EE 8990 - Information Theory

INTRODUCTION (CHAPTER 1):

Lecture 01: Introduction (Sect. 1.0)
- Welcome to EE 8993: Information Theory
- Review the course syllabus
- Review the concept of a histogram and probability density function
- Key concepts in Information Theory

Lecture 02: Motivation (Sect. 1.1)
- Entropy Calculations
- Introduce the Binary Symmetric Channel (BSC)
- Generalization to a Finite State Machine/Transducer
- Introduce the need for a measure such as relative entropy

ENTROPY, RELATIVE ENTROPY, AND MUTUAL INFORMATION (CHAPTER 2):

Lecture 02: Entropy (Sects. 2.1, 2.2)
- Definition of entropy
- Examples of entropy calculations
- Joint entropy
- Conditional entropy (see handout_01.pdf)
- The Chain Rule
- Examples of conditional entropy calculations

Lecture 03: Mutual Information (Sects. 2.3, 2.4, 2.5)
- Definition of mutual information
- Relationship between entropy and mutual information
- Chain rule for entropy
- Conditional mutual information
- Chain rule for information
- Conditional relative entropy

Lecture 04: Useful Inequalities (Sects. 2.6, 2.7, 2.8)
- Convex and concave functions
- Jensen's inequality and the Information inequality
- Independence bound on entropy
- The log sum inequality and its application to relative entropy and mutual information
- Properties of Markov chains
- The data processing inequality and its consequences

Lecture 05: Sufficient Statistics and Fano's Inequality (Sects. 2.9, 2.10, 2.11)
- Sufficient statistic
- Minimal sufficient statistic
- Fano's Inequality (see handout_02.pdf)
- An example of the use of Fano's inequality (see handout_04.pdf)
- Detailed examples of entropy and mutual information calculations (see handout_02.pdf and handout_03.pdf)

Homework
- Easy: 2.1, 2.8, 2.16
- Harder: 2.10, 2.15, 2.20
- Ouch!: 2.24, 2.31, 2.32, 2.35

THE ASYMPTOTIC EQUIPARTITION PROPERTY (CHAPTER 3):

Lecture 06: The Asymptotic Equipartition Property Theorem (Sect. 3.1)
- Introduction
- The AEP Theorem
- The definition of a Typical Set
- Properties of typical sets

Lecture 07: Consequences of the AEP (Sects. 3.2, 3.3)
- Generalized approaches to data compression
- The Data Compression Theorem
- High probability sets and the typical set

Homework
- Easy: 3.3
- Harder: 3.5, 3.6

ENTROPY RATES OF STOCHASTIC PROCESSES (CHAPTER 4):

Lecture 08: Markov Chains (Sect. 4.1)
- Definition of a Markov chain
- Definitions of stationarity and time-invariance
- Examples of Markov chains

Lecture 09: Entropy Rate (Sects. 4.2)
- Definition of entropy rate
- Examples of some stochastic processes and their rates
- Theorems about the rates of stationary processes
- Entropy rates of Markov chains (see handout_05.pdf)

Lecture 10: Random Walk (Sects. 4.3)
- Definition of a random walk process
- Computation of the entropy rate
- An example drawn from chess

Lecture 11: Hidden Markov Models (Sects. 4.4)
- What is an HMM?
- A lower bound on the entropy rate of a stationary HMM
- An upper bound on the entropy rate of a stationary HMM

Homework
- Easy: 4.6, 4.10
- Harder: 4.3, 4.9
- Ouch!: 4.1, 4.13

DATA COMPRESSION (CHAPTER 5):

Lecture 12: The Kraft Inequality (Sects. 5.1, 5.2)
- Examples of source codes
- Definitions of various properties of codes
- The Kraft inequality
- The extended Kraft inequality

Lecture 13: Bounds on Optimal Codes (Sects. 5.3 - 5.5)
- Expected length and entropy rate
- Properties of the minimum expected codeword length
- Uniquely decodable codes and the Kraft inequality

Lecture 14: Huffman Coding (Sects. 5.6 - 5.8)
- An example of a binary Huffman code
- An example of a ternary Huffman code
- Properties of an optimal code
- Proof of the optimality of a Huffman code

Lecture 15: Shannon Coding (Sects. 5.9 - 5.11)
- Description of the Shannon coding procedure (see handout_06.pdf)
- An overview of arithmetic coding
- Competitive optimality of the Shannon code

Lecture 16: EXAM NO. 1 (Chaps. 1-4)

Lecture 17: Discrete Distributions (Sect. 5.12)
- Motivation based on computer-based random number generation
- An example of how to generate a simple pdf
- Formulation of the general problem as tree generation
- Derivation of the relation between the expected tree depth and the entropy
- Bounds on the expected number of fair bits required
- An equality relation for dyadic distributions
- An example of tree generation for a non-dyadic distribution

Homework
- Easy: 5.4, 5.5, 5.6
- Harder: 5.8, 5.12, 5.15
- Ouch!: 5.21, 5.23

CHANNEL CAPACITY (CHAPTER 8):

Lecture 18: Examples of Channel Capacity (Sects. 8.1 - 8.3)
- Definition of a discrete channel and channel capacity
- The capacity of a noiseless binary channel
- The capacity of a binary symmetric channel
- Symmetric channels
- Properties of channel capacity

Lecture 19: Formal Definition of Rate and Jointly Typical Sequences (Sects. 8.5 - 8.6)
- Definition of a discrete channel
- Definition of a code
- Definition of rate
- Definition of a jointly typical sequence
- The Joint Asymptotic Equipartition Theorem

Lecture 20: The Channel Coding Theorem - Part I (Sect. 8.7)
- Statement of the achievability theorem
- New ideas introduced by Shannon to prove the theorem
- Outline of the proof

Lecture 21: The Channel Coding Theorem - Part II (Sects. 8.7 - 8.8)
- Calculation of the probability of error
- Bounds on the probability of error
- Strengthening the conditions of the proof
- Zero-error codes: proof that a probability of error equal to zero implies the rate is less than the capacity

Lecture 22: Fano's Inequality (Sects. 8.9 - 8.11)
- Proof of Fano's inequality
- Capacity per transmission
- Converse to the channel coding theorem
- The critical nature of capacity
- Characteristics of codes that achieve capacity

Lecture 23: Joint Channel Source Coding (Sects. 8.12 - 8.13)
- Feedback Capacity
- Data Compression and Data Transmission
- The Two-Stage Implementation
- The Source-Channel Coding Theorem

Homework
- Easy: 8.3, 8.5, 8.9
- Harder: 8.1, 8.2, 8.4
- Ouch!: 8.6, 8.12

DIFFERENTIAL ENTROPY (CHAPTER 9):

Lecture 24: Properties of Differential Entropy (Sects. 9.1 - 9.3)
- Definition of differential entropy
- Example computations and the Normal distribution
- The AEP for Continuous Variables
- Definitions and theorems analogous to the discrete case

Lecture 25: Joint and Conditional Differential Entropy for Continuous Random Variables (Sects. 9.4 - 9.7)
- Definition of joint entropy
- Entropy of a multivariate normal distribution
- Relative entropy
- Properties of differential entropy and mutual information

Homework
- Easy: 9.1, 9.3
- Harder: 9.4
- Ouch!: 9.6

THE GAUSSIAN CHANNEL (CHAPTER 10):

Lecture 26: The Coding Theorem for Gaussian Channels (Sects. 10.1 - 10.2)
- The importance of the Gaussian channel model
- An overview of the digital communications problem
- Definition of the information capacity of a power-constrained channel
- Calculation of the capacity of a power-constrained Gaussian channel

Lecture 27: Bandlimited Gaussian Channels (Sects. 10.3 - 10.4)
- The Sampling Theorem revisited
- The capacity of a bandlimited channel
- The capacity of parallel Gaussian channels
- An implementation based on the "water-filling" approach

Lecture 28: Colored Noise and Feedback (Sects. 10.5 - 10.6)
- Properties of colored noise
- Derivation of a strategy to maximize the capacity
- The capacity of a Gaussian channel with colored noise
- Does feedback increase capacity?
- Derivation of the capacity of a feedback channel

Homework
- Easy: 10.2, 10.3
- Harder: 10.1
- Ouch!: 10.4

INFORMATION THEORY AND STATISTICS (CHAPTER 12):

Lecture 29: The Method of Types (Sect. 12.1)
- Definition of a type
- Definition of a set of types
- Definition of a type class
- The size of a set of types
- Probability and type
- Size of a type class
- Probability of a type class

Lecture 30: EXAM NO. 2 (Chaps. 5,6,8,9)

Lecture 31: Universal Coding (Sects. 12.2 - 12.5)
- Why universal codes
- Properties of universal codes
- Sanov's theorem
- Sanov's theorem
- Examples of Sanov's theorem: fair dice

Lecture 32: The Conditional Limit Theorem (Section 12.6)
- Approximations for the probability of a set of types
- The Pythagorean Theorem for relative entropy
- An overview of the proof of the theorem
- An example involving n fair dice

Lecture 33: Hypothesis Testing (Section 12.7)
- The general hypothesis testing problem
- The Neyman-Pearson lemma
- Log-likelihood ratios

Lecture 34: Chernoff Information and Bound (Sects. 12.8 - 12.9)
- Stein's lemma
- Chernoff information
- Bayesian decision rules
- An example involving baseball batting averages and football strategies
- Chernoff's bound

Lecture 35: Lempel-Ziv Coding (Section 12.10)
- Description of the algorithm
- Definition of a parse
- The Lempel-Ziv lemma
- Bounds on the entropy for Lempel-Ziv coding
- Asymptotic bounds on Lempel-Ziv coding
- Extensions and enhancements of the basic approach

Lecture 36: Fisher Information (Section 12.11)
- Definition of an estimator
- Definition of Fisher information
- An example of an efficient estimator

Homework
- Easy: 12.1, 12.12
- Harder: 12.2, 12.7
- Ouch!: 12.6, 12.8

Lecture 37: Rate distortion theory (Sects. 13.1 - 13.2)
- Intuitive arguments for joint quantization
- Non-uniform quantization
- Definition of a rate distortion code
- Minimum achievable rate for a given distortion

Lecture 38: Explanation of rate distortion theory (Section 13.3)
- The vector quantization problem
- Clustering
- Example calculation of rate distortion functions

Lecture 39: Lossy Compression (Sects. 13.4, 13.5)
- Achievability of the rate distortion function
- Gaussian sources and channels

Lecture 40: Characterization of the Rate Distortion Function (Sects. 13.6, 13.7)
- The method of Lagrange multipliers

Lecture 41: Computation of the Rate Distortion Function (Section 13.8)
- Maximizing relative entropy
- An iterative approach for finding a rate distortion function

Homework
- Easy: 13.1, 13.2
- Harder: 13.5
- Ouch!: 13.8

Lecture 42: Network Information Theory (Sects. 14.1 - 14.5)
- Multiple access channels
- Channel Capacity
- Slepian-Wolf coding
- Channel Capacity

Lecture 43: The Relay Channel and Side Information (Sects. 14.6 - 14.10)
- Degraded broadcast channels
- Relay Channels
- Channels with side information

Homework
- Easy: 14.1, 14.2
- Harder: 14.8
- Ouch!: 14.10

Lecture 44: Maximum Entropy Spectral Estimation (Chapter 11)
- Maximum entropy distributions
- Examples using maximum entropy solutions
- Variational principles applied to spectral estimation
- The relationship between linear prediction and maximum entropy spectral analysis

Homework
- Easy: 11.1, 11.3
- Harder: 11.5
- Ouch!: 11.7

Lecture 45: The Stock Market (Chapter 15)
- Mathematical definition of a portfolio
- The mean-variance approach to stock market modeling
- The Capital Assets Pricing Model
- The log-optimal portfolio
- Properties of the log-optimal portfolio
- Investments in stationary markets

Homework
- Easy: 15.1
- Harder: 15.3

Lecture 46: FINAL EXAM (CUMULATIVE)
- May 12, 8 AM to 11 AM, 2nd Floor Seminar Room