INFOTH Note 19: Channel Coding Theorem for DMC 2

2025-11-06

1 min read

Notes

Information Theory and Coding (UCB FA24 EE229A, rewritten in 2025)

19.1 Proof of Channel Coding Theorem for DMC

We inherit the setup in Note 18.

The Setup

DMC: $p(y|x)$
Definition of capacity
Encoding and decoding maps, $((e_n,d_n),n\ge 1)$
Prob. of error: $\lambda_{i,n}=\Pr[d_n(\bold y^n)\ne i|\bold x^n=e_n(i)]$
- Max: $\lambda_{\max,n}=\max_{i\in[2^{nR}]}\lambda_{i,n}$
- Average: $P_e^{(n)}=\frac1{2^{nR}}\sum_{i=1}^{2^{nR}}\lambda_{i,n}$

Codebook Randomized

Although the channel encoding and decoding maps are fixed, we randomize the encoding map into a codebook where a behavior of ==coding of a message $m$ into $\bold x^n$ follows a probability==.

For the $p(x)$ distribution over $\mathcal X$, we build a codebook mapping every message $m\in[2^{nR}=M_n]$ to a sequence of length $n$. Every symbol of this sequence ($i$-th index) is a RV:

$$ \text{Codebook } \mathcal C=\begin{bmatrix} X_1(1)&X_2(1)&\cdots& X_n(1)\\ \vdots & \vdots & \ldots & \vdots \\ X_1(2^{nR}) & X_2(2^{nR}) & \cdots & X_n(2^{nR}) \end{bmatrix} $$

where $X_i(m)$ means if the message is $m$, what will be the $i$-th symbol like.

We suppose $X_i(m)\sim(p(x),x\in\mathcal X)$ for every $i$ and $m$, so that for every message the $\bold x^n$ has the same probability $p(\bold x^n)$, since we want to find $P_e^{(n)}$ where the prior of messages are $1\over 2^{nR}$.

Note that $\lambda_{i,n}$ here are also given the randomized codebook instantiated:

$$ \lambda_{i,n}(\kappa):= \Pr[d_n(\bold y^n)\ne i|\bold x^n=\kappa(i),\mathcal C=\kappa] $$

Decoding Scheme: Exact Match or Giving up

Upon seeing any $\bold y^n$, for every $m\in[2^{nR}]$, the receiver (channel decoder) goes into these procedures:

See if $(\bold x^n(m),\bold y^n)$ is in the jointly typical set $A_{\epsilon,X,Y}^{(n)}$ for every $\bold x^n$ for every $m$
If there is a $m$ in a jointly typical set, then output $m$, otherwise output $*$ (giving up).

Under this scheme, we continue our proof.

Goal of Proof

Since $\mathcal C$ here as the codebook is a random variable matrix, it has a distribution over different instances $\Pr[\mathcal C=\kappa]$.

We will prove that if $R $$ E_{\mathcal C}[P_e^{(n)}]\to 0,\quad n\to\infty. $$

We first have the cardinality:

$$ |A_{\epsilon,X,Y}^{(n)}|\le 2^{n(H(X,Y)+\epsilon)},\quad |A_{\epsilon,X,Y}^{(n)}|\ge (1-\delta)2^{n(H(X,Y)-\epsilon)} $$

The Independence

Let $\bold Y^n=Y_{1:n}$ be the received output in the context of the randomized codebook: $\bold Y^n\sim \Pr[\bold Y^n|\bold X^n, (\mathcal C,m)]$.

We suppose $m\in[2^{nR}]$ chosen uniformly at random to be the actual transmitted message here, since we are proving the limit of $P_e^{(n)}$ which is under the assumption where $m$ is chosen uniformly. $\lambda_{\max,n}$ can be derived by a trick, so we do not pay much attention.

Then for $\forall m'\ne m$, their input sequences are independent with the output sequence:

$$ \bold X^n(m')\perp\!\!\!\perp \bold Y^n~|~m\text{ is transmitted i.e. }(\mathcal C,m) $$

So for $\forall m'\ne m$,

$$ \Pr[(\bold X^n(m'),\bold Y^n)\in A_{\epsilon,X,Y}^{(n)}~|~m\text{ is transmitted}]=\sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}(\prod_{i=1}^np(x_i))\cdot (\prod_{i=1}^n p(y_i)) $$

because of the independence.

The Bound

(Bounding the probability of $m'$ is decoded but $m$ is the original)

Given the cardinality

$$ |A_{\epsilon,X,Y}^{(n)}|\le 2^{n(H(X,Y)+\epsilon)}, $$

and given the two products satisfies, $\forall (\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}$,

$$ \prod_{i=1}^np(x_i)=p(\bold x^n)\le 2^{-n(H(X)-\epsilon)},\quad \prod_{i=1}^np(y_i)=p(\bold y^n)\le 2^{-n(H(Y)-\epsilon)}, $$

hence

$$ \begin{aligned} \Pr[(\bold X^n(m'),\bold Y^n)\in A_{\epsilon,X,Y}^{(n)}~|~m\text{ is transmitted}] &=\sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}(\prod_{i=1}^np(x_i))\cdot (\prod_{i=1}^n p(y_i)) \\&\le \sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}2^{-n(H(X)-\epsilon)}2^{-n(H(Y)-\epsilon)} \\&=2^{n(H(X,Y)+\epsilon)}2^{-n(H(X)-\epsilon)}2^{-n(H(Y)-\epsilon)} \\&=2^{-n({\color{red}{I(X;Y)}}+3\epsilon)}. \end{aligned} $$

We have found the mutual information.

The Limit

Since the message sent follows a uniform distribution $\text{Unif}([2^{nR}])$, the expectation of error probability over codebooks are

$$ \begin{aligned} P_e^{(n)}&=E_{\mathcal C}[P_e^{(n)}(\mathcal C)]\\&=\sum_\kappa \Pr[\mathcal C=\kappa]P_e^{(n)}(\kappa) \\&=\sum_\kappa \Pr[\mathcal C=\kappa]\cdot \frac1{2^{nR}}\sum_{m=1}^{2^{nR}}\lambda_{m,n}(\kappa) \\&=\frac1{2^{nR}}\sum_{m=1}^{2^{nR}}\sum_\kappa\Pr[\mathcal C=\kappa]\cdot\lambda_{m,n}(\kappa) \end{aligned} $$

Because of symmetry of the domain of $\mathcal C$, all modified codebooks, where the $m$-th row exchanges with the first row, still construct the domain of $\mathcal C$.

因为 $\mathcal C$ 取值范围的对称性，$m$-th行和第1行对换后的所有$\kappa$仍然可以填充整个 $\mathcal C$ 的范围。

Hence

$$ \sum_\kappa \Pr[\mathcal C=\kappa]\cdot\lambda_{m,n}(\kappa)=\sum_\kappa \Pr[\mathcal C=\kappa]\cdot\lambda_{1,n}(\kappa). $$

And

$$ \begin{aligned} P_e^{(n)}&=\dots \\&=\sum_\kappa\Pr[\mathcal C=\kappa]\cdot\lambda_{1,n}(\kappa) \\&=\lambda_{1,n}=\Pr[d_n(\bold y^n)\ne 1|\bold x^n=e_n(1)] \end{aligned} $$

where we assume message $m=1$ is sent w.l.o.g.

Let the event $\mathcal E_m$ denote that $(\bold X^n(m),\bold Y^n)$ is in the jointly typical set, then by the decoding scheme we have set before, and by the definition of $P_e^{(n)}$,

$$ P_e^{(n)}=\dots=\Pr[\mathcal E_1^c\cup \mathcal E_2\cup \cdots \cup \mathcal E_{2^{nR}}|m=1\text{ is transmitted}]\le \Pr[\mathcal E_1^c|m=1]+(2^{nR}-1)\Pr[\mathcal E_2|m=1] $$

where the equality condition is that the events are disjoint.

From the definition of the typical set, $\Pr[\mathcal E_1^c|m=1]\le \epsilon$; and

$$ \begin{aligned} (2^{nR}-1)\Pr[\mathcal E_2|m=1]&\le 2^{nR}\cdot 2^{-n(I(X;Y)-3\epsilon)} \\&=2^{3n\epsilon}\cdot 2^{n(R-I(X;Y))} \end{aligned} $$

$$ P_e^{(n)}\le \epsilon+2^{3n\epsilon}\cdot 2^{n(R-I(X;Y))}. $$

Since this holds with any $\epsilon$, when $\epsilon\to 0$, with sufficiently large $n$, as long as $R $$ \lim_{n\to \infty\equiv \epsilon\to 0} P_e^{(n)}=0. $$

19.2 Converse Channel Coding Theorem (for DMC)

With the same setup, if we have a coding scheme $((e_n,d_n),n\ge 1)$ s.t.

$$ P_e^{(n)}\to 0,n\to\infty, $$

then $R\le C$.

The $R=C$ case is mysteriously not discussed.