INFOTH Note 19: Channel Coding Theorem for DMC 2

19.1 Proof of Channel Coding Theorem for DMC

We inherit the setup in Note 18.

The Setup

  • DMC: $p(y|x)$
  • Definition of capacity
  • Encoding and decoding maps, $((e_n,d_n),n\ge 1)$
  • Prob. of error: $\lambda_{i,n}=\Pr[d_n(\bold y^n)\ne i|\bold x^n=e_n(i)]$
    • Max: $\lambda_{\max,n}=\max_{i\in[2^{nR}]}\lambda_{i,n}$
    • Average: $P_e^{(n)}=\frac1{2^{nR}}\sum_{i=1}^{2^{nR}}\lambda_{i,n}$

Codebook Randomized

Although the channel encoding and decoding maps are fixed, we randomize the encoding map into a codebook where a behavior of ==coding of a message $m$ into $\bold x^n$ follows a probability==.

For the $p(x)$ distribution over $\mathcal X$, we build a codebook mapping every message $m\in[2^{nR}=M_n]$ to a sequence of length $n$. Every symbol of this sequence ($i$-th index) is a RV:

$$ \text{Codebook } \mathcal C=\begin{bmatrix} X_1(1)&X_2(1)&\cdots& X_n(1)\\ \vdots & \vdots & \ldots & \vdots \\ X_1(2^{nR}) & X_2(2^{nR}) & \cdots & X_n(2^{nR}) \end{bmatrix} $$

where $X_i(m)$ means if the message is $m$, what will be the $i$-th symbol like.

We suppose $X_i(m)\sim(p(x),x\in\mathcal X)$ for every $i$ and $m$, so that for every message the $\bold x^n$ has the same probability $p(\bold x^n)$, since we want to find $P_e^{(n)}$ where the prior of messages are $1\over 2^{nR}$.

Note that $\lambda_{i,n}$ here are also given the randomized codebook instantiated:

$$ \lambda_{i,n}(\kappa):= \Pr[d_n(\bold y^n)\ne i|\bold x^n=\kappa(i),\mathcal C=\kappa] $$

Decoding Scheme: Exact Match or Giving up

Upon seeing any $\bold y^n$, for every $m\in[2^{nR}]$, the receiver (channel decoder) goes into these procedures:

  • See if $(\bold x^n(m),\bold y^n)$ is in the jointly typical set $A_{\epsilon,X,Y}^{(n)}$ for every $\bold x^n$ for every $m$
  • If there is a $m$ in a jointly typical set, then output $m$, otherwise output $*$ (giving up).

Under this scheme, we continue our proof.

Goal of Proof

Since $\mathcal C$ here as the codebook is a random variable matrix, it has a distribution over different instances $\Pr[\mathcal C=\kappa]$.

We will prove that if $R $$ E_{\mathcal C}[P_e^{(n)}]\to 0,\quad n\to\infty. $$

We first have the cardinality:

$$ |A_{\epsilon,X,Y}^{(n)}|\le 2^{n(H(X,Y)+\epsilon)},\quad |A_{\epsilon,X,Y}^{(n)}|\ge (1-\delta)2^{n(H(X,Y)-\epsilon)} $$

The Independence

Let $\bold Y^n=Y_{1:n}$ be the received output in the context of the randomized codebook: $\bold Y^n\sim \Pr[\bold Y^n|\bold X^n, (\mathcal C,m)]$.

We suppose $m\in[2^{nR}]$ chosen uniformly at random to be the actual transmitted message here, since we are proving the limit of $P_e^{(n)}$ which is under the assumption where $m$ is chosen uniformly. $\lambda_{\max,n}$ can be derived by a trick, so we do not pay much attention.

Then for $\forall m'\ne m$, their input sequences are independent with the output sequence:

$$ \bold X^n(m')\perp\!\!\!\perp \bold Y^n~|~m\text{ is transmitted i.e. }(\mathcal C,m) $$

So for $\forall m'\ne m$,

$$ \Pr[(\bold X^n(m'),\bold Y^n)\in A_{\epsilon,X,Y}^{(n)}~|~m\text{ is transmitted}]=\sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}(\prod_{i=1}^np(x_i))\cdot (\prod_{i=1}^n p(y_i)) $$

because of the independence.

The Bound

(Bounding the probability of $m'$ is decoded but $m$ is the original)

Given the cardinality

$$ |A_{\epsilon,X,Y}^{(n)}|\le 2^{n(H(X,Y)+\epsilon)}, $$

and given the two products satisfies, $\forall (\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}$,

$$ \prod_{i=1}^np(x_i)=p(\bold x^n)\le 2^{-n(H(X)-\epsilon)},\quad \prod_{i=1}^np(y_i)=p(\bold y^n)\le 2^{-n(H(Y)-\epsilon)}, $$

hence

$$ \begin{aligned} \Pr[(\bold X^n(m'),\bold Y^n)\in A_{\epsilon,X,Y}^{(n)}~|~m\text{ is transmitted}] &=\sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}(\prod_{i=1}^np(x_i))\cdot (\prod_{i=1}^n p(y_i)) \\&\le \sum_{(\bold x^n,\bold y^n)\in A_{\epsilon,X,Y}^{(n)}}2^{-n(H(X)-\epsilon)}2^{-n(H(Y)-\epsilon)} \\&=2^{n(H(X,Y)+\epsilon)}2^{-n(H(X)-\epsilon)}2^{-n(H(Y)-\epsilon)} \\&=2^{-n({\color{red}{I(X;Y)}}+3\epsilon)}. \end{aligned} $$
  • We have found the mutual information.

The Limit

Since the message sent follows a uniform distribution $\text{Unif}([2^{nR}])$, the expectation of error probability over codebooks are

$$ \begin{aligned} P_e^{(n)}&=E_{\mathcal C}[P_e^{(n)}(\mathcal C)]\\&=\sum_\kappa \Pr[\mathcal C=\kappa]P_e^{(n)}(\kappa) \\&=\sum_\kappa \Pr[\mathcal C=\kappa]\cdot \frac1{2^{nR}}\sum_{m=1}^{2^{nR}}\lambda_{m,n}(\kappa) \\&=\frac1{2^{nR}}\sum_{m=1}^{2^{nR}}\sum_\kappa\Pr[\mathcal C=\kappa]\cdot\lambda_{m,n}(\kappa) \end{aligned} $$

Because of symmetry of the domain of $\mathcal C$, all modified codebooks, where the $m$-th row exchanges with the first row, still construct the domain of $\mathcal C$.

因为 $\mathcal C$ 取值范围的对称性,$m$-th行和第1行对换后的所有$\kappa$仍然可以填充整个 $\mathcal C$ 的范围。

Hence

$$ \sum_\kappa \Pr[\mathcal C=\kappa]\cdot\lambda_{m,n}(\kappa)=\sum_\kappa \Pr[\mathcal C=\kappa]\cdot\lambda_{1,n}(\kappa). $$

And

$$ \begin{aligned} P_e^{(n)}&=\dots \\&=\sum_\kappa\Pr[\mathcal C=\kappa]\cdot\lambda_{1,n}(\kappa) \\&=\lambda_{1,n}=\Pr[d_n(\bold y^n)\ne 1|\bold x^n=e_n(1)] \end{aligned} $$

where we assume message $m=1$ is sent w.l.o.g.

Let the event $\mathcal E_m$ denote that $(\bold X^n(m),\bold Y^n)$ is in the jointly typical set, then by the decoding scheme we have set before, and by the definition of $P_e^{(n)}$,

$$ P_e^{(n)}=\dots=\Pr[\mathcal E_1^c\cup \mathcal E_2\cup \cdots \cup \mathcal E_{2^{nR}}|m=1\text{ is transmitted}]\le \Pr[\mathcal E_1^c|m=1]+(2^{nR}-1)\Pr[\mathcal E_2|m=1] $$

where the equality condition is that the events are disjoint.

From the definition of the typical set, $\Pr[\mathcal E_1^c|m=1]\le \epsilon$; and

$$ \begin{aligned} (2^{nR}-1)\Pr[\mathcal E_2|m=1]&\le 2^{nR}\cdot 2^{-n(I(X;Y)-3\epsilon)} \\&=2^{3n\epsilon}\cdot 2^{n(R-I(X;Y))} \end{aligned} $$

So

$$ P_e^{(n)}\le \epsilon+2^{3n\epsilon}\cdot 2^{n(R-I(X;Y))}. $$

Since this holds with any $\epsilon$, when $\epsilon\to 0$, with sufficiently large $n$, as long as $R $$ \lim_{n\to \infty\equiv \epsilon\to 0} P_e^{(n)}=0. $$

19.2 Converse Channel Coding Theorem (for DMC)

With the same setup, if we have a coding scheme $((e_n,d_n),n\ge 1)$ s.t.

$$ P_e^{(n)}\to 0,n\to\infty, $$

then $R\le C$.

  • The $R=C$ case is mysteriously not discussed.