INFOTH Note 20: Channel Coding Schemes

2025-11-13

2 min read

Notes

Information Theory and Coding (UCB FA24 EE229A, rewritten in 2025)

20.1 Maximum Likelihood Decoding 最尤復号法

Note: This is referenced from Toyoaki Nishida’s slides: IT-09.

Also see Atsuyoshi Nakamura’s slides: 情報理論第10回

Setup

Because it’s not so possible for multiple errors to occur, the probability where the output in the output space will be not so far away from the input, if the input space is the same as the output space.

For every message $i$, we have $w_i:=e_n(i)=\bold x^n(i)$ defined as the encoded code block or channel input.

We define $\Omega_i$ as the decoding region 復号領域.

Here comes the decoding algorithm:

Find the $i$ such that $k(y)=\arg\max_{i}p(y|w_i)$.
Given this $k(y)$, assign $y$ to $\Omega_{k(y)}$ i.e. $y\in\Omega_{k(y)}$.
Upon seeing $y$, output the decoded output as $k(y)$.

Since the probability of correctness (as a metric of evaluating the decoding scheme) is

$$ P_c^{(n)}=\frac1{2^{nR}}\sum_i \sum_{y\in\Omega_i}p(y|w_i)=\frac1{2^{nR}}\sum_i(1-\lambda_{i,n})=1-P_e^{(n)}, $$

to maximize $P_c^{(n)}$ we need to assign $y$ to $\Omega_{\arg\max p(y|w_i)}$.

Don’t forget we want to select a subset codebook $\mathcal C$ where $|\mathcal C|=M_n$.

Example

The input alphabet is $\mathcal X=\{a_1,\dots,a_8\}$ and $\mathcal X=\mathcal Y$. $p(x|y)$ is $\Pi$:

$$ \Pi = \begin{bmatrix} 0.72 & 0.03 & 0.01 & 0.12 & 0.04 & 0.01 & 0.05 & 0.02 \\ 0.01 & 0.65 & 0.03 & 0.04 & 0.05 & 0.07 & 0.03 & 0.12 \\ 0.03 & 0.01 & 0.77 & 0.06 & 0.02 & 0.03 & 0.01 & 0.07 \\ 0.02 & 0.09 & 0.03 & 0.66 & 0.04 & 0.08 & 0.04 & 0.04 \\ 0.01 & 0.04 & 0.02 & 0.01 & 0.86 & 0.03 & 0.01 & 0.02 \\ 0.04 & 0.01 & 0.03 & 0.04 & 0.01 & 0.82 & 0.03 & 0.02 \\ 0.01 & 0.02 & 0.04 & 0.05 & 0.03 & 0.03 & 0.78 & 0.04 \\ 0.06 & 0.05 & 0.04 & 0.03 & 0.04 & 0.05 & 0.04 & 0.69 \end{bmatrix} $$

Set $\{a_2=w_1,a_4=w_2,a_7=w_3\}$ as the used codewords, i.e. we have 3 kinds of messages.

How to choose $\Omega_y$ to maximize $P_c$?

For a $y$, in its corresponding column, choose $x$ with the most probability.

$$ \Pi = \begin{bmatrix} 0.72 & 0.03 & 0.01 & 0.12 & 0.04 & 0.01 & 0.05 & 0.02 \\ \color{blue}0.01 & \color{blue}0.65 & \color{blue}0.03 & \color{blue}0.04 & \color{blue}0.05 & \color{blue}0.07 & \color{blue}0.03 & \color{blue}0.12 \\ 0.03 & 0.01 & 0.77 & 0.06 & 0.02 & 0.03 & 0.01 & 0.07 \\ \color{blue}0.02 & \color{blue}0.09 & \color{blue}0.03 & \color{blue}0.66 & \color{blue}0.04 & \color{blue}0.08 & \color{blue}0.04 & \color{blue}0.04 \\ 0.01 & 0.04 & 0.02 & 0.01 & 0.86 & 0.03 & 0.01 & 0.02 \\ 0.04 & 0.01 & 0.03 & 0.04 & 0.01 & 0.82 & 0.03 & 0.02 \\ \color{blue}0.01 & \color{blue}0.02 & \color{blue}0.04 & \color{blue}0.05 & \color{blue}0.03 & \color{blue}0.03 & \color{blue}0.78 & \color{blue}0.04 \\ 0.06 & 0.05 & 0.04 & 0.03 & 0.04 & 0.05 & 0.04 & 0.69 \end{bmatrix} $$

For example, $P(a_1|w_1)$ must be chosen from 0.01, 0.02 and 0.01 (the first column). So $k(1)=2$ since it is corresponded to $w_2$. -> $a_1\in\Omega_2$.

$$ \Omega_1=\{a_2,a_5,a_8\},\quad \Omega_2=\{a_1,a_4,a_6\},\quad \Omega_3=\{a_3,a_7\}. $$

20.2 Block Coding Generally & BLC

Def: Block Coding

The channel encoder is inputted with the message block $m$ with length $k$, encoding this into a longer $n$-symbol bit block as the codeword. $n-k$ is the number of redundant bits.

Since every $k$-length message block is mapped with the same codebook $\mathcal C$, the coding scheme is memoryless (unlike convolutional coding).

Binary Linear Code $(n,k)$

$n$: length of the input, block length in bits
$k$: message length in bits

So $k/n$ is the (compression) rate of the code, since we have $2^k$ messages while the rate should satisfy $2^k=2^{nR}$. The redundant $n-k$ bits can be used for error detection and correction.

Setup: DMC with input alphabet $\mathcal X=\{0,1\}=\Z_2$.

Def: Encoding Scheme & Generator Matrix

The encoding is via a generator matrix $G\in\Z_2^{k\times n}$ (a $k\times n$ matrix the Boolean field $\Z_2$, where the addition is mod 2, equivalent to XOR), i.e.

$$ c=mG,\quad c\in\mathcal X^n,m\in\mathcal X^k $$

where $m$ is the message of $k$ symbols and $c$ is the code.

We want $\rank(G)=k$ while generally $kspan a basis of $\mathcal C$ the code space.

For example, $m=\begin{bmatrix}1&0&1\end{bmatrix}^\top$ and $G$ is $3\times 6$, then $c=(G_1+G_3)\mod 2=G_1\oplus G_3$ where $G_i$ is the $i$-th row.

Def: Parity-check, Syndrome

Suppose we receive $y\in\mathcal X^n$. Ideally, $y\equiv c$, but with some noise, $y\ne c$ is possible.

For a $(n,k)$ BLC, the parity-check matrix 校验矩阵 $H\in\Z_2^{(n-k)\times n}$ is defined s.t.

$$ cH^\top=\bold 0,\forall c\in\mathcal C $$

For $y$, the syndrome 伴随式/“症状” is

$$ s=yH^\top\in\mathcal X^{n-k}. $$

If $s=\bold 0$, $y$ is a valid code(word) i.e. an element in $\mathcal C$. We can say $y=c$ usually, but there may be some noises such that $c$ has become into another valid codeword.

$H$ and $G$

If we write $H$ by column permutation in the form of $[A|I_{n-k}]$, $G$ can be derived out by $G=[I_k|A^\top]$.

Also,

$$ HG^\top=\bold 0 $$

should be noted.

Error Correction using Syndrome

We model the error as an vector $e\in\mathcal X^n$, then $y=c+e$ and

$$ s=cH^\top+eH^\top=eH^\top $$

and we pair $s$ with the most probable $e$ making $(s,e)$, doing $c=y-e\equiv y+e$ (XOR is equivalent in subtraction and addition).

20.3 Hamming Code (a type of Binary Linear Codes)

Precedent: Parity Checking Code 奇偶码, etc.
Hamming Code: can correct 1 error!

Def: Hamming Code

A binary linear code with the parameter $(n,k)=(2^{r}-1,2^{r}-1-r)$ where $H\in\Z_2^{r\times (2^r-1)}$. For example, given $r=3$, there are 3 redundant bits (bits for parity check), where $(n,k)=(7,4)$.

For the parity check matrix $H$, the $i$-th column (starting from 1) is the binary representation of $i$ itself:

$$ H=\begin{bmatrix} 1 & 2 & 3 & 4 & 5 & 6 & 7 & \leftarrow \text{col idx} \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 \end{bmatrix} $$

in $r=3$ case. Also, $H$ is sometimes re-permuted by the column to $H\bold P$ such that the right part is $I_r$ (the identity matrix) to make computation convenient.

Syndrome

If there is an error at $i$-th position of $c$ within the received $y$. Then if there is 1 error s.t. $e_i=1$, since

$$ s=eH^\top, $$

we find the syndrome is the binary of $i$.

Encoding Algorithm

Step 1: Num of Parity Bits $r$

For a message of $k$ bits, we want to find $r$ s.t.

$$ k\le 2^r-1-r $$

where the encoded message is of $2^r-1$ length. For example, for 8 bits, we need at least 4 parity bits.

Step 2: Insert Parity Bits

We insert parity bits at $2^{i-1}$-th index of the codeword, such as 1,2,4,8,…

And the original message (data) is filled in other empty places.

For a message 1011 with $r=3$, $P_1,P_2,P_3$ are in index 1,2,4.

The data $D_1,D_2,D_3,D_4$ are in index 3,5,6,7 then.

Let $X_i$ means the value on the $i$-th index including data and parity.

Index	7	6	5	4	3	2	1
Variable	$D_4$	$D_3$	$D_2$	$P_3$	$D_1$	$P_2$	$P_1$
Value	1	0	1	$P_3$	1	$P_2$	$P_1$

Step 3: Calculate Parity Bit Values

For $P_i$, consider the $i$-th bit of the binary representations of data digit indices. Only if in some index the bit is 1, we take that into account.

For example, for $P_1$, 001 meets 3=011,5=101,6=110,7=111. We only pick 3,5,7 since 6 has a 0 in the bit. Then $P_1=X_3\oplus X_5\oplus X_7=D_1\oplus D_2\oplus D_4=1$.

Similarly, $P_2$ -> 3,6,7 and $P_2=D_1\oplus D_3\oplus D_4=0$; $P_3=\dots=0$.

Fill in these and we get 1010101 as the codeword.

Decoding Algorithm with 1-bit Error Correction

Suppose y=1000101 where the 5-th bit gets an error.

We compute the syndrome by $yH^\top$, while we have 3 parity bits for error correction, for $s_j$, we pick those with 1 in their $j$-th bits of indices’ binary representations.

$$ s_1=y_1\oplus (y_3\oplus y_5\oplus y_7)=1,\\ s_2=y_2\oplus (y_3\oplus y_6\oplus y_7)=0,\\ s_3=y_4\oplus (y_5\oplus y_6\oplus y_7)=1, $$

and we see $s=101$ representing 5, which is the error location!

If $s=0$ (all syndrome bits are 0), we have no errors.

However, we can only correct 1-digit and detect 2-digit error. When $i$ and $j$ both get errors, $s=i\oplus j$ (by binary). And for multi-digit errors, the XOR holds and we cannot do correction. However, if we cannot be certain that’s an 1-digit error, if it’s a 2-digit error then our syndrome can still only output one index, which is usually wrong in 2 or more digit errors.