EE229 Lecture Note 1

1.1 Axioms

Define the entropy

To (re)invent the entropy, we need to let this quantity follow some axioms. Let the entropy be $H(\cdot)$.

$H$ can also wrap distribution like $p_1,…,p_n$, or just be fed with a distribution tuple $p$, or a random variable $X$.

Axiom 1:

Fair Coin Toss: Random Var $X\in{H,T}$, $P(X=H)=P(X=T)=1/2$.

We attribute 1 as $H(X)=1$ with $H(\frac12,\frac12)=1$

Natural axiomatic property:

$X\perp!!!\perp Y$, $P(X,Y)=P(X)P(Y)$, let $H(X,Y)=H(X)+H(Y)$

Axiom 2:

$$ H(p_1,\cdots,p_n)=H(p_1+p_2,p_3,\cdots,p_n)+(p_1+p_2)H({p_1\over p_1+p_2},{p_2\over p_1+p_2}) $$

Axiom 3:

$p\mapsto H(p,1-p),p\in[0,1)$ should be a continuous function.

Axiom 4:

$H(p)$ is invariant to permutations of $(p_1,p_2,\cdots, p_n)$.

Only one choice for $H$

There is only one choice for $H$ given a finite-state distribution. (for infinitely many states, this will also apply but we will discuss this later). $$ H(p_1,\cdots,p_n)=-\sum_{i=1}^n p_i\log_2 p_i $$

Verification of Axiom 3

$$ p\mapsto H(p,1-p)=-p\log_2p - (1-p)\log_2(1-p) $$

  • is continuous
  • More importantly, it’s convex.

The convention of logarithm notation

We now denote $\log$ for $\log_2(\cdot)$ instead of $\log_e(\cdot)$, whose notation will be $\ln$.