Download - Channel coding

Introduction to Information theorychannel capacity and models

A.J. Han VinckUniversity of Essen

May 2011

This lecture

Some models Channel capacity

Shannon channel coding theorem converse

some channel models

Input X P(y|x) output Y

transition probabilities

memoryless:

- output at time i depends only on input at time i

- input and output alphabet finite

Example: binary symmetric channel (BSC)

Error Source

+

E

X

OutputInput

EXY

E is the binary error sequence s.t. P(1) = 1-P(0) = p

X is the binary information sequence

Y is the binary output sequence

1-p

0 0

p

1 1

1-p

from AWGN to BSC

Homework: calculate the capacity as a function of A and 2

p

Other models

0

1

0 (light on)

1 (light off)

p

1-p

X Y

P(X=0) = P0

0

1

0

E

1

1-e

e

e

1-e

P(X=0) = P0

Z-channel (optical) Erasure channel (MAC)

Erasure with errors

0

1

0

E

1p

pe

e

1-p-e

1-p-e

burst error model (Gilbert-Elliot)

Error Source

RandomRandom error channel; outputs independent P(0) = 1- P(1);

BurstBurst error channel; outputs dependent

Error SourceP(0 | state = bad ) = P(1|state = bad ) = 1/2;

P(0 | state = good ) = 1 - P(1|state = good ) = 0.999

State info: good or bad

good bad

transition probability Pgb

Pbg

PggPbb

channel capacity:

I(X;Y) = H(X) - H(X|Y) = H(Y) – H(Y|X) (Shannon 1948)

H(X) H(X|Y)

notes:capacity depends on input probabilities

because the transition probabilites are fixed

channelX Y

capacity)Y;X(Imax)x(P

Practical communication system design

message estimate

channel decoder

n

Code word in

receive

There are 2k code words of length nk is the number of information bits transmitted in n channel uses

2k

Code book

Code bookwith errors

Channel capacity

Definition:

The rate R of a code is the ratio k/n, where

k is the number of information bits transmitted in n channel uses

Shannon showed that: :

for R C

encoding methods exist

with decoding error probability 0

Encoding and decoding according to Shannon

Code: 2k binary codewords where p(0) = P(1) = ½

Channel errors: P(0 1) = P(1 0) = p

i.e. # error sequences 2nh(p)

Decoder: search around received sequence for codeword

with np differences

space of 2n binary sequences

decoding error probability

1. For t errors: |t/n-p|> Є

0 for n (law of large numbers)

2. > 1 code word in region (codewords random)

nand

)p(h1n

kRfor

022

2

2)12()1(P

)RBSCC(n)R)p(h1(n

n

)p(nhk

channel capacity: the BSC

1-p

0 0

p

1 1

1-p

X Y

I(X;Y) = H(Y) – H(Y|X)

the maximum of H(Y) = 1

since Y is binary

H(Y|X) = h(p)

= P(X=0)h(p) + P(X=1)h(p)

Conclusion: the capacity for the BSC CBSC = 1- h(p)Homework: draw CBSC , what happens for p > ½

channel capacity: the BSC

0.5 1.0

1.0

Bit error p

Cha

nnel

cap

acit

y

Explain the behaviour!

channel capacity: the Z-channel

Application in optical communications

0

1

0 (light on)

1 (light off)

p

1-p

X Y

H(Y) = h(P0 +p(1- P0 ) )

H(Y|X) = (1 - P0 ) h(p)

For capacity, maximize I(X;Y) over P0

P(X=0) = P0

channel capacity: the erasure channel

Application: cdma detection

0

1

0

E

1

1-e

e

e

1-e

X Y

I(X;Y) = H(X) – H(X|Y)

H(X) = h(P0 )

H(X|Y) = e h(P0)

Thus Cerasure = 1 – e

(check!, draw and compare with BSC and Z)P(X=0) = P0

Erasure with errors: calculate the capacity!

0

1

0

E

1p

pe

e

1-p-e

1-p-e

example

Consider the following example

1/3

1/3

0

1

2

0

1

2

For P(0) = P(2) = p, P(1) = 1-2p

H(Y) = h(1/3 – 2p/3) + (2/3 + 2p/3); H(Y|X) = (1-2p)log23

Q: maximize H(Y) – H(Y|X) as a function of pQ: is this the capacity?

hint use the following: log2x = lnx / ln 2; d lnx / dx = 1/x

channel models: general diagram

x1

x2

xn

y1

y2

ym

:

:

:

:

:

:

P1|1

P2|1P1|2P2|2

Pm|n

Input alphabet X = {x1, x2, …, xn}

Output alphabet Y = {y1, y2, …, ym}

Pj|i = PY|X(yj|xi)

In general:

calculating capacity needs more theory

The statistical behavior of the channel is completely defined by the channel transition probabilities Pj|i = PY|X(yj|xi)

* clue:

I(X;Y)

is convex in the input probabilities

i.e. finding a maximum is simple

Channel capacity: converse

For R > C the decoding error probability > 0

k/n

C

Pe

Converse: For a discrete memory less channel

1 1 1 1

( ; ) ( ) ( | ) ( ) ( | ) ( ; )n n n n

n n n

i i i i i i ii i i i

I X Y H Y H Y X H Y H Y X I X Y nC

Xi Yi

m Xn Yn m‘encoder channel

channel

Source generates one

out of 2k equiprobable

messages

decoder

Let Pe = probability that m‘ m

source

converse R := k/n

Pe 1 – C/R - 1/nR

Hence: for large n, and R > C,

the probability of error Pe > 0

k = H(M) = I(M;Yn)+H(M|Yn)

Xn is a function of M Fano

I(Xn;Yn) + 1 + k Pe

nC + 1 + k Pe

1 – C n/k - 1/k Pe

We used the data processing theorem Cascading of Channels

I(X;Y)X Y

I(Y;Z)Z

I(X;Z)

The overall transmission rate I(X;Z) for the cascade cannot be larger than I(Y;Z), that is:

)Z;Y(I)Z;X(I

Appendix:

Assume:

binary sequence P(0) = 1 – P(1) = 1-p

t is the # of 1‘s in the sequence

Then n , > 0

Weak law of large numbers

Probability ( |t/n –p| > ) 0

i.e. we expect with high probability pn 1‘s

Appendix:

Consequence:

n(p- ) < t < n(p + ) with high probability

1.

2.

3.

)p(nh)p(n

)p(n2n2

pn

nn2

t

n

)p(hpn

nn2loglim 2n

1

n

)p1(log)p1(plogp)p(h 22

Homework: prove the approximation using ln N! ~ N lnN for N large.

Or use the Stirling approximation: ! 2 N NN NN e

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

p

h

Binary Entropy: h(p) = -plog2p – (1-p) log2 (1-p)

Note:

h(p) = h(1-p)

Input X Output Y

Noise

)]Noise(H)Y(H[sup:Cap

W2/S2x

)x(p

Input X is Gaussian with power spectral density (psd) ≤S/2W;

Noise is Gaussian with psd = 2noise

Output Y is Gaussian with psd = y2 = S/2W + 2

noise

Capacity for Additive White Gaussian Noise

W is (single sided) bandwidth

For Gaussian Channels: y2 =x

2 +noise2

X Y X Y

Noise

.sec/bits)W2/S

(logWCap

.trans/bits)(log

.trans/bits)e2(log))(e2(logCap

2noise

2noise

2

2noise

2x

2noise

221

2noise22

12noise

2x22

1

bits)e2(log)Z(H;e2

1)z(p 2

z2212/z

2z

2z

2

Middleton type of burst channel model

Select channel k with probability Q(k)

Transition probability P(0)

0

1

0

1

…

channel 1

channel 2

channel k has transition probability p(k)

Fritzman model:

multiple states G and only one state B

Closer to an actual real-world channel

Gn B1-p

Error probability 0 Error probability h

…G1

Interleaving: from bursty to random

Message interleaver channel interleaver -1 message

encoder decoder

bursty

„random error“

Note: interleaving brings encoding and decoding delay

Homework: compare the block and convolutional interleaving w.r.t. delay

Interleaving: block

Channel models are difficult to derive:

- burst definition ?

- random and burst errors ?

for practical reasons: convert burst into random error

read in row wise

transmit

column wise

1

0

0

1

1

0

1

0

0

1

1

0

0

0

0

0

0

1

1

0

1

0

0

1

1

De-Interleaving: block

read in column wisethis row contains 1 error

1

0

0

1

1

0

1

0

0

1

1

e

e

e

e

e

e

1

1

0

1

0

0

1

1

read out

row wise

Interleaving: convolutional

input sequence 0

input sequence 1 delay of b elements

input sequence m-1 delay of (m-1)b elements

Example: b = 5, m = 3

in

out