Vahid Meghdadi [email protected] · Vahid Meghdadi Chapter 4: Continuous channel and its...

OutlineDifferential entropy

Capacity of Gaussian channelCapacity of fading channel

Chapter 4: Continuous channel and its capacity

Vahid Meghdadi

[email protected] : Elements of Information Theory by Cover and

Thomas

Vahid Meghdadi Chapter 4: Continuous channel and its capacity



Differential entropyContinuous random variableGaussian multivariate random variable

Capacity of Gaussian channelAWGNBand limited channelParallel channels

Capacity of fading channelFlat fading channelShannon Capacity of fading ChannelCapacity with outageCSI known at TX




Continuous random variableGaussian multivariate random variable

Continuous random variable

In the case where X is a continuous RV, how the entropy isdefined?For discrete RV we used the mass probability function, here it isreplaced by probability distribution function (PDF).

DefinitionThe random variable X is said to be continuous if its cumulativedistribution function F (x) = Pr(X ≤ x) is continuous.





Differential entropy

DefinitionThe differential entropy h(X ) of a continuous random variable Xwith a PDF PX (x) is defined as

h(X ) =

∫SPX (x) log

1

PX (x)dx

= E

[log

1

PX (x)

](1)

where S is the support set of the random variable.





Example: Uniform distribution

Show that for X ∼ U(0, a) thedifferential entropy is log a.

PX(x)

xa

1/a

Note Unlike discrete entropy, the differential entropy can benegative. However, 2h(X ) = 2log a = a is the volume of the supportset, which is always non-negative.Note A horizontal shift does not change the entropy.





Example: Normal and exponential distribution

Show that for X ∼ N (0, σ2) the differential entropy is

h(x) =1

2log(2πeσ2) bits

Show that for PX (x) = λe−λx for X ≥ 0 the differential entropy is

h(x) = loge

λbits

What is the entropy if PX (x) = λ2 e−λ|x |?





Exercise

Suppose an additive Gaussian channel defined by Y = X + N with:X ∼ N (0,PX ) and N ∼ N (0,PN). Because of the independenceof X and N, Y ∼ N (0,PX + PN).Defining I (X ;Y ) = h(Y )− h(Y |X ), show that

I (X ;Y ) =1

2log2

(1 +

PX

PN

)Hint: You can use the fact that h(Y |X ) = h(N) (why?).Actually this is the capacity of a noisy continuous channel.





Gaussian random vector

Suppose that the vector X is defined as

X =

[X1

X2

]where X1 and X2 are i.i.d. N (0, 1). What is the entropy of X?

h(X) = h(X1,X2) = h(X1) + h(X2|X1) = h(X1) + h(X2)

Therefore

h(X) =1

2log(2πe)2

And for a vector of dimension n:

h(X) =1

2log(2πe)n





Some properties

1. Chain rule: h(X ,Y ) = h(X |Y ) + h(Y )

2. h(X + cte) = h(X )

3. h(cX ) = h(X ) + log |c | (note that in discrete case,H(cX ) = H(X ))

4. Let X be a random vector and Y = AX where A is a squarenon singular matrix. Then h(Y) = h(X) + log |A|.

5. Suppose X is a random vector with E(X) = 0 andE(XXT ) = K, then h(X) ≤ 1

2 log(2πe)n|K|. The equality isachieved only if X is Gaussian ∼ N (0,K)




AWGNBand limited channelParallel channels

Shannon capacity

In the early 1940s it was thought to be impossible to sendinformation at a positive rate with negligible probability of error.Shannon showed that (1948):

I For every channel there exists a maximum informationtransmission rate, below which, BER can be made nearly zero.

I If the entropy of source is less than channel capacity,asymptotically error free communication can be achieved.

I To obtain an error free communication, a coding schemeshould be used.

I Shannon did not show the optimal coding.I Today, the predicted capacity by Shannon can be achieved

within only a few tenth of dB.

For every channel there exists a maximum information transmissionrate, below which, the error probability can be made nearly zero.





Additive white Gaussian channel

As we have seen before with an additive Gaussiannoise channel, the mutual input-outputinformation can be calculated as

I (X ;Y ) = h(Y )− h(Y |X ) = h(y)− h(Z ) = h(Y )− 1

2log 2πeN

To maximize the mutual information, one should maximize h(Y )with the power constraint of PY = P + N. The distributionmaximizing the entropy for a continuous random variable isGaussian. This can be obtained if X is Gaussian.

C = maxp(x):EX 2≤P

I (X ;Y ) =1

2log

(1 +

P

N

)Vahid Meghdadi Chapter 4: Continuous channel and its capacity




Band limited channels

Suppose we have a continuous channel with bandwidth B and thepower spectral density of noise is N0/2. So the analog noise poweris N0B. On the other hand, supposing that the channel is usedover the time interval [0,T ]. So the power of analog signal timesT gives the total energy of the signal in this period. UsingShannon sampling theorem, there are 2B samples per second. Sothe power of discrete signal per sample will be PT/2BT = P/2B.The same argument can be used for the noise, so the power ofsamples of noise is N0

2 2B T2BT = N0/2. So the capacity of the

Gaussian channel per sample is:

C =1

2log

(1 +

P

N0B

)bits per sample





Band limited channel capacity

Since there are maximum 2B independent samples per second thecapacity can be written as:

C = B log

(1 +

P

N0B

)bits per second

Sometimes this equation is divided by B to obtain:

C

B= log

(1 +

P

N0B

)bits per second per Hz

It is the maximum achievable spectral efficiency through theAWGN channel.





Parallel independent Gaussian channel

Here we consider k independent Gaussian channelsin parallel with a common power constraint. Theobjective is to maximize the capacity by optimaldistribution of the power among the channels:

C = maxpX1,...,Xk

(x1,...,xk ):∑

EX 2i ≤P

I (X1, ...,Xk ;Y1, ...,Yk)





Parallel independent Gaussian channel

Using the independence of Z1, ...,Zk :

C = I (X1, ...,Xk ;Y1, ...,Yk)

= h(Y1, ...,Yk)− h(Y1, ...,Yk |X1, ...,Xk)

= h(Y1, ...,Yk)− h(Z1, ...,Zk)

≤∑i

h(Yi )− h(Zi )

≤∑i

1

2log

(1 +

Pi

Ni

)If there is no common power constraint, it is clear that the totalcapacity is the sum of the capacities of each channel.





Common power constraint

The question is: how to distribute the poweramong the transmitter to maximize thecapacity?The capacity for the equivalent channel is:

C = maxP1+P2≤Px

[ B1 log(1 +P1h2

1N0B1

) +

B2 log(1 +P2h2

2N0B2

) ]





Common power constraint

So we should maximize C subjected to P1 + P2 ≤ Px . UsingLagrangian, one can define:

L(P1,P2, λ) = B1 log(1+P1h

21

N0B1)+B2 log(1+

P2h22

N0B2)−λ(P1+P2−Px)

Let d(.)/dp1 = 0 and d(.)/dp2 = 0 and using ln instead of log2:

B1

1 +P1h2

1N0B1

h21

N0B1= λ

P1

B1N0=

1

λN0− 1

h21





With the same operations we obtain:

P1

B1N0= Cst − 1

h21

P2

B2N0= Cst − 1

h22

Where the Cte can be found by setting P1 + P2 = Px . Since thetwo powers are found, the capacity of the channel is calculatedeasily. The only constraint that to be considered is that P1 and P2

cannot be negative. If one of these is negative, the correspondingpower is zero and all the power are assigned to the other one. Thisprinciple is called water filling.





Exercise

ExerciseUse the same principle (water filling) and give the power allocationfor a channel with three frequency bands defined as follows:h1 = 1/2, h2 = 1/3 and h3 = 1; B1 = B, B2 = 2B and B3 = B;N0B = 1; Px = P1 + P2 + P3 = 10.

Solution: P1 = 3.5, P2 = 0 and P3 = 6.5.




Flat fading channelShannon Capacity of fading ChannelCapacity with outageCSI known at TX

Flat fading channel (frequency non-selective)

A non LOS urban transmission results in general in many of multipaths : the received signal is the sum of many replicas oftransmitted signal. Using I and Q components of received signal:

r(t) = cos(2πfct)I∑

i=0

ai cos(φi )− sin(2πfct)I∑

i=0

ai sin(φi ) + n(t)

With the central limit theorem, A =∑I

i=0 ai cos(φi ) and

B =∑I

i=0 ai sin(φi ) are i.i.d. Gaussian random variables.





The envelope of the received signal h =√A2 + B2 will be Rayleigh

random variable with:

fh(h) =h

σ2exp

(−h2

2σ2

)h ≥ 0

with σ2 the variance of A and B. The received power will be anexponential RV with the pdf:

f (p) =1

2σ2exp

(−p2σ2

)p ≥ 0

Therefore, the received signal can be modeled as:

Y = hX + N





Shannon (ergodic) capacity when Rx knows CSI

I The Channel coefficient h is an i.i.d. random variableindependent of signal and noise.

I We assume that the receiver knows the channel coefficient butthe transmitter does not.

I The capacity is: C = maxpx :E[X ]≤P I (X ;Y , h)

I Using chain rule:

I (X ;Y , h) = I (X ; h) + I (X ;Y |h) = I (X ;Y |h)





Conditioned on the fading coefficient h, the channel is transformedinto a simple AWGN with equivalent P equal to |h|2PX . So we canwrite:

I (X ;Y |h = h) =1

2log

(1 +|h|2PX

PN

)The ergodic capacity of the flat fading channel will be :

C = Eh

[1

2log

(1 +

PX |h|2

PN

)]Note: Normally all the signals are complex and they are the baseband equivalent of reel signals. In this case, the capacity ismultiplied by two since the real and imaginary parts of signals aredecorrelated.





Example(Wireless transmission by Andreas Goldsmith)

Consider a wireless channel where power falloff with distancefollows the formula Pr (d) = Pt(d0/d)3 for d0 = 10m. Assume thechannel band width of B = 30 kHz and AWGN with noise PSDN0/2, where N0 = 10−9 W/Hz. For a transmit power of 1 W flindthe capacity of the channel for a distance of 100m and 1km.Solution: The received signal to noise ratio SNR isγ = Pr (d)/PN = pt(d0/d)3/(N0B). That is γ = 15 dB ford = 100m, and −15 dB for d = 1km. The capacity of complextransmission is C = B log(1 + SNR) and is 156.6 kbps ford = 100m and 1.4 kbps for d = 1000 m.






Consider a flat fading channel with i.i.d. channel gain√h, which

can take on three possible values: 0.05 with the probability of 0.1,0.5 with 0.5, and 1 with 0.4. The transmitted power is 10 mW,N0 = 10−9 W/Hz, and the channel band width is 30 kHz. Assumethe receiver has the knowledge of instantaneous value of h but thetransmitter does not. Find the Shannon capacity of this channel.Solution: The channel has three possible received SNRs:γ1 = Pth1/N0B = 0.83, γ2 = Pth2/N0B = 83.33, andγ3 = Pth3/N0B = 333.33. So the Shannon capacity is given by:

C =∑i

B log2(1 + γi )p(γi ) = 199.26Kbps

Note: The average SNR is 175 and the corresponding capacitywould be 223.8 Kbps.





Capacity with outage

I Shannon capacity defines the maximum data rate that can besent over the channel with asymptotically small errorprobability.

I Since the TX does not know the channel, the transmitted rateis constant.

I When channel is in deep fade, the BER is not zero becausethe TX cannot adapt its rate relative to CSI.

I So the capacity with outage is defined and is the maximumrate that can be achieved with some outage probability (theprobability of deep fading).

I By allowing some losses in deep fading, higher data rate canbe achieved.





Fixing the required rate, C , a corresponding minimum SNR can becalculated (assuming complex transmission):

C = log2(1 + γmin)

If TX sends the date at this rate, the outage (non zero BER)occurs when γ < γmin. Therefore the probability of outage ispout = p(γ < γmin).The average rate of data that correctly received at RX isCO = (1− pout)B log2(1 + γmin).The value of γmin is a design parameter based on the acceptableoutage probability. Normally one draws the normalized capacityC/B = log2(1 + γmin) as a function of pout = p(γ < γmin).






Consider the same channel as in the last example with BW=30kHzand p(γ = 0.83) = 0.1, p(γ = 83.33) = 0.5, andp(γ = 333.33) = 0.4. Find the capacity versus outage and theaverage rate correctly received for outage probabilities pout < 0.1,pout = 0.1 and pout=0.6.Solution: For pout < 0.1, we must decode in all the channelstates. Therefore the rate must be less than the worst case:γmin = γ1 = 0.83. The corresponding capacity is 26.23 Kbps.For 0.1 ≤ Pout < 0.6, we can decode incorrectly only if the channelis in the weakest state: γ = 0.83. So γmin = γ2 with correspondingcapacity of 191.94 Kbps.For 0.6 ≤ Pout < 1, we can decode incorrectly if received γ is γ1 orγ2. Thus, γmin = γ3 with corresponding capacity of 251.55 Kbps.





Example (cont.)

For pout < 0.1 data rates close to 26.23 Kbps are always correctlyreceived.For pout = 0.1 we transmit at the rate 191.94 but can only corectewhen γ = γ2 or γ3. So the rate correctly received is(1-0.1)191.94=172.75 Kbps.For pout = 0.6 the rate correctly received is (1-0.6)251.55=125.78Kbps.





Since the channel is known at the TX, the outage cannot beproduced. That is because the TX can adapt its power to avoidthe outage. The capacity is (which is the same as Shannoncapacity as before):

C =

∫ ∞0

B log2(1 + γ)p(γ)dγ

Now we add also the power adaptation with a power constraint:

∫ ∞0

P(γ)p(γ)dγ ≤ P̄

So the problem is how to distribute the available power as afunction of SNR to maximize the rate while the average powerdose not exceed a predefined value.





Water-filling

The capacity is then

C = maxP(γ):∫P(γ)p(γ)dγ=P̄

∫ ∞0

B log2

(1 +

P(γ)γ

P̄

)p(γ)dγ

Note that γ = P|h|2N0B

. It means that for each channel levelrealization, a coding is employed to adjust the rate To find theoptimal power allocation P(γ) we form the Lagrangian.

J(P(γ)) =

∫ ∞0

B log2

(1 +

P(γ)γ

P̄

)p(γ)dγ − λ

∫ ∞0

P(γ)p(γ)dγ





Water-filling

Setting the derivative with respect to P(γ) equal to zero andsolving for P(γ) with the constraint P(γ) ≥ 0:

P(γ)

P̄=

{1/γ0 − 1/γ γ ≥ γ0

0 γ < γ0

It means that if γ is under a threshold γ0, the channel will not beused. The capacity formula is then:

C =

∫ ∞γ0

B log2

(γ

γ0

)p(γ)dγ





Water-filling

Therefore, the capacity can be achieved by adapting the rate as afunction of SNR. Another strategy would be fixing the rate andadapting only the power.Note that γ0 must be found numerically.Replacing the optimal power allocation calculated in the constraintpower, we obtain the following expression that should be satisfiedto calculate γ0. ∫ ∞

γ0

(1

γ0− 1

γ

)p(γ)dγ = 1





Water-filling

1F62

03C6

03C1

03B5

03C1

03C6

03C6

03C1

03C6

03BE

03BE

03B2

03BC

03BC

03B2

03B4

03B2

03B4

03B2

03B4

03B2

03B2

03B2

03BC

03BC

03B2

03BC

03BC

03BC

P(γ)

P

γ0

1/γ0

1/γ

γ

Figure above shows why this principle is called “water-filling”.





Example

With the same example as before: p(γ1 = 0.83) = 0.1,p(γ1 = 83.33) = 0.5, and p(γ3 = 333.33) = 0.4. Find the ergodiccapacity of the channel with CSI at TX and RX.Solution: Since water-filling will be used, we must first calculateγ0 satisfying: ∑

γi≥γ0

(1

γ0− 1

γi

)p(γi ) = 1





Example (cont.)

First we assume that all channel states will be used. In the aboveequation everything is known except for γ0 which is calculated tobe 0.884. Since this value exceeds γ1 = 0.83, the first channelstate should not be used.At the first iteration, the above equation will be calculated only forthe second and third channel giving γ0 = 0.893. This value isacceptable because the weakest channel is better than thisminimum threshold. Using this values the channel capacity can becalculated and is 200.82 Kbps.


Vahid Meghdadi [email protected] · Vahid Meghdadi Chapter 4: Continuous channel and its...

Documents

Transcript of Vahid Meghdadi [email protected] · Vahid Meghdadi Chapter 4: Continuous channel and its...