Computing and Communications 2. Information Theory ...
Transcript of Computing and Communications 2. Information Theory ...
1896 1920 1987 2006
Computing and Communications2. Information Theory
-Channel Capacity
Ying Cui
Department of Electronic Engineering
Shanghai Jiao Tong University, China
2017, Autumn
1
Outline
• Communication system
• Examples of channel capacity
• Symmetric channels
• Properties of channel capacity
• Definitions
• Channel coding theorem
• Source-channel coding theorem
2
Reference
• Elements of information theory, T. M. Cover and J. A. Thomas, Wiley
3
CHANNEL CAPACITY
4
Communication System
– map source symbols from finite alphabet into some sequence of channel symbols, i.e., input sequence of channel
– output sequence of channel is random but has a distribution depending on input sequence of channel• two different input sequences may give rise to same output sequence, i.e.,
inputs are confusable• choose a “nonconfusable” subset of input sequences so that with high
probability there is only one highly likely input that could have caused the particular output
– attempt to recover transmitted message from output sequence of channel• reconstruct input sequences with a negligible probability of error
5
Channel Capacity
6
EXAMPLES OF CHANNEL CAPACITY
7
Noiseless Binary Channel
• Binary input is reproduced exactly at output
• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)
– one error-free bit can be transmitted per channel use
8
Noisy Channel with Nonoverlapping Outputs
• Two possible outputs corresponding to each of the two inputs– appear to be noisy, but really not
• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)– input can be determined from the output
– every transmitted bit can be recovered without error
9
Noisy Typewriter
• Channel input is either unchanged with probability 1/2 or is transformed into the next letter with probability 1/2
• If the input has 26 symbols and we use every alternate input symbol, we can transmit one of 13 symbols without error with each transmission
• C = max I(X; Y)= max (H(Y) – H(Y|X))= max H(Y) – 1 = log 26 – 1 = log 13
achieved using p(x) = (1/26,…, 1/26)
10
• Input symbols are complemented with probability p
• \\\
equality is achieved when the input distribution is uniform
Binary Symmetric Channel
11
Binary Erasure Channel
• Two inputs and three outputs, a fraction of bits are erased
• Xx
• Recover at most 1-α of bits, as α of bits are lost
12
𝜋=1/2achieved when
SYMMETRIC CHANNELS
13
Symmetric
– example of symmetric channel
14
Proof
15
PROPERTIES OF CHANNEL CAPACITY
16
Properties of Channel Capacity
•
•
•
• is a continuous function of p(x)
• is a concave function of p(x)
• Problem for computing channel capacity is a convex problem– maximization of a bounded concave function over a closed
convex set
– maximum can then be found by standard nonlinear optimization techniques such as gradient search
17
0 since ( ; ) 0C I X Y
log since max ( ; ) max ( ) logC C I X Y H X
( ; )I X Y
( ; )I X Y
log since max ( ; ) max ( ) logC C I X Y H Y
DEFINITIONS
18
Discrete Memoryless Channel (DMC)
19
Code
20
Probability of Error
21
Rate and Capacity
– write (2𝑛𝑅, 𝑛) codes to mean ( 2𝑛𝑅 , 𝑛) codes to simplify the notation
22
CHANNEL CODING THEOREM(SHANNON’S SECOND THEOREM)
23
Basic Idea
• For large block lengths, every channel has a subset of inputs producing disjoint sequences at the output
• Ensure that no two input X sequences produce the same output Y sequence, to determine which X sequence was sent
24
Basic Idea
• Total number of possible output Y sequences is ≈
2𝑛𝐻(𝑌)
• Divide into sets of size 2𝑛𝐻(𝑌|𝑋) corresponding to the different input X sequences
• Total number of disjoint sets is less than or equal to
2𝑛(𝐻 𝑌 −𝐻(𝑌|𝑋)) = 2𝑛𝐼(𝑋;𝑌)
• Send at most ≈ 2𝑛𝐼(𝑋;𝑌) distinguishable sequences of length n
25
Channel Coding Theorem
26
New Ideas in Shannon’s Proof
• Allowing an arbitrarily small but nonzero probability of error
• Using the channel many times in succession, so that the law of large numbers comes into effect
• Calculating the average of the probability of error over a random choice of codebooks
– symmetrize the probability and can then be used to show the existence of at least one good code
• Shannon’s proof outline was based on idea of typical sequences, but was not made rigorous until much later
27
Current Proof
• Use the same essential ideas– random code selection, calculation of the average probability of error
for a random choice of codewords, and so on
• Main difference is in the decoding rule-decode by joint typicality– look for a codeword that is jointly typical with the received sequence– if find a unique codeword satisfying this property, declare that word to
be the transmitted codeword– properties of joint typicality
• with high probability the transmitted codeword and the received sequence are jointly typical, since they are probabilistically related
• probability that any other codeword looks jointly typical with the received sequence is 2−𝑛𝐼
• thus, if we have fewer then 2𝑛𝐼 codewords, then with high probability there will be no other codewords that can be confused with the transmitted codeword, and the probability of error is small
28
SOURCE–CHANNEL SEPARATION THEOREM (SHANNON’S THIRD THEOREM)
29
Two Main Basic Theorems
• Data compression: R>H
• Data transmission: R<C
• Is condition H < C necessary and sufficient for sending a source over a channel?
30
Example
• Consider two methods for sending digitized speech over a discrete memoryless channel
– one-stage method: design a code to map the sequence of speech samples directly into the input of the channel
– two-stage method: compress the speech into its most efficient representation, then use the appropriate channel code to send it over the channel
• Lose something by using the two-stage method?
– data compression does not depend on the channel
– channel coding does not depend on the source distribution
31
Joint vs. Separate Channel Coding
• Joint source and channel coding
• Separate source and channel coding
32
Source–Channel Coding Theorem
– consider the design of a communication system as a combination of two parts• source coding: design source codes for the most efficient representation of the
data
• channel coding: design channel codes appropriate for the channel encodes (combat the noise and errors introduced by the channel)
– the separate encoders can achieve the same rates as the joint encoder• hold for the situation where one transmitter communicates to one receiver
33
Summary
34
Summary
35
[email protected]/Personal/yingcui
36