Lecture02-Huffman Coding [相容模式]

21
1 Lih-Jen Kau Signal Proc. & Intell. Electron. Group National Taipei Univ. of Technology 1/41 Lecture 02 Instructor:高立人 電子工程研究所 國立臺北科技大學 Oct. 07, 2021 Lih-Jen Kau Signal Proc. & Intell. Electron. Group National Taipei Univ. of Technology 2/41 Huffman Coding

Transcript of Lecture02-Huffman Coding [相容模式]

Microsoft PowerPoint - Lecture02-Huffman Coding []National Taipei Univ. of Technology1/41
Lecture 02
Instructor
National Taipei Univ. of Technology2/41
Huffman Coding
National Taipei Univ. of Technology3/41
Huffman Code
Huffman Code: Variable Length Code Theorem:
X, discrete random variable with pmf p(X). Let C be any prefix code for X, and CHuff be a Huffman code, then
and )()( ClCl Huff
1)()()( XHClXH Huff
With nodoubt The worst case
Proof of the above theorem is to be given in next lecture.
October 7, 2021
National Taipei Univ. of Technology4/41
Huffman Code Construction
Define a merged pmf :
National Taipei Univ. of Technology5/41
Huffman Code Construction
National Taipei Univ. of Technology6/41
Huffman Code Construction
Thus minimize with symbols also minimize .
Repeat the merging process till the size of alphabet is 2. We know how to do when the number of symbols in the
alphabet is only 2.
Just assign a 0 and a 1 for each of the symbol.
)(Cl 1m )(Cl
National Taipei Univ. of Technology7/41
Example of Huffman Code
0.4 0.2 0.2 0.1 0.1
0.2
Step2:
0.2
0.4
National Taipei Univ. of Technology8/41
Example of Huffman Code
0.2
0.4
0.6
0.2
0.4
0.6
1.0
National Taipei Univ. of Technology9/41
Example of Huffman Code
0.2
0.4
0.6
1.0
0
lbits/symbo 22
.
National Taipei Univ. of Technology10/41
Example of Huffman Code
0.2
0.20.4
0.20.2
National Taipei Univ. of Technology11/41
Example of Huffman Code
Step 3: Rearrange the order Move ever combined prob. to the left place!
Step 4:
0.20.4
0.20.4
0.6
National Taipei Univ. of Technology12/41
Example of Huffman Code
Step 5: Without rearranging the order Without placing larger prob. to the left place!
Step 6:
0.20.4
0.6
1.0
0.20.4
0.6
1.0
1
.
National Taipei Univ. of Technology13/41
Example of Huffman Code
Step 5: With the order rearranged Placing larger prob. to the left place!
Step 6:
0.2 0.4
0.2 0.4
National Taipei Univ. of Technology14/41
Example of Huffman Code
.
0.4 0.1 0.1 0.2 0.2
0.2 0.4
National Taipei Univ. of Technology15/41
Example of Huffman Code
Note: Though the above codes have the same average bit
rates, C1 has smaller variance in codeword length than C
Better to be stored in memory
Thought the 4-bit codeword length in C seldom occurs, the storage should have the memory width enlarged in case this codeword does take place. Therefore, C is not as good as C1.
October 7, 2021
National Taipei Univ. of Technology16/41
A Short Summary
For minimum codeword length variance: Combine prob. with shorter length
Sort prob. in each step, large to small, from left to right
Move combined prob. as close to the left as possible, and combine prob. on the right.combine
With the requirements above, the obtained Huffman code should be unique in most of the cases.
October 7, 2021
National Taipei Univ. of Technology17/41
Remarks for Huffman Code
816.0)( XH
%47 816.0
0.18 0.02
National Taipei Univ. of Technology18/41
Remarks for Huffman Code
Remark2: Longest codeword length (Worst case) Alphabet size m => longest codeword length is m-1
Eg. Alphabet = 5 => longest codeword length =4
1.0
National Taipei Univ. of Technology19/41
Extended Huffman Code1/5
X is a discrete random variable Alphabet H: {1, 2, …, m} Coding by blocks of size n.
:entropy of Y Alphabet size of Y is


National Taipei Univ. of Technology20/41
Extended Huffman Code2/5
National Taipei Univ. of Technology21/41
Extended Huffman Code3/5
Entropy for a group of n symbols from a set of m independent r.v.s (random variables). See page 53 in the textbookChapter 3 There are possible combinations (symbols) in the
constructed alphabet as below.
These symbols can be viewed as letters of an extended alphabet from a source .
nm
nS
National Taipei Univ. of Technology22/41
Extended Huffman Code4/5
National Taipei Univ. of Technology23/41
Extended Huffman Code5/5
Proof: cont.
The n-1 summations in braces in each term, i.e., n term, sum to one. n-1Summation Term
SummationSummation
National Taipei Univ. of Technology24/41
Empirical PMF
For Huffman tree construction X: discrete random variable Alphabet H: {1, 2, …, m} : Length of the sequence to be coded : number of occurrence of symbol k
Binary tree can be obtained from instead of . The use of an empirical pmf model is due to the fact
that we can hardly get the real pmf model of a symbol source to be encoded.
N
kN
National Taipei Univ. of Technology25/41
x 1 2 3 4 5
Nk 40 20 20 10 10
pk 0.4 0.2 0.2 0.1 0.1
An example of using Nk
100N
40 20
National Taipei Univ. of Technology26/41
Storage Requirement
Problem: How to construct the header such that the header size is optimized ?
40 20 20 10 10
40 20
0 100 101 110 111
1 3 3 3 3
The codeword length for the worst case is (m-1), where m is the number of symbols.
Therefore, the space required for the storage of the code is mx(m-1)
October 7, 2021
National Taipei Univ. of Technology27/41
Two-Pass Huffman Encoding
Second pass for encoding
pk not updated for varying statistics.
October 7, 2021
National Taipei Univ. of Technology28/41
Adaptive Huffman
pk updated during the process of encoding
No need for transmitting pk or codewords
Single pass
National Taipei Univ. of Technology29/41
Adaptive Huffman
Suppose the current count of occurrence is
Now, a symbol, say x1, is to be encoded, then the count of occurrence is changed to
Update the count of occurrence for each symbol to be encoded
No need to regrow the whole tree
mNNN ,,, 21
National Taipei Univ. of Technology30/41
Sibling Property
Definition: Nodes with same parent
A binary tree has sibling property if Each node has a sibling
Nodes can be listed in the order of increasing probability with each node adjacent to its sibling
Theorem:
A binary prefix code is a Huffman code iff it has the sibling property.
October 7, 2021
National Taipei Univ. of Technology31/41
Numbering the Binary Tree
20 20 10 10
nodes external
Number the nodes in increasing order
Block: nodes of the same weight make up a block Here, weight means probability,
i.e., count of occurrence.
National Taipei Univ. of Technology32/41
Procedure of Adaptation
Encode the symbol using current code Encode the symbol before the tree is updated!
Update the binary tree Label the corresponding external node as current
node
Exchange current node with the node in the same block that has the largest node number. (Except parent of the current node)
Increase weight of current node by one
Make the parent the current node
Go to step if current node is not root
October 7, 2021
National Taipei Univ. of Technology33/41
H: {a, b,…,z }, 26 letters
Initial code a: 00000 b: 00001 c: 00010
…, etc.
Coding seq.: a a c d z a z m NYT: Not yet transmitted/
Total 51 nodes (26+25=51)
Example of Adaptive Huffman1/9
October 7, 2021
National Taipei Univ. of Technology34/41
Example of Adaptive Huffman2/9
Step : Before transmission
0 49
2 50
2 51
NYT a
0 49
1 50
1 51
NYT a
The old NYT becomes root of the tree! Generate two new external nodes for symbol a and new NYT respectively. Update the weight of a and root!
Weight=0
National Taipei Univ. of Technology35/41
Example of Adaptive Huffman3/9
Step : aac encoded
NYT 0 47
0 47
1 48
a
Generate two new external nodes for symbol c and new NYT respectively. Update the weight of c and the old NYT!
October 7, 2021
National Taipei Univ. of Technology36/41
Example of Adaptive Huffman4/9
Step : aacd encoded
a
Generate two new external nodes for symbol d and new NYT respectively. Update the weight of d and the old NYT!
0 45
1 46
1 48
National Taipei Univ. of Technology37/41
Example of Adaptive Huffman5/9
Step : aacdz encoded
a
Generate two new external nodes for symbol z and new NYT respectively. Update the weight of z and the old NYT!
1 46
National Taipei Univ. of Technology38/41
Example of Adaptive Huffman6/9
Step : Cont.
1 47
a
Consider previous page for node 47, i.e., current node! Node 47 should exchange position with node 48.
1 46
2 48
National Taipei Univ. of Technology39/41
Example of Adaptive Huffman7/9
Step : Cont.
1 47
3 50
a
Consider previous page for node 49, i.e., current node! Node 49 should exchange position with node 50.
1 46
2 48
National Taipei Univ. of Technology40/41
Example of Adaptive Huffman8/9
Bit stream transmitted
a
NYT
c
NYT
d
NYT
z
National Taipei Univ. of Technology41/41
Example of Adaptive Huffman9/9
Comments on Adaptive Huffman Coding In the adaptive Huffman coding procedure, neither
transmitter nor the receiver knows anything about the statistics of the source sequence at the start of transmission.
The tree at both side consists of a single node that corresponds to all symbols not yet transmitted (NYT) and has a weight of zero.
October 7, 2021