HUFFMAN CODING - Universiti Putra Malaysia
of 65
/65
Embed Size (px)
Transcript of HUFFMAN CODING - Universiti Putra Malaysia
PowerPoint Presentation1
2
In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm
Present a procedure for building Huffman codes when the probability model for the source is known
A procedure for building codes when the source statistics are unknown
Describe a new technique for code design that are in some sense similar to the Huffman coding approach
Some applications
Algorithm in 5 steps:
1. Find the grey-level probabilities for the image by finding the histogram
2. Order the input probabilities (histogram magnitudes) from smallest to largest
3. Combine the smallest two by addition
4. GOTO step 2, until only two probabilities are left
5. By working backward along the tree, generate code by alternating assignment of 0 and 1
17
Source reduction
List all probabilities in a descending order
Merge the two symbols with smallest probabilities into a new compound symbol
Repeat the above two steps for N-2 steps
Codeword assignment
Start from the smallest source and work back to the original source
Each merging point corresponds to a node in binary codeword tree
Huffman Coding (using binary tree)
Example 1
possible gray levels. The image is 10 rows by 10
columns. In step 1 we find the histogram for the
image.
18
19
necessary) and add the
remain.
Step 5, actual code assignment is made.
Start on the right-hand side of the tree and assign 0’s &
1’s
0 is assigned to 0.6 branch & 1 to 0.4 branch
23
The assigned 0 & 1 are brought back along the tree &
wherever a branch occurs the code is put on both
branches
24
Assign the 0 & 1 to the branches labeled 0.3, appending
to the existing code.
(at 0.1 & 0.2 branch)
Now we have Huffman code for this image
2 gray levels have 3 bits to represent & 1 gray level has 1 bit
assigned
Gray level represented by 1 bit, g3, is the most likely to occur
(40% of the time) & thus has least information in the
information theoretic sense. 27
using the minimum variance procedure.
EE465: Introduction to Digital Image Processing 28
29
The codeword assignment is not unique. In fact, at each
merging point (node), we can arbitrarily assign “0” and “1”
to the two branches (average code length is the same).
or
32
bpsXHlr 078.0)(
If we use fixed-length codes, we have to spend three bits per
sample, which gives code redundancy of 3-2.122=0.878bps
36
r
* 0 represents the null node
* t(1) denotes the occurrence of T with a frequency of 1
48
TE
r
r
r
r
r
r
57
TENNESSE
r
59
TENNESSEE
r
e : 0
n : 11
s : 101
t : 1001
Average code length = i=0,n (length*frequency)/ i=0,n frequency
= { 1(4) + 2(2) + 3(2) + 1(4) } / (4+2+2+1)
= 18 / 9 = 2
= - ( 0.44 * log20.44 + 0.22 * log20.22
+ 0.22 * log20.22 + 0.11 * log20.11 )
/ log2
= 1.8367
63
2*3 + 3*1) / 9 = 1.89
64
SUMMARY
The average code length of ordinary Huffman coding seems to be
better than the Dynamic version,in this exercise.
But, actually the performance of dynamic coding is better. The problem
with static coding is that the tree has to be constructed in the transmitter
and sent to the receiver. The tree may change because the frequency
distribution of the English letters may change in plain text technical paper,
piece of code etc.
Since the tree in dynamic coding is constructed on the receiver as well, it
need not be sent. Considering this, Dynamic coding is better.
Also, the average code length will improve if the transmitted text is
bigger.
65
Summary of Huffman Coding Algorithm
Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time
Sorting symbols in descending probabilities is the key in the step of source reduction
The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well
2
In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm
Present a procedure for building Huffman codes when the probability model for the source is known
A procedure for building codes when the source statistics are unknown
Describe a new technique for code design that are in some sense similar to the Huffman coding approach
Some applications
Algorithm in 5 steps:
1. Find the grey-level probabilities for the image by finding the histogram
2. Order the input probabilities (histogram magnitudes) from smallest to largest
3. Combine the smallest two by addition
4. GOTO step 2, until only two probabilities are left
5. By working backward along the tree, generate code by alternating assignment of 0 and 1
17
Source reduction
List all probabilities in a descending order
Merge the two symbols with smallest probabilities into a new compound symbol
Repeat the above two steps for N-2 steps
Codeword assignment
Start from the smallest source and work back to the original source
Each merging point corresponds to a node in binary codeword tree
Huffman Coding (using binary tree)
Example 1
possible gray levels. The image is 10 rows by 10
columns. In step 1 we find the histogram for the
image.
18
19
necessary) and add the
remain.
Step 5, actual code assignment is made.
Start on the right-hand side of the tree and assign 0’s &
1’s
0 is assigned to 0.6 branch & 1 to 0.4 branch
23
The assigned 0 & 1 are brought back along the tree &
wherever a branch occurs the code is put on both
branches
24
Assign the 0 & 1 to the branches labeled 0.3, appending
to the existing code.
(at 0.1 & 0.2 branch)
Now we have Huffman code for this image
2 gray levels have 3 bits to represent & 1 gray level has 1 bit
assigned
Gray level represented by 1 bit, g3, is the most likely to occur
(40% of the time) & thus has least information in the
information theoretic sense. 27
using the minimum variance procedure.
EE465: Introduction to Digital Image Processing 28
29
The codeword assignment is not unique. In fact, at each
merging point (node), we can arbitrarily assign “0” and “1”
to the two branches (average code length is the same).
or
32
bpsXHlr 078.0)(
If we use fixed-length codes, we have to spend three bits per
sample, which gives code redundancy of 3-2.122=0.878bps
36
r
* 0 represents the null node
* t(1) denotes the occurrence of T with a frequency of 1
48
TE
r
r
r
r
r
r
57
TENNESSE
r
59
TENNESSEE
r
e : 0
n : 11
s : 101
t : 1001
Average code length = i=0,n (length*frequency)/ i=0,n frequency
= { 1(4) + 2(2) + 3(2) + 1(4) } / (4+2+2+1)
= 18 / 9 = 2
= - ( 0.44 * log20.44 + 0.22 * log20.22
+ 0.22 * log20.22 + 0.11 * log20.11 )
/ log2
= 1.8367
63
2*3 + 3*1) / 9 = 1.89
64
SUMMARY
The average code length of ordinary Huffman coding seems to be
better than the Dynamic version,in this exercise.
But, actually the performance of dynamic coding is better. The problem
with static coding is that the tree has to be constructed in the transmitter
and sent to the receiver. The tree may change because the frequency
distribution of the English letters may change in plain text technical paper,
piece of code etc.
Since the tree in dynamic coding is constructed on the receiver as well, it
need not be sent. Considering this, Dynamic coding is better.
Also, the average code length will improve if the transmitted text is
bigger.
65
Summary of Huffman Coding Algorithm
Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time
Sorting symbols in descending probabilities is the key in the step of source reduction
The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well