# HUFFMAN CODING - Universiti Putra Malaysia

of 65
/65

Embed Size (px)

### Transcript of HUFFMAN CODING - Universiti Putra Malaysia

PowerPoint Presentation1

2

In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm

Present a procedure for building Huffman codes when the probability model for the source is known

A procedure for building codes when the source statistics are unknown

Describe a new technique for code design that are in some sense similar to the Huffman coding approach

Some applications

Algorithm in 5 steps:

1. Find the grey-level probabilities for the image by finding the histogram

2. Order the input probabilities (histogram magnitudes) from smallest to largest

3. Combine the smallest two by addition

4. GOTO step 2, until only two probabilities are left

5. By working backward along the tree, generate code by alternating assignment of 0 and 1

17

Source reduction

List all probabilities in a descending order

Merge the two symbols with smallest probabilities into a new compound symbol

Repeat the above two steps for N-2 steps

Codeword assignment

Start from the smallest source and work back to the original source

Each merging point corresponds to a node in binary codeword tree

Huffman Coding (using binary tree)

Example 1

possible gray levels. The image is 10 rows by 10

columns. In step 1 we find the histogram for the

image.

18

19

necessary) and add the

remain.

Step 5, actual code assignment is made.

Start on the right-hand side of the tree and assign 0’s &

1’s

0 is assigned to 0.6 branch & 1 to 0.4 branch

23

The assigned 0 & 1 are brought back along the tree &

wherever a branch occurs the code is put on both

branches

24

Assign the 0 & 1 to the branches labeled 0.3, appending

to the existing code.

(at 0.1 & 0.2 branch)

Now we have Huffman code for this image

2 gray levels have 3 bits to represent & 1 gray level has 1 bit

assigned

Gray level represented by 1 bit, g3, is the most likely to occur

(40% of the time) & thus has least information in the

information theoretic sense. 27

using the minimum variance procedure.

EE465: Introduction to Digital Image Processing 28

29

The codeword assignment is not unique. In fact, at each

merging point (node), we can arbitrarily assign “0” and “1”

to the two branches (average code length is the same).

or

32

bpsXHlr 078.0)(

If we use fixed-length codes, we have to spend three bits per

sample, which gives code redundancy of 3-2.122=0.878bps

36

r

* 0 represents the null node

* t(1) denotes the occurrence of T with a frequency of 1

48

TE

r

r

r

r

r

r

57

TENNESSE

r

59

TENNESSEE

r

e : 0

n : 11

s : 101

t : 1001

Average code length = i=0,n (length*frequency)/ i=0,n frequency

= { 1(4) + 2(2) + 3(2) + 1(4) } / (4+2+2+1)

= 18 / 9 = 2

= - ( 0.44 * log20.44 + 0.22 * log20.22

+ 0.22 * log20.22 + 0.11 * log20.11 )

/ log2

= 1.8367

63

2*3 + 3*1) / 9 = 1.89

64

SUMMARY

The average code length of ordinary Huffman coding seems to be

better than the Dynamic version,in this exercise.

But, actually the performance of dynamic coding is better. The problem

with static coding is that the tree has to be constructed in the transmitter

and sent to the receiver. The tree may change because the frequency

distribution of the English letters may change in plain text technical paper,

piece of code etc.

Since the tree in dynamic coding is constructed on the receiver as well, it

need not be sent. Considering this, Dynamic coding is better.

Also, the average code length will improve if the transmitted text is

bigger.

65

Summary of Huffman Coding Algorithm

Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time

Sorting symbols in descending probabilities is the key in the step of source reduction

The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well

2

In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm

Present a procedure for building Huffman codes when the probability model for the source is known

A procedure for building codes when the source statistics are unknown

Describe a new technique for code design that are in some sense similar to the Huffman coding approach

Some applications

Algorithm in 5 steps:

1. Find the grey-level probabilities for the image by finding the histogram

2. Order the input probabilities (histogram magnitudes) from smallest to largest

3. Combine the smallest two by addition

4. GOTO step 2, until only two probabilities are left

5. By working backward along the tree, generate code by alternating assignment of 0 and 1

17

Source reduction

List all probabilities in a descending order

Merge the two symbols with smallest probabilities into a new compound symbol

Repeat the above two steps for N-2 steps

Codeword assignment

Start from the smallest source and work back to the original source

Each merging point corresponds to a node in binary codeword tree

Huffman Coding (using binary tree)

Example 1

possible gray levels. The image is 10 rows by 10

columns. In step 1 we find the histogram for the

image.

18

19

necessary) and add the

remain.

Step 5, actual code assignment is made.

Start on the right-hand side of the tree and assign 0’s &

1’s

0 is assigned to 0.6 branch & 1 to 0.4 branch

23

The assigned 0 & 1 are brought back along the tree &

wherever a branch occurs the code is put on both

branches

24

Assign the 0 & 1 to the branches labeled 0.3, appending

to the existing code.

(at 0.1 & 0.2 branch)

Now we have Huffman code for this image

2 gray levels have 3 bits to represent & 1 gray level has 1 bit

assigned

Gray level represented by 1 bit, g3, is the most likely to occur

(40% of the time) & thus has least information in the

information theoretic sense. 27

using the minimum variance procedure.

EE465: Introduction to Digital Image Processing 28

29

The codeword assignment is not unique. In fact, at each

merging point (node), we can arbitrarily assign “0” and “1”

to the two branches (average code length is the same).

or

32

bpsXHlr 078.0)(

If we use fixed-length codes, we have to spend three bits per

sample, which gives code redundancy of 3-2.122=0.878bps

36

r

* 0 represents the null node

* t(1) denotes the occurrence of T with a frequency of 1

48

TE

r

r

r

r

r

r

57

TENNESSE

r

59

TENNESSEE

r

e : 0

n : 11

s : 101

t : 1001

Average code length = i=0,n (length*frequency)/ i=0,n frequency

= { 1(4) + 2(2) + 3(2) + 1(4) } / (4+2+2+1)

= 18 / 9 = 2

= - ( 0.44 * log20.44 + 0.22 * log20.22

+ 0.22 * log20.22 + 0.11 * log20.11 )

/ log2

= 1.8367

63

2*3 + 3*1) / 9 = 1.89

64

SUMMARY

The average code length of ordinary Huffman coding seems to be

better than the Dynamic version,in this exercise.

But, actually the performance of dynamic coding is better. The problem

with static coding is that the tree has to be constructed in the transmitter

and sent to the receiver. The tree may change because the frequency

distribution of the English letters may change in plain text technical paper,

piece of code etc.

Since the tree in dynamic coding is constructed on the receiver as well, it

need not be sent. Considering this, Dynamic coding is better.

Also, the average code length will improve if the transmitted text is

bigger.

65

Summary of Huffman Coding Algorithm

Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time

Sorting symbols in descending probabilities is the key in the step of source reduction

The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well