HUFFMAN CODING - Universiti Putra Malaysia

65
HUFFMAN CODING 1

Transcript of HUFFMAN CODING - Universiti Putra Malaysia

Page 1: HUFFMAN CODING - Universiti Putra Malaysia

HUFFMAN CODING

1

Page 2: HUFFMAN CODING - Universiti Putra Malaysia

2

In this chapter, we describe a very popular coding algorithm called the Huffman coding algorithm

Present a procedure for building Huffman codes when the probability model for the source is known

A procedure for building codes when the source statistics are unknown

Describe a new technique for code design that are in some sense similar to the Huffman coding approach

Some applications

Overview

Page 3: HUFFMAN CODING - Universiti Putra Malaysia

3

Huffman Coding Algorithm

Page 4: HUFFMAN CODING - Universiti Putra Malaysia

4

Huffman Coding Algorithm

Page 5: HUFFMAN CODING - Universiti Putra Malaysia

5

Huffman Coding Algorithm

Page 6: HUFFMAN CODING - Universiti Putra Malaysia

6

Huffman Coding Algorithm

Page 7: HUFFMAN CODING - Universiti Putra Malaysia

7

Huffman Coding Algorithm

Page 8: HUFFMAN CODING - Universiti Putra Malaysia

8

Huffman Coding Algorithm

Page 9: HUFFMAN CODING - Universiti Putra Malaysia

9

Huffman Coding Algorithm

Page 10: HUFFMAN CODING - Universiti Putra Malaysia

10

Huffman Coding Algorithm

Page 11: HUFFMAN CODING - Universiti Putra Malaysia

11

Minimum Variance Huffman Codes

Page 12: HUFFMAN CODING - Universiti Putra Malaysia

12

Minimum Variance Huffman Codes

Page 13: HUFFMAN CODING - Universiti Putra Malaysia

13

Minimum Variance Huffman Codes

Page 14: HUFFMAN CODING - Universiti Putra Malaysia

14

Minimum Variance Huffman Codes

Page 15: HUFFMAN CODING - Universiti Putra Malaysia

15

Minimum Variance Huffman Codes

Page 16: HUFFMAN CODING - Universiti Putra Malaysia

16

Huffman Coding (using binary tree)

Algorithm in 5 steps:

1. Find the grey-level probabilities for the image by finding the histogram

2. Order the input probabilities (histogram magnitudes) from smallest to largest

3. Combine the smallest two by addition

4. GOTO step 2, until only two probabilities are left

5. By working backward along the tree, generate code by alternating assignment of 0 and 1

Page 17: HUFFMAN CODING - Universiti Putra Malaysia

17

Coding Procedures for an N-symbol source

Source reduction

List all probabilities in a descending order

Merge the two symbols with smallest probabilities into a new compound symbol

Repeat the above two steps for N-2 steps

Codeword assignment

Start from the smallest source and work back to the original source

Each merging point corresponds to a node in binary codeword tree

Huffman Coding (using binary tree)

Page 18: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

We have an image with 2 bits/pixel, giving 4

possible gray levels. The image is 10 rows by 10

columns. In step 1 we find the histogram for the

image.

18

Page 19: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Converted into

probabilities by

normalizing to the total

number of pixels

Gray level 0 has 20 pixels

Gray level 1 has 30 pixels

Gray level 2 has 10 pixels

Gray level 3 has 40 pixels

19

a. Step 1: Histogram

Page 20: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Step 2, the probabilities

are ordered.

20

Page 21: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Step 3, combine the

smallest two by addition.

21

Page 22: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Step 4 repeats steps 2

and 3, where reorder (if

necessary) and add the

two smallest

probabilities.

22

d. Step 4: Reorder and

add until only two values

remain.

Page 23: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Step 5, actual code assignment is made.

Start on the right-hand side of the tree and assign 0’s &

1’s

0 is assigned to 0.6 branch & 1 to 0.4 branch

23

Page 24: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

The assigned 0 & 1 are brought back along the tree &

wherever a branch occurs the code is put on both

branches

24

Page 25: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Assign the 0 & 1 to the branches labeled 0.3, appending

to the existing code.

25

Page 26: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Finally, the codes are brought back one more level, &

where the branch splits another assignment 0 & 1 occurs

(at 0.1 & 0.2 branch)

26

Page 27: HUFFMAN CODING - Universiti Putra Malaysia

Example 1

Now we have Huffman code for this image

2 gray levels have 3 bits to represent & 1 gray level has 1 bit

assigned

Gray level represented by 1 bit, g3, is the most likely to occur

(40% of the time) & thus has least information in the

information theoretic sense.27

Page 28: HUFFMAN CODING - Universiti Putra Malaysia

Exercise

Using the example 1, find a Huffman code

using the minimum variance procedure.

EE465: Introduction to Digital Image Processing 28

Page 29: HUFFMAN CODING - Universiti Putra Malaysia

29

symbol x p(x)

S

W

N

E

0.5

0.25

0.125

0.1250.25

0.25

0.5 0.5

0.5

Example 2

Step 1: Source reduction

(EW)

(NEW)

compound symbols

Page 30: HUFFMAN CODING - Universiti Putra Malaysia

30

p(x)

0.5

0.25

0.125

0.1250.25

0.25

0.5 0.5

0.5 1

0

1

0

1

0

codeword

0

10

110

111

Example 2

Step 2: Codeword assignment

symbol x

S

W

N

E

NEW 0

10EW

110

EW

N

S

01

1 0

1 0111

Page 31: HUFFMAN CODING - Universiti Putra Malaysia

31

Example 2

NEW 0

10EW

110

EW

N

S

01

1 0

1 0

NEW 1

01EW

000

EW

N

S

10

0 1

1 0001

The codeword assignment is not unique. In fact, at each

merging point (node), we can arbitrarily assign “0” and “1”

to the two branches (average code length is the same).

or

Page 32: HUFFMAN CODING - Universiti Putra Malaysia

32

symbol x p(x)

e

o

a

i

0.4

0.2

0.2

0.1

0.4

0.2

0.4 0.6

0.4

Example 2

Step 1: Source reduction

(iou)

(aiou)

compound symbolsu 0.1

0.2(ou)

0.4

0.2

0.2

Page 33: HUFFMAN CODING - Universiti Putra Malaysia

33

symbol x p(x)

e

o

a

i

0.4

0.2

0.2

0.1

0.4

0.2

0.4 0.6

0.4

Example 2

(iou)

(aiou)

compound symbols

u 0.10.2(ou)

0.4

0.2

0.2

Step 2: Codeword assignment

codeword0

1

1

01

000

0010

0011

Page 34: HUFFMAN CODING - Universiti Putra Malaysia

34

Example 2

0 1

0100

000 001

0010 0011

e

o u

(ou)i

(iou) a

(aiou)

binary codeword tree representation

Page 35: HUFFMAN CODING - Universiti Putra Malaysia

35

Example 2

symbol x p(x)

e

o

a

i

0.4

0.2

0.20.1

u 0.1

codeword1

01

0000010

0011

length1

23

4

4

bpsppXHi

ii 122.2log)(5

1

2

bpslpli

ii 2.241.041.032.022.014.05

1

bpsXHlr 078.0)(

If we use fixed-length codes, we have to spend three bits per

sample, which gives code redundancy of 3-2.122=0.878bps

Page 36: HUFFMAN CODING - Universiti Putra Malaysia

36

Example 3

Step 1: Source reduction

compound symbol

Page 37: HUFFMAN CODING - Universiti Putra Malaysia

37

Example 3

Step 2: Codeword assignment

compound symbol

Page 38: HUFFMAN CODING - Universiti Putra Malaysia

38

Adaptive Huffman Coding

Page 39: HUFFMAN CODING - Universiti Putra Malaysia

39

Adaptive Huffman Coding

Page 40: HUFFMAN CODING - Universiti Putra Malaysia

40

Update Procedure

Page 41: HUFFMAN CODING - Universiti Putra Malaysia

41

Update Procedure

Page 42: HUFFMAN CODING - Universiti Putra Malaysia

42

Update Procedure

Page 43: HUFFMAN CODING - Universiti Putra Malaysia

43

Update Procedure

Page 44: HUFFMAN CODING - Universiti Putra Malaysia

44

Update Procedure

Page 45: HUFFMAN CODING - Universiti Putra Malaysia

45

Update Procedure

Page 46: HUFFMAN CODING - Universiti Putra Malaysia

46

Dynamic Huffman Coding

Page 47: HUFFMAN CODING - Universiti Putra Malaysia

47

T

Stage 1 (First occurrence of t )

r

/ \

0 t(1)

Order: 0,t(1)

* r represents the root

* 0 represents the null node

* t(1) denotes the occurrence of T with a frequency of 1

Page 48: HUFFMAN CODING - Universiti Putra Malaysia

48

TE

Stage 2 (First occurrence of e)

r

/ \

1 t(1)

/ \

0 e(1)

Order: 0,e(1),1,t(1)

Page 49: HUFFMAN CODING - Universiti Putra Malaysia

49

TEN

Stage 3 (First occurrence of n )

r

/ \

2 t(1)

/ \

1 e(1)

/ \

0 n(1)

Order: 0,n(1),1,e(1),2,t(1) : Misfit

Page 50: HUFFMAN CODING - Universiti Putra Malaysia

50

Reorder: TEN

r

/ \

t(1) 2

/ \

1 e(1)

/ \

0 n(1)

Order: 0,n(1),1,e(1),t(1),2

Page 51: HUFFMAN CODING - Universiti Putra Malaysia

51

TENN

Stage 4 ( Repetition of n )

r

/ \

t(1) 3

/ \

2 e(1)

/ \

0 n(2)

Order: 0,n(2),2,e(1),t(1),3 : Misfit

Page 52: HUFFMAN CODING - Universiti Putra Malaysia

52

Reorder: TENN

r

/ \

n(2) 2

/ \

1 e(1)

/ \

0 t(1)

Order: 0,t(1),1,e(1),n(2),2

t(1),n(2) are swapped

Page 53: HUFFMAN CODING - Universiti Putra Malaysia

53

TENNE

Stage 5 (Repetition of e )

r

/ \

n(2) 3

/ \

1 e(2)

/ \

0 t(1)

Order: 0,t(1),1,e(2),n(2),3

Page 54: HUFFMAN CODING - Universiti Putra Malaysia

54

TENNES

Stage 6 (First occurrence of s)

r

/ \

n(2) 4

/ \

2 e(2)

/ \

1 t(1)

/ \

0 s(1)

Order: 0,s(1),1,t(1),2,e(2),n(2),4

Page 55: HUFFMAN CODING - Universiti Putra Malaysia

55

TENNESS

Stage 7 (Repetition of s)

r

/ \

n(2) 5

/ \

3 e(2)

/ \

2 t(1)

/ \

0 s(2)

Order: 0,s(2),2,t(1),3,e(2),n(2),5 : Misfit

Page 56: HUFFMAN CODING - Universiti Putra Malaysia

56

Reorder: TENNESS

r

/ \

n(2) 5

/ \

3 e(2)

/ \

1 s (2)

/ \

0 t(1)

Order : 0,t(1),1,s(2),3,e(2),n(2),5

s(2) and t(1) are swapped

Page 57: HUFFMAN CODING - Universiti Putra Malaysia

57

TENNESSE

Stage 8 (Second repetition of e )

r

/ \

n(2) 6

/ \

3 e(3)

/ \

1 s(2)

/ \

0 t(1)

Order : 0,t(1),1,s(2),3,e(3),n(2),6 : Misfit

Page 58: HUFFMAN CODING - Universiti Putra Malaysia

58

Reorder: TENNESSE

r

/ \

e(3) 5

/ \

3 n(2)

/ \

1 s(2)

/ \

0 t(1)

Order : 1,t(1),1,s(2),3,n(2),e(3),5

N(2) and e(3) are swapped

Page 59: HUFFMAN CODING - Universiti Putra Malaysia

59

TENNESSEE

Stage 9 (Second repetition of e )

r

0/ \1

e(4) 5

0/ \1

3 n(2)

0/ \1

1 s(2)

0/ \1

0 t(1)

Order : 1,t(1),1,s(2),3,n(2),e(4),5

Page 60: HUFFMAN CODING - Universiti Putra Malaysia

60

ENCODING

The letters can be encoded as follows:

e : 0

n : 11

s : 101

t : 1001

Page 61: HUFFMAN CODING - Universiti Putra Malaysia

61

Average Code Length

Average code length = i=0,n (length*frequency)/ i=0,n frequency

= { 1(4) + 2(2) + 3(2) + 1(4) } / (4+2+2+1)

= 18 / 9 = 2

Page 62: HUFFMAN CODING - Universiti Putra Malaysia

62

ENTROPY

Entropy = - i=1,n (pi log2 pi)

= - ( 0.44 * log20.44 + 0.22 * log20.22

+ 0.22 * log20.22 + 0.11 * log20.11 )

= - (0.44 * log0.44 + 2(0.22 * log0.22 + 0.11 * log0.11)

/ log2

= 1.8367

Page 63: HUFFMAN CODING - Universiti Putra Malaysia

63

Ordinary Huffman Coding

TENNESSE

9

0/ \1

5 e(4)

0/ \1

s(2) 3

0/ \1

t(1) n(2)

ENCODING

E : 1

S : 00

T : 010

N : 011

Average code length = (1*4 + 2*2 +

2*3 + 3*1) / 9 = 1.89

Page 64: HUFFMAN CODING - Universiti Putra Malaysia

64

SUMMARY

The average code length of ordinary Huffman coding seems to be

better than the Dynamic version,in this exercise.

But, actually the performance of dynamic coding is better. The problem

with static coding is that the tree has to be constructed in the transmitter

and sent to the receiver. The tree may change because the frequency

distribution of the English letters may change in plain text technical paper,

piece of code etc.

Since the tree in dynamic coding is constructed on the receiver as well, it

need not be sent. Considering this, Dynamic coding is better.

Also, the average code length will improve if the transmitted text is

bigger.

Page 65: HUFFMAN CODING - Universiti Putra Malaysia

65

Summary of Huffman Coding Algorithm

Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time

Sorting symbols in descending probabilities is the key in the step of source reduction

The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well

Only works for a source with finite number of symbols (otherwise, it does not know where to start)