Embed Size (px)

### Transcript of 4. Huffman and Arithmetic Coding

• 7/29/2019 4. Huffman and Arithmetic Coding

1/27

Huffman and Arithmetic Coding

Coding and Its Application

• 7/29/2019 4. Huffman and Arithmetic Coding

2/27

Introduction

Huffman codes can be classified as instantaneous code It has property:

Huffman codes are compact codes, i.e. it produces a code withan average length which is the smallest possible to achieve for

the given number of source symbols, code alphabet, and sourcestatistics

Huffman codes operate by reducing a source with qsymbols to a source with rsymbols, where r is the size of

the code alphabet

• 7/29/2019 4. Huffman and Arithmetic Coding

3/27

Introduction

Consider the source S with q symbols andassociated with probabilities

Let the symbols be renumbered so that

By combining the last r symbols ofS,

into one symbol, with probability

The trivial r-ary compact code for the reduced sourcewith rsymbols is used to design the compact code forthe preceding reduced source

: 1,2, ,is i q

: 1,2, ,iP s i q

1 2 qP s P s P s

1 2, , ,q r q r qs s s 1q rs 1 1

1

r

q r q r

i

P s P s

• 7/29/2019 4. Huffman and Arithmetic Coding

4/27

Binary Huffman Coding

The algorithm: Re-order the source symbols in decreasing order of symbol

probability

Reduce the source by combining the last two symbols and re-ordering the new set in decreasing order

Assign a compact code for the final reduced source. For a twosymbol source the trivial code is {0, 1}

Backtrackto the original source S assigning a compact code

Example:Consider a 5 symbol source with the following probability

• 7/29/2019 4. Huffman and Arithmetic Coding

5/27

Binary Huffman Coding

The average length is 2.2 bits/symbol

The efficiency is 96.5%

Are Huffman codes unique?

• 7/29/2019 4. Huffman and Arithmetic Coding

6/27

r-ary Huffman Codes

Calculate . If is a non-integer value then appenddummy symbols to the source with zero probability untilthere are symbols

Re-order the source symbols in decreasing order of symbolprobability

Reduce the source to , then and so on by combiningthe last rsymbols of into a combined symbol and re-ordering the new set of symbol probabilities for indecreasing order. For each source keep trackof the position ofthe combined symbol

Terminate the source reduction when a source with exactly rsymbols is produced. For a source with q symbols the reducedsource with rsymbols will be

1

q r

r

1q r r

S 1S 2Sj

S

1jS

1

q rs

S

• 7/29/2019 4. Huffman and Arithmetic Coding

7/27

r-ary Huffman Codes

Assign a compact r-ary code for the final reduced source.For a source with rsymbols the trivial code is

Backtrackto the original source assigning a compact codefor the j-th reduced source. The compact code assigned

to , minus the code words assigned to any dummysymbols, is the r-ary Huffman code

0,1, , r

S

• 7/29/2019 4. Huffman and Arithmetic Coding

8/27

r-ary Huffman Codes

Example: we want to design a compact quaternary codefor a source with 11 symbols

First, we calculate which is not integer

We need to append dummy symbols, so that we have asource with symbols

The appended symbols is with

11 4 2.334 1

4 2.33 4 1 13q

12 13,s s 12 13

0.00P s P s

• 7/29/2019 4. Huffman and Arithmetic Coding

9/27

r-ary Huffman Codes

1s

2s

3s

4s

5s

6s

7s

8s

9s

10s

11s

12s

13s

• 7/29/2019 4. Huffman and Arithmetic Coding

10/27

Arithmetic Coding

Problems related to Huffman coding: The size of Huffman code table represents an exponential

increase in memory and computational requirements

The code table needs to be transmitted to the receiver

The source statistics are assumed stationary Encoding and decoding is performed on a per blockbasis: the

code is not produced until a block ofn symbols is received

One solution to using Huffman coding on increasingly

larger extensions of the source is by using arithmeticcoding

Arithmetic coding also needs source statistics

• 7/29/2019 4. Huffman and Arithmetic Coding

11/27

Arithmetic Coding

Consider the N-length source message ,

where are the source symbols andindicates that thej-th character in the message is thesource symbol

Arithmetic coding assumes thatfor can be calculated. Remember Markovmodel!!

The goal of this coding is to assign a unique interval alongthe unit number line of length equal to the probability of

the given source message, with its position on thenumber line given by cumulative probability of givensource message,

1 2, , ,i i iN s s s

: 1,2, ,is i q ijs

is

1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N

1 2, , ,i i iN Cum s s s

• 7/29/2019 4. Huffman and Arithmetic Coding

12/27

Arithmetic Coding

The basic operation ofarithmetic coding is to producethis unique interval by starting with the interval [0,1) anditeratively subdividing it by for

Consider the first letter of message of the message,

namely The individual symbols are each assigned the interval

where

and

1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N

1is ,i ilb hb

1

i

i i k

k

hb Cum s P s

1

1

i

i i i k

k

lb Cum s P s P s

• 7/29/2019 4. Huffman and Arithmetic Coding

13/27

Arithmetic Coding

The length of each interval is and the end ofthe interval is given by

The interval corresponding to the symbol is then selected

Next we consider the second letter of the message,

The individual symbols are now assigned the intervalwhere:

i i ihb lb P s

iCum s

2is

,i ilb hb

1 1 1

1 1

1

| *

| *

i i i i i

i

k i

k

hb lb Cum s s P s

lb P s s R

1 1 1 1

1

1 1

1

| | *

| *

i i i i i i i

i

k i

k

lb lb Cum s s P s s P s

lb P s s R

1 1 1i i iR hb lb P s

• 7/29/2019 4. Huffman and Arithmetic Coding

14/27

Arithmetic Coding

The length of each interval is The interval corresponding to the symbol , that is

is then selected

The length of the interval corresponding to the message

seen so far, is

1 1| *i i i i ihb lb P s s P s

2is 2 2,i ilb hb

1 2,i is s 2 2 2 1 1 1 2| * ,i i i i i i ihb lb P s s P s P s s

• 7/29/2019 4. Huffman and Arithmetic Coding

15/27

Arithmetic Coding

Example: consider the message originating fromthe 3-symbol source with the following individual andcumulative probabilities

We assume that the source is zero-memory such that

Initially, the probbaility line [0,1) is divided into threeintervals: [0,0.2), [0.2,0.7), and [0.7,1.0) correspondingto and the length ratios 0.2 : 0.5 : 0.3

2 2 1, ,s s s

1 2 , 1| , , ,ij i i i j ijP s s s s P s

1 2 3, ,s s s

• 7/29/2019 4. Huffman and Arithmetic Coding

16/27

Arithmetic Coding

The first letter of the message is then the interval[0.2,0.7) is selected

The interval [0.2,0.7) is divided into three subinterval oflength ratios 0.2 : 0.5 : 0.3, that is [0.2,0.3), [0.3,0.55), and

[0.55,0.7) When the second letter is received the subinterval

[0.3,0.55) is selected

The interval [0.3,0.55) is subdivided into [0.3,0.35),

[0.35,0.475), and [0.475,0.55) with length ratios 0.2 : 0.5 :0.3

2s

2s

• 7/29/2019 4. Huffman and Arithmetic Coding

17/27

Arithmetic Coding

When the last letter of the message is received, thecorresponding interval [0.3,0.35) of lengthis selected

The final interval can be written, in binary, as[0.01001100,0.01011001)

We need to select a number ofsignificant bits that is ableto represent it.

It suppose to be

1s

2 2 1, , 0.05P s s s

2 2 2 1 2log , , log 0.05 4.32 4 bitsP s s s

• 7/29/2019 4. Huffman and Arithmetic Coding

18/27

Arithmetic

Coding

The graphicalinterpretation ofthe previousexample

• 7/29/2019 4. Huffman and Arithmetic Coding

19/27

Arithmetic Coding

How does one select the number that falls within thefinal interval so that it can be transmitted in the leastnumber of bits?

Let [low,high) denote the final interval

Since low< high, at the first place that differ will be a 0 inthe expansion for lowand a 1 in the expansion for high

It is selected and transmitted as the t-bit code sequence

1 2 1

1 2 1

0. 0

0. 1

t

t

low a a a

high a a a

1 2 11ta a a

• 7/29/2019 4. Huffman and Arithmetic Coding

20/27

Encoding Algorithm

• 7/29/2019 4. Huffman and Arithmetic Coding

21/27

Encoding Algorithm

• 7/29/2019 4. Huffman and Arithmetic Coding

22/27

Decoding Algorithm

• 7/29/2019 4. Huffman and Arithmetic Coding

23/27

Decoding Algorithm

• 7/29/2019 4. Huffman and Arithmetic Coding

24/27

Example of Encoding

Consider this zero-memory source

Suppose that the message bad_dab. is generated where .

is the EOF

• 7/29/2019 4. Huffman and Arithmetic Coding

25/27

Example of Encoding

Applying the algorithm, we yield

Binary representation for [0.434249583, 0.434250482):

So, we transmit 16-bit value: 0110111100101011

• 7/29/2019 4. Huffman and Arithmetic Coding

26/27

Example of Decoding

Suppose we want to decode the result from the lastexample

• 7/29/2019 4. Huffman and Arithmetic Coding

27/27

Which is Better?

A question arises: which is better between arithmeticcode and Huffman code?

Huffman coding and arithmetic coding exhibit similarperformance in theory

Huffman coding becomes computionally prohibitive withincreasing n because the computational complexity is theorder of nq