4. Huffman and Arithmetic Coding
-
Upload
yos-tiba-alexander -
Category
Documents
-
view
219 -
download
0
Embed Size (px)
Transcript of 4. Huffman and Arithmetic Coding
-
7/29/2019 4. Huffman and Arithmetic Coding
1/27
Huffman and Arithmetic Coding
Coding and Its Application
-
7/29/2019 4. Huffman and Arithmetic Coding
2/27
Introduction
Huffman codes can be classified as instantaneous code It has property:
Huffman codes are compact codes, i.e. it produces a code withan average length which is the smallest possible to achieve for
the given number of source symbols, code alphabet, and sourcestatistics
Huffman codes operate by reducing a source with qsymbols to a source with rsymbols, where r is the size of
the code alphabet
-
7/29/2019 4. Huffman and Arithmetic Coding
3/27
Introduction
Consider the source S with q symbols andassociated with probabilities
Let the symbols be renumbered so that
By combining the last r symbols ofS,
into one symbol, with probability
The trivial r-ary compact code for the reduced sourcewith rsymbols is used to design the compact code forthe preceding reduced source
: 1,2, ,is i q
: 1,2, ,iP s i q
1 2 qP s P s P s
1 2, , ,q r q r qs s s 1q rs 1 1
1
r
q r q r
i
P s P s
-
7/29/2019 4. Huffman and Arithmetic Coding
4/27
Binary Huffman Coding
The algorithm: Re-order the source symbols in decreasing order of symbol
probability
Reduce the source by combining the last two symbols and re-ordering the new set in decreasing order
Assign a compact code for the final reduced source. For a twosymbol source the trivial code is {0, 1}
Backtrackto the original source S assigning a compact code
Example:Consider a 5 symbol source with the following probability
-
7/29/2019 4. Huffman and Arithmetic Coding
5/27
Binary Huffman Coding
The average length is 2.2 bits/symbol
The efficiency is 96.5%
Are Huffman codes unique?
-
7/29/2019 4. Huffman and Arithmetic Coding
6/27
r-ary Huffman Codes
Calculate . If is a non-integer value then appenddummy symbols to the source with zero probability untilthere are symbols
Re-order the source symbols in decreasing order of symbolprobability
Reduce the source to , then and so on by combiningthe last rsymbols of into a combined symbol and re-ordering the new set of symbol probabilities for indecreasing order. For each source keep trackof the position ofthe combined symbol
Terminate the source reduction when a source with exactly rsymbols is produced. For a source with q symbols the reducedsource with rsymbols will be
1
q r
r
1q r r
S 1S 2Sj
S
1jS
1
q rs
S
-
7/29/2019 4. Huffman and Arithmetic Coding
7/27
r-ary Huffman Codes
Assign a compact r-ary code for the final reduced source.For a source with rsymbols the trivial code is
Backtrackto the original source assigning a compact codefor the j-th reduced source. The compact code assigned
to , minus the code words assigned to any dummysymbols, is the r-ary Huffman code
0,1, , r
S
-
7/29/2019 4. Huffman and Arithmetic Coding
8/27
r-ary Huffman Codes
Example: we want to design a compact quaternary codefor a source with 11 symbols
First, we calculate which is not integer
We need to append dummy symbols, so that we have asource with symbols
The appended symbols is with
11 4 2.334 1
4 2.33 4 1 13q
12 13,s s 12 13
0.00P s P s
-
7/29/2019 4. Huffman and Arithmetic Coding
9/27
r-ary Huffman Codes
1s
2s
3s
4s
5s
6s
7s
8s
9s
10s
11s
12s
13s
-
7/29/2019 4. Huffman and Arithmetic Coding
10/27
Arithmetic Coding
Problems related to Huffman coding: The size of Huffman code table represents an exponential
increase in memory and computational requirements
The code table needs to be transmitted to the receiver
The source statistics are assumed stationary Encoding and decoding is performed on a per blockbasis: the
code is not produced until a block ofn symbols is received
One solution to using Huffman coding on increasingly
larger extensions of the source is by using arithmeticcoding
Arithmetic coding also needs source statistics
-
7/29/2019 4. Huffman and Arithmetic Coding
11/27
Arithmetic Coding
Consider the N-length source message ,
where are the source symbols andindicates that thej-th character in the message is thesource symbol
Arithmetic coding assumes thatfor can be calculated. Remember Markovmodel!!
The goal of this coding is to assign a unique interval alongthe unit number line of length equal to the probability of
the given source message, with its position on thenumber line given by cumulative probability of givensource message,
1 2, , ,i i iN s s s
: 1,2, ,is i q ijs
is
1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N
1 2, , ,i i iN Cum s s s
-
7/29/2019 4. Huffman and Arithmetic Coding
12/27
Arithmetic Coding
The basic operation ofarithmetic coding is to producethis unique interval by starting with the interval [0,1) anditeratively subdividing it by for
Consider the first letter of message of the message,
namely The individual symbols are each assigned the interval
where
and
1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N
1is ,i ilb hb
1
i
i i k
k
hb Cum s P s
1
1
i
i i i k
k
lb Cum s P s P s
-
7/29/2019 4. Huffman and Arithmetic Coding
13/27
Arithmetic Coding
The length of each interval is and the end ofthe interval is given by
The interval corresponding to the symbol is then selected
Next we consider the second letter of the message,
The individual symbols are now assigned the intervalwhere:
i i ihb lb P s
iCum s
2is
,i ilb hb
1 1 1
1 1
1
| *
| *
i i i i i
i
k i
k
hb lb Cum s s P s
lb P s s R
1 1 1 1
1
1 1
1
| | *
| *
i i i i i i i
i
k i
k
lb lb Cum s s P s s P s
lb P s s R
1 1 1i i iR hb lb P s
-
7/29/2019 4. Huffman and Arithmetic Coding
14/27
Arithmetic Coding
The length of each interval is The interval corresponding to the symbol , that is
is then selected
The length of the interval corresponding to the message
seen so far, is
1 1| *i i i i ihb lb P s s P s
2is 2 2,i ilb hb
1 2,i is s 2 2 2 1 1 1 2| * ,i i i i i i ihb lb P s s P s P s s
-
7/29/2019 4. Huffman and Arithmetic Coding
15/27
Arithmetic Coding
Example: consider the message originating fromthe 3-symbol source with the following individual andcumulative probabilities
We assume that the source is zero-memory such that
Initially, the probbaility line [0,1) is divided into threeintervals: [0,0.2), [0.2,0.7), and [0.7,1.0) correspondingto and the length ratios 0.2 : 0.5 : 0.3
2 2 1, ,s s s
1 2 , 1| , , ,ij i i i j ijP s s s s P s
1 2 3, ,s s s
-
7/29/2019 4. Huffman and Arithmetic Coding
16/27
Arithmetic Coding
The first letter of the message is then the interval[0.2,0.7) is selected
The interval [0.2,0.7) is divided into three subinterval oflength ratios 0.2 : 0.5 : 0.3, that is [0.2,0.3), [0.3,0.55), and
[0.55,0.7) When the second letter is received the subinterval
[0.3,0.55) is selected
The interval [0.3,0.55) is subdivided into [0.3,0.35),
[0.35,0.475), and [0.475,0.55) with length ratios 0.2 : 0.5 :0.3
2s
2s
-
7/29/2019 4. Huffman and Arithmetic Coding
17/27
Arithmetic Coding
When the last letter of the message is received, thecorresponding interval [0.3,0.35) of lengthis selected
The final interval can be written, in binary, as[0.01001100,0.01011001)
We need to select a number ofsignificant bits that is ableto represent it.
It suppose to be
1s
2 2 1, , 0.05P s s s
2 2 2 1 2log , , log 0.05 4.32 4 bitsP s s s
-
7/29/2019 4. Huffman and Arithmetic Coding
18/27
Arithmetic
Coding
The graphicalinterpretation ofthe previousexample
-
7/29/2019 4. Huffman and Arithmetic Coding
19/27
Arithmetic Coding
How does one select the number that falls within thefinal interval so that it can be transmitted in the leastnumber of bits?
Let [low,high) denote the final interval
Since low< high, at the first place that differ will be a 0 inthe expansion for lowand a 1 in the expansion for high
It is selected and transmitted as the t-bit code sequence
1 2 1
1 2 1
0. 0
0. 1
t
t
low a a a
high a a a
1 2 11ta a a
-
7/29/2019 4. Huffman and Arithmetic Coding
20/27
Encoding Algorithm
-
7/29/2019 4. Huffman and Arithmetic Coding
21/27
Encoding Algorithm
-
7/29/2019 4. Huffman and Arithmetic Coding
22/27
Decoding Algorithm
-
7/29/2019 4. Huffman and Arithmetic Coding
23/27
Decoding Algorithm
-
7/29/2019 4. Huffman and Arithmetic Coding
24/27
Example of Encoding
Consider this zero-memory source
Suppose that the message bad_dab. is generated where .
is the EOF
-
7/29/2019 4. Huffman and Arithmetic Coding
25/27
Example of Encoding
Applying the algorithm, we yield
Binary representation for [0.434249583, 0.434250482):
So, we transmit 16-bit value: 0110111100101011
-
7/29/2019 4. Huffman and Arithmetic Coding
26/27
Example of Decoding
Suppose we want to decode the result from the lastexample
-
7/29/2019 4. Huffman and Arithmetic Coding
27/27
Which is Better?
A question arises: which is better between arithmeticcode and Huffman code?
Huffman coding and arithmetic coding exhibit similarperformance in theory
Huffman coding becomes computionally prohibitive withincreasing n because the computational complexity is theorder of nq