4. Huffman and Arithmetic Coding

download 4. Huffman and Arithmetic Coding

of 27

Transcript of 4. Huffman and Arithmetic Coding

  • 7/29/2019 4. Huffman and Arithmetic Coding

    1/27

    Huffman and Arithmetic Coding

    Coding and Its Application

  • 7/29/2019 4. Huffman and Arithmetic Coding

    2/27

    Introduction

    Huffman codes can be classified as instantaneous code It has property:

    Huffman codes are compact codes, i.e. it produces a code withan average length which is the smallest possible to achieve for

    the given number of source symbols, code alphabet, and sourcestatistics

    Huffman codes operate by reducing a source with qsymbols to a source with rsymbols, where r is the size of

    the code alphabet

  • 7/29/2019 4. Huffman and Arithmetic Coding

    3/27

    Introduction

    Consider the source S with q symbols andassociated with probabilities

    Let the symbols be renumbered so that

    By combining the last r symbols ofS,

    into one symbol, with probability

    The trivial r-ary compact code for the reduced sourcewith rsymbols is used to design the compact code forthe preceding reduced source

    : 1,2, ,is i q

    : 1,2, ,iP s i q

    1 2 qP s P s P s

    1 2, , ,q r q r qs s s 1q rs 1 1

    1

    r

    q r q r

    i

    P s P s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    4/27

    Binary Huffman Coding

    The algorithm: Re-order the source symbols in decreasing order of symbol

    probability

    Reduce the source by combining the last two symbols and re-ordering the new set in decreasing order

    Assign a compact code for the final reduced source. For a twosymbol source the trivial code is {0, 1}

    Backtrackto the original source S assigning a compact code

    Example:Consider a 5 symbol source with the following probability

  • 7/29/2019 4. Huffman and Arithmetic Coding

    5/27

    Binary Huffman Coding

    The average length is 2.2 bits/symbol

    The efficiency is 96.5%

    Are Huffman codes unique?

  • 7/29/2019 4. Huffman and Arithmetic Coding

    6/27

    r-ary Huffman Codes

    Calculate . If is a non-integer value then appenddummy symbols to the source with zero probability untilthere are symbols

    Re-order the source symbols in decreasing order of symbolprobability

    Reduce the source to , then and so on by combiningthe last rsymbols of into a combined symbol and re-ordering the new set of symbol probabilities for indecreasing order. For each source keep trackof the position ofthe combined symbol

    Terminate the source reduction when a source with exactly rsymbols is produced. For a source with q symbols the reducedsource with rsymbols will be

    1

    q r

    r

    1q r r

    S 1S 2Sj

    S

    1jS

    1

    q rs

    S

  • 7/29/2019 4. Huffman and Arithmetic Coding

    7/27

    r-ary Huffman Codes

    Assign a compact r-ary code for the final reduced source.For a source with rsymbols the trivial code is

    Backtrackto the original source assigning a compact codefor the j-th reduced source. The compact code assigned

    to , minus the code words assigned to any dummysymbols, is the r-ary Huffman code

    0,1, , r

    S

  • 7/29/2019 4. Huffman and Arithmetic Coding

    8/27

    r-ary Huffman Codes

    Example: we want to design a compact quaternary codefor a source with 11 symbols

    First, we calculate which is not integer

    We need to append dummy symbols, so that we have asource with symbols

    The appended symbols is with

    11 4 2.334 1

    4 2.33 4 1 13q

    12 13,s s 12 13

    0.00P s P s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    9/27

    r-ary Huffman Codes

    1s

    2s

    3s

    4s

    5s

    6s

    7s

    8s

    9s

    10s

    11s

    12s

    13s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    10/27

    Arithmetic Coding

    Problems related to Huffman coding: The size of Huffman code table represents an exponential

    increase in memory and computational requirements

    The code table needs to be transmitted to the receiver

    The source statistics are assumed stationary Encoding and decoding is performed on a per blockbasis: the

    code is not produced until a block ofn symbols is received

    One solution to using Huffman coding on increasingly

    larger extensions of the source is by using arithmeticcoding

    Arithmetic coding also needs source statistics

  • 7/29/2019 4. Huffman and Arithmetic Coding

    11/27

    Arithmetic Coding

    Consider the N-length source message ,

    where are the source symbols andindicates that thej-th character in the message is thesource symbol

    Arithmetic coding assumes thatfor can be calculated. Remember Markovmodel!!

    The goal of this coding is to assign a unique interval alongthe unit number line of length equal to the probability of

    the given source message, with its position on thenumber line given by cumulative probability of givensource message,

    1 2, , ,i i iN s s s

    : 1,2, ,is i q ijs

    is

    1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N

    1 2, , ,i i iN Cum s s s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    12/27

    Arithmetic Coding

    The basic operation ofarithmetic coding is to producethis unique interval by starting with the interval [0,1) anditeratively subdividing it by for

    Consider the first letter of message of the message,

    namely The individual symbols are each assigned the interval

    where

    and

    1 2 , 1| , , ,ij i i i jP s s s s 1,2, ,j N

    1is ,i ilb hb

    1

    i

    i i k

    k

    hb Cum s P s

    1

    1

    i

    i i i k

    k

    lb Cum s P s P s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    13/27

    Arithmetic Coding

    The length of each interval is and the end ofthe interval is given by

    The interval corresponding to the symbol is then selected

    Next we consider the second letter of the message,

    The individual symbols are now assigned the intervalwhere:

    i i ihb lb P s

    iCum s

    2is

    ,i ilb hb

    1 1 1

    1 1

    1

    | *

    | *

    i i i i i

    i

    k i

    k

    hb lb Cum s s P s

    lb P s s R

    1 1 1 1

    1

    1 1

    1

    | | *

    | *

    i i i i i i i

    i

    k i

    k

    lb lb Cum s s P s s P s

    lb P s s R

    1 1 1i i iR hb lb P s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    14/27

    Arithmetic Coding

    The length of each interval is The interval corresponding to the symbol , that is

    is then selected

    The length of the interval corresponding to the message

    seen so far, is

    1 1| *i i i i ihb lb P s s P s

    2is 2 2,i ilb hb

    1 2,i is s 2 2 2 1 1 1 2| * ,i i i i i i ihb lb P s s P s P s s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    15/27

    Arithmetic Coding

    Example: consider the message originating fromthe 3-symbol source with the following individual andcumulative probabilities

    We assume that the source is zero-memory such that

    Initially, the probbaility line [0,1) is divided into threeintervals: [0,0.2), [0.2,0.7), and [0.7,1.0) correspondingto and the length ratios 0.2 : 0.5 : 0.3

    2 2 1, ,s s s

    1 2 , 1| , , ,ij i i i j ijP s s s s P s

    1 2 3, ,s s s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    16/27

    Arithmetic Coding

    The first letter of the message is then the interval[0.2,0.7) is selected

    The interval [0.2,0.7) is divided into three subinterval oflength ratios 0.2 : 0.5 : 0.3, that is [0.2,0.3), [0.3,0.55), and

    [0.55,0.7) When the second letter is received the subinterval

    [0.3,0.55) is selected

    The interval [0.3,0.55) is subdivided into [0.3,0.35),

    [0.35,0.475), and [0.475,0.55) with length ratios 0.2 : 0.5 :0.3

    2s

    2s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    17/27

    Arithmetic Coding

    When the last letter of the message is received, thecorresponding interval [0.3,0.35) of lengthis selected

    The final interval can be written, in binary, as[0.01001100,0.01011001)

    We need to select a number ofsignificant bits that is ableto represent it.

    It suppose to be

    1s

    2 2 1, , 0.05P s s s

    2 2 2 1 2log , , log 0.05 4.32 4 bitsP s s s

  • 7/29/2019 4. Huffman and Arithmetic Coding

    18/27

    Arithmetic

    Coding

    The graphicalinterpretation ofthe previousexample

  • 7/29/2019 4. Huffman and Arithmetic Coding

    19/27

    Arithmetic Coding

    How does one select the number that falls within thefinal interval so that it can be transmitted in the leastnumber of bits?

    Let [low,high) denote the final interval

    Since low< high, at the first place that differ will be a 0 inthe expansion for lowand a 1 in the expansion for high

    It is selected and transmitted as the t-bit code sequence

    1 2 1

    1 2 1

    0. 0

    0. 1

    t

    t

    low a a a

    high a a a

    1 2 11ta a a

  • 7/29/2019 4. Huffman and Arithmetic Coding

    20/27

    Encoding Algorithm

  • 7/29/2019 4. Huffman and Arithmetic Coding

    21/27

    Encoding Algorithm

  • 7/29/2019 4. Huffman and Arithmetic Coding

    22/27

    Decoding Algorithm

  • 7/29/2019 4. Huffman and Arithmetic Coding

    23/27

    Decoding Algorithm

  • 7/29/2019 4. Huffman and Arithmetic Coding

    24/27

    Example of Encoding

    Consider this zero-memory source

    Suppose that the message bad_dab. is generated where .

    is the EOF

  • 7/29/2019 4. Huffman and Arithmetic Coding

    25/27

    Example of Encoding

    Applying the algorithm, we yield

    Binary representation for [0.434249583, 0.434250482):

    So, we transmit 16-bit value: 0110111100101011

  • 7/29/2019 4. Huffman and Arithmetic Coding

    26/27

    Example of Decoding

    Suppose we want to decode the result from the lastexample

  • 7/29/2019 4. Huffman and Arithmetic Coding

    27/27

    Which is Better?

    A question arises: which is better between arithmeticcode and Huffman code?

    Huffman coding and arithmetic coding exhibit similarperformance in theory

    Huffman coding becomes computionally prohibitive withincreasing n because the computational complexity is theorder of nq