# Lecture7: Huffman Code Huffman Code Applicationeee.guc.edu.eg/Courses/Communications/COMM901 Source...

Embed Size (px)

### Transcript of Lecture7: Huffman Code Huffman Code Applicationeee.guc.edu.eg/Courses/Communications/COMM901 Source...

SOURCE CODING PROF. A.M.ALLAM

Huffman Code Application

Lossless Image Compression

A simple application of

Huffman coding of

image compression

which would be :

Generation of a

Huffman code for the

set of values that any

pixel may take

For monochrome

images a set usually

consists of integers

from 0 to 255

Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 2 LECTURES

Huffman Code Application

1. Encode the image using Huffman code

2. Save it in a file

The original (uncompressed) image representation uses 8 bits/pixel. The image consists of 256

rows of 256 pixels, so the uncompressed representation uses 65,536 bytes

Steps to have lossless image compression

3. Generate a Huffman code for compressed image

5. store it in a file again

6. Determine the compression ratio

number of bytes (uncompressed)/ number of bytes compressed

1.The number of bytes in the compressed representation includes the number of bytes

needed to store the Huffman code

2.The compression ratio is different for different images

Notes:

Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 3 LECTURES

Huffman Code Application

Image Name Bits/Pixel Total Size (B) Compression Ratio

Sena 7.01 57,504 1.14

Sensin 7.49 61,430 1.07

Earth 4.94 40,534 1.62

Omaha 7.12 58,374 1.12

Huffman (Lossless JPEG) compression based on pixel value

Image Name Bits/Pixel Total Size (B) Compression Ratio

Sena 4.02 32,968 1.99

Sensin 4.7 38,541 1.70

Earth 4.13 33,880 1.93

Omaha 6.42 52,643 1.24

Huffman Compression Based on pixel difference value between the pixel

and its neighbor

Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM

4

Huffman Code Application

Image Name Bits/Pixel Total Size (B)

Compression Ratio

Sena 3.93 32,261 2.03

Sensin 4.63 37,896 1.73

Earth 4.82 39,504 1.66

Omaha 6.39 52,321 1.25

Huffman compression based on pixel difference value and adaptive model

Lecture7: Huffman Code

In the end, the particular application will determine which approach

is more suitable

Notice that there is little difference between the performance of adaptive

Huffman code and Huffman coder

Adaptive Huffman coder can be used as an on line or real time coder makes the

adaptive Huffman coder amore attractive option in many applications

However, adaptive Huffman coder is more subjected to errors and may also be

more difficult to implement

SOURCE CODING PROF. A.M.ALLAM

Huffman Code Application

Text Compression

-The probabilities in the left table 3 are the probabilities of the 26 letters obtained for the U.S. Constitution and

are representative of English text

-The probabilities in the right table ere obtained by counting the frequency of occurrences of letters in an

earlier version of some chapter

-While the two documents are substantially different, the two sets of probabilities are very much alike

-Text compression seems natural for Huffman coding. In text, we have a discrete alphabet that, in a given

class, has relatively stationary probabilities

Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 6 LECTURES

Huffman Code Application

Another class of data that is very suitable for compression is CD quality audio data. The

audio signal for each stereo channel is sampled at 44.1 kHz, and each sample is represented

by 16 bits

Audio Compression

The three segments used in this example represent a wide variety of audio material, from

symphonic pieces as nominated

File Name

Original File Size (bytes)

Entropy (bits)

Estimated Compressed File Size(bytes)

Compression Ratio

Mozart 939.862 12.8 725.420 1.3

Cohn 402.442 13.8 349.300 1.15

Mir 884.020 13.7 759.540 1.16

Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 7 LECTURES

Tunstall Code

It is clear that Huffman code encodes letters from the source alphabet

using codewords with varying numbers of bits codewords with fewer

bits for letters that occur more frequently and codewords with more bits

for letters that occur less frequently

On the other hand, errors in codewords propagate, error in one

codeword will cause a series of errors to occur

Tunstall Code

It encodes letters such that each group of different letters

from the source are encoded to codewords of equal length

It is variable to fixed mapping

Lecture7: Tunstall Code

It is fixed to variable mapping

SOURCE CODING PROF. A.M.ALLAM

11/8/2016

Tunstall Code

-Start with the N letters of the source alphabet

-Remove the entry with highest probability

-Add N string obtained by concatenating this letter with every letter in the alphabet (including

itself), this will increase the size from N to N+(N-1)

-Calculated the probabilities of the new entries

-Select the entry with the highest probability and repeat until

the size reaches 2L, i.e., it is repeated k times until

Given: alphabet of size N

Algorithm

LNkN 2)1(

Required: a Tunstall code of L bits for given pmf

bits3 , L=0.1, P(c)=0.3, P(B)=0.6S={A,B,C}, P(A)=Ex:

alphabet P

A 0.6

B 0.3

C 0.1

X

alphabet P

B 0.3

C 0.1

AA 0.36

AB 0.18

AC 0.06

X

alphabet P

B 0.3

AAA 0.216

AB 0.18

AAB 0.108

C 0.1

AC 0.06

AAC 0.036

k=1

k=2 k=3 X

arranged

AAC

AC

C

AAB

AB

AAA

B

X

CODE

000

001

010

011

100

101

110

Lecture7: Tunstall Code

SOURCE CODING PROF. A.M.ALLAM

9

Tunstall Code

i.e., it encodes into binary codewords of fixed length L, make 2L source phrases

that are as nearly equally probable as we can

If the probability of the last source phrases are not nearly equally probable you

can use Huffman code at that instance, Tunstall/Huffman

Lecture7: Tunstall Code

B 0.3

AAA 0.216

AB 0.18

AAB 0.108

C 0.1

AC 0.06

AAC 0.036

AAC 1111

AC 1110

C 110

AAB 011

AB 010

AAA 10

B 00

)3(58.223.0226.0

318.03108.031.0406.04036.0

lengthcodeTunstallxx

xxxxxL

0

1

0.108

0.18

0.096

0

1

0.196

1

0.288 0

1 0.412 0 0.196

0.216 0.288

0.3

0.412

0.588 0

1

0

1

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 10

Golomb Rice Code

LECTURES

Unary code is the same as the Huffman code for the semi infinite alphabet {1, 2 ,3 ,…}

with probability model

Golomb Rice codes belong to a family of codes designed to encode integers with the

assumption that the larger an integer, the lower its probability of occurrence

The simplest code for this situation is the unary code

Unary code for a positive integer n is n 1s followed by 0

the code for 4 is 11110,

the code for 7 is 11111110

kkP

2

1)(

for their probability models Both are optimal

One step higher in complexity of unary code is that it splits the integer into two parts,

representing one part with a unary code and the other part with a different code

Lecture7: Golomb Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 11 LECTURES

Golomb Rice Code

to be encoded is an integer n such that 0 parameterized by an integer m > It is

the quotient q and the reminder rtwo numbers, represented by

m

nq

qmnr

q can take values 0,1,2,3,…

r can take values 0,1,2,3,… m-1

q is coded by unary code of q

r is coded by:

m2log If m is power of 2, otherwise:

m2log Binary representation of r for the first values of r mm

2log2

m2log Binary representation of for the rest values of r mrm

2log2

Lecture7: Golomb Code

SOURCE CODING PROF. A.M.ALLAM

11/8/2016 12 LECTURES

Ex: For m=5, the Golomb code for the integer numbers {0,1,…,15} is

m2log =log4=2 m2log =log8=3

Golomb Rice Code Lecture7: Golomb Code

Find Golomb codes for integer numbers {3,4,5,18} for the case m=7,8