Lecture7: Huffman Code Huffman Code Applicationeee.guc.edu.eg/Courses/Communications/COMM901 Source...
Embed Size (px)
Transcript of Lecture7: Huffman Code Huffman Code Applicationeee.guc.edu.eg/Courses/Communications/COMM901 Source...

SOURCE CODING PROF. A.M.ALLAM
Huffman Code Application
Lossless Image Compression
A simple application of
Huffman coding of
image compression
which would be :
Generation of a
Huffman code for the
set of values that any
pixel may take
For monochrome
images a set usually
consists of integers
from 0 to 255
Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 2 LECTURES
Huffman Code Application
1. Encode the image using Huffman code
2. Save it in a file
The original (uncompressed) image representation uses 8 bits/pixel. The image consists of 256
rows of 256 pixels, so the uncompressed representation uses 65,536 bytes
Steps to have lossless image compression
3. Generate a Huffman code for compressed image
5. store it in a file again
6. Determine the compression ratio
number of bytes (uncompressed)/ number of bytes compressed
1.The number of bytes in the compressed representation includes the number of bytes
needed to store the Huffman code
2.The compression ratio is different for different images
Notes:
Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 3 LECTURES
Huffman Code Application
Image Name Bits/Pixel Total Size (B) Compression Ratio
Sena 7.01 57,504 1.14
Sensin 7.49 61,430 1.07
Earth 4.94 40,534 1.62
Omaha 7.12 58,374 1.12
Huffman (Lossless JPEG) compression based on pixel value
Image Name Bits/Pixel Total Size (B) Compression Ratio
Sena 4.02 32,968 1.99
Sensin 4.7 38,541 1.70
Earth 4.13 33,880 1.93
Omaha 6.42 52,643 1.24
Huffman Compression Based on pixel difference value between the pixel
and its neighbor
Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM
4
Huffman Code Application
Image Name Bits/Pixel Total Size (B)
Compression Ratio
Sena 3.93 32,261 2.03
Sensin 4.63 37,896 1.73
Earth 4.82 39,504 1.66
Omaha 6.39 52,321 1.25
Huffman compression based on pixel difference value and adaptive model
Lecture7: Huffman Code
In the end, the particular application will determine which approach
is more suitable
Notice that there is little difference between the performance of adaptive
Huffman code and Huffman coder
Adaptive Huffman coder can be used as an on line or real time coder makes the
adaptive Huffman coder amore attractive option in many applications
However, adaptive Huffman coder is more subjected to errors and may also be
more difficult to implement

SOURCE CODING PROF. A.M.ALLAM
Huffman Code Application
Text Compression
-The probabilities in the left table 3 are the probabilities of the 26 letters obtained for the U.S. Constitution and
are representative of English text
-The probabilities in the right table ere obtained by counting the frequency of occurrences of letters in an
earlier version of some chapter
-While the two documents are substantially different, the two sets of probabilities are very much alike
-Text compression seems natural for Huffman coding. In text, we have a discrete alphabet that, in a given
class, has relatively stationary probabilities
Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 6 LECTURES
Huffman Code Application
Another class of data that is very suitable for compression is CD quality audio data. The
audio signal for each stereo channel is sampled at 44.1 kHz, and each sample is represented
by 16 bits
Audio Compression
The three segments used in this example represent a wide variety of audio material, from
symphonic pieces as nominated
File Name
Original File Size (bytes)
Entropy (bits)
Estimated Compressed File Size(bytes)
Compression Ratio
Mozart 939.862 12.8 725.420 1.3
Cohn 402.442 13.8 349.300 1.15
Mir 884.020 13.7 759.540 1.16
Lecture7: Huffman Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 7 LECTURES
Tunstall Code
It is clear that Huffman code encodes letters from the source alphabet
using codewords with varying numbers of bits codewords with fewer
bits for letters that occur more frequently and codewords with more bits
for letters that occur less frequently
On the other hand, errors in codewords propagate, error in one
codeword will cause a series of errors to occur
Tunstall Code
It encodes letters such that each group of different letters
from the source are encoded to codewords of equal length
It is variable to fixed mapping
Lecture7: Tunstall Code
It is fixed to variable mapping

SOURCE CODING PROF. A.M.ALLAM
11/8/2016
Tunstall Code
-Start with the N letters of the source alphabet
-Remove the entry with highest probability
-Add N string obtained by concatenating this letter with every letter in the alphabet (including
itself), this will increase the size from N to N+(N-1)
-Calculated the probabilities of the new entries
-Select the entry with the highest probability and repeat until
the size reaches 2L, i.e., it is repeated k times until
Given: alphabet of size N
Algorithm
LNkN 2)1(
Required: a Tunstall code of L bits for given pmf
bits3 , L=0.1, P(c)=0.3, P(B)=0.6S={A,B,C}, P(A)=Ex:
alphabet P
A 0.6
B 0.3
C 0.1
X
alphabet P
B 0.3
C 0.1
AA 0.36
AB 0.18
AC 0.06
X
alphabet P
B 0.3
AAA 0.216
AB 0.18
AAB 0.108
C 0.1
AC 0.06
AAC 0.036
k=1
k=2 k=3 X
arranged
AAC
AC
C
AAB
AB
AAA
B
X
CODE
000
001
010
011
100
101
110
Lecture7: Tunstall Code

SOURCE CODING PROF. A.M.ALLAM
9
Tunstall Code
i.e., it encodes into binary codewords of fixed length L, make 2L source phrases
that are as nearly equally probable as we can
If the probability of the last source phrases are not nearly equally probable you
can use Huffman code at that instance, Tunstall/Huffman
Lecture7: Tunstall Code
B 0.3
AAA 0.216
AB 0.18
AAB 0.108
C 0.1
AC 0.06
AAC 0.036
AAC 1111
AC 1110
C 110
AAB 011
AB 010
AAA 10
B 00
)3(58.223.0226.0
318.03108.031.0406.04036.0
lengthcodeTunstallxx
xxxxxL
0
1
0.108
0.18
0.096
0
1
0.196
1
0.288 0
1 0.412 0 0.196
0.216 0.288
0.3
0.412
0.588 0
1
0
1

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 10
Golomb Rice Code
LECTURES
Unary code is the same as the Huffman code for the semi infinite alphabet {1, 2 ,3 ,…}
with probability model
Golomb Rice codes belong to a family of codes designed to encode integers with the
assumption that the larger an integer, the lower its probability of occurrence
The simplest code for this situation is the unary code
Unary code for a positive integer n is n 1s followed by 0
the code for 4 is 11110,
the code for 7 is 11111110
kkP
2
1)(
for their probability models Both are optimal
One step higher in complexity of unary code is that it splits the integer into two parts,
representing one part with a unary code and the other part with a different code
Lecture7: Golomb Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 11 LECTURES
Golomb Rice Code
to be encoded is an integer n such that 0 parameterized by an integer m > It is
the quotient q and the reminder rtwo numbers, represented by
m
nq
qmnr
q can take values 0,1,2,3,…
r can take values 0,1,2,3,… m-1
q is coded by unary code of q
r is coded by:
m2log If m is power of 2, otherwise:
m2log Binary representation of r for the first values of r mm
2log2
m2log Binary representation of for the rest values of r mrm
2log2
Lecture7: Golomb Code

SOURCE CODING PROF. A.M.ALLAM
11/8/2016 12 LECTURES
Ex: For m=5, the Golomb code for the integer numbers {0,1,…,15} is
m2log =log4=2 m2log =log8=3
Golomb Rice Code Lecture7: Golomb Code
Find Golomb codes for integer numbers {3,4,5,18} for the case m=7,8