8/12/2019 Assignment No 6 HuffmanCoding
1/7
Assignment # 6 1
Assignment # 6
Huffman coding
Saad Iftikhar 039
Munhal Imran
Muhammad Hassan Zia 029Moeez Aslam
Hamza Hashmi
junaid Afzal Swatti
Ali Tausif 061
Armaghan Ahmed
Zohair Fakhar
Mohsin Altaf
Instructor:
Sir Qasim Umer Khan
8/12/2019 Assignment No 6 HuffmanCoding
2/7
Assignment # 6 2
AbstractIn this assignment we have implemented
the Huffman entropy encoding algorithm for data
compression. The results obtained after extensive
testing with different sets showed acceptable results
and confirmed the notion that more similar the data
set the better is compression achieved by Huffman
compression algorithm.
I. INTRODUCTIONncomputer science andinformation theory,Huffman
coding is anentropy encodingalgorithm used forlossless
data compression.The term refers to the use of avariable-
length code table for encoding a source symbol (such as a
character in a file) where the variable-length code table has
been derived in a particular way based on the estimated
probability of occurrence for each possible value of the source
symbol. Huffman coding uses a specific method for choosing
the representation for each symbol, resulting in aprefix
code (sometimes called "prefix-free codes", that is, the bit
string representing some particular symbol is never a prefix of
the bit string representing any other symbol) that expresses the
most common source symbols using shorter strings of bits
than are used for less common source symbols
II. ASSIGNMENTIn this assignment we were required to implement
the Huffman algorithm in matlab.
III. PERFORMANCE
following are the matlab codes:CODE:
Class of Huffman code:
%-----huffman coding-------------%%%%----- version 1--------------%%%%%- data structure (classes)---%%%%%%------18-12-2013------------%%%
%%
classdef huffman % data structurevalues that the node has in it
propertiesleftNode = []rightNode = []probabilitycode = [];symbolhuffy % will store the huufman
code just for checkend
end%%%%%%%%%%%--------------%%%%%%%%%%%%%%
Code for probability finding of data:%- calculating frequency of elements--%%%%%--- Saad Iftikhar-------%%%%%%%%%%%%%%---- 17 december 2013----%%%%%%%%%%
%% calculate how many same numbers occurfunction[data_unique,data_freq]=frequency(data);clc;% data=[22 33 55 66 11 22 33 44 66];data_unique=unique(data); % thisfunction creates asorted ascending orderarray% with only unique elements no twoelements are repereated
fori=1:length(data_unique)data_unique1(i)= sum(data ==
data_unique(i));% this array has the correspondingfrequency of the data% in the unique arrayenddata_unique=data_unique;data_freq=data_unique1;
data_freq=data_freq/sum(data_freq);end%%%%%%%%%%%------------%%%%%%%%%%%%%%
Conversion of data from binary indecimal:%--- calculating frequency of elementsbinary version-------------%%%%%%%--- 20-21 december 2013---%%%%%%
%% convert the data to decimalfunction[convData]=dataConv(data,M);
Huffman Coding implementation in Matlab
I
http://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Information_theoryhttp://en.wikipedia.org/wiki/Entropy_encodinghttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Prefix_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Variable-length_codehttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Entropy_encodinghttp://en.wikipedia.org/wiki/Information_theoryhttp://en.wikipedia.org/wiki/Computer_science8/12/2019 Assignment No 6 HuffmanCoding
3/7
Assignment # 6 3
%%% always comes a whole no not afraction% M=8;k=log2(M);convData=[];
remainder=mod(length(data),k);
% this function here will check if thedata is exactly divisable by k or elsewill append 0 bits ;
if(remainder~=0)append=k-remainder;else
append=0;end
data=[zeros(1,append) data];fordataLength=1:k:length(data)
string=num2str(data(dataLength:dataLength+k-1));
decimal=bin2dec(string);convData=[convData decimal];
end%%%%%%%%%%%------------%%%%%%%%%%%%%%
Main code of Huffman:%%%%% main code of huffman coding algousing classes-------%%%%%%%%%%-- Saad Iftikhar 18-12-2013-%%%%%%%%---creating binary tree-----%%%%%%
%%%---huffman code using classes---%%%%%% functioncodedData=sourceHuffman(information,M);clc;
clear symbol;clear codeHuff;clear codeBits;clear arrr;%% initializingglobal symbol; % global variable willhave the the symbolsglobal codeHuff; % global variable will
have the the symbols huffman codesglobal codeBits;% global variable willhave the the symbols related length ofhuffman codesdecodeData=[];symbol=[];codeHuff=[];codeBits=[];
M=4;
information=[0 0 1 0 1 0 0 0 0 1];% information=randint(1,3000);convdata=dataConv(information,M); % thisfunction her will convert our% from binary format to decimal as restof our program is written for% decimal[data,prob]=frequency(convdata);
%% Empty Array of Object Huffmanarray = huffman.empty(length(prob),0);array_final =huffman.empty(length(prob),0);
%% Assign Initial saving all theprobabilities of the numbers in the% probability property of theclass/structure alspfori=1:length(data)
array(i).probability = prob(i);
array(i).symbol = data(i);array_final(i).probability =prob(i);
array_final(i).symbol = data(i);end% here creating a temperary aaray to dothe sorting the algo we are using% is the bubleSort algo for ascendingordertemparray = array;%
%% Creating the Binary Tree for k =
1:size(temparray,2)-1 % size(a,2) givessize of the columns% binary tree is where a node/ parent hastwo children and lower% probability one is on left and higherone is on right% here to create a binary tree we have totraverse for the no of nodes -1% here we take size of the colums as sizeis always given as 2 dim vectorfork = 1:size(temparray,2)-1% % First Sort the temp array ussebuble sort%
fori=1:size(temparray,2)for j = 1:size(temparray,2)-1
% buble Sort algorithmif (temparray(j).probability
> temparray(j+1).probability)tempnode = temparray(j);
% this is the swaping operationtemparray(j) =
temparray(j+1);temparray(j+1) =
tempnode;
8/12/2019 Assignment No 6 HuffmanCoding
4/7
Assignment # 6 4
endend
end
%% % now we have to Create a new node%
newnode = huffman; % a node of the
class of huffman
% Add the probailities here we arecreating the tree lowest two
% probability nodes are added intoone single node
newnode.probability =temparray(1).probability +temparray(2).probability;% new node has the sum of previous twoprobabilities% % now assign the left lowest probabilyone as 0 and higher probabilty oen
% as 1temparray(1).code = [0];temparray(2).code = [1];
%% % Attach Children Nodes to the newnode the parent node created
newnode.leftNode = temparray(1);newnode.rightNode = temparray(2);
%% % remove the previous two nodes andreplace by parent nodes just like% in C++ we would remove the pointerand of children nodes and replace% by pointer of father node%
temparray =temparray(3:size(temparray,2)); % fisttwo nodes are gone%% now appending the new parent node%
temparray = [newnode temparray];%end % end the looping and hence binarytree created%%rootNode = temparray(1); % the root
node is always the first nodele_code = []; % that will be the finalcode huffman%% % Looping though the tree% % See recursive function loop.m%final_data=[]; % variable definitionlater usedcheck=huffman;
f=traverse(rootNode,le_code); % here wewill traverse the tree and generatehuffman tree
% here is the loop for detectingreplacing the data with its huffman code
forcodeLength=1:length(convdata)forinner=1:length(symbol)
if(convdata(codeLength)==symbol(inner))level=sum(codeBits(1:inner-1))+1;final_data=[final_data
codeHuff(level:level+codeBits(inner)-1)];elseendend
end% codedData=final_data%%%%%%%%%%%----------------%%%%%%%%%%%%%%
Traversal code for code generation:%%%%%----------- function for traversalof the binary tree-------%%%%%%%%%%---- 20-12-2013------------%%%%%%%%--algorithm for traversing the treewhole to get the code-----%%%%%%%
functionf = traverse(tempNode,codec)
global symbol; % these are the global
variables to store our array code anddata as in recursive functions they arecontinously over writtenglobalcodeHuff;globalcodeBits;
if ~isempty(tempNode) % if we have thenext root or notcodec = [codec tempNode.code]; % appendwith the previous node
if~isempty(tempNode.symbol)% disp(tempNode.symbol);tempNode.huffy=[codec];% disp(codec);symbol=[symbol tempNode.symbol];codeHuff=[codeHuff codec];codeBits=[codeBits length(codec)];
end
traverse(tempNode.leftNode,codec);traverse(tempNode.rightNode,codec);
8/12/2019 Assignment No 6 HuffmanCoding
5/7
Assignment # 6 5
endf=codec;end%%%%%%%%%-------------%%%%%%%%%%%%%%
Code for decoding of Huffman:
%---Huffman decoding algorithm---------%
%%%%%------- Saad Iftikhar----%%%%%%%%%%%-- 21-12-2013------------------------%
%% functiondecodedData=decodeHuffman(data,M,rootNode)
%lets traverse data and create thedecoded string using the structures
function [realdata,olright] =dHuffman(tempNode,data,M)%% traversing the tree and when we reacha leaf we assign the leaf nodes value tothe data vectordecoded=[]; % definig variablesrealdata=[];i=1;k=1;centerNode=tempNode; % variable of classhuffamnwhile(klength(data)) % if data is
over this is the leafrealdata=[realdata
centerNode.symbol];flag=1;k=i+1;break
endend
end%% here we convert the data from decimalback to binary formatbinaryReal=dec2bin(realdata,log2(M));binaryReal1=binaryReal';binaryRealFinal=binaryReal1(:);
binaryRealFinal=binaryRealFinal';forloop=1:length(binaryRealFinal)olright(loop)=str2double(binaryRealFinal(loop));end
end%%%%%%%%%%%------%%%%%%%%%%%%%%
8/12/2019 Assignment No 6 HuffmanCoding
6/7
Assignment # 6 6
Results:
Information is the original data its size is 21 bits long. final_data is the Huffmancompressed data its size is greatly reduced to 10 as M=8 ,k=3 in this case quitesimilar data
8/12/2019 Assignment No 6 HuffmanCoding
7/7
Assignment # 6 7
Now in this window it is shown that final data Huffman encoded data when sent to the decoding functions return the original
data and the sum(ol==information) returns 21 which means all 21 bits of original data and decoded data are a match.
IV. CONCLUSIONNow the implementation of the Huffman lossless entropy
encoding compression algorithm has confirmed the notion that
when data has many similar elements in it this compression
reduces the length of a code and hence increases the entropyno of useful information sent per bits.
Top Related