Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html
-
Upload
donald-walton -
Category
Documents
-
view
215 -
download
0
Transcript of Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html
![Page 1: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/1.jpg)
Data Structures
Week 6: Assignment #2 Problemhttp://www.cs.hongik.ac.kr/~rhanha/rhanha_teaching.html/
![Page 2: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/2.jpg)
Requirement
Encode a message using Huffman's algorithmUse Min Heap as the priority queue dynamic allocation
The input consists of stings A string consists of alphabets only
Upper case and lower case letters are treated as different characters
stored in a text file given in separate lines
![Page 3: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/3.jpg)
Requirement – cont’Output should be stored in a text file in the
following format
Due date 2001/5/23 24:00
Heap Traversal:[character or string]...
Huffman Tree Traversal:[character or string]...
character: frequency, code...
the code for the message:
![Page 4: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/4.jpg)
Encoding
Encode the message as a long bit string assign a bit string code to each symbol
of the alphabet then, concatenate the individual codes
of the symbols making up the message to produce an encoding for the message
![Page 5: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/5.jpg)
Example#1 Symbol Code A 010 B 100 C 000 D 111ABACCDA 010100010000000111010 Three bits are used for each symbol 21 bits are needed to encode the message
inefficient
![Page 6: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/6.jpg)
Example#2 Symbol Code
A 00B 01C 10D 11
ABACCDA 00010010101100 Two bits are used for each symbol 14 bits are needed to encode the
message
![Page 7: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/7.jpg)
Example#3
ABACCDA Each of the letters B and D appears only
once in the message The letter A appears three times The letter A assigned a shorter bit string
than the letters B and D
![Page 8: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/8.jpg)
Example#3 - cont’ Symbol Code
A 0B 110C 10D 111
ABACCDA 0110010101110 Encoding of the message requires only
13 bits more efficient
![Page 9: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/9.jpg)
Variable-Length Code
If variable-length codes are used the code for one symbol may not be a
prefix of the code for another
Example The code for a symbol x, c(x)
a prefix of the code of another symbol y, c(y) When c(x) is encountered in a left-to-
right scan It is unclear whether c(x) represents the
symbol x or whether it is the first part of c(y).
![Page 10: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/10.jpg)
Optimal Encoding Scheme (1)
Symbol FrequencyA 3B 1C 2D 1
Find the two symbols that appear least frequentlyThese are B and DCombine these two symbols into the single symbol BDThe frequency of this new symbol is the sum of the frequencies of its two symbolsThe frequency of BD is 2
![Page 11: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/11.jpg)
Optimal Encoding Scheme (2) Symbol Frequency
A 3 C 2 BD 2Again choose the two symbols with smallest frequency These are C and BDCombine these two symbols into the single symbol CBDThe frequency of this new symbol is the sum of the frequencies of its two symbolsThe frequency of CBD is 4
![Page 12: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/12.jpg)
Optimal Encoding Scheme (3) Symbol Frequency
A 3 CBD 4 There are now only two symbols remainingThese are combined into the single symbol ACBDThe frequency of ACBD is 7
Symbol Frequency ACBD 7
![Page 13: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/13.jpg)
Optimal Encoding Scheme (4)
ACBD (A and CBD) assigned the codes 0 and 1
CBD (C and BD) assigned the codes 10 and 11
BD (B and D) assigned the codes 110 and 111
![Page 14: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/14.jpg)
D1
C2B1
A3
The Huffman’s Algorithm (1)
![Page 15: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/15.jpg)
The Huffman’s Algorithm (2)
C2
B1 D1
A3
![Page 16: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/16.jpg)
The Huffman’s Algorithm (3)
B1 D1
C2
A3
BD2
BD2
![Page 17: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/17.jpg)
The Huffman’s Algorithm (4)
B1 D1
A3
BD2C2
![Page 18: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/18.jpg)
The Huffman’s Algorithm (5)
B1 D1
A3
BD2C2
CBD4CBD4
![Page 19: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/19.jpg)
The Huffman’s Algorithm (6)
B1 D1
A3
BD2C2
CBD4
![Page 20: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/20.jpg)
The Huffman’s Algorithm (7)
B1 D1
A3
BD2C2
CBD4
ACBD7ACBD7
![Page 21: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/21.jpg)
The Huffman’s Algorithm (8)
1. Build a min heap which contains the nodes of all symbols with the frequency values as the keys
2. Delete two nodes from the heap, concatenate the two symbols, add their frequencies, and put the result back into the heap
3. Make the two nodes become the two children of the node of the concatenated symboli.e) if s=s1 s2 is the symbol concatenated from s1 and s2,
then s1 and s2 become the left child and right child of s
4. Continue steps 2 and 3 until priority queue is empty
![Page 22: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/22.jpg)
The Huffman’s Algorithm (9)
Once the Huffman tree is constructed the code of any symbol can be
constructed by starting at the leaf representing that symbol
climbing up to the root The code is initialized to null each time that a left branch is climbed
0 is appended to the beginning of the code each time that a right branch is climbed
1 is appended to the beginning of the code
![Page 23: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/23.jpg)
![Page 24: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/24.jpg)
VARposition[i] : a pointer to the ith symboln : the number of symbols /*none zero frequency */frequency[i] : the relative frequency of the ith symbolcode[i] : the code assigned to the ith symbolp, p1, p2: a pointer to Min heap's node or huffman tree's node
Main Function{
initialization;count the frequency of each symbol within the message;
// construct a node for each symbol for(i=0; i < n; i++){
<p> = create <frequency[i]> a node;position[i] = p; //a pointer to the leaf containing
the ith symbolinsert <p> into Min heap ;
}//end for
The Huffman’s Algorithm (10)
![Page 25: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/25.jpg)
The Huffman’s Algorithm (11)
while(Min heap contains more than one item){
<p1> = delete Min heap;
<p2> = delete Min heap;
//combine p1 and p2 as branches of a single tree
<p> = create < info(p1)+info(p2) > a node;
set <p1> to be left_child of huffman tree p;
set <p2> to be right_child of huffman tree p;
insert <p> into Min heap;
}//end while
![Page 26: Data Structures Week 6: Assignment #2 Problem rhanha/rhanha_teaching.html](https://reader035.fdocuments.in/reader035/viewer/2022070409/56649e9a5503460f94b9cf41/html5/thumbnails/26.jpg)
The Huffman’s Algorithm (12)
//the tree is now constructed; use it to find codes<root> = delete Min heap;for(i=0; i<n; i++){
p = position[i];code[i] = NULL;while(p!=root){
//travel up to the rootif(is left<p>)
code[i]= 0 followed by code[i];else
code[i]= 1 followed by code[i];<p> = move <p> to father node;
} // end while}//end for
}//end main