Post on 06-Feb-2016
description
sorting3 1
Sorting III:Sorting III:
Heapsort
sorting3 2
A good sorting algorithm is hard A good sorting algorithm is hard to find ...to find ...
• Quadratic sorting algorithms (with running times of O(N2), such as Selectionsort & Insertionsort are unacceptable for sorting large data sets
• Mergesort and Quicksort are both better alternatives, but each has its own set of problems
sorting3 3
A good sorting algorithm is hard A good sorting algorithm is hard to find ...to find ...
• Both Mergesort and Quicksort have average running times of O(N log N), but:– Quicksort’s worst-case running time is
quadratic– Mergesort uses dynamic memory and may fail
for large data sets because not enough memory is available to complete the sort
sorting3 4
Heapsort to the rescue!Heapsort to the rescue!
• Combines the time efficiency of Mergesort with the storage efficiency of Quicksort
• Uses an element interchange algorithm similar to Selectionsort
• Works by transforming the array to be sorted into a heap
sorting3 5
Review of HeapsReview of Heaps
• A heap is a binary tree, usually implemented as an array
• The data entry in any node is never less than the data entries in its children
• The tree must be complete:– every level except the deepest is fully populated– at the deepest level, nodes are placed as far to
the left as possible
sorting3 6
Review of HeapsReview of Heaps
• Data location in an array-based heap:– the root entry is in array[0]– for any node at array[n], its parent node may be
found at array[(n-1)/2]– if a node at array[n] has children, they will be
found at:• array[2n+1] (left child)
• array[2n+2] (right child)
sorting3 7
How Heapsort WorksHow Heapsort Works
• Begin with an array of values to be sorted
• Heapsort algorithm treats the array as if it were a complete binary tree
• Values are rearranged so that the other heap condition is met: each parent node is greater than or equal to its children
sorting3 8
Heapsort in ActionHeapsort in Action
42 19 33 8 12 97 54 85 29 60 26 71
Original array
42
19 33
8 12 97 54
85 29 60 26 71 … arranged as binarytree (but not yet heap)
sorting3 9
Heapsort in ActionHeapsort in Action
97 85 71 29 60 42 54 8 19 12 26 33
4229
338 12
97
54
85
19
60
26
71
… rearranged as heap
sorting3 10
Heapsort in ActionHeapsort in ActionThe goal of the sort algorithm is to arrange entries in order,smallest to largest. Since the root element is by definition thelargest element in a heap, that largest element can be placed inits final position by simply swapping it with whatever elementhappens to be in the last position:
9733
As a result, the shaded area can now be designated the sorted side of the array, and the unsorted side is almost, but not quite, a heap.
sorting3 11
Heapsort in ActionHeapsort in Action
The next step is to restore the heap condition by rearranging thenodes again; the root value is swapped with its largest child, andmay be swapped further down until the heap condition is restored:
85 33 3360
Once the heap is restored, the previous operation is then repeated:the root value, which is the largest on the unsorted side, is swapped with the last value on the unsorted side:
8526
sorting3 12
Heapsort in ActionHeapsort in Action
Again, the unsorted side is almost a heap, and must be restored toheap condition - this is accomplished the same way as before:
71 26 2654
The process continues: root is swapped with the last value andmarked sorted; the unsorted portion is restored to heap condition
7112 1260 1233 1219
sorting3 13
Heapsort in ActionHeapsort in ActionAnd so on ...
601254 1242 12
And so forth ...
54842 826 8
Etc. ...
42833 829 8 331229 1219 12291226 12 26819 8 1912 128
Tada!
sorting3 14
Pseudocode for HeapsortPseudocode for Heapsort
• Convert array of n elements into heap
• Set index of unsorted to n (unsorted = n)
• while (unsorted > 1)unsorted--;
swap (array[0], array[unsorted]);
reHeapDown
sorting3 15
Code for HeapsortCode for Heapsort
template <class item>void heapsort (item array[ ], size_t n){
size_t unsorted = n;makeHeap(array, n);while (unsorted > 1){
unsorted--;swap(array[0], array[unsorted]);reHeapDown(data, unsorted);
}}
sorting3 16
makeHeap functionmakeHeap function
• Two common ways to build initial heap
• First method builds heap one element at a time, starting at front of array
• Uses reHeapUp process (last seen in priority queue implementation) to enforce heap condition - children are <= parents
sorting3 17
Code for makeHeapCode for makeHeaptemplate <class item>void makeHeap (item data[ ], size_t n){
for (size_t x=1; x<n; x++){
size_t k = x;while (data[k] > data[parent(k)]){
swap (data[k], data[parent(k)]);k = parent(k);
}}
}
sorting3 18
Helper functions for makeHeapHelper functions for makeHeap
template <class item>void swap (item& a, item& b){
item temp = a;a = b;b = temp;
}
size_t parent (size_t index){
size_t n;n = (index-1)/2;return n;
}
sorting3 19
Alternative makeHeap methodAlternative makeHeap method
• Uses a function (subHeap) that creates a heap from a subtree of the complete binary tree
• The function takes three arguments: an array of items to be sorted, the size of the array(n), and a number representing the index of a subtree.
• Function subHeap is called within a loop in the makeHeap function, as follows:for (int x = (n/2); x>0; x--)
subHeap(data, x-1, n);
sorting3 20
Alternative makeHeap in ActionAlternative makeHeap in Action
42 19 33 8 12 97 54 85 29 60 26 71
Original array
First call to subHeap, in the given example:subHeap (data, 5, 12);
Algorithm examines subtree to determine if any change isnecessary; since the only child of element 5 (97) is element 11(71), no change from this call.
sorting3 21
Second call to subHeapSecond call to subHeap
42 19 33 8 12 97 54 85 29 60 26 71
The second call to subHeap looks at the subtree with rootindex 4
The children of this node are found at indexes 9 and 10
Both children are greater than the root node, so the rootnode’s data entry is swapped with the data entry of the largerchild
60 12
sorting3 22
Third call to subHeapThird call to subHeap
42 19 33 8 60 97 54 85 29 12 26 71
Continuing to work backward through the array, the nextroot index is 3
Again, the children of this node are examined, and theroot data entry is swapped with the larger of its two children
85 8
sorting3 23
The process continues ...The process continues ...
42 19 33 85 60 97 54 8 29 12 26 7197 33
In this case, a further swap is necessary because the dataentry’s new position has a child with a larger data entry
71 33
sorting3 24
The process continues ...The process continues ...
42 19 97 85 60 71 54 8 29 12 26 3385 19
Again, a further swap operation is necessary
29 19
sorting3 25
Last call to subHeapLast call to subHeap
42 85 97 29 60 71 54 8 19 12 26 3397 4271 42
We now have a heap:
sorting3 26
ReHeapDown functionReHeapDown function
• Both the original HeapSort function and the subHeap function will call reHeapDown to rearrange the array
• The reHeapDown fucntion swaps an out-of-place data entry with the larger of its children
sorting3 27
Code for reHeapDownCode for reHeapDowntemplate <class item>void reHeapDown(item data [ ], size_t n){
size_t current = 0, big_child, heap_ok = 0; while ((!heap_ok) && (left_child(current) < n)) { if (right_child(current) >= n)
big_child = left_child(current); else if (data[left_child(current)] > data[right_child(current)]) big_child = left_child(current); else big_child = right_child(current);
...
sorting3 28
Code for reHeapDownCode for reHeapDown
if (data[current] < data[big_child]) { swap(data[current], data[big_child]); current = big_child; } else heap_ok = 1; } // end of while loop} // end of function
sorting3 29
Helper functions for Helper functions for reHeapDownreHeapDown
size_t left_child(size_t n){
return (2*n)+1;}
size_t right_child(size_t n){
return (2*n)+2;}
sorting3 30
Time analysis for HeapSortTime analysis for HeapSort
• The time required for a HeapSort is the sum of the time required for the major operations:– building the initial heap: O(N log N)– removing items from the heap: O(N log N)
• So total time required is O(2N log N), but since constants don’t count, the worst-case (and average case) for HeapSort is O(N log N)