Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal...
Transcript of Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal...
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
1/30
Internal Sorting 2
S. Thiel1
1Department of Computer Science & Software EngineeringConcordia University
May 15, 2018
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
2/30
Outline
SortingQuicksortMergesortHeapsortRadix Sort
References
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
3/30
Sorting
I Three general purpose sorting algorithms
I They sort by comparing
I Quicksort
I Mergesort
I Heapsort
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
4/30
Quicksort
I Pick a pivot
I Partition around the pivot
I Items that are smaller/equal go before
I Items that are larger go after
I Recursively apply Quicksort on each partition
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
5/30
Quicksort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ(n2)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
6/30
Quicksort Sort Properties 1
I Quicksort. . .
I is a divide and conquer algorithm
I works best with good pivot selection
I is recursive
I puts a pivot in place every pass
I is in-place
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
7/30
Quicksort Sort Properties 2
I Is it Stable?
I Some neat optimizations to make it fast and stable withmany duplicates
I . . . might be a bit slower otherwise
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
8/30
Quicksort Standard Approaches [1, p.242]
I Picking a pivot often uses median of three
I Diversion to Insertion Sort for small partitions
I Tail recursion
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
9/30
Standard Implementation 1
1 v o i d q s o r t ( i n t [ ] A , i n t i , i n t j ) { // Q u i c k s o r t2 i n t p i v o t i n d e x = f i n d p i v o t (A, i , j ) ;3 swap (A, p i v o t i n d e x , j ) ; // S t i c k p i v o t a t end4 // k w i l l be t h e f i r s t p o s i t i o n i n t h e r i g h t5 // s u b a r r a y6 i n t k = p a r t i t i o n (A, i −1, j , A [ j ] ) ;7 swap (A, k , j ) ; // Put p i v o t i n p l a c e8 i f ( ( k− i ) > 1) q s o r t (A, i , k−1) ; // S o r t l e f t9 i f ( ( j−k ) > 1) q s o r t (A, k+1, j ) ; // S o r t r i g h t
10 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
10/30
Standard Implementation 2
1 i n t p a r t i t i o n ( i n t [ ] A , i n t l , i n t r , i n t p i v o t ) {2 do { // Move bounds inward u n t i l t h e y meet3 w h i l e (A[++ l ] < p i v o t ) ;4 w h i l e ( ( r !=0) && (A[−− r ] > p i v o t ) ) ;5 swap (A, l , r ) ; // Swap out−of−p l a c e v a l u e s6 } w h i l e ( l < r ) ; // Stop when t h e y c r o s s7 swap (A, l , r ) ; // R e v e r s e l a s t , wasted swap8 r e t u r n l ; // Return f i r s t p o s i t i o n i n r i g h t p a r t i t i o n9 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
11/30
NumberSort Learning Aid
I Thanks to Justin Cotarla who took some time to makethis!
I https://users.encs.concordia.ca/~sthiel/
comp352/NumberSort/
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
12/30
Mergesort
I Splits Input into halves repeatedly
I Merges halves back together
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
13/30
Mergesort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ (n log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
14/30
Figure: A Mergesort example [2].
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
15/30
Standard Implementation 1
1 L i s t m e r g e s o r t ( L i s t i n l i s t ) {2 i f ( i n l i s t . s i z e ( ) <= 1) r e t u r n i n l i s t ;3 L i s t L1 = i n l i s t . s u b l i s t ( 0 , i n l i s t . s i z e ( ) /2) ;4 L i s t L2 = i n l i s t . s u b l i s t ( i n l i s t . s i z e ( ) /2 ,5 i n l i s t . s i z e ( )−1)6 r e t u r n merge ( m e r g e s o r t ( L1 ) , m e r g e s o r t ( L2 ) ) ;7 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
16/30
Standard Implementation 2
1 L i s t merge ( L i s t L1 , L i s t L2 ) {2 L i s t L = new L i s t ( ) ;3 w h i l e ( ! L1 . i sEmpty ( ) && ! L2 . i sEmpty ( ) ) {4 i f ( L1 . g e t ( 0 ) <= L2 . g e t ( 0 ) )5 L . add ( L1 . remove ( 0 ) ;6 e l s e L . add ( L2 . remove ( 0 ) ;7 }8 i f ( L1 . i sEmpty ( ) ) L . a d d A l l ( L2 ) ;9 i f ( L2 . i sEmpty ( ) ) L . a d d A l l ( L1 ) ;
10 }
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
17/30
Mergesort with Lists
I The code above looks nice with ListsI Finding the halfway point of a List is costly
I How costly? I say 2n. Why?
I We can alternate and skip finding halfway points for 1n
I We can use a List-of-Lists and just start mergingsublists depth-first?
I We can use a List-of-Lists approach finding inherentstructure first?
I is cost of merging lists of varying sizes worth it?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
18/30
Mergesort with Arrays
I Using two arrays solves most Array-related issues
I Using two arrays implements pretty smoothly
I There is no cost to finding the halfway point
I You need an empty array, unlike Quicksort/Mergesortw/ lists
I Can you benefit from sorting existing runs?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
19/30
Mergesort with existing runs
I Does the best-case change?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
20/30
Heapsort
I Build a heap in Θ (n)
I Take the value off the heap Θ (1)
I re-settle the heap Θ (log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
21/30
Heapsort Analysis
I In the best case: Θ (n log n)
I In the average case: Θ (n log n)
I In the worst case: Θ (n log n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
22/30
Heapsort Properties
I Can be done in-place
I Generally considered slower than Quicksort
I Effective when only the first few values in a list areneeded
I Shaffer and most others show Top-Down Heapsort
I Bottom-Up Heapsort is twice as fast, faster with a bitextra memory
I works well when you have more data than fits inmemory
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
23/30
Heap navigation
I We can use math to navigate the heap
I iParent(i) = floor((i − 1)/2)
I iLeftChild(i) = 2 ∗ i + 1
I iRightChild(i) = 2 ∗ i + 2
I http://faculty.simpson.edu/lydia.sinapova/
www/cmsc250/LN250_Weiss/L13-HeapSortEx.htm
I https://en.wikipedia.org/wiki/Heapsort
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
24/30
Two Types of Radix Sort
I Most Significant Digit looks good, but actually can bebad
I Least Significant Digit looks weird, but actually good
I LSD Radix sort is how most of us sort cards
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
25/30
LSD Radix Sort Analysis
I In the best case: Θ (n)
I In the average case: Θ (n)
I In the worst case: Θ (n)
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
26/30
MSD Radix Sort Analysis
I In the best case: Θ (n)
I In the average case: Θ (L)
I In the worst case: Θ (L)
I What the heck is L?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
27/30
LSD Radix Sort Code
Algorithm 1 LSD Radix Sort
Require: input, val()/* Initialization */
1: buf ← initArray [input.length]2: counts ← initArray [passes][bucketCount]
Ensure: sorted input/* perform initial counting pass */
3: for i in input do
4: for j = 0 to passes − 1 do5: counts[j][bitsFor(val(i), j)]6: end for7: end for
/* convert counts to indices */8: for j = 1 to passes − 1 do
9: convertToIndices(counts[j])10: end for
/* deal to buffer based on current radix */11: for j = 0 to passes − 1 do
12: for i in input do13: buf [counts[j][bitsFor(val(i), j)]] = i14: end for15: swap(input, buf )16: end for
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
28/30
LSD Radix Sort Example
Figure: A Radix Sort Example.
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
29/30
LSD Radix Sort Analysis
I If the size of n grows with n, not so good
I we go back to Θ (n log n)
I but! If that is the case, then comparison sorts become:
I Θ (n log n log n) why?
Internal Sorting 2
S. Thiel
Sorting
Quicksort
Mergesort
Heapsort
Radix Sort
References
30/30
References I
[1] Clifford A. Shaffer.Data Structures and Algorithm Analysis in Java.2013.
[2] Wikipedia.Mergesort.https://en.wikipedia.org/wiki/Merge_sort, May2017.