Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal...

30
Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal Sorting 2 S. Thiel 1 1 Department of Computer Science & Software Engineering Concordia University May 15, 2018

Transcript of Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal...

Page 1: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

1/30

Internal Sorting 2

S. Thiel1

1Department of Computer Science & Software EngineeringConcordia University

May 15, 2018

Page 2: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

2/30

Outline

SortingQuicksortMergesortHeapsortRadix Sort

References

Page 3: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

3/30

Sorting

I Three general purpose sorting algorithms

I They sort by comparing

I Quicksort

I Mergesort

I Heapsort

Page 4: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

4/30

Quicksort

I Pick a pivot

I Partition around the pivot

I Items that are smaller/equal go before

I Items that are larger go after

I Recursively apply Quicksort on each partition

Page 5: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

5/30

Quicksort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ(n2)

Page 6: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

6/30

Quicksort Sort Properties 1

I Quicksort. . .

I is a divide and conquer algorithm

I works best with good pivot selection

I is recursive

I puts a pivot in place every pass

I is in-place

Page 7: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

7/30

Quicksort Sort Properties 2

I Is it Stable?

I Some neat optimizations to make it fast and stable withmany duplicates

I . . . might be a bit slower otherwise

Page 8: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

8/30

Quicksort Standard Approaches [1, p.242]

I Picking a pivot often uses median of three

I Diversion to Insertion Sort for small partitions

I Tail recursion

Page 9: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

9/30

Standard Implementation 1

1 v o i d q s o r t ( i n t [ ] A , i n t i , i n t j ) { // Q u i c k s o r t2 i n t p i v o t i n d e x = f i n d p i v o t (A, i , j ) ;3 swap (A, p i v o t i n d e x , j ) ; // S t i c k p i v o t a t end4 // k w i l l be t h e f i r s t p o s i t i o n i n t h e r i g h t5 // s u b a r r a y6 i n t k = p a r t i t i o n (A, i −1, j , A [ j ] ) ;7 swap (A, k , j ) ; // Put p i v o t i n p l a c e8 i f ( ( k− i ) > 1) q s o r t (A, i , k−1) ; // S o r t l e f t9 i f ( ( j−k ) > 1) q s o r t (A, k+1, j ) ; // S o r t r i g h t

10 }

Page 10: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

10/30

Standard Implementation 2

1 i n t p a r t i t i o n ( i n t [ ] A , i n t l , i n t r , i n t p i v o t ) {2 do { // Move bounds inward u n t i l t h e y meet3 w h i l e (A[++ l ] < p i v o t ) ;4 w h i l e ( ( r !=0) && (A[−− r ] > p i v o t ) ) ;5 swap (A, l , r ) ; // Swap out−of−p l a c e v a l u e s6 } w h i l e ( l < r ) ; // Stop when t h e y c r o s s7 swap (A, l , r ) ; // R e v e r s e l a s t , wasted swap8 r e t u r n l ; // Return f i r s t p o s i t i o n i n r i g h t p a r t i t i o n9 }

Page 11: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

11/30

NumberSort Learning Aid

I Thanks to Justin Cotarla who took some time to makethis!

I https://users.encs.concordia.ca/~sthiel/

comp352/NumberSort/

Page 12: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

12/30

Mergesort

I Splits Input into halves repeatedly

I Merges halves back together

Page 13: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

13/30

Mergesort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ (n log n)

Page 14: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

14/30

Figure: A Mergesort example [2].

Page 15: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

15/30

Standard Implementation 1

1 L i s t m e r g e s o r t ( L i s t i n l i s t ) {2 i f ( i n l i s t . s i z e ( ) <= 1) r e t u r n i n l i s t ;3 L i s t L1 = i n l i s t . s u b l i s t ( 0 , i n l i s t . s i z e ( ) /2) ;4 L i s t L2 = i n l i s t . s u b l i s t ( i n l i s t . s i z e ( ) /2 ,5 i n l i s t . s i z e ( )−1)6 r e t u r n merge ( m e r g e s o r t ( L1 ) , m e r g e s o r t ( L2 ) ) ;7 }

Page 16: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

16/30

Standard Implementation 2

1 L i s t merge ( L i s t L1 , L i s t L2 ) {2 L i s t L = new L i s t ( ) ;3 w h i l e ( ! L1 . i sEmpty ( ) && ! L2 . i sEmpty ( ) ) {4 i f ( L1 . g e t ( 0 ) <= L2 . g e t ( 0 ) )5 L . add ( L1 . remove ( 0 ) ;6 e l s e L . add ( L2 . remove ( 0 ) ;7 }8 i f ( L1 . i sEmpty ( ) ) L . a d d A l l ( L2 ) ;9 i f ( L2 . i sEmpty ( ) ) L . a d d A l l ( L1 ) ;

10 }

Page 17: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

17/30

Mergesort with Lists

I The code above looks nice with ListsI Finding the halfway point of a List is costly

I How costly? I say 2n. Why?

I We can alternate and skip finding halfway points for 1n

I We can use a List-of-Lists and just start mergingsublists depth-first?

I We can use a List-of-Lists approach finding inherentstructure first?

I is cost of merging lists of varying sizes worth it?

Page 18: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

18/30

Mergesort with Arrays

I Using two arrays solves most Array-related issues

I Using two arrays implements pretty smoothly

I There is no cost to finding the halfway point

I You need an empty array, unlike Quicksort/Mergesortw/ lists

I Can you benefit from sorting existing runs?

Page 19: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

19/30

Mergesort with existing runs

I Does the best-case change?

Page 20: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

20/30

Heapsort

I Build a heap in Θ (n)

I Take the value off the heap Θ (1)

I re-settle the heap Θ (log n)

Page 21: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

21/30

Heapsort Analysis

I In the best case: Θ (n log n)

I In the average case: Θ (n log n)

I In the worst case: Θ (n log n)

Page 22: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

22/30

Heapsort Properties

I Can be done in-place

I Generally considered slower than Quicksort

I Effective when only the first few values in a list areneeded

I Shaffer and most others show Top-Down Heapsort

I Bottom-Up Heapsort is twice as fast, faster with a bitextra memory

I works well when you have more data than fits inmemory

Page 23: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

23/30

Heap navigation

I We can use math to navigate the heap

I iParent(i) = floor((i − 1)/2)

I iLeftChild(i) = 2 ∗ i + 1

I iRightChild(i) = 2 ∗ i + 2

I http://faculty.simpson.edu/lydia.sinapova/

www/cmsc250/LN250_Weiss/L13-HeapSortEx.htm

I https://en.wikipedia.org/wiki/Heapsort

Page 24: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

24/30

Two Types of Radix Sort

I Most Significant Digit looks good, but actually can bebad

I Least Significant Digit looks weird, but actually good

I LSD Radix sort is how most of us sort cards

Page 25: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

25/30

LSD Radix Sort Analysis

I In the best case: Θ (n)

I In the average case: Θ (n)

I In the worst case: Θ (n)

Page 26: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

26/30

MSD Radix Sort Analysis

I In the best case: Θ (n)

I In the average case: Θ (L)

I In the worst case: Θ (L)

I What the heck is L?

Page 27: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

27/30

LSD Radix Sort Code

Algorithm 1 LSD Radix Sort

Require: input, val()/* Initialization */

1: buf ← initArray [input.length]2: counts ← initArray [passes][bucketCount]

Ensure: sorted input/* perform initial counting pass */

3: for i in input do

4: for j = 0 to passes − 1 do5: counts[j][bitsFor(val(i), j)]6: end for7: end for

/* convert counts to indices */8: for j = 1 to passes − 1 do

9: convertToIndices(counts[j])10: end for

/* deal to buffer based on current radix */11: for j = 0 to passes − 1 do

12: for i in input do13: buf [counts[j][bitsFor(val(i), j)]] = i14: end for15: swap(input, buf )16: end for

Page 28: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

28/30

LSD Radix Sort Example

Figure: A Radix Sort Example.

Page 29: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

29/30

LSD Radix Sort Analysis

I If the size of n grows with n, not so good

I we go back to Θ (n log n)

I but! If that is the case, then comparison sorts become:

I Θ (n log n log n) why?

Page 30: Internal Sorting 2 - users.encs.concordia.casthiel/comp352/04_Internal_Sorting.pdf · Internal Sorting 2 S. Thiel Sorting Quicksort Mergesort Heapsort Radix Sort References 1/30 Internal

Internal Sorting 2

S. Thiel

Sorting

Quicksort

Mergesort

Heapsort

Radix Sort

References

30/30

References I

[1] Clifford A. Shaffer.Data Structures and Algorithm Analysis in Java.2013.

[2] Wikipedia.Mergesort.https://en.wikipedia.org/wiki/Merge_sort, May2017.