2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering,...

59
2110211 Intro. to Data Stru ctures Chapter 7 Sorti ng Veera Muangsin, Dept . of Computer Engineering, 1 Chapter 7 Sorting • Sort is a very useful and frequently used operation • Require fast algorithm • Easy algorithms sort in O(N 2 ) • Complicate algorithms sort in O(N log N) • Any general-purpose sorting algorithm requires (N log N) comparisons

description

Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 3 Sorting Algorithms Insertion Sort Shellsort Heapsort Mergesort Quicksort Bucket Sort

Transcript of 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering,...

Page 1: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

1

Chapter 7 Sorting

• Sort is a very useful and frequently used operation• Require fast algorithm• Easy algorithms sort in O(N2) • Complicate algorithms sort in O(N log N)• Any general-purpose sorting algorithm requires

(N log N) comparisons

Page 2: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

2

Sorting

What is covered in this chapter:• Sort array of integers• Comparison-based sorting

main operations are compare and swap• Assume that the entire sort can be done in main

memory

Page 3: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

3

Sorting Algorithms

• Insertion Sort• Shellsort• Heapsort• Mergesort• Quicksort• Bucket Sort

Page 4: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

4

Insertion Sort• Sort N elements in N-1 passes ( pass 1 to N-1 )

• In pass p– insertion sort ensures that the elements in positions 0 t

hrough p are in sorted order

– elements in positions 0 through p-1 are already in sorted order

– move the element in position p to the left until its correct place is found

Page 5: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

5

Insertion Sort

Original 34 8 64 51 32 21 positionsmoved

After p = 1 8 34 64 51 32 21 1

After p = 2 8 34 64 51 32 21 0

After p = 3 8 34 51 64 32 21 1

After p = 4 8 32 34 51 64 21 3

After p = 5 8 21 32 34 51 64 4

Page 6: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

6

Insertion Sort

• The element in position p is saved in tmp• All larger elements prior to position p are move

d one spot to the right• Then tmp is placed in the correct spot

Page 7: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

7

public static void insertionSort( Comparable [ ] a ){ int j;

for( int p = 1; p < a.length; p++ ) { Comparable tmp = a[ p ]; for( j = p; j > 0 && tmp.compareTo( a[ j - 1 ] ) < 0; j-- ) a[ j ] = a[ j - 1 ]; a[ j ] = tmp; }}

Page 8: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

8

Analysis

• Insertion sort has nested loops. Each loop can have N iterations. So, insertion sort is O(N2).

• The inner loop can be executed at most p+1 times for each value of p.

• For all p = 1 to N-1, the inner loop can be executed at most 2 + 3 + 4 + . . . + N = (N2)

• Input in reverse order can achieve this bound.

Page 9: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

9

Analysis

• If the input is presorted, the running time is O(N) because the test in the inner for loop always fails immediately.

• The average case is (N2)

Page 10: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

10

Shellsort • Sort N element in t passes

• Each pass k has an associated value hk

• The t passes use a sequence of h1 , h2 , . . . , ht (called increment sequence)

• The first pass uses ht and the last pass uses h1

• ht > . . . > h2 > h1 and h1 = 1

• In each pass, all elements spaced hk apart are sorted

Page 11: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

11

Shellsort

Original 81 94 11 90 12 35 17 95 28 58 41 75 15

After 5-sort 35 17 11 28 12 41 75 15 96 58 81 94 95

After 3-sort 28 12 11 35 15 41 58 17 94 75 81 96 95

After 1-sort 11 12 15 17 28 35 41 58 75 81 94 95 96

Page 12: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

12

Shellsort

• A sequence that is sorted using hk is said to be h

k-sorted

• An hk-sorted sequence that is then hk-1 sorted remains hk-sorted

• An hk-sort performs an insertion sort on hk independent subarrays

Page 13: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

13

Shellsort

• Any increment sequence works, as long as h1 = 1

• Some choices are better than others

• A popular (but poor) increment sequence is 1, 2, 4, 8, . . . , N/2 ht = N/2 , and hk = hk+1/2

Page 14: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

14

Shellsort v.s. Insertion Sort

• The last pass of shellsort performs an insertion sort on the whole array (h1-sort).

• But shellsort is better than insertion sort because shellsort perform insertion sorts on presorted arrays

Page 15: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

15

public static void shellsort( Comparable [ ] a ){ int j;

for( int gap = a.length / 2; gap > 0; gap /= 2 ) for( int i = gap; i < a.length; i++ ) { Comparable tmp = a[ i ]; for( j = i; j >= gap && tmp.compareTo(a[j-gap]) < 0; j -= gap ) a[ j ] = a[ j - gap ]; a[ j ] = tmp; }}

Page 16: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

16

A Bad Case of Shellsort

• N is a power of 2• All the increments are even, except the last incr

ement, which is 1.• The N/2 largest numbers are in the even positio

ns and the N/2 smallest numbers are in the odd positions

Page 17: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

17

A Bad Case of Shellsort1 9 2 10 3 11 4 2 5 13 6 14 7 15 8 16

After 8-sort 1 9 2 10 3 11 4 2 5 13 6 14 7 15 8 16

After 4-sort 1 9 2 10 3 11 4 2 5 13 6 14 7 15 8 16

After 2-sort 1 9 2 10 3 11 4 2 5 13 6 14 7 15 8 16

After 1-sort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Page 18: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

18

A Bad Case of Shellsort

• No sorting is performed until the last pass

• i th smallest number (i ฃN/2) is in position 2i-1

• Restoring the i th element to its correct place requires moving it i-1 spaces

• Restoring N/2 smallest numbers requires = (N2) work

2/

11N

ii

Page 19: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

19

Worst-Case Analysis

• A pass with increment hk consists of hk insertion sorts of about N/hk elements

• Since insertion sort is quadratic, the total cost of a pass is O(hk(N/hk)2) = O(N 2/hk)

• Summing over all passes gives

)()/1()/( 21

21

2 NOhNOhNO t

i it

i i

Page 20: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

20

Hibbard’s Increments• Increment sequence 1, 3, 7, 15, . . . , 2k - 1

• hk+1= 2 hk + 1

• Consecutive increments have no common factors

• Worst case running time of Shellsort using Hibbard’s increment is (N3/2)

Page 21: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

21

Analysis of Hibbard’s Increments

• Input of hk-sort is already hk+1-sorted and hk+2-sorted (e.g. input of 3-sort is already 7-sorted and 15-sorted)

• Let i be the distance between two elements. If i is expressible as a linear combination of hk+1 and hk+2, then a[p-i] ฃ a[p]

• For example, 52 = 1*7 + 3*15, so a[100] ฃ a[152] because a[100] ฃa[107] ฃ a[122] ฃ a[137] ฃ a[152]

Page 22: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

22

Analysis of Hibbard’s Increments

• All integers ฃ(hk+1 -1)(hk+2 -1) = 8hk2 + 4hk can be

expressed as a linear combination of hk+1 and hk+2

• Proof:

i = x*hk+1 + y*hk+2

i+1 = x*hk+1 + y*(2*hk+1+1) +1

i+1 = x*hk+1 + y*(2*hk+1+1) - 2*hk+1 + 2*hk+1 + 1

i+1 = (x-2)*hk+1 + (y+1)*hk+2

Page 23: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

23

Analysis of Hibbard’s Increments• So, a[p-i] ฃ a[p] if i ฃ8hk

2 + 4hk

• In each pass, a[p] is never moved further than a[p-i] or 8hk

2 + 4hk elements to the left

• The innermost for loop is executed at most 8hk

+ 4 = O(hk) times for each position. So, each pass has O(Nhk) running time.

Page 24: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

24

Analysis of Hibbard’s Increments• For hk > N1/2, use the bound O(N2/hk).

• For hk ฃ N1/2 use the bound O(Nhk)

• About half of the increment sequence satisfy hk N1/2

• The total time is

t

tkk

t

kk

t

tkk

t

kk hNhNOhNNhO

12/

22/

112/

22/

1

/1()/(

)()()( 2/3

2/

2

2/ NOhNONhO

tt

Page 25: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

25

Sedgewick’s Increments

• Sedgewick’s increments is {1, 5, 19, 41, 109, . . .} which can be term as 9*4i - 9*2i + 1 or 4i -3*2i + 1

• O(N4/3) worst-case time and O(N7/6) average time

Page 26: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

26

Heap Sort

• Build a binary heap of N elements and then perform N deleteMin operations

• Building a heap takes O(N) time and N deleteMin operations take O(N log N) time

• The total running time is O(N log N)

Page 27: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

27

Heap Sort

• The sorted elements, which are taken out of the heap, can be place in another array.

• To avoid using extra array to keep result, replace the last element in the heap with the element taken out of the heap.

• To get the result in increasing order, use max-heap instead.

Page 28: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

28

97

59

3158

53

4126

97 53 59 26 41 58 31

59

58

9731

53

4126

59 53 58 26 41 31 97

After one deleteMax

Page 29: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

29

58

31

9759

53

4126

58 53 31 26 41 59 97

53

31

9759

41

5826

53 41 31 26 58 59 97

After two deleteMax

After three deleteMax

Page 30: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

30

public static void heapsort( Comparable [ ] a ){ for( int i = a.length / 2; i >= 0; i-- ) percDown( a, i, a.length ); for( int i = a.length - 1; i > 0; i-- ) { swapReferences( a, 0, i ); percDown( a, 0, i ); }}

private static int leftChild( int i ){ return 2 * i + 1; } // array begins at index 0

public static final void swapReferences( Object [ ] a, int index1, int index2 ){ Object tmp = a[ index1 ]; a[ index1 ] = a[ index2 ]; a[ index2 ] = tmp;}

Page 31: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

31

private static void percDown( Comparable [ ] a, int i, int n ){ int child; Comparable tmp;

for( tmp = a[ i ]; leftChild( i ) < n; i = child ) { child = leftChild( i ); if( child != n - 1 && a[ child ].compareTo( a[ child + 1 ] ) < 0 ) child++; if( tmp.compareTo( a[ child ] ) < 0 ) a[ i ] = a[ child ]; else break; } a[ i ] = tmp;}

Page 32: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

32

Analysis

• Building the heap uses at most 2N comparisons

• deleteMax uses at most 2N log N - O(N) comparisons

• So, heapsort uses at most 2N log N - O(N) comparison

Page 33: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

33

Analysis

• Worst-case and average-case are only slightly different

• Average number of comparison is 2N log N - O(N log log N)

Page 34: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

34

Mergesort

• The fundamental operation is merging two sorted lists.

• Because the lists are sorted, this can be done in one pass through the input, if the output is put in a third list.

Page 35: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

35

Mergesort

• Mergesort takes two input arrays A and B, an output array C, and three counters, Actr, Bctr, and Cctr.

• The smaller of A[Actr] and B[Bctr] is copied to the next entry in C, and appropriate counters are advanced

• Remaining input items are copied to C

Page 36: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

36

Mergesort

1 13 24 26 2 15 27 38

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1 2

Actr Bctr Cctr

Page 37: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

37

Mergesort

1 13 24 26 2 15 27 38 1 2 13

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1 2 13 15

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1 2 13 15 24

Actr Bctr Cctr

Page 38: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

38

Mergesort

1 13 24 26 2 15 27 38 1 2 13 15 24 26

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1 2 13 15 24 26 27 38

Actr Bctr Cctr

1 13 24 26 2 15 27 38 1 2 13 15 24 26

Actr Bctr Cctr

Page 39: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

39

Mergesort

• If N > 1, recursively mergesort the first half and the second half

• If N = 1, only one element to sort -> the base case

Page 40: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

40

Mergesort: Divide and Conquer24 13 26 1 2 27 38 15

24 13 26 1

2 27 38 15

24 13

1 26

2 15 27 38

24 13 26 1

2 27

38 15

13 24

26 1 2 27

15 38

1 13 24 26

2 27 38 15

1 2 13 15 24 26 27 38

Page 41: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

41

Analysis

• Merging two sorted lists is linear, because at most N-1 comparisons are made

• For N = 1, the time to mergesort is constant

• Otherwise, the time to mergesort N numbers is the time to do two recursive mergesorts of size N/2, plus the linear time to merge

• T(N) = N log N + N

Page 42: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

42

public static void mergeSort( Comparable [ ] a ){ Comparable [ ] tmpArray = new Comparable[ a.length ]; mergeSort( a, tmpArray, 0, a.length - 1 );}

private static void mergeSort( Comparable [ ] a, Comparable [ ] tmpArray, int left, int right ){ if( left < right ) { int center = ( left + right ) / 2; mergeSort( a, tmpArray, left, center ); mergeSort( a, tmpArray, center + 1, right ); merge( a, tmpArray, left, center + 1, right ); }}

Page 43: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

43

private static void merge( Comparable [ ] a, Comparable [ ] tmpArray, int leftPos, int rightPos, int rightEnd ){ int leftEnd = rightPos - 1; int tmpPos = leftPos; int numElements = rightEnd - leftPos + 1;

while( leftPos <= leftEnd && rightPos <= rightEnd ) if( a[ leftPos ].compareTo( a[ rightPos ] ) <= 0 ) tmpArray[ tmpPos++ ] = a[ leftPos++ ]; else tmpArray[ tmpPos++ ] = a[ rightPos++ ]; while( leftPos <= leftEnd ) // Copy rest of first half tmpArray[ tmpPos++ ] = a[ leftPos++ ]; while( rightPos <= rightEnd ) // Copy rest of right half tmpArray[ tmpPos++ ] = a[ rightPos++ ]; for( int i = 0; i < numElements; i++, rightEnd-- ) a[ rightEnd ] = tmpArray[ rightEnd ];}

Page 44: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

44

QuicksortDivide-and-Conquer recursive algorithm1. If the number of elements in S is 0 or 1, then return2. Pick any element v in S. This is called the pivot3. Partition the remaining elements in S ( S - {v} ) int

o two disjoint groups: S1 and S2. S1 contains elements ฃv, S2 contains elements ฃv

4. Return {quicksort(S1), v, quicksort(S2)}

Page 45: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

45

13 81 92 43 65 31 57 26 75 0

13 81 92 43 65 31 57 26 75 0

13 0 26 43 57 31 65 92 75 81

0 13 26 31 43 57 65 75 81 92

0 13 26 31 43 57 65 75 81 92

select pivot

partition

quicksort quicksort

Page 46: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

46

Quicksort v.s. Mergesort

• In Quicksort, subproblems need not be of equal size

• Quicksort is faster because partitioning step can be performed in place and very efficiently

Page 47: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

47

Picking the Pivot• Use the first element as the pivot

– Bad choice

– If input is presorted or in reverse order, the pivot makes poor partitioning because either all elements go into S1 or they go into S2

– If the input is presorted, quicksort will take quadratic time to do nothing useful

• Use the larger of the first two distinct elements– Also bad

Page 48: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

48

Picking the Pivot• Choose the pivot randomly

– generally safe

– generating random numbers is expensive

– does not reduce the average running time

• Median-of-Three Partitioning– The best choice would be the median of the array

– A good estimation is to use the median of the left, right, and center elements as pivot

Page 49: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

49

Partitioning Strategy1. Swap the pivot with the last element2. i starts at the first element and j starts at the next-to-last element3. Move i right, skipping over elements smaller than the pivot.

Move j left, skipping over elements larger than the pivot. Both i and j stops if encounter an element equal to the pivot

4. When i and j stop, if i is to the left of j, swap their elements5. Repeat 3 and 4 until i and j cross6. Swap the pivot with i’s element

Page 50: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

50

Partitioning8 1 4 9 0 3 5 2 7 6

8 1 4 9 0 3 5 2 7 6

2 1 4 9 0 3 5 8 7 6

2 1 4 9 0 3 5 8 7 6

i j

i

i

j

j

i j

Page 51: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

51

Partitioning2 1 4 5 0 3 9 8 7 6

2 1 4 5 0 3 9 8 7 6

2 1 4 5 0 3 6 8 7 9

i

i

j

j

i pivot

Page 52: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

52

Small Arrays

• For very small arrays (N ฃ 20), quicksort does not perform as well as insertion sort.

• Quicksort is recursive. So, these cases occur frequently.

• So, use a sorting algorithm that is efficient for small arrays, such as insertion sort.

• A good cutoff range is N = 10

Page 53: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

53

public static void quicksort( Comparable [ ] a ){ quicksort( a, 0, a.length - 1 ); }

private static void quicksort( Comparable [ ] a, int left, int right ){ if( left + CUTOFF <= right ) { Comparable pivot = median3( a, left, right ); int i = left, j = right - 1; for( ; ; ) { while( a[ ++i ].compareTo( pivot ) < 0 ) { } while( a[ --j ].compareTo( pivot ) > 0 ) { } if( i < j ) swapReferences( a, i, j ); else break; } swapReferences( a, i, right - 1 ); quicksort( a, left, i - 1 ); quicksort( a, i + 1, right ); } else insertionSort( a, left, right );}

Page 54: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

54

private static Comparable median3( Comparable [ ] a, int left, int right ){ int center = ( left + right ) / 2; if( a[ center ].compareTo( a[ left ] ) < 0 ) swapReferences( a, left, center ); if( a[ right ].compareTo( a[ left ] ) < 0 ) swapReferences( a, left, right ); if( a[ right ].compareTo( a[ center ] ) < 0 ) swapReferences( a, center, right );

// Place pivot at position right - 1 swapReferences( a, center, right - 1 ); return a[ right - 1 ];}

Page 55: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

55

Analysis

T(N) = T(i) + T(N - i - 1) + cN ; i = |S1|

Worst-CaseT(N) = T(N - 1) + cN

T(N) = T(1) + c i = O(N2)

Page 56: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

56

Analysis

Best CaseT(N) = 2T(N/2) + cNT(N) = cN log N + N = O(N log N)

Average Case

T(N) = 2/N ( T(j) ) + cNT(N) = O(N log N)

Page 57: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

57

Bucket Sort

• Sort N integers in the range 1 to M• Use M buckets, one bucket for each integer i• Bucket i stores how many times i appears in the

input. Initially, all buckets are empty.• Read input and increase values in buckets• Finally, scan the buckets and print the sorted

list

Page 58: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

58

Bucket Sort

11 12 3 2 0 1 1 1

3, 1, 3, 5, 8, 7, 4, 2, 9, 5, 4, 10, 4

1, 2, 3, 3, 4, 4, 4, 5, 5, 7, 8, 9, 10

Page 59: 2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn…

2110211 Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University

59

Bucket Sort• Input A1, A2, . . . , AN consist of positive integer smaller than

M

• Keep an array count[ ] of size M, which is initialized to all 0s

• When Ai is read, increment count[Ai] by 1

• After all input is read, scan the count array, printing out the sorted list

• This algorithm takes O(M+N)