Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

28
Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort

Transcript of Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Page 1: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

SortingQuick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort

Page 2: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Motivation of Sorting

Sorting algorithms contain interesting and important ideas for code optimization as well as algorithm design.

Page 3: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Merge SortLast time we talked

about Merge Sort◦ Recursively calls:

MergeSort(1st half of list)MergeSort(2nd half of list)

Then Merges results

Page 4: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Quick SortThis probably the

most common sort used in practice, since it is usually the quickest in practice.

It uses the idea of a partition, without using an additional array, and recursion to achieve this efficiency.

Page 5: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

QuickSortBasically the partition works like this:

◦ Given an array of n values you must randomly pick an element in the array to partition by.

◦ Once you have picked this value, compare all of the rest of the elements to this value. If they are greater, put them to the “right” of the

partition element. If they are less, put them to the “left” of the partition

element.

So if we sort those 2 sides the whole array will be sorted.

88 35 44 99 71 20 45 42 67 61

35 44 20 42 45 88 61 99 67 71

Still need to be sorted Still need to be sortedIn the right spot

:D

Page 6: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

QuickSort Thus, similar to MergeSort, we can use a partition to break

the sorting problem into two smaller sorting problems.

◦ QuickSort at a general level:1) Partition the array with respect to a random element.2) Sort the left part of the array, using Quick Sort.3) Sort the right part of the array, using Quick Sort.

It should be clear that this algorithm will work◦ But it may not be clear why it is faster than MergeSort.

Like MergeSort it recursively solves 2 sub problems and requires linear additional work.

BUT unlike MergeSort the sub problems are NOT guaranteed to be of equal size.

The reason that QuickSort is faster is that the partitioning step can actually be performed in place and very efficiently. This efficiency can more than make up for the lack of equal sized

recursive calls.

Page 7: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

How to Partition in Place8 3 6 9 2 4 7 5

LOW HIGH

Pivot

Eleme

nt

Assume for now, that we partition based on the last element in the array, 5.

Start 2 counters: Low at array index 0 High at 2nd to last index in the array Advance the Low counter forward until a value greater than the pivot is

encountered. Advance the High counter backward until a value less than the pivot is

encountered.

8 3 6 9 2 4 7 5

LOW HIGH

Pivot

Element

Now, swap these 2 elements, since we know that they are both on the “wrong” side.

4 3 6 9 2 8 7 5

Continue to advance the counters as before.

LOW HIGH

Page 8: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

How to Partition in Place

SWAP

4 3 6 9 2 8 7 5

LOW HIGH

Pivot

Element

4 3 2 9 6 8 7 5

LOW HIGH

When both counters line up, SWAP the last element with the counter position to finish the partition.

4 3 2 5 6 8 7 9

Now as you can see our array is partitioned into a “left” and a “right”.

Page 9: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Picking the PivotAlthough the Partition algorithm works

no matter which element is chosen as the pivot, some choices are better than others.◦Wrong Choice:

Just use the first element in the list If the input is random, this is acceptable. BUT, what if the list is already sorted or in reverse

order? Then all the elements go into S1 or S2, consistently

throughout recursive calls. So it would take O(n2) to do nothing at all! (If presorted)

EMBARRASSING!

Page 10: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Picking the PivotA Safer Way

◦ Choose the pivot randomly Generally safe, since it’s unlikely the random pivot would

consistently be a poor partition. Random number generation is generally expensive.

Median-of-Three Partitioning◦ The best choice would be the median of the array.

But that would be hard to calculate and slow. A good estimate is to pick 3 elements and use the

median of those as the pivot. The rule of thumb: Pick the left, center, and right

elements and take the median as the pivot.

8 1 4 9 6 3 5 2 7 0

Left

Cen

ter

Rig

htPivot

Element

Page 11: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Analysis of QuicksortShown on the board

Page 12: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Quickselect Given an array of n elements, determine the kth smallest

element.◦ Clearly k must lie in between 1 and n inclusive◦ The selection problem is different, but related to the sorting

problem.

The idea is:◦ Partition the array.◦ There are 2 subarrays:

One of size m, with the m smallest elements. The other of size n-m-1 .

If k ≤ m, we know the kth smallest element is in the 1st partition. If k == m+1, we know the kth smallest element IS the pivot. Otherwise, the kth smallest element is in the 2nd partition.

35 44 20 42 45 88 61 99 67 71

Size m: if (k ≤ m) the kth smallest element is in here.

if (k==m+1)We know the kth smalles is the

pivot

Size n-m-1: if (k > m) the kth smallest element is in here.

Page 13: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

QuickselectAlgorithm:Quickselect(A, low, high, k):

1) m=Partition(A, low, high) // m is how many values are less than the partition element.

2) if k≤m, return Quickselect(low, low+m-1, k)3) if k==m+1 return the pivot, A[low+m]4) else return Quickselect(low+m+1, high, k-m-1)

So instead of doing 2 recursive calls, w only make one here.

◦ It turns out that on average Quickselect takes O(n) time, which is far better than it’s worst case performace of O(n2) time.

Page 14: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Quickselect AnalysisShown on the board

Page 15: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Shellsort Although this doesn’t have the fastest running time

of all the sorting algorithms...◦ It is fairly competitive with Quick and Merge sort for fairly

decent sized data sets.

The basic idea:◦ Instead of sorting all the elements at once, sort a small set

of them Such as every 5th element (you can use insertion sort) Then sort every 3rd element, etc. Finally sort all the elements using insertion sort.

Rationale:◦ A small insertion sort is quite efficient.◦ A larger insertion sort can be efficient, if the elements are

already “close” to sorted order. By doing the smaller insertion sorts, the elements do get closer to

in order before the larger insertions are done.

Page 16: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Shellsort example

5 2 3 9 1 7 4 17 13 18 12 6

4 1 3 5 2 6 9 12 7 18 17 13

Sort every 5th element:

Sort every 3rd element:

12 4 3 9 18 7 2 17 13 1 5 6

Notice that by the time we do this last insertion sort, most elements don’t have a long way to go before being inserted.

12 4 3 9 18 7 2 17 13 1 5 6

5 2 3 9 1 7 4 17 13 18 12 6

Final a normal insertion sort:

1 2 3 4 5 6 7 9 12 13 17 18

Page 17: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Shellsort – choosing incrementsDo we always do a 5,3,1 sort?

◦ No, in general shell sort will ALWAYS work as long as the last “pass” is a 1-sort.

What tends to work well is if each of the values in the increment sequence are in a geometric series.◦ i.e. 1,2,4,8,16, etc.◦ The initial passes will be really quick, O(n).

It turns out, in practice a geometric ratio of 2.2 produces the best results.

◦ The actual average case analysis of this sort is too difficult to derive, But it can be shown with experimental results to indicate

an average running time of O(n1.25). Dependent on the gap sequence.

Page 18: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting Sort In counting sort, we know that each of the values being

sorted are in the range from 0 to m inclusive.◦ We use this information to count the occurrences of each

value.

Here is the algorithm for sorting an array a[0], …, a[n-1]:1) Create an auxiliary C, indexed from 0 to m, and initialize

each value to 0.2) Run through the input array A, incrementing the number of

occurrences of each value 0 through m by adding +1 to the appropriate index in C. (Thus C is a frequency array)

3) Run through the array C, a 2nd time so that the value stored in each array slot represents the # of elements ≤ the index value in the original array A.

4) Now, run through the original input array A, and for each value in A, use the auxiliary array C, to tell you the proper placement of that value in the sorted input, which be a new array B.

5) Copy the sorted array B into A.

Page 19: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting Sort Example

Index: 0 1 2 3 4 5 6 7

A: 3 6 4 1 3 4 1 4

Index: 0 1 2 3 4 5 6

C: 0 2 0 2 3 0 1

Consider the input array to sort. There are values from 0 to 6

First, create the frequency array C:

Now we want to change C so that each array element stores the # of values less than or equal to the given index - 1:

Index: 0 1 2 3 4 5 6

C: -1 1 1 3 6 6 7

Now that C is completed, we can start…

Page 20: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting Sort ExampleIndex: 0 1 2 3 4 5 6 7

A: 3 6 4 1 3 4 1 4

Original array:

Index: 0 1 2 3 4 5 6

C: -1 1 1 3 6 6 7

Now that C is completed, We can start putting the elements in A in their correct spot in B:

Frequency array:

Start with A[7], which contains 4. Since C[ A[7]] = C[4] = 6, this means there are 6 elements ≤ 4, thus 4 should be place in index 6 of the output array, B[6]:

Index: 0 1 2 3 4 5 6 7

B: 4

Update C: We must decrement C[4] so that the next time we place a 4, it’s in a new location.

Index: 0 1 2 3 4 5 6

C: -1 1 1 3 5 6 7

Page 21: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting Sort ExampleFinish on the board

Page 22: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting SortNote: Counting sort is a stable sort

◦ This means that ties in the original input stay in the same relative order after being sorted.

For example, the last 4 in the input will be in array index 6 of the output, the second to last 4 in the input will be in array index 5 of the output, and the 4 in array index 3 of the input will be placed in index 4 of the output.

So there was no unnecessary switching of equal values.

Another note: ◦ After getting the frequency array, why didn’t we just

loop through each index in C and place each corresponding number in the array A directly? i.e. Since C[1]=1 originally, why not just place a 1 in A[0] and

A[1] and move on… The reason is these numbers may be keys with associated

data with them, and this approach would place the keys, but not all of the data.

Page 23: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Counting Sort AnalysisO(n+k), Best, Worst, and Average

runtime◦where n is the length of the input

array and ◦k is the length of the counting array.

Page 24: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Radix SortThe input to this sort must be non-

negative integers all of a fixed length of digits.◦For example, numbers in the range100 – 999

or 1000 – 9999, have numbers of a fixed length of digits.

The sort works as follows:1) Sort the values using a O(n) stable sort on the

kth most significant digit.2) Decrement k by 1.3) Repeat step 1. (Unless k=0, they you’re done.

Page 25: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Radix Sort ExampleUnsorted

235

162

734

175

237

674

628

Sort digits

162

734

674

235

175

237

628

Sort tens

628

734

235

237

162

674

175

Sort hundreds

162

175

235

237

628

674

734

Page 26: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Radix Sort AnalysisThe running time of this sort should be O(nk)

◦Since we do k stable sorts that each run O(n) time. Where k is the number of digits. (A stable sort is one where if two values being sorted,

say vi and vj are equal, and vi comes before vj in the unsorted list, then vi will STILL come before vj in the sorted list.)

Depending on how many digits the numbers are, this sort can be more efficient than any O(n log n) sort, depending on the range of values.

Page 27: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

Radix SortQuestion:

◦Would it work the other way around, namely from most significant digit to least significant digit?

Page 28: Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.

A lower time bound for comparison based sortingShown on the board