Post on 18-Dec-2015
Quicksort
Divide-and-Conquer
Quicksort AlgorithmGiven an array S of n elements (e.g., integers):• If array only contains one element, return it.• Else
– pick one element to use as pivot.
– Divide step: Partition elements into two sub-arrays:• Elements less than or equal to pivot
• Elements greater than pivot
– Conquer step: Quicksort two sub-arrays, recursively.
– Combine step: the returned sort result S1, followed by pivot, followed by the returned sort result S2.
(nothing extra needs to be done)
General Example
Pseudo-code Input: an array A[left, right]
QuickSort (A, left, right) { if (left < right) {
pivot = Partition (A, left, right)Quicksort (A, left, pivot-1)Quicksort (A, pivot+1, right)
}}
Two key steps
• How to pick a pivot?
• How to partition?
Pick a pivot
There are a number of ways to pick the pivot element, such as:
• Use the first element as pivot
• Choose the pivot randomly
• Pick median value of three elements from data array:
data[0], data[n/2], and data[n-1].
ExampleWe are given array of n integers to sort:
40 20 10 80 60 50 7 30 100
Pick Pivot ElementIn this example, we will use the first element in the array:
40 20 10 80 60 50 7 30 100
Partitioning Array
Given a pivot, partition the elements of the array such that the resulting array consists of:
1. One sub-array that contains elements >= pivot 2. Another sub-array that contains elements < pivot
The sub-arrays are stored in the original data array.
Partitioning loops through, swapping elements below/above pivot.
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
40 20 10 80 60 50 7 30 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 60 50 7 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
40 20 10 30 7 50 60 80 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
7 20 10 30 40 50 60 80 100pivot_index = 4
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
Partition Result
7 20 10 30 40 50 60 80 100
[0] [1] [2] [3] [4] [5] [6] [7] [8]
<= data[pivot] > data[pivot]
Recursion: Quicksort Sub-arrays
7 20 10 30 40 50 60 80 100
[0] [1] [2] [3] [4] [5] [6] [7] [8]
<= data[pivot] > data[pivot]
Analysis of Quicksort
Quicksort: Best Case
Analysis of quicksort—best case
• Suppose each partition operation divides the array almost exactly in half
• Then the depth of the recursion is log2n– Because that’s how many times we can halve n
• We note that– Each partition is linear over its subarray– All the partitions at one level cover the array
Partitioning at various levels
Best Case Analysis
• We cut the array size in half each time
• So the depth of the recursion in log2n
• At each level of the recursion, all the partitions at that level do work that is linear in n
• O(log2n) * O(n) = O(n log2n)
• Hence in the best case, quicksort has time complexity O(n log2n)
• What about the worst case?
Quicksort: Worst Case
Quicksort: Worst Case
• Assume first element is chosen as pivot.
• Assume we get array that is already in order:
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
too_big_index too_small_index
1. While data[too_big_index] <= data[pivot]++too_big_index
2. While data[too_small_index] > data[pivot]--too_small_index
3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]
4. While too_small_index > too_big_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]
2 4 10 12 13 50 57 63 100pivot_index = 0
[0] [1] [2] [3] [4] [5] [6] [7] [8]
> data[pivot]<= data[pivot]
Worst case
• In the worst case, partitioning always divides the size n array into these three parts:– A length one part, containing the pivot itself– A length zero part, and– A length n-1 part, containing everything else
• We don’t recur on the zero-length part
• Recurring on the length n-1 part requires (in the worst case) recurring to depth n-1
Worst case partitioning
Worst case partitioning
Worst case for quicksort
• In the worst case, recursion may be n levels deep (for an array of size n)
• But the partitioning work done at each level is still n
• O(n) * O(n) = O(n2)• So worst case for Quicksort is O(n2)• When does this happen?
– There are many arrangements that could make this happen
– Here are two common cases:• When the array is already sorted• When the array is inversely sorted (sorted in the opposite
order)
Typical case for quicksort
• If the array is sorted to begin with, Quicksort is terrible: O(n2)
• However, Quicksort is usually O(n log2n)
• The constants are so good that Quicksort is generally the faster algorithm.
• Most real-world sorting is done by Quicksort
What can we do to avoid worst case?
Picking a better pivot• Before, we picked the first element of the
subarray to use as a pivot– If the array is already sorted, this results in O(n2)
behavior– It’s no better if we pick the last element
• Pick median value of three elements from data array:
data[0], data[n/2], and data[n-1].
Use this median value as pivot.
Compare with other Sorting Algorithms
Quicksort for Small Arrays
• For very small arrays (N<= 20), quicksort does not perform as well as insertion sort
• A good cutoff range is N=10
• Switching to insertion sort for small arrays can save about 15% in the running time
Mergesort vs Quicksort
• Both run in O(n lgn)– Mergesort – always.– Quicksort – on average
• Compared with Quicksort, Mergesort has less number of comparisons but larger number of moving elements
Mergesort vs QuicksortIn C++, copying objects can be expensive while
comparing objects often is relatively cheap. Therefore, quicksort is the sorting routine commonly used in C++ libraries
In Java, an element comparison is expensive but moving elements is cheap. Therefore, Mergesort is used in the standard Java library for generic sorting
.