Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer...
-
Upload
osborn-sharp -
Category
Documents
-
view
220 -
download
0
Transcript of Copyright (C) Gal Kaminka 2003 1 Data Structures and Algorithms Sorting II: Divide and Conquer...
Copyright (C) Gal Kaminka 2003 1
Data Structures and Algorithms
Sorting II:
Divide and Conquer Sorting
Gal A. Kaminka
Computer Science Department
2
Last week: in-place sorting
Bubble Sort – O(n2) comparisons O(n) best case comparisons, O(n2) exchanges
Selection Sort - O(n2) comparisons O(n2) best case comparisons O(n) exchanges (always)
Insertion Sort – O(n2) comparisons O(n) best case comparisons Fewer exchanges than bubble sort Best in practice for small lists (<30)
3
This week
Mergesort O(n log n) always O(n) storage
Quick sort O(n log n) average, O(n^2) worst Good in practice (>30), O(log n) storage
4
MergeSort A divide-and-conquer technique Each unsorted collection is split into 2
Then again Then again
Then again
……. Until we have collections of size 1 Now we merge sorted collections
Then again Then again
Then again Until we merge the two halves
5
MergeSort(array a, indexes low, high)1. If (low < high)
2. middle(low + high)/2
3. MergeSort(a,low,middle) // split 1
4. MergeSort(a,middle+1,high) // split 2
5. Merge(a,low,middle,high) // merge 1+2
6
Merge(arrays a, index low, mid, high)1. bempty array, tmid+1, ilow, tllow
2. while (tl<=mid AND t<=high)
3. if (a[tl]<=a[t])
4. b[i]a[tl]
5. ii+1, tltl+1
6. else
7. b[i]a[t]
8. ii+1, tt+1
9. if tl<=mid copy a[tl…mid] into b[i…]
10. else if t<=high copy a[t…high] into b[i…]
11. copy b[low…high] onto a[low…high]
7
An example
Initial: 25 57 48 37 12 92 86 33
Split: 25 57 48 37 12 92 86 33
Split: 25 57 48 37 12 92 86 33
Split: 25 57 48 37 12 92 86 33
Merge: 25 57 37 48 12 92 33 86
Merge: 25 37 48 57 12 33 86 92
Merge: 12 25 33 37 48 57 86 92
8
The complexity of MergeSort
Every split, we half the collection How many times can this be done?
We are looking for x, where 2x = n
x = log2 n
So there are a total of log n splits
9
The complexity of MergeSort
Each merge is of what run-time? First merge step: n/2 merges of 2 n Second merge step: n/4 merges of 4 n Third merge step: n/8 merges of 8 n …. How many merge steps? Same as splits log n
Total: n log n steps
10
Storage complexity of MergeSort
Every merge, we need to hold the merged array:
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4
1 2 3 4 5 6
11
Storage complexity of MergeSort
So we need temporary storage for merging Which is the same size as the two collections together
To merge the last two sub-arrays (each size n/2)
We need n/2+n/2 = n temporary storage
Total: O(n) storage
12
MergeSort summary
O(n log n) runtime (best and worst) O(n) storage (not in-place) Very naturally done using recursion
But note can be done without recursion!
In practice: Can be improved by combining with insertion sort Split down to arrays of size 20-30, then insert-sort Then merge
13
QuickSort
Key idea: Select a item (called the pivot) Put it into its proper FINAL position Make sure:
All greater item are on one side (side 1) All smaller item are on other side (side 2)
Repeat for side 1 Repeat for side 2
14
Short example
25 57 48 37 12 92 86 33 Let’s select 25 as our initial pivot. We move items such that:
All left of 25 are smaller All right of 25 are larger As a result 25 is now in its final position
12 25 57 48 37 92 86 33
15
Now, repeat (recursively) for left and right sides
12 25 57 48 37 92 86 33 Sort 12 Sort 57 48 37 92 86 33
12 needs no sorting For the other side, we repeat the process
Select a pivot item (let’s take 57) Move items around such that left items are smaller,
etc.
16
12 25 57 48 37 92 86 33
Changes into
12 25 48 37 33 57 92 86
And now we repeat the process for left
12 25 37 33 48 57 92 86
12 25 33 37 48 57 92 86
12 25 33 37 48 57 92 86
And for the right
12 25 33 37 48 57 86 92
12 25 33 37 48 57 86 92
17
QuickSort(array a; index low, hi)
1. if (low >= hi)
2. return ; // a[low..hi] is sorted
3. pivotfind_pivot(a,low,hi)
4. p_index=partition(a,low,high,pivot)
5. QuickSort(a,low,p_index-1)
6. QuickSort(a,p_index+1,hi)
18
Key questions
How do we select an item (FindPivot())? If we always select the largest item as the pivot
Then this process becomes Selection Sort Which is O(n2)
So this works only if we select items “in the middle” Since then we will have log n divisions
How do we move items around efficiently (Partition()?) This offsets the benefit of partitioning
19
FindPivot
To find a real median (middle item) takes O(n) In practice however, we want this to be O(1) So we approximate:
Take the first item (a[low]) as the pivot Take the median of {a[low],a[hi],a[(low+hi)/2]}
FindPivot(array a; index low, high)
1. return a[low]
20
Partition (in O(n))
Key idea: Keep two indexes into the array
up points at lowest item >= pivot down points at highest item <= pivot
We move up, down in the array Whenever they point inconsistently, interchange
At end: up and down meet in location of pivot
21
partition(array a; index low,hi ; pivot; index pivot_i)
1. downlow, uphi
2. while(down<up)
3. while (a[down]<=pivot && down<hi)
4. downdown + 1
5. while (a[hi]>pivot)
6. upup – 1
7. if (down < up)
8. swap(a[down],a[up])
9. a[pivot_i]=a[up]
10. a[up] = pivot
11. return up
22
Example: partition() with pivot=25
First pass through loop on line 2:
25 57 48 37 12 92 86 33
down up
23
Example: partition() with pivot=25
First pass through loop on line 2:
25 57 48 37 12 92 86 33
down up
We go into loop in line 3 (while a[down]<=pivot)
24
Example: partition() with pivot=25
First pass through loop on line 2:
25 57 48 37 12 92 86 33
down up
We go into loop in line 5 (while a[up]>pivot)
25
Example: partition() with pivot=25
First pass through loop on line 2:
25 57 48 37 12 92 86 33
down up
We go into loop in line 5 (while a[up]>pivot)
26
Example: partition() with pivot=25
First pass through loop on line 2:
25 57 48 37 12 92 86 33
down up
Now we found an inconsistency!
27
Example: partition() with pivot=25
First pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
So we swap a[down] with a[up]
28
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
29
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
Move down again (increasing) – loop on line 3
30
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
Now we begin to move up again – loop on line 5
31
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
Again – loop on line 5
32
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
down < up? No. So we don’t swap.
33
Example: partition() with pivot=25
Second pass through loop on line 2:
25 12 48 37 57 92 86 33
down up
Instead, we are done. Just put pivot in place.
34
Example: partition() with pivot=25
Second pass through loop on line 2:
12 25 48 37 57 92 86 33
down up
Instead, we are done. Just put pivot in place.
(swap it with a[up] – for us a[low] was the pivot)
35
Example: partition() with pivot=25
Second pass through loop on line 2:
12 25 48 37 57 92 86 33
down up
Now we return 2 as the new pivot index
36
Notes We need the initial pivot_index in partition() For instance, change FindPivot():
return pivot (a[low]), as well as initial pivot_index (low) Then use pivot_index in the final swap
QuickSort: Average O(n log n), Worst case O(n2)
works very well in practice (collections >30) Average O(n log n), Worst case O(n2) Space requirements O(log n) – for recursion