Merge sort and quick sort

Merge Sort and Quick Sort

CSC 391

2

Sorting• Insertion sort

– Design approach:– Sorts in place:– Best case:– Worst case:

• Bubble Sort– Design approach:– Sorts in place:– Running time:

Yes(n)

(n2)

incremental

Yes(n2)

incremental

3

Sorting• Selection sort

– Design approach:– Sorts in place:– Running time:

• Merge Sort– Design approach:– Sorts in place:– Running time:

Yes

(n2)

incremental

NoLet’s see!!

divide and conquer

4

Divide-and-Conquer

• Divide the problem into a number of sub-problems

– Similar sub-problems of smaller size

• Conquer the sub-problems

– Solve the sub-problems recursively

– Sub-problem size small enough solve the problems in straightforward

manner

• Combine the solutions of the sub-problems

– Obtain the solution for the original problem

5

Merge Sort Approach• To sort an array A[p . . r]:

• Divide– Divide the n-element sequence to be sorted into two

subsequences of n/2 elements each• Conquer– Sort the subsequences recursively using merge sort

– When the size of the sequences is 1 there is nothing more to do

• Combine– Merge the two sorted subsequences

6

Merge Sort

Alg.: MERGE-SORT(A, p, r)

if p < r Check for base case

then q ← (p + r)/2 Divide

MERGE-SORT(A, p, q) Conquer

MERGE-SORT(A, q + 1, r) Conquer

MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)

1 2 3 4 5 6 7 8

62317425

p rq

7

Example – n Power of 21 2 3 4 5 6 7 8

q = 462317425

1 2 3 4

74255 6 7 8

6231

1 2

25

3 4

745 6

317 8

62

1

5

2

2

3

4

4

7 16

37

28

6

5

Divide

8

Example – n Power of 2

1

5

2

2

3

4

4

7 16

37

28

6

5

1 2 3 4 5 6 7 8

76543221

1 2 3 4

75425 6 7 8

6321

1 2

52

3 4

745 6

317 8

62

ConquerandMerge

9

Example – n Not a Power of 2

625374162741 2 3 4 5 6 7 8 9 10 11

q = 6

4162741 2 3 4 5 6

625377 8 9 10 11

q = 9q = 3

2741 2 3

4164 5 6

5377 8 9

6210 11

741 2

23

164 5

46

377 8

59

210

611

41

72

64

15

77

38

Divide

10

Example – n Not a Power of 2

776654432211 2 3 4 5 6 7 8 9 10 11

7644211 2 3 4 5 6

765327 8 9 10 11

7421 2 3

6414 5 6

7537 8 9

6210 11

23

46

59

210

611

41

72

64

15

77

38

741 2

614 5

737 8

ConquerandMerge

11

Merge - PseudocodeAlg.: MERGE(A, p, q, r)1. Compute n1 and n2

2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]

3. L[n1 + 1] ← ; R[n2 + 1] ←

4. i ← 1; j ← 15. for k ← p to r6. do if L[ i ] ≤ R[ j ]7. then A[k] ← L[ i ]8. i ←i + 19. else A[k] ← R[ j ]10. j ← j + 1

p q

7542

6321rq + 1

L

R

1 2 3 4 5 6 7 8

63217542

p rq

n1 n2

Algorithm:mergesort( int [] a, int left, int right) { if (right > left) { middle = left + (right - left)/2; mergesort(a, left, middle); mergesort(a, middle+1, right); merge(a, left, middle, right); } }Assumption: N is a power of two. For N = 1: time is a constant (denoted by 1)Otherwise, time to mergesort N elements = time to mergesort N/2 elements + time to merge two arrays each N/2 elements.

Time to merge two arrays each N/2 elements is linear, i.e. N

Thus we have:(1) T(1) = 1(2) T(N) = 2T(N/2) + N

Next we will solve this recurrence relation. First we divide (2) by N:(3) T(N) / N = T(N/2) / (N/2) + 1

Complexity of Merge Sort

N is a power of two, so we can write(4) T(N/2) / (N/2) = T(N/4) / (N/4) +1

(5) T(N/4) / (N/4) = T(N/8) / (N/8) +1

(6) T(N/8) / (N/8) = T(N/16) / (N/16) +1

(7) ……(8) T(2) / 2 = T(1) / 1 + 1Now we add equations (3) through (8) : the sum of their left-hand sides will be equal to the sum of their right-hand sides:T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + … + T(2)/2 =T(N/2) / (N/2) + T(N/4) / (N/4) + ….+ T(2) / 2 + T(1) / 1 + LogN(LogN is the sum of 1s in the right-hand sides)

After crossing the equal term, we get(9) T(N)/N = T(1)/1 + LogN

T(1) is 1, hence we obtain(10) T(N) = N + NlogN = O(NlogN)Hence the complexity of the MergeSort algorithm is O(NlogN).

Complexity of Merge Sort

14

Quicksort

• Sort an array A[p…r]

• Divide– Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such

that each element of A[p..q] is smaller than or equal to each element in A[q+1..r]

– Need to find index q to partition the array

≤A[p…q] A[q+1…r]

15

Quicksort

• Conquer– Recursively sort A[p..q] and A[q+1..r] using Quicksort

• Combine– Trivial: the arrays are sorted in place

– No additional work is required to combine them

– The entire array is now sorted

A[p…q] A[q+1…r]≤

16

Recurrence

Alg.: QUICKSORT(A, p, r)

if p < r

then q PARTITION(A, p, r)

QUICKSORT (A, p, q)

QUICKSORT (A, q+1, r)Recurrence:

Initially: p=1, r=n

T(n) = T(q) + T(n – q) + n

17

Partitioning the ArrayAlg. PARTITION (A, p, r)1. x A[p]2. i p – 13. j r + 14. while TRUE5. do repeat j j – 16. until A[j] ≤ x7. do repeat i i + 18. until A[i] ≥ x9. if i < j10. then exchange A[i] A[j]11. else return j

Running time: (n)n = r – p + 1

73146235

i j

A:

arap

ij=q

A:

A[p…q] A[q+1…r]≤

p r

Each element isvisited once!

Partition can be done in O(n) time, where n is the size of the array. Let T(n) be the number of comparisons required by Quicksort.

If the pivot ends up at position k, then we have

To determine best-, worst-, and average-case complexity we need to determine the values of k that correspond to these cases.

Analysis of quicksort

T(n) T(nk) T(k 1) n

Best-Case ComplexityThe best case is clearly when the pivot always

partitions the array equally. Intuitively, this would lead to a recursive depth of at

most lg n callsWe can actually prove this. In this case – T(n) T(n/2) T(n/2) n (n lg n)

Best Case Partitioning• Best-case partitioning

– Partitioning produces two regions of size n/2

• Recurrence: q=n/2T(n) = 2T(n/2) + (n)

T(n) = (nlgn) (Master theorem)

Average-Case Complexity

Average case is rather complex, but is where the algorithm earns its name. The bottom line is: T(n) =

(nlgn)

Worst Case Partitioning

The worst-case behavior for quicksort occurs when the partitioning routine produces one region with n - 1 elements and one with only l element.

Let us assume that this Unbalanced partitioning arisesat every step of the algorithm.

Since partitioning costs (n) time and T(1) = (1), the recurrence for the running time isT(n) = T(n - 1) + (n).To evaluate this recurrence, we observe that T(1) = (1) and then iterate:

nn - 1

n - 2

n - 3

2

1

1

1

1

1

1

n

nnn - 1

n - 2

3

2

(n2)

Best case: split in the middle — Θ( n log n) Worst case: sorted array! — Θ( n2) Average case: random arrays — Θ( n log n)

Worst Case Partitioning

Merge sort and quick sort

Education

Transcript of Merge sort and quick sort