Intro to CS – Honors I Merge Sort GEORGIOS PORTOKALIDIS [email protected].

Intro to CS – Honors IMerge SortGEORGIOS PORTOKALIDIS

[email protected]

MergeSort MergeSort can be very easily expressed using recursion

◦ Also called Top-Down MergeSort

A fine example of divide-and-conquer algorithm◦ Break down a problem to smaller pieces and attack those

In simple words◦ The array to be sorted is divided in half◦ The two halves are sorted using recursion◦ The two sorted arrays are merged to form a single sorted array

MergeSort Algorithm 1. If the array a has only one element, do nothing (base case). Otherwise, do the following (recursive case):

2. Copy the first half of the elements in a to a smaller array named firstHalf .

3. Copy the rest of the elements in the array a to another smaller array named lastHalf.

4. Sort the array firstHalf using a recursive call.

5. Sort the array lastHalf using a recursive call.

6. Merge the elements in the arrays firstHalf and lastHalf into the array a

Visualizing MergeSort

Merge Process MergeSort divides an array into two parts

◦ firstHalf and lastHalf◦ Both of these arrays are sorted Their smallest element is in firstHalf[0] and lastHalf[0]

The smallest element in both arrays is the smallest between firstHalf[0] and lastHalf[0]◦ We can copy/move that into the result array a

Assuming the smallest element was in firstHalf[0], the next smallest element is the smallest between firstHalf[1] and lastHalf[0]

Merge Processint firstHalfIndex = 0, lastHalfIndex = 0, aIndex = 0;while (Some_Condition){

if (firstHalf[firstHalfIndex] < lastHalf[lastHalfIndex]){

a[aIndex] = firstHalf[firstHalfIndex];aIndex++;firstHalfIndex++;

}else{

a[aIndex] = lastHalf[lastHalfIndex];aIndex++;lastHalfIndex++;

}}

What is the condition to terminate the loop?

while ((firstHalfIndex < firstHalf.length) && (lastHalfIndex < lastHalf.length))

Loop until one of the arrays are exhausted

The loop has moved all the elements of one array. So we just need to move the

remaining elements into a

//Precondition: Arrays firstHalf and lastHalf are sorted from//smallest to largest; a. length = firstHalf.length + lastHalf.length.//Postcondition: Array a contains all the values from firstHalf//and lastHalf and is sorted from smallest to largest.private static void merge(int[] a, int[] firstHalf, int[] lastHalf){

int firstHalfIndex = 0, lastHalfIndex = 0, aIndex = 0;while ((firstHalfIndex < firstHalf.length) &&

(lastHalfIndex < lastHalf.length)){

if (firstHalf[firstHalfIndex] < lastHalf[lastHalfIndex]){

a[aIndex] = firstHalf[firstHalfIndex];firstHalfIndex++;

}else{

a[aIndex] = lastHalf[firstHalfIndex];lastHalfIndex++;

}aIndex++;

}

//At least one of firstHalf and lastHalf has been completely //copied to a. Copy rest of firstHalf, if any.

while (firstHalfIndex < firstHalf.length){

a[aIndex] = firstHalf[firstHalfIndex];aIndex++;firstHalfIndex++;

}//Copy rest of lastHalf, if any.while (lastHalfIndex < lastHalf.length){

a[aIndex] = lastHalf[lastHalfIndex];aIndex++;lastHalfIndex++;

}}

Dividing an Array//Precondition: a.length = firstHalf.length + lastHalf.length.//Postcondition: All the elements of a are divided//between the arrays firstHalf and lastHalf.private static void divide(int[] a, int[] firstHalf, int[] lastHalf){

for (int i = 0); i < firstHalf.length; i++)firstHalf[i] = a[i];

for (int i = 0; i < lastHalf.length; i++)lastHalf[i] = a[firstHalf.length + i];

}

Back to MergeSort/**Precondition: Every indexed variable of the array a has a value.Postcondition: a[0] <= a[1] <= . . . <= a[a. length - 1].*/public static void sort(int[] a){

if (a.length >= 2){int halfLength = a.length / 2;int[] firstHalf = new int[halfLength];int[] lastHalf = new int[a.length - halfLength];divide(a, firstHalf, lastHalf);sort(firstHalf);sort(lastHalf);merge(a, firstHalf, lastHalf);}//else do nothing. a.length == 1, so a is sorted.

}

Why not create the arrays within divide?

MergeSort Characteristics For an array with n elements, we need to divide the array in half, similarly to a binary search

If the length of the array is ◦ odd, the array is split to segments of (n-1)/2 length◦ even, the array is split into a (n/2)-1 and a (n/2) segment

So dividing the array requires x iterations, similarly to binary search, its worst case is x <= log2(n)

However, for each divide we need to also merge the two halves, and in the worst case this requires n copies

So in total we require k operations where k <= nlog2(n)

The complexity in terms of performance is O(nlog(n))

MergeSort also has space overhead, it requires 2n locations

Other Versions of Merge Sort There are variants of MergeSort that do in-place sorting but it is slower

◦ No additional space is required◦ Complexity rises to O(n log2 n)◦ Additional reading

◦ http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.5514&rep=rep1&type=pdf◦ http://thomas.baudel.name/Visualisation/VisuTri/inplacestablesort.html

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.5514&rep=rep1&type=pdf

http://thomas.baudel.name/Visualisation/VisuTri/inplacestablesort.html

Intro to CS – Honors I Merge Sort GEORGIOS PORTOKALIDIS [email protected].

Documents

Transcript of Intro to CS – Honors I Merge Sort GEORGIOS PORTOKALIDIS [email protected].