Chapter 7 Sorting

Post on 06-Jan-2016

17 views 0 download

description

Chapter 7 Sorting. Instructors: C. Y. Tang and J. S. Roger Jang. All the material are integrated from the textbook "Fundamentals of Data Structures in C" and  some supplement from the slides of Prof. Hsin-Hsi Chen (NTU). Sequential Search. /* maximum size of list plus one */ - PowerPoint PPT Presentation

Transcript of Chapter 7 Sorting

Chapter 7 Sorting

Instructors:

C. Y. Tang and J. S. Roger JangAll the material are integrated from the textbook "Fundamentals of Data Structures in

C" and  some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Sequential Search

/* maximum size of list plus one */# define MAX_SIZE 1000

typedef struct {

int key;/* other fields */

} element;

element list[MAX_SIZE];

Sequential SearchInt seqsearch(int list[], int searchnum, int n){

/* search an array, list, that has n numbers.Return i, if list[i] = searchnum. Return -1, if searchnum is not in the list */int i; // sentinel that signals the end of the listlist[n].key = searchnum; for (i = 0; list[i].key != searchnum; i++)

;return ((i < n) ? i : -1);

}

Analysis of Seq Search Example

44, 55, 12, 42, 94, 18, 06, 67 unsuccessful search

n+1 The average number of comparisons for a successful search

is

( ) /i nn

i

n

11

20

1

Binary Search Input : sorted list

(i.e. list[0].key list[1].key ... list[n-1].key)≦ ≦ ≦ compare searchnum and list[middle].key, where

middle=(n-1)/2, There are three possible outcomes: searchnum < list[middle].key

search list[0] and list[middle-1] searchnum = list[middle].key

search terminal successfully searchnum > list[middle].key

search list[middle+1] and list[n-1]

Binary Search Input : 4,15,17,26,30,46,48,56,58,82,90,95

It is easily to see that binary search makes no more than O(log n) comparisons (i.e. height of tree ).

46

17

4

15

26

30

58

48

56

90

82 95

[5]

[2]

[0]

[1]

[3]

[4]

[8]

[6]

[7]

[10]

[9] [11]

Decision tree for binary search

int binsearch(element list[], int searchnum, int n){

int left = 0, right = n-1, middle;while (left <= right){

middle = (left + right) / 2;switch (COMPARE(list[middle].key, searchnum)){

case -1 : left = middle + i; break;

case 0 : return middle;case 1 : right = middle - 1;

}}return -1;

}

Binary Search

Compare lists to verify that are identical, or to identify the discrepancies. The problem of list verification is an instance of repeatedly searching o

ne list, using each key in the other list as the search key. example

international revenue service (e.g., employee vs. employer) complexities

random order: O(mn) ordered list: O(tsort(n)+tsort(m)+m+n)

List Verification

/* compare two unordered lists list1 and list2 */void verify1(element list1[], element list2[ ], int n, int m){ int i, j; int marked[MAX_SIZE]; for(i = 0; i<m; i++) marked[i] = FALSE; for (i=0; i<n; i++) if ((j = seqsearch(list2, m, list1[i].key))<0) printf("%d is not in list 2\n", list1[i].key); else marked[j] = TRUE; for ( i=0; i<m; i++) if (!marked[i]) printf("%d is not in list1\n", list2[i]key);}

Verifying using a sequential search

/* Same task as verify1, but list1 and list2 are sorted */void verify2(element list1[ ], element list2 [ ], int n, int m) { int i, j; sort(list1, n); sort(list2, m); i = j = 0; while (i< n && j< m) if (list1[i].key<list2[j].key) { printf ("%d is not in list 2 \n", list1[i].key); i++; }

Fast verification of two lists

else if (list1[i].key == list2[j].key) { i++; j++; } else { printf("%d is not in list 1\n", list2[j].key); j++; } for(; i<n; i++) printf ("%d is not in list 2\n", list1[i].key); for(; j<m; j++) printf("%d is not in list 1\n", list2[j].key);}

Fast verification of two lists

complexities verify1 ︰

O(mn) (randomly arranged) verify2 ︰

O(tsort(n) + tsort(m) + m + n) (sorted list)

Sorting Problem Definitions We are given a list of records (R0, R1, ... , Rn-1).

each record, Ri, has a key value, Ki. Find a permutation σ such that ︰

sorted Kσ(i-1) K≦ σ(i), for 0 < i n-1.≦

stable If i<j and Ki=Kj in the list, then Ri precedes Rj in the sorted

list. criteria

# of key comparisons # of data movements

Insertion sort

Insert a record Ri into a sequence of ordered records, R0,R1,...,Ri-1.(K0 K≦ 1 ... K≦ ≦ i-1).

Time Complexity : O(n2) Stable The fastest sort when n is small (n 20)≦

Insertion sort

26 5 77 1 61 11 59 15 48 19

5 26 77 1 61 11 59 15 48 19

5 26 77 1 61 11 59 15 48 19

1 5 26 77 61 11 59 15 48 19

1 5 26 61 77 11 59 15 48 19

1 5 11 26 61 77 59 15 48 19

1 5 11 26 59 61 77 15 48 19

1 5 11 15 26 59 61 77 48 19

1 5 11 15 26 48 59 61 77 19

5 26 1 61 11 59 15 48 19 77

Insertion sort/* perform a insertion sort on the list */void insertion_sort(element list[], int n){ int i, j; element next; for (i=1; i<n; i++) { next= list[i]; for (j=i-1; j>=0 && next.key<list[j].key ; j--) list[j+1] = list[j]; list[j+1] = next; }}

Insertion sort

worse case

i 0 1 2 3 4- 5 4 3 2 11 4 5 3 2 12 3 4 5 2 13 2 3 4 5 14 1 2 3 4 5

)()(1

0

2

n

i

nOiO

Insertion sort

best case

i 0 1 2 3 4- 2 3 4 5 11 2 3 4 5 12 2 3 4 5 1 3 2 3 4 5 14 1 2 3 4 5

O(n)

left out of order (LOO)

Insertion sort

Ri is LOO if Ri < max{Rj}0j<i

k: # of records LOO

Computing time: O((k+1)n)

44 55 12 42 94 18 06 67* * * * *

Variation

Binary insertion sort sequential search --> binary search reduce # of comparisons,

# of moves unchanged List insertion sort

array --> linked list sequential search, move --> 0

Quick Sort (C.A.R. Hoare)

Given (R0, R1, ... , Rn-1), pick up a pivot key Ki, if Ki is placed in position s(i), then ︰Kj K≦ s(i) for j<s(i)

Kj K≧ s(i) for j>s(i)

R0, …, RS(i)-1, RS(i), RS(i)+1, …, RS(n-1)

two partitions

Quick sort example Given 10 records with keys (26, 5, 37, 1, 61, 11, 59, 15, 48, 19)

Quick sortvoid quicksort(element list[], int left, int right){ int pivot, i, j; element temp; if (left < right) { i = left; j = right+1; pivot = list[left].key; do { do i++; while (list[i].key < pivot); do j--; while (list[j].key > pivot); if (i < j) SWAP(list[i], list[j], temp); } while (i < j); SWAP(list[left], list[j], temp); quicksort(list, left, j-1); quicksort(list, j+1, right); }}

Analysis for quick sort Assume that each time a record is positioned, th

e list is divided into the rough same size of two parts.

Position a list with n element needs O(n) T(n) is the time taken to sort n elements

T(n)<=cn+2T(n/2) for some c <=cn+2(cn/2+2T(n/4)) ... <=cnlog n+nT(1)=O(nlog n)

Analysis for quick sort Time complexity:

Worst case: O(n2) Best case: O(nlogn) Average case: O(nlogn)

Space complexity Worst case: O(n) Best case: O(logn) Average case: O(logn)

Unstable

Variation Our version of quick sort always picked the key

of the first record in the current sublist as the pivot.

A better choice for this pivot is the median of the first, middle, and last keys in the current sublist.

Optimal sorting time How quickly can we hope to sort a list of n object? If we restrict our question to algorithms that per

mit only the comparison and interchange of keys, then the answer is O(n log2n)

Decision tree for insertion sort

K0 K≦ 1 [0,1,2]

K0 K≦ 2 [1,0,2]

K0 K≦ 2 [0,1,2]

stop [0,1,2]

stop [1,0,2]

stop [0,1,2]

stop [0,1,2]

stop [2,1,0]stop [1,2,0]

K1 K≦ 2 [0,1,2]

K1 K≦ 2 [1,2,0]

Yes

Yes

Yes

Yes

Yes

No

No No

No No

Optimal sorting time Theorem 7.1: Any decision tree that sorts n distin

ct elements has a height of at least log2(n!) +1 Corollary: Any algorithm that sorts by comparison

s only must have a worst case computing time ofΩ(n log2n).

Merge sort How to merge two sorted lists (list[i], ... , list[m] and

list[m+1], ... , list[n]) into single sorted list, (sorted[i], ... , sorted[n])?

There are two algorithm to do this, one need O(n) space complexity, another only require O(1).

Merge sort (O(n) space)

void merge(element list[], element sorted[], int i, int m, int n){ int j, k, t; j = m+1; k = i; while (i<=m && j<=n) { if (list[i].key<=list[j].key) sorted[k++]= list[i++]; else sorted[k++]= list[j++]; } if (i>m) for (t=j; t<=n; t++) sorted[k+t-j]= list[t]; else for (t=i; t<=m; t++) sorted[k+t-i] = list[t];}

Analysis

Time complexity: O(n) Space complexity: O(n)

*Figure 7.5:First eight lines for O(1) space merge example (p337)

file 1 (sorted) file 2 (sorted)

discover n keys

exchange

sortexchange

Sort by the rightmost records of n -1 blocks

compareexchange

compareexchange

compare

preprocessing

0 1 2 y w z u x 4 6 8 a v 3 5 7 9 b c e g i j k d f h o p q l m n r s t

0 1 2 3 w z u x 4 6 8 a v y 5 7 9 b c e g i j k d f h o p q l m n r s t

0 1 2 3 4 z u x w 6 8 a v y 5 7 9 b c e g i j k d f h o p q l m n r s t

0 1 2 3 4 5 u x w 6 8 a v y z 7 9 b c e g i j k d f h o p q l m n r s t

*Figure 7.6:Last eight lines for O(1) space merge example(p.338)

6, 7, 8 are merged

Segment one is merged (i.e., 0, 2, 4, 6, 8, a)

Change place marker (longest sorted sequence of records)

Segment one is merged (i.e., b, c, e, g, i, j, k)

Change place marker

Segment one is merged (i.e., o, p, q)

No other segment. Sort the largest keys.

O(1) Space merge sort

Iterative Merge Sort

We assume that the input sequence has n sorted lists each of length 1.

We merge these lists pairwise to obtain n/2 list of size 2.

We then merge the n/2 lists pairwise, and so on, until a single list remains.

Iterative Merge Sort example

merge_pass function

void merge_pass(element list[], element sorted[],int n, int length)

{

int i, j;

for (i=0; i<n-2*length; i+=2*length)

merge(list,sorted,i,i+length-1,i+2*lenght-1);

if (i+length<n)

merge(list, sorted, i, i+lenght-1, n-1);

else

for (j=i; j<n; j++)

sorted[j]= list[j];

}

...i i+length-1 i+2length-1

...2*length

One complement segment and one partial segment

Only one segment

merge_sort function

void merge_sort(element list[], int n)

{

int length=1;

element extra[MAX_SIZE];

while (length<n) {

merge_pass(list, extra, n, length);

length *= 2;

merge_pass(extra, list, n, length);

length *= 2;

}

} l l l l ...

2l 2l ...

4l ...

26 5 77 1 61 11 59 15 48 19

5 26 11 59

5 26 77 1 61 11 15 59 19 48

1 5 26 61 77 11 15 19 48 59

1 5 11 15 19 26 48 59 61 77

(1+10)/2

(1+5)/2 (6+10)/2

copy copy copy copy

Data Structure: array (copy subfiles) vs. linked list (no copy)

Recursive Formulation of Merge Sort

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9

key 26 5 77 1 61 11 59 15 48 19

link 8 5 -1 1 2 7 4 9 6 0

start=3

Simulation of merge sort

int rmerge(element list[], int lower, int upper)

{int middle;if (lower >= upper) return lower;else {

middle = (lower+upper)/2;return listmerge(list,

rmerge(list,lower,middle), rmerge(list,middle, upper)); }}

lower upper

Point to the start of sorted chain

lower uppermiddle

Recursive Merge Sort

List Merge

int listmerge(element list[], int first, int second){ int start=n; while (first!=-1 && second!=-1) { if (list[first].key<=list[second].key) { /* key in first list is lower, link this element to start and

change start to point to first */ list[start].link= first; start = first; first = list[first].link; }

first

second...

List Merge

else { /* key in second list is lower, link this element into the partially sorted list */ list[start].link = second; start = second; second = list[second].link; } } if (first==-1) list[start].link = second; else list[start].link = first; return list[n].link;}

first is exhausted.

second is exhausted.

O(nlog2n)

Nature Merge Sort

26 5 77 1 61 11 59 15 48 19

5 26 77 1 11 59 61 15 19 48

1 5 11 26 59 61 77 15 19 48

1 5 11 15 19 26 48 59 61 77

Heap Sort

The heap sort algorithm will require only a fixed amount of additional storage and at the same time will have as its worst case and average computing time O(n log n).

While heap sort is slightly slower than merge sort using O(n) additional space, it is faster than merge sort using O(1) additional space.

Heap Sort Example

26[1]

77[3]5[2]

11[6]61[5]

48[9]

59[7]

19[10]15[8]

1[4]

Array interpreted as a binary tree

Heap Sort Example

77[1]

59[3]61[2]

11[6]19[5]

1[9]

26[7]

5[10]15[8]

48[4]

Max heap following first for loop of heapsort

initial heap

exchange

Heap Sort Example

61[1]

59[3]48[2]

11[6]19[5]

1[9]

26[7]

77[10] 5[8]

15[4]

Heap Sort Example

59[1]

26[3]48[2]

11[6]19[5]

61[9]

1[7]

77[10]5[8]

15[4]

Heap Sort Example

48[1]

26[3]19[2]

11[6]5[5]

61[9]

1[7]

77[10]

59[8]

15[4]

Heap Sort Example

26[1]

11[3]19[2]

1[6]5[5]

61[9]

48[7]

77[10]

59[8]

15[4]

Heap Sortvoid adjust(element list[], int root, int n){ int child, rootkey; element temp; temp=list[root]; rootkey=list[root].key; child=2*root; while (child <= n) { if ((child < n) && (list[child].key < list[child+1].key)) child++; if (rootkey > list[child].key) break; else { list[child/2] = list[child]; child *= 2; } } list[child/2] = temp;}

2i 2i+1

Heap Sort

void heapsort(element list[], int n){ int i, j; element temp; for (i=n/2; i>0; i--) adjust(list, i, n); for (i=n-1; i>0; i--) { SWAP(list[1], list[i+1], temp); adjust(list, 1, i); }}

ascending order (max heap)

bottom-up

n-1 cylces

top-down

Radix Sort

We now examine the problem of sorting records that have several keys.

These keys are labeled K0,K1, ... ,Kr-1, with K0 be the most significant key and Kr-1 the least.

R0, ... ,Rn-1, is lexically sorted with respect to the keys K0,K1, ... ,Kr-1 iff

, 0 i<n-1≦( , ,..., ) ( , ,..., )k k k k k ki i i

r

i i i

r0 1 1

1

0

1

1

1

1

Radix Sort

We now examine the problem of sorting records that have several keys.

These keys are labeled K0,K1, ... ,Kr-1, with K0 be the most significant key and Kr-1 the least.

R0, ... ,Rn-1, is lexically sorted with respect to the keys K0,K1, ... ,Kr-1 iff

, 0 i<n-1≦

Most significant digit first: sort on K0, then K1, ... Least significant digit first: sort on Kr-1, then Kr-2, ...

( , ,..., ) ( , ,..., )k k k k k ki i i

r

i i i

r0 1 1

1

0

1

1

1

1

Arrangement of cards after first pass of an MSD sort

Suits: < < < Face values: 2 < 3 < 4 < … < J < Q < K < A

(1) MSD sort first, e.g., bin sort, four bins LSD sort second, e.g., insertion sort(2) LSD sort first, e.g., bin sort, 13 bins 2, 3, 4, …, 10, J, Q, K, A MSD sort, e.g., bin sort four bins

Arrangement of cards after first pass of an LSD sort

LSD radix sort example

LSD radix sort example (Cont.)

LSD radix sort example (Cont.)

LSD Radix Sort

An LSD radix r sort, R0, R1, ..., Rn-1 have the keys that are d-tuples

(x0, x1, ..., xd-1)

#define MAX_DIGIT 3#define RADIX_SIZE 10typedef struct list_node *list_pointer;typedef struct list_node { int key[MAX_DIGIT]; list_pointer link;}

LSD Radix Sort

list_pointer radix_sort(list_pointer ptr){ list_pointer front[RADIX_SIZE],

rear[RADIX_SIZE]; int i, j, digit; for (i=MAX_DIGIT-1; i>=0; i--) { for (j=0; j<RADIX_SIZE; j++) front[j]=read[j]=NULL; while (ptr) { digit=ptr->key[I]; if (!front[digit]) front[digit]=ptr; else rear[digit]->link=ptr;

Initialize bins to beempty queue.

Put records into queues.

LSD Radix Sort

rear[digit]=ptr; ptr=ptr->link; } /* reestablish the linked list for the next pass */ ptr= NULL; for (j=RADIX_SIZE-1; j>=0; j++) if (front[j]) { rear[j]->link=ptr; ptr=front[j]; } } return ptr;}

Get next record.

O(n)

O(r)

O(d(n+r))

List and Table Sort

Most sorting methods require excessive data movement since we must physically move records following some comparisons.

If the records are large, this slows down the sorting process.

We can reduce data movement by using a linked list representation.

List and Table Sort

However, in some applications we must physically rearrange the records so that they are in the required order.

We can achieve considerable savings by first performing a liked list sort and then physically rearranging the records according to the order specified in the list.

List and Table Sort

In this section, we examine three methods for rearranging the lists into sorted order.

list_sort1 and list_sort2 require a linked list representation.

table_sort uses an auxiliary table that indirectly references the list's records.

In Place Sorting

R0 R1 R2 R3 R4 R5 R6 R7 R8 R926 5 77 1 61 11 59 15 48 19 8 5 -1 1 2 7 4 9 6 0

ikeylink

R0 <--> R3

1 5 77 26 61 11 59 15 48 19 1 5 -1 8 2 7 4 9 6 3

How to know where the predecessor is ? I.E. its link should be modified.

list_sort 1void list_sort1(element list[], int n, int start){ int i, last, current; element temp; last = -1; for (current=start; current!=-1; current=list[current].link){ list[current].linkb = last; last = current; } for (i=0; i<n-1; i++) { if (start!=i) { if (list[i].link+1) list[list[i].link].linkb= start; list[list[i].linkb].link= start; SWAP(list[start], list[i].temp); } start = list[i].link; }}

Example of list_sort 1

Example of list_sort 1

Example of list_sort 1

list_sort 1

list_sort 1 require to concert the singly linked list into a doubly linked one.

However, we can avoid this by introduce list_sort 2.

list_sort 2/* list sort with only one link field */void list_sort2 (element list[], int n, int start){

int i.next;element temp;for(i = 0; i < n-1; i++){ while(start < i) start = list[start].link; next = list[start].link; /* save index of next largest key */ if (start != i){

SWAP(list[i],list[start],temp);list[i].link = start;

} srart = next;}

}

Example of list_sort 2

Example of list_sort 2

Example of list_sort 2

Table sort

In previous section, we use link list to represent sort result.

Now we use auxiliary one dimension table to represent it (i.e. Rt[i] is the record with the ith smallest key).

How to physically rearrange the records according to this table?

table_sort

void table_sort (element list[], int n, int table[]){/* rearrange list[0],...,list[n-1] to correspond to the seque

nce list[table[0]],...,list[table[n-1]] */int i,current,next;element temp;for (i = 0; i < n-1; i++) if(table[i] != i){ /* nontrivial cycle starting a i */

temp = list[i];current = i;

table_sort

do { next = table[current]; list[current] = list[next]; table[current] = current; current = next;

} while (table[current] != i); list[current] = temp; table[current] = current;

}}

table_sort

0 1 2 3 4

4 3 1 2 0

R0

50

R1

9

R2

11R3

8

R4

3

Beforesort

Auxiliarytable

T

Key

Aftersort

Auxiliarytable

T

第 0名在第 4個位置,第 1名在第 3個位置,…,第 4名在第 0個位置

Data are not moved.(0,1,2,3,4)

(4,3,1,2,0)

A permutationlinked list sort

(0,1,2,3,4)

Every permutation is made of disjoint cycles.

Example

R0 R1 R2 R3 R4 R5 R6 R735 14 12 42 26 50 31 182 1 7 4 6 0 3 5

keytable

two nontrivial cycles

R0 R2 R7 R5 R02 7 5 0

R3 R4 R6 R34 6 3 4

trivial cycle

R1 R11

table_sort

table_sort(1) R0 R2 R7 R5 R0

0 2 7 512 18 50 35

35 0

(2) i=1,2 t[i]=i

(3) R3 R4 R6 R33 4 626 31 42

42 3

Summary of internal sorting

Of the several sorting methods we have studied no one method is best. Some methods are good for small n, others for large n.

An insertion sort works well when the list is already partially ordered. It is also the best sorting method for small n.

Merge sort has the best worst case behavior, but it requires more storage than a heap sort, and has slightly more overhead than quick sort.

Quick sort has the best average behavior, but its worst case behavior is O(n2).

The behavior of radix sort depends on the size of the keys and the choice of the radix.

Practical Considerations for Internal Sorting

Data movement slow down sorting process insertion sort and merge sort --> linked file

perform a linked list sort + rearrange records

Complexity of Sort

stability space timebest average worst

Bubble Sort stable little O(n) O(n2) O(n2)Insertion Sort stable little O(n) O(n2) O(n2)Quick Sort untable O(logn) O(nlogn) O(nlogn) O(n2)Merge Sort stable O(n) O(nlogn) O(nlogn) O(nlogn)

Heap Sort untable little O(nlogn) O(nlogn) O(nlogn)Radix Sort stable O(np) O(nlogn) O(nlogn) O(nlogn)List Sort ? O(n) O(1) O(n) O(n)Table Sort ? O(n) O(1) O(n) O(n)

Comparison

n < 20: insertion sort

20 n < 45: quick sort

n 45: merge sort

hybrid method: merge sort + quick sortmerge sort + insertion sort

External Sort

To sort large files (cannot be contained in the internal memory).

Following overheads apply: Seek time Latency time Transmission time

The most popular method for sorting on external storage devices is merge sort. phase 1 : Segment the input file & sort the segments

(runs) phase 2 : Merge the runs

750 Records 750 Records 750 Records 750 Records 750 Records 750 Records

1500 Records 1500 Records 1500 Records

3000 Records

4500 Records

Block Size = 250

(1) sort three blocks at a time and write them out onto scratch pad(2) three blocks: two input buffers & one output buffer

2 2/3 passes

File: 4500 records, A1, …, A4500internal memory: 750 records (3 blocks)block length: 250 recordsinput disk vs. scratch pad (disk)

Time Complexity of External Sort

input/output time ts = maximum seek time tl = maximum latency time trw = time to read/write one block of 250 records tIO = ts + tl + trw cpu processing time tIS = time to internally sort 750 records ntm = time to merge n records from input buffers to the o

utput buffer

Time Complexity of External Sort

Operation time

(1) read 18 blocks of input , 18tIO,internally sort, 6tIS,write 18 blocks, 18tIO

36 tIO +6 tIS

(2) merge runs 1-6 in pairs 36 tIO +4500 tm

(3) merge two runs of 1500 recordseach, 12 blocks

24 tIO +3000 tm

(4) merge one run of 3000 recordswith one run of 1500 records

36 tIO +4500 tm

Total Time 96 tIO +12000 tm+6 tIS

Consider Parallelism

Carry out the CPU operation and I/O operationin parallel

132 tIO = 12000 tm + 6 tIS Two disks: 132 tIO is reduced to 66 tIO

k-way Merging

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

A 4-way merge on 16 runs

2 passes (4-way) vs. 4 passes (2-way)

k-way Merging A k-way merge on m runs requires at most logkm passe

s over the data. To begin with, k-runs of size S1,S2,...,Sk can no longer be

merged internally in time. The most direct way to merge k-runs would be to make k-

1 comparisons to determine the next record to output. Hence, the total number of key comparisons being made i

s n(k-1)logkm. We can achieve a significant reduction in the number of c

omparisons by using a loser tree with k leaves (Chapter 5).

)S(Ok

i∑1

Optimal Merging Of Runs

15542

15

5

42

external path length:

2*3+4*3+5*2+15*1=43 2*2+4*2+5*2+15*2=52

Optimal Merging Of Runs

When we merge by using the first merge tree, we merge some records only once, while others may be merged up to three times.

In the second merge tree, we merge each record exactly twice.

We can construct Huffman tree to solve this problem.

Construction of a Huffman tree

32

5

Runs of length : 2,3,5,7,9,13

Construction of a Huffman tree

32

5 5

10

Runs of length : 2,3,5,7,9,13

Construction of a Huffman tree

32

5 5

1097

16

Runs of length : 2,3,5,7,9,13

Construction of a Huffman tree

32

5 5

1097

16

13

23

Runs of length : 2,3,5,7,9,13

Construction of a Huffman tree

32

5 5

1097

16

13

23

39

Runs of length : 2,3,5,7,9,13