CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

50
CS235102 CS235102 Data Structures Data Structures Chapter 7 Sorting Chapter 7 Sorting (Concentrating on (Concentrating on Internal Sorting Internal Sorting ) )
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    231
  • download

    6

Transcript of CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Page 1: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

CS235102 CS235102 Data StructuresData Structures

Chapter 7 SortingChapter 7 Sorting(Concentrating on (Concentrating on Internal Internal

SortingSorting))

Page 2: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Chapter 7 Sorting: OutlineChapter 7 Sorting: Outline IntroductionIntroduction

Searching and List VerificationSearching and List Verification DefinitionsDefinitions

Insertion SortInsertion Sort Quick SortQuick Sort Merge SortMerge Sort

MergeMerge Iterative / Recursive Merge SortIterative / Recursive Merge Sort

Heap SortHeap Sort Radix SortRadix Sort Summary of Internal SortingSummary of Internal Sorting

Page 3: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (1/9)Introduction (1/9)

Why efficient sorting methods are so important ? Why efficient sorting methods are so important ? The efficiency of a searching strategy depends The efficiency of a searching strategy depends

on the assumptions we make about the on the assumptions we make about the arrangement of records in the listarrangement of records in the list

No single sorting technique is the “best” for all No single sorting technique is the “best” for all initial orderings and sizes of the list being sorted.initial orderings and sizes of the list being sorted.

We examine several techniques, indicating when We examine several techniques, indicating when one is superior to the others.one is superior to the others.

Page 4: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (2/9)Introduction (2/9) Sequential searchSequential search

We search the list by examining the key valueWe search the list by examining the key values s listlist[[00]].key, … , list.key, … , list[[n-1n-1]].key..key. Example: Example: ListList has has nn records. records.

4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 954, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95

In that order, until the correct record is located,In that order, until the correct record is located, or we have examined all the records in the lis or we have examined all the records in the listt

Unsuccessful search: Unsuccessful search: nn+1 +1 O( O(nn)) Average successful searchAverage successful search

)(O2/)1(/)1(1

0

nnnin

i

Page 5: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (3/9)Introduction (3/9) Binary searchBinary search Binary search assumes that the list is ordered on the key Binary search assumes that the list is ordered on the key

field such that field such that listlist[0].[0].keykey listlist[1]. [1]. keykey … … listlist[[nn-1]. -1]. keykey.. This search begins by comparing This search begins by comparing searchnumsearchnum (search ke (search ke

y) and y) and listlist[[middlemiddle].].keykey where where middlemiddle=(=(nn-1)/2-1)/2

56

[7]

17

[2]

58

[8]

26

[3]

4

[0]

48

[6]

90

[10]

15

[1]

30

[4]

46

[5]

82

[9]

95

[11]Figure 7.1:Figure 7.1:Decision tree for Decision tree for

binary search (p.323)binary search (p.323)

4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95

Page 6: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (4/9)Introduction (4/9) Binary search (cont’d)Binary search (cont’d)

Analysis of Analysis of binsearchbinsearch: : makes no more than O(log makes no more than O(log nn) comparisons) comparisons

Page 7: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (5/9)Introduction (5/9) List VerificationList Verification

Compare lists to verify that they are identical or Compare lists to verify that they are identical or identify the discrepancies.identify the discrepancies.

ExampleExample International Revenue Service (IRS) International Revenue Service (IRS)

(e.g., employee vs. employer)(e.g., employee vs. employer)

Reports three types of errors:Reports three types of errors: all records found in all records found in listlist1 but not in 1 but not in listlist22 all records found in all records found in listlist2 but not in 2 but not in listlist11 all records that are in all records that are in listlist1 and 1 and listlist2 with the same key 2 with the same key

but have different values for different fieldsbut have different values for different fields

Page 8: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (6/9)Introduction (6/9) Verifying using a sequential searchVerifying using a sequential search

Check whether the elements in list1 are also in list2

And the elements in list2 but not in list1 would be show up

Page 9: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (7/9)Introduction (7/9) Fast Fast

verification verification of two listsof two lists

The remainder elements of a list is not a member of another list

The element of two lists are matched

The element of list1 is not in list2

The element of list2 is not in list1

Page 10: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (8/9)Introduction (8/9) ComplexitiesComplexities Assume the two lists are randomly arrangedAssume the two lists are randomly arranged Verify1: O(Verify1: O(mnmn)) Verify2: sorts them before verificationVerify2: sorts them before verification

O(O(tsorttsort((nn) + ) + tsorttsort((mm) + ) + mm + + nn) ) O( O(maxmax[[nnloglogn, mn, mloglogmm])]) tsorttsort((nn): the time needed to sort the ): the time needed to sort the nn records in records in listlist11 tsorttsort((mm): the time needed to sort the ): the time needed to sort the mm records in records in listlist22 we will show it is possible to sort we will show it is possible to sort nn records in O( records in O(nnloglognn) time) time

DefinitionDefinition Given (Given (RR00, , RR11, …, , …, RRnn-1-1), each ), each RRii has a key value has a key value KKii

find a permutation find a permutation , such that , such that KK((ii-1)-1) KK((ii)), 0<, 0<i i nn-1-1

denotes an unique permutationdenotes an unique permutation Sorted: Sorted: KK((ii-1)-1) KK((ii)), 0<, 0<ii<<nn-1-1

Stable: if Stable: if ii < < jj and and KKii = = KKjj then then RRii precedes precedes RRjj in the sort in the sort

ed listed list

Page 11: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Introduction (9/9)Introduction (9/9) Two important applications of sorting:Two important applications of sorting:

An aid to searchAn aid to search Matching entries in listsMatching entries in lists

Internal sortInternal sort The list is small enough to sort entirely in main The list is small enough to sort entirely in main

memorymemory

External sortExternal sort There is too much information to fit into main There is too much information to fit into main

memorymemory

Page 12: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Insertion Sort (1/3)Insertion Sort (1/3)

Concept:Concept: The basic step in this method is to insert a record The basic step in this method is to insert a record RR in in

to a sequence of ordered records, to a sequence of ordered records, RR11, , RR22, …, , …, RRii ( (KK11

KK22 , …, , …, KKii) in such a way that the resulting sequen) in such a way that the resulting sequen

ce of size ce of size ii is also ordered is also ordered

VariationVariation Binary insertion sortBinary insertion sort

reduce search timereduce search time

List insertion sortList insertion sort reduce insert timereduce insert time

Page 13: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Insertion Sort (2/3)Insertion Sort (2/3) Insertion sort programInsertion sort program

i = 1234

[0]

[1]

[2]

[3]

[4]

list

5 2 3 1 4 next = 2

5 2 3

5 3 1

5 3 2 1 4

5 4

Page 14: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Insertion Sort (3/3)Insertion Sort (3/3) Analysis of InsertionSort:Analysis of InsertionSort:

If If kk is the number of records LOO, then the computing is the number of records LOO, then the computing time is O((time is O((kk+1)+1)nn))

The worst-case time is O(The worst-case time is O(nn22).). The average time is O(The average time is O(nn22).). The best time is O(The best time is O(nn).).

)(O)(O2

0

2

n

j

ni

left out of order (LOO)

O(n)

}{max LOOis 0

jij

ii RRiffR

Page 15: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Quick Sort (1/6)Quick Sort (1/6) The quick sort scheme developed by C. A. R. HoaThe quick sort scheme developed by C. A. R. Hoa

re has the re has the best average behaviorbest average behavior among all the so among all the sorting methods we shall be studyingrting methods we shall be studying

Given (Given (RR00, , RR11, …, , …, RRn-1n-1) and ) and KKii denote a pivot keydenote a pivot key

If If KKii is placed in position is placed in position ss((ii),),

then then KKjj KKss((ii)) for for jj < < ss((ii), ), KKjj KKss((ii)) for for jj > > ss((ii).).

After a positioning has been made, the original file After a positioning has been made, the original file is partitioned into two subfiles, is partitioned into two subfiles, {{RR00, …, , …, RRss((ii)-1)-1}}, , RRss((ii)), ,

{{RRss((ii)+1)+1, …, , …, RRss((nn-1)-1)}}, and they will be sorted independ, and they will be sorted independ

entlyently

Page 16: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

6159

112615

Quick Sort (2/6)Quick Sort (2/6) Quick Sort ConceptQuick Sort Concept

select a pivot keyselect a pivot key interchange the elements to their correct positions accinterchange the elements to their correct positions acc

ording to the pivotording to the pivot the original file is partitioned into two subfiles and they the original file is partitioned into two subfiles and they

will be sorted will be sorted independentlyindependently

R0 R1 R2 R3 R4 R5 R6 R7 R8 R9

26 5 37 1 61 11 59 15 48 1919 376111

191

15 193748

37 48

5 1 59 485 15 26 59 61 48 37

191 115 15 26 59 61 48 371 115 26 59 61 48 37

15 191 115 2615 191 115 26 6159

37 4815 191 115 26 6159

Page 17: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Quick Sort ProgramQuick Sort ProgramR0 R1 R2 R3 R4 R5 R6 R7 R8 R9

right

left

26 5 37 1 61 11 59 15 48 19

0

9

i: j:

pivot=2619 3715 6111 26

4

111 191 11

1

11

-11

10

4

3

1915 19

34

530

9

6

5937 6148 59

7

4837 48

67

86

9

960

Page 18: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Quick Sort (4/6)Quick Sort (4/6) Analysis for Quick SortAnalysis for Quick Sort Assume that each time a record is positioned, the list is divAssume that each time a record is positioned, the list is div

ided into the rough same size of two parts.ided into the rough same size of two parts. Position a list with Position a list with nn element needs element needs O(O(nn)) TT((nn) is the time taken to sort ) is the time taken to sort nn elements elements TT((nn)<=)<=cncn+2+2TT((nn/2) for some /2) for some cc

<= <=cncn+2(+2(cncn/2+2/2+2TT((nn/4))/4)) ... ... <= <=cncnloglog22nn++nTnT(1)=(1)=O(O(nnloglognn))

Time complexityTime complexity Average case and best case:Average case and best case: O( O(nnloglognn)) Worst case: Worst case: O(O(nn22)) Best internal sorting method considering the average caseBest internal sorting method considering the average case

UnstableUnstable

Page 19: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Quick Sort (5/6)Quick Sort (5/6)

Lemma 7.1:Lemma 7.1: Let Let TTavgavg((nn) be the expected time for quicksort to sort a fil) be the expected time for quicksort to sort a fil

e with e with nn records. Then there exists a constant records. Then there exists a constant kk such th such that at TTavgavg((nn) ) knknloglogeenn for for nn 2 2

Space for Quick SortSpace for Quick Sort The smaller of the two subarrays is always stored first, tThe smaller of the two subarrays is always stored first, t

he maximum stack space is less thanhe maximum stack space is less than 2log 2lognn

Stack space complexity: Stack space complexity: Average case and best case:Average case and best case: O(log O(lognn)) Worst case: Worst case: O(O(nn))

Page 20: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Quick Sort (6/6)Quick Sort (6/6)

Quick Sort VariationsQuick Sort Variations Quick sort using a median of three: Pick the median oQuick sort using a median of three: Pick the median o

f the first, middle, and last keys in the current sublist af the first, middle, and last keys in the current sublist as the pivot. Thus, pivot = median{s the pivot. Thus, pivot = median{KKll, , KK((l+rl+r)/2)/2, , KKrr}.}.

Page 21: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (1/13)Merge Sort (1/13) Before looking at the merge sort algorithm to sort Before looking at the merge sort algorithm to sort

nn records, let us see how one may merge two so records, let us see how one may merge two sorted lists to get a single sorted list.rted lists to get a single sorted list.

MergingMerging The first one, Program 7.7, uses O(The first one, Program 7.7, uses O(nn) additional space.) additional space. It merges the sorted lists It merges the sorted lists

((listlist[[ii], … , ], … , listlist[[mm]) and (]) and (listlist[[m+1m+1], …, ], …, listlist[[nn]), ]), into a single sorted list, (into a single sorted list, (sortedsorted[[ii], … , ], … , sortedsorted[[nn]).]).

Page 22: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge (using O(Merge (using O(nn) space)) space)

Page 23: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (3/13)Merge Sort (3/13) OO(1) space merge(1) space merge

Steps in an Steps in an OO(1) space merge when the total number of (1) space merge when the total number of records, n is a perfect square */records, n is a perfect square */and the number of records in each of the files to be and the number of records in each of the files to be merged is a multiple of merged is a multiple of n */n */

Step1:Step1:Identify the Identify the n records with largest keyn records with largest key. This is done by . This is done by following right to left along the two files to be merged.following right to left along the two files to be merged.

Step2:Step2:Exchange the records of the second file that were Exchange the records of the second file that were identified in Step1identified in Step1 with those just to the left of those with those just to the left of those identified from the first file so that the identified from the first file so that the n record with n record with largest keys form a contiguous blocklargest keys form a contiguous block

Page 24: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort Merge Sort (4/13)(4/13)

OO(1) space merge (cont’d)(1) space merge (cont’d) Step3:Step3:

SwapSwap the block of the block of n largestn largest records with the records with the leftmostleftmost block (unless it is already the leftmost block). block (unless it is already the leftmost block). Sort the Sort the rightmost blockrightmost block

Step4:Step4:ReorderReorder the blocks excluding the block of largest reco the blocks excluding the block of largest records into rds into nondecreasing ordernondecreasing order of the of the last key in the blolast key in the blockscks

Page 25: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (5/13)Merge Sort (5/13) OO(1) space merge (cont’d)(1) space merge (cont’d)

Step5:Step5:Perform as many merge sub steps as needed to Perform as many merge sub steps as needed to merge the merge the n-1 blocks other than the block with the n-1 blocks other than the block with the largest keys.largest keys.

0 1 2 3 w z u x 4 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t

0 1 2 3 4 z u x w 6 8 a | v y 5 7 9 b | c e g i j k | d f h o p q | l m n r s t

Page 26: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (6/13)Merge Sort (6/13)

6, 7, 8 are merged

Segment one is merged (i.e., 0, 2, 4, 6, 8, a)

Change place marker (longest sorted sequence of records)

Segment one is merged (i.e., b, c, e, g, i, j, k)

Change place marker

Segment one is merged (i.e., o, p, q)

No other segment. Sort the largest keys.

Step6:Step6:Sort the block with the largest keysSort the block with the largest keys

Page 27: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (7/13)Merge Sort (7/13) Iterative merge sortIterative merge sort

1.1. We assume that the input sequence has We assume that the input sequence has nn sorted lis sorted lists, each of length 1.ts, each of length 1.

2.2. We merge these lists pairwise to obtain We merge these lists pairwise to obtain nn/2 lists of s/2 lists of size 2.ize 2.

3.3. We then merge the n/2 lists pairwise, and so on, untWe then merge the n/2 lists pairwise, and so on, until a single list remains.il a single list remains.

AnalysisAnalysis Total number of passes is the celling of logTotal number of passes is the celling of log22nn

merge two sorted list in linear time: O(merge two sorted list in linear time: O(nn)) The total computing time is O(The total computing time is O(n n loglog n n).).

Page 28: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (8/13)Merge Sort (8/13) merge_passmerge_pass Invokes Invokes mergemerge (Program 7.7) to merge the sorted sublists (Program 7.7) to merge the sorted sublists Perform one pass of the merge sort. It merges adjancent Perform one pass of the merge sort. It merges adjancent

pairs of subfiles from list into sorted.pairs of subfiles from list into sorted.

the number of elements in the listthe length of the subfile

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]length=2

n=10i= 0

0 1 3

4

list

sorted

4 5 7

8

Page 29: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

merge_sortmerge_sort: Perform a merge sort on the file: Perform a merge sort on the file

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

length=1

list

extra

n=102

list

4

extra

8

list

16

Page 30: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 31: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 32: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 33: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 34: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (10/13)Merge Sort (10/13) Recursive merge sort conceptRecursive merge sort concept

Page 35: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

listmergelistmerge:: Takes two sorted chains and returns an integer that points Takes two sorted chains and returns an integer that points

to the start of to the start of the sorted listthe sorted list

The link field in each record is initially set to -1

Since the elements were numbered from 0 to n-1, we use list[n] to store the start pointer

Page 36: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

rmergermerge: sort the list, list[lower], …, list[upper]. : sort the list, list[lower], …, list[upper]. The link field in each record is initially set to -1The link field in each record is initially set to -1

start = rmerge(list, 0, n-1);

[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

lower=upper=

middle=

0

9442211001

0

1

1

02

1

2

2

0

42

3

334

3

43

4

0

3

0

294

5

776655

5

6

6

6

5

76

7

7

5

97

8

88

8

9

9

9

8

5

8

5

7

0

5

0

4= 0

list

Page 37: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Merge Sort (13/13)Merge Sort (13/13) Variation:Variation: Natural merge sort Natural merge sort ::

We can modify We can modify merge_sortmerge_sort to take into account the prevaili to take into account the prevailing order within the input list.ng order within the input list.

In this implementation we make an initial pass over the data In this implementation we make an initial pass over the data to determine the sequences of records that are in order.to determine the sequences of records that are in order.

The merge sort then uses these initially ordered sublists for The merge sort then uses these initially ordered sublists for the remainder of the passes.the remainder of the passes.

Page 38: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Heap Sort (1/3)Heap Sort (1/3) The challenges of merge sortThe challenges of merge sort

The merge sort requires additional storage proportional to The merge sort requires additional storage proportional to the number of records in the file being sorted.the number of records in the file being sorted.

By using the By using the OO(1) space merge algorithm, the space requir(1) space merge algorithm, the space requirements can be reduced to ements can be reduced to OO(1), but significantly slower th(1), but significantly slower than the original one.an the original one.

Heap sortHeap sort Require only a fixed amount of additional storageRequire only a fixed amount of additional storage Slightly slower than merge sort using O(Slightly slower than merge sort using O(nn) additional spac) additional spac

ee Faster than merge sort using Faster than merge sort using OO(1) additional space.(1) additional space. The worst case and average computing time is O(The worst case and average computing time is O(n n loglog n n), ),

same as merge sortsame as merge sort UnstableUnstable

Page 39: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

adjustadjust adjust the binary tree to establish the heapadjust the binary tree to establish the heap

/* compare root and max. root */

/* move to parent */ [1]

[2] [3]

[4] [5] [6][7]

[8] [9] [10]

26

5 77

1 61 11 59

15 48 19

rootkey =

root = 1

n = 10

26

child = 23

77

67

59

14

26

Page 40: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Heap Sort (3/3)Heap Sort (3/3)

[1]

[2] [3]

[4][5] [6][7]

[8] [9] [10]

26

5 77

1 61 11 59

15 48 19

heapsortheapsort

n = 10i = 54

48

1

32

61

19

5

1

77

59

26

9

5

77

61

48

15

5

8

1

61

59

26

1

7

5

59

48

19

5

6

1

48

26

11

1

5

1

26

19

15

1

4

5

19

15

5

3

1

15

11

1

2

1

11

5

1

1

1

5

ascending order(max heap)

bottom-up

top-down

Page 41: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (1/8)Radix Sort (1/8)

We considers the problem of sorting records that We considers the problem of sorting records that have several keyshave several keys These keys are labeled These keys are labeled KK0 0 (most significant key)(most significant key), , KK11, ,

… , … , KKr-1 r-1 (least significant key)(least significant key).. Let Let KKi i

jj denote key denote key KKjj of record of record RRii..

A list of records A list of records RR00, … , , … , RRnn-1-1, is , is lexically sortedlexically sorted with re with re

spect to the keys spect to the keys KK00, , KK11, … , , … , KKrr-1-1 iffiff ((KKii

00, , KKii11, …, , …, KKii

r-1r-1) ) ( (KK00i+1i+1, , KK11

i+1i+1, …, , …, KKr-1r-1i+1i+1), 0), 0 i i < < nn-1-1

Page 42: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (2/8)Radix Sort (2/8)

ExampleExample sorting a deck of cards on two keys, suit and face sorting a deck of cards on two keys, suit and face

value, in which the keys have the ordering relation:value, in which the keys have the ordering relation:KK0 0 [Suit]:[Suit]: < < < < < < KK1 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A[Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A

Thus, a sorted deck of cards has the ordering:Thus, a sorted deck of cards has the ordering:22, …, A, …, A, … , 2, … , 2, … , A, … , A

Two approaches to sort:Two approaches to sort:1.1. MSD (Most Significant Digit) first:MSD (Most Significant Digit) first: sort on K0, then K1, ...

2.2. LSD (Least Significant Digit) first:LSD (Least Significant Digit) first: sort on Kr-1, then Kr-2, ...

Page 43: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (3/8)Radix Sort (3/8) MSD first

1. MSD sort first, e.g., bin sort, four bins

2. LSD sort second Result: 22, …, A, …, A, … , 2, … , 2, … , A, … , A

Page 44: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (4/8)Radix Sort (4/8) LSD first1.LSD sort first, e.g., face sort,

13 bins 2, 3, 4, …, 10, J, Q, K, A

2.MSD sort second (may not needed, we can just classify these 13 piles into 4 separated piles by considering them from face 2 to face A)

Simpler than the MSD one because we do not have to sort the subpiles independently

Result: 22, …, A, …, A, … , , … , 22, …, A, …, A

Page 45: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (5/8)Radix Sort (5/8) We also can use an LSD or MSD sort when we We also can use an LSD or MSD sort when we

have only one logical key, if we interpret this key have only one logical key, if we interpret this key as a composite of several keys.as a composite of several keys.

Example:Example: integer: the digit in the far right position is the least siginteger: the digit in the far right position is the least sig

nificant and the most significant for the far left positionnificant and the most significant for the far left position range: range: 0 K 999 using LSD or MSD sort for three keys (K0, K1, K2) since an LSD sort does not require the maintainence

of independent subpiles, it is easier to implement

MSD LSD0-9 0-9 0-9

Page 46: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (6/8)Radix Sort (6/8) radix sortradix sort

decompose the sort key into digits using a radix decompose the sort key into digits using a radix rr. . Ex: When Ex: When rr =10, we get the common base 10 or deci =10, we get the common base 10 or deci

mal decomposition of the keymal decomposition of the key

LSD radix LSD radix rr sort sort The records, The records, RR00, … ,, … ,RRnn-1-1

Keys: Keys: dd-tuples (-tuples (xx00, , xx11, …, , …, xxdd-1-1) and that 0 ) and that 0 xxii < < rr.. Each record has a link field, and that the input list is stEach record has a link field, and that the input list is st

ored as a dynamically linked list.ored as a dynamically linked list. We implement the bins as queuesWe implement the bins as queues

frontfront[[ii], 0 ], 0 ii < < rr, pointing to the first record in bin , pointing to the first record in bin ii rearrear[[ii], 0 ], 0 ii < < rr, pointing to the last record in bin , pointing to the last record in bin ii

#define MAX_DIGIT 3 /* 0 to 999 */#define RADIX_SIZE 10typedef struct list_node *list_pointer;typedef struct list_node {

int key[MAX_DIGIT];list_pointer link;};

Page 47: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

LSD Radix SortLSD Radix Sort Time complexity: O(MAX_DIGIT(RADIX_SIZE+n))

MAX_DIGIT passes

O(RADIX_SIZE)

O(RADIX_SIZE)

O(n)

RADIX_SIZE = 10

MAX_DIGIT = 3

f[9]

f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271NULL

NULL

93

33 NULL

984NULL

55

306NULL

208NULL

179

859

9 NULL r[9]

r[8]

r[7]

r[6]

r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

Initial input:179→208→306→93→859→984→55→9→271→33

Chain after first pass, i=2:271→93→33→984→55→306→208→179→859→9

Page 48: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Radix Sort (8/8)Radix Sort (8/8) Simulation of Simulation of radix_sortradix_sort

f[9]f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271

NULL

93

33 NULL

984NULL

55

306

208

NULL

179

859

9 NULL

r[9]

r[8]

r[7]

r[6]r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

f[9]

f[8]

f[7]

f[6]

f[5]

f[4]

f[3]

f[2]

f[1]

f[0]

271NULL

NULL93

33

984NULL

55

306NULL

208

179

859

9

NULL

r[9]

r[8]

r[7]

r[6]

r[5]

r[4]

r[3]

r[2]

r[1]

r[0]

NULL

NULL

Chain after second pass, i=1:306→208→9→33→55→859→271→179→984→93

Chain after third pass, i=0:9→33→55→93→179→208→271→306→859→984

Page 49: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Summary of Internal Sorting (1/2)Summary of Internal Sorting (1/2) Insertion SortInsertion Sort

Works well when the list is already partially orderedWorks well when the list is already partially ordered The best sorting method for small The best sorting method for small nn

Merge SortMerge Sort The best/worst case (O(The best/worst case (O(nnloglognn)))) Require more storage than a heap sortRequire more storage than a heap sort Slightly more overhead than quick sortSlightly more overhead than quick sort

Quick SortQuick Sort The best average behaviorThe best average behavior The worst complexity in worst case (O(The worst complexity in worst case (O(nn22))))

Radix SortRadix Sort Depend on the size of the keys and the choice of the radixDepend on the size of the keys and the choice of the radix

Page 50: CS235102 Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)

Summary of Internal Sorting (2/2)Summary of Internal Sorting (2/2) Analysis of the average running timesAnalysis of the average running times