CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

16
CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    2

Transcript of CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Page 1: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

CSC 213 –Large Scale

Programming

Lecture 24:

Radix & Bucket Sorts

Page 2: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Today’s Goal

Discuss two new ways with which to sort data Very different than other forms of sorts Can* be a much faster method of sorting Follows simple pattern, but confusing to learn

Page 3: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort

Uses, B, array of Sequences (e.g., buckets)Sorts Sequence, S, in two phases:

1. Remove first Entry, <v, k>, in S and add to B[k]

2. For i 0, …, B.size()-1, move entries from bucket B[i] to end of S

Page 4: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort Example Suppose keys range from [0, 9]

7, d 1, c 3, a 7, g 3, b 7, e

1, c 3, a 3, b 7, d 7, g 7, e

Phase 1

Phase 2

0 1 2 3 4 5 6 8 9

B

1, c 7, d 7, g3, b3, a 7, e

S

7

S

Page 5: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort Algorithm

Algorithm bucketSort(Sequence<E> S, Comparator<E> c)B new Sequence[c.getMaxKey()]

// instantiate the Sequence at each index within Bwhile S.isEmpty() do // Phase 1

entry S.removeFirst()B[c.compare(entry, null)].insertLast(entry)

for i 0 to B.length - 1 // Phase 2while B[i].isEmpty() do

entry B[i].removeFirst()S.insertLast(entry)

return S

Page 6: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort Properties

Keys indices into array Must be non-negative integers Does not require external Comparator

Sort is stable Two entries with same key keep relative ordering Bubble-sort & Merge-sort also stable

Page 7: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort Extensions

Extend Bucket-sort with Comparator Specify maximum number of buckets C.compare(key, null) returns index for key

For Integer keys from a – b: Comparator maps k to k – a

For Boolean keys, Comparator returns: 0 when the key is false 1 when the key is true

Page 8: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Bucket-Sort Extensions

Use Bucket-sort with any keys Keys must be from bounded set, D, of values

D could be U.S. states, molecular structures, To-Whack items on assassins hit list…

Comparator ranks each value in D Rank states alphabetically or by admission order Ranks used as index into bucket array, B

Page 9: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

d-Tuples

Combination of d keys (k1, k2, …, kd) ki is “i-th dimension of the tuple”

Example: Point p = (x, y) is 2-tuple x is value of 1st dimension y is value of 2nd dimension

Page 10: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Lexicographic Order

Order of d-tuples defined recursively:

(x1, x2, …, xd) (y1, y2, …, yd)

x1 y1 (x1 y1 (x2, …, xd) (y2, …, yd))

Lexicographic order of 2-tuples?(3, 4) (7, 8) (3, 2) (1, 4) (4, 8)

(1, 4) (3, 2) (3, 4) (4, 8) (7, 8)

Page 11: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Lexicographic Sorting

Uses d calls to stable sorting algorithm Each call sorts along single dimension of tuple So must sort from smallest dimension to largest

Algorithm lexicographicSort(Sequence<E> s, Comparator<E> c, Sort<E> stableSort)

for i c.size() downto 1stableSort.sort(s, c, i)

return s

Page 12: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Radix-Sort

Lexicographical sort using Bucket-sort Good for tuples where each dimension made

into index in the range [0, N-1] Compare each character in two Strings Compare each bit in two Integers

Requires modification to comparator key still first parameter to compare But, dimension now passed as second parameter

Page 13: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Radix-Sort for Integers

Represent Integer as a tuple of bits:6210 = 1111102 0410 = 0001002

With decimal representation, need 10 buckets With binary representation, need 2 buckets

Radix-sort runs in O(bn) time b is the length of longest element in input For 32-bit integers, b = 32 Takes O(32n) time ≈ O(n) time!

Page 14: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Radix-Sort for IntegersAlgorithm binaryRadixSort(Sequence<E> S,

Comparator<E> c)for i 0 to c.size()

bucketSort(S, 2, i, c) return S

Value of the ith bit of Integer k is:

((k >> i) & 1)

Page 15: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

Example Sorting a sequence of 4-bit integers

1001

0010

1101

0001

1110

0010

1110

1001

1101

0001

1001

1101

0001

0010

1110

1001

0001

0010

1101

1110

0001

0010

1001

1101

1110

Page 16: CSC 213 – Large Scale Programming Lecture 24: Radix & Bucket Sorts.

For Friday…

Come with questions to review for Midterm Last chance to ask me about material Also have sort-based problems to discuss