Sorting Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort.
Linear Sort
-
Upload
debasmita-datta -
Category
Documents
-
view
221 -
download
0
Transcript of Linear Sort
-
8/3/2019 Linear Sort
1/37
Linear Sorting
Sorting in O(n)
-
8/3/2019 Linear Sort
2/37
Previously
Previous sorts had the same property: they are
based only on comparisons
Any comparison-based sorting algorithm must
run in (n log n)
Thus, MERGESORT and HEAPSORT are
asymptotically optimal
Obviously, this lower bound does not apply to
linear sorting
-
8/3/2019 Linear Sort
3/37
Lower Bound for Sorting
Assume (for simplicity) that all elements in an input sequence a1,a2, ..., an are distinct
We can view comparison-based sorting using a decision tree
Each internal node in the decision tree corresponds to one of thecomparisons in the algorithm.
start at the root and do first comparison:
- if take left branch, if > take right branch, etc.
each leaf represents one possible ordering of the input one leaf for each of n! possible orderings
Each permutation must appear as a leaf in the tree
One decision tree exists for each algorithm and input size
-
8/3/2019 Linear Sort
4/37
The Decision Tree
a1:a2
a2:a3 a1:a3
a2:a3a1:a3 2,1,3
2,3,1 3,2,13,1,21,3,2
1,2,3
>
>
>
>
>
a1 = 6, a2 = 8, a3 = 5
Note: The length of the longest root to leaf path in this tree= worst-case number of comparisons
worst-case number of operations of algorithm
-
8/3/2019 Linear Sort
5/37
The (nlgn) Lower BoundTheorem:Any decision tree T for sorting n elements has height
(nlgn) (therefore, any comparison sort algorithm requires(nlgn)
compares in the worst case).
Proof: Let hbe the height of T. Then we know
T has at least n! leaves T is binary, so it has at most 2h leaves
n! #leaves2h
2hn!
lg(2h) lg(n!)
h (nlgn) (Eq. 3.18)
-
8/3/2019 Linear Sort
6/37
Proof
Consider a decision tree with height h with l
reachable leaves
n! l
Since a binary tree of height h has no more than
2h leaves:
n! l 2h
Taking the log of both sides:
h lg (n!) = (n lg n)
-
8/3/2019 Linear Sort
7/37
Counting Sort
Each element is an integer in the range from 1
to k
For each element, determine the number of
elements less than it
Works well on small ranges
Does notsort in place
-
8/3/2019 Linear Sort
8/37
The Code
COUNTING-SORT (A,B, k)
fori1 tok //Init CdoC[i] 0 //(k)
forj1 tolength[A] //Build C
doC[A[j]] C[A[j]] + 1 //(n)
fori 2 tok //Make C'
doC[i] C[i] + C[i-1] //(k)
forj length[A] downto 1 //Copy info
doB[C[A[j]]] A[j] //(n)
C[A[j]] C[A[j]] -1
Requirement: input elements are integers in known range [0..k] for
some constant k.
Idea: for each input element x, find the number of elements x (say
this number = m) and put x in the (m+1)stspot in the output array.
-
8/3/2019 Linear Sort
9/37
3 6 4 1 3 4 1 4
2 0 2 3 0 1
1 2 3 4 5 6 7 8
1 2 3 4 5 6
A
C
Original
Number of 1's, 2's..
How do you construct C's size?
How do you fill C's data?
-
8/3/2019 Linear Sort
10/37
Modify CSuch that C now tells how many elements are less than i
2 0 2 3 0 1
1 2 3 4 5 6
C Number of 1's, 2's..
How do you construct C'?
2 2 4 7 7 8
1 2 3 4 5 6
C' Number of slots
-
8/3/2019 Linear Sort
11/37
Move into Bfrom A
2 2 4 7 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
-
8/3/2019 Linear Sort
12/37
Move into Bfrom A
4
2 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
-
8/3/2019 Linear Sort
13/37
Move into Bfrom A
4
2 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
-
8/3/2019 Linear Sort
14/37
Move into Bfrom A
4
1 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
1
-
8/3/2019 Linear Sort
15/37
Move into Bfrom A
1 4
1 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
-
8/3/2019 Linear Sort
16/37
Move into Bfrom A
1 4
1 2 4 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4
-
8/3/2019 Linear Sort
17/37
Move into Bfrom A
1 4
1 2 4 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4
-
8/3/2019 Linear Sort
18/37
Move into Bfrom A
1 4
1 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43
-
8/3/2019 Linear Sort
19/37
Move into Bfrom A
1 4
1 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43
-
8/3/2019 Linear Sort
20/37
Move into Bfrom A
1 1 4
0 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43
-
8/3/2019 Linear Sort
21/37
Move into Bfrom A
1 1 4
0 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43
-
8/3/2019 Linear Sort
22/37
Move into Bfrom A
1 1 4
0 2 3 4 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43 4
-
8/3/2019 Linear Sort
23/37
Move into Bfrom A
1 1 4
0 2 3 4 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43 4
-
8/3/2019 Linear Sort
24/37
Move into Bfrom A
1 1 4
0 2 3 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43 4 6
-
8/3/2019 Linear Sort
25/37
Move into Bfrom A
1 1 4
0 2 3 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43 4 6
-
8/3/2019 Linear Sort
26/37
Move into Bfrom A
1 1 4
0 2 2 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
43 4 63
-
8/3/2019 Linear Sort
27/37
Running Time of Counting Sort
for loop in lines 1-2 takes (n) time.for loop in lines 3-4 takes (k) time.for loop in lines 5-7 takes (n) time.
In practice, use counting sort when we have k = (n),
so running time is (n).
Counting sort has the important property ofstability:
A sorting algorithm isstable when numbers with the same values
appear in the output array in the same order as they do in the input array.
Important when satellite data is stored with elements being sorted and
when counting sort is used as a subroutine for radix sort.
Overall time is (k + n).
-
8/3/2019 Linear Sort
28/37
Radix Sort
Sorting by "column"
A d-digit number would create dcolumns
Start with least-significant row Usually requires counting sort to be used on the
columns
-
8/3/2019 Linear Sort
29/37
Example(by hand)
331
429190
127
982
784
318
190
331982
784
127
318
429
318
127429
331
982
784
190
127
190318
331
429
784
982
-
8/3/2019 Linear Sort
30/37
The Code
RADIX-SORT (A, d)
fori 1 tod
do use a stable sort to sortA on digitI
Note:
radix sort sorts the leastsignificant digit first.
correctness can be shown by induction on the digit being sorted. often counting sort is used as the internal sort in step 2.
-
8/3/2019 Linear Sort
31/37
Givenn d-digit numbers in which each digit can take on up
to k possible values, RADIXSORT correctly sorts these
numbers in (d(n + k)) time.
Proof:
Each digit is in the range 0(k-1)
Takesktime to constructC
Each pass over nd-digit numbers takes
(n+k)
Thus, the total running time is (d(n+k))
-
8/3/2019 Linear Sort
32/37
Givenn b-bit numbers and any positive integer r b,
RADIX-SORT correctly sorts these numbers in
((b/r)(n + 2r)) time.
Proof: For a value r b, we view each key as having d =b/r digits of r bits each. Each digit is an integer in the range 0
to 2r
- 1, so that we can use counting sort with k = 2r
- 1. (Forexample, we can view a 32-bit word as having 4 8-bit digits,
so that b = 32, r = 8, k = 2r - 1 = 255, and d = b/r = 4.) Each
pass of counting sort takes time (n + k) = (n + 2r) and there
are d passes, for a total running time of (d(n + 2r )) =
((b/r)(n + 2r)).
-
8/3/2019 Linear Sort
33/37
Bucket Sort
Bucket sortruns in linear time when the input isdrawn from a uniform distribution.
Like counting sort, bucket sort is fast because itassumes something about the input.
Whereas counting sort assumes that the input consistsof integers in a small range, bucket sort assumes thatthe input is generated by a random process thatdistributes elements uniformly over the interval [0, 1).
The idea of bucket sort is to divide the interval [0, 1)into nequal-sized subintervals, or buckets, and thendistribute the ninput numbers into the buckets.
To produce the output, we simply sort the numbers ineach bucket and then go through the buckets in order,listing the elements in each.
-
8/3/2019 Linear Sort
34/37
Example
.06
.43
.48
.37
.91
.44
.98
0
1
2
3
4
5
6
7
8
9
.06
.37
.43 .44 .48
.98
/
/
/
/
/
/
-
8/3/2019 Linear Sort
35/37
The Code
The bucket sort assumes that the input is an n-elementarrayA and that each elementA[i] in the array satisfies0 A[i] < 1. The code requires an auxiliary arrayB[0n- 1] of linked lists (buckets) and there is a
mechanism for maintaining such lists.
BUCKET-SORT (A)
nlength[A]
for i 1 to ndo insertA[i] into list B[A[i]]
for i 0 to n- 1
do sort list B[i] with insertion sort
concatenate the lists together
-
8/3/2019 Linear Sort
36/37
Proof of Bucket Sort
Consider two elements A[i] and A[j]
Assume A[i]A[j]
Then, A[i] is placed into either the same bucketas A[j], or a bucket with a lower index.
The sort of each bucket guarantees A[i] and A[j]
are ordered correctly
-
8/3/2019 Linear Sort
37/37
Of Interest(only to me)
As long as the input has the property that thesum of the squares of the bucket sizes is linear
in the total number of elements... bucket sort
will run in linear time
In other words, each bucket should get the
square root of the number of bucket elements.