Linear Sort

download Linear Sort

of 37

Transcript of Linear Sort

  • 8/3/2019 Linear Sort

    1/37

    Linear Sorting

    Sorting in O(n)

  • 8/3/2019 Linear Sort

    2/37

    Previously

    Previous sorts had the same property: they are

    based only on comparisons

    Any comparison-based sorting algorithm must

    run in (n log n)

    Thus, MERGESORT and HEAPSORT are

    asymptotically optimal

    Obviously, this lower bound does not apply to

    linear sorting

  • 8/3/2019 Linear Sort

    3/37

    Lower Bound for Sorting

    Assume (for simplicity) that all elements in an input sequence a1,a2, ..., an are distinct

    We can view comparison-based sorting using a decision tree

    Each internal node in the decision tree corresponds to one of thecomparisons in the algorithm.

    start at the root and do first comparison:

    - if take left branch, if > take right branch, etc.

    each leaf represents one possible ordering of the input one leaf for each of n! possible orderings

    Each permutation must appear as a leaf in the tree

    One decision tree exists for each algorithm and input size

  • 8/3/2019 Linear Sort

    4/37

    The Decision Tree

    a1:a2

    a2:a3 a1:a3

    a2:a3a1:a3 2,1,3

    2,3,1 3,2,13,1,21,3,2

    1,2,3

    >

    >

    >

    >

    >

    a1 = 6, a2 = 8, a3 = 5

    Note: The length of the longest root to leaf path in this tree= worst-case number of comparisons

    worst-case number of operations of algorithm

  • 8/3/2019 Linear Sort

    5/37

    The (nlgn) Lower BoundTheorem:Any decision tree T for sorting n elements has height

    (nlgn) (therefore, any comparison sort algorithm requires(nlgn)

    compares in the worst case).

    Proof: Let hbe the height of T. Then we know

    T has at least n! leaves T is binary, so it has at most 2h leaves

    n! #leaves2h

    2hn!

    lg(2h) lg(n!)

    h (nlgn) (Eq. 3.18)

  • 8/3/2019 Linear Sort

    6/37

    Proof

    Consider a decision tree with height h with l

    reachable leaves

    n! l

    Since a binary tree of height h has no more than

    2h leaves:

    n! l 2h

    Taking the log of both sides:

    h lg (n!) = (n lg n)

  • 8/3/2019 Linear Sort

    7/37

    Counting Sort

    Each element is an integer in the range from 1

    to k

    For each element, determine the number of

    elements less than it

    Works well on small ranges

    Does notsort in place

  • 8/3/2019 Linear Sort

    8/37

    The Code

    COUNTING-SORT (A,B, k)

    fori1 tok //Init CdoC[i] 0 //(k)

    forj1 tolength[A] //Build C

    doC[A[j]] C[A[j]] + 1 //(n)

    fori 2 tok //Make C'

    doC[i] C[i] + C[i-1] //(k)

    forj length[A] downto 1 //Copy info

    doB[C[A[j]]] A[j] //(n)

    C[A[j]] C[A[j]] -1

    Requirement: input elements are integers in known range [0..k] for

    some constant k.

    Idea: for each input element x, find the number of elements x (say

    this number = m) and put x in the (m+1)stspot in the output array.

  • 8/3/2019 Linear Sort

    9/37

    3 6 4 1 3 4 1 4

    2 0 2 3 0 1

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    A

    C

    Original

    Number of 1's, 2's..

    How do you construct C's size?

    How do you fill C's data?

  • 8/3/2019 Linear Sort

    10/37

    Modify CSuch that C now tells how many elements are less than i

    2 0 2 3 0 1

    1 2 3 4 5 6

    C Number of 1's, 2's..

    How do you construct C'?

    2 2 4 7 7 8

    1 2 3 4 5 6

    C' Number of slots

  • 8/3/2019 Linear Sort

    11/37

    Move into Bfrom A

    2 2 4 7 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

  • 8/3/2019 Linear Sort

    12/37

    Move into Bfrom A

    4

    2 2 4 6 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

  • 8/3/2019 Linear Sort

    13/37

    Move into Bfrom A

    4

    2 2 4 6 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

  • 8/3/2019 Linear Sort

    14/37

    Move into Bfrom A

    4

    1 2 4 6 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    1

  • 8/3/2019 Linear Sort

    15/37

    Move into Bfrom A

    1 4

    1 2 4 6 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

  • 8/3/2019 Linear Sort

    16/37

    Move into Bfrom A

    1 4

    1 2 4 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    4

  • 8/3/2019 Linear Sort

    17/37

    Move into Bfrom A

    1 4

    1 2 4 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    4

  • 8/3/2019 Linear Sort

    18/37

    Move into Bfrom A

    1 4

    1 2 3 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43

  • 8/3/2019 Linear Sort

    19/37

    Move into Bfrom A

    1 4

    1 2 3 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43

  • 8/3/2019 Linear Sort

    20/37

    Move into Bfrom A

    1 1 4

    0 2 3 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43

  • 8/3/2019 Linear Sort

    21/37

    Move into Bfrom A

    1 1 4

    0 2 3 5 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43

  • 8/3/2019 Linear Sort

    22/37

    Move into Bfrom A

    1 1 4

    0 2 3 4 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43 4

  • 8/3/2019 Linear Sort

    23/37

    Move into Bfrom A

    1 1 4

    0 2 3 4 7 8

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43 4

  • 8/3/2019 Linear Sort

    24/37

    Move into Bfrom A

    1 1 4

    0 2 3 4 7 7

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43 4 6

  • 8/3/2019 Linear Sort

    25/37

    Move into Bfrom A

    1 1 4

    0 2 3 4 7 7

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43 4 6

  • 8/3/2019 Linear Sort

    26/37

    Move into Bfrom A

    1 1 4

    0 2 2 4 7 7

    1 2 3 4 5 6 7 8

    1 2 3 4 5 6

    B

    C

    Sorted

    Number of slots.

    3 6 4 1 3 4 1 4

    1 2 3 4 5 6 7 8

    A Original

    43 4 63

  • 8/3/2019 Linear Sort

    27/37

    Running Time of Counting Sort

    for loop in lines 1-2 takes (n) time.for loop in lines 3-4 takes (k) time.for loop in lines 5-7 takes (n) time.

    In practice, use counting sort when we have k = (n),

    so running time is (n).

    Counting sort has the important property ofstability:

    A sorting algorithm isstable when numbers with the same values

    appear in the output array in the same order as they do in the input array.

    Important when satellite data is stored with elements being sorted and

    when counting sort is used as a subroutine for radix sort.

    Overall time is (k + n).

  • 8/3/2019 Linear Sort

    28/37

    Radix Sort

    Sorting by "column"

    A d-digit number would create dcolumns

    Start with least-significant row Usually requires counting sort to be used on the

    columns

  • 8/3/2019 Linear Sort

    29/37

    Example(by hand)

    331

    429190

    127

    982

    784

    318

    190

    331982

    784

    127

    318

    429

    318

    127429

    331

    982

    784

    190

    127

    190318

    331

    429

    784

    982

  • 8/3/2019 Linear Sort

    30/37

    The Code

    RADIX-SORT (A, d)

    fori 1 tod

    do use a stable sort to sortA on digitI

    Note:

    radix sort sorts the leastsignificant digit first.

    correctness can be shown by induction on the digit being sorted. often counting sort is used as the internal sort in step 2.

  • 8/3/2019 Linear Sort

    31/37

    Givenn d-digit numbers in which each digit can take on up

    to k possible values, RADIXSORT correctly sorts these

    numbers in (d(n + k)) time.

    Proof:

    Each digit is in the range 0(k-1)

    Takesktime to constructC

    Each pass over nd-digit numbers takes

    (n+k)

    Thus, the total running time is (d(n+k))

  • 8/3/2019 Linear Sort

    32/37

    Givenn b-bit numbers and any positive integer r b,

    RADIX-SORT correctly sorts these numbers in

    ((b/r)(n + 2r)) time.

    Proof: For a value r b, we view each key as having d =b/r digits of r bits each. Each digit is an integer in the range 0

    to 2r

    - 1, so that we can use counting sort with k = 2r

    - 1. (Forexample, we can view a 32-bit word as having 4 8-bit digits,

    so that b = 32, r = 8, k = 2r - 1 = 255, and d = b/r = 4.) Each

    pass of counting sort takes time (n + k) = (n + 2r) and there

    are d passes, for a total running time of (d(n + 2r )) =

    ((b/r)(n + 2r)).

  • 8/3/2019 Linear Sort

    33/37

    Bucket Sort

    Bucket sortruns in linear time when the input isdrawn from a uniform distribution.

    Like counting sort, bucket sort is fast because itassumes something about the input.

    Whereas counting sort assumes that the input consistsof integers in a small range, bucket sort assumes thatthe input is generated by a random process thatdistributes elements uniformly over the interval [0, 1).

    The idea of bucket sort is to divide the interval [0, 1)into nequal-sized subintervals, or buckets, and thendistribute the ninput numbers into the buckets.

    To produce the output, we simply sort the numbers ineach bucket and then go through the buckets in order,listing the elements in each.

  • 8/3/2019 Linear Sort

    34/37

    Example

    .06

    .43

    .48

    .37

    .91

    .44

    .98

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    .06

    .37

    .43 .44 .48

    .98

    /

    /

    /

    /

    /

    /

  • 8/3/2019 Linear Sort

    35/37

    The Code

    The bucket sort assumes that the input is an n-elementarrayA and that each elementA[i] in the array satisfies0 A[i] < 1. The code requires an auxiliary arrayB[0n- 1] of linked lists (buckets) and there is a

    mechanism for maintaining such lists.

    BUCKET-SORT (A)

    nlength[A]

    for i 1 to ndo insertA[i] into list B[A[i]]

    for i 0 to n- 1

    do sort list B[i] with insertion sort

    concatenate the lists together

  • 8/3/2019 Linear Sort

    36/37

    Proof of Bucket Sort

    Consider two elements A[i] and A[j]

    Assume A[i]A[j]

    Then, A[i] is placed into either the same bucketas A[j], or a bucket with a lower index.

    The sort of each bucket guarantees A[i] and A[j]

    are ordered correctly

  • 8/3/2019 Linear Sort

    37/37

    Of Interest(only to me)

    As long as the input has the property that thesum of the squares of the bucket sizes is linear

    in the total number of elements... bucket sort

    will run in linear time

    In other words, each bucket should get the

    square root of the number of bucket elements.