Quicksort optimizacija

2
Improvements to Quicksort 1) [Choice of pivot element] In partitioning, don’t choose the pivot element as a[first]. If a is almost in order, this choice leads to a very uneven split. We have a worst or near-worst case — Θ(n 2 ) running time. If n is large, sorting will be extremely slow. Some reasonable choices: a) [middle element] Choose the pivot element as a[(first+last) / 2], i.e. the middle element. If a is almost in order, partitioning tends to split the array into equal-sized subarrays. Quicksort runs a bit faster than in the expected case (n lg(n) comparisons instead of 1.39 n lg(n).) b) [random element] Choose the pivot element as a[rnd], rnd = pseudo-random integer in range [first,last]. The probability of a bad case is extremely low, regardless of any non-randomness in the input. Drawback: The added time required by the pseudo- random number generator. c) [median-of-three] Choose the pivot element as median( a[first], a[last], a[(first+last)/2] ). In practice, a bad case appears to be very unlikely. Our previous analysis of quicksort no longer applies. We assumed the pivot element was equally likely to be the k th smallest element of the array, k = 1,..,n. With the median-of-three, the pivot element is more likely to lie closer to the median ( k (n+1)/2). (This is good; we want an even split.) Suppose we call a split “bad” if it is worse than 90-10 (i.e., k < 0.1 n or k > 0.9 n). If a is randomly ordered, with (a) or (b) the probability of a bad split is 20% . With (c), if n is large, the probability of a bad split is about 6(0.1) 2 – 2(0.1) 3 , or about 6% . An analysis (fairly complex) shows the expected number of comparisons is about 1.19 n lg(n), instead of 1.39 n lg(n). Drawback: The added time required to find the median — insignificant if the sub-array is large. Good strategy: Median-of-three for large sub-arrays, middle element for smaller ones. 2) [Eliminating the overhead of many recursive function calls to sort small sub-arrays] Which is faster? (i) quicksort (ii) a simple non-recursive sort (e.g., insertion sort)? Unless n is very small, quicksort is much faster. n lg(n) grows far more slowly that n 2 . If n = 10 6 , n lg(n) 2×10 7 and n 2 10 12 . What if n is very small? Say n = 15. For simplicity, suppose quicksort always splits its input evenly (15 7,7 3,3,3,3). a) Quicksort performs (15–1) + 2(7–1) + 4(3–1) = 34 comparisons, and it makes 6 recursive calls to itself (even if we eliminate recursive calls to sort sub- arrays of size 0 or 1). b) Insertion sort performs about 15(15–1)/4 53 com- parisons (a bit more than quicksort), but makes no recursive calls. Quicksort might well be slower — due (in part) to the overhead of the recursive function calls. Suppose we determine experimentally (or guess) that insertion sort is faster for n < 16, and slower otherwise. We can speed up quicksort with this strategy: After partitioning, when quicksort needs to sort a sub- array, it will check the size of the sub-array. If the size is less than 16, it will invoke insertion sort to sort the sub-array, rather than calling itself recursively. /KUCEQPUVCPVRGTJCRUVQ+VKUQWTGUVKOCVGQHJQYNCTIGVJGKPRWV OWUVDGKPQTFGTHQTSWKEMUQTVVQDGHCUVGTVJCPKPUGTVKQPUQTV void quicksort2 ( Element[] a, int first, int last) 2CTVKVKQPC=HKTUV?C=NCUV?LWUVCUDGHQTG Choose p with first p last ; int splitPoint = partition ( a, first, last, p ); 5QTVNGHVUWDCTTC[+HKVKUUOCNNWUGKPUGTVKQPUQTVQVJGTYKUGSWKEMUQTV if ( splitPoint first < M ) insertionSort( a, first, splitPoint –1); else quicksort2( a, first, splitPoint –1); 5QTVTKIJVUWDCTTC[+HKVKUUOCNNWUGKPUGTVKQPUQTVQVJGTYKUGSWKEMUQTV if ( last splitPoint < M ) insertionSort( a, splitPoint +1, last); else quicksort2( a, splitPoint +1, last); return;

description

qsort - C

Transcript of Quicksort optimizacija

  • Improvements to Quicksort

    1) [Choice of pivot element]

    In partitioning, dont choose the pivot element as a[first].

    If a is almost in order, this choice leads to a very uneven split.

    We have a worst or near-worst case (n2) running time.

    If n is large, sorting will be extremely slow.

    Some reasonable choices:

    a) [middle element] Choose the pivot element as a[(first+last) / 2], i.e. the middle element.

    If a is almost in order, partitioning tends to split the array into equal-sized subarrays.

    Quicksort runs a bit faster than in the expected case (n lg(n) comparisons instead of 1.39nlg(n).)

    b) [random element] Choose the pivot element as a[rnd], rnd = pseudo-random integer in range [first,last].

    The probability of a bad case is extremely low, regardless of any non-randomness in the input.

    Drawback: The added time required by the pseudo-random number generator.

    c) [median-of-three] Choose the pivot element as median( a[first], a[last], a[(first+last)/2] ).

    In practice, a bad case appears to be very unlikely.

    Our previous analysis of quicksort no longer applies.

    We assumed the pivot element was equally likely to be the kth smallest element of the array, k = 1,..,n.

    With the median-of-three, the pivot element is more likely to lie closer to the median ( k (n+1)/2). (This is good; we want an even split.)

    Suppose we call a split bad if it is worse than 90-10 (i.e., k < 0.1n or k > 0.9n).

    If a is randomly ordered, with (a) or (b) the probability of a bad split is 20%.

    With (c), if n is large, the probability of a bad split is about 6(0.1)2 2(0.1)3, or about 6%.

    An analysis (fairly complex) shows the expected number of comparisons is about 1.19n lg(n), instead of 1.39nlg(n).

    Drawback: The added time required to find the median insignificant if the sub-array is large.

    Good strategy: Median-of-three for large sub-arrays, middle element for smaller ones.

    2) [Eliminating the overhead of many recursive function calls to sort small sub-arrays]

    Which is faster?

    (i) quicksort

    (ii) a simple non-recursive sort (e.g., insertion sort)?

    Unless n is very small, quicksort is much faster.

    nlg(n) grows far more slowly that n2.

    If n = 106, nlg(n) 2107 and n2 1012.

    What if n is very small?

    Say n = 15.

    For simplicity, suppose quicksort always splits its input evenly (15 7,7 3,3,3,3).

    a) Quicksort performs (151) + 2(71) + 4(31) = 34 comparisons, and it makes 6 recursive calls to itself (even if we eliminate recursive calls to sort sub-arrays of size 0 or 1).

    b) Insertion sort performs about 15(151)/4 53 com-parisons (a bit more than quicksort), but makes no recursive calls.

    Quicksort might well be slower due (in part) to the overhead of the recursive function calls.

    Suppose we determine experimentally (or guess) that insertion sort is faster for n < 16, and slower otherwise.

    We can speed up quicksort with this strategy:

    After partitioning, when quicksort needs to sort a sub-array, it will check the size of the sub-array.

    If the size is less than 16, it will invoke insertion sort to sort the sub-array, rather than calling itself recursively.

    /KUCEQPUVCPVRGTJCRUVQ+VKUQWTGUVKOCVGQHJQYNCTIGVJGKPRWV

    OWUVDGKPQTFGTHQTSWKEMUQTVVQDGHCUVGTVJCPKPUGTVKQPUQTV

    void quicksort2( Element[] a, int first, int last)

    2CTVKVKQPC=HKTUV?C=NCUV?LWUVCUDGHQTG Choose p with first p last; int splitPoint = partition( a, first, last, p);

    5QTVNGHVUWDCTTC[+HKVKUUOCNNWUGKPUGTVKQPUQTVQVJGTYKUGSWKEMUQTV if ( splitPointfirst < M ) insertionSort( a, first, splitPoint1); else quicksort2( a, first, splitPoint1);

    5QTVTKIJVUWDCTTC[+HKVKUUOCNNWUGKPUGTVKQPUQTVQVJGTYKUGSWKEMUQTV if ( lastsplitPoint

  • We can use this strategy to speed up any divide-and-conquer algorithm.

    For quicksort, there is an even faster alternative.

    Recall insertion sort is very fast if its input is almost in order.

    4GEWTUKXGHWPEVKQPSWKEMUQTVCUJQWNFDGECNNGFQPN[D[SUQTVUJQYPDGNQY

    void quicksort2a( Element[] a, int first, int last) 2CTVKVKQPC=HKTUV?C=NCUV?LWUVCUDGHQTG Choose p with first p last; int splitPoint = partition( a, first, last, p);

    +HNGHVUWDCTTC[KUUOCNNNGCXGKVWPUQTVGF1VJGTYKUGUQTVYKVJSWKEMUQTV if ( splitPointfirst M ) quicksort2a( a, first, splitPoint1);

    +HTKIJVUWDCTTC[KUUOCNNNGCXGKVWPUQTVGF1VJGTYKUGUQTVYKVJSWKEMUQTV if ( lastsplitPoint M ) quicksort2a( a, splitPoint +1, last); return ;

    SUQTVCUQTVUVJGCTTC[C

    void qsort( Element[] a) n = size of a; if ( n M ) quicksort2a( a, 0, n1); #VVJKURQKPVCKUPQVEQORNGVGN[UQTVGFDWVKVKUCNOQUVKPQTFGT 5VTCKIJVKPUGTVKQPUQTVECPEQORNGVGVJGUQTVXGT[SWKEMN[

    insertionSort( a); return ;

    3) [Making quicksort work almost in place]

    Does quicksort work in place (i.e., only constant extra space)?

    Partitioning works in place.

    But quicksort doesnt.

    Each time quicksort calls itself recursively, an activation record (stack frame) is pushed onto the run-time stack.

    The activation record has space for parameters, automatic local variables, temporary objects, etc.

    The activation record isnt popped until the recursive call returns.

    The extra space required by quicksort is proportional to the maximum depth of recursion (the height of the tree of recursive function calls).

    What is the maximum depth of recursion?

    About n in the worst case.

    (lgn) in the expected case. (Not easy to prove.)

    We can modify quicksort to make the maximum depth of recursion approximately lg(n) even in the worst case.

    4GEWTUKXGHWPEVKQPSWKEMUQTVUJQWNFDGECNNGFQPN[D[SUQTVUJQYPDGNQY

    .KMGSWKEMUQTVCGZEGRVVJCVVJGOCZKOWOFGRVJQHTGEWTUKQPECPPQVGZEGGF

    NIP

    void quicksort3( Element[] a, int first, int last)

    6JGUGEQPFTGEWTUKXGECNNVQSWKEMUQTVKUTGRNCEGFD[CNQQR

    while ( last first+1 M ) 2CTVKVKQPC=HKTUV?C=NCUV?LWUVCUDGHQTG Choose p with first p last; int splitPoint = partition( a, first, last, p);

    5QTVVJGUOCNNGTUWDCTTC[D[CTGEWTUKXGECNNVQSWKEMUQTVWPNGUUKVU UK\GKUNGUUVJCP/6JGPUQTVVJGNCTIGTUWDCTTC[D[TGRGCVKPI

    VJGNQQRTCVJGTVJCPECNNKPISWKEMUQTVTGEWTUKXGN[

    if ( splitPoint first < lastsplitPoint ) .GHVUWDCTTC[KUUOCNNGT

    if ( splitPoint first M ) quicksort3( a, first, splitPoint 1); first = splitPoint+1;

    else 4KIJVUWDCTTC[KUUOCNNGT

    if ( lastsplitPoint M ) quicksort3( a, splitPoint +1, last); last = splitPoint1;

    return ;

    SUQTVCUQTVUVJGCTTC[C

    void qsort( Element[] a) #UYKVJSWKEMUQTVCECNNUSWKEMUQTVKPUVGCFQHSWKEMUQTVC

    Each time quicksort calls itself recursively, the size of the subarray to be sorted decreases by more than a factor of 2.

    So, for a recursive call at depth k, the size of the subarray is less than n/2k.

    If the depth of recursion exceeded lg(n), the subarray size would be less than 1.