Quicksort!. A practical sorting algorithm Quicksort Very efficient Tight code Good cache...

Post on 21-Dec-2015

219 views 0 download

Tags:

Transcript of Quicksort!. A practical sorting algorithm Quicksort Very efficient Tight code Good cache...

Quicksort!Quicksort!

A practical sorting algorithm

• Quicksort Very efficient

• Tight code

• Good cache performance

• Sort in place

Easy to implement Used in older version of C++ STL sort(begin, end)

Average case performance: O(n log n) Worst case performance: O(n2)

Quicksort

• Divide & Conquer

Divide into two arrays around the last element

Recursively sort each array

QUICKSORT

Divide around this element

Quicksort

• Divide & Conquer

Divide into two arrays around the last element

Recursively sort each array

QUICKSORT

Divide around this element

Quicksort

• Divide & Conquer

Divide into two arrays around the last element

Recursively sort each array

QUICKSORT

Exchange

Quicksort

• Divide & Conquer

Divide into two arrays around the last element

Recursively sort each array

QUICKSORT QUICKSORT

Partitioning the array

PARTITION(A, p, r)

x = A[r]

i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

xi j

Partitioning the array

PARTITION(A, p, r)

x = A[r]

i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

xi j

Partitioning the array

PARTITION(A, p, r)

x = A[r]

i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

xi j

Partitioning the array

PARTITION(A, p, r)

x = A[r]

i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

xi j

Partitioning the array

PARTITION(A, p, r)

x = A[r]

i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

xi j

Putting it together: Quicksort

QUICKSORT(A, p, r)

if p < r

then q = PARTITION(A, p, r)

QUICKSORT(A, p, q-1)

QUICKSORT(A, q+1, r)

QUICKSORT QUICKSORT

QUICKSORT

rp q

Putting it together: Quicksort

QUICKSORT(A, p, r)

if p < r

then q = PARTITION(A, p, r)

QUICKSORT(A, p, q-1)

QUICKSORT(A, q+1, r)

PARTITION(A, p, r)

x = A[r]; i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

• To sort an array A of length n:

QUICKSORT(A, 1, n)

Running Time

QUICKSORT(A, p, r)

if p < r

then q = PARTITION(A, p, r)

QUICKSORT(A, p, q-1)

QUICKSORT(A, q+1, r)

PARTITION(A, p, r)

x = A[r]; i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

• PARTITION(A, p, r) Time O(r-p) Outside loop:

• ~5 operations

• 0 comparisons

Inside loop: • r-p comparisons

• c(r-p) total operations, for some c ~= 5

Running Time

QUICKSORT(A, p, r)

if p < r

then q = PARTITION(A, p, r)

QUICKSORT(A, p, q-1)

QUICKSORT(A, q+1, r)

PARTITION(A, p, r)

x = A[r]; i = p–1

for j = p to r-1

if A[j] <= x

then i++; exchange A[i] and A[j]

exchange A[i+1] and A[r]

return i+1

• QUICKSORT (A, 1, n)

• Let T(n) be the worst-case running time, for any n

• Then, T(n) ≤ max{1 ≤ q ≤ n-1} ( T(q-1) + T(n-q) + (n) ) (why?)

Worst-Case Running Time

• Sort A = [n, n-1, n-2, …, 3, 2, 1]

1. PARTITION(A, 1, n) A = [1, n-1, n-2, …, 3, 2, n]2. PARTITION(A, 2, n) A = [1, n-1, n-2, …, 3, 2, n]3. PARTITION(A, 2, n-1) A = [1, 2, n-2, …, 3, n-1, n]4. PARTITION(A, 3, n-1) A = [1, 2, n-2, …, 3, n-1, n]5. PARTITION(A, 3, n-2) A = [1, 2, 3, …, n-2, n-1, n]6. PARTITION(A, 4, n-2)7. PARTITION(A, 4, n-3)

…n. PARTITION(A, n/2, n/2+1)

Total comparisons: (n-1) + (n-2) + … + 1 = (n2)

Worst-Case Running Time

T(n) ≤ max{1 ≤ q ≤ n-1} ( T(q-1) + T(n – q) + (n) )

Substitution Method:• Guess T(n) < cn2

T(n) ≤ max{1 ≤ q ≤ n} ( c(q – 1)2 + c(n – q)2 ) + dn

= c×max{1 ≤ q ≤ n} ((q – 1)2 + (n – q)2 ) + dn

This is max for q = 1 (or, q = n)

= c(n2 – 2n + 1) + dn

= cn2 – c(2n – 1) + dn

≤ cn2, if we pick c such that c(2n – 1) > dn

A randomized version of Quicksort

• Pick a random number in the array as a pivot

RANDOMIZED-PARTITION(A, p, r)

i = RANDOM(p, r)

exchange A[r] and A[i]

return PARTITION(A, p, r)

RANDOMIZED-QUICKSORT(A, p, r)

if p < r

then q = RANDOMIZED-PARTITION(A, p, r)

RANDOMIZED-QUICKSORT(A, p, q – 1)

RANDOMIZED-QUICKSORT(A, q + 1, r)

• What is the expected running time?

Expected running time of Quicksort

• Observe that the running time is a constant times the number of comparison

Lemma

Let X be the number of comparisons (A[j] <= x) performed by PARTITION.

Then, the running time of QUICKSORT is (n+X)

Proof

There are n calls to PARTITION.

Each call does a constant amount of work, plus the for loop.

Each iteration of the for loop executes the comparison A[j] <= x.

Therefore, running time of PARTITION is a (# comparisons)

A quick review of probability

Define a sample space S, which is a set of events

S = { A1, A2, …. }

Each event A has a probability P(A), such that

1. P(A) ≥ 0

2. P(S) = P(A1) + P(A2) + … = 1

3. P(A or B) = P(A) + P(B)

More generally, for any subset T = { B1, B2, … } of S,

P(T) = P(B1) + P(B2) + …

A quick review of probability

• Example: Let S be all sequences of 2 coin tosses

S = {HH, HT, TH, TT}

• Let each head/tail occur with 50% = 0.5 probability, independent of the other coin flips

• Then P(HH) = P(HT) = P(TH) = P(TT) = 0.25

• Let T = “no two tails occurring one after another”

= { HH, HT, TH }

• Then, P(T) = P(HH) + P(TH) + P(TT) = 0.75

A quick review of probability

• A random variable X is a function from S to real numbers

X: S RR

Example: X = “$3 for every heads”

X: {HH, HT, TH, TT} RR

X(HH) = 6;

X(HT) = X(TH) = 3;

X(TT) = 0

A quick review of probability

• Given two random variables X and Y,

• The joint probability density function f(x, y) is f(x, y) = P(X = x and Y = y)

• Then,

P(X = x0) = y P(X = x0 and Y = y)

P(Y = yo) = x P(X = x and Y = y0)

• The conditional probability P(X = x | Y = y) = P(X = x and Y = y) / P(Y = y)

A quick review of probability

• Two random variables X and Y are independent, if for all values x and y,

P(X = x and Y = y) = P(X = x) P(Y = y)

Example:

X = 1, if first coin toss is heads; 0 otherwise

Y = 1, if second coin toss is heads; 0 otherwise

Easy to check that X and Y are independent

A quick review of probability

• The expected value E(X) of a random variable X, is

E(X) = x x P(X = x)

• Going back to the coin tosses, estimate E(X)

E(X) = 6 P(X = 6) + 3 P(X = 3) + 0 P(X = 0)

= 6×0.25 + 3×0.5 + 0×0.25

= 1.5 + 1.5 + 0 = 3

A quick review of probability

Linearity of expectation:E(X + Y) = E(X) + E(Y)

For any function g(x),

E(g(X)) = x g(x) P(X = x)

Assume X and Y are independent:

E[XY] = x y xyP(X = x and Y = y)

= x y P(X = x) P(Y = y)

= x P(X = x) y P(Y = y)

= E(X) E(Y)

A quick review of probability

Define indicator variable I(A), where A is an event

1, if A occurs I(A) =

0, otherwise

Lemma: Given a sample space S, and an event A,

E[ I(A) ] = P(A)

Proof: E[ I(A) ] = 1×P(A) + 0×P(S – A) = P(A)

Back to Quicksort

Lemma

Let X be the number of comparisons (A[j] <= x) performed by PARTITION.

Then, the running time of QUICKSORT is O(n+X)

Define indicator random variables:

Xij = I( zi is compared to zj during the course of the algorithm )

Then, the total number of comparisons performed is:

X = i=1…n-1 j=i+1…n Xij

Expected running time of Quicksort

• Running time = X = i=1…n-1 j=i+1…n Xij

• Expected running time = E[X]

E[X] = E[ i=1…n-1 j=i+1…n Xij ]

= i=1…n-1 j=i+1…n E[ Xij ]

= i=1…n-1 j=i+1…n P( zi is compared to zj )

• We just need to estimate P( zi is compared to zj )

Expected running time of Quicksort

During partition of zi……zj

• Once a pivot p is selected

p is compared to all elements except itself

no element < p is ever compared to an element > p

smaller than pivot larger than pivot

Expected running time of Quicksort

• Denote by Zij = { zi,……,zj }

In order to compare zi and zj

• First element chosen as pivot within Zij should be zi, or zj

P( zi is compared to zj ) =

= P( zi or zj is first pivot from Zij )

= P( zi is first pivot from Zij ) + P( zj is first pivot from Zij )

= 1/(j – i +1) + 1/(j – i + 1)= 2/(j – i + 1)

Expected running time of Quicksort

Now we are ready to compute E[X]:

E[X] = i=1…n-1 j=i+1…n 2/(j – i + 1)

Let k = j – i

E[X] = i=1…n-1 j=i+1…n 2/(k + 1)

< i=1…n j=1…n 2/k

= i=1…n (lg n + O(1)) by CLRS A.7

= O(n lg n )

Selection: Find the ith number

• The ith order statistics is the ith smallest element

1

2

34 5 6

7 8 9 1011

12

1314 15