Heapsort quick sort

67
Algorithms Sandeep Kumar Poonia Head Of Dept. CS/IT B.E., M.Tech., UGC-NET LM-IAENG, LM-IACSIT,LM-CSTA, LM-AIRCC, LM-SCIEI, AM-UACEE

description

Heapsort quick sort

Transcript of Heapsort quick sort

Page 1: Heapsort quick sort

Algorithms

Sandeep Kumar PooniaHead Of Dept. CS/IT

B.E., M.Tech., UGC-NET

LM-IAENG, LM-IACSIT,LM-CSTA, LM-AIRCC, LM-SCIEI, AM-UACEE

Page 2: Heapsort quick sort

Sandeep Kumar Poonia

Algorithms

Introduction to heapsort

Quicksort

Page 3: Heapsort quick sort

Recap

Sandeep Kumar Poonia

Asymptotic notation

Merge Sort

Solving Recurrences

The Master Theorem

Page 4: Heapsort quick sort

Sandeep Kumar Poonia

Sorting Revisited

So far we’ve talked about two algorithms to

sort an array of numbers

What is the advantage of merge sort?

What is the advantage of insertion sort?

Next on the agenda: Heapsort

Combines advantages of both previous algorithms

Sort in Place – Like insertion sort

O(n Lg n) Worst case – Like Merge sort

Page 5: Heapsort quick sort

Sandeep Kumar Poonia

A heap can be seen as a complete binary tree:

What makes a binary tree complete?

Is the example above complete?

Heaps

16

14 10

8 7 9 3

2 4 1

Page 6: Heapsort quick sort

Sandeep Kumar Poonia

A heap can be seen as a complete binary tree:

We calls them “nearly complete” binary trees; can

think of unfilled slots as null pointers

Heaps

16

14 10

8 7 9 3

2 4 1 1 1 111

Page 7: Heapsort quick sort

Sandeep Kumar Poonia

Heaps

In practice, heaps are usually implemented as

arrays:

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1A = =

Page 8: Heapsort quick sort

Sandeep Kumar Poonia

Heaps

To represent a complete binary tree as an array:

The root node is A[1]

Node i is A[i]

The parent of node i is A[i/2] (note: integer divide)

The left child of node i is A[2i]

The right child of node i is A[2i + 1]16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1A = =

Page 9: Heapsort quick sort

Sandeep Kumar Poonia

Referencing Heap Elements

So…

Parent(i) { return i/2; }

Left(i) { return 2*i; }

right(i) { return 2*i + 1; }

An aside: How would you implement this

most efficiently?

Page 10: Heapsort quick sort

Sandeep Kumar Poonia

The Heap Property

Heaps also satisfy the heap property:

A[Parent(i)] A[i] for all nodes i > 1

In other words, the value of a node is at most the

value of its parent

Where is the largest element in a heap stored?

Definitions:

The height of a node in the tree = the number of

edges on the longest downward path to a leaf

The height of a tree = the height of its root

Page 11: Heapsort quick sort

Heap Height

Q: What are the minimum and maximum

numbers of elements in a heap of height h?

Ans: Since a heap is an almost-complete binary

tree (complete at all levels except possibly the

lowest), it has at most 2h+1 -1 elements (if it is

complete) and

at least 2h -1+1 = 2h elements (if the lowest level

has just 1 element and the other levels are

complete).

Sandeep Kumar Poonia

Page 12: Heapsort quick sort

Sandeep Kumar Poonia

Heap Height

Q: What is the height of an n-element heap?

Why?

Ans: This is nice: basic heap operations take at

most time proportional to the height of the

heap

Given an n-element heap of height h, we know

that

Thus

Since h is an integer,

Page 13: Heapsort quick sort

Sandeep Kumar Poonia

Heap Operations: Heapify()

Heapify(): maintain the heap property

Given: a node i in the heap with children l and r

Given: two subtrees rooted at l and r, assumed to

be heaps

Problem: The subtree rooted at i may violate the

heap property.

Action: let the value of the parent node “float

down” so subtree at i satisfies the heap property

What do you suppose will be the basic operation

between i, l, and r?

Page 14: Heapsort quick sort

Sandeep Kumar Poonia

Procedure MaxHeapify

MaxHeapify(A, i)

1. l left(i)

2. r right(i)

3. if l heap-size[A] and A[l] > A[i]

4. then largest l

5. else largest i

6. if r heap-size[A] and A[r] > A[largest]

7. then largest r

8. if largest i

9. then exchange A[i] A[largest]

10. MaxHeapify(A, largest)

Assumption:

Left(i) and Right(i)

are max-heaps.

Page 15: Heapsort quick sort

Sandeep Kumar Poonia

Running Time for MaxHeapifyMaxHeapify(A, i)

1. l left(i)

2. r right(i)

3. if l heap-size[A] and A[l] > A[i]

4. then largest l

5. else largest i

6. if r heap-size[A] and A[r] > A[largest]

7. then largest r

8. if largest i

9. then exchange A[i] A[largest]

10. MaxHeapify(A, largest)

Time to fix node i

and its children =

(1)

Time to fix the

subtree rooted at

one of i’s children =

T(size of subree at

largest)

PLUS

Page 16: Heapsort quick sort

Sandeep Kumar Poonia

Running Time for MaxHeapify(A, n)

T(n) = T(largest) + (1)

largest 2n/3 (worst case occurs when the last row of

tree is exactly half full)

T(n) T(2n/3) + (1) T(n) = O(lg n)

Alternately, MaxHeapify takes O(h) where h is the

height of the node where MaxHeapify is applied

Page 17: Heapsort quick sort

Sandeep Kumar Poonia

Building a heap

Use MaxHeapify to convert an array A into a max-heap.

Call MaxHeapify on each element in a bottom-up

manner.

BuildMaxHeap(A)

1. heap-size[A] length[A]

2. for i length[A]/2 downto 1

3. do MaxHeapify(A, i)

Page 18: Heapsort quick sort

Sandeep Kumar Poonia

BuildMaxHeap – Example

24 21 23 22 36 29 30 34 28 27

Input Array:

24

21 23

22 36 29 30

34 28 27

Initial Heap:

(not max-heap)

Page 19: Heapsort quick sort

Sandeep Kumar Poonia

BuildMaxHeap – Example

24

21 23

22 36 29 30

34 28 27

MaxHeapify(10/2 = 5)

3636

MaxHeapify(4)

2234

22

MaxHeapify(3) 2330

23

MaxHeapify(2)

2136

21

MaxHeapify(1)

2436

2434

2428

24

21

21

27

Page 20: Heapsort quick sort

Sandeep Kumar Poonia

Correctness of BuildMaxHeap

Loop Invariant: At the start of each iteration of the forloop, each node i+1, i+2, …, n is the root of a max-heap.

Initialization:

Before first iteration i = n/2

Nodes n/2+1, n/2+2, …, n are leaves and hence roots of max-heaps.

Maintenance:

By LI, subtrees at children of node i are max heaps.

Hence, MaxHeapify(i) renders node i a max heap root (while preserving the max heap root property of higher-numbered nodes).

Decrementing i reestablishes the loop invariant for the next iteration.

Page 21: Heapsort quick sort

Sandeep Kumar Poonia

Running Time of BuildMaxHeap

Loose upper bound:

Cost of a MaxHeapify call No. of calls to MaxHeapify

O(lg n) O(n) = O(nlg n)

Tighter bound:

Cost of a call to MaxHeapify at a node depends on the height,

h, of the node – O(h).

Height of most nodes smaller than n.

Height of nodes h ranges from 0 to lg n.

No. of nodes of height h is n/2h+1

Page 22: Heapsort quick sort

Sandeep Kumar Poonia

Heapsort

Sort by maintaining the as yet unsorted elements as a max-heap.

Start by building a max-heap on all elements in A.

Maximum element is in the root, A[1].

Move the maximum element to its correct final position.

Exchange A[1] with A[n].

Discard A[n] – it is now sorted.

Decrement heap-size[A].

Restore the max-heap property on A[1..n–1].

Call MaxHeapify(A, 1).

Repeat until heap-size[A] is reduced to 2.

Page 23: Heapsort quick sort

Sandeep Kumar Poonia

Heapsort(A)

HeapSort(A)

1. Build-Max-Heap(A)

2. for i length[A] downto 2

3. do exchange A[1] A[i]

4. heap-size[A] heap-size[A] – 1

5. MaxHeapify(A, 1)

Page 24: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

4 10

14 7 9 3

2 8 1

16 4 10 14 7 9 3 2 8 1A =

Page 25: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

4 10

14 7 9 3

2 8 1

16 10 14 7 9 3 2 8 1A = 4

Page 26: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

4 10

14 7 9 3

2 8 1

16 10 7 9 3 2 8 1A = 4 14

Page 27: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

4 7 9 3

2 8 1

16 14 10 4 7 9 3 2 8 1A =

Page 28: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

4 7 9 3

2 8 1

16 14 10 7 9 3 2 8 1A = 4

Page 29: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

4 7 9 3

2 8 1

16 14 10 7 9 3 2 1A = 4 8

Page 30: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1A =

Page 31: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 1A = 4

Page 32: Heapsort quick sort

Sandeep Kumar Poonia

Heapify() Example

16

14 10

8 7 9 3

2 4 1

16 14 10 8 7 9 3 2 4 1A =

Page 33: Heapsort quick sort

Sandeep Kumar Poonia

Algorithm Analysis

In-place

Not Stable

Build-Max-Heap takes O(n) and each of the n-1 calls

to Max-Heapify takes time O(lg n).

Therefore, T(n) = O(n lg n)

HeapSort(A)

1. Build-Max-Heap(A)

2. for i length[A] downto 2

3. do exchange A[1] A[i]

4. heap-size[A] heap-size[A] – 1

5. MaxHeapify(A, 1)

Page 34: Heapsort quick sort

Sandeep Kumar Poonia

Heap Procedures for Sorting

MaxHeapify O(lg n)

BuildMaxHeap O(n)

HeapSort O(n lg n)

Page 35: Heapsort quick sort

Sandeep Kumar Poonia

Priority Queue

Popular & important application of heaps.

Max and min priority queues.

Maintains a dynamic set S of elements.

Each set element has a key – an associated value.

Goal is to support insertion and extraction efficiently.

Applications:

Ready list of processes in operating systems by their

priorities – the list is highly dynamic

In event-driven simulators to maintain the list of events to be

simulated in order of their time of occurrence.

Page 36: Heapsort quick sort

Sandeep Kumar Poonia

Basic Operations

Operations on a max-priority queue:

Insert(S, x) - inserts the element x into the set S

S S {x}.

Maximum(S) - returns the element of S with the largest key.

Extract-Max(S) - removes and returns the element of S with the largest key.

Increase-Key(S, x, k) – increases the value of element x’s key to the new value k.

Min-priority queue supports Insert, Minimum, Extract-Min, and Decrease-Key.

Heap gives a good compromise between fast insertion but slow extraction and vice versa.

Page 37: Heapsort quick sort

Sandeep Kumar Poonia

Heap Property (Max and Min)

Max-Heap For every node excluding the root,

value is at most that of its parent: A[parent[i]] A[i]

Largest element is stored at the root.

In any subtree, no values are larger than the value stored at subtree root.

Min-Heap For every node excluding the root,

value is at least that of its parent: A[parent[i]] A[i]

Smallest element is stored at the root.

In any subtree, no values are smaller than the value stored at subtree root

Page 38: Heapsort quick sort

Sandeep Kumar Poonia

Heap-Extract-Max(A)

Heap-Extract-Max(A,n)

1. if n < 1

2. then error “heap underflow”

3. max A[1]

4. A[1] A[n]

5. n n - 1

6. MaxHeapify(A, 1)

7. return max

Running time : Dominated by the running time of MaxHeapify

= O(lg n)

Implements the Extract-Max operation.

Page 39: Heapsort quick sort

Sandeep Kumar Poonia

Heap-Insert(A, key)

Heap-Insert(A, key)

1. heap-size[A] heap-size[A] + 1

2. i heap-size[A]

4. while i > 1 and A[Parent(i)] < key

5. do A[i] A[Parent(i)]

6. i Parent(i)

7. A[i] key

Running time is O(lg n)

The path traced from the new leaf to the root has

length O(lg n)

Page 40: Heapsort quick sort

Sandeep Kumar Poonia

Heap-Increase-Key(A, i, key)Heap-Increase-Key(A, i, key)

1 If key < A[i]

2 then error “new key is smaller than the current key”

3 A[i] key

4 while i > 1 and A[Parent[i]] < A[i]

5 do exchange A[i] A[Parent[i]]

6 i Parent[i]

Heap-Insert(A, key)

1 heap-size[A] heap-size[A] + 1

2 A[heap-size[A]] –3 Heap-Increase-Key(A, heap-size[A], key)

Page 41: Heapsort quick sort

Sandeep Kumar Poonia

Quicksort

Sorts in place

Sorts O(n lg n) in the average case

Sorts O(n2) in the worst case

So why would people use it instead of merge

sort?

Page 42: Heapsort quick sort

Sandeep Kumar Poonia

Quicksort

Another divide-and-conquer algorithm

The array A[p..r] is partitioned into two non-

empty subarrays A[p..q] and A[q+1..r]

Invariant: All elements in A[p..q] are less than all

elements in A[q+1..r]

The subarrays are recursively sorted by calls to

quicksort

Unlike merge sort, no combining step: two

subarrays form an already-sorted array

Page 43: Heapsort quick sort

Sandeep Kumar Poonia

Quicksort Code

Quicksort(A, p, r)

{

if (p < r)

{

q = Partition(A, p, r);

Quicksort(A, p, q-1);

Quicksort(A, q+1, r);

}

}

Page 44: Heapsort quick sort

Sandeep Kumar Poonia

Partition

Clearly, all the action takes place in the partition() function

Rearranges the subarray in place

End result:

Two subarrays

All values in first subarray all values in second

Returns the index of the “pivot” element

separating the two subarrays

How do you suppose we implement this

function?

Page 45: Heapsort quick sort

Sandeep Kumar Poonia

Partition In Words

Partition(A, p, r):

Select an element to act as the “pivot” (which?)

Grow two regions, A[p..i] and A[j..r]

All elements in A[p..i] <= pivot

All elements in A[j..r] >= pivot

Increment i until A[i] >= pivot

Decrement j until A[j] <= pivot

Swap A[i] and A[j]

Repeat until i >= j

Return j

Page 46: Heapsort quick sort

Sandeep Kumar Poonia

Partition Code

What is the running time of partition()?

Page 47: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

What will be the worst case for the algorithm?

Partition is always unbalanced

What will be the best case for the algorithm?

Partition is balanced

Which is more likely?

The latter, by far, except...

Will any particular input elicit the worst case?

Yes: Already-sorted input

Page 48: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

In the worst case: one subarray have 0 element

and another n-1 elements

T(1) = (1)

T(n) = T(n - 1) + (n)

Works out to

T(n) = (n2)

Page 49: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

In the best case: each has <= n/2 elements

T(n) = 2T(n/2) + (n)

What does this work out to?

T(n) = (n lg n)

Page 50: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

(Average Case)

Intuitively, a real-life run of quicksort will

produce a mix of “bad” and “good” splits

Randomly distributed among the recursion tree

Pretend for intuition that they alternate between

best-case (n/2 : n/2) and worst-case (n-1 : 1)

What happens if we bad-split root node, then

good-split the resulting size (n-1) node?

Page 51: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

(Average Case)

Intuitively, a real-life run of quicksort will

produce a mix of “bad” and “good” splits

Randomly distributed among the recursion tree

Pretend for intuition that they alternate between

best-case (n/2 : n/2) and worst-case (n-1 : 1)

What happens if we bad-split root node, then

good-split the resulting size (n-1) node?

We end up with three subarrays, size 1, (n-1)/2, (n-1)/2

Combined cost of splits = n + n -1 = 2n -1 = O(n)

No worse than if we had good-split the root node!

Page 52: Heapsort quick sort

Sandeep Kumar Poonia

Review: Analyzing Quicksort

(Average Case)

Intuitively, the O(n) cost of a bad split

(or 2 or 3 bad splits) can be absorbed

into the O(n) cost of each good split

Thus running time of alternating bad and good

splits is still O(n lg n), with slightly higher

constants

Page 53: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

For simplicity, assume:

All inputs distinct (no repeats)

Slightly different partition() procedure

partition around a random element, which is not

included in subarrays

all splits (0:n-1, 1:n-2, 2:n-3, … , n-1:0) equally likely

What is the probability of a particular split

happening?

Answer: 1/n

Page 54: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

So partition generates splits

(0:n-1, 1:n-2, 2:n-3, … , n-2:1, n-1:0)

each with probability 1/n

If T(n) is the expected running time,

What is each term under the summation for?

What is the (n) term for?

1

0

11 n

k

nknTkTn

nT

Page 55: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

So…

1

0

1

0

2

11

n

k

n

k

nkTn

nknTkTn

nT

Write it on

the board

Page 56: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the

substitution method

Guess the answer

Assume that the inductive hypothesis holds

Substitute it in for some value < n

Prove that it follows for n

Page 57: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the

substitution method

Guess the answer

What’s the answer?

Assume that the inductive hypothesis holds

Substitute it in for some value < n

Prove that it follows for n

Page 58: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

Substitute it in for some value < n

Prove that it follows for n

Page 59: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

What’s the inductive hypothesis?

Substitute it in for some value < n

Prove that it follows for n

Page 60: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

T(n) an lg n + b for some constants a and b

Substitute it in for some value < n

Prove that it follows for n

Page 61: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

T(n) an lg n + b for some constants a and b

Substitute it in for some value < n

What value?

Prove that it follows for n

Page 62: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

T(n) an lg n + b for some constants a and b

Substitute it in for some value < n

The value k in the recurrence

Prove that it follows for n

Page 63: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

We can solve this recurrence using the dreaded

substitution method

Guess the answer

T(n) = O(n lg n)

Assume that the inductive hypothesis holds

T(n) an lg n + b for some constants a and b

Substitute it in for some value < n

The value k in the recurrence

Prove that it follows for n

Grind through it…

Page 64: Heapsort quick sort

Sandeep Kumar Poonia

Note: leaving the same

recurrence as the book

What are we doing here?

Analyzing Quicksort: Average Case

1

1

1

1

1

1

1

0

1

0

lg2

2lg

2

lg2

lg2

2

n

k

n

k

n

k

n

k

n

k

nbkakn

nn

bbkak

n

nbkakbn

nbkakn

nkTn

nT The recurrence to be solved

What are we doing here?

What are we doing here?

Plug in inductive hypothesis

Expand out the k=0 case

2b/n is just a constant,

so fold it into (n)

Page 65: Heapsort quick sort

Sandeep Kumar Poonia

What are we doing here?

What are we doing here?

Evaluate the summation:

b+b+…+b = b (n-1)

The recurrence to be solved

Since n-1<n, 2b(n-1)/n < 2b

Analyzing Quicksort: Average Case

nbkkn

a

nnn

bkk

n

a

nbn

kakn

nbkakn

nT

n

k

n

k

n

k

n

k

n

k

2lg2

)1(2

lg2

2lg

2

lg2

1

1

1

1

1

1

1

1

1

1

What are we doing here?Distribute the summation

This summation gets its own set of slides later

Page 66: Heapsort quick sort

Sandeep Kumar Poonia

How did we do this?Pick a large enough that

an/4 dominates (n)+b

What are we doing here?Remember, our goal is to get

T(n) an lg n + b

What the hell?We’ll prove this later

What are we doing here?Distribute the (2a/n) term

The recurrence to be solved

Analyzing Quicksort: Average Case

bnan

na

bnbnan

nbna

nan

nbnnnn

a

nbkkn

anT

n

k

lg

4lg

24

lg

28

1lg

2

12

2lg2

22

1

1

Page 67: Heapsort quick sort

Sandeep Kumar Poonia

Analyzing Quicksort: Average Case

So T(n) an lg n + b for certain a and b

Thus the induction holds

Thus T(n) = O(n lg n)

Thus quicksort runs in O(n lg n) time on average