Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {//...

39
Insertion Sorting • Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0 .. i-1] is sorted. //post-condition: A[0 .. i] is sorted. temp = A[i]; while (A[i-1] > temp) {A[i] = A[i-1]; i--;} A[i] = temp; } InsertionSort(A) { for (j = 2; j<= A.size; ++j) Insert(A,j) }

description

Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0.. i] is sorted. temp = A[i]; while (A[i-1] > temp) {A[i] = A[i-1]; i--;} A[i] = temp; } worst-case: i best-case: 1 average: will do later. InsertionSort(A) { for (j = 1; j

Transcript of Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {//...

Page 1: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Insertion Sorting• Efficient algorithm for sorting a small number of elements. Insert(A,i){// pre-condition: A[0 .. i-1] is sorted. //post-condition: A[0 .. i] is sorted. temp = A[i]; while (A[i-1] > temp) {A[i] = A[i-1]; i--;} A[i] = temp;}

InsertionSort(A){ for (j = 2; j<= A.size; ++j) Insert(A,j)}

Page 2: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

• Our pseudo-code style is a little different from that of the text. (Ours will be a lot closer to Java or C++.)• Analysis of InsertionSort:

• Focus on a specific operation. (key comparison). Text considers every operation.• Measured as a function of parameter(s). In this case, it is the size of the array.• Cost of the operation is a constant (1).• Cases of interest:

• worst-case• best-case• average-case

Page 3: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Insert(A,i){// pre-condition: A[0 .. i-1] is sorted. //post-condition: A[0 .. i] is sorted. temp = A[i]; while (A[i-1] > temp) {A[i] = A[i-1]; i--;} A[i] = temp;}

worst-case: i best-case: 1 average: will do later.InsertionSort(A){ for (j = 1; j<= A.size; ++j) Insert(A,j)} worst-case: n(n-1)/2 best-case: n

Page 4: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Review of background materialMathematical Preliminaries• induction- proof technique for assertion of the form “ for all integer n, ... ” Example 1: For all n, n3 + 5n is divisible by 6. Example 2. 1 + r + … + rk = (rk+1 – 1)/(r-1)Proof by induction on k.

Page 5: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Inequalities, O, and notation• For any positive real constants c and d, cn > nd for all sufficiently large n.• There are positive constants c1 and c2 such that for all n, c1 n log n <= log n! <= c2 n log n.Definition: Let f(n) and g(n) be such that f(n) <= c g(n) for all large enough n. Then, we say f(n) = O(g(n)). • If <= is replaced by >=, we have f(n) = (g(n)). • If both hold, f(n) = (g(n)).

Page 6: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

From previous slide, we have: log n! = (n log n)Exercise:1) 1 + 1/2+ 1/3 + ... + 1/n = (log n)

2) 1 + 1/22 + 1/32 + ... + 1/n2 = (1)

3) Estimate 1/2 + 2/22 + 3/23 ... + n/2n

Summation formulas for arithmetic, geometric and arithmetic-geometric series

Page 7: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Recurrence formulas/equations A recursive definition of a function T(n) in terms of itself. E.g. T(n) = 2 T(n/2) + log nProblem: In how many ways can we fill a 2 by n rectangle using 1 by 1, 1 by 2 and 2 by 1 tiles?

board

tiles

Let f(n) be the number of ways of tiling a 2 by n board.

Page 8: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Exercise: What are f(1), f(2), f(3)? We will now try to obtain a recurrence formula for f(n). Consider the top-left corner. It can be tiled using one of three types of tiles. g(n)

g(n-1) +f(n-2)

f(n-1)

n

Page 9: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Therefore f(n) = g(n) + f(n-1) + g(n-1) + f(n-2)To use this formula, we need a way to compute g(n). So, let us try to create a recurrence formula for g(n). g(1) = 1 g(2) = 3

Lower-left corner can be filled with a 1 by 1 tile or a 1 by 2 tile. Thus, g(n) = g(n-1) + f(n-1)Now, we can compute f(.) for any value. Closed-form formula?

Page 10: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Divide and conquer recurrence formulas Problem: Assuming that single-bit operations cost unit time, how long would it take to compute the square of a n bit integer? (hardware algorithm) 101100110 101100110 1011001100 101100110 101100110 101100110101100110

Page 11: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

This school method of multiplication requiresO(n2) bit-level operations. Why? Essentially, we need to multiply each bit of the number by every bit. Karatsuba’s algorithm: m = p . 2n/2 + qSo, m2 = (p . 2n/2 + q )2

= (p2 . 2n + 2n/2 . 2 p q + q2) 2 p q = (p + q)2 - p2 - q2, so

m2 = (p2 . 2n + 2n/2 . [(p + q)2 - p2 - q2 ] + q2)

Page 12: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

m2 = (p2 . 2n + 2n/2 . [(p + q)2 - p2 - q2 ] + q2)

Karatsuba’s algorithm: Input: m with n bits.1. If m has one bit, then output the answer directly and halt.2. Let p = most significant n/2 bits of m, q = least significant n/2 bits. 3. Recursively compute ps = p2 and qs = q2.4. Compute r = p+q, and recursively compute ss = r2.5. Output shift(ps,n) + shift(ss - ps - qs, n/2) + qs

Page 13: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Analysis of Karatsuba’s algorithm:• shift(p,n) costs O(n) bit operations.• Adding two n bit integers takes O(n) bit operations.• Subtracting a n bit integer from a n bit integer takes O(n) operations.1. If m has one bit, then ...2. Let p = most significant n/2 bits of m, q = least significant n/2 bits. O(n) 3. Recursively compute ps = p2 and qs = q2. 2 T(n/2)4. Compute r = p+q, and recursively compute ss = r2. O(n) + T(n/2)5. Output shift(ps,n) + shift(ss - ps - qs, n/2) + qs O(n)

Page 14: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Recurrence formula T(n) = 3 T(n/2) + O(n)Solution: T(n) = O(n1.59)The textbook gives a master theorem to handle a general type of divide-and-conquer recurrence formula T(n) = a T(n/b) + f(n) (a>= 1, b > 1)• Problem is divided into a sub-problems each of size n/b and each recursively solved.• Overhead f(n) to combine the sub-problem results.

Page 15: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

There are three cases to consider. (1) f(n) = O(nlog

b a - ) for some

constant > 0. Then, T(n) = (nlog

b a).

(2) f(n) = (nlogb

a). Then, T(n) = (nlog

b a lg n).

(3) f(n) = (nlogb

a + ) for some constant > 0, and if a f(n/b) <= c f(n) for some constant c < 1, and for all large enough n, then T(n) = (f(n)). Previous example belongs to case (1) where a = 3 and b = 2.

Page 16: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

We will now consider examples for other cases.Merge Sorting. To sort an array of n elements, recursively sort two halves, and merge. T(n) = 2 T(n/2) + O(n).Solution: T(n) = (n log n)Heap building. Convert an array of n keys into a heap. Question: What is the definition of a heap?For all i= 1,2,..., n, A[i] >= A[i/2]. (Assume A[0] is -maxint.)

Page 17: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Heap Sort: Based on a clever data structure called a heap.A heap is an array in which the keys will always be stored in index 1 to k for some k. Such an array is called a heap if when you view the heap as a full binary tree, the parent key is always <= the child key.

3

87

47

1

2 3

4 5

A 1 2 3 4 5 3 7 4 7 8

class Heap { int[] HeapElements; int Size}

Page 18: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

An obvious but important fact: HeapElements[1] contains the smallest key.Heap can support the following three operations very fast, i.e. in time O(log n) where n is the size of the heap.• void Insert(int x) • int DeleteMin( ) Informal description of the algorithm for Insert(x):• Find a place for x along the path from the leaf with index Size/2 and the root (index=1).

Page 19: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

More formal description: Size++; for (int j=Size; HeapElements[j/2] > x; j/=2) HeapElements[j]=HeapElements[j/2]; HeapElements[j] = x; Although the code is extremely simple, it is tricky.Test it with some examples. What happens if the key inserted is the smallest?

Page 20: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

First consider Max-Heapify. Problem is to convert the tree rooted at node i into a heap given that trees rooted at 2i and 2i+1 are heaps.

k

L R

i

Algorithm is given in page 130 of the text.Suppose H[2i] = x, H[2i+1]= y.Case 1: (k < x) && (k < y) In this case, the tree rooted at is a heap. So, the procedure terminates.

?

Page 21: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Case 2: (k > x) && (y > x). Thus, x is the minimum of the three keys k, x and y. Suppose we swap the keys k and x. Now the problem reduces to Adjust(2i). Case 3: y = min {k, x, y}. In this case, swap k and y; Adjust(2i+1).

Number of comparisons performed by Max-Heapify is 2 h where h is the height of the heap. Since the heap is a complete binary tree, h is at most [log n]+1, so overall cost of Heapify is O(log n).

Page 22: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Build-Max-Heap. To convert a tree rooted at node i into a heap, recursively convert trees rooted at 2i and at 2i+1 into heaps, then heapify.

Build-Max-Heap(A, i, n) { if (2*i > n) return; BuildHeap(A, 2*i, n); BuildHeap(A, 2*i+1,n); Heapify(A,i,n); }

Analysis: T(n) = 2 T(n/2) + O(log n).Solution (by case 3 of master theorem) T(n) = O(n).Note: The algorithm in the text is the iterative version of the above algorithm.

Page 23: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Review of sortingWorst-case Quadratic time sorting algorithms

• bubble-sort• selection-sort• insertion-sort• quick-sort

Worst-case O(n log n) algorithms• heap-sort• merge-sort

Average-case O(n log n) algorithms• heap sort, merge-sort• quick-sort

Intermediate (and hard to analyze) algorithm

• shell-sort (best: O(n (log n)2 )

Page 24: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Quicksort - Expected case analysis. Quicksort(A, low, high) // sort the segment A[low ... high] // k is a constant to be determined experimentally. { if (high == low + 1) return; if (high - low <= k) call insertion sort to sort; k = partition(A, low, high, pivot); Quicksort(A, low, k-1); Quicksort(A, k+1, high); } If pivot splits the array into sizes i and n-i-1,then cost of recursive calls - T(i) and T(n-i-1).Cost of partition step is n.

Page 25: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

split = (0, n-1) -> T(0) + T(n-1)split = (1, n-2) -> T(1) + T(n-2) . . . . . . . . . . . . . . . . . . . . . . . . .Split = (n-1, 0) -> T(n-1) + T(0)Crucial fact: all these cases are equally likely when averaged over all inputs.So, we get the recurrence formula:

)}0()1(...)2()1()1()0({11)( TnTnTTnTTn

nnT

Solution to the equation is T(n) = O(n log n)Text uses induction to prove this. Alternative approach is presented next.

Page 26: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

)}0()1(...)2()1()1()0({11)( TnTnTTnTTn

nnT

n T(n) = n(n-1) + 2{T(0) +T(1) + ... + T(n-1)}(n+1) T(n+1) = (n+1)n + 2{ T(0) + T(1) + ... +T(n)}

(n+1) T(n+1) - n T(n) = 2n + 2 T(n)(n+1) T(n+1) - (n+2) T(n) = 2n

)2)(1(2

1)(

2)1(

nnn

nnT

nnT

It follows that T(n) = O(n log n).

Page 27: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Searching• Dictionary problem

• search, insert and delete operations• findmin, deletemin, rangesearch, successor• merge, split

• Search tree• pointer-based• cost of search, insert, delete = O(h)

• Balanced search tree• to insure h = O(log |S|), S = size of tree.• worst-case (deterministic) AVL, red-black• amortized: splay tree• worst-case (randomized) treap

Page 28: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Searching - hash table• array-based.• Store key x in location h(x) where h maps keys to an index of table.• Typically, |U| >> |T| >> |S| where

• U = set of all possible keys• T = hash table• S = set of keys actually present

• since |U| >> |T|, collisions can and will happen.• Chaining uses a linked list to chain colliding keys.• Open addressing uses rehashing using a collection of hash functions.

Page 29: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Cost of searching• Chaining: Let m = table size and n = number of keys to be stored. Assuming uniform hashing, cost of both successful and unsuccessful searching is where = n/m, load factor.• Open addressingAssuming uniform hashing, cost of unsuccessful search is 1/(1- ). The cost of successful search is 1/ ln(1/(1- )).• Applications

Page 30: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Standard design techniques Divide and conquer• merge-sorting, binary search, Karatsuba• Fast Fourier Transform to multiply polynomials p(x) and q(x), both of degree n-1. (has n coefficients.)

1) evaluate p(x) and q(x) at carefully chosen 2n (complex) numbers c1,..., c2n.2) product polynomial r(x)’s value is known at c1,..., c2n. r(cj) = p(cj) * q(cj)3) from these values, interpolate the polynomial back.

Step (2) is Fourier transform and (3) is inverse Fourier transform.

Page 31: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Dynamic programming Main idea: typically, solving a problem using a standard data structure (e.g. array) involves solving sub-problems on the same data structure (e.g. sub-tree). If the same instance is needed more than once, simply use a table to store the result, instead of making multiple recursive calls to the same instance. (often, this results in a non-recursive version.)

• longest common subsequence• optimal matrix chain product• knapsack problem• a simple two-person game

Page 32: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Recall the recurrence formula for f(n), the number of ways to place three types of dominos in a 2 by n board.f(n) = g(n) + f(n-1) + g(n-1) + f(n-2)g(n) = g(n-1) + f(n-1)Here is a direct conversion of this formula into program segments to compute f and g:

void f(int n){if (n == 1) return 2 else if (n==2) return 6 else return g(n) + f(n-1) + g(n-1) +f(n-2)}

Page 33: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

This is very inefficient. Why?Redundant calls.f(n) = g(n) + f(n-1) + g(n-1) + f(n-2)g(n) = g(n-1) + f(n-1)• To compute f(n), g(.) is called with argument n, which in turn calls f(.) with argument n-1. Then, after completing g(n), f calls itself with n-1, thus computing f(n-1) twice. • This redundancy doubles after each call, leading to exponential time to compute f(n).• Instead, simply use arrays F and G to store f(i) and g(i) as they are computed.

Page 34: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

A problem from Olympiad on Informatics

An arbitrary list of integers (both positive and negative) is presented to two players A and B with A playing first. In their turn, either player is allowed to pick the first or last number from the (remaining) list, and the number is deleted from the list. As usual, players take turns. When the list turns empty, the game ends. The player with higher total wins. The problem is to design an algorithm to determine for a given input list if the first player can force a win (against any opponent).

Page 35: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Example: < 1 -8 -2 4 > Player 1 chooses 4. Player 2 chooses 1. Player 1 chooses -2. Player 2 chooses -8. Player 1 wins. Decision tree representation of possibilities:

Page 36: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Observations:• All leaves are at the same level in the game tree.• To determine the best strategy for player 1, start at the rootand recursively evaluate the two subtrees and choose the maximum of the two.• However, when you expand a subtree rooted at a node whichcorresponds to second player choice, the rule is to choose the minimum of the values of the subtrees.

Player 1

Player 2 Player 2

Input: a1 a

2 … a

n

a1

an

a2 ..

an

a1 ..

an-1

a2

an

a1 a

n-1

a3 ..

an

a2 ..

an-1

Page 37: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

algorithm Play( A[ ], m, n)// returns the maximum score player 1 can get when playing with the list < A[m] , A[m+1], …, A[n] >{ if (n-m == 1) max {A[m], A[n]} else return max { Play(A, m+2, n) + A[m], Play(A, m+1, n-1) + A[m], Play(A, m+1,n-1) + A[n], Play(A, m, n-2) + A[n] } }

Page 38: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Complexity of the algorithm: Main operations: key comparisons,+, function calls. Since function calls are the most expensive,we may just focus on it. T(n) = 4 T(n-2) + c with T(2) = 2. Solution: ? ? ?

Can we improve the algorithm? Yes, look at the 4 sub-problems two levels down from the root.Of the 4 sub-problems, only three are distinct. So, we just skip one of the two calls.

Page 39: Insertion Sorting Efficient algorithm for sorting a small number of elements. Insert(A,i) {// pre-condition: A[0.. i-1] is sorted. //post-condition: A[0..

Complexity of the resulting algorithm: T(n) = 3 T(n-2) + c with T(2) = 1.• We can significantly improve this solution by dynamic programming.• Instead of making recursive calls, simply use “table look-up”. • A two-dimensional array R[ , ] will be used to store the result of Play(A, m, n).