CSE 331: Review.

73
CSE 331: Review

description

Main Steps in Algorithm Design Problem Statement Real world problem Problem Definition Precise mathematical def Algorithm “Implementation” Data Structures Analysis Correctness/Run time

Transcript of CSE 331: Review.

Page 1: CSE 331: Review.

CSE 331: Review

Page 2: CSE 331: Review.

Main Steps in Algorithm DesignProblem Statement

Algorithm

Real world problem

Problem Definition Precise mathematical def

“Implementation” Data Structures

Analysis Correctness/Run time

Page 3: CSE 331: Review.

Stable Matching Problem

Gale-Shaply Algorithm

Page 4: CSE 331: Review.

Stable Marriage problem

Set of men M and women W

Matching (no polygamy in M X W)

Perfect Matching (everyone gets married)

Instablity

m w

m’ w’

Preferences (ranking of potential spouses)

Stable matching = perfect matching+ no instablity

Page 5: CSE 331: Review.

Gale-Shapley AlgorithmIntially all men and women are free

While there exists a free woman who can propose

Let w be such a woman and m be the best man she has not proposed to

w proposes to m

If m is free

(m,w) get engaged

Else (m,w’) are engaged

If m prefers w’ to w

w remains freeElse

(m,w) get engaged and w’ is free

Output the engaged pairs as the final output

At most n2 iterations

O(1) time implementation

Page 6: CSE 331: Review.

GS algorithm: Firefly Edition

1

1

2

2

3

3

4

4

5

5

6

6

Mal

Wash

Simon

Inara

Zoe

Kaylee

Page 7: CSE 331: Review.

GS algo outputs a stable matching

Lemma 1: GS outputs a perfect matching S

Lemma 2: S has no instability

Page 8: CSE 331: Review.

Proof technique de jour

Source: 4simpsons.wordpress.com

Proof by contradiction

Assume the negation of what you want to prove

After some reasoning

Page 9: CSE 331: Review.

Two obervations

Obs 1: Once m is engaged he keeps getting engaged to “better” women

Obs 2: If w proposes to m’ first and then to m (or never proposes to m) then she prefers m’ to m

Page 10: CSE 331: Review.

Proof of Lemma 2

By contradiction

m w

m’ w’

Assume there is an instability (m,w’)

m prefers w’ to w

w’ prefers m to m’

w’ last proposed to m’

Page 11: CSE 331: Review.

Contradiction by Case Analysis

Depending on whether w’ had proposed to m or not

Case 1: w’ never proposed to m

w’ prefers m’ to m

Assumed w’ prefers m to m’

Source: 4simpsons.wordpress.com

By Obs 2

Page 12: CSE 331: Review.

By Obs 1

Case 2: w’ had proposed to m

Case 2.1: m had accepted w’ proposalm is finally engaged to w

Thus, m prefers w to w’4simpsons.wordpress.com

Case 2.2: m had rejected w’ proposal

m was engaged to w’’ (prefers w’’ to w’)

m is finally engaged to w (prefers w to w’’)

m prefers w to w’

4simpsons.wordpress.com

By Obs 1

By Obs 1

Page 13: CSE 331: Review.

Overall structure of case analysis

Did w’ propose to m?

Did m accept w’ proposal?

4simpsons.wordpress.com

4simpsons.wordpress.com 4simpsons.wordpress.com

Page 14: CSE 331: Review.

Graph Searching

BFS/DFS

Page 15: CSE 331: Review.

O(m+n) BFS Implementation

BFS(s)

CC[s] = T and CC[w] = F for every w≠ s

Set i = 0Set L0= {s}

While Li is not empty

Li+1 = Ø

For every u in Li

For every edge (u,w)

If CC[w] = F then

CC[w] = TAdd w to Li+1

i++

Array

Linked List

Input graph as Adjacency list

Version in KT also

computes a BFS tree

Page 16: CSE 331: Review.

An illustration

1

2 3

4 5

6

7

8

1 2 3 4 5 7 8 6

Page 17: CSE 331: Review.

O(m+n) DFS implementation

BFS(s)

CC[s] = T and CC[w] = F for every w≠ s

Intitialize Q= {s}

While Q is not empty

Delete the front element u in Q

For every edge (u,w)

If CC[w] = F then

CC[w] = TAdd w to the back of Q

O(n)

O(1)

O(1)

Repeated nu timesO(nu)

Repeated at most once for each

vertex u

Σu O(nu) = O(Σu nu) =

O(m) O(1)

Page 18: CSE 331: Review.

A DFS run using an explicit stack

1

2 3

4 5

6

7

8

1

2

4

5

6

3

8

7

3

5

3

7

Page 19: CSE 331: Review.

Topological Ordering

Page 20: CSE 331: Review.

Run of TopOrd algorithm

Page 21: CSE 331: Review.

Greedy Algorithms

Page 22: CSE 331: Review.

Interval Scheduling: Maximum Number of Intervals

Schedule by Finish Time

Page 23: CSE 331: Review.

End of Semester blues

Monday Tuesday Wednesday Thursday Friday

Project

331 HWExam study

Party!

Write up a term paper

Can only do one thing at any day: what is the maximum number of tasks that you can do?

Page 24: CSE 331: Review.

Schedule by Finish Time

Set A to be the empty set

While R is not empty

Choose i in R with the earliest finish time

Add i to A

Remove all requests that conflict with i from R

Return A*=A

O(n log n) time sort intervals such that f(i) ≤ f(i+1)

O(n) time build array s[1..n] s.t. s[i] = start time for i

Do the removal on

the fly

Page 25: CSE 331: Review.

The final algorithm

Monday Tuesday Wednesday Thursday Friday

Project

331 HWExam study

Party!

Write up a term paper

Order tasks by their END time

Page 26: CSE 331: Review.

Proof of correctness uses“greedy stays ahead”

Page 27: CSE 331: Review.

Interval Scheduling: Maximum Intervals

Schedule by Finish Time

Page 28: CSE 331: Review.

Scheduling to minimize lateness

Monday Tuesday Wednesday Thursday Friday

Project

331 HW

Exam study

Party!

Write up a term paper

All the tasks have to be scheduledGOAL: minimize maximum lateness

Page 29: CSE 331: Review.

The Greedy Algorithm

(Assume jobs sorted by deadline: d1≤ d2≤ ….. ≤ dn)

f=s

For every i in 1..n do

Schedule job i from s(i)=f to f(i)=f+ti

f=f+ti

Page 30: CSE 331: Review.

Proof of Correctness uses“Exchange argument”

Page 31: CSE 331: Review.

Proved the following

Any two schedules with 0 idle time and 0 inversions have the same max lateness

Greedy schedule has 0 idle time and 0 inversions

There is an optimal schedule with 0 idle time and 0 inversions

Page 32: CSE 331: Review.

Shortest Path in a Graph: non-negative edge weights

Dijkstra’s Algorithm

Page 33: CSE 331: Review.

Shortest Path problem

Input: Directed graph G=(V,E)

Edge lengths, le for e in E

“start” vertex s in V

Output: All shortest paths from s to all nodes in V

100

155

s

u

w

5

s

u15

5

s

u

w

Page 34: CSE 331: Review.

Dijkstra’s shortest path algorithm

Input: Directed G=(V,E), le ≥ 0, s in V

R = {s}, d(s) =0

While there is a x not in R with (u,x) in E, u in R

d’(w) = min e=(u,w) in E, u in R d(u)+le

Pick w that minimizes d’(w) Add w to Rd(w) = d’(w)

s

w

u

z

x

y

1

2

4

3

3

1

2

1

2

d(s) = 0

1

4

2 s

u

d(u) = 1

4

2

w

d(w) = 2

5

x

d(x) = 2

3

4

y

d(y) = 3

z

d(z) = 4

Shortest paths

Page 35: CSE 331: Review.

Dijkstra’s shortest path algorithm (formal)

Input: Directed G=(V,E), le ≥ 0, s in V

S = {s}, d(s) =0

While there is a v not in S with (u,v) in E, u in S

Pick w that minimizes d’(w) Add w to Sd(w) = d’(w)

At most n iterations

O(m) time

O(mn) time bound is trivial

O(m log n) time implementation is possible

Page 36: CSE 331: Review.

Proved that d’(v) is best when v is added

Page 37: CSE 331: Review.

Minimum Spanning Tree

Kruskal/Prim

Page 38: CSE 331: Review.

Minimum Spanning Tree (MST)

Input: A connected graph G=(V,E), ce> 0 for every e in E

Output: A tree containing all V that minimizes the sum of edge weights

Page 39: CSE 331: Review.

Kruskal’s Algorithm

Joseph B. Kruskal

Input: G=(V,E), ce> 0 for every e in E

T = Ø

Sort edges in increasing order of their cost

Consider edges in sorted order

If an edge can be added to T without adding a cycle then add it to T

Page 40: CSE 331: Review.

Prim’s algorithm

Robert Prim

Similar to Dijkstra’s algorithm

Input: G=(V,E), ce> 0 for every e in E

2

1 3

51

50

0.5

S = {s}, T = Ø

While S is not the same as V

Among edges e= (u,w) with u in S and w not in S, pick one with minimum cost

Add w to S, e to T

2

1 50

0.5

Page 41: CSE 331: Review.

Cut Property Lemma for MSTs

S V \ S

Cheapest crossing edge is in all MSTs

Condition: S and V\S are non-empty

Assumption: All edge costs are distinct

Page 42: CSE 331: Review.

Divide & Conquer

Page 43: CSE 331: Review.

Sorting

Merge-Sort

Page 44: CSE 331: Review.

Sorting

Given n numbers order them from smallest to largest

Works for any set of elements on which there is a total order

Page 45: CSE 331: Review.

Mergesort algorithmInput: a1, a2, …, an Output: Numbers in sorted order

MergeSort( a, n )

If n = 2 return the order min(a1,a2); max(a1,a2)

aL = a1,…, an/2

aR = an/2+1,…, an

return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) )

Page 46: CSE 331: Review.

An example run

MergeSort( a, n )

If n = 2 return the order min(a1,a2); max(a1,a2)

aL = a1,…, an/2

aR = an/2+1,…, an

return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) )

151 100 19 2 8 34

511 19 100

1 19 51 100

2 8 43

2 3 4 8

1 2 3 4 8 19 51 100

Page 47: CSE 331: Review.

CorrectnessInput: a1, a2, …, an Output: Numbers in sorted order

MergeSort( a, n )

If n = 2 return the order min(a1,a2); max(a1,a2)

aL = a1,…, an/2

aR = an/2+1,…, an

return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) )

By induction

on n

Inductive step follows from correctness of MERGE

If n = 1 return the order a1

Page 48: CSE 331: Review.

Counting Inversions

Merge-Count

Page 49: CSE 331: Review.

Mergesort-Count algorithmInput: a1, a2, …, an Output: Numbers in sorted order+ #inversion

MergeSortCount( a, n )

If n = 2 return ( a1 > a2, min(a1,a2); max(a1,a2))

aL = a1,…, an/2 aR = an/2+1,…, an

return (c+cL+cR,a)

(cL, aL) = MergeSortCount(aL, n/2)

(cR, aR) = MergeSortCount(aR, n/2)

(c, a) = MERGE-COUNT(aL,aR) Counts #crossing-inversions+ MERGE

O(n)

T(2) = c

T(n) = 2T(n/2) + cn

O(n log n) time

If n = 1 return ( 0 , a1)

Page 50: CSE 331: Review.

Closest Pair of Points

Closest Pair of Points Algorithm

Page 51: CSE 331: Review.

Closest pairs of points

Input: n 2-D points P = {p1,…,pn}; pi=(xi,yi)

Output: Points p and q that are closest

d(pi,pj) = ( (xi-xj)2+(yi-yj)2)1/2

Page 52: CSE 331: Review.

The algorithmInput: n 2-D points P = {p1,…,pn}; pi=(xi,yi)

Sort P to get Px and Py

Q is first half of Px and R is the rest

Closest-Pair (Px, Py)

Compute Qx, Qy, Rx and Ry

(q0,q1) = Closest-Pair (Qx, Qy)

(r0,r1) = Closest-Pair (Rx, Ry)

δ = min ( d(q0,q1), d(r0,r1) )

S = points (x,y) in P s.t. |x – x*| < δ

return Closest-in-box (S, (q0,q1), (r0,r1))

If n < 4 then find closest point by brute-force

Assume can be done in O(n)

O(n log n)

O(n)

O(n)

O(n)

O(n)

O(n log n) + T(n)

T(< 4) = c

T(n) = 2T(n/2) + cn

O(n log n) overall

Page 53: CSE 331: Review.

Dynamic Programming

Page 54: CSE 331: Review.

Weighted Interval Scheduling

Scheduling Algorithm

Page 55: CSE 331: Review.

Weighted Interval Scheduling

Input: n jobs (si,ti,vi)

Output: A schedule S s.t. no two jobs in S have a conflict

Goal: max Σi in S vj

Assume: jobs are sorted by their finish time

Page 56: CSE 331: Review.

A recursive algorithm

Compute-Opt(j)

If j = 0 then return 0

return max { vj + Compute-Opt( p(j) ), Compute-Opt( j-1 ) }

OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) }

Proof of correctness by induction on jCorrect for j=0

= OPT( p(j) ) = OPT( j-1 )

Page 57: CSE 331: Review.

Exponential Running Time1

23

4

5

p(j) = j-2

OPT(5)

OPT(3) OPT(4)

OPT(1) OPT(2)

OPT(1) OPT(1)

OPT(1)OPT(2)

OPT(1)

OPT(2)

OPT(3)Formal

proof: Ex.

Only 5 OPT values!

Page 58: CSE 331: Review.

Bounding # recursionsM-Compute-Opt(j)

If j = 0 then return 0

M[j] = max { vj + M-Compute-Opt( p(j) ), M-Compute-Opt( j-1 ) }

If M[j] is not null then return M[j]

return M[j]

Whenever a recursive call is made an M value of assigned

At most n values of M can be assigned

O(n) overall

Page 59: CSE 331: Review.

Property of OPT

OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) }

Given OPT(1), …, OPT(j-1), one can compute OPT(j)

Page 60: CSE 331: Review.

Recursion+ memory = IterationIteratively compute the OPT(j) values

M[0] = 0

M[j] = max { vj + M[p(j)], M[j-1] }

For j=1,…,n

Iterative-Compute-Opt

M[j] = OPT(j) O(n) run time

Page 61: CSE 331: Review.

Knapsack Problem

Knapsack Algorithm

Page 62: CSE 331: Review.

Subset Sum ProblemMaximize the weight packed into a bag

*ref: Images from http://minecraft.gamepedia.com

w=10

w=8

w=15

w=33

Capacity:W=40

w=12

w=5

w=12

w=17

Page 63: CSE 331: Review.

Subset Sum Problem

Input: A set of n items each with weight wi > 0 and a capacity W

Output: A subset of the n items with maximum sum of weights under the constraint that the sum weights is ≤W

Page 64: CSE 331: Review.

Knapsack ProblemMaximize the value packed into the bag

*ref: Images from http://minecraft.gamepedia.com

w=10v=1

w=8v=200

w=15v=76

w=35v=50

Capacity:W=40

w=12v=14

w=1000v=20

w=5v=19

w=18v=21

Page 65: CSE 331: Review.

Knapsack Problem

Input: A set of n items each with weight wi > 0 and value vi>0A capacity W

Output: A subset of the n items with maximum sum of valueunder the constraint that the sum weights is ≤W

Page 66: CSE 331: Review.

Dynamic Programming Algorithm:Subset Sum

OPT(i,w) = Maximum weight packed given the first i item and capacity w

OPT(i,w) = max { OPT(i-1,w), wi + OPT(i-1, w-wi)}

Pack item iDon’t pack item i

If item i can’t fit (wi>w) then OPT(i,w) = OPT(i-1,w)

For each OPT(i,w), decide if item i should be packed or not:

If item i can fit:

Output OPT(n,W): With some book keeping, can also output the packed set

Page 67: CSE 331: Review.

Dynamic Programming Algorithm:Knapsack Problem

OPT(i,w) = Maximum value packed given the first i item and capacity w

OPT(i,w) = max { OPT(i-1,w), vi + OPT(i-1, w-wi)}

Pack item iDon’t pack item i

If item i can’t fit (wi>w) then OPT(i,w) = OPT(i-1,w)

Decide if item i should be packed or not:

If item i can fit:

Output OPT(n,W): With some book keeping, can also output the packed set

Page 68: CSE 331: Review.

Runtime

OPT(i,w) = max { OPT(i-1,w), vi + OPT(i-1, w-wi)}

OPT(i,w) is computed in constant time

nW entries in OPT to be computed for an O(nW) runtime

Page 69: CSE 331: Review.

Shortest Path in a Graph

Bellman-Ford

Page 70: CSE 331: Review.

Shortest Path Problem

Input: (Directed) Graph G=(V,E) and for every edge e has a cost ce (can be <0)

t in V

Output: Shortest path from every s to t

1 1

100

-1000

899

s t

Shortest path has cost negative

infinity

Assume that G has no negative

cycle

Page 71: CSE 331: Review.

Recurrence Relation

OPT(i,u) = cost of shortest path from u to t with at most i edges

OPT(i,u) = min { OPT(i-1,u), min(u,w) in E { cu,w + OPT(i-1, w)} }

Path uses ≤ i-1 edges Best path through all neighbors

Page 72: CSE 331: Review.

P vs NP

Page 73: CSE 331: Review.

P vs NP question

P: problems that can be solved by poly time algorithms

NP: problems that have polynomial time verifiable witness to optimal solution

Is P=NP?

Alternate NP definition: Guess witness and verify!