Overview What is Dynamic Programming? A Sequence of 4 Steps Longest Common Subsequence Elements of...

Post on 15-Dec-2015

223 views 1 download

Transcript of Overview What is Dynamic Programming? A Sequence of 4 Steps Longest Common Subsequence Elements of...

Overview

• What is Dynamic Programming?

• A Sequence of 4 Steps

• Longest Common Subsequence

• Elements of Dynamic Programming

S-1

What is Dynamic Programming?

• Dynamic programming solves optimization problems by combining solutions to sub-problems

• “Programming” refers to a tabular method with a series of choices, not “coding”

S-2

What is Dynamic Programming? …contd

• A set of choices must be made to arrive at an optimal solution

• As choices are made, subproblems of the same form arise frequently

• The key is to store the solutions of subproblems to be reused in the future

S-3

What is Dynamic Programming? …contd

• Recall the divide-and-conquer approach– Partition the problem into independent

subproblems– Solve the subproblems recursively– Combine solutions of subproblems

• This contrasts with the dynamic programming approach

S-4

What is Dynamic Programming? …contd

• Dynamic programming is applicable when subproblems are not independent

– i.e., subproblems share subsubproblems– Solve every subsubproblem only once and

store the answer for use when it reappears

• A divide-and-conquer approach will do more work than necessary

S-5

A Sequence of 4 Steps

• A dynamic programming approach consists of a sequence of 4 steps

1. Characterize the structure of an optimal solution

2. Recursively define the value of an optimal solution

3. Compute the value of an optimal solution in a bottom-up fashion

4. Construct an optimal solution from computed information

S-6

Longest Common Subsequence (LCS)

Application: comparison of two DNA stringsEx: X= {A B C B D A B }, Y= {B D C A B A}

Longest Common Subsequence:

X = A B C B D A B

Y = B D C A B A

Brute force algorithm would compare each subsequence of X with the symbols in Y

S-7

LCS Algorithm• if |X| = m, |Y| = n, then there are 2m

subsequences of x; we must compare each with Y (n comparisons)

• So the running time of the brute-force algorithm is O(n 2m)

• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution.

• Subproblems: “find LCS of pairs of prefixes of X and Y”

S-8

LCS Algorithm• First we’ll find the length of LCS. Later we’ll

modify the algorithm to find LCS itself.

• Define Xi, Yj to be the prefixes of X and Y of length i and j respectively

• Define c[i,j] to be the length of LCS of Xi and Yj

• Then the length of LCS of X and Y will be c[m,n]

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

S-9

LCS recursive solution

• We start with i = j = 0 (empty substrings of x

and y)

• Since X0 and Y0 are empty strings, their LCS

is always empty (i.e. c[0,0] = 0)

• LCS of empty string and any other string is

empty, so for every i and j: c[0, j] = c[i,0] = 0

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

S-10

LCS recursive solution

• When we calculate c[i,j], we consider two

cases:

• First case: x[i]=y[j]: one more symbol in

strings X and Y matches, so the length of LCS

Xi and Yj equals to the length of LCS of

smaller strings Xi-1 and Yi-1 , plus 1

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

S-11

LCS recursive solution

• Second case: x[i] != y[j]

• As symbols don’t match, our solution is not

improved, and the length of LCS(Xi , Yj) is the

same as before (i.e. maximum of LCS(Xi, Yj-1)

and LCS(Xi-1,Yj)

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

S-12

LCS Length AlgorithmLCS-Length(X, Y)1. m = length(X) 2. n = length(Y)

3. for i = 1 to m c[i,0] = 0 // special case: Y0

4. for j = 1 to n c[0,j] = 0 // special case: X0

5. for i = 1 to m // for all Xi

6. for j = 1 to n // for all Yj

7. if ( Xi == Yj )8. c[i,j] = c[i-1,j-1] + 19. else c[i,j] = max( c[i-1,j], c[i,j-1] )10. return c S-13

LCS ExampleWe’ll see how LCS algorithm works on the

following example:

• X = ABCB

• Y = BDCAB

LCS(X, Y) = BCBX = A B C BY = B D C A B

What is the Longest Common Subsequence of X and Y?

S-14

LCS Example (0)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

X = ABCB; m = |X| = 4Y = BDCAB; n = |Y| = 5Allocate array c[4,5]

ABCBBDCAB

S-15

LCS Example (1)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0

ABCBBDCAB

S-16

LCS Example (2)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0

ABCBBDCAB

S-17

LCS Example (3)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 0

ABCBBDCAB

S-18

LCS Example (4)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 0 1

ABCBBDCAB

S-19

LCS Example (5)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

000 1 1

ABCBBDCAB

S-20

LCS Example (6)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

0 0 10 1

1

ABCBBDCAB

S-21

LCS Example (7)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 1 11

ABCBBDCAB

S-22

LCS Example (8)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 1 1 1 2

ABCBBDCAB

S-23

LCS Example (10)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

21 1 11

1 1

ABCBBDCAB

S-24

LCS Example (11)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 11

1 1 2

ABCBBDCAB

S-25

LCS Example (12)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

ABCBBDCAB

S-26

LCS Example (13)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1

ABCBBDCAB

S-27

LCS Example (14)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1

else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1 1 2 2

ABCBBDCAB

S-28

LCS Example (15)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

B

Yj BB ACD

0

0

00000

0

0

0

if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3

ABCBBDCAB

S-29

LCS Algorithm Running Time

• LCS algorithm calculates the values of each entry of the array c[m,n]

• So what is the running time?

O(m*n)

since each c[i,j] is calculated in constant time, and there are m*n elements in the array

S-30

How to find actual LCS• So far, we have just found the length of LCS,

but not LCS itself.

• We want to modify this algorithm to make it output Longest Common Subsequence of X and Y

Each c[i,j] depends on c[i-1,j] and c[i,j-1]

or c[i-1, j-1]

For each c[i,j] we can say how it was acquired:

2

2 3

2 For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3

S-31

How to find actual LCS - continued

• Remember that

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

So we can start from c[m,n] and go backwards

Whenever c[i,j] = c[i-1, j-1]+1, remember x[i] (because x[i] is a part of LCS)

When i=0 or j=0 (i.e. we reached the beginning), output remembered letters in reverse order S-32

Finding LCSj 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

Yj BB ACD

0

0

00000

0

0

0

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3B

S-33

Finding LCS (2)j 0 1 2 3 4 5

0

1

2

3

4

i

Xi

A

B

C

Yj BB ACD

0

0

00000

0

0

0

1000 1

1 21 1

1 1 2

1

22

1 1 2 2 3B

B C BLCS (reversed order):

LCS (straight order): B C B

S-34

Elements of Dynamic Programming

• For dynamic programming to be applicable, an optimization problem must have:

1. Optimal substructure– An optimal solution to the problem contains within it

optimal solution to subproblems

2. Overlapping subproblems– The space of subproblems must be small; i.e., the

same subproblems are encountered over and over

S-35

Review: Dynamic programming

• DP is a method for solving certain kind of problems

• DP can be applied when the solution of a problem includes solutions to subproblems

• We need to find a recursive formula for the solution

• We can recursively solve subproblems, starting from the trivial case, and save their solutions in memory

• In the end we’ll get the solution of the whole problem

S-36

Properties of a problem that can be solved with dynamic

programming• Simple Subproblems

– We should be able to break the original problem to smaller subproblems that have the same structure

• Optimal Substructure of the problems– The solution to the problem must be a

composition of subproblem solutions

• Subproblem Overlap– Optimal subproblems to unrelated problems can

contain subproblems in common

S-37

Review: Longest Common Subsequence (LCS)

• Problem: how to find the longest pattern of characters that is common to two text strings X and Y

• Dynamic programming algorithm: solve subproblems until we get the final solution

• Subproblem: first find the LCS of prefixes of X and Y.

• this problem has optimal substructure: LCS of two prefixes is always a part of LCS of bigger strings

S-38

Review: Longest Common Subsequence (LCS)

continued• Define Xi, Yj to be prefixes of X and Y of

length i and j; m = |X|, n = |Y| • We store the length of LCS(Xi, Yj) in c[i,j]

• Trivial cases: LCS(X0 , Yj ) and LCS(Xi, Y0) is empty (so c[0,j] = c[i,0] = 0 )

• Recursive formula for c[i,j]:

otherwise]),1[],1,[max(

],[][ if1]1,1[],[

jicjic

jyixjicjic

c[m,n] is the final solutionS-39

Review: Longest Common Subsequence (LCS)

• After we have filled the array c[ ], we can use this data to find the characters that constitute the Longest Common Subsequence

• Algorithm runs in O(m*n), which is much better than the brute-force algorithm: O(n 2m)

S-40

Conclusion• Dynamic programming is a useful

technique of solving certain kind of problems

• When the solution can be recursively described in terms of partial solutions, we can store these partial solutions and re-use them as necessary

• Running time (Dynamic Programming algorithm vs. brute-force algorithm):– LCS: O(m*n) vs. O(n * 2m)

S-41

The End

S-42