CS 231: Algorithmic Problem Solvingcs231/resources/231slidesmo… · 2 M 0 CS 231 Module 5...

CS 231: Algorithmic Problem SolvingNaomi Nishimura

Module 5Date of this version: November 22, 2019

WARNING: Drafts of slides are made available prior to lecture foryour convenience. After lecture, slides will be updated to reflectmaterial taught. Check the date on this page to make sure youhave the correct, updated version.

WARNING: Slides do not include all class material; if you havemissed a lecture, make sure to find out from a classmate whatmaterial was presented verbally or on the board.

CS 231 Module 5 1 / 39

Matrix-chain multiplication

Matrix-chain multiplication

Input: A sequence (chain) of matrices M0, . . . ,Mn−1, where Mi hasdimension di ×di+1

Output: A parenthesization that results in the smallest number ofmultiplications of pairs of values

Possible parenthesizations: (M0M1)M2 and M0(M1M2)

2 x 10 10 x 30 30 x 3

M1

M2

M0

CS 231 Module 5 Matrix-chain multiplication 2 / 39

Approach 3: Divide-and-conquer

What if we knew that the last multiplication in an optimal solution was(M0 . . .Mk)(Mk+1 . . .Mn−1)?

Then the total cost would be:

• The optimal cost of forming A by multiplying M0 through Mk .

• The optimal cost of forming B by multiplying Mk+1 through Mn−1.

• The cost of multiplying A and B.


Divide-and-conquer, continued

Issues to resolve:

1. How do we handle the fact that we don’t know k?

2. How do we determine the optimal costs of forming A and B?

Ideas:

1. Try all possible values of k .

2. Use the same algorithm for each instance.

Problems:

• We end up trying all possibilities.

• We recompute some pieces more than once.


Paradigm: Dynamic programmingFeatures:

• Define a solution to an instance of a problem in terms of solutions tosmaller instances formed from the original instance.

• Solve each instance and store solutions in a table.

• Choose an order such that solutions can be found in the table whenthey are needed.

• Extract the solution to the original instance from the table.

Works when:

• The Optimal Substructure Property holds.

• Instances can be ordered so that smaller instances are solved beforetheir solutions are needed to solve bigger instances.

History:

• Popularized by Richard Bellman by work started in 1955

• “Programming” means use of a tabular solution method

CS 231 Module 5 Paradigm: Dynamic programming 5 / 39

Optimal substructure and dynamic programming

Dynamic programming, like the greedy approach, takes advantage ofoptimal substructure.

A dynamic programming algorithm may end up trying various options tofigure out which way of splitting up the problem is the best.


Dynamic programming recipe

Recipe for dynamic programming:

1. Determine how the optimal solution can be formed from optimalsolutions to smaller instances.

2. Determine what information should be stored in each table entry.

3. Determine the shape of the table or tables needed to store thesolutions to the smaller instances.

4. Determine the base cases.

5. Choose an order of evaluation.

6. Extract the solution from the table.


Using dynamic programming for matrix-chainmultiplication

Step 1: Determine how the optimal solution can be formed fromoptimal solutions to smaller instances.

Starting with a generic optimal solution to the original instance:

• Show that the Optimal Substructure Property holds.

• Consider all the possible ways it may contain optimal solutions tosmaller instances.

• Write a formula that encompasses all the possibilities.


A generic optimal solution to matrix-chain multiplication

In an optimal solution, the very last multiplication will consist ofmultiplying a pair of matrices A and B such that for some value of k :

• A is formed by multiplying M0 through Mk

• B is formed by multiplying Mk+1 through Mn−1

We have already executed the first two steps in the Optimal Substructurerecipe. We still need to show that if any piece O ′ of O is not an optimalsolution for a smaller input I ′ of I , then O is not an optimal solution for I .


Determining a formula

Note: We don’t know the value of k , other than it must be in the rangefrom 0 to n−2.

Suppose we consider the optimal costs of multiplying M0 through Mk andMk+1 through Mn−1 for all values of k .

One value (or more) will give us the optimal values.

The wrong choice of k will result in a value that is bigger than the optimalvalue.

If we take the minimum value obtained for any choice of k , we will obtainthe correct value for the optimal solution.


Defining smaller instances

To be able to repeat the process for each smaller instance, we need to beable to break up each smaller instance in all possible ways.

The smaller pieces may not start at M0 or end at Mn−1.

We define m[i , j ] to be the smallest cost to multiply Mi through Mj , wherem[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}.

The matrix formed by multiplying Mi through Mk will have dimensions diand dk+1 and the matrix formed by multiplying Mk+1 through Mj willhave dimensions dk+1 and dj+1.


Matrix chain multiplication

Step 2: Determine what information should be stored in each tableentry.We store each value m[i , j ].Step 3: Determine the shape of the table or tables needed to storethe solutions to the smaller instances.A triangle (or a grid).Step 4: Determine the base cases.When m[i , j ] requires no multiplications, the cost is 0.Thus, m[i , i ] = 0 for all values of i .Step 5: Choose an order of evaluation.When we calculate m[i , j ], we want to be able to look up the values ofm[i ,k] and m[k + 1, j ] in the table.m[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}To ensure that the “smaller” problems have been stored before they areused, we view the “size” of m[i , j ] as j− i and fill in entries innondecreasing order of size.


How to implement the filling of a table

To write an algorithm that fills in table entries, typically one loop is usedfor each dimension in the table.

For a one-dimensional table or list, the single loop will typically iterateover indices.

For a two-dimensional table:

• If filled row by row, typically the outer loop will iterate over rowindices and the inner loop will iterate over column indices.

• If filled column by column, typically the outer loop will iterate overcolumn indices and the inner loop will iterate over row indices.

• If filled diagonal by diagonal, typically the outer loop will iterate overdiagonals and the inner loop will iterate either over row or columnindices.


Filling diagonal by diagonal

Information to determine:

• Number of diagonals

• Starting and ending number of columns (or rows)

• For each diagonal and column (or row), the row (or column) of theentry in the given diagonal


Filling in the table for matrix chain multiplication

Information to determine:

• Number of diagonals Fill in table by increasing value of j− i . Onediagonal for each value of j− i , for a total of n.

• Starting and ending number of columns (or rows) Values range from0 to n−1

• For each diagonal and column (or row), the row (or column) of theentry in the given diagonal For diagonal d = j− i and column j , userow j−d .


Details of order of evaluation

0 1 2 3

0 1 5 8 10

1 2 6 9

2 3 7

3 4 0

1

2

3

i

j

j-i

Fill in increasing order of j-i(diagonals)

Within each each diagonal, work inincreasing order of j

First diagonal has i = j , costs are all0.

Solution is at top right corner


Choosing what to store

Step 6: Extract the solution from the table.

Table entry [0,n+ 1] will contain the minimum cost, but how do we figureout the actual sequence of multiplications?

Go back to Step 3 and also store which k is used to obtain the minimumvalue.

Once the table is full, use values stored in entry [0,n+ 1] to trace backthrough entries to extract ordering information.


Filling in diagonal j− i = 1

In each entry, store:

• minimum value, and

• which value k results in that value.

Recall: m[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}

0 1 2 3

0 0 600k=0

1 0 900k=1

2 0 90k=2

3 0

d0 = 2, d1 = 10, d2 = 30, d3 = 3,d4 = 1

m[0,1] =m[0,0] +m[1,1] +d0d1d2 = 600

m[1,2] =m[1,1] +m[2,2] +d1d2d3 = 900

m[2,3] =m[2,2] +m[3,3] +d2d3d4 = 90


Filling in diagonal j− i = 2

Recall: m[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}d0 = 2, d1 = 10, d2 = 30, d3 = 3,d4 = 1

0 1 2 3

0 0 600k=0

780k=1

1 0 900k=1

390k=1

2 0 90k=2

3 0

Try values k = 0 and k = 1 form[0,2]:m[0,2] =

min

{m[0,0] +m[1,2] +d0d1d3 = 960

m[0,1] +m[2,2] +d0d2d3 = 780Try values k = 1 and k = 2 form[1,3]:m[1,3] =

min

{m[1,1] +m[2,3] +d1d2d4 = 390

m[1,2] +m[3,3] +d1d3d4 = 930


Filling in diagonal j− i = 3Try values k = 0, k = 1, and k = 2 for m[0,3]:

m[0,3] = min

m[0,0] +m[1,3] +d0d1d4 = 410

m[0,1] +m[2,3] +d0d2d4 = 750

m[0,2] +m[3,3] +d0d3d4 = 786

0 1 2 3

0 0 600k=0

780k=1

410k=0

1 0 900k=1

390k=1

2 0 90k=2

3 0

m[0,3] uses k = 0 to split M0 and M1 through M3

m[1,3] uses k = 1 to split M1 and M2 through M3

Solution: M0(M1(M2M3))


Assessing a dynamic programming algorithm

Recipe for assessing an algorithm:

1. Determine the (worst-case) running time.

2. Ensure that the algorithm produces the correct output for anyinstance.

For dynamic programming algorithms:

1. Typically, the cost is dominated by the product of the number of tableentries and the cost of filling one entry. The cost of extracting asolution is usually no greater than the cost of filling the table.

2. Correctness is proved by showing that the problem has optimalsubstructure and that the order of evaulation ensures that thealgorithm finds the correct smaller instances to check.

The space required is typically the number of table entries.

CS 231 Module 5 Analysis 21 / 39

Expressing the solution in terms of other solutions

Common methods based on types of inputs:

Set Determine an order on the elements. Smaller instances aredefined using smaller subsets of the elements. Biggerinstances are formed by adding elements one at a time.

Sequence or Grid Define a problem in terms of the position or positions inthe sequence. Smaller instances are at earlier positions.Bigger instances are at later positions.

Tree Define a problem on a subtree. Smaller instances are onsmaller subtrees. Bigger instances are on bigger subtrees.

CS 231 Module 5 General approaches to dynamic programming 22 / 39

Determining what information should be stored in eachtable entry

Common types of information stored:

Decision problem True or False (solutions to smaller instances)

Evaluation problem Values (solutions to smaller instances)

Search or constructive problem Information indicating which smallerinstances led to the optimal solution

Because the cost of the algorithm will depend on the amount ofinformation stored in each entry (cost of calculation as well as cost ofreading time), often full solutions are not stored.

Instead, they can be reconstructed in asymptotically the same amount oftime required to fill the table.


Common table shapes

1D Typically used when each problem has only one variable.

2 1D Typically used when each problem has two variables, butbigger instances depend only on one-smaller values of onevariable. Examples: the input naturally has two variables(e.g. grids or graphs) or there are two inputs.

2D Typically used when each problem has two variables, andbigger instances depend on more than just one-smaller valuesof one variable. Sometimes a grid is used, and sometimes atriangle if, for example, M[i,j] = M[j,i] or one not defined.

2 2D Typically used in situations like for 2 1D tables, but this timefor problems with three variables.

3D or higher Typically used when problems have three or more variables.

For a tree, often nodes are ordered from leaves to root.


Longest common subsequence

A subsequence of a string x0x1 . . .xn−1 is any string xi1xi2 . . .xij such that0≤ i1 ≤ i2 ≤ ·· · ≤ ij ≤ n−1.

Example: The string ”abcde” is a subsequence of the string”000a0b0000cd0000e”.

Longest common subsequence

Input: A string X of length m and a string Y of length n

Output: A string Z that is a subsequence of both X and Y and ofmaximum length

a r m a g e d d o n

a a r d v a r k s

a r m a g e d d o n

a a r d v a r k s

CS 231 Module 5 Longest common subsequence 25 / 39

Recipe Step 1 for LCSSuppose Z is an optimal solution. Where in X and Y can the last symbolin Z be found?

• Matching the last symbols inboth X and Y .

• Matching a non-last symbol inX .

• Matching a non-last symbol inY .

z

x

y

z

x

y

z

x

y


Defining smaller instances

z

x

y

z

x

y

z

x

y

z

x

y

z

x

y

z

x

y


Recipe Step 2 for LCS

We’ll use Python slice notation to represent smaller strings, such as

• Z [: k] is the first k symbols (positions 0 to k−1)

• Z [k :] is all but the first k symbols (positions k and on)

• Z [j : k] is symbols j to k−1

Suppose Z [: k] is an LCS of X [: i ] and Y [: j ].

Claim

Case 1 xi−1 = yj−1 ⇒ Z [: k−1] is an LCS of X [: i −1] and Y [: j−1].Case 2 xi−1 6= yj−1 and xi−1 6= zk−1 ⇒ Z [: k] is an LCS of X [: i −1] andY [: j ].Case 3 xi−1 6= yj−1 and yj−1 6= zk−1 ⇒ Z [: k] is an LCS of X [: i ] andY [: j−1].

For C [i , j ] the length of the LCS of X [: i ] and Y [: j ]:C [i , j ] = C [i −1, j−1] + 1 if xi−1 = yj−1C [i , j ] = max{C [i −1, j ],C [i , j−1]} otherwise


Steps 3 through 5Step 3: Determine what information should be stored in each tableentry.

Store length of subsequence as well as whether the best value of C [i , j ]was found using C [i −1, j−1], C [i −1, j ], or C [i , j−1].

Step 3: Determine the shape of the table or tables needed to storethe solutions to the smaller instances.A grid of dimensions (m+ 1)× (n+ 1)Step 4: Determine the base cases.

Use value 0 to denote empty string, so C [0,0] = C [0, j ] = C [i ,0] = 0

Step 5: Choose an order of evaluation.

C [i , j ] = C [i −1, j−1] + 1 if xi−1 = yj−1

C [i , j ] = max{C [i −1, j ],C [i , j−1]} if xi−1 = yj−1

We need to know C [i −1, j−1], C [i −1, j ], and C [i , j−1] before C [i , j ].

Use increasing values of i and then increasing values of j (hence row byrow and then column by column).


Completion and analysis

Step 6: Extract the solution from the table.

C[m,n] contains the length of the longest common subsequence of theinputs.

To extract the sequence, store information on which of the three values ledto the best answer.

Analysis:

Space: O(mn)

Time: O(mn)


Filling in the table for LCS

0 1 2 3

0 0 1 2 3

1 2 3 4

2 4 5

i

j

4

5

6

4

1

2 3

• Number of diagonals

• Starting and ending number ofcolumns

• For each diagonal and column,the row of the entry in thegiven diagonal


Completing the implementation

Possible implementation choices:

• Use a Grid or two to store values and to keep track of which gave abest solution (plus the character if match found).

• Extract the solution by working backwards from C[m,n] and following“arrows” to previous best solutions, adding characters when found.

See sample code on website for calculating just the value and alsocalculating the value and extracting the subsequence.


All-pairs cheapest paths

All-pairs cheapest paths

Input: A graph G with non-negative edge weights

Output: The least-cost paths between each pair of vertices in G

Note: Other methods needed when there can be negative cycles.

Naıve method: Use Dijkstra n times.

Floyd-Warshall dynamic programming algorithm

CS 231 Module 5 All-pairs cheapest paths 33 / 39

Knapsack (or 0-1 knapsack)

Note: In this version, each entire object is either included in or excludedfrom the knapsack.

Knapsack

Input: A set of n types of objects, where object i has positiveinteger weight wi and positive integer value vi , and aninteger weight bound W

Output: A subset of objects with total weight at most W and themaximum total value

CS 231 Module 5 Knapsack 34 / 39

Options for dynamic programming on graphs

Idea:

• Removing a set of one or more vertices to break a graph into smallergraphs

• Solve instances on smaller graphs that can be combined to solve theoriginal instance on a larger graph

• Solutions to additional problems on smaller graphs may also beneeded

Challenges in splitting up a graph:

• Using a large set of vertices to break up the graph may make itexpensive to account for solutions that involve vertices in the set

• Breaking a graph into a large number of smaller graphs may make itexpensive to combine the information in the smaller graphs

CS 231 Module 5 Graphs 35 / 39

Using dynamic programming on assignments and exams

In many of the examples seen in lectures, the biggest challenge is figuringout how to express an optimal solution in terms of optimal solutions tosmaller instances.

It is not expected that on your own you would be able to devise expressionsfor problems as challenging as, for example, all-pairs cheapest paths.

For assignments and exams, you can expect either to be provided with theexpression or to be asked a question in which finding such an expression isrelatively simple.

You will be expected to understand the rest of the steps in the recipe tobe able to figure those out on your own.

CS 231 Module 5 Summary 36 / 39

Comparing paradigms

Aspect Exhaustive Greedy D-and-C DP

Feasible solutions All Some Some Some

Applicability Wide Narrow Medium Medium

Speed Slow Fast Medium Medium

Divide-and-conquer:

• assumes you know which problems are needed

• works top down

• may repeat smaller instances

Dynamic programming:

• does not assume you know which smaller instances are needed

• works bottom up (top down version memoization exists)

• does not repeat smaller instances


Module summary

Topics covered:

• Matrix-chain multiplication

• Paradigm: Dynamic programming

• Dynamic programming recipe

• Solving matrix-chain multiplication

• Dynamic programming analysis

• General approaches to dynamic programming

• Longest common subsequence

• All-pairs cheapest paths

• Knapsack

• Finding the height of a tree

• Comparing paradigms


CS 231: Algorithmic Problem Solvingcs231/resources/231slidesmo… · 2 M 0 CS 231 Module 5...

Documents

Transcript of CS 231: Algorithmic Problem Solvingcs231/resources/231slidesmo… · 2 M 0 CS 231 Module 5...