CPSC 413 Lecture Notes | Part IIpages.cpsc.ucalgary.ca/~eberly/Courses/CPSC413/1998/...Neapolitan...

CPSC 413 Lecture Notes — Part II

Department of Computer Science

Fall, 1998

Contents

I Design of Algorithms 5

1 Divide and Conquer 71.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3 Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Asymptotically Fast Integer Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 111.6 Asymptotically Fast Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 151.7 A Bit About Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.8 Additional References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.10 Hints for Selected Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.11 Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Dynamic Programming 272.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Example: Computation of the Fibonacci Numbers . . . . . . . . . . . . . . . . . . . 272.3 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.4 Example: The “Optimal Fee” Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 372.5 Example: The Matrix Chain Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 432.6 Additional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.8 Hints for Selected Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.9 Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3 Greedy Methods 593.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.2 A Modified Optimal Fee Problem and a Generalization . . . . . . . . . . . . . . . . . 603.3 Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.4 Greedy Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.5 Correctness: Proving Incorrectness of a Greedy Heuristic . . . . . . . . . . . . . . . . 663.6 Correctness: Proving Correctness of a Greedy Algorithm . . . . . . . . . . . . . . . . 693.7 Application: Proving Correctness of the Greedy Algorithm for the (Generalized)

Optimal Fee Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.8 Proving Correctness of an “Optimized” Version . . . . . . . . . . . . . . . . . . . . . 803.9 Application: The Activity Selection Problem . . . . . . . . . . . . . . . . . . . . . . 823.10 What to Do if There are No Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3

3.11 Additional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.13 Hints for Selected Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913.14 Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

II Solutions for Selected Exercises and Tests 101

4 Solutions for Algorithm Design Exercises and Tests 1034.1 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.2 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.3 Greedy Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Bibliography 146

Index 148

4

Part I

Design of Algorithms

5

Chapter 1

Divide and Conquer

1.1 Overview

This chapter introduces “Divide and Conquer,” which is a technique for designing recursive algo-rithms that are (sometimes) asymptotically efficient. It also presents several algorithms that havebeen designed using it, along with their analysis.

While the algorithms are interesting by themselves (and some textbooks include exercises onthis topic which ask students to trace execution of them on given inputs), they’re intended asexamples in these notes, rather than subjects for study by themselves.

Thus, CPSC 413 students may be asked to design similar algorithms using “Divide and Con-quer” — when given appropriate time and guidance: It’s frequently not obvious how one shouldbreak a problem down into subproblems in order to solve it quickly. Students might also be expectedto analyze these algorithms. However, students won’t be expected to memorize the algorithms thatare given here as examples, and they won’t be asked to trace the execution of these algorithms, onparticular inputs, in assignments or tests for this course in 1998.

Much of the material presented in Chapter 8 in Part I will be useful for the analysis of “Divideand Conquer” algorithms. The versions of the “Master Theorem” presented in this chapter will beparticularly useful here.

A set of exercises is given at the end of this chapter and can be used to assess the algorithmdesign and analysis skills mentioned above.

1.2 Divide and Conquer

As described by Cormen, Leiserson, and Rivest [5] (among others), an algorithm uses “Divide andConquer” if it solves a problem (given some “instance” of the problem as input) by decomposingthe given instance into several instances of the same problem, so that these can be solved by arecursive application of the same algorithm; recursively solving the derived instances; and thencombining their solutions in order to obtain a solution for the original problem.

This technique can sometimes be used to design algorithms that use polynomial time (that is,a polynomial number of operations in the input size) in the worst case. However, the algorithmsthat are designed using this technique aren’t always this efficient.

Consider, for example, the algorithm for the computation of the Fibonacci numbers that wasdiscussed in Section 8.2.1 in Part I. This algorithm computes the nth Fibonacci number by callingitself recursively, twice, to compute the (n − 1)th and (n − 2)th Fibonacci numbers, and then

7

performing (at most) a constant number of additional operations.This algorithm requires more than polynomial time — we’ve seen already that the number

of operations it uses is in Θ((

1+√

52

)n). Thus this algorithm would use time that’s exponential

in the size of input even if we insisted that its input (n) was given in unary — as a string of nones. The problem here is that, taken altogether, an exponential number of smaller instances ofthe problem are eventually created and recursively solved when it attempts to compute the nth

Fibonacci number.So, this is an example of an algorithm that uses “Divide and Conquer.” However, it isn’t an

example of an efficient algorithm, since it uses time that’s more than polynomial in the size of itsinput in the worst case.

It turns out that an algorithm using Divide and Conquer will be asymptotically efficient, pro-vided that the following additional conditions are satisfied:

1. The algorithm generates (at most) a constant number of smaller instances of the given prob-lem, which are to be recursively solved, when processing any given instance.

2. The size of each new instance that the algorithm generates is smaller than the size of theoriginally given instance by at least a constant factor that’s less than one.

3. The time required by the algorithm to generate the new instances, and to solve the originalinstance using the solutions of the derived ones, is at most a polynomial function of the inputsize.

The above algorithm for computing the nth Fibonacci number satisfies the first of these con-ditions, since it generates at most two smaller instances to be solved when asked to compute Fn— not counting the even smaller instances that it generates when it tries to solve these instancesrecursively. It also satisfies the third condition if integer addition and subtraction are assumed tohave unit cost. However, it fails to satisfy the second condition, because it calls itself recursivelyto compute Fn−1 (and Fn−2) when it’s asked to generate Fn — and the recursively derived input,n− 1, is not smaller than n by a constant factor.

These conditions aren’t strictly necessary — it’s possible to design an algorithm using Divide andConquer that violates one or more of them and that is asymptotically efficient anyway. However,many of the efficient algorithms that have been designed using Divide and Conquer do satisfy them,and we’ll consider several such algorithms in the rest of this chapter.

It will turn out that the “Master Theorem” from Chapter 8 in Part I will be extremely useful forthe analysis of this kind of algorithm. At least, this will be useful when all the recursively derivedinstances of the problem have roughly the same size. You’ll need to rely on other techniques, suchas the substitution method and the iteration method that were also introduced in Chapter 8 inPart I, when the sizes of the recursively derived instances aren’t all the same.

1.3 Binary Search

Consider again the binary search algorithm that was presented in Section 8.2.2 in Part I. We’veseen already that the algorithm uses a number T (n) of operations, in the worst case, that is givenby the recurrence

T (n) =

2 if n ≤ 0,T(dn−1

2 e)

+ 5 if n ≥ 1.

8

Here, n is the number of elements in the array to be searched. This recurrence isn’t quite in theform that the “Master Theorem” considers. However, since n−1

2 is pretty close to n2 , one might

guess or conjecture that T (n) ∈ Θ(U(n)), where

U(n) =

c if n ≤ 1,U(dn2 e

)+ 5 if n ≥ 2,

where c is some positive constant. The Master Theorem can be used to prove that U(n) ∈ Θ(log2 n),so the above guess is equivalent to a conjecture that T (n) ∈ Θ(log2 n), and the substitution method(introduced in Section 8.4 in Part I) could be used to confirm that this is the case.

Alternatively (if you wish to avoid using the substitution method), you could note that appli-cation of the recurrence confirms that T (1) = 7, so this can be rewritten as

T (n) =

2 if n ≤ 0,7 if n = 1,T(dn−1

2 e)

+ 5 if n ≥ 2.

You could next note that

dn−12 e = bn2 c

for every integer n (prove this by considering the cases that n is even and that n is odd separately),so that the above recurrence can be rewritten as

T (n) =

2 if n ≤ 0,7 if n = 1,T(bn2 c

)+ 5 if n ≥ 2.

Now it follows that

T (n) ≤

7 if n ≤ 1,T(bn2 c

)+ 5 if n ≥ 2,

and T (n) ≥

2 if n ≤ 1,T(bn2 c

)+ 5 if n ≥ 2;

these recurrences can be analyzed using the Master Theorem, so that we can conclude from themthat T (n) ∈ Θ(log2 n) as well, without having to resort to other techniques.

Binary search is discussed in many texts on algorithm design and analysis, including Bras-sard and Bratley [2] (in Section 7.3), Horowitz, Sahni, and Rajasekaran [6] (in Section 3.2), andNeapolitan and Naimipour [9] (in Section 2.1).

1.4 Merge Sort

On way to sort an array of length n is to break it into two pieces, of lengths bn2 c and dn2 e, sortthese subarrays recursively, and then “merge” the contents of the subarrays together to producea sorted array — provided that n ≥ 2; the problem is trivial (because the given array is alreadysorted to start with) otherwise.

It’s easy to merge two sorted arrays together, to produce a larger sorted array, using a numberof comparisons that’s linear in the sum of the lengths of the input arrays. In particular, all youneed to do is maintain an output array that’s initially empty, as well as pointers into the input

9

arrays that initially point to the front elements. As long the pointers point to elements of the inputarrays (so that you haven’t fallen off the end of one or the other of the input arrays), it’s sufficientto compare the elements that the two pointers point to, append the smaller of these two elementsonto the end of the output array, and then advance the pointer for the list from which this elementwas taken, to point to the next element after the one that’s just been appended to the output. Assoon as you reach the end of one of the two input arrays (by adding its last element to the output),all you need to do is append the remaining elements of the other array to the output in order tofinish.

Since only a constant number of operations are performed each time an element is added to theoutput array, and at least one operation is performed each time this happens, it’s clear that boththe best and worst case running times for this merge operation are linear in the input size.

Now let T (n) be the number of steps used in the worst case by the recursive sorting algorithmthat’s described above, when given an array of length n. It follows by the above analysis of thecost of a “merge” that T (n) satisfies the following recurrences:

T (n) ≤c1 if n ≤ 1,T(dn2 e

)+ T

(bn2 c

)+ d1n if n ≥ 2,

and

T (n) ≥c2 if n ≤ 1,T(dn2 e

)+ T

(bn2 c

)+ d2n if n ≥ 2,

for positive constants c1, c2, d1, and d2.Since these recurrences involve two applications of the function T on inputs that are both

approximately half as large as the original inputs size, you might “guess” at this point that T (n) ∈Θ(U(n)) where

U(n) =

c if n ≤ 1,2U

(bn2 c

)+ dn if n ≥ 2,

where c and d are positive constants, use the Master Theorem to establish that U(n) ∈ Θ(n log2 n),and then use the substitution method to prove that T (n) ∈ Θ(n log2 n) as well.

Alternatively you could note (or prove, using a straightforward induction) that T (n) is a non-decreasing function, so that it also satisfies the recurrences

T (n) ≤c1 if n ≤ 1,2T(dn2 e

)+ d1n if n ≥ 2,

and

T (n) ≥c2 if n ≤ 1,2T(bn2 c

)+ d2n if n ≥ 2,

if T (n) satisfies the recurrences that were originally given for it. Now you can apply the MasterTheorem more directly to these recurrences and then argue from the first that

T (n) ∈ O(n log2 n)

10

and from the second that

T (n) ∈ Ω(n log2 n),

establishing again that T (n) ∈ Θ(n log2 n) (this time, without having to use the substitution methodto confirm a guess).

See Horowitz, Sahni, and Rajasekaran [6] (Section 3.4), Neapolitan and Naimipour [9] (Sec-tion 2.2), or Sedgewick [11] (Chapter 8) for additional information about this algorithm.

1.5 Asymptotically Fast Integer Multiplication

Next consider the problem of multiplying two nonnegative integers together. Let’s call the inputintegers x and y, their product (the output) z, and let’s try to assess the cost of performing thisoperation as a function of the maximum of the lengths of decimal representations of x and y.

The case x = y = 0 is trivial (z = 0 as well, so this instance of the problem could be solved byreading the input and confirming that both input integers are zero, and then writing zero out asthe answer), so we’ll ignore this case from now on, and we’ll assume that at least one of x or y ispositive.

Thus, we’ll consider the input size to be the natural number n, where either

10n−1 ≤ x < 10n and 0 ≤ y < 10n

or

10n−1 ≤ y < 10n and 0 ≤ x < 10n

or both.We’ll count operations on digits as having unit cost.Note, by the way, that if we were to use the “unit cost criterion” (as previously defined) instead,

then we’d be considering the input size to be “2” in all cases. We’d probably also be charging “1”as the cost of this computation, and this wouldn’t give a very useful analysis.

The “standard” integer multiplication algorithm that you learned in public school can be ana-lyzed without too much difficulty, and it shouldn’t be too difficult to write this down in pseudocode(perhaps, assuming that x and y are given by arrays of their digits and that the output z is to berepresented this way too); if you do this then you should discover that the algorithm uses Θ(n2)operations on digits in the worst case. It isn’t too hard to see that the algorithm can’t use morethat O(n2) of these operations under any circumstances, and it also shouldn’t be too hard to seethat Ω(n2) of these operations are used if the decimal representations of both x and y have length n.

Here is one more property of this standard algorithm that should be noted, because it willbe useful later on: If implemented reasonably carefully, this algorithm uses only a linear amount(Θ(n)) of work space, provided that you assume that each decimal digit has “constant size.”

Here are a few more observations that will be useful later on:

1. It’s possible to compute the sum x+ y from x and y using only a linear number of operationson digits and only a constant amount of additional work space, if you don’t count the spaceneeded to write down the output: Simply use the grade school method for addition. You onlyneed a constant number of operations to compute each digit of the sum, and you only needto remember one extra digit — namely, the “carry” digit that was computed in the previousstep — at any point in the computation.

11

2. You can implement subtraction, computing the x − y on inputs x and y (assuming, if youlike, that x ≥ y, so that the answer is always nonnegative, or adding a bit to the output torepresent the sign of the answer, otherwise) using asymptotically the same time and storagespace as you’d need to implement addition, for the same inputs.

3. If integers are given by their decimal representations then the operations of multiplying orperforming “division with remainder” by a power of ten is also quite inexpensive.

1.5.1 An Ineffective Approach

Let m = dn2 e. Since 0 < x, y < 10n, it’s clear that there exist nonnegative integers xL, xU , yL,and yU such that 0 ≤ xL, xU , yL, yU < 10m, and

x = 10mxU + xL and y = 10myU + yL.

Furthermore, if x and y are given by their decimal representations, then you can use these toextract decimal representations of all four of xL, xU , yL and yU using only O(n) operations ondigits. (Why?)

In fact, the lengths of xU and yU will be at most bn2 c, which is equal to m− 1 if n is odd. Onthe other hand, at least one of these two integers will have length bn2 c, since x and y’s decimalrepresentations would both have length less than n otherwise.

Now, it should be clear that if z = x · y then

z = 102mxU · yU + 10m · (xL · yU + xU · yL) + xL · yL

Based on this, you might compute z using the following steps, assuming n ≥ 2 (you’d just goahead and compute the product of a single pair of digits if n = 1):

1. Decompose the inputs to obtain (decimal representations of) xL, xU , yL, yU .

2. Recursively compute the product z1 = xL · yL.

3. Recursively compute the product z2 = xL · yU .

4. Recursively compute the product z3 = xU · yL.

5. Recursively compute the product z4 = xU · yU .

6. Return z = 102m · z4 + 10m · (z2 + z3) + z1.

The number of operations on digits, T (n), used by this algorithm in the worst case, can be seento satisfy a recurrence of the form

T (n) ≤c if n ≤ 1,3T(dn2 e

)+ T

(bn2 c

)+ dn if n ≥ 2,

where c and d are positive constants (note, again, that the first recursive multiplication in the abovealgorithm uses slightly smaller inputs than each of the last three, if n is odd).

If you can prove (or are allowed to assume) that T (n) is a nondecreasing function then you mayconclude that

T (n) ≤c if n ≤ 1,4T(dn2 e

)+ dn if n ≥ 2,

12

which makes it easier to apply the master theorem. This can be used to establish that T (n) ∈ O(n2).It isn’t quite as easy as it might seem to establish a lower bound, in all cases, that has this

form. On the other hand, it is easy to argue that if n is a power of two then

T (n) ≥c if n ≤ 1,4T(dn2 e

)+ dn if n ≥ 2,

for some positive constant c, by considering the input x = y = 10n−1 (with decimal representationsconsisting of strings of nines) in this special case. This, and the fact (or assumption) that T (n) isnondecreasing, can be used to establish that T (n) ∈ Ω(n2) as well.

Thus, T (n) ∈ Θ(n2), so that this algorithm has the same “asymptotic” behaviour of the standardone.

A more careful analysis will confirm that the “hidden multiplicative constant” in the runningtime for this algorithm is larger than the corresponding constant for the standard algorithm — atleast in the special case that n is a power of two. The new algorithm isn’t significantly better forother values of n. So, while the two algorithms have the same asymptotic cost, this new algorithmis always slower (although, by “only” a constant factor) than the simpler one.

To make matters worse, the new algorithm requires quadratic storage space as well as time, sothe new algorithm is of no practical interest.

1.5.2 A More Effective Approach

The Algorithm

Now, let

u = xU − xL and v = yU − yLand note that the absolute values of u and v are nonnegative integers such that

0 ≤ |u|, |v| < 10m

for m as above. Since

u · v = xU · yU − xU · yL − xL · yU + xL · yLit is reasonably easy to confirm that

z = x · y = 102m · xU · yU + 10m · (xU · yU + xL · yL − u · v) + xL · yL.While, at first glance, this might not look any better than the expressions given above, it is thebasis for the correctness of a recursive integer multiplication algorithm, in which you perform thefollowing steps when n ≥ 2:

1. Decompose the inputs to obtain (decimal representations) of xL, xU , yL, and yU .

2. Use integer subtraction to compute u = xU − xL and v = yU − yL. Then compute (andremember) the signs of u and v, and compute |u| and |v| as well.

3. Recursively compute z1 = xU · yU .

4. Recursively compute |u| · |v| — and use this, along with the signs of u and v, to recover theproduct z2 = u · v.

5. Recursively compute z3 = xL · yL.

6. Compute z = 102m · z1 + 10m · (z1 + z3 − z2) + z3.

13

Analysis

After proving (or, if you are allowed to, assuming) that the number of steps T (n) used by thisalgorithm is nondecreasing, you can argue that this running time satisfies the recurrence

T (n) ≤c if n ≤ 1,3T(dn2 e

)+ dn if n ≥ 2,

for positive constants c and d, because the algorithm uses only three recursive multiplications ofintegers that are approximately half as large as the original inputs, along with additional operationsthat can be performed in linear time.

The master theorem can be used to find a closed form for a recurrence, and this can be used toestablish that

T (n) ∈ O(nlog2 3).

Since log2 3 < 1.6, this implies that T (n) ∈ O(n1.6).This is very similar to the first sub-quadratic integer multiplication algorithm, “Karatsuba’s

Algorithm,” which was discovered by Karatsuba and Ofman [7] in the early 1960’s. The fact thatn-digit integer multiplication could be performed using o(n2) operations on digits was extremelysurprising at the time.

There is good news, and bad news, concerning this approach. Here is the good news: While thesealgorithms aren’t quite efficient enough to replace standard multiplication for single- or double–precision integer computations, they are efficient enough to be considered to be efficient (and“practical”) when used to multiply integers that are slightly larger. The threshold (between inputsizes on which the standard algorithm is superior, and sizes on which the asymptotically efficientalgorithm is the better of the two) is low enough so that, in the 1990’s, it is reasonably commonto find that integer multiplication algorithms using time in O(nlog2 3) are being implemented andused for extended-precision computations.

Here is the bad news: The storage space required by the asymptotically faster algorithm is(also) in Θ(nlog2 3), so that the standard algorithm is preferable if storage space is the resourcebottleneck to be concerned about, rather than running time.

Neapolitan and Naimipour [9] also discuss the above algorithm (in Section 2.6).

1.5.3 More About the Complexity of Integer Arithmetic

This topic is beyond the scope of this course, so this subsection is not required reading.Algorithms for integer multiplication that are asymptotically even faster do exist and have been

known for some time. Indeed, one of the exercises at the end of this chapter involves the derivationof one such algorithm.

The asymptotically fastest algorithm that is currently known is the algorithm of Schonhage andStrassen [10], which is based on the fast Fourier transform and can be used to multiply two n-digit in-tegers together using O(n(logn)(log logn)) operations on digits. Unlike the above “Karatsuba-like”algorithms this is currently only considered to be of theoretical interest. That is, implementationsof it aren’t common and it isn’t widely used in practice.

The fast Fourier transform over fields is discussed in Chapter 32 of Cormen, Leiserson, andRivest [5], and the generalization of this to computations over rings (that’s needed for Schonhageand Strassen’s algorithm) is discussed in exercises there. A more extensive discussion of this topic,

14

which includes Schonhage and Strassen’s algorithm and its analysis, appears in Chapter 7 of Aho,Hopcroft and Ullman [1].

Divide and Conquer can also be used to design asymptotically efficient algorithms for relatedinteger computations, including integer division with remainder and the computation of the greatestcommon divisor of two integers. These algorithms and their analysis can be found in Chapter 8 ofAho, Hopcroft and Ullman [1].

Finally, Knuth [8] includes a much more extensive discussion of algorithms for integer arithmeticthan any of the above.

1.6 Asymptotically Fast Matrix Multiplication

This last example will be skipped if there isn’t time for it.Consider now the problem of computing the product of two n × n matrices. To simplify the

analysis we’ll consider “field operations” or “operations on scalars” to have unit cost.The standard matrix multiplication algorithm, which you may have learned in high school, can

be used to perform this computation using approximately 2n3 (more precisely, 2n3 − n2) of theseoperations: It computes each of the n2 entries of the product matrix at a time by taking the innerproduct of a pair of vectors, using n multiplications and n− 1 additions of scalars for each.

In contrast, matrix addition and subtraction seem to be much cheaper, since you add and sub-tract matrices “componentwise;” exactly n2 scalar operations are needed to either add or subtracttwo n× n matrices together.

A recursive algorithm can be developed using “Divide and Conquer” for matrix multiplicationas well. To simplify the description of this algorithm, let’s suppose henceforth that n is a power oftwo — we’ll remove this assumption later. Now note that if X and Y are two n×n matrices, thenyou can write them as

X =

[X1,1 X1,2

X2,1 X2,2

]and Y =

[Y1,1 Y1,2

Y2,1 Y2,2

]

where Xi,j and Yi,j are n2 × n

2 matrices for all i and j. In this case the product of X and Y is amatrix

Z =

[Z1,1 Z1,2

Z2,1 Z2,2

]

where Zi,j = Xi,1 · Y1,j +Xi,2 · Y2,j for 1 ≤ i, j ≤ 2. It isn’t too difficult to write a recursive matrixmultiplication algorithm, using this approach, that uses T (n) operations on scalars to multiply apair of n× n matrices, where

T (n) =

1 if n = 18T(n2

)+ f(n) if n ≥ 2,

for an asymptotically positive function f(n) ∈ Θ(n2). Unfortunately, an analysis of this recurrenceproves that T (n) ∈ Θ(n3). So, asymptotically, this is no better than the standard algorithm, anda more careful analysis will confirm that it has no practical interest (in that you should expect itto be consistently slower than the standard algorithm, and it requires additional storage space aswell).

15

1.6.1 Strassen’s Algorithm

In the late 1960’s, Strassen [13] described an algorithm for the multiplication of two n×n matricesusing seven recursive multiplications of dn2 e × dn2 e matrices and Θ(n2) additional operations onscalars, rather than eight. Several such algorithms are now known, and one of them is presentedbelow.

Suppose again that n is a power of two, and consider the matrices X, Y , Z, and Xi,j , Yi,j andZi,j for 1 ≤ i, j ≤ 2 mentioned above. The next seven n

2 × n2 matrices can be computed from these

using exactly ten additions or subtractions of n2 × n

2 matrices and and exactly seven multiplicationsof n

2 × n2 matrices:

P = (X1,1 +X2,2) · (Y1,1 + Y2,2);Q = (X2,1 +X2,2) · Y1,1;R = X1,1 · (Y1,2 − Y2,2);S = X2,2 · (Y2,1 − Y1,1);T = (X1,1 +X1,2) · Y2,2;U = (X2,1 −X1,1) · (Y1,1 + Y1,2);V = (X1,2 −X2,2) · (Y2,1 + Y2,2).

(1.1)

The above expressions indicate how these seven matrices should be computed as part of anasymptotically fast matrix multiplication algorithm, in that they show which matrix additions andsubtractions are needed to form seven pairs of n2 × n

2 matrices whose products should be recursivelycomputed. However, since matrix multiplication and addition satisfy the usual “distributive laws”(even though matrix multiplication isn’t commutative), these matrices also satisfy the followingequations.

P = X1,1Y1,1 +X1,1Y2,2 +X2,2Y1,1 +X2,2Y2,2;Q = X2,1Y1,1 +X2,2Y1,1;R = X1,1Y1,2 −X1,1Y2,2;S = X2,2Y2,1 −X2,2Y1,1;T = X1,1Y2,2 +X1,2Y2,2;U = X2,1Y1,1 +X2,1Y1,2 −X1,1Y1,1 −X1,1Y1,2;V = X1,2Y2,1 +X1,2Y2,2 −X2,2Y2,1 −X2,2Y2,2.

While it’s tedious, the above equations can be used to confirm that the following identities aresatisfied too:

Z1,1 = P + S − T + V ;Z1,2 = R+ T ;Z2,1 = Q+ S;Z2,2 = P +R−Q+ U.

(1.2)

It follows that the product matrix Z can be computed from P , Q, R, S, T , U , and V using anadditional eight additions and subtractions of n2 × n

2 matrices (and no more matrix multiplications).In total, then, the computation of Z from X and Y (based on equations 1.1 and 1.2 above) uses

eighteen additions or subtractions of n2 × n

2 (which require a total of 92n

2 additions or subtractionsof scalars), as well as seven multiplications of n

2 × n2 matrices (each of which should be performed

recursively).

16

Let T (n) be the number of operations on scalars used by the Divide and Conquer algorithm forn× n matrix multiplication that’s been sketched. Then, if n is a power of two, then

T (n) =

1 if n = 1,7T(n2

)+ 9

2n2 if n ≥ 2.

The Master Theorem can be used to solve this recurrence and to establish that

T (n) ∈ Θ(nlog2 7).

Since log2 7 < 2.81, this implies that T (n) ∈ O(n2.81) (indeed, T (n) ∈ o(n2.81)), and thereforethat T (n) ∈ o(n3). It follows that this algorithm is asymptotically faster than standard matrixmultiplication.

Now suppose that n is not a power of two and that you want to compute the product of two n×nmatrices X and Y , as above. Set n = 2dlog2 ne, so that n is a power of two such that n ≤ n < 2n,and consider the following two n× n matrices,

X =

[X 00 0

]and Y =

[Y 00 0

],

which have X and Y , respectively, as their top left n × n submatrices, and which don’t have anynonzero entries anywhere else. It should be clear that the product of X and Y is a matrix whosetop left n × n submatrix is XY , so that you can multiply X by Y by forming and multiplyingtogether X and Y instead.

Since n is a power of two, the above asymptotically fast matrix multiplication algorithm can beused to multiply X by Y . Since n < 2n, the resulting algorithm still uses O(nlog2 7) operations tomultiply X and Y together.

To my knowledge, this algorithm is not widely considered to be of practical interest, in partbecause it is not clear that it is numerically stable, and in part because it does not perform wellon small- or moderately-sized inputs. However, this opinion may be changing (at least, whencomputations are exact, so that numerical stability isn’t an issue).

You can find more information about Divide and Conquer algorithms for matrix multiplica-tion with the above asymptotic cost in Aho, Hopcroft and Ullman [1] (Chapter 6), Brassard andBratley [2] (Section 7.6), Cormen, Leiserson, and Rivest [5] (Section 31.2), Horowitz, Sahni, andRajasekaran [6] (Section 3.7), or Neapolitan and Naimipour [9] (Section 2.5).

Cormen, Leiserson, and Rivest also attempt to describe how one could go about deriving equa-tions like equations 1.1 and 1.2, so you might want to look at their presentation in order to seethis.

1.6.2 More About the Complexity of Matrix Multiplication

This topic is beyond the scope of CPSC 413, so this subsection is not required reading.Asymptotically faster matrix multiplication algorithms are also known to exist, and Brassard

and Bratley [2] include a brief discussion of the history of research on the complexity of matrixmultiplication. The most recent result that has improved the theoretical upper bound on thecomplexity of matrix multiplication is that of Coppersmith and Winograd [4], who show that it ispossible to multiply two n× n matrices together using O(nα) operations on scalars, for a constantα < 2.39. However at the moment (and, probably, for the foreseeable future), it seems highlyunlikely that this result will ever be of more than theoretical interest; indeed, the only matrix

17

multiplication algorithms that are currently considered to be practical are the standard algorithmand, possibly, the algorithms with complexity Θ(nlog2 7) discussed above.

The problem of solving a nonsingular n×n system of linear equations can be proved to have thesame asymptotic complexity as matrix multiplication, so this problem can also be solved (at least,theoretically) in sub-cubic time. It’s known that one can also compute various matrix factorizationsof matrices (including an LUP factorization of a nonsingular matrix), and one can compute the rankof a matrix, at this cost. Both Aho, Hopcroft, and Ullman [1] (Chapter 6) and Cormen, Leiserson,and Rivest [5] (Chapter 31) present at least some of this material — with Aho, Hopcroft, andUllman providing more of it.

Finally, Burgisser, Clausen, and Shokrollahi [3] includes far more information about the com-plexity of matrix multiplication and related problems.

1.7 A Bit About Implementations

This is beyond the scope of CPSC 413 (so that this section is also optional). However, at least onething can be noted: While the above Divide and Conquer algorithms have been presented as purelyrecursive, in that a recursive approach is being described as used even when n is very small (namely,when n = 2), this is certainly not how Divide and Conquer algorithms should be implemented.

For example, one way to write an algorithm with almost the same performance as the standardinteger multiplication algorithm on small inputs, but with the same asymptotic complexity as the“Karatsuba-like” algorithms that were described above, would be to start by comparing n to somepre-determined “threshold value,” k. If it was found that n ≤ k then the standard algorithm wouldbe used, and Karatsuba’s algorithm would be used otherwise. The threshold would be chosenusing a more careful theoretical analysis, experimental techniques (including profiling of code, etc.)or both, and the “best” choice of the threshold might depend on such things as the skill of theprogrammer, the programming language, operating system, hardware, and so on.

It would likely be even better, though, to write a recursive algorithm A with the structure

if n ≤ k thenPerform the multiplication using the standard algorithm

elseProceed as with a Karatsuba-like algorithm, calling algorithm Awhen it is necessary to multiply smaller integers recursively.

end if

The difference between this algorithm and the previous one is that if n is large then it willbehave initially like the algorithm based on Divide and Conquer that has already been described— forming smaller and smaller instances of the problem that are to solved recursively. At somepoint, though, the smaller instances would have size less than or equal to k, and the standardalgorithm would then be applied to solve all of these, even though the recursive approach was usedto form these smaller problem instances at the beginning.

This algorithm has a small amount of additional overhead that the previous version lacks —input sizes are being compared to the threshold value k more often. In spite of that, it should be atleast plausible (since k is a constant, so that comparing it to the input size will be inexpensive) thatyou could profitably choose a larger threshold value when using this version of a hybrid algorithmthan you could for the previous one, and also that a careful implementation of this version of thealgorithm could prove to be superior to the previous ones (Why?).

18

Something more to think about: What would be a good choice of the threshold value, k, foreach of the hybrid algorithms described above?

Another implementation detail, that is more specific to the problem of integer multiplication,has to do with the fact that the above integer multiplication algorithms were presented as if dec-imal representations of the input and output integers were being used. Similar algorithms canbe obtained in which you a produce a base B representation of the output integer z from baseB representations of the inputs, for any constant B ≥ 2 that you want. So, you can produce“Karatsuba-like” multiplication algorithms that work with binary, or hexadecimal, representationsof integers.

You can also choose B to be much larger, so that only one digit, or a pair of “digits,” can befit into a word of machine memory. This can lead to an algorithm that is more efficient than itwould be if B = 10 (or if B has some other small value) while making more efficient use of storagespace at the same time. (The choice such that a pair of digits fit into a word should be considered,because it allows an arbitrary product of a pair of digits to fit into a word of memory as well, andthis might simplify the implementation of an integer multiplication algorithm).

1.8 Additional References

Brassard and Bratley [2], Horowitz, Sahni, and Rajasekaran [6], and Neapolitan and Naimipour [9]all include chapters on Divide and Conquer. Each includes one or more additional examples, andseveral include a bit more information about how you’d choose threshold values when implementingthese algorithms.

1.9 Exercises

Clearly, most of these exercises are too long to be used on tests in this course. This will be true formany of the exercises included for the “algorithm design” topics in the next two chapters as well.However, you will certainly be well prepared for tests if you’re able to solve these problems withouttoo much trouble.

Hints for some of these exercises are given in the section after this one. Solutions for the firsttwo of these exercises can be found in Subsection 4.1.1.

1. Suppose you’re given a positive integer n and that you wish to construct a binary search treestoring the values 1, 2, . . . , n that’s as balanced as possible.

Once you’ve decided which value to store at the root, you will have no choice about whichvalues to include in the left subtree and which values to include in the right.

In order to make your tree as balanced as possible, you should store n+12 at the root if n is

odd, and you should store either n2 or n

2 + 1 at the root if n is even. In the latter case, itdoesn’t matter which you choose, so you might as well choose the smaller element n

2 .

Note, also, that if you have a balanced binary search tree storing the values 1, 2, . . . , n, thenit’s easy to turn this into a balanced binary search tree storing k+ 1, k+ 2, . . . , k+ n for anygiven integer k — all you need to do is add k to the values stored at each node of the giventree.

(a) Based on this, design a Divide and Conquer algorithm to construct a balanced binarysearch tree storing the values 1, 2, . . . , n when you’re given n as input.

19

(b) Next, write down a recurrence for the time used by your algorithm on input n, assumingthat it takes constant time to create a node or a pointer to one, to change the valuestored at a node (by adding some value to it), or to divide a given value by two.

(c) Finally, find a function f(n) in closed form such that your algorithm uses Θ(f(n)) of theabove steps in the worst case.

2. Now change your algorithm so that it takes a second input, k, and produces a balanced binarysearch tree storing k + 1, k + 2, . . . , k + n instead. Your new algorithm should use the samenumber of recursive calls as the old one, but it should do less additional work (since you canchange the value of the second parameter, k, when you recurse).

Once again, form and solve a recurrence for the worst case running time of your algorithm.You should discover that the new algorithm requires substantially fewer operations in theworst case than the original one did.

3. If n ≥ 1 and 0 ≤ i ≤ n then the binomial coefficient(ni

)satisfies the following equation:(

n

i

)=

1 if i = 0 or i = n,(n−1i−1

)+(n−1

i

)if 1 ≤ i ≤ n− 1.

Suppose, for the purposes of this question, that this is all you know about this value — inparticular, you should pretend that you don’t know any other expression for

(ni

).

(a) Design a Divide and Conquer algorithm that computes(ni

)on inputs n and i, assuming

that 0 ≤ i ≤ n (you may return the output 0 if i is out of range).

(b) Then write down a recurrence for the time required by your algorithm on inputs n and i,as a function of n alone, in the worst case. You should assume that it’s possible to addtwo integers together or to compare two integers in constant time when you generatethis recurrence.

(c) Finally, find a function f(n) in closed form such that your algorithm uses time O(f(n))on inputs n and i in the worst case.

4. Recall that the Fibonacci numbers F0, F1, F2, . . . are defined by the recurrence

Fn =

0 if n = 0,1 if n = 1,Fn−1 + Fn−2 if n ≥ 2,

and note that this implies that

Fn = F1 · Fn + F0 · Fn−1 = F2 · Fn−1 + F1 · Fn−2

whenever n ≥ 2.

(a) Prove by induction on i that if i ≥ 1 and n is any integer such that n ≥ i, then

Fn = Fi · Fn−i+1 + Fi−1 · Fn−i.

20

(b) Use the result from part (a) to show that if n ≥ 1 and l = bn2 c then

Fn =

Fl(Fl−1 + Fl+1) = Fl(Fl + 2Fl−1) if n is even (so that n = 2l),F 2l + F 2

l+1 = 2F 2l + 2FlFl−1 + F 2

l−1 if n is odd (so that n = 2l + 1).

(c) Use the above result to design a Divide and Conquer algorithm for computation of thenth Fibonacci number Fn on input n, such that the number of arithmetic operations(additions, multiplications, subtractions, and divisions with remainder of pairs of inte-gers) is in O(n), and prove that your algorithm does have this worst case running time(if integer arithmetic is assumed to have unit cost).

5. Of course, it isn’t realistic to assume that integer arithmetic can be performed in constanttime if the integers can be arbitrarily large: We should probably be counting the number ofoperations on digits (or operations on integers of some larger fixed size) that are performedby an algorithm instead, when considering the algorithm’s running time.

It’s possible to add or subtract two m-digit integers together using Θ(m) operations on digits.

At this point, we’ve also seen — or know about — at least three different algorithms forinteger multiplication, which could be used to multiply two m-digit integers together:

(a) “Standard multiplication” (which you learned to use in public school) uses Θ(m2) oper-ations on digits.

(b) The “Karatsuba-like” multiplication algorithm given in this chapter of the notes usesΘ(mlog2 3) operations on digits.

(c) The “Schonhage-Strassen” multiplication algorithm that is described in [10], [1], or [8]uses Θ(m(log2m)(log2 log2m)) operations on digits.

It’s possible to perform integer division with remainder on m-digit integers using O(m2)operations on digits. One can do even better than this as well, but you won’t need to in orderto solve this problem.

Finally, you should note that the length of a decimal representation of Fm is in Θ(m). Thisis implied by the results that are proved or left as exercises in Chapter 2 in Part I, and youmay use this fact without proving it here.

Perform a more careful analysis of the algorithm you designed for the previous exercise, toshow that it’s possible to compute Fn on input n at the following costs:

(a) If standard multiplication is used for integer multiplication, then Fn can be computedfrom n using O(n2) operations on digits.

(b) If a Karatsuba-like multiplication algorithm is used for integer multiplication, then Fncan be computed from n using O(nlog2 3) operations on digits.

(c) If the Schonhage-Strassen algorithm is used for integer multiplication, then Fn can becomputed from n using O(n(log2 n)2(log2 log2 n)) operations on digits.

You don’t need to know anything about these multiplication algorithms, except for the factthat they use the number of operations listed above, in order to solve this problem. On theother hand, it definitely won’t be sufficient just to multiply the number of integer arithmeticoperations that your algorithm uses by the cost (number of operations on digits) for the mostexpensive arithmetic operation — a more careful analysis will be needed!

21

6. Suppose h(x) = h4x4 + h3x

3 + h2x2 + h1x+ h0 is an integer polynomial with degree at most

four (so that h0, h1, . . . , h4 ∈ Z).

Suppose as well that v−2, v−1, v0, v1, and v2 are defined as follows.

v−2 = h(−2) = 16h4 − 8h3 + 4h2 − 2h1 + h0;v−1 = h(−1) = h4 − h3 + h2 − h1 + h0;v0 = h(0) = h0;v1 = h(1) = h4 + h3 + h2 + h1 + h0;v2 = h(2) = 16h4 + 8h3 + 4h2 + 2h1 + h0.

In other words, v−2

v−1

v0

v1

v2

=

16 −8 4 −2 11 −1 1 −1 10 0 0 0 11 1 1 1 116 8 4 2 1

·h4

h3

h2

h1

h0

.

(a) Confirm that the following identities are satisfied as well (or, at least, explain how youcould do this):

h0 = v0;h1 = 1

12v−2 − 23v−1 + 2

3v1 − 112v2 = 1

12(v−2 − 8v−1 + 8v1 − v2);h2 = − 1

24v−2 + 23v−1 − 5

4v0 + 23v1 − 1

24v2 = 124(−v−2 + 16v−1 − 30v0 + 16v1 − v2);

h3 = − 112v−2 + 1

6v−1 − 16v1 + 1

12v2 = 112(−v−2 + 2v−1 − 2v1 + v2);

h4 = 124v−2 − 1

6v−1 + 14v0 − 1

6v1 + 124v2 = 1

24(v−2 − 4v−1 + 6v0 − 4v1 + v2).

That is, h4

h3

h2

h1

h0

=

124 −1

614 −1

6124

− 112

16 0 −1

6112

− 124

23 −5

423 − 1

24112 −2

3 0 23 − 1

120 0 1 0 0

·v−2

v−1

v0

v1

v2

.

Note that these equations imply that the coefficients of any polynomial h(x) with degreeat most four can be recovered from the polynomial’s values at −2, −1, 0, 1, and 2.

(b) Suppose α(x) and β(x) are both integer polynomials with degree at most two, and thatγ(x) = α(x) · β(x) is their product, so that γ(x) is an integer polynomial with degree atmost four.Explain how you could compute the coefficients of γ(x) from α(−2), α(−1), . . . , α(2) andβ(−2), β(−1), . . . , β(2) — without computing the coefficients of α(x) or β(x) first.Hint: Note that γ(i) = α(i) · β(i) for any integer i.

(c) Now note that if a and b are two nonnegative integers whose decimal representationshave length at most n (so that 0 ≤ a, b < 10n), and if m = dn/3e, then

a = a2102m + a110m + a0 = α(10m)

22

and

b = b2102m + b110m + b0 = β(10m)

where a0, a1, a2, b0, b1, b2 are nonnegative integers such that 0 ≤ ai, bi < 10m for i be-tween 1 and 3,

α(x) = a2x2 + a1x+ a0 and β(x) = b2x

2 + b1x+ b0,

and so that a · b = γ(10m) if γ(x) = α(x) · β(x).Use this observation and the results from parts (a) and (b) to design another Divideand Conquer algorithm for nonnegative integer multiplication. As well as calling itselfrecursively, you algorithm might

• add or subtract integers together;• multiply an integer by a small integer constant (between −4 and 4); or• perform exact division by a small integer constant (between 1 and 24), where the

division is “exact” because you’ll always dividing one integer k by another integer l,such that k is an integer multiple of l — in other words, when the remainder willalways be zero.

For the rest of this question, you should assume (correctly) that all three of the aboveoperations can be performed using a number of operations that’s linear in the length ofthe decimal representation(s) of the integer(s) you started with.Here’s a hint to make sure you’re on the right track: Your algorithm should form andrecursively solve exactly five small integer multiplication problems in order to multiplya by b whenever n ≥ 3. There’s no need to recurse at all if n ≤ 2.

(d) Write down a recurrence for the number T (n) of operations on digits used by youralgorithm and prove that T (n) ∈ O(nlog3 5) ⊆ O(n1.47).

7. Consider the Selection problem, in which you’re given as inputs

• an array A of integers, where A has length n (that is, n integers, which aren’t necessarilydistinct, are stored in array locations A[1], A[2], . . . , A[n]), and

• an integer k such that 1 ≤ k ≤ n,

and whose output should be an integer x that is the kth smallest element stored in A — sothat

• A[i] = x for some integer i between 1 and n,

• A[j] ≤ x for (at least) k integers j between 1 and n, and

• A[k] ≥ x for (at least) n− k + 1 integers l between 1 and n.

If the integers stored in A are distinct then x and the above array index i will be unique. IfA stores multiple copies of one or more of the same integers then x will still be unique, butthe array index i might not be.

The median of the above array A is the value x that satisfies the above conditions whenk = dn2 e, so that half the integers stored in A are less than or equal to x. The MedianFinding problem has the array A as input and returns the median of A as output.

23

Note that it’s easy to write an algorithm for the Median Finding problem if you alreadyknow an algorithm for the Selection problem — all you’d need to do is set k = dn2 e,execute the algorithm for Selection on inputs A and k, and then return the output thatthe Selection algorithm generates.

In this question, you’ll be asked to design a Divide and Conquer algorithm for Selection.We’ll define T (n) to be the number of comparisons of (pairs of) integers stored in A that thisalgorithm uses, when the input array A has length n, in the worst case.

If n < 20 then the kth largest element stored in A can be found (so that an instance of theSelection problem involving A can be solved) by sorting A and returning the entry that’sin position k in the resulting sorted array. Clearly this uses at most some constant number ofcomparisons of integers (since n < 10). So we’ll consider both of the Selection and MedianFinding problems to be solved for the case n < 20, and we’ll assume that n ≥ 20 from nowon.

In the first stage of the algorithm (for the case n ≥ 20), the array will be split into dn5 esubarrays, where each subarray has length at most five — the first subarray includes theelements A[1], A[2], . . . , A[5], the second subarray includes A[6], A[7], . . . , A[10], and so on.

Then, for 1 ≤ i ≤ dn5 e, the median xi of the ith of these subarrays will be computed, andwritten into the ith location, B[i], of a new array B (which has length dn5 e).Note that this implies that B[i] ∈ A[5i − 4], A[5i − 3], . . . , A[5i] if 1 ≤ i ≤ bn5 c and thatB[i] ∈ A[5dn/5e − 4], A[5dn/5e − 3], . . . , A[n] if i = dn5 e, and that B[i] is less than or equalto at least three of A[5i− 4], A[5i− 3], . . . , A[5i], and is also greater than or equal to at leastthree of these values, whenever i ≤ bn5 c.

(a) Argue that the above array B can be constructed from A using at most cn comparisons,for some constant c that’s independent of n, in the worst case.

In the second stage of the algorithm, the algorithm is recursively applied to find the median yof the array B. Note that y also belongs to the array A.

(b) Prove that there are at least 310n− 5 integers i between 1 and n such that A[i] ≤ y and

that there are also at least 310n− 5 integers j between 1 and n such that A[j] ≥ y.

In the third stage of the algorithm, y is compared to each of the elements stored in A, inorder to compute the values

• nL, which is the number of integers i such that 1 ≤ i ≤ n and A[i] < y,

• nE , which is the number of integers j such that 1 ≤ j ≤ n and A[j] = y, and

• nG = n− nL − nE , which is the number of integers l such that A[l] > y,

and to create two arrays, C, and D, with lengths nL and nG respectively, such that

• if k ≤ nL then the kth smallest element in A is equal to the kth smallest element in C,

• if nL < k ≤ nL + nE then the kth smallest element in A is equal to y, and

• if nL + nE < k ≤ n then the kth smallest element in A is equal to the (k − nL − nE)th

smallest element in D.

24

(c) Argue that this third stage of the algorithm can also be performed using a number ofcomparisons that is at most linear in n (in the worst case).

(d) Prove that nL ≤ 710n + 5 < n and nG ≤ 7

10n + 5 < n whenever n ≥ 20. This should beeasy, if you’ve managed to answer all the previous parts of this question.

In the fourth and final stage of the algorithm, either an instance of the Selection problemincluding the array C or D is formed (without performing any more comparisons of arrayelements), and this is recursively solved to discover the kth smallest element of A, or thevalue y is returned as this element (depending on how k, nL, and nL + nE are related).

(e) Complete the above sketch in order to write (pseudocode for) a Divide and Conqueralgorithm for the Selection problem, that has all the properties mentioned above.Note that this will be a deterministic algorithm.

(f) Prove that the number of comparisons used by this algorithm to solve the Selectionproblem is in Θ(n) in the worst case.

8. Recall (probably, from CPSC 331) that “Quicksort” is a sorting algorithm that uses Θ(n2)comparisons to sort an array of length n in the worst case, but that only uses O(n log2 n)operations “most of the time” (or “in the average case”).

Use the deterministic algorithm for Selection you designed to solve the previous problem,and modify the “Quicksort” algorithm, in order to produce a deterministic sorting algorithmthat uses O(n log2 n) comparisons in the worst case instead of just “most of the time.”

Unfortunately, this new algorithm will probably only be of theoretical interest — it’ll be likelythat “Heap Sort” or “Merge Sort” (or both) are faster algorithms than the one you producein this way.

1.10 Hints for Selected Exercises

Exercise #4(a): Remember to use induction on i, instead of on n.

Exercise #5: You’ll need to form and solve recurrences in order to answer this question. You’llbe able to use the Master Theorem to solve some of the recurrences you obtain, but not all of them.

Exercise #6(a): There’s a reason why the equations were restated using vectors and matrices!

Exercise #6(d): Forming and solving the recurrence you need here will be complicated by thefact that the integers used in recursively derived instances are just slightly larger than you’d needthem to be, in order for the resulting recurrence to be easy to solve using the Master Theorem.

Under these circumstances, you should consider two approaches (and you should probably con-sider them in the order in which they’re given here). You can examine the easy-to-solve recurrenceyou’d get if the integers were a little bit smaller, solve this recurrence using the Master Theorem,and then use the substitution method to prove that the recurrence you started with has the samesolution. Alternatively, you could try to apply the techniques that have been given to “simplify”recurrences, in Chapter 8 in Part I, in order to avoid using the substitution method at all.

25

Exercise #7(b): If all else fails, consider a case analysis (possibly with ten cases correspondingto ten different remainders you could get when dividing n by ten; you might find that a “caseanalysis” involving a much smaller number of cases can be used, too).

Exercise #7(f): The Master Theorem should not be used here, because you should discoverthat the algorithm usually forms and recursively solves two instances of the problem, when the twoinstances have very different sizes.

If you give away too much, by pretending that both instances have the same size (namely, themaximum of the two sizes they really do have), then you’ll get another recurrence for a function thatis an upper bound on the number of comparisons used by this algorithm — because the numberof comparisons used by the algorithm really does increase with n. Unfortunately, the function thatcorresponds to this simpler recurrence (which you’d be able to solve using the Master Theorem)will grow more than linearly with n.

Instead of using the Master Theorem, you should take advantage of the fact that you’ve beentold what the final answer should look like (namely, a function that’s linear in n) and then use adifferent method for solving recurrences instead. Apply this method to the first recurrence you got,and not to some other recurrence that you replaced it with when trying to “simplify” the problem.

1.11 Sample Tests

There haven’t been any tests that were solely about Divide and Conquer in CPSC 413 in the lastfew years.

However, you can find one or two questions on this topic in some of the tests on forming andsolving recurrences that are given at the end of Chapter 8 in Part I.

26

Chapter 2

Dynamic Programming

2.1 Overview

While Divide and Conquer can frequently be used to solve a problem in polynomial time, we’veseen that this isn’t always the case: Sometimes, if (for example) you aren’t able to reduce theproblem size substantially when you recurse, too many instances of the problem need to be solvedrecursively, in order for the algorithm to be efficient.

Dynamic Programming is a design technique that sometimes allows you to produce polynomial-time algorithms anyway, by trading off the storage space that you must declare and maintain forrunning time and — usually — working from “the bottom up” rather than from ‘’the top down.”

In this chapter, we’ll begin by modifying our procedure for computing the Fibonacci numbers,and we’ll develop a simple procedure that uses time that is polynomial in the size of the output inthe worst case — pretty much the best we can hope for, when the output is substantially largerthan the input.

The approach that was used to develop the more efficient algorithm — Dynamic Programming— will then be generalized and described in more detail. It will be applied to additional (pro-gressively more complicated) examples after that. However (as with the material in the previouschapter) it’s the algorithm design technique that’s important here, and not the algorithms thatwere produced using it.

CPSC 413 students should know how to design and analyze a Dynamic Programming algorithm.In particular, given sufficient time and guidance, they should be able to design and analyze suchalgorithms in order to solve problems on tests and exams.

A set of exercises and sample tests is given at the end of this chapter. These can be used toassess the algorithm design and analysis skills mentioned above.

2.2 Example: Computation of the Fibonacci Numbers

Consider, once again, the problem of computing the nth Fibonacci number Fn on input n.

2.2.1 An Inefficient Algorithm

A simple recursive algorithm has already been developed and used as a motivating example inChapter 8 in Part I. It was noted in Chapter 1 that this could be regarded as a “Divide andConquer” algorithm but that the number of steps it uses would be exponential in the size of itsinput, even it that input (an integer n) were represented in unary.

27

function fib2 (n: integer)var fib store: array [0 . . . n] of integervar i: integer

if n ≤ 0 thenreturn 0

elsif n = 1 thenreturn 1

elsefib store[0] := 0; fib store[1] := 1for i := 2 . . . n do

fib store[i] := fib store[i− 1] + fib store[i− 2]end forreturn fib store[n]

end ifend function

Figure 2.1: A More Efficient Way to Compute the Fibonacci Numbers

In particular, we saw that this simple algorithm used an exponential number of steps —Θ((

1+√

52

)n)operations on input n — in the worst case.

2.2.2 A More Efficient Solution

Recall the recursive definition of the Fibonacci numbers:

Fn =

0 if n = 0,1 if n = 1,Fn−1 + Fn−2 if n ≥ 2.

The inefficient recursive function applied this recurrence directly in order to compute Fn, gen-erating two recursive calls to do so whenever n ≥ 2.

It failed to take advantage of the fact that there is actually only a very small number ofdifferent values that are computed and used during this recursive computation — specifically, thevalues Fn, Fn−1, Fn−2, Fn−3, . . . , F1, F0. The problem with the Divide and Conquer algorithm forthis computation is that it computes these values over and over again, rather than keeping themaround and reusing them (without recomputing them) as needed.

An inspection of the above recurrence confirms that F0 and F1 can be computed without needingto know any of the other Fibonacci numbers, and that Fi can be computed if Fi−1 and Fi−2 (whichare values of the function on “smaller” inputs) are available. This should suggest to you that itwill be simpler to try to generate these values in the order F0, F1, F2, . . . , Fn−1, Fn of increasingargument (or “index”) rather than in decreasing order, which is what we followed using Divide andConquer.

That is, we will produce a simpler efficient algorithm by working “from the bottom up,” in thiscase, than we could if we tried to work “from the top down.”

Pseudocode for a procedure that uses this strategy is shown above in Figure 2.1. Note that ifn ≥ 2 then the values F0, F1, . . . , Fn are written into an array as soon as they’re computed and

28

function fib3 (n: integer)var previous, current, next, i: integer

if n ≤ 0 thenreturn 0

elsif n = 1 thenreturn 1

elseprevious := 0; current := 1for i := 2 . . . n do

next := previous + current; previous := current; current := nextend forreturn current

end ifend function

Figure 2.2: An Even More Efficient Way to Compute the Fibonacci Numbers

then read from this array whenever they’re needed later, so that the algorithm is iterative insteadof recursive, works from small inputs to larger ones (instead of in the opposite order, which theDivide and Conquer solution used), and maintains a data structure to explicitly store and retrieveinstances of the same problem on smaller inputs.

It should be clear (by this point, during the course) that this procedure uses Θ(n) arithmetic op-erations, when given n as input, in the worst case. Thus, this uses a linear (rather than exponential)number of operations, as a function of n.

2.2.3 An Even More Efficient Solution

While it isn’t clear how one could reduce the number of operations used by this new procedure,by very much, one more observation will allow us to reduce the storage space by a considerableamount: Note that, once this algorithm computes the ith Fibonacci number, Fi, it never uses Fjagain for j ≤ i−2. Therefore, there’s no reason to keep these values around, and it isn’t too difficultto design an improved version of the algorithm that uses almost the same number of operations asthe above one (the running times differ by only a constant factor) but which only needs to store aconstant number of integers.

Pseudocode for such a procedure is shown in Figure 2.2. To see that this is correct, you shouldtry to prove that the values of the variables previous and current are Fi and Fi−1, respectively, atthe end of the ith execution of the algorithm’s for loop, for 2 ≤ i ≤ n. In particular, the variablecurrent has value Fn on termination, as needed.

2.2.4 An Efficient “Top Down” Computation of the Fibonacci Numbers

Yet another variation on the efficient algorithm for computing the Fibonacci numbers is possible:If we’re willing add a mechanism for explicitly checking whether a given Fibonacci number Fm hasbeen computed already, such as an array computed of boolean values that will be used along withthe array fib store storing the Fibonacci numbers, then it’s possible to write a recursive top-downalgorithm for computing the Fibonacci numbers that is also efficient.

29

function fib4 (n: integer)if n ≤ 0 then

return 0elsif n = 1 then

return 1else

Declare an array fib store[2 . . . n] of integer and an array computedof boolean

fib init(n)return rfib(n)

end ifend function

Note: The arrays that were initialized by the above function should only beaccessible by it and by the following routines. These routines should only beinvoked by fib4 and themselves.

procedure fib init(n: integer)for i := 2 . . . n do

computed[i] := falseend forend procedure

function rfib(m: integer)if m ≤ 0 then

return 0elsif m = 1 then

return 1else

if not computed[m] thenfib store[m] := rfib(m− 2) + rfib(m− 1) Read this line carefully! computed[m] := true

end ifreturn fib store[m]

end ifend function

Figure 2.3: An Efficient “Top Down” Computation of the Fibonacci Numbers

30

In particular, it’s possible to write a recursive algorithm whose structure (at the top levels)resembles that of the inefficient Divide and Conquer algorithm for this problem that we startedwith. However, the new algorithm will behave differently whenever it’s asked to compute Fm whenm ≥ 2: Instead of immediately recursing, the new algorithm will first check whether Fm hasalready been found. If it has then the algorithm will simply read the value it needs from the arrayfib store and will report this value without recursing at all. If this value hasn’t been found, thenthe algorithm will form and recursively solve two smaller instances of the problem, like the originalone did — but it will write the resulting value to the array fib store just before it returns this value,in order to make sure that the algorithm will use the data structure instead of recursive calls, onany future attempts to find the same value Fm.

An algorithm that uses this strategy is shown in Figure 2.3 on page 30. This includes a mainroutine, fib4, which returns an answer immediately if n ≤ 2, and which declares the data structuresmentioned above, otherwise. This routine also calles a procedure, fib init, which initializes the datastructures, and a recursive function rfib, which computes the desired Fibonacci number “from thetop down,” as described above.

It should be easy to show that the number of arithmetic operations this algorithm uses in theworst case is linear in the number used by the first Dynamic Programming solution for the problemon the same input. However, this last solution might be a bit slower (by a constant factor), partlybecause of the extra overhead needed to check whether the problem has been solved already.Exercise: Write a program to implement this algorithm in an object-oriented programming lan-guage like C++, so that the data structures and subroutines used by the algorithm are betterprotected.

You’ll probably end up defining a class that exports a function for computation of the nth

Fibonacci number, given an integer n as input. The data structures and subroutines used for thiscomputation should not accessible from outside the class.

You might also want to change the implementation of the algorithm so that a different datastructure than a pair of arrays is used — consider, for example, a balanced binary search treewhich is initially empty and which stores ordered pairs (i, Fi) for positive integers i during thecomputation.

How would this replacement of the data structure affect the number of operations (includingcomparisons of integers at unit cost) used by the algorithm to compute Fn in the worst case?

2.2.5 Storage Space

As suggested at the beginning of this chapter, it has been necessary to declare and manage morestorage space using this approach (even with the last version of this algorithm) than is necessary ifyou’re using a purely recursive “Divide and Conquer” strategy. Therefore it might seem that the“Dynamic Programming” solution requires more storage space, in general.

However, this isn’t really clear — or, perhaps, it depends on what you count.Recall that when a recursive algorithm is executed, information about the recursive calls that

have been started but not completed (usually, values of input and local variables) must be main-tained somewhere — usually on a stack. When the original recursive algorithm for computing theFibonacci numbers is executed on input n, the “stack” of recursive calls might have depth as largeas n. Therefore, if you include the size of this stack when you count the storage space required bythe algorithm, then you should discover that the first “Dynamic Programming” solution for thisproblem does not use significantly more storage space than the “Divide and Conquer” algorithm,after all — and the second “Dynamic Programming” solution uses significantly less storage space

31

than either one.Since the Divide and Conquer algorithm uses exponential time, because it makes an exponential

number of recursive calls, you might initially think that things are even worse than this, in thesense that you might think that the stack depth is sometimes more than linear in n.

Fortunately, this isn’t the case: With a bit of work, you should be able to prove that you neverhave more than n (or, perhaps, n + 1) recursive executions “in progress” at the same time, whenthe Divide and Conquer algorithm is executed on input n. It follows that the stack won’t havedepth more than n or n+ 1 at any point during the computation, either. Thus the space usage ofthis recursive algorithm is not significantly worse than it is for the first “Dynamic Programming”solution, even though it also isn’t significantly better.

2.2.6 A More Reasonable Analysis

Note: This subsection doesn’t have much to do with “Dynamic Programming,” but it does relateto the ongoing example involving the Fibonacci numbers; you can consider it to be recommendedreading (but not required).

The above analyses have been a but inconsistent or, perhaps, unrealistic, in the sense that wehave been considering running times as functions of the input value. This doesn’t correspond to theway we’ve applied “the unit cost criterion” to other examples, in that we haven’t tried to expressthe running time as a function of the “input size.” However, since this “input size” would alwaysbe “one” according to the unit cost criterion, this wouldn’t be very helpful.

Now, we could try to proceed as we did when considering integer multiplication algorithms.That is, we could consider the input size to be the length of a decimal representation of the input n(or the length of a base-B representation, for an constant B ≥ 2), and count the number ofoperations on digits used by the algorithm.

In order to do this it’s necessary to estimate the lengths of representations of all the intermediatevalues computed by the algorithm as well.

In this case, these are the ith Fibonacci numbers, for 0 ≤ i ≤ n − 1, and additional propertiesof the Fibonacci numbers that were proved or left as exercises in Chapter 2 in Part I can be usedto prove that the length of a representation of Fi is in Θ(i).

In both of the last two algorithms given above, the ith Fibonacci number is obtained (for i ≥ 2)by adding together two smaller values that had already been computed. Since integer addition canbe implemented using a linear number of operations on digits, this addition can be performed usingΘ(i) of these operations.

This can be used to establish that both of the above algorithms use Θ(n2) operations on digits tocompute the nth Fibonacci number. Since this output has a decimal representation of length Θ(n),it follows that both of the above algorithms use a quadratic number of operations on digits whenthis is considered as a function of the input and output size. Therefore these are both “polynomialtime” algorithms, using this cost measure.

Similarly, you can estimate the storage space used by each algorithm, measuring this as thenumber of digits that must be stored at any time. If you do so, then you should discover thatthe first algorithm requires quadratic (Θ(n2)) space, while the second only requires linear (Θ(n))space. Neither of these algorithms is recursive, so there’s no need to worry about the size of anystack that is used to keep track of recursive calls.

A more precise analysis can be performed if you have a bit more information — including amore precise estimate (including a multiplicative constant) for the number of operations on digitsneeded to add two integers together, or to assign an integer to a variable, as well as the base B

32

used for representation of values (since this will determine the lengths of their representations).If you don’t have this information, then determining the information about running time and

storage space that’s been given above is about as much as you can do, using the techniques thathave been introduced in this course.

You might wonder whether experimental techniques, including such things as profiling, can giveyou sharper results than this.

It turns out that they can, but perhaps not to the extent that you think.You can run experiments, perform profiling, and so on, in order to get a better idea of the

running time used by any particular implementation of the above algorithm(s), under particularconditions.

The results will be affected by such things as underlying hardware, the operating system used,and the implementation of machine arithmetic, as well as the load that the system is under whenthe tests are performed.

Change any or all of the above, and you can expect your experimental results to change (some-what) as well.

On the other hand, it is quite possible that these experimental results can help to you tochoose between two or more algorithms with the same “asymptotic” complexity, provided that youtest both of them under similar conditions. You’re comparing apples and oranges if you test thealgorithms under different conditions and then try to compare the algorithms based on your testresults.

2.3 Dynamic Programming

In this section we’ll describe the properties of a problem that you should try to establish if youwant to use “Dynamic Programming” to solve it, and then we’ll describe a process that you canfollow, once these properties have been established, in order to generate a “Dynamic Programming”algorithm that solves the problem.

2.3.1 Necessary Characteristics of Problem to be Solved

You will need to establish that the problem has two properties — the “Optimal SubstructureProperty,” and the fact that an instance can be solved from the solution from a small number of“Overlapping Subproblems.” These are described below.

The Optimal Substructure Property

We’ll say that a problem has the Optimal Substructure Property if an instance of the problem canbe solved efficiently, using the solutions for some number of smaller “derived” instances of the sameproblem (and, these smaller instances can be generated from the original one, as well).

This property is necessary — and sufficient — for it to be possible to solve the problem using“Divide and Conquer.” However, it isn’t sufficient to guarantee that a “Divide and Conquer”algorithm for the problem is efficient.

In the case of the problem of computing the Fibonacci numbers, the fact that the functionto be computed is defined by a recurrence was sufficient to establish the “Optimal SubstructureProperty” since, by definition, a recurrence expresses the value of a function on some input in termsof the values of the same function on smaller values.

33

Overlapping Subproblems

We’ll say that a problem can be solved using a small number of “overlapping subproblems” —or, that it has the “Overlapping Subproblem” property — if the solution for a given instance canbe produced (without recursing), using the solutions of a small number of instances in total —and, furthermore, a recursive algorithm for the problem would only (eventually) recursively formand solve these instances, given the original one as input — even though these instances might beformed and solved over and over again if such an algorithm was used to solve the problem.

In the case of computing the Fibonacci numbers, the problem has the “overlapping subproblem”property because the nth Fibonacci number (the problem’s output, on input n) can be computedwithout recursing, if the “outputs” or “function values” for the inputs 0, 1, . . . , n− 1, (that is, thevalues F0, F1, . . . , Fn−1) are already known. Furthermore, a straightforward recursive algorithm forcomputing Fn would only recursively compute the above Fibonacci numbers (and not any more)along the way — even though it would recursively compute at least a few of these an exponentialnumber of times.

2.3.2 Algorithm Development

Now, suppose that you’ve managed to confirm that each of the above properties is satisfied. Inorder to design a Dynamic Programming algorithm for the problem, you must also do the following.

1. Give a procedure or process that can be used to list all of the smaller instances (that is, inputs)of the problem whose solutions will eventually be needed in order to solve a given one.

The procedure should list these in a particular order: Namely, it should only produce a giveninstance I after it has listed all the instances that would be needed in order to solve I; theinstance it was given as input should be the final instance that it lists.

In the case of the problem of computing the Fibonacci numbers, this is easy: On input n(corresponding to the problem of computing the nth Fibonacci number), such an algorithmwould just list the instances (that is, inputs) 0, 1, 2, . . . , n− 1, n in this order.

A procedure that listed the instances in decreasing order, n, n − 1, . . . , 1, 0 would be “incor-rect,” because it would list the instance n before the instances (including n − 1 and n − 2)that are needed to solve it, rather than after.

2. Choose a data structure that can be used to store the solutions of these instances and retrievethem efficiently when they are needed.

In the case of the problem of computing the Fibonacci numbers, such a data structure waseasy to describe: an array indexed by integers (which are the same as the inputs for thisproblem) is sufficient.

3. Write an algorithm to solve the problem using Dynamic Programming: This will be an iterativealgorithm that uses the procedure (or “process”) identified in the first step to list the instancesof the problem that will be needed, and that uses the data structure described above in orderto organize the solutions for these instances.

As each instance is listed (by the above “procedure”), it is solved in essentially the way that itwould be using the “Divide and Conquer” algorithm you could have written after establishingthe Optimal Substructure Property.

34

However, the Dynamic Programming algorithm does not call itself recursively whenever it’sfound that a solution for a smaller instance is needed: Instead, the solution for this instanceis read from the data structure that is being used to maintain these solutions.

Eventually, after looking up values in this data structure when solutions for smaller probleminstances are needed, and performing whatever additional computations would be needed(by a “Divide and Conquer” algorithm for the same problem), the solution for the “current”instance will be obtained. At this point, this should be written to the data structure forproblem solutions.

And, eventually, the procedure for enumerating subproblems will terminate, after havinglisted the instance that it was originally given. At this point — providing that all the “listed”instances including the final one have been solved — the instance that was given as input forthe entire algorithm will have been solved, so its solution can be found in the data structurethat is being maintained and this solution can be returned as output. (Of course, you canmake a small optimization by returning this solution as output once you’ve realized that it’sbeen computed, instead of writing it to your data structure and then looking it up again.)

At this point it might be a good idea to go back and check that the first “efficient” algorithmfor computing the Fibonacci numbers, shown in Figure 2.1, matches the above description.

4. Next, consider whether the solutions for some of these smaller instances of subproblems areactually needed for very long.

If you can determine the point in the computation when a given instance’s solution is nolonger needed (because it will never be referred to again) then you can “discard” this instance’ssolution at this point in the computation, by removing it from your data structure for problemsolutions, or by allowing it to be overwritten.

You may therefore be able to design a smaller data structure than the one described above,which only stores solutions for smaller instances for as long as they are needed, and deletes(or overwrites) them after that.

In the case of the Fibonacci numbers, it was observed that the ith Fibonacci number wouldnot be used after the i + 2th Fibonacci number had been computed, so that it was nevernecessary to keep track of more than two previous function values at once — the Fibonaccinumber Fi could be overwritten as soon as Fi+2 had been obtained (provided, of course, thatFi+1 was also available).

Thus, the “array” data structure, used in the first version of a Dynamic Programming algo-rithm for this problem, could be replaced by a data structure that consisted only of a pairof integer variables, with a third variable used (only) to ensure that the data structure wasupdated correctly whenever a new solution was to be added to it.

At this point it might be a good idea to go back and check that the second “efficient” algorithmfor computing the Fibonacci numbers (given in Figure 2.2) is essentially what you’d get fromthe first one by replacing data structures as described above.

Unless the new data structure can be accessed and modified more quickly than the originalone, you shouldn’t expect to reduce the running time by making this change. Indeed, therunning time might increase if the time needed to maintain the new data structure exceedsthe time that would be needed to maintain the old one. However, you may discover that thissubstitution reduces the storage space needed to solve the problem, by a substantial amount,even if the change has no significant (positive) effect on the algorithm’s running time.

35

2.3.3 A Similar Strategy: “Memoization”

The alternative strategy that is described here might not be discussed in class, if there isn’t timefor it. If this is the case, then it won’t be mentioned in class tests.

A “Dynamic Programming” algorithm that is developed by following the above steps will gener-ally not be recursive (it will frequently be “iterative,” having a loop in its outer structure, instead),and it will typically form and solve small instances of a problem first, and use the solutions forthese to construct solutions for larger ones. Thus it typically works “from the bottom up,” unlikea Divide and Conquer algorithm (which generally works “from the top down”).

However, a similar strategy can be used to produce an efficient algorithm that works from thetop down, instead. Such an algorithm will often call two subroutines, described below, one afterthe other:

1. The first subroutine initializes the data structure for solutions of subproblems to be empty— or (if the data structure has a fixed length, like an array) fills the data structure with adesignated value that is different from any instance’s output value, or does something else tomark the subproblems as “unsolved.”

2. The second subroutine is a recursive procedure or function that has almost the same structureas a Divide and Conquer algorithm for the same problem: It always “begins” by attemptingto look up the solution for the instance it’s been asked to solve, in the data structure thatwas initialized by the first subroutine.

If this solution is found, then the second subroutine simply returns this solution and termi-nates.

If this solution isn’t found (which will always be the case for the first attempt to solve thisinstance, but never after that), then the subroutine proceeds essentially as a Divide andConquer algorithm for the problem would — calling itself recursively whenever a solution fora subproblem is needed.

Before this subroutine terminates, it writes the solution that it’s obtained into the datastructure that was initialized by the first subroutine, so that it won’t ever be necessary tocompute this particular solution again.

After the second subroutine terminates, the solution for the problem instance that was originallygiven as input will have been written into the data structure for problem solutions. Therefore, the“main” procedure can simply look this solution up, return it as output, and terminate.

If it’s the case that a call to the second subroutine, when the instance’s solution has alreadybeen written to the data structure, is not much more expensive than simply “looking up the solutiondirectly” would have been, then the running time used by the above “top down” algorithm wouldnot be significantly larger than the running time for a corresponding “Dynamic Programming”algorithm for the same problem.

It might be a good idea to go back and check that the final algorithm for computing the nth

Fibonacci number, shown in Figure 2.3 on page 30, has this structure.This kind of top down “Memoization” strategy has one significant advantage over the bottom

up “Dynamic Programming” strategy, but it also has a significant disadvantage.The advantage is that you can sometimes write such an algorithm, even though you aren’t able

to determine which smaller instances’ solutions you’ll need in order to solve a given instance inadvance (so that you aren’t able to carry out the first step that was described for the design of aDynamic Programming algorithm).

36

In particular, you may be able to use a hash table that stores input-output pairs and uses inputs(problem instances) to form the hash table’s keys — and write a top-down algorithm of the formgiven above — even though you wouldn’t necessarily know where to start, if you tried to solve theproblem by working from the bottom up.

There is at least one programming environment — namely, the computer algebra package Maple— which makes it very easy to turn a Divide and Conquer algorithm into a top down “Memoization”algorithm: If you take a recursive (Divide and Conquer) algorithm that has been written in Maple’sprogramming language, and add the instruction

option remember

to its beginning, then a “table of solutions” will be maintained when the resulting program isexecuted — without your needing to make any additional changes to the source code. (If you’reinterested, see the tutorial or online help for Maple, for further details.)

The disadvantage of a top-down approach is that it is frequently difficult, or impossible, toconserve storage space (by choosing a more efficient data structure) or to make similar optimiza-tions if Memoization is used instead of the bottom-up strategy, “Dynamic Programming.” This isextremely likely to be the case if Memoization is being used instead of Dynamic Programming inorder to overcome the difficulty (not knowing which smaller instances’ solutions you need, or theorder in which to solve these smaller instances of the problem) that has been just been described.That is, it’s likely if you’re using Memoization because of the “advantage” that has been givenabove.

2.4 Example: The “Optimal Fee” Problem

2.4.1 Definition

Suppose now that you are given a list of “fees” of n jobs as input

f1, f2, . . . , fn

where fi is the “fee” for the ith job and is a nonnegative integer, for 1 ≤ i ≤ n, and that you wishto compute a set S ⊆ 1, 2, . . . , n of (indices of) these jobs such that

fS =∑i∈S

fi

is maximized, subject to the constraint that

if i ∈ S then i+ 1 6∈ S for 1 ≤ i ≤ n− 1.

That is, you are not allowed to include any pair of “consecutive” jobs in the output set S.A variation of this problem in which the total fee fS is returned as output, instead of the set S,

will also be considered.

Note: We’ll call a set S ⊆ 1, 2, . . . , n an admissible set if it satisfies the above constraint.Therefore a set must be “admissible” if it can be the output for an instance of the first version ofthe above problem.

In order to keep things simple, we’ll return to the “unit cost criterion:” we’ll consider the “inputsize” for the problem instance that’s been described above to be n, and we’ll compute the runningtimes of algorithms according to the unit cost criterion (as previously described), as well.

37

2.4.2 Brute Force Solutions Require Exponential Time

One inefficient approach would be to consider every possible set S ⊆ 1, 2, . . . , n, rejecting setsthat fail to satisfy the constraint (that is, that are not admissible) and computing fS and keepingtrack of the set S that maximizes this value, for the rest.

There are clearly 2n sets to be considered, so this is clearly (at best) an exponential-time solutionfor this problem.

We might be able to do slightly better than this if we could manage just to list the sets S thatsatisfy the given constraint.

Indeed, we can improve on this still further if we recognize that there are admissible sets S thataren’t worth considering. In particular, suppose that S is any set such that i 6∈ S, i + 1 6∈ S, andi+2 6∈ S for any integer i such that 1 ≤ i ≤ n−2. Set S = S∪i+1; then (after a bit of thought)it should be clear that S satisfies the above constraint if S does and

fS = fS + fi+1 ≥ fS ,

so that it isn’t necessary to consider the set S when looking for a solution for the problem —provided that you consider the set S (or, another set that’s been formed by making a similar“improvement” to S) instead.

Unfortunately, this is still not good enough: One can prove that there are still at least 2bn/5c

sets that must be considered.However, the idea that we can safely “throw some sets away,” even though they satisfy the

above constraint, will be useful when we develop an efficient solution for this problem.

2.4.3 Solving the Second Version Using Dynamic Programming

To begin with, we’ll consider the variation of the problem in which the optimal total fee fS isreturned instead of an optimal set S; the original version of the problem will be considered againlater.

Establishing the Necessary Properties

Optimal Substructure Property. Suppose 1 ≤ i ≤ n and consider an instance of this problemof size i that includes the fees

f1, f2, . . . , fi

for the first i jobs in the original instance of the problem. Let best feei be the correspondingtotal fee (the correct output for the second version of the problem, for the same input). We’ll alsoconsider one particular set Si that is a correct solution for the first version of this problem, giventhis instance as input, as well.

Note that there might be more than one admissible set whose total fee is best feei, so we reallyare picking out one of several possible correct solutions when we choose or define Si.

Since f1 ≥ 0, it is clear that best fee1 = f1 and that we can set S1 = 1.Similarly (since we can’t include both 1 and 2 in S2, but both f1 and f2 are nonnegative),

best fee2 = max(f1, f2)

and we can set

S2 =

1 if f1 ≥ f2,2 if f1 < f2.

38

Suppose now that i ≥ 3, and note that we can either include i in Si or exclude it.If we do not include i in Si then Si is also an admissible set for the instance of this problem of

size i− 1 consisting of the fees for the first i− 1 jobs, so that

best feei ≤ best feei−1

in this case. However, the set Si−1 is always an admissible set for the input of size i that includesthe fees for the first i jobs, so

best feei−1 ≤ best feei

as well. Thus

best feei = best feei−1 if i 6∈ Si.

Furthermore, we can correctly define Si to be the same set as Si−1 if we can somehow decide thati should not be included in it.

In the only other case, that i should be included Si, it is clear that i− 1 cannot be included —otherwise, Si would include both i−1 and i, so that Si would not be admissible after all. It followsin this case that if we remove i from Si then the result is an admissible set for the input instanceof size i− 2 that includes the fees for the first i− 2 jobs.

On the other hand, if we add i to the set Si−2 then we always obtain an admissible set for theabove instance of size i. Therefore

best feei = best feei−2 + feei if i ∈ Si,

and, furthermore, we can correctly define Si to be Si−2∪i if we can somehow decide that i shouldbe concluded in this set.

Now, how do we decide whether to include i?Note that, by the above argument, there exist admissible sets for the above input of size i with

total fees best feei−1 and best feei−2 + fi respectively, and that best feei must always be equal toone or the other of these values. On the other hand, best feei must be as large as possible, so itwill always be true that

best feei = max(best feei−1, best feei−2 + fi).

In order to define a set Si with the above total fee, we will exclude i from Si and set Si to be Si−1

if best feei = best feei−1, and we will include i, and define Si to be Si−2 ∪ i otherwise. In otherwords, we can now define best feei by the recurrence

best feei =

f1 if i = 1,max(f1, f2) if i = 2,max(best feei−1, best feei−2 + fi) if 3 ≤ i ≤ n,

and we can define Si using the recurrence

Si =

1 if i = 1,1 if i = 2 and f1 ≥ f2,

2 if i = 2 and f1 < f2,

Si−1 if 3 ≤ i ≤ n and best feei = best feei−1,

Si−2 ∪ i if 3 ≤ i ≤ n and best feei 6= best feei−1,

(2.1)

39

function best fee(n)if n = 1 then

return f [1]elsif n = 2 then

return max(f [1], f [2])else

return max(best fee(i− 2) + f [i], best fee(i− 1))end if

end function

Figure 2.4: An Exponential-Time Divide and Conquer Algorithm for the Optimal Fee

provided that we’ve managed to compute best feei for 1 ≤ i ≤ n.Note that, if we’re only interested in solving the second version of the problem, then we don’t

need to compute Si at all.We’ve also used the approach of “safely ignoring admissible sets when we know that better ones

are being considered” about as far as anyone could expect to — only two sets are being consideredas candidates here for Si.

At this point, it is possible to write a function that solves this problem using Divide andConquer; a recursive function that solves the second version of this problem in this way is givenin Figure 2.4. This function is correct if you assume that the input fees f1, f2, . . . , fn are stored inthe positions with indices 1, 2, . . . , n respectively, of an array f ; the function has an integer input nand returns the value best feei as output.

This function is reasonably easy to analyze, if you’ve understood the analysis of the originalrecursive function for the computation of the Fibonacci numbers. Unfortunately, this is the casebecause the two algorithms have virtually the same running times (and virtually the same analysis);thus, this recursive function also uses exponential time.

Overlapping Subproblems. For 1 ≤ i ≤ n let Ii denote an instance of (the second version of)this problem of size i that includes the fees

f1, f2, . . . , fi.

Then we wish to solve the instance In of this problem (to compute best feen) and, since best feeiis the solution for instance Ii of this problem, a consideration of the above recurrence for best feenshould confirm that the only instances of this problem, whose solutions are needed to find best feen,are the instances

I1, I2, . . . , In

that have just been defined. Since there are only n of these instances (most of which are solvedrepeatedly when the recursive function in Figure 2.4 is used), this problem does have the desired“overlapping subproblems” property.

Algorithm Design

Subproblems, and Their Solution Order. As previously described, we now continue by find-ing a procedure that can be used to list subproblems in an order that guarantees that solutions

40

function best fee2(n)var i: integer

if n = 1 thenreturn f [1]

elsif n = 2 thenreturn max(f [1], f [2])

elsebestFee[1] := f [1]; bestFee[2] := max(f [1], f [2])i := 2while i < n do

i := i+ 1bestFee[i] := max(bestFee[i− 2] + f [i], bestFee[i− 1])

end whilereturn bestFee[n]

end function

Figure 2.5: A More Efficient, Dynamic Programming, Solution for the Problem

for “smaller” problem instances will always be available when they are needed, such that the listincludes (and ends with) the problem instance that we were originally asked to solve.

This procedure is easy to describe: It simply lists the problem instances

I1, I2, . . . , In

by order of increasing input size, as shown here. The solutions of the (second version of the)problem, corresponding to these instances, are the values

best fee1, best fee2, . . . , best feen,

so it should be clear from the above recurrence for best feen that each value best feei appearsafter the solutions for problems (best feei−1 and best feei−2) that are needed to derive it. Theabove list also ends with the original problem instance, In, as desired.

Choice of Data Structure. Once again, an array (this time, of length n) can be used as a datastructure to store the solutions for these instances of (the second version of) the problem: We’llcall this array bestFee, and we’ll eventually store the solution best feei for the instance Ii of theproblem in location i of this array, for 1 ≤ i ≤ n.

A First Efficient Algorithm. Assuming that the input fees are stored in the entries of an arrayf (as was the case for the recursive solution for the problem) a Dynamic Programming solution forthe second version of the problem, using the above subproblem ordering and data structure, is asshown in Figure 2.5.

Once again, since bestFee is an array, so that a reference to “bestFee[i]” corresponds to an arrayaccess instead of a recursive function call, it is reasonably easy to show that this function uses Θ(n)operations in order to solve the second version of the problem, on an input of size n.

41

function best set(n)if (n = 1) then

return 1elsif (n = 2) then

if bestFee[2] = f [1] thenreturn 1

elsereturn 2

end ifelse

if bestFee[n] = bestFee[n− 1] thenreturn best set(n− 1)

elsereturn best set(n− 2) ∪ n

end ifend if

end function

Figure 2.6: A Divide and Conquer Completion for the First Version of the Problem

Improvement: Reduction of Storage Space. After thinking about how long solutions forproblem instances are really needed, we can make pretty much exactly the same improvement forthis function as was made in order to produce the final version of a procedure that computes theFibonacci numbers. That is, the array bestFee of length n that is used by the above function couldbe replaced by a pair of variables (along with a third variable that is only used to “update” thisdata structure as needed).

It might be a useful exercise to write pseudocode for this final solution of the second version ofthis problem.

2.4.4 Solving the Original Version of the Problem (Sketch)

Let’s suppose now that the Dynamic Programming algorithm for the second version of the problemthat is shown in Figure 2.5 has been used compute best feei for 1 ≤ i ≤ n and these values havebeen written into the array bestFee.

Then — provided that this array has not been erased or destroyed — a correction solution forthe first version of the problem can be generated using a recursive function that simply implementsthe recursive definition of Sn that is given in Equation (2.1) on page 39. Such a recursive functionis shown in Figure 2.6.

By now we’ve seen several exponential-time recursive functions, so you may be surprised toread that the recursive function in Figure 2.6 uses Θ(n) operations to compute the set Sn (as theset it returns as output, when given n as input) in the worst case.

Here is the important difference between the previous recursive functions and this one: Theprevious recursive functions called themselves recursively, with inputs (of size) n − 1 and n − 2,when given an input (of size) n. This last recursive function recursively calls itself only once oninput n — either with input n− 1 or with input n− 2 (but not both), depending on the contentsof the arrays bestFee and f .

42

At this point, it might be a useful exercise to analyze this recursive function more carefully, inorder to prove more formally that this claim about its running time is correct.

Another (possibly, more interesting) exercise would be to modify this solution for the firstversion of the problem, to reduce the storage space used. Try to modify this solution, in order toproduce an algorithm with (asymptotically) the same running time, that uses only two or threeinteger variables, and a bit vector (that is, array of boolean values) of length n, as storage.

2.5 Example: The Matrix Chain Problem

The previous example was useful because it showed that Dynamic Programming could be usedto do more than evaluate functions that were (initially) given by recurrences. It was somewhatunsatisfactory, because it was a bit too similar to the original “Fibonacci numbers” example — itssolution used a similar arrangement of subproblems, and almost exactly the same data structurefor the management of problem solutions.

The next example corrects this difficulty, in the sense that it isn’t quite so similar to the firstexample. It’s a very common example, and you can find in Aho, Hopcroft and Ullman [1], Brassardand Bratley [2], Cormen, Leiserson and Rivest [5], and Neapolitan and Naimipour [9].

2.5.1 Definition of the Problem

Multiplying Chains of Matrices. Suppose we are given a list of n matrices A1, A2, . . . , An,where the ith matrix has pi−1 rows and pi columns, for 1 ≤ i ≤ n — so that the number of columnsin the ith matrix is the same as the number of rows in the i+ 1th matrix, for all i. Then the matrixproduct

A1 ·A2 . . . An

would be defined, and this would be a matrix with p0 rows and pn columns.

Strategies for Multiplying Chains. If n > 2 then there would be several different ways tocompute this matrix product by using a sequence of multiplications of pairs of matrices. Forexample, if n = 3 then we could either multiply A1 by A2 first, and then multiply this productby A3 on the right, or we could first multiply A2 by A3 and then multiply the resulting productby A1 on the left. In other words, we can take advantage of the fact that matrix multiplication isan “associative operation,” so that

(A1 ·A2) ·A3 = A1 · (A2 ·A3).

Indeed, we “take advantage” of this fact every time we write a matrix product of three or morematrices as

A1 ·A2 ·A3

without specifying the order in which the multiplications should be performed (by adding paren-theses, as above).

Note, as well (to avoid any misunderstanding) that matrix multiplication is not also a “com-mutative” operation. It is not true, in general, that A1 ·A2 = A2 ·A1. Indeed, the second productisn’t even defined unless p1 = p3; the two matrices have different shapes if p1 = p3 but p1 6= p2; andthey’ll have the same shapes, but will frequently be different matrices, even when p1 = p2 = p3.

43

Therefore the two alternatives given above, “(A1 · A2) · A3” and “A1 · (A2 · A3)” are the onlyways to compute A1 ·A2 ·A3 that we’ll be interested in.

In general (when n ≥ 3) there will be a different “strategy” for computing the product A1 ·A2 · · ·An for every way to include parentheses in this expression, to completely specify the orderin which pairs of matrices are to be multiplied together in order to form the above matrix product.

Cost of Multiplying Chains. Let’s suppose — as is usually the case — that we’ll be using“standard” matrix multiplication (the algorithm you probably learned first, to multiply matricestogether) in order to perform this computation, rather than Strassen’s method or one of the otheralternatives mentioned in Chapter 1. Then the number of multiplications of “scalars” that is neededto compute an i × j matrix by a j × k matrix is ijk, and the number of additional operations(namely, additions of scalars) is in Θ(ijk) as well. To simplify the arithmetic that’s to follow (inthis presentation) we’ll pretend that the “cost” of this matrix multiplication is exactly ijk (or,that’s what we’ll “define” it to be, if you’d prefer to think of it that way instead).

Using this cost measure, the total cost to compute A1 ·A2 ·A3 (and, more generally, A1 ·A2 · · ·An)turns out to depend upon the strategy that has been selected in order to perform the computation.Assuming, as above, that Ai has pi−1 rows and pi columns, you should be able to confirm thatthe product A1 · A2 can be computed using p0p1p2 operations, and the resulting p0 × p2 can bemultiplied by A3 using an additional p0p2p3 operations, so that the cost of computing (A1 ·A2) ·A3

using the multiplications suggested by the parentheses is

p0p1p2 + p0p2p3,

while the cost of computing the p1 × p3 matrix A2 · A3 is p1p2p3, and the cost of multiplying A1

by this matrix is p0p1p3 — so that the cost of computing A1 · (A2 · A3) using the multiplicationssuggested by the parentheses is

p1p2p3 + p0p1p3.

Suppose (for the sake of example) that p0 = 1 and p1 = p2 = p3 = 2. Then, in this case,

p0p1p2 + p0p2p3 = 8 and p1p2p3 + p0p1p3 = 12,

so the strategy you choose to perform this “matrix chain” multiplication really can effect its totalcost.

Problems to be Solved. We’re now ready to define two versions of a problem to be solved. Theinput for both versions of the problem will be the same — a sequence of n+ 1 positive integers

p0, p1, p2, . . . , pn

representing the rows and columns of a sequence of matrices

A1, A2, . . . , An

(as above), whose product A1 ·A2 · · ·An one might like to obtain.For the first version of the problem, the output will be the total cost (that is, total number

of multiplications of scalars) that is required in order to compute the above matrix product, A1 ·A2 · · ·An, when you choose a strategy that makes this total cost as small as possible.

44

For the second version of the problem, the output will somehow include a specification of this“optimal strategy,” as well. A data structure that can be used to represent this strategy (whichwill therefore be included in the output for this version of the problem) will be described later.

In order to keep things simple, once again, we’ll use the “unit cost criterion” when analyzingalgorithms for this problem. Thus the input size will be considered to be n + 1 when the input isas given above, because this input consists of a sequence of n+ 1 positive integers.

2.5.2 Brute Force Solutions Require Exponential Time

A “brute force” solution for this problem would be one in which you list every possible strategy forperforming the given matrix chain computation, compute the total cost for each, and then choosethe cost (and strategy) that is found to be at least as good as all the others.

While it might not be clear now, it’s to be hoped that it will be clear by the time you’ve reachedthe end of these notes that the number of strategies you can choose to compute the product

A1 ·A2 · · ·An

is at least as large as a function P (n) that satisfies a recurrence,

P (n) =

1 if n = 1,n−1∑k=1

P (k)P (n− k) if n > 1.

It can be proved by induction on n that P (n) = Cn−1, where Cm is the mth Catalan number,

Cm = 1m+1

(2mm

).

Using “Stirling’s approximation for m!” (given, for example, by Cormen, Leiserson, and Rivest [5]),it can be shown that

Cm ∈ Ω(

4m

m(3/2)

),

so that P (n) ∈ Ω(4n/n3/2) as well. It should now be clear that a “brute force” strategy for solving(either version of) the given problem will require exponential time.

2.5.3 Solution Using Divide and Conquer

In order to design a solution for this problem using Divide and Conquer, and to start to design asolution using Dynamic Programming, we’ll need to find a recurrence that we can use to computethe optimal total cost for this computation.

If n = 1 then the problem is trivial, and the “total cost” that we should be reporting is zero,since no matrix multiplications are required.

Otherwise (if n ≥ 2), you should note that every strategy for computing A1 · A2 · · ·An has toend with a multiplication of a pair of matrices. Now you may assume “without loss of generality”that the strategy will have the form,

1. Somehow, compute the matrix product A1 ·A2 · · ·Ak.

2. Somehow, compute the matrix product Ak+1 ·Ak+2 · · ·An.

45

3. Perform the final matrix multiplication to compute the product

A1 ·A2 · · ·An = (A1 ·A2 · · ·Ak) · (Ak+1 ·Ak+2 · · ·An),

where you can choose k to be any positive integer such that 1 ≤ k ≤ n− 1. (If k = 1 or k = n− 1then one, or the other, of the first two steps will be trivial and will have total cost zero; both ofthe first two steps are nontrivial, otherwise).

It was claimed above that you can make this assumption “without loss of generality,” in spiteof the fact that there other strategies besides these. For example, you could reverse the order ofSteps 1 and 2 (they’re independent), or you could mingle their multiplications together in certainways.

However, that’s all you can do, and every other strategy will have the same cost as one of thestrategies with the form given above. Put another way: You can take any other strategy that endswith the same matrix multiplication that’s listed above, and reorder the matrix multiplicationsthat it includes, in order to produce a strategy that has the form given above, and that has thesame cost.

Therefore it’s sufficient to consider only those strategies with the above form, as claimed above.Furthermore we may assume that “Step 1” and “Step 2” are both strategies of this form, and soon.

Now, you should be able to confirm that the function P (n) mentioned in the previous section(on the cost of brute force solutions) is the number of strategies of this form, provided that therestriction is applied inductively to “sub-strategies,” as the previous paragraph suggests.

It’s also the number of ways to add pairs of parentheses to the product

A1 ·A2 · · ·Anin order to “parenthesize” this expression “fully.” (See Cormen, Leiserson, and Rivest’s discussionof this problem, if you’re interested in a more precise definition of an expression’s being “fullyparenthesized.”)

Here is the next observation: Once you’ve chosen the integer k that defines the top level of theabove “strategy” (it specifies precisely which multiplication you’ll perform at the end), you’re leftwith the task of deciding how to implement Steps 1 and 2. As argued above, we might as well usestrategies of the same general form.

Furthermore, if we want to reduce the cost of the entire computation, we should choose “op-timal” ways to perform Steps 1 and Step 2, which are computations of the same general form, aswell.

We’ve now made progress towards finding a recursive solution for this problem.Let m[1, n] denote the optimal cost of the entire computation, which is what we wish to compute

(to solve the first version of the problem, given the input that’s been described). For 1 ≤ i ≤ n, letm[1, i] denote the optimal cost to compute the matrix product

A1 ·A2 · · ·Ai(so that m[1, 1] = 0) and let m[i, n] denote the optimal cost to compute the matrix product

Ai ·Ai+1 · · ·An(so that m[n, n] = 0 as well). Then the above claims about strategies for performing this compu-tation imply that m[1, n] = 0 if n = 1 and that

m[1, n] = min1≤k≤n−1

(m[1, k] +m[k + 1, n] + p0pkpn)

46

function min cost(i, j)var current, best, k: integer

if i ≥ j thenreturn 0

elsebest := pi−1pipj + min cost(i+ 1, j)k := iwhile k < j − 1 do

k := k + 1current := pi−1pkpj + min cost(i, k) + min cost(k + 1, j)best := min(best, current)

end whilereturn best

end if

end function

Figure 2.7: An Inefficient Solution Using Divide and Conquer

if n ≥ 2. Note that if n = 2 this is just the claim that m[1, n] = p0p1p2, as we’d hope, and thatit also agrees with the expression you could obtain for m[1, n] when n = 3 by performing the caseanalysis that was included in the description of this problem.

Of course, we aren’t finished — we need a way to compute m[1, i] and m[i, n] as well. If1 ≤ i ≤ j ≤ n, let m[i, j] denote the (optimal) cost required to compute the matrix product

Ai ·Ai+1 · · ·Aj .

Then if you apply the argument that was given above (to analyze m[1, n]) over again, in order toanalyze m[1, j] where 1 ≤ j ≤ n, you should be able to confirm that

m[1, j] =

0 if j = 1,min

1≤k≤j−1(m[1, k] +m[k + 1, j] + p0pkpj) if j > 1.

It follows by a similar argument that, for 1 ≤ i ≤ n,

m[i, n] =

0 if i = n,

mini≤k≤n−1

(m[i, k] +m[k + 1, n] + pi−1pkpn) if i < n.

We’re still not done, because these expressions now refer to the values m[i, j] for all i and jsuch that 1 ≤ i ≤ j ≤ n (both in order to define m[1, j], and m[i, n]).

So, we’ll apply the argument once again, to try to derive an expression for m[i, j]. You shouldbe able to conclude that if 1 ≤ i ≤ j ≤ n then

m[i, j] =

0 if i = j,

mini≤k≤j−1

(m[i, k] +m[k + 1, j] + pi−1pkpj) if i < j.

Now, finally, there is some good news: This recursive definition “only” defines m[i, j] in termsof the same function (for different choices of i and j).

47

A recursive function which implements this recursive definition is shown in Figure 2.7; given iand j such that 1 ≤ i ≤ j ≤ n as input, it returns the value m[i, j] as output if it terminates. Thus— if it terminates — then it can be used to solve the first version of our problem by calling it withinputs i = 1 and j = n.

It also refers to the inputs p0, p1, . . . , pn; of course, if you want to make this look a bit more likesource code, then you can define an array p of length n + 1 and rewrite this, assuming that pi isstored in the array location p[i] for 0 ≤ i ≤ n.

Unfortunately, it may not be immediately clear that this function does terminate. To see thatit does, you should note that it terminates whenever it is given inputs i and j such that i = j (orif i > j, in case you accidentally call it with inputs that make no sense).

If it is called with inputs i and j such that i < j then it recursively calls itself with inputs rand s, where i ≤ r ≤ s < j. Fortunately, that’s all we need to know in order to prove that thisterminates, since |s−r| is an integer which is strictly less than the integer |j− i| in all such cases —so it is possible to prove, formally, that this function terminates whenever it is called with inputsi and j such that 1 ≤ i ≤ j ≤ n, using induction on j − i.

Furthermore, you can also establish, by induction on j − i, that the depth of the stack usedto store information about recursive calls (when the function is executed) never exceeds j − i.Therefore the space used by this algorithm to compute m[1, n] is in O(n), even when you includethe size of this stack when you count storage space (at least, provided that the unit cost criterionis used to measure this).

Unfortunately, you can also establish one more thing by induction on j − i — namely, that thenumber of operations used by this algorithm on these inputs is in Ω(2j−i), so that this is definitely(at best) an exponential-time solution for this problem.

2.5.4 Solution Using Dynamic Programming

We did quite a bit of the work that we need for a “Dynamic Programming” solution above, in orderto solve the problem using “Divide and Conquer.”

Establishing the Necessary Properties

Optimal Substructure Property. For 1 ≤ i ≤ j ≤ n let Ii,j be an instance of this problem ofsize j − i+ 2 that includes the inputs

pi−1, pi, pi+1, . . . , pj ;

then the solution for (the first version of) the problem given this instance as input is the valuem[i, j] that was discussed in the previous section. We established, when developing a solutionusing Divide and Conquer, that it is possible to solve the original instance of this problem usingonly the values p0, p1, . . . , pn and the solutions m[i, j] for the instances Ii,j , for 1 ≤ i ≤ j ≤ n.

Therefore this problem does have the “optimal substructure property.” The fact that we didmanage to develop an inefficient Divide and Conquer solution for the problem confirms this, aswell.

Overlapping Subproblems. Furthermore, as argued above, there are at most n2 “smaller”instances of problems whose solutions we need in order to solve the originally given instance (ofsize n+ 1) — namely, the solutions m[i, j] for the instances Ii,j , for 1 ≤ i ≤ j ≤ n. (As usual, theDivide and Conquer solution for the problem used exponential time because it formed and solvedmany of these instances over and over again.)

48

Therefore, this problem has the desired “overlapping subproblems” property too.

Algorithm Design

Subproblems, and Their Solution Order. It isn’t quite as easy to choose an order for thesubproblems, in this case, as it was for previous examples.

However, the proof that the above recursive function (in Figure 2.7) terminates suggests atleast one order that can be used, because it used the observation that you only need to know valuesm[r, s], such that |s− r| < |j − i|, in order to compute m[i, j].

Here is a procedure that lists (the names of) the needed instances in an order that has theproperties we need, and which were identified when the design of Dynamic Programming algorithmswas discussed:

for h := 0 . . . n− 1 dofor i := 1 . . . n− h do

List the instance Ii,i+hend for

end for

You should confirm that this lists instances, starting with I1,1, I2,2, . . . , In,n (when h = 0),continuing in such a way that every instance Ii,j such that 1 ≤ i ≤ j ≤ n is eventually listed, usingan order such that |j − i| never decreases, and such that the final instance listed is I1,n. Since thedifference between the second parameter j and the first parameter i never decreases, and all theinstances you might want are included, this ensures that every solution m[r, s] you need would havebeen computed and stored already, when you tried to compute the solution m[i, j]. That is, if youneed m[r, s] in order to compute m[i, j], then Ir,s is listed before Ii,j by the above procedure.

You should also (eventually) note that the most expensive part of the Dynamic Programmingsolution for the problem that will shortly be presented is (as far as its outer structure is concerned)a doubly-nested for-loop, in which the instances are generated and solved using exactly the sameorder as the order in which they’re listed by the code listed above. (More precisely, this loop willlist all but the first n of these instances. It will be preceded by a smaller loop that is used to listand solve the first n instances of the problem that the above procedure lists, because these are partof a “base” case and it’s more convenient to handle them separately from the rest.)

Choice of Data Structure. A two-dimensional array chain cost[1 . . . n][1 . . . n] can be used asthe data structure to store solutions for instances of problems; for 1 ≤ i ≤ j ≤ n, the solutionsm[i, j] for the instance Ii,j will eventually be stored in the location chain cost[i, j] of the array.

Of course, this is rather wasteful, because slightly less than half the array locations (the locationschain cost[i, j] for i > j) are never used, so you could reduce the storage space required by declaringand using a slightly different data structure (probably, at the cost of complicating the source codeand, perhaps, increasing the time required by a small amount). A two-dimensional array will beused in the solution of the problem that’s presented next, in order to try to make the code morereadable than it might otherwise be.

A First Efficient Algorithm. A Dynamic Programming algorithm for (the first version of)this problem, based on above choices, is given in Figure 2.8. The most expensive part of this is a

49

function min cost2 (n)current, best, i, h, k: integerif (n ≤ 1) then

return 0else

for i := 1 . . . n dochain cost[i, i] := 0

end forfor h := 1 . . . n− 1 do

for i := 1 . . . n− h dobest := pi−1pipi+h + chain cost[i+ 1, i+ h]k := iwhile k < i+ h− 1 do

k := k + 1current := pi−1pkpi+h + chain cost[i, k] + chain cost[k + 1, i+ h]best := min(best, current)

end whilechain cost[i, i+ h] := best

end forend forreturn chain cost[1, n]

end if

end function

Figure 2.8: An Efficient Solution Using Dynamic Programming

triply-nested loop, and it should not be difficult (at this point in the course) for you to establishthat is uses Θ(n3) operations to solve this problem, when given the input

p0, p1, . . . , pn

of size n+ 1 that has been described.

Improvement: Reduction of Storage Space. Apart from the minor improvement mentionedabove, there’s no obvious way to reduce the storage space requirements in this case, because youneed to refer to almost every value m[i, j] that’s generated, almost until the end of the computation.In other words, it’s not true in this case that you can reduce storage requirements substantially bydiscarding the values m[i, j] after they’re needed.

2.5.5 The Second Version of This Problem

In an alternative version of this problem, you would try to compute and return an optimal “strategy”to compute the matrix product

A1 ·A2 · · ·An

along with its cost.

50

One way to represent such a strategy (that is, to generate it as output, in a way that would allowyou to use this output along with the matrices A1, A2, . . . , An to perform the matrix multiplicationsyou need) would be to use another two-dimensional array, split. For 1 ≤ i ≤ j ≤ n, split[i, j] wouldstore an integer k such that i ≤ k ≤ j − 1 and such that one optimal strategy for computing thematrix product

Ai ·Ai+1 · · ·Aj

would be to execute the following steps, in order:

1. Compute the matrix product Ai ·Ai+1 · · ·Ak as efficiently as possible.

2. Compute the matrix product Ak+1 ·Ak+2 · · ·Aj as efficiently as possible.

3. Perform one more matrix multiplication to compute (and return)

(Ai ·Ai+1 · · ·Ak) · (Ak+1 ·Ak+2 · · ·Aj).

As an exercise, you should try to modify the Dynamic Programming solution that was givenfor the first version of this problem, so that it computes the above array as well. It shouldn’t benecessary to increase either the running time or the storage space required by more than a constantfactor.

2.6 Additional Examples

Brassard and Bratley [2], Cormen, Leiserson and Rivest [5], Horowitz, Sahni and Rajasekaran [6],and Neapolitan and Naimipour [9] all include chapters on Dynamic Programming with additionallong examples.

2.7 Exercises

Hints for some of the following exercises are given in the section after this one. Solutions forExercises #1, 2, and 4 can be found in Subsection 4.2.1.

1. Design a Dynamic Programming algorithm that computes the binomial coefficient,(ni

), given

inputs n and i such that n > 0 and 0 ≤ i ≤ n, using only the recursive definition(n

i

)=

1 if i = 0 or i = n,(n−1i−1

)+(n−1

i

)if 1 ≤ i ≤ n− 1,

— that is, without using the fact that(ni

)= n!

i!(n−i)! . Your algorithm should use a number ofinteger operations that is polynomial in n, in the worst case.

2. Consider the “optimal fee” problem discussed in Section 2.4. Suppose the problem is changed,so that the original constraint,

“if i ∈ S then i+ 1 6∈ S, for 1 ≤ i ≤ n− 1”

is replaced with a weaker constraint,

51

“if i ∈ S and i+ 1 ∈ S then i+ 2 6∈ S, for 1 ≤ i ≤ n− 2.”

That is, suppose the requirements are changed so that it is possible to choose two jobs in arow, but not three.

Produce an algorithm that solves this version of the problem (that is, that finds a bestset S ⊆ 1, 2, . . . , n and the corresponding best fee), using a number of operations thatis polynomial in n. Briefly say why your algorithm is correct, and estimate its worst casecomplexity as a function of n as precisely as you can.

3. Find an algorithm that takes a positive integer n as input and returns the number of binarysearch trees storing the values 1, 2, . . . , n as output. For example, if n = 1 then your algorithmshould return 1; if n = 2 then your algorithm should return 2; and if n = 3 then your algorithmshould return 5 (why?). Your algorithm should use a number of integer operations that ispolynomial in n.

4. A (binary) heap is a kind of binary tree that is described in many algorithms and datastructures texts, including Section 7.1 of Cormen, Leiserson and Rivest [5]. As describedthere, every heap of size n has the same shape, and values are stored in the heap in “heaporder.” Among other things, this implies that the largest value in the heap is always storedat the root.

Consider the number of binary heaps storing the distinct values 1, 2, . . . , n, for a positiveinteger n. There is only one such heap if n = 1, or if n = 2, while there are two such heapsif n = 3, three heaps if n = 4, eight heaps if n = 5, twenty heaps if h = 6, and eighty suchheaps if n = 7 (and if I haven’t made any errors in arithmetic).

Design a Dynamic Programming algorithm that takes a positive integer n as input and thatcomputes the number of different heaps storing the distinct values 1, 2, . . . , n as output. Onceagain, your algorithm should only use a number of integer operations that is polynomial in n.

5. Design a Dynamic Programming algorithm that takes a positive integer n as input and com-putes the average depth of all the binary search trees storing 1, 2, . . . , n as output, under theassumption that each of these trees is equally likely to be used.

Your algorithm should return 0 if n = 1, since every binary search tree of size one has depthzero. Similarly, it should return 1 if n = 2 because every binary search tree of size two hasdepth one. It should return 9

5 if n = 3, because there are exactly five different binary searchtrees storing the values 1, 2, and 3, four of these trees have depth 2, the remaining tree hasdepth 1, and 1

5(4 · 2 + 1 · 1) = 95 .

6. Each of the above problems can be solved using Dynamic Programming with a number ofoperations that is polynomial in n in the worst case. Can any of them be solved this efficientlyusing Divide and Conquer instead? Which ones, and why?

7. Recall that the Fibonacci numbers satisfy the following property (as proved in an exercise inChapter 1): If n ≥ 2 and l = bn2 c, then

Fn =

Fl(Fl + 2Fl−1) if n is even, so that n = 2l,2F 2

l + 2FlFl−1 + F 2l−1 if n is odd, so that n = 2l + 1.

52

Now consider a problem with inputs n ≥ 0 and m ≥ 2, and with output Fn mod m (whichis an integer between 0 and m − 1). Note that the above formula can be used to expressFn mod m in terms of Fl mod m and Fl−1 mod m, if n ≥ 2 and l = bn2 c.

(a) Design an algorithm that uses “Memoization” to compute Fn mod m on inputs n and m,and which uses a binary search tree (with ordered pairs (i, Fi mod m) stored at nodes)to store solutions for subproblems.

(b) While it might not be obvious, it will be the case that there are only O(log2 n) positiveintegers i such that Fi mod m is computed by such an algorithm on inputs n and m.Using this fact without proof, argue that your algorithm uses a number of operations ondigits that is polynomial in the input size, log2 n+ log2m (when a more realistic notionof “input size” is used).

(c) Now prove that if n ≥ 3 then the values Fn−2, Fn−1 and Fn can be computed (mod m)from the values Fl−2, Fl−1, and Fl (mod m), for l = bn2 c, using only a constant numberof integer operations modulo m.To do this, you should consider the cases “n is even” and “n is odd” separately, and youshould give formulas for Fn−2, Fn−1, and Fn in terms of Fl−2, Fl−1, and Fl in each case.

(d) Use the above to design a Dynamic Programming algorithm (instead of a “Memoization”algorithm) to compute Fn mod m. This algorithm should also use a number of operationson digits that’s polynomial in log2 n+ log2m.

(e) Finally, design a Divide and Conquer algorithm that takes n and m as input and thatreturns an ordered triple

(Fn−2 mod m,Fn−1 mod m,Fn mod m)

as output. Show that this algorithm uses a number of operations on digits that’s poly-nomial in log2 n+ log2m in the worst case, as well.


Exercise #2: Try to modify the analysis that was used for the first version of the problem, sothat it applies to this second version instead.

Exercise #3: The hardest part of this problem might be to figure out how to solve it using“Divide and Conquer” — that is, to establish the “Optimal Substructure” property.

Consider the value stored at the root of any binary search tree storing 1, 2, . . . , n — it must beone of these values. Suppose it’s k; then the left subtree will be one of the binary search trees storing1, 2, . . . , k, and the right subtree will be one of the binary search trees storing k − 1, k − 2, . . . , n.There will be as many possible right subtrees, in this case, as there are binary search trees storingthe values 1, 2, . . . , n− k (why?).

Exercise #4: You should start by reading the material about binary heaps mentioned in thequestion.

Once again, the hardest part will probably be to show that the problem could be solved usingDivide and Conquer. In this case you know the value stored at the root — it’s n — and you alsoknow the shapes of the left and right sub-heaps. You don’t know precisely which of the remainingvalues, 1, 2, . . . , n− 1, are stored in the left sub-heap — but you should discover that, once you’veselected these, counting the number of ways to “complete” the heap is reasonably straightforward.

53

Exercise #5: Consider designing an algorithm that computes a different set of values — namely,the number of trees storing 1, 2, . . . , n with depth d, for all possible choices of d. You should findthat it’s not too difficult to compute this kind of value using Dynamic Programming — that thatthe “average depth” you want is easily computed after that.

2.9 Sample Tests

The following tests were used in fall 1996 and 1997, respectively. Solutions for these tests can befound in Section 4.2.2.

2.9.1 Class Test for 1996

Instructions:

Attempt all questions. Write answers on the question sheets.

No aids allowed.

Duration: 50 minutes.

Suppose r and u are positive integers, and that you are trying to move from one point to anotherinside a 2-dimensional grid, where the points on the grid have labels (i, j), for integers i and j suchthat 0 ≤ i ≤ r and 0 ≤ j ≤ u.

Now, suppose that 0 ≤ i ≤ r and 0 ≤ j ≤ u, and that you are currently at node (i, j). Youare allowed to move in at most two directions — “up” or “to the right,” and you can never take amove that takes you off the grid:

• If you move “to the right” from node (i, j), then you end up at node (i+1, j) — unless i = r,in which case you are not allowed to move “to the right” from node (i, j) at all.

• If you move “up” from node (i, j), then you end up at node (i, j + 1) — unless j = u, inwhich case you are not allowed to move “up” from node (i, j) at all.

For r ≥ 0 and u ≥ 0, let P (r, u) be the number of paths you can follow in order to move fromnode (0, 0) to node (r, u).

Here are some useful “examples” of the values of P (r, u) to consider:

• P (0, u) = 1 for every integer u ≥ 0, because you can only get from node (0, 0) to node (0, u)by taking u steps up.

• P (r, 0) = 1 for every integer r ≥ 0, because you can only get from node (0, 0) to node (r, 0)by taking r steps to the right.

• “In particular,” P (0, 0) = 1.

• P (1, 1) = 2, because you can move from (0, 0) to (1, 1), either by first moving right (to (1, 0))and then moving up, or by first moving up (to (0, 1)) then moving to the right.

54

1. (1 mark) Suppose r ≥ 0 and u > 0. How many paths can you follow from node (0, 0) tonode (r, u) such that the last move is up? Your answer should be of the form “P (a, b),”where a and b are integers that might depend on r and u.

2. (1 mark) Suppose r > 0 and u ≥ 0. How many paths can you follow from node (0, 0) tonode (r, u) such that your last move is to the right? Again, your answer should be ofthe form “P (a, b),” where a and b are integers that might depend on r and u.

3. (3 marks) Give a recursive definition of P (r, u) for positive r and u (as a function of somevalues “P (a, b)” for a ≤ r, b ≤ u, and such that a 6= r or b 6= u or both), and explain brieflywhy your answer is correct.

It might help to assume that P (r, u) = 0 whenever r < 0 or u < 0 (or both), so you mayassume this if you wish to.

4. (15 marks) Write an algorithm that takes two nonnegative integers r and u as input and thatreturns P (r, u) as output.

For full marks, your algorithm should use a number of steps that is (at most) polynomial inmax(r+ 1, u+ 1) in all cases. A small number of part marks will be given, if your algorithmis slower than that.

Note: This quiz is supposed to test your knowledge of “dynamic programming.” Therefore,please don’t used a “closed form” for the function P (r, u) in your algorithm, even if youhappen to know one! You will not receive full marks (or even very high part marks) if youignore this instruction!

Another Note: If you couldn’t answer Question 3, then you may use the following com-pletely incorrect recurrence for P (r, u) in order to answer Question 4, for almost full marks:For r, u ≥ 0, you may pretend that

P (r, u) =

1 if r = 0 or u = 0 (or both)2P (r − 1, u) + 3P (r, u− 1) + r ∗ u otherwise.

Yet Another Note: If you can’t write down the algorithm immediately then try to followthe steps given in class to design your algorithm, and try to document what you are doingwell enough so that part marks can be awarded!

Finally: If your algorithm is readable and correct (whether you used the above “bogus”recurrence, or not) then a small number of bonus marks will be awarded if you give a correctclosed form for the running time of your algorithm, as a function of r and u.

55


Instructions:


No aids allowed.


Suppose someone is trying to walk along a straight line, perhaps after attending a social gath-ering. At the beginning (that is, at time 0) this person is standing at the origin (that is, atposition 0).

This person is able either to move forward one step, from some position i to position i+1, or tomove backward one step, from some position i to position i− 1 — but not both at once — duringeach time step.

For any nonnegative integer t ≥ 0 and for any integer p, let P (p, t) be the number of ways thatthis person could reach position p after exactly t time steps (that is, at time t).

Since the person starts at the origin,

P (0, 0) = 1 and P (p, 0) = 0 if p 6= 0.

This implies that

P (1, 1) = P (−1, 1) = 1

and that

P (p, 1) = 0 if p 6∈ −1, 1,

since this person started at the origin and took exactly one step, either forward or backward, afterthat (until time 1 was reached).

Finally, it should be noted that

P (p, t) = 0 if p > t or if p < −t

because it is impossible for this person to travel a distance of more than t away from the origin,after having taken only t steps.

1. (4 marks) Give a recursive definition (a “recurrence”) for the function P (p, t) that is describedabove. Your answer should define P (p, t) in terms of values P (p′, t′) where t′ < t. Note,though, that it is possible that p′ < p, p′ = p, or even that p′ > p.

If you don’t want to read all that, then you “sacrifice” at least one mark by skipping straightto Question #2 now, and then attempt Question #1 later (if you have time for it).

2. (12 marks) Design a “Dynamic Programming” algorithm that can be used to compute P (p, t)given any integer p and any nonnegative integer t as input.

For full marks, and as an aid for marking, you should follow the design steps that wereintroduced and applied to examples during lectures (including observing or proving that the

56

problem to be solved satisfies the two properties that were necessary). BRIEFLY name ordescribe each step — using at most two lines, and ideally only one — before you carry it out.

Don’t give the code or consider optimizations to reduce storage space, yet! You’ll be askedto do this in later questions.

If you haven’t answered Question #1 (or you aren’t very sure about your answer, andthink you might have trouble using it here) then you may use the following incorrect recursivedefinition of P (p, t) in order to answer Question #2. One mark will be deducted if you usethis recurrence (or any other incorrect one) instead of a correct recurrence for this function— and the recurrence that you use must be either the one given here, or your answer forQuestion #1.

P (p, t) =

1 if t = 0 and p = 0,0 if t ≥ 0 and p < −t or p > t,P (p− 1, t− 1) + P (p, t− 1)− P (p+ 1, t− 1) if t > 0 and −t ≤ p ≤ t.

3. (6 marks) Give pseudocode (at approximately the same level of detail as the pseudocode inthe online lecture notes) for the algorithm you designed in your answer for Question #2.

4. (3 marks) BRIEFLY describe any changes you could make in order to reduce storage spaceand briefly explain why these changes can be made (while keeping the algorithm correct).

57

Chapter 3

Greedy Methods

3.1 Overview

This chapter presents another algorithm design method that can sometimes be used to solve prob-lems using polynomial time in the worst case — in particular, it introduces greedy methods forsolving problems.

Unfortunately, “greedy methods” aren’t always correct — and proving correctness of a greedymethod is frequently harder than designing or analyzing the complexity of an algorithm that usesthis approach.

The chapter will begin with a modified version of the “Optimal Fee Problem” that was usedas an example in Chapter 2. We’ll see that a “greedy method” can be used to solve the modifiedversion of the problem, but that a plausible greedy method for the original problem would beincorrect.

After that, “optimization problems” will be defined; these are the kinds of problems on whichgreedy methods can sometimes be used. A general form will be given for a “greedy method” foran optimization problem. A set of steps will be described for showing that a greedy method isincorrect, and then a (somewhat more complicated) set of steps will be given for showing that agreedy method is correct.

The latter steps will be used to prove that the greedy method for the modified version of theoptimal fee problem is correct.

A second long example — an “activity selection problem” — will be considered after that. Twogreedy methods will be considered; the first will be shown to be incorrect, and the second will beproved to be correct.

As usual, it isn’t the algorithms themselves that will be most important — they’ll merely serveas examples. It’s more important that CPSC 413 students know how to write programs that usegreedy methods and that they know both how to prove that greedy methods are incorrect and that(other) greedy methods are correct.

Proving both “incorrectness” and “correctness” can be challenging; a set of exercises and sampletests that can be used assess these skills can be found at the end of this chapter.

59

3.2 A Modified Optimal Fee Problem and a Generalization

3.2.1 Definition of the Modified Problem

Consider a modified version of the “Optimal Fee Problem” that had been used as an example whenDynamic Programming was discussed.

As before, the input for the problem will include a sequence of n “fees” for jobs, f1, f2, . . . , fn.As before, these will be required to be nonnegative integers.

We’ll define the “size” of this instance to the number of fees, n, that it includes.As before, the output will be a set S ⊆ 1, 2, . . . , n, that satisfies a constraint and that

maximizes the function

fS =∑i∈S

fi.

The most important difference between this modified problem and the original one is the constraint:Instead of insisting that two or more consecutive integers could not be included, we will now considerthe new constraint that

|S| ≤ dn2 e.

3.2.2 A Solution for the Modified Problem

Since we’re now free to choose any subset of size at most dn2 e, we can choose the set S of jobs ofthis size whose fees are largest. One way to do this is to sort the input fees, keeping track of theirindices, and then set S to be the indices of the first dn2 e fees in the list.

It may seem “obvious” that this algorithm is correct. We’ll see a way to prove this shortly.

3.2.3 A Generalization

Now consider another variation of this problem in which another input, an integer k, is added, andsuch that the ‘’constraint” that must be satisfied by the output set is that

|S| ≤ k.

We’ll define the “size” of an instance to be n+ 1, since it now includes one more integer input.The previous problem is “reducible” to this one, in the sense that it can be solved efficiently

if a solution for this new version of the problem can be used as a subroutine: Given a sequencef1, f2, . . . , fn of fees (an instance of the previous problem), one can solve the previous problem asfollows:

1. Count the input fees, in order to compute the integer n.

2. Set k = dn2 e.

3. Append this integer k to the input sequence of fees, to produce an instance of the “generalized”problem.

4. Solve this instance to obtain a set S.

5. Return the set S as the solution of the instance for the original problem.

Therefore, it will be sufficient to find a (provably correct) algorithm that solves the “generalized”problem, in order to obtain a (provably correct) algorithm that solves the previous one.

60

1. if (k = 0 or n = 0) then2. return ∅3. else4. Set i to be any integer such that 1 ≤ i ≤ n and fi ≥ fj for all j such that 1 ≤ j ≤ n.

Form a new, smaller instance of the problem 5. for j := 1 . . . n− 1 do6. if j = i then Rename the nth job 7. fj := fn8. else Leave the other names unchanged 9. fj := fj10. end if11. end for12. k := k − 1

13. Recursively solve the instance of the problem including fees f1, f2, . . . , fn−1 and thebound k, to obtain a solution S ⊆ 1, 2, . . . , n− 1 Note that if i ∈ S then this refers to the old nth job

14. if i ∈ S then15. S := S ∪ n16. else17. S := S ∪ i18. end if19. return S20. end if

Figure 3.1: A Recursive Solution for the Generalized Fee Selection Problem

3.2.4 A Recursive Solution for the Generalization

If either k = 0 or n = 0 (or both) then the empty set is the only set that can be returned. Otherwise,an integer i such that fi is as large as possible should be included, and the remaining k−1 elementsshould be chosen from the rest.

A recursive algorithm for the generalized problem, that implements this sketch, is shown inFigure 3.1. It is complicated by the process of forming a smaller instance that can be solvedrecursively (and recovering a solution for the original instance from the solution of the smallerone).

3.2.5 A More Efficient Solution for the Generalization

Of course, this is not the algorithm you should implement in order to solve this problem: It will besimpler, and faster, when k > 0 and n > 0, to sort the jobs in order of nonincreasing fee and thenchoose the (indices of the) first k of the jobs in the list to be included in S.

However, the first algorithm is correct if and only if the second is, because they can be shownto return the same outputs in all cases — more precisely, every output that the second algorithmcan generate on a given instance I can be generated by the first algorithm on instance I as well,and vice-versa. We’ll therefore use the first algorithm in order to prove correctness of the second.

61

Sort the input jobs, to obtain a sequence i1, i2, . . . , in such that i1, i2, . . . , in =1, 2, . . . , n (that is, the two sets are the same) and fi1 ≥ fi2 ≥ · · · ≥ fin .S := ∅j := 1while (j ≤ n) do

if (ij − 1 6∈ S and ij + 1 6∈ S) thenS := S ∪ ij

end ifj := j + 1

end whilereturn S

Figure 3.2: A Greedy Heuristic for the Original Optimal Fee Problem

3.2.6 Back to the Original Problem

It might — or might not — seem plausible that we could use a similar “greedy” approach to solvethe version of the problem that we studied as a “Dynamic Programming” example. Thus, we mightwant to consider the heuristic for this version of the problem shown in Figure 3.2.

However, consider an input sequence 9, 10, 8, 1 (so that f1 = 9, f2 = 10, f3 = 8 and f1 = 1).If the above heuristic is applied to this input then the input sequence will necessarily be “sorted”

into the order 2, 1, 3, 4, because the four fees included in the input are distinct, and f2 > f1 > f3 >f4.

After this, integers 2, 1, 3 and 4 will be considered as candidates for inclusion in S. The integer 2(representing the second job, with fee 10) will be the first thing that is considered, and it will beincluded in S. The integers 1 and 3 will both be considered after that, because the first and thirdjobs have the next highest fees (9 and 8 respectively); neither will be included because the integer 2has been included already, and two consecutive jobs can’t belong to S. After that, the fourthinteger (corresponding to the fourth job, with fee 1) will be included, so that on termination of theheuristic

S = 2, 4

with total fee

fS = f2 + f4 = 11,

while a better choice (indeed, the only correct solution for this instance of the problem) is the set

T = 1, 3

with total fee

fT = f1 + f3 = 17.

Thus the above heuristic is incorrect because it does not always return a correct solution for awell-formed instance of the given problem. In particular, it does not return a correct solution forthe instance that is given above.

62

3.3 Optimization Problems

3.3.1 Definition

All of the versions of the “Optimal Fee” problem that have been discussed are examples of opti-mization problems.

Definition 3.1. An optimization problem is a computational problem whose instances (or inputs)include (or, somehow “represent”)

• a finite set Q of elements drawn from some “universe,”

such that a correct solution for any instance is a subset S of the input set Q. Either the definitionof the problem itself, or additional inputs (or both) also specify

• a property or constraint that the output set must satisfy. Any subset of the input set satisfyingthis constraint is said to be valid.

• an optimization function Opt , from subsets of Q to the nonnegative real numbers.

A correct output is a valid subset S of Q such that Opt(S) is “optimal,” as described in more detailbelow.

Every optimization problem is either a maximization problem or a minimization problem (butnot both at once).

If a problem is a maximization problem, then the optimization function should be thought of asa “benefit function,” whose value we want to be as large as possible. In this case, a valid subset Sof the input set Q is said to be optimal if and only if Opt(S) ≥ Opt(R), for every valid subset Rof Q (and, a subset of the input set can only be said to be optimal if it is valid).

If the problem is a minimization problem, then the optimization function should be thought ofas a “cost function,” whose value we want to be as small as possible. In this case, a valid subset Sof the input set Q is said to be optimal if and only if Opt(S) ≤ Opt(R), for every valid subset Rof Q (and, a subset of the input set can only be said to be optimal if it is valid).

In either case a solution (or, “correct output”) for an instance of an optimization problem is anoptimal subset of the input set (as mentioned above).

3.3.2 Example

Consider, for example, the first version of the “Optimal Fee” problem, which was used as a “Dy-namic Programming” example.

In this case the “input set” Q is implicitly given — it’s the set 1, 2, . . . , n, where n is thelength of the sequence of “fees”(f1, f2, . . . , fn) that were “explicitly” given as inputs.

The “optimization function” is a function from subsets or 1, 2, . . . , n to the nonnegative realnumbers: For R ⊆ 1, 2, . . . , n,

Opt(R) = fR =∑i∈R

fi.

This is specified in the input by the fees f1, f2, . . . , fn that were explicitly given.Finally, the “constraint” is part of the definition of the problem, and no additional inputs are

needed for it: In this version of a problem, a subsetR of 1, 2, . . . , n is only valid if, for 1 ≤ i ≤ n−1,if i ∈ R then i+ 1 6∈ R; that is, R is not valid if R contains two more more consecutive integers.

63

3.4 Greedy Methods

Greedy methods are heuristics (or, sometimes, algorithms) that can be applied to optimizationproblems.

3.4.1 Definition

A “greedy method” for an optimization problem is defined by

• a set of “base instances,” including all instances of the optimization problem such that the“input set” is the empty set and, frequently, all instances in which the “input set” has size 1as well (but, sometimes, including other instances, too). In order to develop a greedy method,it is necessary to be able to “recognize” base instances (that is, distinguish them from otherproblem instances) as well.

• a procedure that can be used to solve the problem whenever a base instance is given as input.

• a “local optimization function”, loc, which is a function from elements of the input set Q tothe nonnegative real numbers. (Recall that, in contrast, the optimization problem Opt thatwas part of a problem instance was a function from subsets of the input set to the nonnegativereal numbers.)

Local optimization problems can be used in two ways (depending on the greedy method that isbeing developed): Depending on the definition of the problem and the greedy method — but notdepending on the instance of the problem being solved — we either say that

• an element x of the input set Q is a greedy choice if loc(x) ≥ loc(y) for every element y of theinput set Q, or

• an element x of the input set Q is a greedy choice if loc(x) ≤ loc(y) for every element y of theinput set Q.

In order to complete a definition (or development) of a “greedy method” it is also necessary tohave

• a construction that can be used to taken an instance I of the optimization problem, and agreedy choice x (belonging to the “input set” Q defined by I) and produce a smaller instance Iof the same problem;

• a construction that can be used to take the above instance I, greedy choice x, derived in-stance I and any correct solution S for the derived instance I and to produce a correctsolution S for the original instance I.

Given all this, the corresponding “greedy method” has the form shown in Figure 3.3, beforeany “optimizations” are made. It is assumed here that “I” is the instance of the problem that isgiven as input.

It’s often the case that an algorithm based on this can be made more efficient by the use ofa preprocessing step that sets up data structures, sorts values, or performs other computationsin order to simplify the process of making “greedy choices,” deriving smaller problem instances,and recovering solutions later on. The resulting procedure is frequently “iterative” rather than“recursive” after being optimized, as well.

64

if (I is a base instance) then

Apply the method for “solving base instances” to compute a solution forthe instance I, and return this instance as output.

else

Find a “greedy choice” x for this instance.

Use I and x to produce a smaller instance I of the same problem.

Apply the greedy method recursively to try to find a solution S for thederived instance I.

Use I, x, I, and S to generate a solution S for the original instance I.Return S as output.

end if

Figure 3.3: A Generic Greedy Method, Before Optimizations

However, the “unoptimized” version, that has a structure like the above, is worthwhile, becauseit’s generally (reasonably) straightforward to prove that the “optimized” version is correct if theoriginal version is — and (as we’ll see) there is a well defined process you can follow in order toprove that the “unoptimized” version is correct.

3.4.2 Example

The recursive algorithm shown in Figure 3.1 on page 61 is a greedy method with the form describedabove. In this case, each of the following is true.

• The “base instances” are all those instances such that k = 0 or n = 0, so that the only correctoutput set is the empty set. Note that these include all the instances such that the input sethas size zero, so that this is consistent with the description of “base instances” given above.These instances are also easy to recognize, since k is given as one of the inputs and n is alsoeasy to compute.

• The “procedure to solve base instances” is trivial — one just returns the empty set as theoutput (as in line 2 of the algorithm in Figure 3.1).

• The “local optimization function” is the function loc such that loc(i) = fi for 1 ≤ i ≤ n, anda “greedy choice” is an element i of the “input set” 1, 2, . . . , n such that loc(i) is as largeas possible — note line 4 of the algorithm.

• The construction that can be used to form a smaller “derived” instance of the problem, from agiven instance and greedy choice, is shown on lines 5–12 of the algorithm given in Figure 3.1.

• The construction that can be used to recover a solution for the original instance from theoriginal instance, greedy choice, derived instance, and a solution for the derived instance, isshown on lines 14–18 of the algorithm given in Figure 3.1.

65

3.5 Correctness: Proving Incorrectness of a Greedy Heuristic

Proving that a greedy method (or “greedy heuristic,” which is what I’ll call these if I’m not surethat they’re correct, or know that they’re incorrect), is “relatively” easy: All you need to do isprovide at least one well-formed instance of the problem such that the greedy heuristic gives anincorrect solution for it (and show that this is what happens).

We’ll consider two cases — namely, that the output is unique, and that multiple outputs arepossible.

3.5.1 Proving Incorrectness When Outputs are Guaranteed to be Unique

Suppose first that the heuristic is “completely specified,” or “deterministic,” so that there is onlyone output set that it can return as output, when given some instance I as input.

In order to prove that the heuristic is incorrect, you must do the following.

1. Describe (at least) one well-formed instance I for the problem.

2. Describe the output that the greedy heuristic would return when it is given I as input; youcan generally do this by tracing the heuristic’s behaviour on input I and keeping track of theoutput it generates.

3. Prove that the output that the heuristic generated is not optimal.

In order to perform this last step, you must either

(a) demonstrate that the heuristic’s output is not well formed at all — that is, it’s notsyntactically correct, so that it doesn’t even represent a subset of the input set, or

(b) demonstrate that the heuristic’s output is syntactically correct but not valid, by showingthat it does not satisfy the constraint that is defined as part of the optimization problem(possibly, together with part of the input), or

(c) demonstrate that the heuristic’s output is (possibly) valid, but definitely not optimal.Suppose the output is a subset S of the input; then if the problem is a minimizationproblem, you must find (and report) a valid subset T of the input set such that Opt(T ) <Opt(S), and if the problem is a maximization problem then you must find (and report)a valid subset T of the input set such that Opt(T ) > Opt(S).Note that it isn’t necessary to prove that T is optimal, itself — there might exist anothervalid subset of the input that is even “better” than T ! However, if you can prove theabove then you’ll have demonstrated that S is not optimal, which is all you need to do.

It will usually be clear which of the above three cases you’re in (and, much of the time, it’llbe the last case). It’s possible (but quite unlikely during CPSC 413) that you could end upproving that one of the above three cases arises without actually knowing which one it is.

3.5.2 Proving Incorrectness When Outputs are Not (Necessarily) Unique

Suppose, now, that the heuristic is not completely specified, so that it’s at least possible thatseveral different output sets might be returned by an implementation of the heuristic, on any giveninstance, depending on how the heuristic is implemented.

The heuristic that is shown in Figure 3.2 on page 62 has this property, because it doesn’tcompletely specify the order in which jobs are listed when more than one of them has the same fee.

66

For example, on an instance I such that n = 4 and f1 = f3 = 9, f2 = 10, and f4 = 1, the heuristicmight start either by sorting the jobs into the order 2, 1, 3, 4 or 2, 3, 1, 4. It will turn out that thesame output set S = 2, 4 is returned in either case. In contrast, if another instance I containedonly two jobs, such that f1 = f2 = 1, then either of the output sets 1 or 2 could be computedby the heuristic, depending on whether the inputs where “sorted” in order to obtain the sequence1, 2, or the sequence 2, 1.

In a case like this, things will be simplest if you can discover an instance I such that thebehaviour of the heuristic is completely specified (and, not implementation-dependent) when it isexecuted on instance I. Then, it will be clear that there is only one output set S that can bereturned by the heuristic on input I — and you can proceed (after observing that this is the case)as described above for the case that “outputs are guaranteed to be unique.”

This is the process that was followed in Subsection 3.2.6 in order to prove that the heuristicdescribed there was incorrect, using an instance I in which n = 4 and f1 = 9, f2 = 10, f3 = 8,and f4 = 1.

If you can’t find an instance I such that the heuristic works in an “implementation-independent”way on input I and I can serve as a counterexample, then you may still be able to find an instance Isuch that there is only one output set that the heuristic could return when given I as input.

Consider, for example, the instance I of the original version of the Optimal Fee problem suchthat n = 4, f2 = 10, f1 = f3 = 9, and f4 = 1 that is mentioned above. In this case, the onlyoutput set that the heuristic can return on input I is the set 2, 4, even though there are two waysin which this could be computed (depending on how the sorting required at the beginning of theheuristic is implemented). Once again, the set T = 1, 3 is a valid output set with a larger totalfee, so that S is not optimal.

Finding an example with this property (and that can serve as a counterexample) is slightly lessdesirable because you may need to do a bit more work than you need to in the first case, in orderto prove that the heuristic’s output is unique. Once you manage to do this, you can proceed asabove, for the case that outputs are “guaranteed” to be unique.

Finally, suppose that you can’t find an instance I that seems to be a useful counterexampleand such that there is only one output set S that the heuristic could possibly return on input I.In this case you’ll be forced to consider an instance I such that several different output sets couldbe returned by the heuristic on input I.

A “Weak” Proof of Incorrectness

One way to proceed, in this case, is to pick any one output set S that the heuristic could returnon input I and then prove that this is not a correct output for the instance — either becauseit’s syntactically incorrect, not valid, or valid but not optimal; see the discussion of the case thatoutputs are guaranteed to be unique, for a few more details.

This proves incorrectness, but only in a very “weak” way: It establishes that there are at leastsome ways to implement the heuristic that would produce “algorithms” that are actually incorrect.Unfortunately, it fails to eliminate the possibility that other implementations might actually becorrect.

A “Strong” Proof of Incorrectness

A more time-consuming alternative is to somehow consider every output set S that the heuristiccould return on input I and then prove that every one of these is incorrect.

67

This proves incorrectness in a stronger way, because it establishes that every implementationof the heuristic must be incorrect.

Unless there is an explicit statement to the contrary, you should try to prove the “strong” result,when you’re asked to establish that a greedy heuristic is incorrect, in CPSC 413. As suggestedabove, this will probably be simplest if you can find a “counterexample” instance I such that theheuristic’s behaviour on input I is well defined (implementation-independent) so that the heuristic’soutput on input I is clearly unique.

3.5.3 Why Bother?

You might be wondering why it’s important to prove that greedy heuristics are incorrect (after all,we’re generally interested in designing algorithms and proving that they’re correct).

On reason is that there are often “natural” or “obvious” greedy methods for problems, and itmight be tempting to assume that the heuristics are correct and then rush ahead and implementthem. Frequently, these heuristics really are incorrect, but you may discover that it’s necessary toprovide “evidence” that this is the case, in order to convince other developers that they haven’tfound an easy solution for the problem they’re trying to solve.

An even worse — but plausible — situation would be one in which an incorrect heuristic isalready implemented and in use. Then (depending on the importance of the problem it solves,or the difficulty of making a change) you might be in the position of having to demonstrate thata working system should be replaced. To make things, worse, it might be the case that theimplemented system is incorrect but fast, while correct solutions are slow, so that any replacementfor the working system would initially seem to be “inferior.”

Of course, in an ideal world, the incorrect method would not have been implemented andreleased at all (at least, unless it was understood that its output might not be optimal and thatvalid but sub-optimal output sets would be acceptable).

3.5.4 An Even Stronger Proof of “Incorrectness”

Up until this point, we’ve considered proofs in which you demonstrate that there is at least oneinstance on which a heuristic returns an incorrect output.

Since it is sometimes tempting to “patch” an incorrect algorithm — sometimes by handlinga small collection of instances as special cases — it may be “helpful” if you can provide an evenstronger proof of incorrectness. One way to do so would be to prove, somehow, that there areinfinitely many — ideally, arbitrarily large — instances on which the heuristic would give anincorrect answer.

For example, consider again the heuristic shown in Figure 3.2. Let n = 2h be an even numberthat is greater than or equal to four, and consider an instance

9, 10, 8, 1, 1, 1, . . . , 1

so that f1 = 9, f2 = 10, f3 = 8, and fi = 1 for 4 ≤ i ≤ n. As before, the heuristic would sort thejobs. It would be forced to continue after that by considering jobs 2, 1, and 3, and then all the restof the jobs in some order. At best, the remaining jobs could be considered in some order so thath jobs are eventually included, and so that the output set that is eventually obtained includes thejobs 2, 4, 6, 8, . . . , n, with total fee h+ 9. (At worst the remaining jobs could be considered in sucha way that only every third job got included, with a resulting total fee that’s lower than this).

On the other hand, the set T = 1, 3, 5, . . . , n − 1 is a valid set (for this instance) and it hastotal fee h+15 > h+9. Since it is not possible to include job 2 in any set whose total fee is as large

68

as this, the heuristic is guaranteed to produce incorrect outputs on any one of an infinite familyof arbitrarily large inputs — eliminating (one would expect) the possibility that a “simple patch”would be sufficient to correct the heuristic.

You could take this still farther by expanding the infinite family of counterexamples; this won’tbe discussed any further here.

By the way, this subsection is “for interest only;” you won’t be asked to find an infinite familyof counterexamples, for a greedy heuristic, on a test in CPSC 413.

3.5.5 Expectations for CPSC 413

The first step of the above process — discovering an instance on which a greedy heuristic fails —can be quite challenging! It might involve a careful consideration of both the problem that’s to besolved, and a study of the heuristic as well. Furthermore, counterexamples (instances on which theheuristic will fail) might not be very common, so that a random selection of an instance might notyield a counterexample very often.

Even given the “counterexample,” proving that the heuristic fails might be nontrivial (at least,in the case when the heuristic’s output is valid but not optimal), because this may involve findinganother valid subset of the input set that is “better” than the heuristic’s output. This is relativelyeasy when a correct algorithm for the problem is also available (all you should need to do is runthe correct algorithm on the “counterexample,” and compute and compare values of the localoptimization function loc), but that isn’t always the case.

In CPSC 413, you will be expected to be able to prove that greedy heuristics are incorrect byfollowing the method that’s been sketched above. Either counterexamples and “better” output setswill be given, or hints will be given to help you find them, when this seems necessary.

3.6 Correctness: Proving Correctness of a Greedy Algorithm

In order to prove that a greedy method for an optimization problem is correct, you must provethat base instances can be correctly recognized and solved. This will usually be pretty easy, and itwon’t be described further here.

You must also prove that the rest of the instances are correctly solved, as well. In order to dothis, it is necessary and sufficient to prove that the following two properties are satisfied.

1. The Greedy Choice Property: For every instance I of the problem that is not a baseinstance, at least one greedy choice for this instance exists. Furthermore, for every possiblegreedy choice x for this instance, there exists an optimal subset S of the input set (that is, acorrect solution for this instance) such that x ∈ S.

This property says, essentially, that you’ll always be able to make a greedy choice, and thatyou can’t make a mistake (by making it impossible to construct a correct output set) if youstart by including one.

2. The Optimal Substructure Property (for Greedy Methods): For every instance I ofthe problem that is not a base instance, and every greedy choice x for I, there exists a smallerinstance I such that a correct solution for the instance I is easy to compute from x and fromany correct solution for the instance I. Furthermore this “smaller instance” I of the problemmust be easy to generate from I and x.

69

This property says, essentially, that it will be possible to solve the original instance of theproblem in a recursive way, by choosing x, generating a smaller instance of the problem,solving that smaller instance recursively, and then recovering a correct solution for the originalinstance I from x and from a correct solution of the “derived” instance I.

Note: This isn’t quite the same as any other property that may have been called an “Op-timal Substructure Property,” when either Divide and Conquer or Dynamic Programmingwas discussed. In this chapter (on Greedy Methods) the phrase “the Optimal SubstructureProperty” will refer to the property that’s just been defined.

This different use of the name “Optimal Substructure Property” is unfortunate, but it alsoseems to be consistent with other references. For example, you should keep this in mind whenreading the discussion of greedy methods given by Cormen, Leiserson and Rivest [5].

Proving these properties is nontrivial. However, there is a set of steps you can frequently followin order to prove them. These steps are given in the next two subsections.

3.6.1 Proving the Greedy Choice Property

Here is an “outline” of a proof that a greedy method for an optimization problem has the greedychoice property. You should make sure that you understand this outline and that you can “fill itin” in order to obtain a proof of the greedy choice property, for a given optimization problem andgreedy method, if you’ve been given enough time (and hints, where necessary) to do this.

It’s a good idea to start by stating the greedy choice property using the vocabulary and notationused for the optimization problem that is to be solved, since this will probably give you a betteridea of what you need to prove for this particular problem and algorithm.

To begin the proof of the property, suppose that I is an instance of the given problem that isnot a base instance. Then, the “input set” Q that is defined by this instance will be nonempty.

Since Q is also finite, there will be at least one “greedy choice” x ∈ Q — after all, since Qis finite there will be an element x ∈ Q that maximizes (or minimizes) the “local optimizationfunction” loc (whichever is necessary).

Now, let x be an arbitrary greedy choice for the instance I.Next, let S ⊆ Q be any correct (valid and optimal) output for this instance.Two cases must now be considered: Either x ∈ S or x 6∈ S.

1. Case: x ∈ S. In this case we’re done: All we have to do is set S to be S. Then S is a correctsolution for this instance of the problem such that x ∈ S, as desired.

2. Case: x 6∈ S. This is the meat of the proof. There are (at least) two strategies you can tryto follow, in order to handle this case.

(a) Prove that this case can’t ever arise at all — that is, prove that x must belong to S (bysomehow establishing a contradiction without assuming anything more).This strategy may be feasible if there is only one correct solution for any given instanceof the problem. If that isn’t the case then it is probably not feasible at all, because therereally might be correct output sets that don’t contain x.Don’t make the mistake of assuming that the output is unique (and using this “fact”)when it isn’t true!

70

(b) Use S to “construct” a correct output set S that does contain x.It will often — but not always — be the case that we can set S to be T ∪ x where Tis some cleverly chosen subset of S.Whether that is true or not, we must do the following after selecting the set S:

i. Confirm that x ∈ S.ii. Show that S is valid.iii. Show that S is optimal. Since S is already known to be optimal, it is sufficient to

show (somehow) that Opt(S) = Opt(S). And, since S is already known to be valid,it is really only necessary to prove that Opt(S) ≤ Opt(S) if this is a minimizationproblem, or to prove that Opt(S) ≥ Opt(S) if this is a maximization problem.Exercise (to Assess Your Understanding): Convince yourself that this is true,and explain why this is the case.

The details involved will vary from problem to problem, and it might be convenient (in somecases) to combine some of these steps, or to change the order in which they’re performed. Youmight even find a correct proof of the greedy choice property that doesn’t follow this outline at all!However, this is a good outline to consider first, when you are trying to establish the greedy choiceproperty in order to prove correctness of a greedy algorithm in CPSC 413.

3.6.2 Proving the Optimal Substructure Property

In order to establish that the optimal substructure property holds, you must describe the followingtwo constructions and prove that they are correct. These will be used in the (unoptimized) “greedymethod” that is being proved correct — just before, and just after, a recursive call that is used tohandle non-base instances. Indeed, they are the last two constructions mentioned in the definitionof a “Greedy Method” that was given in Subsection 3.4.1.

1. The first construction uses

• an arbitrary instance I of the problem that is not a base instance (which defines an“input set” Q), and

• a greedy choice x ∈ Q that could be made for I,

to produce a new instance I of the problem whose size is strictly less than the size of theinstance I.

2. The second construction uses

• an instance I of the problem that is not a base instance,

• an element x that could be a greedy choice for I,

• the smaller instance I that would be produced from I and x using the constructiondescribed above, and

• any correct solution S for the instance I,

to generate a correct solution S for the original instance I that includes x.

71

In order to prove correctness of these constructions you must prove that they both terminate,and that they generate the outputs described above when they do. Proving termination will usuallynot be difficult (and frequently, it will be “obvious” that the procedures terminate, so that all youneed to do is to say so), and this won’t be discussed any further here.

To prove correctness of the first construction, you should let I be an arbitrary instance of theproblem that isn’t a base instance and you should let x be an arbitrary greedy choice for I.

Let I be the output that is generated by the first construction using I and x.You should at least mention the following, and you should provide a proof of any of these that

are not obvious:

• I is a well-formed instance of the optimization problem that is to be solved;

• I is a strictly smaller instance than I is.

It will often (but not always) be the case that both of these are easy to prove.After you’ve proved correctness of the first construction, you should let I and x be as above,

and you should let I be the instance of the problem that is generated from I and x using the firstconstruction.

Now, let S be an arbitrary solution of the problem for the instance I.Finally, let S be the output generated by the second construction using I, x, I, and S. You will

now need to prove that S is a correct solution of the problem for instance I. In order to do this,you will need at least to mention each of the following, and you will need to prove any of these thatare nontrivial:

• S is “syntactically correct,” so that it is (or represents) a subset of the input set Q that isdefined by the instance I;

• S is a valid set for the instance I;

• S is an optimal set for the instance I.

Here is a way to try to do the last step (which will usually be the most difficult) — namely, toprove that S is optimal, provided that you’ve done everything before that, including having provedthat the greedy choice property holds.

Note that, since the greedy choice property holds, there exists an optimal set T for the instance Iof the problem, such that x ∈ T .

Somehow, use T to find a set T , and prove (or, if one or the other is obvious, “observe”) that

• T is a valid set for the smaller instance I, and

• if you applied the second construction to the set T (with I, x, and I unchanged), then theset T (that you used to construct T ) would be produced by this construction.

Suppose, now, that you’ve managed to do this.Since T is valid for the instance I and S is optimal for this instance, Opt(S) ≥ Opt(T ) if this

is a maximization problem and Opt(S) ≤ Opt(T ) if this is a minimization problem. (Note that, inthe previous sentence, “Opt” refers to the optimization function defined for the smaller instance I.)

Next, try to use this fact, and the fact that S and T can be generated from S and T respectivelyusing the second construction, to prove that Opt(S) ≥ Opt(T ) for a maximization problem, andOpt(S) ≤ Opt(T ) for a minimization problem, as well. Here, “Opt” refers to the optimizationfunction defined for the original instance I (as usual).

72

It will sometimes be the case that you can do this, by proving that

Opt(S)− Opt(T ) = Opt(S)−Opt(T ).

Now, if the above is true then S must be optimal for the instance I, because T is, and S isvalid.

Regardless of whether you’ve used this method (of “comparing S to T”) or not, you’re finishedonce you’d proved that S is optimal: Since the instance I, greedy choice x, and solution S for thesmaller instance I were “arbitrarily chosen,” this implies that the second construction is correct.

3.6.3 An Important Special Case

Suppose that you’ve managed to establish the greedy choice property, prove that the first construc-tion is correct and that the second construction terminates.

Furthermore, suppose that the second construction forms the solution S for the original instanceby adding the greedy choice x to the solution S for the derived instance (without making any otherchanges).

Then the following claim implies that the second construction is also correct — which is allyou’d need in order to complete the proof of the optimal substructure property, in this case.

Claim: Suppose that

• the greedy choice property is satisfied;

• the optimization function “Opt” is the same for the instances I and the smaller instance Ithat the first construction produces from the original instance I and the greedy choice x;

• the function Opt is additive: For every subset S of Q,

Opt(S) =∑y∈S

Opt(y);

• the “input set” Q defined by I is a subset of Q \ x (so that Q ⊇ Q ∪ x);

• if S is a solution of the problem for instance I then S ∪ x is a valid set for the instance I;

• if T is any solution of the problem for instance I such that x ∈ T , then T \ x is a valid setfor the instance I.

Then if S is a solution of the problem for the instance I, then S ∪ x is a solution for theinstance I, so that we can set S = S ∪ x.Proof. It is given that the first construction is correct and that both constructions terminate. Allthat remains is to prove that the second construction’s output is correct.

Suppose now that the problem is a minimization problem, and let T be a solution of the problemfor the instance I such that x ∈ T : Since the greedy choice property is satisfied and x is a greedychoice for this instance, such a set T does exist.

Then T = T \ x is a valid set for the instance I.On the other hand, S is an optimal set for this instance, so

Opt S ≤ Opt T .

73

Let S = S ∪ x, as above. Then, since the function Opt is additive, and x 6∈ S (since x 6∈ Q),

Opt(S) = Opt(S ∪ x) = Opt(S) + Opt(x) ≤ Opt(T ) + Opt(x) = Opt(T ).

On the other hand, S is a valid set and T is an optimal set for the instance I, so that

Opt(T ) ≤ Opt(S)

as well. Therefore Opt(S) = Opt(T ), and S is optimal for I (since T is, and S is valid for I).The same argument, with the direction of all inequalities reversed, establishes the result if the

problem is a maximization problem, instead.

3.6.4 Completing a Proof of Correctness

If you’ve managed to state and prove the above properties then you may conclude that the givengreedy method is “correct.”

To see that these properties really do imply correctness of the method, I suggest that you tryto establish that the method always returns an optimal output set, using induction on the size ofthe input.

You should discover that you’ll use the correctness of your method for solving “base” inputs inorder to complete the basis in your proof by induction. And you should discover that you’ll usethe above properties in order to complete the inductive step that you need.

While this “exercise” of completing the proof of correctness (using induction) is recommended,to improve your understanding of this topic, you won’t need to include such a proof by inductionwhen you’re asked to prove correctness of a greedy method, for CPSC 413.

3.6.5 A Mistake to Watch For and Avoid

Recall that (in this chapter) we started with a version of the Optimal Fee problem in which werequired an output set with size at most dn2 e. In order to design a greedy algorithm more easily,we consider a “generalized” version of the problem that added a size bound k as an input, instead.

By adding k, we made it easier to design a recursive solution — because this made it easier toreduce the problem of solving an instance of the problem to that of solving another instance of thesame problem. This would have been much more difficult to do, if we’d stuck with the first versionof the problem that was introduced in this chapter — try it, and see for yourself!

Sometimes, rather than adding an extra “size bound,” as we did here, you can add an extrainput set T ⊆ Q as a parameter, and add the additional constraint that the output must be asubset of T , and not just a subset of Q. You can turn an instance of the original problem into aninstance of this new one, by setting T to be Q and leaving everything else unchanged.

For example, this would have been a useful way to proceed if you’d managed to find a usefulgreedy method for the original version of the Optimal Fee problem (used as an example for DynamicProgramming), and run into trouble proving correctness.

It’s sometimes the case that you only discover the need to generalize (or slightly modify) theproblem to be solved when you’re trying to establish the “optimal substructure property,” becausethis is when you’ll see that you need to add some additional parameters or constraints (as we didabove) in order to recurse.

If you do modify the description of the problem — even slightly — at this point, then you mustmake sure to do the following, as well:

74

1. Provide an efficient reduction from the original problem to the modified problem that you’veproduced. That is, describe an efficient way to start with an arbitrary instance I of the originalproblem and use this to produce an instance I∗ of the modified problem, and describe anefficient way to take any solution S∗ of the instance I∗ of the modified problem and use thisto recover a solution for the instance I of the problem that you started with.

This was done for the Optimal Fee problem, in these notes, by giving an algorithm for the firstversion of the problem introduced in these notes, that used an algorithm for the generalizedversion as a subroutine.

2. Prove correctness of the greedy method for the generalized problem — making sure to establishthat base instances are correctly recognized and solved, and that both the greedy choiceproperty and the optimal substructure property hold for this version of the problem and thealgorithm for it. (It’s certainly not sufficient to prove part of this for a version of an algorithmthat solved the “original” problem and to do the rest of it for a version for the “generalized”version instead.)

We’ll do this, for the generalized version of the problem, in the next section of these notes.

If you forget the first of these, but remember the second, then you will be able to producea correct greedy algorithm that solves the modified version of the problem. However, you won’t(necessarily) have any way to use this algorithm as part of another algorithm for the problem youwanted to solve in the first place.

If you forget the second of these (or prove part of what’s needed for the original version and provethe rest for the modification) then you cannot conclude that there is a correct greedy algorithmfor either version of the problem, because you haven’t established that base instances are correctlyhandled and the greedy choice property holds and the optimal substructure property holds, foreither one.

3.6.6 Expectations for CPSC 413

It is expected that you will be able to state the greedy choice and optimal substructure propertiesand that you will be able to prove correctness of greedy algorithms by following the outlines thathave been given above — provided that sufficient time is allowed for this and that reasonable hintsare given for the parts that require inspiration or creativity, as needed.

3.7 Application: Proving Correctness of the Greedy Algorithmfor the (Generalized) Optimal Fee Problem

Now consider the greedy method for the final version of the “Optimal Fee” problem that waspresented (as pseudocode) in Figure 3.1 on page 61.

It’s been noted already that the “base instances” of the problem were chosen to be thoseinstances such that n = 0 (that is, no fees were included) or k = 0, or both. The only possibleoutput set, for such an instance, is the empty set.

It should be clear by inspection of the first two lines of the pseudocode that base instances arecorrectly recognized and solved by this greedy algorithm.

75

3.7.1 The Greedy Choice Property

If I is an instance that is not a base instance then it includes n fees where n ≥ 1, and it includesthe input k ≥ 1.

As defined by the greedy algorithm, a “greedy choice” for this instance is any integer i suchthat 1 ≤ i ≤ n and fi ≥ fj for all j such that 1 ≤ j ≤ n.

Thus the “Greedy Choice Property” is the following, for this problem and algorithm:

Claim. If I is an instance of the final version of the Optimal Fee problem that includes n jobswith fees f1, f2, . . . , fn, where n ≥ 1, and if it also includes an input k ≥ 1, then there exists aninteger i such that 1 ≤ i ≤ n and fi ≥ fj for all integers j such that 1 ≤ j ≤ n. Furthermore, forany such integer i, there exists an optimal set of jobs (for this instance) such that i ∈ S.

Proof of Claim. Suppose I is as described in the claim (so that it’s an instance of the problem thatis not a base instance, it includes n jobs with fees as above, for n ≥ 1, and so that it includes theinput k ≥ 1).

Since the set of integers between 1 and n is a nonempty finite set (it is given that n ≥ 1), andsince fi is a nonnegative integer for 1 ≤ i ≤ n, it is clear that there exists at least one integer isuch that 1 ≤ i ≤ n and fi ≥ fj for all j such that 1 ≤ j ≤ n, as claimed.

Let i be any such integer (so that i is a “greedy choice”).Now, let S ⊆ 1, 2, . . . , n be any correct (valid and optimal) output set corresponding to the

instance I.Either i ∈ S or i 6∈ S.

Case: i ∈ S. In this case it is sufficient to set S to be S; then S is an optimal set of jobs for thisinstance that includes i, as desired.

Case: i 6∈ S.Two “subcases” will be considered: Either |S| < k or |S| = k. Note that, since S is valid,

|S| ≤ k, so that one of these two subcases must hold.

Subcase: |S| < k.In this case, since |S| and k are integers, |S| ≤ k − 1. Let

S = S ∪ i.

Clearly i ∈ S (by construction).Since i 6∈ S, |S| = |S|+ 1 ≤ (k − 1) + 1 = k, and S ⊆ 1, 2, . . . , n, so that S is a valid set.Furthermore, since i 6∈ S and fi ≥ 0,

fS = fS + fi ≥ fS .

On the other hand, this is a maximization problem, S is optimal, and S is valid, so that fS ≥ fSas well. Therefore fS = fS and S is optimal (since S is).

Subcase: |S| = k.In this case, since k ≥ 1, S is nonempty. Let j be some element of S, and let

S = (S \ j) ∪ i.

Once again it is clear that i ∈ S, by construction.

76

It is also clear that S contains all the elements that S does, except for j, and S contains i aswell. Therefore |S| = (|S| − 1) + 1 = |S| = k, and S ⊆ 1, 2, . . . , n, so that S is valid.

Since i was a greedy choice, fi ≥ fj . Therefore

fS = (fS − fj) + fi = fS + (fi − fj) ≥ fS .

On the other hand, this is a maximization problem, S is optimal, and S is valid, so fS ≥ fS aswell. Therefore fS = fS and S is optimal (since S is).

It has been shown that S is an optimal set that contains i in every case. Since the (non-base) instance I and the greedy choice i were arbitrarily chosen, it now follows that the claim iscorrect.

Note: You should be able to confirm that this follows the general “outline” for proofs of thegreedy choice property that has been presented in these notes. It uses the “second” strategy thatwas proposed to handle the case “x 6∈ S.”

You may be wondering why the first strategy wasn’t used, instead. The reason is that theremight exist optimal sets for the above instance that don’t include the greedy choice i. In particular,this can happen (only) if k < n and there are k+1 or more integers between 1 and n that representjobs with the same (maximal) fee.

You might also be wondering why the “first subcase” needed to be considered. In fact, it canarise, but only for extremely unusual instances — namely, those such that f1 = f2 = · · · = fn = 0.

So, these cases must be considered, because they’re possible, even though they might not arisevery often.

3.7.2 Proving the Optimal Substructure Property

As described in the definition of the “Optimal Substructure Property (for Greedy Methods),” twoconstructions must be described and proved to be correct.

The first construction uses a non-base instance I and a greedy choice (i) for I to producea smaller instance I; this is shown in lines 5–12 of the pseudocode algorithm that is shown inFigure 3.1 on page 61.

The second construction uses all of this, as well as a correct (optimal) solution S for theinstance I, and uses this to produce a correct (optimal) solution for the original instance I. Thisis shown in lines 14–18 of the above algorithm..

Since the only loops in either construction are for loops it is clear that both constructionsterminate.

It should also be clear, by inspection of the first construction, that it produces an instance thatincludes n − 1 jobs whose fees are also fees of the original instance. Since the original instance isnot a base instance n ≥ 1, so that n− 1 ≥ 0 (as one would hope). It’s also true that k ≥ 1, so thatk = k − 1 ≥ 0.

Therefore the instance I, that includes the fees f1, f2, . . . , fn−1 and the integer k defined by thisconstruction, is a well formed instance of the problem. It’s also a strictly “smaller” instance than Iis, since it includes one fewer job.

Since the non-base instance I and the greedy choice i were “arbitrarily chosen,” it follows thatthis first construction is correct.

Now consider the second construction and the output set S that it produces. Since 1 ≤ i ≤ nand either S = S∪i or S = S∪n (by inspection of the second construction), S ⊆ 1, 2, . . . , n.

77

Furthermore, since |S| ≤ k = k − 1, |S| ≤ |S| + 1 ≤ k. Therefore, S is a valid output set (forthe instance I).

It remains only to prove that S is also optimal. In order to do this we’ll need to consider boththe optimization function that is defined as part of the instance I, and the optimization functionthat is defined as part of the instance I. Unfortunately these aren’t the same (this is unfortunate,because it complicates the proof).

For any subset R of 1, 2, . . . , n let

fR =∑j∈R

fj ,

as usual; then fR is the value of the “optimization function for the instance I” on input R.For any subset R of 1, 2, . . . , n− 1, let

fR =∑j∈R

fj ;

then fR is the value of the “optimization function for the instance I” on input R.Now, let T be any optimal set for the instance I such that i ∈ T . Since i is a greedy choice for

the instance I and the greedy choice property is satisfied, such a set T does exist.Set

T =

T \ n if n ∈ T ,T \ i if n 6∈ T .

It should be clear by inspection of this definition that T is always a subset of 1, 2, . . . , n− 1 andthat |T | = |T | − 1. Since T is an optimal (and, therefore, valid) output set for the instance I,|T | ≤ k. Therefore |T | ≤ k − 1 = k.

Thus, T is a valid output set for the instance I.Now, let T ? be the subset of 1, 2, . . . , n that would be obtained by applying the second

construction to T .Either n ∈ T or n 6∈ T .Furthermore, if n ∈ T , then either n = i or n 6= i.If n ∈ T and n = i then T = T \ n = T \ i, so that T ? = T ∪ i = T ∪ n = T , by

inspection of the second construction.If n ∈ T and n 6= i, then T = T \ n, and i ∈ T (since i ∈ T ). Therefore T ? = T ∪ n = T ,

again by inspection of the second construction.Finally, if n 6∈ T , then T = T \ i, so that i 6∈ T , and T ? = T ∪ i = T , by inspection of the

second construction once again.Therefore T = T ?; that is, T is the set that would be obtained from T by applying the second

construction to this smaller set.Next, it should be noted that, since S is an optimal output set for the instance I, while T is a

valid set for this instance, and since this is a maximization problem,

fS ≥ fT .

Now, consider any set R that is a valid output set for the instance I and the set R that wouldbe produced by applying the second construction to it. Either i ∈ R or i 6∈ R.

78

If i ∈ R then R = R ∪ n, so that i ∈ R as well and

fR =∑j∈R

fj

= fn + fi +∑

j∈R but j 6=ifj

= fi + fi +∑

j∈R but j 6=ifj

= fi +∑j∈R

fj

= fi + fR.

On the other hand, if i 6∈ R, then R = R ∪ i, so that

fR =∑j∈R

fj

= fi +∑j∈R

fj

= fi +∑j∈R

fj

= fi + fR,

in this case too.Therefore fR = fi + fR in all cases.It follows that fS = fi + fS , and fT = fi + fT . Therefore, since fS ≥ fT ,

fS = fi + fS ≥ fi + fT = fT ,

as well.However, this is a maximization problem, S is a valid output set for the instance I, and T is

an optimal output set for the instance I, so that fT ≥ fS too.Therefore fS = fT and S is an optimal output set for the instance I (since T is).Since the instance I, greedy choice i, and solution S for the derived instance I were all arbitrarily

chosen, this implies that the second construction is correct (since the output it produces is correctin all cases), as is required to complete the proof of the Optimal Substructure Property.

Note: You should confirm that this proof follows the outline that was given for “proofs of theOptimal Substructure Property.”

That outline ended with a “claim” that was useful for proving this property in an important spe-cial case. Unfortunately, that “special case” didn’t apply because it was necessary to do somethingdifferent from simply “adding i to S in order to get S” in this case. (There were also additionalproblems, as an inspection of the claim will verify.)

However, if you examine the proof of the claim, you should be able to verify that the argumentthat was used there has been used at the end of the above proof as well. It’s just been complicated,somewhat, by the need to “rename” the nth job in order to define the instance I and to redefinethe optimization function accordingly.

79

if (n = 0 or k = 0) thenreturn ∅

else Preprocessing: Set up the array A for i := 1 . . . n do

A[i] := (i, fi)end forSort the ordered pairs in the array A by nondecreasing order of theirsecond coordinates.S := ∅for i := 1 . . .min(n, k) do

Add the first coordinate of the ordered pair in position i of the array A to Send forreturn S

end if

Figure 3.4: An Optimized Greedy Algorithm for the Modified Optimal Fee Problem

3.8 Proving Correctness of an “Optimized” Version

It’s already been mentioned that the “unoptimized” (recursive) versions of greedy algorithms aremainly of interest because we can prove that they are correct, and not because they’re the versionsthat should be implemented. Optimized versions, which add preprocessing, use data structures toiterate over subproblems quickly, and which are often iterative rather than recursive, should beimplemented instead.

In order to prove that these are correct, you should prove that they always return the sameoutputs as the “unoptimized” versions; then the optimized algorithms are correct because theunoptimized ones are.

3.8.1 Application: The Optimal Fee Problem

Consider, once again, the final version of the “Optimal Fee Problem.” Here is an “optimizedversion” of the greedy algorithm that we’ve proved to be correct.

The algorithm will use an array A of length n; the entries of the array will be ordered pairs(i, fi) for 1 ≤ i ≤ n, and the algorithm is shown in Figure 3.4.

In order to prove correctness of this optimized version of the algorithm, you should prove thatevery output set that can be returned by the above algorithm, given an instance I as input, canalso be returned as an output set by the original “unoptimized” algorithm, when given the sameinput I as input.

One way to prove that the two algorithms return the same output sets — indeed, a way to“discover” an optimized version of the algorithm from the unoptimized one — is to characterizethe sets that the unoptimized algorithm returns in a more helpful way. Suppose now that I is someinstance of the problem that includes n fees, f1, f2, . . . , fn, and a bound k on the size of the outputset (as usual).

Claim. Any subset S of 1, 2, . . . , n of size m = min(n, k), such that

• for all integers i and j such that 1 ≤ i, j ≤ n, if i ∈ S and j 6∈ S then fi ≥ fj

80

is a correct (that is, optimal) output set for the instance I.

Proof. It is sufficient to prove that any set S that has the above property can be returned as outputby the “unoptimized” version of the algorithm, when given the instance I as input. This can beestablished using induction on m = min(n, k).

Basis: If m = min(n, k) = 0 then the only set S of size m is the empty set — and it is clear byinspection of the unmodified version of the algorithm that this set is returned as output.

Inductive Step: Suppose m > 0, and that the result is true for all instances that include n fees anda bound k on output set size, for all integers n and k such that min(n, k) < m.

Once again, let I be an instance including n fees f1, f2, . . . , fn and a bound k on the output setsize, where min(n, k) = m.

Finally, let S be any subset of 1, 2, . . . , n of size m that has the property given in the claim(that is, if i ∈ S and j 6∈ S then fi ≥ fj). Since m > 0, S is nonempty.

Let i be any element of S such that fi ≥ fh for every element h of S; since S is finite andnonempty, such an element does exist. Then, since S satisfies the property given in the claim,fi ≥ fh for every integer h such that 1 ≤ h ≤ n and n 6∈ S as well. That is, i is an element thatcould be chosen as the “greedy choice” when the unoptimized algorithm is run with the instance Ias input.

Given S and i, let

S =

S \ n if n ∈ S,S \ i if n 6∈ S.

As well, for 1 ≤ i ≤ n− 1, let

fj =

fj if j 6= i,

fn if j = i.

In addition, set k = k − 1, and note that the fees f1, f2, . . . , fn−1 and the integer k are the feesand bound that would be included in the derived smaller instance I that would be formed andrecursively solved, if the unoptimized algorithm was run with the instance I as input and if i wasthe greedy choice that was initially made. Set n = n− 1.

Now, it is not difficult (but somewhat tedious) to prove that S is a subset of 1, 2, . . . , n withsize min(n, k) = m− 1 such that

• for all integers r and s such that 1 ≤ r, s ≤ n, if r ∈ S and s 6∈ S then fr ≥ fs.

That is, S satisfies the property given in the claim, with respect to the instance I. (This “notdifficult but tedious proof” is left as an exercise.)

Since min(n, k) < m, it follows by the inductive hypothesis that S is a correct (valid andoptimal) output set for the instance I including the fees f1, f2, . . . fn−1 and the bound k — becausethe unoptimized greedy algorithm can return S as output when given the instance I as input.

To finish, it is sufficient to confirm that if S is the solution that is returned, when the problem isrecursively solved on input I, then the set S that we started with is also the set that the unoptimizedgreedy algorithm will compute and return, as a solution for the instance I, from S and the greedychoice i. (Exercise: Prove this, too — this shouldn’t be difficult, if you’ve read the proof ofcorrectness of the “unoptimized” algorithm carefully.)

Since the instance I was arbitrarily chosen, this establishes the claim.

81

Now, in order to complete the “discovery,” and proof of correctness of the “optimized” versionof the algorithm, it suffices to note that a correct and efficient way to obtain a set S with theproperty given in the claim is to sort the jobs by nondecreasing fee and choose the first min(n, k)of them that appear in the sorted list. This is exactly what the “optimized” algorithm does.

The most expensive part of this computation is the sorting of the array. If an asymptoticallyefficient sorting algorithm like heap sort or merge sort is used, then this, and the rest of thealgorithm, can be performed using O(n logn) integer operations.

3.9 Application: The Activity Selection Problem

The following problem is also considered at the beginning of Chapter 17 of Cormen, Leiserson, andRivest [5]

3.9.1 Definition

Consider the problem of trying to schedules as many activities as possible in a single lecture hall.If the input specifies n activities then we will assume that these are represented by the integers

between 1 and n, so that the output is a subset of the “input set” 1, 2, . . . , n.For 1 ≤ i ≤ n, the ith activity has a start time si and a finish time fi (which are both

nonnegative real numbers), and the problem input consists of the n pairs of integers

(s1, f1), (s2, f2), . . . , (sn, fn).

Since each activity must occupy some positive amount of time, fi > si if 1 ≤ i ≤ n.For 1 ≤ i, j ≤ n such that i 6= j, the ith and jth activities are compatible if and only if their

times don’t overlap, that is, if

fi ≤ sj or fj ≤ si.

These activities are incompatible otherwise.An output set S ⊆ 1, 2, . . . , n (for the instance of the problem described by the above input)

is valid if and only if it satisfies the constraint that, for 1 ≤ i, j ≤ n such that i 6= j, if i ∈ S andj ∈ S then i and j are compatible.

That is, S is valid if it does not contain any pair of incompatible activities.An output set S ⊆ 1, 2, . . . , n is optimal (for this instance of the problem) if S is valid and

|S| ≥ |T | for every valid subset T ⊆ 1, 2, . . . , n.Thus, this is another example of an optimization problem — specifically a maximization prob-

lem.

3.9.2 A Generalization

In order to define recursive solutions for this problem it will be useful to consider a generalizationof this problem whose input also considers a nonnegative integer, “Start .”

This integer represents the earliest time at which the hall is available, so now a set S ⊆1, 2, . . . , n is only valid if it meets the requirement that has been already given (if does notinclude any incompatible activities) and if it includes no activities that begin before Start — thatis, if for all i ∈ S, si ≥ Start .

An optimal set will still be defined to be a valid set whose size is as large as possible.

82

The “size” of an instance of this generalized problem will be defined in an unusual way: Thesize of the instance that includes the above start times and finish times, and the integer Start , willbe defined to be the number of integers i between 1 and n such that si ≥ Start . Note that this isan upper bound on the size of any output set that is a correct solution.

Now, it’s easy to turn an instance of the original problem into an instance of the modification(since the start times are all nonnegative) — all you need to do is set Start to be 0 and include thisadditional input. Then, any correct solution for the resulting instance of the generalized problemis a correct solution for the given instance of the original problem, as well.

The generalized version of the problem will be considered instead of the original version in therest of these notes.

3.9.3 Base Instances

Before considering greedy strategies it will be useful to define “base instances” of this problem;these will be instances such that

si < Start for 1 ≤ i ≤ n;

that is, instances such that no activities can be included in a valid output set, so that the onlycorrect solution for any of these instances is the empty set, ∅.

Note that this set is a correct output set, for any such instance of this problem.

3.9.4 An Incorrect Strategy

Strategy

Suppose now that we have a non-base instance, so that si ≥ Start for at least one integer i suchthat 1 ≤ i ≤ n.

Consider a strategy in which we begin by including an activity whose start time is as early aspossible (without being less than Start). This is the same as choosing an activity that minimizesthe value of the function loc, if this function is defined as follows for 1 ≤ i ≤ n:

loc(i) =

1 + max1≤i≤n

si if si < Start ,

si if si ≥ Start .

Suppose the activity i with start time si and finish time fi was selected; then a “smaller instance”of the problem (that can be solved recursively to complete the definition of the output set) can beobtained by changing the value of Start to be fi and leaving everything else unchanged.

Proof That Strategy is Incorrect

Now consider an instance that includes three activities, so that n = 3, and such that s1 = 1, f1 = 6,s2 = 2, f2 = 3, s3 = 4, f3 = 5, and such that Start = 0.

Using the above strategy, the first activity will be selected and included in the output set (sinceits start time is strictly less than the start time of all the other activities). However, the secondand third activities are incompatible with the first so that only the first activity can be included.Therefore the above strategy will produce the output set 1.

However, the second and third activities both begin after the start time and are compatible(since f2 ≤ s3), so 2, 3 is a valid set that is strictly larger than the output produced by the greedystrategy.

83

Therefore, the greedy strategy’s output is not an optimal set for this instance of the problem.It follows that a greedy method using this strategy would be incorrect.

3.9.5 A Correct Strategy

Strategy

Suppose, again, that we have a non-base instance of the problem, so that there is at least oneinteger i such that si ≥ Start . Since fi > si, fi ≥ Start as well.

Now consider a strategy in which we begin by including an activity (that doesn’t begin beforeStart) that finishes as early as possible. This is the same as including an activity that minimizesthe value of the function loc, if this function is defined as follows for 1 ≤ i ≤ n:

loc(i) =

1 + max1≤i≤n

fi if si < Start ,

fi if si ≥ Start .

We’ll prove in the rest of these notes that this greedy strategy is correct — it can be used toproduce a correct solution for this problem.

Proof of Correctness

Base Instances. If base instances are defined as above then they are easy to recognize: Aninstance including the above start times and finish times, and the integer Start , is a base instanceif and only if si < Start for every integer i such that 1 ≤ i ≤ n; this is trivially satisfied if n = 0.

Base instances are also easy to solve, since the empty set ∅ is a correct solution for any of them(and is the only correct solution, as well).

Greedy Choice Property. The “greedy choice property” for this problem and greedy methodis as follows.

Claim. Suppose I is an instance of this problem that includes n activities, such that the ith activityhas start time si and finish time fi (with fi > si) for 1 ≤ i ≤ n, as well as the integer Start .

Suppose, furthermore that this is not a base instance of the problem, so that si ≥ Start for atleast one integer i between 1 and n.

Finally let i be any integer such that 1 ≤ i ≤ n and loc(i) ≤ loc(j) for all j such that 1 ≤ j ≤ n,for the localization function loc that is defined above.

Then there exists a valid and optimal set S ⊆ 1, 2, . . . , n for this instance of the problem suchthat i ∈ S.

Proof of Claim. Let S be a valid and optimal output set for this instance of the problem. Theneither i ∈ S or i 6∈ S.

Case: i ∈ S. In this case it is sufficient to let S = S; then S is a valid and optimal output set forthis instance such that i ∈ S, as required.

Case: i 6∈ S.To begin, note that S is nonempty — for otherwise, since I is not a base instance of the problem,

there exists at least one integer j such that 1 ≤ j ≤ n and sj ≥ Start , so that j would be a validoutput set whose size is greater than S.

84

Thus there exists at least one integer j in S. Suppose, now, that j is an integer between 1 and nsuch that j ∈ S and loc(j) ≤ loc(k) for every integer k ∈ S. Since i 6∈ S it is clear that j 6= i.

Let S = (S \ j) ∪ i. Then, clearly, i ∈ S, by construction.Since S ⊆ 1, 2, . . . , n and 1 ≤ i ≤ n it is clear that S ⊆ 1, 2, . . . , n as well.We will next show that S is valid. In order to do this we must show that sk ≥ Start for every

integer k ∈ S, and that u and v are compatible for every pair of (distinct) activities u and v in S.Suppose that 1 ≤ k ≤ n and k ∈ S.It has already been noted that there exists at least one integer ` such that s` ≤ Start , so that

loc(`) ≤ max1≤h≤n

fh. Since i was a “greedy choice,” loc(i) ≤ loc(`), so that loc(i) ≤ max1≤h≤n

fh as well,

implying that si ≥ Start (and that loc(i) = fi). Therefore sk ≥ Start if k = i.On the other hand, if k 6= i then k ∈ S (by the construction of S), and it follows that sk ≥ Start

(because S is a valid set) in this case as well.Therefore, sk ≥ Start for every integer k ∈ S.Now let u and v be distinct elements of S.If u = i then v 6= i so that v ∈ S. Therefore j and v are distinct elements of S (they both

belong to S and they’re not the same, since v ∈ S and j 6∈ S), so that v and j are compatible, sinceS is valid.

This implies that either fj ≤ sv or fv ≤ sj . The latter possibility can be ruled out by the choiceof j: loc(j) ≤ loc(v), and this implies that fj ≤ fv because j and v both belong to a valid outputset. Now, if fv were less than or equal to sj as well, then this would imply that fj ≤ sj , which weknow is impossible.

Therefore, fj ≤ sv. However, fi = loc(i) ≤ loc(j) = fj , so fi ≤ sv as well. Therefore, theactivities u and v are compatible if u = i.

It follows by a symmetric argument (exchanging the roles of u and v) that u and v are compatibleif v = i, as well.

Finally, if neither u = i nor v = i then u and v are distinct activities that belong to S, and uand v are compatible because S is valid.

Therefore (since this has been proved in every possible case), every distinct pair of activitiesu, v ∈ S is compatible.

It follows that S is a valid set.Now, since S = (S \ j) ∪ i, j ∈ S, and i 6∈ S,

|S| = (|S| − 1) + 1 = |S|

and it follows that S is also optimal, since S is optimal and S is valid.Therefore, S is an optimal set that contains i, as required.Thus, the claim is correct — that is, the “greedy choice property” is satisfied.

Optimal Substructure Property. In order to prove the optimal substructure property it isnecessary to describe two constructions and prove that each is correct.

The first construction accepts a non-base instance I and a greedy choice i and produces asmaller instance, I. This is easily described: If I includes n activities such that the jth activity hasstart time sj and finish time fj for 1 ≤ j ≤ n, as well as the time Start , and if i is a greedy choicefor I, then it suffices to set I to be another instance with n jobs, such that the jth activity has thesame start time sj and finish time fj as above — but which includes the additional input (as starttime) fi, instead of Start .

85

The second construction accepts the above instance I, greedy choice i, derived smaller in-stance I, and a correct solution S ⊆ 1, 2, . . . , n for the instance I and generates a solution Sfor I. In this case, this is easy: It is sufficient to set S = S ∪ i.

It should be clear that (straightforward implementations of) both of these constructions termi-nate when given syntactically correct inputs.

It should also be clear that if I is a non-base instance of the problem, and i is a greedy choicefor I, then the instance I produced by the first construction is a syntactically correct instance ofthe problem — it includes the same activities (with the same start times and finish times) as Idoes, so these must all be syntactically correct. Since fi is a nonnegative integer, the additionalinput fi that is included in I (to replace Start) is a nonnegative integer as well, so that the entireinstance I is syntactically correct.

Next recall the definition of the “size” of an instance; using this definition, the size of theinstance I is the number of integers j such that 1 ≤ j ≤ n and sj ≥ Start , while the size of theinstance I is the number of integers j such that 1 ≤ j ≤ n and sj ≥ fi.

Since i is a greedy choice for I, loc(i) ≤ loc(j) for every integer j such that 1 ≤ j ≤ n. Since Iis a non-base instance there is at least one integer j such that sj ≥ Start , so that loc(j) ≤ max

1≤`≤nf`.

Thus loc(i) ≤ max1≤`≤n

f` as well, implying that si ≥ Start . Since fi > si, fi ≥ Start as well.

This implies that, for every integer j such that 1 ≤ j ≤ n, if sj ≥ fi then sj ≥ Start as well —so the “size” of the instance I (as defined above) is at least as large as the “size” of the instance I.

However, si ≥ Start (again, because i was a greedy choice for I), but si < fi, and this impliesthat the “size” of the instance I is strictly less (by at least one) than the “size” of the instance I.

Since the non-base instance I and greedy choice i were arbitrarily chosen, it follows that thefirst construction is correct.

Now, to prove correctness of the second construction, let I, i, and I be as above, and let Sbe any valid and optimal output set for the instance I. Let S be the set generated by the secondconstruction using these, so that S = S ∪ i.

Clearly S ⊆ 1, 2, . . . , n since S ⊆ 1, 2, . . . , n and 1 ≤ i ≤ n. Thus the output set S is“syntactically correct.”

Let k ∈ S; then either k = i or k ∈ S.It has been noted already that si ≥ Start because i is a greedy choice for the instance I.

Therefore sk ≥ Start if k = i.On the other hand, if k ∈ S then sk ≥ fi, since S is a valid output set for the instance I. It has

already been shown that fi > si ≥ Start , so this implies that sk ≥ Start in this case, as well.Therefore sk ≥ Start for every element k of S.Next let u and v be any distinct pair of activities in S.If u = i then v 6= i, so v ∈ S. It follows immediately that sv ≥ fi = fu, so that u and v are

compatible in this case.If v = i then u and v are compatible as well, by a symmetric argument (exchanging the roles

of u and v).Finally, if u 6= i and v 6= i then u and v are distinct activities in S, so u and v must be

compatible in this case as well (because S is a valid output set for the instance I).Therefore u and v are compatible for every pair of distinct activities u, v ∈ S, and S is a valid

output set for the instance I.Finally, let T be any optimal output set for the instance I such that i ∈ T ; since the greedy

choice property is satisfied, some such set T exists.Let T = T \ i; then clearly T ⊆ 1, 2, . . . , n, since T ⊆ 1, 2, . . . , n.

86

if (si < Start for all i such that 1 ≤ i ≤ n) thenreturn ∅

elseLet i be any integer such that 1 ≤ i ≤ n, si ≥ Start , and fi ≤ fj for every integer jsuch that 1 ≤ j ≤ n and sj ≥ Start .

Let I be an instance with the same activities, start times and finish times as thegiven instance, but with fi replacing the input Start .

Apply the algorithm recursively to the input instance I to obtain an output set S

return S ∪ iend if

Figure 3.5: A Greedy Method for the Activity Selection Problem

Let j ∈ T ; then i and j are distinct activities in T , so i and j are compatible, and eitherfi ≤ sj or fj ≤ si. However, sj ≥ Start (because j ∈ S and S is valid for the instance I), soloc(j) = fj ≥ fi = loc(i) because i is a greedy choice for I. Therefore it cannot be true thatfj ≤ si, and it must be the case that fi ≤ sj .

Now, let u and v be any pair of distinct activities in T ; then u and v is also a pair of distinctactivities in T , so that u and v are compatible.

Since sj ≥ fi for every activity j ∈ T , and u and v are compatible for every pair of distinctactivities u, v ∈ T , T is a valid output set for the instance I.

Therefore, |T | ≤ |S|, since S is an optimal output set for the same instance.However, this implies that

|S| = 1 + |S| ≥ 1 + |T | = |T |,

so S is an optimal output set for the instance I, because S is a valid output set and T is an optimaloutput set for this instance.

Since the non-base instance I, greedy choice i, and solution S for the derived instance I werearbitrarily chosen, it follows that the second construction is also correct.

Therefore, the optimal substructure property is satisfied.

Pseudocode for the Correct Algorithm, without Optimizations. Pseudocode for a recur-sive algorithm that implements the above strategy (without optimizations) is shown in Figure 3.5.

An Optimized Version. The algorithm shown in Figure 3.6 is also a correct solution for thisproblem, because it produces exactly the same output sets as the one given above; this can beproved by induction on the “size” of the instance being solved.

Since the first operation is the most expensive (the rest is just a single sweep through thesequence of inputs), this problem can be solved using O(n logn) operations (assuming that the“unit cost criterion” is used, and n is the number of activities included in the input).

3.10 What to Do if There are No Solutions

The above treatment concerns optimization problems, such that solutions for “well-formed” (or“syntactically correct’) instances are always guaranteed to exist.

87

Sort the activities by finish times, to obtain a sequence i1, i2, . . . , in such thati1, i2, . . . , in = 1, 2, . . . , n (that is, the two sets are the same), andfi1 ≤ fi2 ≤ · · · ≤ finS := ∅start := Startfor j := 1 . . . n do

if sij ≥ start thenS := S ∪ ijstart := fij

end ifend forreturn S

Figure 3.6: An Optimized Greedy Method for the Activity Selection Problem

If you are to consider optimization problems such solutions might not exist (because there mightnot be any valid sets corresponding to an instance, or possibly because there are infinitely manyvalid sets but none of them “maximizes” or “minimizes” the value of the optimization function),then a few things will need to change:

• You’ll need to change the notion of “correctness” of a greedy algorithm — it should beacceptable for the algorithm to fail to find a solution and report this fact — but only whenno solutions for the given instance exist at all.

• The outline of a “greedy algorithm” that has been given must be modified, very slightly: Ifit is not possible to make a greedy choice for some non-base instance, if a smaller instancecan’t be formed, if the output of a recursive call is a report that there is no solution, or if acorrect solution can’t be recovered after that, then the “main” algorithm should report thatthere is no solution (for the originally given instance).

• The definition of the “greedy choice property” should be modified, to make it clear that theclaims it include apply to any non-base instances I for which solutions exist. In particular, ifthere aren’t any solutions that correspond to a non-base instance I, then it is quite possiblethat there are no greedy instances for I, or that there are greedy choices that can’t be includedin correct outputs (after all, there aren’t any correct outputs) — but the algorithm mightstill be “correct,” as defined above.

• The “optimal substructure” property should be qualified in a similar way (that is, the twoconstructions should be able to detect and report “failure” to do what they’re supposed to,when they’re given a non-base instance as input and that instance has no solutions).

If you have the time for it, and wish to test your understanding, then you should review thesenotes and see whether any additional changes would be necessary.

By the way, it should be clear that there are always correct outputs, for well-formed instancesof all versions of the “Optimal Fee” Problem. Unless it’s otherwise indicated, you’ll be allowed toassume that this is true of all the other optimization problems you’ll see, when considering GreedyMethods in CPSC 413, as well.

88

And, to conclude, here is one example of an “Optimization Problem” for which solutions mightnot exist: Consider the problem of computing a “Minimum Cost Spanning Tree” of an undirectedgraph (as defined, for example, in Chapter 24 of Cormen, Leiserson, and Rivest’s book), if it isn’tguaranteed that the input graph is connected.

3.11 Additional Examples

Brassard and Bratley [2], Cormen, Leiserson, and Rivest [5], Horowitz, Sahni, and Rajasekaran [6],and Neapolitan and Naimipour [9] all include chapters on Greedy Methods that include one ormore longer examples; Cormen, Leiserson and Rivest include additional examples in other chapters(notably, in a long section on “Graph Algorithms”) as well.

In particular, each of these references includes most or all of the following “greedy algorithms;”they’re all well known examples of algorithms which computer science students should probablyknow about.

• There are at least two greedy methods for computing a “minimum cost spanning tree” of agraph — Prim’s algorithm and Kruskal’s algorithm.

• There is also a well-known greedy method to compute all “single-path shortest paths” — thatis, the shortest paths from a given node in an undirected graph (with nonnegative distanceson the edges) to each of the other noeds.

• There is a famous greedy method to design “Huffman codes” for messages. In this case, youare given a set of characters that can be included in messages, along with probabilities thateach character might appear, and you wish to design an “encoding” of each character as astring of 0′s and 1′s so that each message can be encoded or decoded by sweeping over themessage (or its code) from left to right, in such a way that the “expected length” of the codeis as small as possible.

This list isn’t complete but it should suggest (along with the examples in this chapter) that“greedy methods” have been used to find efficient solutions for problems in a variety of areas. Astime permits, you may want to look at these other algorithms, in order to see more examples ofgreedy methods and (nontrivial) proofs of their correctness.

3.12 Exercises

Hints for some of the following exercises are given in the section after this one.You’ll find most or all of these exercises in one or more of the textbooks mentioned above that

include chapters on “greedy algorithms.” For example, Exercise 1, below, is a thinly disguisedversion of Problem 17.2-4 from Cormen, Leiserson, and Rivest [5]. Solutions for Exercises #2 and 3can be found in Subsection 4.3.1.

1. “Professor Midas” drives an automobile from Calgary to Vancouver (and has already chosena route to take). His car’s gas tank, when full, holds enough gas to travel n miles, and hismap gives the distances between gas stations on his route.

The professor wishes to make as few gas stops as possible along the way.

Give an efficient method by which Professor Midas can determine at which gas stations heshould stop, and prove that your strategy yields an optimal solution. You may assume that

89

the professor will be able to fill up the tank completely (if he should wish to) every time hestops. You may also assume that all gas stations are on the professor’s route, so that choosingone doesn’t change the distance that the professor needs to travel.

2. Consider the problem of making change for n cents using the smallest possible number ofcoins.

You may assume that k+ 1 different kinds of coins are available, for some positive integer k,and that these coins have denominations 1, c, c2, . . . , ck for some other positive integer c. Forexample, if c = 5 and k = 3 then you could use coins worth 1/c, 5/c, 25/c, and 125/c (that is,$1.25). You may also assume that an infinite supply of coins of each denomination is available.

Give a greedy method that can be used to make change for n cents (on input n) using as fewcoins as possible, and show that your approach yields an optimal solution. Your algorithmshould take the integers n, k, and c as input and should report the number of coins used ineach denomination in order to make change.

Then, if you want more of a challenge, try to do the same thing, assuming that you can makechange using pennies, nickels, dimes, quarters, and loonies (that is, when your coins havedenominations 1/c, 5/c, 10/c, 25/c, and 100/c = $1.00).

3. Now consider a generalization of the above “coin changing” problem, in which the denomi-nations of coins are not necessarily powers of some integer c. Instead, the inputs c and k arereplaced by a sequence of k integers d1, d2, . . . , dk such that 1 = d1 < d2 < · · · < dk and theseof the denominations of the coins you’re allowed to use.

Show that a greedy strategy, under which you always choose the highest denomination cointhat you can (that is, the largest denomination that’s less than or equal to the amount ofchange you need to make) does not produce an optimal solution for this problem in somecases.

4. Describe an efficient greedy method that, given a set x2, x2, . . . , xn of points on the realline, determines the smallest set of unit-length intervals that contain all the given points, andprove that your method is correct.

5. Suppose that you have a set of activities to schedule among a large number of lecture theatres.

As with the “Activity Selection Problem,” you should assume that there are n activities, forsome positive integer n, that these activities have names 1, 2, . . . , n, and that the ith activityhas a start time si and a finish time ti associated with it. These times are nonnegative realnumbers, and 0 ≤ si < ti for all i.

You should also assume that there are up to n lecture theatres available (so, there are at leastas many lecture theatres available as you could possibly need), and that these are initiallycompletely “unbooked.” You want to assign each of the activities to the lecture theatres sothat each activity is assigned to exactly one of the lecture theatres, and (if we consider alecture theatre to be “used” if at least one activity has been assigned to it) the number oflecture theatres used is as small as possible — subject to the constraint that you can’t assigntwo activities to the same lecture theatre if their times overlap. That is, you can’t assign twoactivities to the same theatre if the activities are “incompatible,” as this was defined for theActivity Selection Problem.

Here are two possible greedy methods to consider.

90

(a) Initially, consider all the activities to be “unassigned.” Consider the lecture theatres,one after another, and try to assign as many compatible activities to the current lecturetheatre as you can — until a lecture theatre has been chosen for each of the givenactivities.In other words: Repeatedly use a correct (greedy) algorithm for the “Activity SelectionProblem” to assign as many of the currently unassigned activities to the current lecturetheatre as you can. Continue doing this for as many lecture theatres as you need to,until there are no unassigned activities left.

(b) Associate a “time when available” with each lecture theatre. Initially this “time whenavailable” will be zero for each theatre (so that this is less than or equal to the starttime for every activity). Every time you assign an activity to a lecture theatre, youshould adjust the “time when available” for that lecture theatre, so that it’s equal tothe finish time for the latest activity that has been assigned to it. Since si < ti for all i,this adjustment will always increase the lecture theatre’s “time when available.”Consider the activities in order of increasing start time. For each activity, assign it to alecture theatre whose “time when available” is less than or equal to the activity’s starttime, and as close to the activity’s start time as possible.

For each of these approaches, decide whether the approach corresponds to a correct greedyalgorithm for the problem, and prove that your answer is correct.


Exercise #1: Consider a heuristic in which Professor Midas always waits as long as possiblebefore stopping for gas. That is, the professor doesn’t stop for gas at any gas station if he still hasenough has to get to the next one (or to Vancouver).

Exercise #2: Consider a heuristic in which you always choose a coin with the largest denomi-nation possible.

Exercise #3: Suppose k = 6, d1 = 1, d2 = 3, d3 = 7, d4 = 29, d5 = 43, d6 = 59, and n = 72.

Exercise #4: You should start by thinking about the best way to cover the leftmost of the inputpoints.

Exercise #5: One of the methods is incorrect and the other is incorrect. To decide which iswhich, consider an instance of the problem in which n = 6, s1 = 1, t1 = 6, s2 = 1, t2 = 2, s3 = 3,t3 = 4, s4 = 5, t4 = 10, s5 = 7, t5 = 8, s6 = 9, and t6 = 10.

91

3.14 Sample Tests

One test on greedy methods was used in fall, 1996, and two were used in fall, 1997. These are givenbelow.

You’ll notice that there was a lot to read! Indeed, some of the material (that didn’t reveal thequestion) was distributed before several of the tests, in order to reduce the time spent reading inthe test itself.

A take-home test will be used in fall, 1998, in order to try to reduce any problems that thelength of the questions might cause.


Instructions:


No aids allowed.


1. (4 marks) Consider a version of the coin changing problem, in which you are given as input asequence of “arbitrary” denominations d1, d2, . . . , dk of k kinds of coins, so that d1, d2, . . . , dkare integers and 1 = d1 < d2 < · · · < dk, and where you are also given an integer n ≥ 0 asinput. You want to find nonnegative integers s1, s2, . . . , sk (where si is the number of coinsof denomination di you choose) such that s1d1 + s2d2 + . . . skdk = n (so, the total value ofthe coins is n) and such that s1 + s2 + · · · + sk is as small as possible (subject to the aboveconstraint).

A proposed greedy strategy for this problem is to start by including a coin with the largestdenomination possible — that is, to ensure that si ≥ 1, where i = k if n ≥ dk, and wherei < k and di ≤ n < di+1 if 0 ≤ n < dk.

Prove that a “greedy algorithm” for this problem using this greedy strategy is incorrect. Itwill be useful to consider an instance in which k = 4, d1 = 1, d2 = 7, d3 = 15, d4 = 31, andn = 37.

2. Suppose, now, that you start at a city “c0” and that you wish to make a trip in which youvisit the cities c1, c2, . . . , cn in this order. Suppose as well that, for 1 ≤ i ≤ n, there are hichoices of highways between city ci−1 and city ci, and that (for 1 ≤ j ≤ hi) the jth of thesehighways is di,j kilometres long. You want to pick a route from city c0 to city cn using thesehighways, such that the total distance traveled is as small as possible.

You can model this problem more formally as follows:

92

The Route Selection Problem

Input: Integer n ≥ 0,Integers h1, h2, . . . , hn > 0,Integers di,j > 0 for 1 ≤ i ≤ n and 1 ≤ j ≤ hi.

Output: A sequence 〈r1, r2, . . . , rn〉 of n integers such that 1 ≤ ri ≤ hi for 1 ≤ i ≤ nand such that d1,r1 + d2,r2 + · · ·+ dn,rn is as small as possible.

A “base instance” of this problem is an instance such that n = 0; the only correct output forsuch an instance is an empty sequence.

We’ll consider the “size” of an instance of this problem to be the value of the input n — thatis, the number of cities (excluding c0) that are to be visited.

A proposed greedy strategy for this problem is to minimize the distance that must be traveledin order to get from city cn−1 to city cn — that is, to choose rn to be an integer such that1 ≤ rn ≤ hn and such that dn,rn ≤ dn,` for every integer ` such that 1 ≤ ` ≤ hn.

(10 marks) State and prove the greedy choice property, for this problem and greedy strategy.

Make your statement of the property as specific (to this problem) as you can. For full marks,your proof should be (almost) as formal and as detailed as the proofs of the “greedy choiceproperty” that have been included in examples during recent lectures — a one-line proof isn’tgood enough!

3. (6 marks) Give a correct and asymptotically efficient (greedy) algorithm that solves the RouteSelection Problem, which was considered in the previous question.

The following constructions can be used to establish that the “Optimal Substructure Prop-erty” also holds for this problem and the greedy strategy given in Question 2. If you wish to,you may use these constructions for Question 3 without proving that they are correct.

Construction #1

Input: An instance of the Route Selection Problem such that n > 0.

Output: A smaller instance (including n, hi for 1 ≤ i ≤ n, and di,jfor 1 ≤ i ≤ n and 1 ≤ j ≤ hi) of the same problem.

n := n− 1;hi := hi for 1 ≤ i ≤ n;di,j := di,j for 1 ≤ i ≤ n and 1 ≤ j ≤ hi.That is: “discard” the last city and the highways leading to it

93

Construction #2

Input: The input for Construction # 1,A greedy choice (an integer rn such that 1 ≤ rn ≤ hn, chosen using the

strategy described for Question #2), andA correct solution S = 〈r1, r2, . . . , rn−1〉 for the smaller instance

produced using Construction #1.

Output: A correct solution S for the original instance.

S := 〈r1, r2, . . . , rn−1, rn〉; that is, obtain S by appending the greedy choice rnto the sequence S.

4. (Bonus Question) For up to five bonus marks, prove that the “Optimal Substructure Prop-erty” is satisfied for this problem and greedy strategy.

3.14.2 First Class Test for 1997

Instructions:


No aids allowed.


Definition of a Problem. A clique in an undirected graph G = (V,E) is a subset V ′ of thevertices V such that there is an edge (u, v) ∈ E in the graph connecting each pair of distinct nodesu, v ∈ V ′.

The MaxClique problem takes an undirected graph G = (V,E) as input and returns a subsetV ′ ⊆ V as output. An output set V ′ ⊆ V is valid if and only if V ′ is a clique in the input graph.An output set V ′ ⊆ V is optimal if and only if it is valid and |V ′| ≥ |V ′′| for every (other) cliqueV ′′ ⊆ V .

As usual, an output is “correct” if and only if it is valid and optimal. Thus, this problem takesan undirected graph and returns a clique that is as large as possible as its (correct) output.

We will consider the size of an instance of this problem to be the number of vertices in theinput graph. That is, the “size” of an instance G = (V,E) is the size |V | of the set of vertices V .

Additional Useful Definitions. The degree of a vertex v ∈ V is the number of edges in Ethat connect v to other vertices in the graph. This is the same of the number of neighbours of v— other vertices that are connected by an edge to v — in the graph.

If U ⊆ V is a subset of the vertices in the graph G = (V,E) then the induced subgraph onthe vertices in U is the graph G′ = (U,E′) that includes the vertices in U and all the edges in theoriginal graph that connect nodes in U . That is, if u, v ∈ U then (u, v) is an edge in E′ if and onlyif (u, v) is an edge in E.

94

Assumptions You’re Allowed to Make. When writing pseudocode you may assume that thereare efficient subroutines for the following operations, and you may use these subroutines withoutwriting code for them:

• Computation of the number of vertices in the input graph.

• Computation of the number of edges in the input graph.

• Iteration over all the vertices in G, so that you can write a for loop with the structure

for v ∈ V do(Code for loop body)

end for

• Computation of the degree of any given vertex.

• Computation of the set of neighbours of any given vertex.

• Computation of the induced subgraph corresponding to any subset U of the set of vertices Vin G.

• Deciding whether a given pair of vertices are neighbours — that is, deciding whether (u, v) ∈ Efor any given pair of vertices u, v ∈ V .

Now, consider the MaxClique problem defined above.

1. (1 mark) What output should be returned when the input instance G = (V,E) has “size 0”as defined above — so that V = ∅?

2. (2 marks) Suppose again that the input graph G = (V,E) has “size” greater than zero, sothat V is nonempty, and suppose you’ve decided that some node v ∈ V should be includedin the output clique. If another node u is not a neighbour of v then should u be included inthe output as well? Why, or why not?

Note: This should be easy!

3. (1 mark) Based on your answers for the above questions, describe the output that should bereturned if the input graph has size greater than zero (that is, V is nonempty) but there areno edges in the graph.

4. (8 marks) Design and give pseudocode for a greedy method for this problem in which a“greedy choice” is a vertex whose degree is as large as possible. Your method should generateoutput that is valid — it should definitely be a clique — but it might not necessarily beoptimal.

Don’t try to optimize this code — Write this as a recursive function maxClique that takesan undirected graph as its input and returns a clique as its output value.

Note: It’s possible that you won’t have to write very much besides the pseudocode, in orderto answer this question!

95

5. (8 marks) Prove that the algorithm you gave in your answer for the previous question — or,indeed, any greedy method using the greedy choice mentioned in that question — is incorrect.

The following graph G = (V,E) (such that V = A,B,C,D,E, F,G and E includes sevenedges) will probably be useful as you try to answer this question.

A

C E

D

B

G

F

6. (5 marks) Finally, sketch a brief proof that your algorithm always terminates and that italways returns a valid output set (that is, a clique) as its output. You may assume that thesubroutines mentioned at the beginning of this test are all correct.

Hint: Consider using induction on the size of the input.

3.14.3 Second Class Test for 1997

Instructions:


No aids allowed.


Description of Useful Concepts and Notation

Note: The material in this subsection was distributed the day before the quiz was written.This section describes a situation (or “area”) that is discussed (in some detail) in Computer

Science 457; those of you who have taken Computer Science 457 already should find it to be familiar,and those of you who haven’t are going to get to see it, a bit ahead of time.

It also introduces some additional notation that is needed to define some problems in this area.

Memory Management. Consider an organization of a computer’s memory into

• a main memory that consists of a set of n “pages,” with names 1, 2, . . . , n;

96

• a cache that always contains copies of exactly k of these pages (where 1 ≤ k ≤ n), so thatthe contents of the cache at any one time can be represented by a subset C of 1, 2, . . . , nwhose size is k.

Suppose that the following happens when a request is made for information that is stored onpage i (for 1 ≤ i ≤ n), or when you try to write information to this page:

If there is currently a copy of page i in the cache (if i ∈ C), then the information is read fromor written to this copy of the page — and no other changes are made.

Otherwise it is necessary to move a copy of the page into the cache first. Therefore,

1. Some other page j with a copy in the cache already (so that j ∈ C) is selected; this copy isused to update the “real” page j in main memory.

2. A new copy of page i is created and overwrites the old copy of page j. In effect, the set ofpages with copies in the cache has been changed to be the set

(C \ j) ∪ i.

3. The required information is read from, or written to, the copy of page i in the cache that hasjust been created.

In this second case — when we need to create a new copy of page i — we say that a page fault hasoccurred.

By the way, there is more than one way to choose the integer j, above: You could choose anyone of the integers j ∈ C in order to do what has been described.

Definitions and Notation. Now let integers n and k be as above, so that 1 ≤ k ≤ n.Let C0 be a subset of 1, 2, . . . , n whose size is k. C0 represents the set of pages with copies in

the cache at the beginning of a sequence of requests.Let m be a nonnegative integer: m ≥ 0.Let r1, r2, . . . , rm be a sequence of integers between 1 and n (not necessarily distinct): These

represent a sequence of pages of memory that we wish to access, in this order.Next, let p1, p2, . . . , pm be another sequence of integers, of the same length m.This sequence, p1, p2, . . . , pm, represents a valid way to meet the requests r1, r2, . . . , rm when

the contents of the cache is initially the set of pages in C0, if the following is true (and theserequirements also define a sequence of sets C1, C2, . . . , Cn, representing the contents of the cacheafter requests are served): For all i such that 1 ≤ i ≤ n,

1. pi ∈ Ci−1 (in particular, p1 ∈ C0);

2. if ri ∈ Ci−1 then pi = ri and Ci−1 = Ci;

3. if ri 6∈ Ci−1 then pi ∈ Ci−1 (so, clearly, pr 6= ri), and Ci = (Ci−1 \ pi) ∪ ri. Note thatCi−1 6= Ci in this case.

This means that pi is the page whose copy you remove from the cache in order to add the page ri,if you need to.

If Ci 6= Ci−1 then a page fault has occurred at time i (and a page fault has not occurred at thistime, otherwise).

97

An Example. Suppose that n = 5, k = 3, C0 = 1, 2, 3 (so, C0 ⊆ 1, 2, . . . , 5, as needed),m = 2, r1 = 4 and r2 = 3.

A First Case: If p1 = 3 and p2 = 1, then

• r1 6∈ C0, but p1 ∈ C0 (as needed), so C1 = (C0 \ p1) ∪ r1 = 1, 2, 4 6= C0;

• r2 6∈ C1, but p2 ∈ C1 (as needed), so C2 = (C1 \ p2) ∪ r2 = 2, 3, 4 6= C1.

Thus the sequence p1, p2 is “valid,” and two page faults occur when it is used.

A Second Case: Suppose instead that p1 = 2 and p2 = 3; then

• r1 6∈ C0, but p1 ∈ C0 (as needed), so C1 = (C0 \ p1) ∪ r1 = 1, 3, 4 6= C0;

• Now, r2 ∈ C1, so p2 = r2 (as needed), and C2 = C1.

Thus this sequence p1, p2 is also “valid,” and only one page fault occurs (at time 1) when it is used.

A Third Case: Suppose next that p1 = 5 and p2 = 3; then this sequence is not valid, becauser1 6∈ C0 but p1 is not in C0, either.

A Fourth Case: Finally, suppose p1 = 2 and p2 = 4; then this is also not valid, but for a differentreason: The set C1 is defined to be 1, 3, 4, just as for the second example. Thus r2 ∈ C1 in thiscase, but p2 is not equal to r2 in spite of the fact that this is required.

The Problem To Be Solved

Needed for Question #1: A Definition. Consider an optimization problem, “Offline MemoryManagement,” that is defined as follows.

An instance of this problem includes

• positive integers n and k representing the number of pages in main memory and the size ofa cache, respectively;

• A set C0 ⊆ 1, 2, . . . , n of size k, representing the initial contents of the cache;

• A sequence r1, r2, . . . , rm of integers between 1 and n (not necessarily distinct), representinga sequence of pages from memory that are to be accessed in order.

An output is a sequence p1, p2, . . . , pm of integers between 1 and n of the same length m as thesequence of requests. This is valid if it meets the requirements for a “valid” sequence that weregiven above. This is optimal if it is valid and the number of page faults that occur when it is isused is as small as possible.

An Example (Which Might Not Be Needed). An instance of this problem was defined inthe “example” at the end of the previous section, and four possible sequences of outputs were givenafter that.

The last two are not valid at all, for the reasons given above.The first is valid — but it isn’t optimal, because the second is also valid, and causes one fewer

page fault than the first.It turns out that the second is optimal (and, therefore, a correct output). You can’t ever avoid

having a page fault at time 1 for this instance since r1 6∈ C0 — and this is the only page fault thatoccurs when this sequence is used.

98

Needed for Question #1: Size. The size of the instance will defined to be m — the numberof requests for memory accesses that it includes.

A Greedy Algorithm — Base Instances. These are instances such that m = 0; an “emptysequence” of integers (that is, one of length zero) is a correct output for any such instance.

A Greedy Algorithm — Greedy Choice. A “greedy choice” will define the first integer p1,that is listed in the output sequence.

If r1 ∈ C0 then p1 = r1 (because no other choice is possible).If r1 6∈ C0 either

• p1 is some element of C0 such that p1 6∈ r1, r2, . . . , rm, so that no memory access is everymade for page p1 at all, or

• if no such element p1 of C0 exists (because all are requested), p1 is the element of C0 whosefirst request comes after all the other pages in C0 have already been requested.

In other words (in this second case), there is an integer i such that p1 = ri, p1 6= rj for1 ≤ j ≤ i − 1, and, for every other element q of C0, there exists an integer j such that1 ≤ j ≤ i− 1 and rj = q.

Needed for Question #2: Defining a Smaller Instance. Once this “greedy choice” p1 hasbeen made you can define a smaller instance of the same problem by

• leaving the values of n and k unchanged,

• replacing C0 by C0 = C1 = (C0 \ p1) ∪ r1 — the contents of the cache you’d get afterserving the first request (using the greedy choice);

• using the sequence of requests r2, r3, . . . , rm of length m − 1 (that is, removing r1 from thefront of the sequence).

Needed for Question #2: Recovering a Solution of the Original Instance. If the se-quence p1, p2, . . . , pm−1 is a correct solution for the “smaller instance” that’s just been defined, thenthe sequence

p1, p1, p2, . . . , pm−1

of length m should be returned as a solution for the original instance of the problem.

The Test Questions

Consider the “Offline Memory Management” problem and the greedy algorithm for it that is de-scribed above.

1. (15 marks) State the Greedy Choice Property that you must prove in order to show thatthis algorithm is correct, and then prove that this property is satisfied.

You may use the following fact without having to prove it:

Lemma: Given a non-base instance (including m requests, where m > 0), a greedy choice p1

for this instance, and a valid sequence p1, p2, . . . , pm for this instance, there exists (another)

99

valid sequence p1, p2, . . . , pm for the same instance that starts with the greedy choice, andthat does not cause any more page faults than the sequence p1, p2, . . . , pm does.

2. (10 marks) Describe, as specifically, and in as much detail as you can, what you would needto do in order to show that the Optimal Substructure Property is also satisfied for thisproblem as well. It is not necessary to give a proof of this property, in order to answer thisquestion.

3. (5 marks) Very Challenging Bonus Question: Try to prove that the “lemma” given inQuestion 1 is correct.

This is much more difficult than the other questions, so you shouldn’t spend any time on ituntil you’ve finished the rest!

100

Part II

Solutions for Selected Exercises andTests

101

Chapter 4

Solutions for Algorithm DesignExercises and Tests

4.1 Divide and Conquer

4.1.1 Solutions for Selected Exercises

Solution for Exercise #1 in Section 1.9

Solution for Part (a): This problem requires a recursive algorithm to produce a balanced binarysearch tree storing the first n positive integers, given n as input, and its analysis.

Suppose that a binary tree is to be given by a pointer to the root node of the tree, and thatnodes are structures (or objects, or records) whose components include the “value” stored, and a“left child” and “right child,” each of which is a pointer to another node.

A sketch of the algorithm to be written was also given in the question. In order to solve part ofthe problem using this algorithm, it is useful to write an auxiliary function, add to value, whichtakes an integer k and a pointer p to a node as input, and which adds k to the value of every nodein the subtree whose root is pointed to by p. In the following pseudocode, “↑ node” will be usedto declare something to be a pointer to a node and, if p has this type, then ∗p will represent thenode that p points to.

add to value(k : integer, p : ↑node)

if p 6= null then∗p.value := ∗p.value + kadd to value(k, ∗p.left child)add to value(k, ∗p.right child)

end if

It can be argued that if the pointer p points to a binary tree of size s (for any integer s ≥ 0),then the above function uses Θ(s) pointer dereferences and accesses of data, arithmetic operations,and assignments of values to variables.

Now, a function build tree, which takes a positive integer n as input and returns a pointer toa balanced binary search tree storing the first n positive integers, corresponds to the pseudocodeshown in Figure 4.1 (on page 104).

103

build tree(n : integer) : ↑node

p, q1 , q2 : ↑node; middle : integer

if n ≤ 0 thenreturn null

elsemiddle :=

⌊n+1

2

⌋q1 := build tree(middle − 1)q2 := build tree(n−middle)add to value(middle, q2 )new(p)∗p.value := middle∗p.left child := q1∗p.right child := q2return p

end if

Figure 4.1: Algorithm build tree

Solution for Part (b): Let T (n) be the number of operations used by the function build tree,when it is given a positive integer n as input. Since this function calls the function add to valuewith a pointer to a binary tree of size n − middle = n − bn+1

2 c = dn−12 e as input, and since

middle − 1 = bn−12 c, it can be shown that T (n) satisfies a recurrence

T (n) ≤c1 if n ≤ 1T (bn−1

2 c) + T (dn−12 e) + c2n if n ≥ 2,

for positive constants c1 and c2, and that it also satisfies a recurrence

T (n) ≥c1 if n ≤ 1T (bn−1

2 c) + T (dn−12 e) + c2n if n ≥ 2,

for (smaller) positive constants c1 and c2. Now, suppose that R(n) satisfies the recurrence

R(n) =

1 if n ≤ 1R(bn−1

2 c) +R(dn−12 e) + n if n ≥ 2.

Then, it can be shown that T (n) ∈ Θ(R(n)), so it’s sufficient to find an asymptotic bound in closedform for R(n) instead.

Exercise: If you’d like to have more practice, use the substitution method to prove that T (n) ∈Θ(R(n)).

It can also be shown that R(n) is a nondecreasing function of n.

Exercise: If you’d like to have more practice in the use of mathematical induction, then prove thatR(n) ≤ R(n+ 1) for every integer n ≥ 1 using induction on n.

104

Solution for Part (c): The fact that R is a nondecreasing function implies that R(bn−12 c) ≤

R(dn2 e) and that R(dn−12 e) ≤ R(dn2 e), and that if S(n) satisfies the recurrence

S(n) =

1 if n ≤ 1,2S(dn2 e) + n if n ≥ 2,

then R(n) ≤ S(n) for every integer n ≥ 1 — so, that, in particular, R(n) ∈ O(S(n)).Now, the “master theorem” can be used to establish that S(n) ∈ Θ(n logn), and this implies

that R(n) ∈ O(n logn), and hence that T (n) ∈ O(n logn) as well.We would also like to show that R(n) ∈ Ω(n logn). To do this, note first that since R(n) is an

increasing function, R(dn−12 e) ≥ R(bn−1

2 c), and that if U(n) satisfies the recurrence

U(n) =

1 if n ≤ 1,2U(bn−1

2 c) + n if n ≥ 2,

then R(n) ≥ U(n) for every integer n ≥ 1. The recurrence for U(n) is still not in the form requiredso that we can apply the master theorem (although it’s getting closer). Now, let V (n) = U(n− 1)for all n. Then V (n) = U(n− 1) = 1 if n ≤ 2, and if n > 2 then

V (n) = U(n− 1)= 2U(bn−2

2 c) + n− 1 (since n− 1 ≥ 2)= 2U(bn2 c − 1) + n− 1 (since

⌊n2 − 1

⌋=⌊n2

⌋− 1)

= 2V (bn2 c) + n− 1.

Now, finally, set

W (n) =

12 if n ≤ 1,2W (bn2 c) + n− 2 if n ≥ 2.

Then, W (1) = 12 < V (1), W (2) = 1 ≤ V (2), and one can show using induction on n that

V (n) ≥W (n) for every integer n ≥ 1.(Note that W (1) was defined to be 1

2 instead of 1 in order to ensure that all these inequalitiesare satisfied: If W (1) had been set to be 1 and W (2) had been defined using the recursive definitionshown above, then W (2) would have turned out to be greater than V (2), so that the above claim“V (n) ≥W (n) for every integer n ≥ 1” would have been false.)

Now (at last) the recurrence for W (n) has a form that allows us to apply the master theorem.This theorem implies that W (n) ∈ Θ(n logn). Since V (n) ≥ W (n) for all n ≥ 1, it follows thatV (n) ∈ Ω(n logn). Since U(n) = V (n+ 1), U(n) ∈ Ω((n+ 1) log(n+ 1)) = Ω(n logn), too. Finally,since R(n) ≥ U(n) for every positive integer n, it follows that R(n) ∈ Ω(n logn).

Since we have already shown that R(n) ∈ O(n logn), it follows that R(n) ∈ Θ(n logn). Thus,Θ(R(n)) = Θ(n logn), and we can conclude that T (n) (the running time of the algorithm we startedwith) is in Θ(n logn), since we had already observed that T (n) ∈ Θ(R(n)).

Exercise: If you want still more practice, try to confirm directly that T (n) ∈ Θ(n log2 n) withoutperforming changes of recurrences, and so on, but by working with the original recurrences (shownat the bottom of the first page of these solutions) instead.

105


This question required an algorithm that generalizes the one obtained in the solution for Exercise #1above and the analysis of the new algorithm.

In particular, we now wish to write an algorithm which accepts two integers n and k as inputs(with n being nonnegative) and which produces a balanced binary search tree with nodes k, k +1, . . . , k + n− 1 as output.

An algorithm that solves this problem (and agrees with the description given in the question)is shown in Figure 4.2, which is shown below.

build tree(n, k : integer) : ↑node

p, q1 , q2 : ↑node; middle : integer

if n ≤ 0 thenreturn null

elsemiddle :=

⌊n−1

2

⌋q1 := build tree(middle, k)q2 := build tree(n−middle − 1, k + middle + 1)new(p)∗p.value := k + middle∗p.left child := q1∗p.right child := q2return p

end if

Figure 4.2: Algorithm build tree

In order to convince yourself that the algorithm is correct, consider the cases n = 0, n = 1, andn = 2, and then try to prove using induction on n that the algorithm always returns the requiredbinary search tree, after that.

It isn’t too hard to see that this algorithm has a running time T (n) that’s a function of theinput n and that satisfies a recurrence

T (n) =

c1 if n ≤ 0,T(bn−1

2 c)

+ T(dn−1

2 e)

+ c2 if n ≥ 1.

for positive constants c1 and c2, assuming that pointer dereferences, reading the current values ofvariables, assignments of values to variables, and integer arithmetic are all counted as having unitcost.

Using this recurrence it’s easy to confirm that T (0) = c1, T (1) = c2 + 2c1, T (2) = 2c2 + 3c1,and T (3) = 3c2 + 4c1. After this you might guess that

T (n) = c2n+ (n+ 1)c1 = (c1 + c2)n+ c1

for every integer n ≥ 0 — and it’s easy to use induction on n to prove that this is correct.

106

An alternative strategy (for finding a solution for the above recurrence) would be to “guess”that it has the same asymptotic complexity as a function U(n) (that is, it’s in Θ(U(n))), where

U(n) =

c1 if n ≤ 1,2U

(bn2 c

)+ c2 if n ≥ 2,

so that U(n) satisfies a similar recurrence, but also so that an asymptotic bound for U(n) canbe obtained using the master theorem. The master theorem could then be used to confirm thatU(n) ∈ Θ(n), and the substitution method could then be used to prove that T (n) ∈ Θ(n) as well.

Thus there are several ways to prove that T (n) ∈ Θ(n) (assuming the unit cost criterion as it’sgiven above).

Note that this implies that the above “generalized” algorithm is asymptotic more efficient thanthe algorithm that was given as an answer for the previous question, as this question suggested.

107

4.2 Dynamic Programming



The Problem. In this question you were asked to design an algorithm that used Dynamic Pro-gramming to compute the binomial coefficient

(ni

), for n > 0 and 0 ≤ i ≤ n, given n and i as inputs,

using only the recursive definition(n

i

)=

1 if i = 0 or i = n,(n−1i−1

)+(n−1

i

)if 1 ≤ i ≤ n− 1.

Establishing the Optimal Substructure Property. In this case, this is easy, because a re-cursive definition of

(ni

)has already been given. Furthermore the definition states

(ni

)directly if

n = 0 and defines(ni

)in terms of values

(n−1j

)(for j = i − 1 and j = i) when n > 0, so that it is

clear that a recursive function that simply implemented this definition would terminate, whenevergiven integers n ≥ 0 and i such that 0 ≤ i ≤ n as input.

Establishing the Overlapping Subproblems Property. Since the recursive definition defines(ni

)in terms of values

(n−1j

)where 0 ≤ j ≤ i, and since it gives a value for

(mm

)directly whenever

m ≥ 0, it should be clear that the only instances of the problem that arise, when you try to compute(ni

)using this definition, are instances in which you’re asked to compute

(mj

), for integers m and j

such that 0 ≤ m ≤ n and 0 ≤ j ≤ min(m, i).Therefore, there are only Θ(ni) instances of this problem that are formed and solved when the

above recursive definition is used to compute(ni

).

Design Step 1: Characterizing Instances to be Solved and Determining an Order ofSolution. The set of instances to be solved was identified above. There are several orders youcould choose from, in order to solve these, but it’s probably simplest if the algorithm proceedsin n + 1 “stages.” In the hth stage (for h = 0, 1, 2, . . . , n), the algorithm should compute

(hj

)for

0 ≤ j ≤ min(h, i) in some order (it won’t matter which). To keep the code as simple as possible,one might as well compute the values

(h0

),(h

1

),(h

2

), . . . ,

(hs

)for s = min(h, i) in this order, during the

hth stage.

Design Step 2: Choosing a Data Structure. A two-dimensional array with n+ 1 rows andi+ 1 columns, whose (r, s)th position will be used to store

(rs

)for 0 ≤ r ≤ n and 0 ≤ s ≤ min(r, i),

can be used as the data structure needed to maintain solutions of instances for this computation.

Design Step 3: A First Algorithm. An algorithm that is based on these choices is shown inFigure 4.3. For the sake of completeness this returns a value (0) if its inputs are out of the rangesgiven above; it might be appropriate to return an error in these cases instead.

It should be clear that this uses O(n2) steps on inputs n and i in the worst case (namely, i = n),assuming that the “unit cost criterion” is used.

108

function binom(n, i: integer): integer

if (n < 0) or (i < 0) or (i > n) thenreturn 0

elsif (n = 0) or (i = 0) or (i = n) thenreturn 1

else

B: array[0 . . . n, 0 . . . i] of integerr, s: integer

for r := 0 . . . n dofor s := 0 . . .min(r, i) do

if (s = 0) or (s = r) thenB[r, s] := 1

elseB[r, s] := B[r − 1, s− 1] +B[r − 1, s]

end ifend for

end forreturn B[n, i]

end if

end function

Figure 4.3: A Dynamic Programming Algorithm for Computing a Binomial Coefficient

Design Step 4: Conserving Storage Space. The above algorithm uses Θ(n2) storage spacein the worst case (n = i), provided that the “unit cost criterion” is used to measure space as wellas time.

Note that the value(rs

)is never used by this function after all the values

(r+1t

)have been

computed for 0 ≤ t ≤ min(r + 1, i). Therefore the algorithm doesn’t really need to remember theentries in more than two (adjacent) rows of the table stored in the array at any time, and a slightlymore complicated algorithm that uses only a linear amount (O(n)) of storage space, and at mosttwice as much time (with the extra time needed to copy elements back and forth between datastructures) could also be developed.

You consider writing the pseudocode for this improved algorithm as an exercise, if it isn’tobvious what this should look like.

More Information about Algorithm Analysis. Of course the “unit cost criterion” is unre-alistic, especially when applied to algorithms that manipulate large integers (like this one).

If a more realistic cost measure is used, then integer addition should be consider to be a “lineartime” operation — it requires time that is linear in the lengths of the representations of its inputs.This is true for comparisons of integers, as well.

The output returned by this function is an integer whose (decimal or binary) representation haslength at most linear in n, and it can be argued (since they are all binomial coefficients) that allthe “intermediate values” manipulated by this algorithm are integers whose representations haveat most this length, as well.

Using this information, you should be able to prove that the above (first) algorithm could be

109

implemented to use O(n3) operations on digits (or bits), and to require O(n3) storage space (if adigit or bit required unit storage space), on inputs n and i. The improved algorithm mentionedafter that would also use O(n3) operations on digits or bits, but only O(n2) storage space.


The Problem. In this question, you were asked to produce an efficient algorithm for a modifiedversion of the “optimal fee” problem discussed in Chapter 2, in which you were allowed to choosetwo jobs in a row, but not three.

Establishing the Optimal Substructure Property. As for the original problem, supposebest set[i] is a solution for the problem when you are given the first i jobs (and their fees) asinputs, and let best fee[i] be the sum of the fees of all the jobs in the set best set[i], for 0 ≤ i ≤ n.Then,

• best set[i] ⊆ 1, 2, . . . , i,

• for 1 ≤ j ≤ i− 2, if j ∈ best set[i] and j + 1 ∈ best set[i], then j + 2 /∈ best set[i],

• best fee[i] =∑

j∈best set[i]

fee[j], and

• if S ⊆ 1, 2, . . . , i such that j, j + 1 and j + 2 are not all in S for any integer j, then∑j∈S

fee[j] ≤ best fee[i].

Now, since it is given that fee[j] ≥ 0 for all j, we can correctly set the sets best set[0], best set[1],and best set[2] and the corresponding values best fee[0], best fee[1], and best fee[2] to be as follows:

• best set[0] = ∅ and best fee[0] = 0;

• best set[1] = 1 and best fee[1] = fee[1];

• best set[2] = 1, 2 and best fee[2] = fee[1] + fee[2].

Consider the possible choices of best set[i] when i > 3.If i−1 /∈ best set[i], then we may assume without loss of generality that i ∈ best set[i], because

i could be added to this set without violating any of the constraints given above for best set[i].Therefore, there are three cases that should be considered (corresponding to decisions about whetherto include i− 1 and i):

(a) i ∈ best set[i] and i− 1 ∈ best set[i];

(b) i ∈ best set[i], but i− 1 /∈ best set[i];

(c) i− 1 ∈ best set[i], but i /∈ best set[i].

In the first case (i ∈ best set[i] and i − 1 ∈ best set[i]), it is not possible for i − 2 to belongto best set[i]: Otherwise, j, j + 1, and j + 2 all belong to this set, for j = i− 2. Therefore, in thisfirst case, it can be shown that (we may assume, without loss of generality) that

best set[i] = best set[i− 3] ∪ i− 1, i

110

and

best fee[i] = best fee[i− 3] + fee[i− 1] + fee[i].

In order to prove this, suppose that a set S has been chosen as “best set[i]” by some means,and that this first case holds. Then i ∈ S, i − 1 ∈ S, i − 2 /∈ S (as argued above), and ifS = S \ i, i− 1 then S is one of the sets that should have been considered as a candidate whenchoosing best set[i− 3]. That is, it satisfies the first two constraints given for best set[i− 3]. Now,since the set best set[i− 3] must satisfy the fourth condition given in the same list, it follows that∑

j∈S

fee[j] ≤ best fee[i− 3],

by the definitions of best set[i − 3] and best set[i − 3]. Since the sum of the fees of jobs in S isfee[i− 1] + fee[i] plus the sum of the fees of jobs in S, it follows that∑

j∈Sfee[j] ≤ best fee[i− 3] + fee[i− 1] + fee[i].

On the other hand, the set best set[i − 3] ∪ i, i − 2 is one of the sets that should have beenconsidered as a candidate when choosing S = best set[i], so (by the definition of best set[i]),∑

j∈Sfee[j] ≥ best fee[i− 3] + fee[i− 1] + fee[i],

as well. Thus,

best fee[i] =∑j∈S

fee[j] = best fee[i− 3] + fee[i− 1] + fee[i].

Now, the set best set[i − 3] ∪ i − 1, i satisfies the first two conditions given above for theset best set[i], and the sum of the fees of the jobs in this set is the same as best fee[i] (as shownabove), so it follows that we could correctly choose best set[i] to be best set[i − 3] ∪ i − 1, i, asclaimed.

In the second case (i ∈ best set[i], but i − 1 /∈ best set[i]), it can be shown that (we mayassume without loss of generality that)

best set[i] = best set[i− 2] ∪ i

and

best fee[i] = best fee[i− 2] + fee[i].

The proof that these choices are correct is similar to the proof given at the end of the discussion ofthe first case.

In the third case (i − 1 ∈ best set[i] but i /∈ best set[i]), we may assume without loss ofgenerality that

best set[i] = best set[i− 1]

and

best fee[i] = best fee[i− 1].

111

Once again, the proof of this is similar to the proof given at the end of the discussion of the firstcase.

Now, each of the above three possible values for best fee[i] is the sum of all the jobs in someset T ⊆ 1, 2, . . . , i that satisfies the first two conditions given above for best set[i]. Therefore(since we have argued already that best fee[i] must take on one of these values),

best fee[i] = max(best fee[i− 3] + fee[i− 1] + fee[i], best fee[i− 2] + fee[i], best fee[i− 1]),

and we can choose best set[i] as

best set[i] :=

best set[i− 3] ∪ i− 1, i if best fee[i] = best fee[i− 3] + fee[i− 1] + fee[i],best set[i− 2] ∪ i otherwise, if best fee[i] = best fee[i− 2] + fee[i],best set[i− 1] in all other cases.

Remaining Steps. Now, development of a polynomial-time algorithm that uses dynamic pro-gramming to solve this problem is extremely similar to the development of the algorithm given inthe original example. Once again, it will be useful to use an array to store the values best fee[i], for0 ≤ i ≤ n, and another array choice can be used to keep track of the sets best set[i] for 1 ≤ i ≤ n.However, this time, it won’t be (quite) good enough to define choice to be an array of booleanvalues. Instead, it is sufficient to specify that, for 3 ≤ i ≤ n,

• choice[i] ∈ “a”, “b”, “c”;

• if choice[i] = “a” then i and i− 1 both belong to best set[i];

• if choice[i] = “b” then i belongs to best set[i] but i− 1 does not; and, finally,

• if choice[i] = “c” then i− 1 belongs to best set[i] but i does not.

Exercise: Using the solution for the original problem, and the above information, complete thedevelopment of a polynomial-time algorithm that uses dynamic programming to solve this modifiedversion of the “job selection” problem.


The Problem. In this question you were asked to design an algorithm that counted the numberH(n) of binary heaps storing the integers 1, 2, . . . , n when given n ≥ 0 as input.

Establishing the Optimal Substructure Property. In this case, there is a bit of work to bedone: You must start by developing a recurrence for the function H(n).

There’s clearly only one heap of size 1 storing the value 1, so H(1) = 1. Since there’s only one“empty heap,” we can set H(0) = 1 as well. This will turn out to be useful (and, it will give theanswer we want), when a recursive definition is developed for H(n) where n > 1.

Now, heaps differ from binary search trees in at least one important respect: every binary heapof size n has the same shape as any other binary heap of size n.

This is obvious if n = 0 or n = 1, so suppose, now, that n ≥ 2.Let k be an integer such that

2k − 1 ≤ n < 2k+1 − 1,

112

so that k = blog2(n+ 1)c. Since n ≥ 2, k ≥ 1.Now, since 2k−1 ≤ n < 2k+1−1, either 2k−1 ≤ n < 2k+2k−1−1, or 2k+2k−1−1 ≤ n < 2k+1−1.In the first case, 2k−1 ≤ n < 2k+2k−1−1, the right sub-heap of the root is a “full” binary heap

with depth k − 2 and size exactly, 2k−1 − 1, and the left sub-heap has depth either k − 2 or k − 1and (larger) size n − 2k−1. Note that in this case, either n = 2k − 1, so that n − 2k−1 = 2k−1 − 1and the left sub-heap is a full binary heap with depth k − 2, or

2k−1 ≤ n− 2k−1 < 2k − 1,

and the left sub-heap has depth k − 1.In the second case, 2k + 2k−1− 1 ≤ n < 2k+1− 1, the left sub-heap of the root is a “full” binary

heap with depth k−1 and size exactly 2k−1, and the right sub-heap has depth either k−2 or k−1and size n− 2k. Note that in this case

2k−1 − 1 ≤ n− 2k < 2k − 1,

so that either n = 2k + 2k−1 and the right sub-heap is a full binary heap with depth k− 2 and sizeexactly 2k−1 − 1, or n > 2k + 2k−1, and the right sub-heap has depth k − 1 and size between 2k−1

and 2k − 2.In order to simplify the presentation of a recurrence, let nL and nR denote the sizes of the left

and right sub-heaps of a heap of size n, respectively. Then, if n ≥ 2 and k = blog2(n + 1)c (asabove) then

nL =

n− 2k−1 if 2k − 1 ≤ n < 2k + 2k−1 − 1,2k − 1 if 2k + 2k−1 − 1 ≤ n < 2k+1 − 1,

and

nR = n− 1− nL =

2k−1 − 1 if 2k − 1 ≤ n < 2k + 2k−1 − 1,n− 2k if 2k + 2k−1 − 1 ≤ n < 2k+1 − 1.

There is a second significant difference between a binary heap storing the values 1, 2, . . . , n anda binary search tree that stores the same values: The value stored at the root of the heap is alwaysguaranteed to be the largest value (n) stored in the heap.

Unfortunately, there’s a third significant difference: This information doesn’t determine the setof values that are stored in the left sub-heap. Indeed, every subset of 1, 2, . . . , n − 1 of size nLmight be stored in the left sub-heap.

However, there’s good news: For every set SL of size nL that is a subset of 1, 2, . . . , n − 1,there are exactly H(nL) binary heaps of size nL that store the values in SL, and there are exactlyH(nR) binary heaps of size nR = n − nL − 1 that store the values in 1, 2, . . . , n − 1 that don’tbelong to SL.

If your “discrete mathematics” is rusty then it would be a good exercise (in set theory) to provethat this is the case.

This information is sufficient to define a recurrence for H(n):

H(n) =

1 if n = 0 or n = 1,(n−1nL

)·H(nL) ·H(n− 1− nL) if n ≥ 2.

Since 0 ≤ nL, n − 1 − nL < n when n ≥ 2, this really is a “recurrence,” so that it can be used asthe basis for a “Divide and Conquer” solution for this problem — and it establishes the “OptimalSubstructure Property.”

113

Establishing the Overlapping Subproblems Property. It should now be clear that at mostn+ 1 instances of this problem are formed and solved when “Divide and Conquer” is used to solvethis problem and compute H(n): One only needs to compute the values H(i) for integers i suchthat 0 ≤ i ≤ n.

Design Step 1: Determining the Order in Which Subproblems are Solved. It shouldalso be reasonably clear that you compute the values H(0), H(1), . . . , H(n) in this order (that is,you compute H(i) in order of increasing argument i), then that the values H(j) that you need inorder to compute H(i) will already be available, when H(i) is to be computed — because j < i ifH(j) is needed.

Design Step 2: Choice of Data Structure. Now, we can use an array of size n + 1 in orderto store the values H(0), H(1), . . . , H(n) that are needed.

Remaining Steps. The rest of this “algorithm design” problem — writing down the pseudocodecorresponding to these choices — is left as a straightforward (but tedious) exercise.

At this point, you’ll have done everything that is necessary to answer the question that wasasked on the problem set. The rest of this “solution” is for interest only. It provides part of ananswer for a later question on the problem set, and considers the complexity of this problem, whena more realistic measure than “the unit cost criterion” is applied.

Complexity of a “Divide and Conquer” Solution. Let d be the depth of a binary heapwhose size is n — then d = dlog2 ne.

If nL and nR = n − nL − 1 are as defined above, then binary heaps with size nL or nR havedepth at most d− 1.

Now, consider the number of operations that would be used by a “Divide and Conquer” algo-rithm for computing H(n). If you try to measure this as a function of d = dlog2 ne rather thanas a function of n, then the above information can be used to prove that this number of steps is(singly) exponential in d.

However, this fact can be used to establish that this number of steps is also polynomial in n— and not exponential in n, which has been the case for the “Divide and Conquer” solutions forother problems that we’ve solved by Dynamic Programming!

Solution Using Dynamic Programming — A Better Solution. A much more careful anal-ysis (which won’t be given here) will reveal that it isn’t necessary to compute all of the valuesH(0), H(1), . . . , H(n), if you wish to use “Dynamic Programming” to find H(n).

Look back at the recurrence for H(n), and note that it always expresses H(n) in terms ofH(n− 2h) and H(2h − 1) for some integer h — so that at least one of nL + 1 or nR + 1 is always apower of two. Furthermore, nR ≤ nL <

23n whenever n ≥ 2. With some care (and a bit of clever

thinking) you should be able to use these observations to prove that you only need to compute andstore H(i) for O(log2 n) choices of i between 0 and n, in order to compute H(n) using DynamicProgramming.

Figuring out which values H(i) are needed is slightly tricky, but not too difficult.However, you can avoid having to do this by using “Memoization” instead of “Dynamic Pro-

gramming” in order to perform this computation. If time permits, then it might be a useful exerciseto review the information about “Memoization” in Chapter 2, and then design an algorithm thatuses this technique to find H(n).

114

Finally, a More Careful Analysis. You shouldn’t conclude, at this point, that the problemcan be solved using time that is polynomial in logn — at least, not unless you’re using the “unitcost criterion” and have a way to compute binomial coefficients, like

(n−1nL

), at “unit cost.”

Computing the binomial coefficients is nontrivial. There’s another problem, if you wish tocompute the number of operations on digits or bits that are needed to solve this problem: You’llneed to find a usable bound on the length of decimal or binary representations for the values thatthis algorithm computes.

This can be done, but it isn’t trivial: You’ll need to look at the recurrence for H(n) and usethis to generate a recurrence for the length of a decimal representation of H(n).

If you’ve done this correctly, and take advantage of the fact that nR ≤ nL ≤ 23n whenever

n ≥ 2, then you should be able to apply a result that was introduced in CPSC 413, to prove thatthe length of a decimal representation of H(n) is bounded by a polynomial function of n — namely,the master theorem!

If you have the time for it, and wish to exercise your “algorithm analysis” skills, then you shouldtry to perform the analysis that’s been sketched (in part) here.

As mentioned above, you won’t be able to conclude that the original problem can be solvedusing a number of operations on digits that is polynomial in logn. This shouldn’t be surprising,since the output is a number whose decimal representation is at least linear in n. So it might not beclear that analyzing the problem more carefully, to improve the “Dynamic Programming” solution,is really worthwhile.

It is (if you’re interested in solving this problem at all), since the final algorithm is fasterthan the original by (almost) a factor of n, even though its running time is not “polylogarithmic”in n. So, careful “algorithm design” really can yield substantial reductions in the time needed forcomputations (and “algorithm analysis” techniques really can be used to prove results that aren’tobvious, and that might be difficult to discover using experimentation alone).

115

4.2.2 Solutions for Sample Tests

Solution for Sample Test in Section 2.9.1

1. Every path from (0, 0) to (r, u − 1) can be extended by adding a final move up in order toobtain a path from (0, 0) to (r, u) (in which the last move is up). If two distinct paths to(r, u− 1) are extended in this way, then two distinct paths to (r, u) are obtained.

On the other hand, every path to (r, u) in which the last move is up can be obtained byextending some path to (r, u− 1) in this way.

Therefore, the number of paths from (0, 0) to (r, u) in which the last move is up is P (r, u−1).

2. An argument that is similar to the one given above can be used to show that the number ofpaths from (0, 0) to (r, u) in which the last move is to the right is P (r − 1, u).

3. If r > 0 and u > 0 then every path from (0, 0) to (r, u) either ends with a final move up,or a final move to the right, but not both. Therefore, it follows (using the solutions for theprevious two questions) that

P (r, u) = P (r − 1, u) + P (r, u− 1) if r > 0 and u > 0.

4. It is known already that P (r, u) = 1 if r, u ≥ 0 and at least one of r and u equals zero, so thesolution for Question 3 (and this information) provides a recursive definition of P (r, u), forall r, u ≥ 0.

It is easy to show (by induction on r + u) that if P (r, u) is computed using this recursivedefinition, and both r and u are positive, then the value P (a, b) will be used at some pointduring the computation, if and only if 0 ≤ a ≤ r and 0 ≤ b ≤ u. These values can be storedin a two-dimensional array, with r + 1 rows and u+ 1 columns.

We can compute these values iteratively, in such a way that all the values that are neededhave already been computed whenever a new value is being computed, by computing thesevalues in any one of several different orders:

• Using a sequence of u + 1 stages, in which the values P (0, `), P (1, `), . . . , P (r, `) arecomputed (in this order) during stage `, for ` = 0, 1, . . . , u;

• Using a sequence of r + 1 stages, in which the values P (`, 0), P (`, 1), . . . , P (`, u) arecomputed (in this order) during stage `, for ` = 0, 1, . . . , r;

• Or, even, using a sequence of r+u stages, in which all the needed values P (a, b) such thata+ b = ` are computed (in any order you want to) during stage `, for ` = 1, 2, . . . , r+u.

The first of these schedules will be used in the following algorithm. The second scheduleproduces an algorithm that is almost identical to this one, while the third will produce onethat’s a bit more complicated (and has no real advantages over either of the ones that wouldbe produced using the first two schedules).

116

P : array[0 . . . r, 0 . . . u] of integer

if ((r = 0) or (u = 0)) thenreturn 1

else Stage 0 for a := 0 . . . r do

P (a, 0) := 1end for

Stages 1 – u for b := 1 . . . u do Stage b P (0, b) := 1for a := 1 . . . r do

P (a, b) := P (a− 1, b) + P (a, b− 1)end for

end forreturn P (r, u)

end if

Figure 4.4: A simple dynamic programming algorithm that computes P (r, u) for r, u ≥ 0

A “dynamic programming” algorithm that solves the problem using this schedule is shownin Figure 4.4. Since the only control structures used by the algorithm are an if-then-elsestructure and nested for loops, it is easy to see that this algorithm uses Θ(ru) steps (forpositive r and u).

The algorithm also uses space Θ(ru) — assuming (realistically, for small r and u, so thatP (r, u) can be stored using a single word of memory, and unrealistically otherwise) thateach value P (a, b) can be stored using “constant space.” The use of space can be improved,while complicating the algorithm very slightly, and without significantly changing the runningtime, if you note that the only values needed during stage ` of the computation are the onescomputed in stages ` − 1 and `, for ` > 0. Thus (if you complicate things slightly more, bychoosing between the first two schedules after seeing the values r and u), it is possible to solvethis problem using a “dynamic programming” algorithm with Θ(ru) steps and Θ(min(r, u))storage locations. However, full marks will be given for an algorithm that is correct andresembles the one on the following page.

Finally, if the “bogus recursive definition” given on the quiz was used instead of the correctone, then an algorithm that is identical to this one — except for a change on the line in whichthe recursive definition is used — could be obtained. The same improvements (as describedabove) would also be possible, and the running time would also be the same.

117


1.

P (p, t) =

1 if t = 0 and p = 0,0 if t = 0 and p 6= 0,P (p− 1, t− 1) + P (p+ 1, t− 1) if t > 0 and −t ≤ p ≤ t,0 if t > 0, and (p < −t or p > t)

The last two of these rules could actually be replaced by the single rule,

“P (p, t) = P (p− 1, t− 1) + P (p+ 1, t− 1) if t > 0,”

because this would give the same definition for P (p, t) whenever t > 0, if P (p, 0) is definedusing the first two rules.

2. Establishing the Optimal Subproblems Property. In this case, this is straightforward,because a recursive definition for the function P (p, t) has been obtained. This defines P (p, 0)directly for every integer p, and if t > 0 then it defines P (p, t) in terms of function valuesP (p′, t − 1) such that p − 1 ≤ p′ ≤ p + 1 if −t ≤ p ≤ t, and it defines P (p, t) directly (as 0)otherwise.

This all implies that a divide and conquer algorithm that simply implements the above re-cursive definition will terminate, whenever it is given integers p and t ≥ 0 as inputs, and thisestablishes the “Optimal Subproblems” property.

Establishing the Overlapping Subproblems Property. If we take care to solve the“border” cases intelligently — that is, if we note that (for t > 0) P (t, t) = P (t − 1, t − 1)because P (t+ 1, t− 1) = 0, and P (−t, t) = P (1− t, t− 1) because P (−t− 1, t− 1) = 0, andalso simplify the recurrence used to compute P (t − 1, t) and P (1 − t, t) in a similar way —then the only other instances of the problem that we would form and solve, when computingP (p, t) recursively, would be (a subset of) the problems of computing the values P (q, u), where0 ≤ u ≤ t− 1 and −u ≤ q ≤ u.

Thus, if we now “include” the instance in which we compute P (p, t), the total number ofinstances that must be formed and solved is at most

1 +t−1∑u=0

(2u+ 1) = t2 + 1 ∈ Θ(t2).

Since the number of problem instances whose solutions are required is (at most) quadraticin t this establishes the “Overlapping Subproblems” property.

Design Step 1: Determining and Order of Solution for Subproblems. Several ordersof solution are possible, but the simplest is probably to proceed in t stages, before P (p, t) iscomputed, and then to add the instance of computing P (p, t) at the end, after that.

In the ith stage, for 0 ≤ i ≤ t− 1, the values P (p, i) for −p ≤ i ≤ p could be computed in anyorder one might want. However, we’ll choose the order of “increasing i” (so that P (p− 1, i)is computed before P (p, i) for 1− i ≤ p ≤ i), again, just to simplify the code.

118

Thus, we’ll choose an order (for these solutions) that begins with the computation of thefollowing values (if t ≥ 3):

P (0, 0), P (−1, 1), P (0, 1), P (1, 1), P (−2, 2), P (−1, 2) . . .

Design Step 2: Choice of Data Structure A two-dimensional array A and one morevariable, “solution,” can be used to store the above values. To simplify the code that shouldbe given as an answer for the next question, it will be assumed there is an array locationA(q, u) that can be accessed, for 0 ≤ t− 1 and 1− t ≤ u ≤ t− 1, and that this location willbe used to store the value P (q, u) if −u ≤ q ≤ u. The variable solution will be used to storethe value P (p, t).

Note: These are certainly not the only choices that one might make. Other correct (andreasonable) sets of problem instances (for example, ones including the problem of computingP (q, u) when −t ≤ q ≤ t even when q < −u or q > u) will also be perfectly acceptable, andother correct “orders for solving problem instances” will be perfectly acceptable, too.

3. Pseudocode is given in Figure 4.5. This code returns 0 if t < 0, just so that some “sensible”value is returned in this case. It would be acceptable if you reported an error in this case oreven (on a quiz) simply ignored this case completely.

Note: Of course, you’ll need to handle the “border cases” correctly to receive full marks forthis question. You will not lose more than two marks for incorrect handling of the bordercases, and you will not lose more than one mark for this unless you make several (at leastthree) distinct errors that have to do with this.

4. Since a value P (q, u) is never used again, after all the values P (q, u+ 1) have been computed,it is not necessary to remember more than two rows of the table of solutions, stored in thearray, at once. (Furthermore, if you’re sufficiently clever about it, then you can make dowith only one row, and at most two extra variables.) Therefore you could take advantageof this information to reduce the storage space, by using one or two one-dimensional arrays,instead of a two-dimensional array — reducing the storage space to O(t) space (if the “unitcost criterion” is used to measure this).

Note: The above information will be considered to be a complete answer for this question.However, you can do even better than this:

It can be proved that P (−q, u) = P (q, u) for all q and u, so you could reduce the storagespace by half, again, by only computing and storing the values P (q, u) for 0 ≤ u ≤ t− 1 and0 ≤ q ≤ u. (You’d reduce the running time by approximately a factor of two, as well, bydoing this.)

Something you almost certainly didn’t notice (pat yourself on the back, if you did): It can beproved by induction on t that if −t ≤ p ≤ t then

P (p, t) =

0 if t− p is odd,( t(t−p)/2

)if t− p is even,

so that you could solve this problem by checking the parity of t− p and then computing andreturning a binomial coefficient. Of course, you wouldn’t be using “dynamic programming”

119

if ( t < 0 or (p < −t or p > t) thenreturn 0

elsif (t = 0 and p = 0) thenreturn 1

elseA[0, 0] := 1A[−1, 1] := 1; A[0, 1] := 0; A[1, 1] := 1for u := 2 . . . t− 1 do

A[−u, u] := A[1− u, u− 1]; A[1− u, u] := A[2− u, u− 1]for q := 2− u . . . u− 2 do

A[q, u] := A[q − 1, u− 1] +A[q + 1, u− 1]end forA[u− 1, u] := A[u− 2, u− 1]; A[u, u] := A[u− 1, u− 1]

end forif (p = −t or p = 1− t) then

solution := A[p+ 1, t− 1]elsif (p = t− 1 or p = t) then

solution := A[p− 1, t− 1]else

solution := A[p− 1, t− 1] +A[p+ 1, t− 1]end ifreturn solution

end if

Figure 4.5: A Dynamic Programming Solution for the Quiz Problem

at all if you solved the problem this way. On the other hand, you’d be reducing both timeand storage space using this solution.

120

4.3 Greedy Methods



In this question, you were asked to find a correct and efficient greedy algorithm to make change.Two versions of the problem were described.

The First Version of the Problem. For the first version of the problem, you were to use coinsworth 1, c, c2, . . . , ck cents, for some positive constants c ≥ 2 and k ≥ 0. You had “as many coins asyou could possibly need” to choose from for each of these denominations of coins, and you wantedto make change for n cents, for some integer n ≥ 0, using as few coins as possible.

In “instance” of this problem will include the values c, k, and n as inputs. The “size” ofthis instance will be considered to be n (for the purposes of establishing correctness of the fol-lowing approach). A correct output corresponding to the instance will be a sequence of integers〈s0, s1, . . . , sk〉 such that si ≥ 0 for all i,

k∑i=0

sici = n,

and such that if t0, t1, . . . , tk are nonnegative integers such thatk∑i=0

tici = n as well, then

k∑i=0

si ≤k∑i=0

ti.

Base Instances. We will consider the “base instances” for this problem to be the instances suchthat n = 0: The only possible correct output for one of these instances is the sequence of integers〈s0, s0, . . . , sk〉 such that si = 0 for 0 ≤ i ≤ k.

The Greedy Approach. The “greedy method” that will be used to solve this problem will beto begin by choosing the coin in the largest denomination possible. In other words, if n > 0 thenyou will always start by using a coin with value ck if n ≥ k, and with value c` where ` < k andc` ≤ n < c`+1 otherwise.

As usual, we will need to prove “the greedy choice property” and “the optimal substructureproperty” in order to show that a greedy algorithm based on this idea is correct.

The Greedy Choice Property. The version of the property for this problem and greedy methodis as follows.

Claim (The Greedy Choice Property). For all instances c ≥ 2, k ≥ 0, and n > 0 of the aboveproblem, there exists a correct output S = 〈s0, s1, . . . , sk〉 such that s` > 0, where ` = k if n ≥ ck

and where ` = blogc nc < k if n < ck.

Proof. Let n, c, k, and ` be as above, so that a “greedy choice” for the given instance of the problemwould be a coin with denomination c`.

Let S = 〈s0, s1, . . . , sk〉 be a correct output for this instance of the problem.

121

Case: s` > 0. In this case, it is sufficient to set S = S (and si = si for 0 ≤ i ≤ k). Then S satisfiesthe properties given in the proposition.

Case: s` = 0. Since c`+1 > n, sh = 0 as well, for `+ 1 ≤ h ≤ k.Therefore, since S is a correct output for the given instance of the problem,

k∑i=0

sici =

`−1∑i=0

sici = n,

and there cannot exist a set of fewer than s0 + s1 + · · · + s`−1 coins (in the given denominations)worth exactly n cents.

Under these circumstances, si ≤ c− 1 for 1 ≤ i ≤ `− 1: Otherwise, one could subtract c fromsi and add 1 to si+1 — replacing c coins in denomination ci with a single one in denomination ci+1

— to obtain a smaller set of coins worth n cents. Therefore,

n =`−1∑i=0

sici

≤`−1∑i=0

(c− 1)ci (since si ≤ c− 1)

= (c− 1)

(c` − 1c− 1

)(summing the arithmetic series)

= c` − 1 < n, (since c` ≤ n)

and we have a contradiction. Thus the only assumption we made — that a correct output S =〈s0, s1, . . . , sk〉 such that s` = 0 exists at all — is incorrect. That is, this case can never arise.

Thus the result is correct in the only possible case, as required.

The Optimal Substructure Property. It is necessary to give two constructions and to provethat (taken together) they are correct. The first construction takes an instance of the problem (suchthat n > 0) as input, along with a “greedy choice,” and produces a smaller instance of the sameproblem. The second construction takes a correct solution for this smaller instance, along with theoriginal instance and greedy choice, and produces a correct solution for the original instance fromall this.

Construction #1: Given integers c ≥ 2, k ≥ 0, and n > 0, let

` =

k if ck ≤ n,blogc nc if ck > n

so that “the greedy choice” for this instance of the problem would be a coin of denomination c`.Define the new instance of the problem to consist of integers c, k, and n, where c = c, k = k,

and n = n− c`.Construction #2: Let c, k, n, `, c, k, and n be as described above in Construction #1, and letS = 〈s1, s2, . . . , sk〉 be a correct output for this problem, for the instance including c, k, and n.

Let S = 〈s0, s1, . . . , sk〉, where (for 0 ≤ i ≤ k),

si =

si + 1 if i = `,si otherwise.

122

Since 0 ≤ ` ≤ k and c` ≤ n (by the choice of `), 0 ≤ n < n. Since c = c and k = k, the instanceproduced by Construction #1 is a well formed instance of the problem that is smaller than theoriginal instance (since it has “size” n).

Now, consider the sequence of integers S = 〈s0, s1, . . . , sk〉 produced by Construction #2. SinceS = 〈s0, s1, . . . , sk〉 is a correct output for the smaller instance (c, k, and n), si is a nonnegativeinteger for all i. Since si is either si or si + 1 for all i, si is a nonnegative integer, for all i, as well.Furthermore,

k∑i=0

sici =

(`−1∑i=0

sici

)+ s`c

` +

k∑i=`+1

sici

=

(`−1∑i=0

sici

)+ (s` + 1)c` +

k∑i=`+1

sici

=

(k∑i=0

sici

)+ c` (reordering terms)

= n+ c` = n.

Finally, let m = s0 + s1 + · · · + sk be the number of coins used in any correct solution for thesmaller instance of the problem, and let M be the number of coins used in some correct solution S∗

of the original instance of the problem that uses at least one coin of denomination c`; the “greedychoice property” proved above implies that such a solution exists.

Since n = n− c` and S∗ includes a coin of denomination c`, we could remove such a coin fromS∗ to produce a set of M − 1 coins worth n cents. Therefore (since this would be a “valid” set ofcoins for the smaller instance of the problem)

M − 1 ≥ m.

On the other hand, if we take any correct solution for the smaller instance of the problem, and addin a coin with denomination c`, we obtain a set of coins worth n cents. Since the “correct” solutionof the smaller problem must include exactly m coins,

m+ 1 ≥M.

The above two inequalities imply that M = m+ 1. Since the set of coins produced using Construc-tion #2 is worth exactly n cents and includes exactly m+1 coins, this identity implies that this setis “optimal” as well as “valid.” Thus, Construction #2 produces a correct solution for the originalinstance of the problem.

Therefore, these two constructions are “correct,” and the optimal substructure property holds.

A Correct and Efficient Greedy Algorithm. While one could produce a correct recursivegreedy algorithm using the above information, we can use this information, and consider the prob-lem a bit more carefully, in order to produce an “optimized” greedy algorithm that is correct andmuch more efficient when n is large.

Claim. Let c ≥ 2, k > 0, and n > 0 be integers, and let S = 〈s0, s1, . . . , sk〉 be a correct solutionfor the above problem for these inputs.

123

Input: Integers c ≥ 2, k ≥ 0, n ≥ 0

Output: A correct solution S = 〈s0, s1, . . . , sk〉 for the coin changing problem,for these inputs

denom := ck

index := k

while index ≥ 0 do

sindex :=⌊

ndenom

⌋n := n− sindex ∗ denom

denom := denomc

index := index− 1end while

return〈s0, s1, . . . , sk〉

Figure 4.6: A Greedy Method for the First Coin Changing Problem

If n ≥ ck then

sk =⌊n

ck

⌋and S = 〈s0, s1, . . . , sk−1, 0〉 is a correct solution for this problem, for the inputs c = c, k = k, andn = n− skck < n.

If n < ck, let ` = blogc nc. Then 0 ≤ ` < k, si = 0 for `+ 1 ≤ i ≤ k,

s` =⌊n

c`

⌋,

and S = 〈s0, s1, . . . , s`−1, 0, 0, . . . , 0〉 is a correct solution for this problem, for the inputs c = c,k = k, and n = n− s`c` < n.

This can be established using induction on n (and using the correctness of the greedy choiceand optimal substructure properties, and the correctness of the two constructions described above,as well); this is left as an additional exercise.

The claim implies that there is only one correct solution for this problem for any choice of c, k,and n. An algorithm based on this claim is shown in Figure 4.6.

The recursive algorithm one would obtain (without “optimizations”) from the original greedystrategy, and constructions used to prove “the optimal substructure property,” would not be asefficient as this one — because it would (essentially) construct a solution, one coin at a time. Ifn is substantially larger than ck, this would imply that Ω(n/ck) steps are used. In contrast, thealgorithm given above uses O(k) arithmetic operations in all cases.

The Second Version of the Problem. For the second version of the problem, you were askedto make change for n cents using pennies, nickels, dimes, quarters, and loonies.

We will now consider an instance of the problem to include a single integer, n ≥ 0, as input.The output will be a 5-tuple of nonnegative integers 〈p, k, q, d, l〉 such that

p+ 5k + 10d+ 25q + 100l = n

124

and such that p + k + d + q + l is as small as possible. As before, we will consider n to be the“size” of the corresponding instance of the problem (for the purposes of establishing correctness ofan algorithm).

Base Instances. The only “base instance” will be the instance of this problem for which n = 0;the only correct output for this instance is the 5-tuple 〈0, 0, 0, 0, 0〉.

The Greedy Approach. As for the first version of the problem, the “greedy approach” used tosolve this problem on input n > 0 will be to begin by choosing the largest denomination possible— a loonie if n ≥ 100, quarter if 25 ≤ n ≤ 99, dime if 10 ≤ n ≤ 24, nickle if 5 ≤ n ≤ 9, and pennyif 1 ≤ n ≤ 4.

The Greedy Choice Property. Let S = 〈p, k, d, q, l〉 be a correct output, corresponding to theinstance n > 0.

Lemma 4.1. p ≤ 4.

Proof. Suppose, instead, that p ≥ 5. If we set p = p − 5, k = k + 1, d = d, q = q, and l = l(replacing five pennies with a nickel), then p, k, d, q, l ≥ 0,

p+ 5k + 10d+ 25q + 100l = p+ 5k + 10d+ 25q + 100l = n,

and

p+ k + d+ q + l = p+ k + d+ q + l − 4 < p+ k + d+ q + l,

contradicting the choice of 〈p, k, d, q, l〉 as a correct (and hence optimal) solution for the problem.

Lemma 4.2. k ≤ 1.

Proof. If k ≥ 2 then we can subtract two from k and add one to d (replacing two nickels with adime), and obtain another way to make change for n that uses a smaller number of coins — againcontradicting the choice of S as a correct solution for this problem.

Lemma 4.3. p+ 5k ≤ 9.

Proof. This follows from Lemmas 4.1 and 4.2.

Lemma 4.4. d ≤ 2.

Proof. If d ≥ 3 then three dimes can be replaced by one quarter and one nickel (that is, we cansubtract three from d and increase q and k by one each) in order to make change for n cents usinga smaller number of coins.

Lemma 4.5. If k > 0 then d ≤ 1.

Proof. If k > 0 and d ≥ 2 then one nickel and two dimes can be replaced by one quarter.

Lemma 4.6. p+ 5k + 10d ≤ 24.

Proof. By Lemma 4.4, d ≤ 2, and, by Lemma 4.5, either k = 0 or d ≤ 1. Thus, if d = 2 then k = 0,so (by Lemma 4.1) p+ 5k+ 10d ≤ 4 + 5 · 0 + 10 · 2 = 24 if d = 2. On the other hand, if d ≤ 1, then(by Lemma 4.3), p+ 5k + 10d ≤ 9 + 10 · 1 = 19 ≤ 24. Thus, p+ 5k + 10d ≤ 24 in all cases.

125

The following has now been proved, as well.

Lemma 4.7. If p+ 5k + 10d ≥ 20 then d = 2.

Proof. It was established in the proof of Lemma 4.6 that p+ 5k + 10d ≤ 19 if d ≤ 1.

Lemma 4.8. q ≤ 3.

Proof. If q ≥ 4 then four quarters can be replaced with one loonie, in order to find a way to makechange for n cents using a smaller number of coins.

Lemma 4.9. p+ 5k + 10d+ 25q ≤ 99.

Proof. This follows from Lemmas 4.6 and 4.8.

Claim (Unique Solution). There is exactly one correct solution for the given problem for anynonnegative integer n: namely, a 5-tuple 〈p, k, d, q, l〉, such that

(i) l =⌊n

100

⌋;

(ii) q =⌊n−100l

25

⌋;

(iii) d =⌊n−100l−25q

10

⌋;

(iv) k =⌊n−100l−25q−10d

5

⌋; and

(v) p = n− 100l − 25q − 10d− 5k.

Proof. Let p, k, d, q, and l be the number of pennies, nickels, dimes, quarters, and loonies respec-tively in some correct solution for this problem. We will prove that the proposition is correct byshowing that p, k, d, q, and l must have the values given above.

Since these integers are all nonnegative, n = p + 5k + 10d + 25q + 100l ≥ 100l, so clearlyl ≤

⌊n

100

⌋(since l is an integer). On the other hand, if l <

⌊n

100

⌋, then (again, since l is an integer),

l ≤⌊n

100

⌋−1 ≤ n

100−1, so that p+5k+10d+25q = n−100l ≥ n−100(n

100 − 1)

= 100, contradictingLemma 4.9. Thus l =

⌊n

100

⌋, as claimed in equation (i), above.

Now n−100l−25q ≥ 0, so q ≤⌊n−100l

25

⌋. However, since n−100l−25q = p+ 5k+ 10d ≤ 24 (by

Lemma 4.6), it can be shown that q ≥⌊n−100l

25

⌋as well (for, otherwise, we could use the fact that

q is an integer to show that p + 5k + 10d = n − 100l − 25q ≥ 25). Thus q =⌊n−100l

25

⌋as claimed

above in equation (ii).Similarly, 0 ≤ n − 100l − 25q − 10d = 5k + p ≤ 9 (by Lemma 4.6), and it can be shown that

this implies that d =⌊n−100l−25q

10

⌋as claimed above in equation (iii).

Again, by a similar argument, 0 ≤ n− 100l− 25q − 10d− 5k = p ≤ 4, so k =⌊n−100l−25q−10d

5

⌋,

as claimed above in equation (iv).Finally, equation (v) follows directly from the fact that the proposed solution is correct, so that

p+ 5k + 10d+ 25q + 100l = n.

Corollary: The above problem and greedy strategy (of choosing the largest denomination of coinpossible) satisfies the “greedy choice property.”

Proof. This follows from the equations given in the claim about a “Unique Solution,” which es-tablish that the number of coins “of the largest denomination possible” that are used in a correctsolution is always greater than zero.

126

The Optimal Substructure Property. The constructions needed to establish this property,and the proofs that they are correct, are virtually identical to those given for the first version ofthe problem. That is, the first construction should simply change the value for the input parameter“n” by subtracting the denomination of the coin that was chosen using the greedy strategy, in orderto obtain a “smaller” instance of the same problem, and the second construction should convert acorrect solution for this smaller instance to a correct solution for the original one, by adding oneto the number of coins of this denomination that are used.

A Correct and Efficient Greedy Algorithm. The above claim contains a set of five equationsthat can be used to solve this problem, using a program that is (approximately) five lines long anduses (approximately) five arithmetic operations, for all inputs n ≥ 0.


In this question, you were asked to show that a natural greedy heuristic for the coin changingproblem (always use a coin with the largest denomination possible) is incorrect if arbitrary integerdenominations of coins are allowed.

As suggested in the hint for this problem, consider the case that k = 6 denominations of coinsare used, and that the denominations are d1 = 1, d2 = 3, d3 = 7, d4 = 29, d5 = 43, and d6 = 59.Suppose you want to make change for n = 72 cents.

Using the given greedy heuristic, you must start by choosing a coin with the highest denomi-nation, 59. After that, you need to make change for another 72− 59 = 13 cents.

You must continue by choosing a coin with denomination d3 = 7 (leaving you with the problemof making change for 6 cents), and you must finish by choosing two coins with denomination d2 = 6.

You’ll have used four coins in total.On the other hand, if you started by choosing a coin with denomination d5 = 43 instead, then

you’d be left with the problem of making change for 72 − 43 = 29 cents, and you could finish bychoosing a second coin with denomination d4 = 29.

Thus it’s possible to make change for the same amount using only two coins, instead of four,and the given greedy heuristic is clearly incorrect.

127

4.3.2 Solutions for Sample Tests


1. In this problem you were asked to prove that a greedy heuristic for the coin changing problemwas incorrect, if coins of arbitrary denominations could be used.

The proposed greedy strategy for this problem was to start by including a coin with the largestdenomination possible — that is, to ensure that si ≥ 1, where i = k if n ≥ dk, and wherei < k and di ≤ n < di+1 if 0 ≤ n < dk.

It was suggested that you consider an instance in which k = 4, d1 = 1, d2 = 7, d3 = 15,d4 = 31, and n = 37.

Now, if the above greedy strategy is used then you must use one coin worth 31 cents. Afterthat, it is necessary to make change for six cents — using six pennies. Therefore, seven coinsmust be used.

On the other hand, you can also make change for 37 cents using two coins worth 15 centseach, and another coin worth 7 cents — using only three coins in total.

Therefore, the solution obtained using the proposed greedy choice for this instance isn’toptimal — so it isn’t correct. It follows that the proposed greedy approach isn’t correct,either.

2. In this question, you were given a “Route Selection Problem,” and a proposed greedy methodto solve it.

(10 marks) State and prove the greedy choice property, for this problem and greedy strategy.

Solution:

The Greedy Choice Property: Given any instance of the “Route Selection Problem” such thatn > 0, and given a corresponding greedy choice rn as described above, there exists a correctsolution S = 〈r1, r2, . . . , rn〉 for this instance that ends with the greedy choice rn.

Proof: Let S = 〈t1, t2, . . . , tn〉 be a correct solution for the given instance of the “RouteSelection Problem.” Then either tn = rn, or tn 6= rn.

Case: tn = rn.

In this case it is sufficient to set S = S: Then S is a correct solution for the problem and Sends with rn.

Case: tn 6= rn.

Let S be the sequence 〈r1, r2, . . . , rn−1, rn〉 obtained from S by replacing the final element tnwith rn — so that ri = ti for 1 ≤ i ≤ n− 1.

For 1 ≤ i ≤ n − 1, 1 ≤ ri ≤ hi because ri = ti and S is a correct solution for this instanceof the problem. 1 ≤ rn ≤ hn as well, by definition of “the greedy choice” rn. Thus, S is a“valid” solution for this instance of the problem.

Since S is a correct (and hence, optimal) solution for this instance, and S is a “valid” one,

d1,t1 + d2,t2 + · · ·+ dn,tn ≤ d1,r1 + d2,r2 + · · ·+ dn,rn .

128

On the other hand, di,ri = di,ti for 1 ≤ i ≤ n− 1 because ri = ti, while dn,rn ≤ dn,tn , becausern was “the greedy choice.” Therefore

d1,t1 + d2,t2 + · · ·+ dn,tn ≥ d1,r1 + d2,r2 + · · ·+ dn,rn .

The above two inequalities imply that

d1,r1 + d2,r2 + · · ·+ dn,rn = d1,t1 + d2,t2 + · · ·+ dn,tn ,

so that S is optimal (since S is).

Therefore S is a correct output sequence and (by construction) S ends with rn in this caseas well.

3. (6 marks) Give a correct and asymptotically efficient (greedy) algorithm that solves the RouteSelection Problem, which was considered in the previous question.

(In the question, two constructions were given that could be used to establish that the “Op-timal Substructure Property” also held for this problem and the greedy strategy given inQuestion 2. You were allowed to use these constructions for Question 3 without proving thatthey are correct.)

Solution:

An efficient greedy algorithm that uses the given components (but that has also been some-what “optimized”) is as follows. The order in which the entries of S are computed matchesthe choice of “greedy choice strategy” that has been given here. However, the order could bereversed (so that they are computed in the order r1, r2, . . . , rn) while keeping the algorithmcorrect.

for i := n down to 1 do

Choose ricandidate := 1candidate value : di,1

for j := 2 . . . hi do

if di,j < candidate value then

candidate value := di,jcandidate:= j

end if

end for

ri := candidate

end for

return 〈r1, r2, . . . , rn〉

129

4. (Bonus Question) For up to five bonus marks, prove that the “Optimal Substructure Prop-erty” is satisfied for this problem and greedy strategy.

Solution: Since n ≥ 1, n ≥ 0, and since hi = hi and di,j = di,j for 1 ≤ i ≤ n and 1 ≤ j ≤ hi,hi and di,j are all positive. Thus, Construction #1 produces a well-formed smaller instanceof the same problem (it is smaller, because n < n).

Let S be the solution obtained from S and the greedy choice rn using Construction #2. SinceS is a correct solution for the smaller instance that was produced using Construction #1,1 ≤ ri ≤ hi for 1 ≤ i ≤ n− 1. As well, 1 ≤ rn ≤ hn because rn was the greedy choice. Thus,S is a “valid” solution for the original problem.

Now, let d be the total distance travelled using any correct solution for the smaller instance,and let D be the total distance travelled using any correct solution for the original instance.

By the greedy choice property, there exists a correct solution for the original instance thatends with rn. If rn is removed, then a valid instance for the smaller instance is obtained.Therefore, D − dn,rn ≥ d.

On the other hand, appending rn to any correct solution for the smaller instance produces avalid solution for the original one, so d+ dn,rn ≥ D.

Thus D = d+dn,rn , implying that the solution S obtained by appending rn to S is “optimal”as well as “valid.” Thus is solution, and Construction #2, are correct.

It follows that the “optimal substructure property” is satisfied.

130


1. (1 mark) What output should be returned when the input instance G = (V,E) has “size 0”as defined on the test — so that V = ∅?

Answer: The empty set, ∅, should be returned.

2. (2 marks) Suppose again that the input graph G = (V,E) has “size” greater than zero, sothat V is nonempty, and suppose you’ve decided that some node v ∈ V should be includedin the output clique. If another node u is not a neighbour of v then should u be included inthe output as well? Why, or why not?

Note: This should be easy!

Answer: No, u should not be included, because (u, v) 6∈ E, so that the resulting output setcould not possibly be a clique (it would already have to include u and v).

3. (1 mark) Based on your answers for the above questions, describe the output that should bereturned if the input graph has size greater than zero (that is, V is nonempty) but there areno edges in the graph.

Answer: The output should be a set that includes exactly one vertex in V (and, any vertexin V could be chosen).

4. (8 marks) Design and give pseudocode for a greedy method for this problem in which a“greedy choice” is a vertex whose degree is as large as possible. Your method should generateoutput that is valid — it should definitely be a clique — but it might not necessarily beoptimal.

Don’t try to optimize this code — Write this as a recursive function maxClique that takesan undirected graph as its input and returns a clique as its output value.

Note: It’s possible that you won’t have to write very much besides the pseudocode, in orderto answer this question!

Answer: Algorithm Design:

Base Instances: These are the graphs with no vertices — that is, the graphs G = (V,E)where V = ∅. It will be sufficient to check whether V = ∅ in order to recognize these, andthe empty set should always be returned as output when a base instance is given as input.

Greedy Choice: If a graph G = (V,E) is given as input where V is nonempty then the greedychoice should be a vertex v ∈ V whose degree is as large as possible.

Construction of a Smaller Instance from a Non-Base Instance G = (V,E) and a GreedyChoice v ∈ V : Use the induced subgraph G = (V , E) where V is the set of neighbours of v,as a derived smaller instance formed from G and v.

Recovery of a Solution for the Original Instance from a Solution for the Smaller Instance:Add the greedy choice to the solution for the smaller instance, to obtain a solution for theoriginal instance.

131

Pseudocode: It is assumed here that the input is an undirected graph G = (V,E).

function maxClique(G)if V = ∅ then

return ∅else

Choose a vertex v ∈ V such that the degree of v is greater than or equal tothe degree of every vertex u ∈ V .

Set G to be the induced subgraph on the set of vertices that are neighboursof v in G.

return maxClique(G) ∪ vend if

end function

Note that this corresponds to the above components and the “algorithm template” in thechapter on Greedy Methods. If your pseudocode is clear enough so that the marker candiscover what the above “design components” would be from it, and your pseudocode isconsistent with the above information (or, for the instructions for this question and youranswers for the first three), then you’ll receive full marks for the pseudocode alone. (However,the above “design documentation” will be useful, too, since it will be used as an aid by themarker when trying to award partial credit for incomplete solutions.)

5. (8 marks) Prove that the algorithm you gave in your answer for the previous question — or,indeed, any greedy method using the greedy choice mentioned in that question — is incorrect.

The following graph G = (V,E) (such that V = A,B,C,D,E, F,G and E includes sevenedges) will probably be useful as you try to answer this question.

A

C E

D

B

G

F

Answer: Consider the above graph G = (V,E).

The degree of the vertex B is four, and the degree of every other vertex is at most three.Therefore, B must be the greedy choice when the previous algorithm (or, any other algorithmcorresponding to the “greedy choice” identified in the question) is executed, with this graphas input.

Therefore, the vertex B will be included in the output set and the algorithm will be appliedrecursively to the induced subgraph G = (V , E), where V is the set of neighbours of B in G.

132

It is clear by inspection that V = A,C,D,E and that E = ∅; that is, there are no edges inthis induced subgraph. Therefore, the heuristic will return a clique of size one — includingexactly one of the vertices A, C, D, or E, as its solution for the instance G = (V , E).

The algorithm will add B to this set and then terminate, so that its output (when giventhe above graph G as input) will be one of the four sets A,B, B,C, B,D, or B,E.Therefore, the output will be always be a clique of size two.

However, E,F,G is a clique of size three in the input graph.

Therefore the output of the algorithm, on this input, is not a maximal clique, so the outputis incorrect.

It follows that the algorithm is incorrect, as well.

6. (5 marks) Finally, sketch a brief proof that your algorithm always terminates and that italways returns a valid output set (that is, a clique) as its output. You may assume that thesubroutines mentioned at the beginning of the test are all correct.

Hint: Consider using induction on the size of the input.

Answer:

Claim: The algorithm always terminates and returns a valid output (that is, a clique) whenit is given an undirected graph G = (V,E) as input.

Proof of Claim: By induction on n = |V |.

Basis: If n = 0 then V is the empty set and this is a base instance of the problem. It followsthat the algorithm will terminate immediately and return the output set — a clique in theinput graph — as output.

Inductive Step: Suppose n ≥ 0 and that the algorithm always terminates, and returns a cliqueof its input graph as output, when it is executed with a graph with at most n vertices asinput.

Let G = (V,E) be an undirected graph such that |V | = n+ 1.

Since |V | = n+ 1 > 0 the set V is nonempty and finite. Since every vertex in this set has adegree, and this is a finite nonnegative integer, it is clear that there exists some vertex v in Vsuch that the degree of v is greater than or equal to the degree of every (other) vertex u ∈ V .That is, a “greedy choice” exists and the algorithm succeeds in executing the first statementin the else part of its if -then-else structure.

Next, the algorithm forms the induced subgraph G = (V , E) such that V is the set ofneighbours of v in G. Since v is not a neighbour of itself, v 6∈ V , so that |V | ≤ |V | − 1 = n.

It follows by the inductive hypothesis that this algorithm terminates and returns some clique Sof G as output, when it is executed with G as input.

It is clear by inspection of the code that the algorithm terminates and returns a set S = S∪vif it is executed on input G, and S is the clique of G returned by the recursive call.

Obviously, S ⊆ V .

133

In order to prove that S is a clique it is necessary and sufficient to show that (r, s) ∈ E forevery pair r and s of distinct elements of S. Suppose, then that r, s ∈ S and that r 6= s.

If r = v then s ∈ V , and (r, s) = (v, s) ∈ E, because V only includes neighbours of v in G.By a symmetric argument (exchanging the roles of r and s), (r, s) ∈ E if s = v.

Finally, if neither r = v nor s = v, the r, s ∈ S. Then, (r, s) ∈ E because S is a clique. Thisimplies that (r, s) ∈ E, as desired, since E ⊆ E.

Therefore (r, s) ∈ E in every case — and S is a clique.

Since G was chosen to be an arbitrary undirected graph such that |V | = n+1, this completesthe inductive step — the algorithm terminates and returns a clique as output, whenever it isexecuted with an undirected graph that includes (at most) n+ 1 vertices as its input.

It therefore follows by induction on n that the claim is correct.

134


Consider the minimization problem “Offline Memory Management” and the greedy algorithm de-fined for it.

1. (15 marks) State the Greedy Choice Property that you must prove in order to show thatthis algorithm is correct, and then prove that this property is satisfied.

You may use the following fact without having to prove it:


for this instance, and a valid sequence p1, p2, . . . , pm for this instance, there exists (another)valid sequence p1, p2, . . . , pm for the same instance that starts with the greedy choice, andthat does not cause any more page faults than the sequence p1, p2, . . . , pm does.

Solution: The Greedy Choice Property for this problem and algorithm can be stated asfollows (this version is longer than necessary: more succinct versions that say essentially thesame thing will receive full credit):

Claim: Let n and k be positive integers such that 1 ≤ k ≤ n, let C0 ⊆ 1, 2, . . . , n suchthat |C0| = k, and let r1, r2, . . . , rm be a sequence of integers between 1 and n (comprising a“non-base” instance of the “Offline Memory Management” problem).

A greedy choice p1 exists for this instance. That is, there is an integer p1 between 1 and nsuch that p1 = r1 if r1 ∈ C0, p1 ∈ C0 and p1 6∈ r1, r2, . . . , rm if r1 6∈ C0 and there is anyelement q of C0 such that q 6∈ r1, r2, . . . , rm, and, finally, p1 ∈ C0 and p1 = ri for someinteger i such that 1 ≤ i ≤ m, p1 6∈ r1, r2, . . . , ri−1, and q ∈ r1, r2, . . . , ri−1 for every otherelement q of C0, otherwise.

Furthermore, if p1 is any greedy choice for this instance, then there is a correct solutionp1, p2, . . . , pm for this instance of the problem that begins with the greedy choice, p1.

Proof of Claim:

The existence of a “greedy choice” follows from the fact that m is finite and positive (so thatthe sequence r1, r2, . . . , rm is finite and nonempty), and that the set C0 ⊆ 1, 2, . . . , n isfinite and nonempty as well (since k = |C0| ≥ 1).

(To see that these really do imply the existence of a greedy choice, note that if r1 ∈ C0 thenp1 = r1 satisfies the above condition. If r1 6∈ C0 but the “set difference” C0 \ r1, r2, . . . , rmis nonempty (so that the second case described for the greedy choice applies) then p1 can bechosen to be any element of this “set difference.” In the only remaining case, r1 6∈ C0 andC0 ⊆ r1, r2, . . . , rm, there must be a unique element p1 ∈ C0 whose first appearance in thesequence comes after those of all the other elements of C0, because the set C0 and sequencer1, r2, . . . , rm are both nonempty and finite; this element p1 is a greedy choice in this lastcase.)

Now let p1 be any such greedy choice.

Let p1, p2, . . . , pm be any correct solution for this instance of the problem.

(By the way, it is not too difficult to see that a valid sequence exists, because the cachesize k is greater than or equal to one, and, then, an optimal sequence exists, because the

135

optimization function defined for this minimization problem — “number of page faults”—maps all valid sequences to nonnegative integers. The number of valid sequences is finite, sothere must exist at least one valid sequence for which the value of the optimization functionis minimal. That is, a correct solution does exist.)

Clearly, either p1 = p1, or p1 6= p1.

If p1 = p1, then it suffices to set pi = pi for 2 ≤ i ≤ m, so that p1, p2, . . . , pm and p1, p2, . . . , pmare the same sequence. Then p1, p2, . . . , pm is a valid and optimal sequence that starts with p1,as required.

If p1 6= p1 then the lemma implies that there exists a valid sequence p1, p2, . . . , pm thatstarts with p1 and that doesn’t cause any more page faults than p1, p2, . . . , pm. However,p1, p2, . . . , pm is an optimal sequence so it doesn’t cause any more page faults than the sequencep1, p2, . . . , pm does, either. Therefore these two sequences cause the same number of pagefaults, and p1, p2, . . . , pm is optimal, since it is valid and p1, p2, . . . , pm is optimal. Since italso starts with p, the sequence p1, p2, . . . , pm has all the properties stated in the claim, asrequired to complete the proof.

Thus, the claim is correct.

Notes About This Proof: Of course, this proof only establishes that the claim is correctassuming that the lemma is. See the solutions for Question #3, for a proof that the lemmais correct as well.

This proof is also longer than necessary: It would have been acceptable if the two paragraphsin parentheses (the second and fifth paragraphs in the proof) had been left out, so that yousimply stated that it was “obvious” or “clear” that a greedy choice exists (because the set C0

is nonempty and finite and the sequence r1, r2, . . . , rm is nonempty and finite), and it wouldhave been acceptable if you’d simply said, “Let p1, p2, . . . , pm be a correct solution for thisinstance,” without even claiming that a solution had to exist at all (assuming it implicitly,instead).

However, to receive full marks for your statement of the Greedy Choice Property, you musthave given a statement that clearly concerned “non-base instances,” made the point that atleast one greedy choice for such an instance always exists, and that for any such greedy choice,there is always a valid and optimal solution that contains it — and, at least some effort musthave been made to state this in terms of the problem and algorithm that was defined (ratherthan giving a statement that was completely “generic”). Furthermore, to receive full marks,your proof must have at least included a “claim” that greedy choices for non-base instancesalways exist, and an argument (perhaps, like the one above) that there would always be acorrect solution containing any given greedy choice.

In case you were concerned by the fact that the output for this problem must be a se-quence instead of a set : A finite sequence can actually be defined to be a special kind ofset of ordered pairs of values, so that “the sequence r1, r2, . . . , rm” corresponds to a set(1, r1), (2, r2), . . . , (m, rm). If you think of sequences as sets, in this way, and add the “con-straint on the output set” that it must correspond to a sequence, then you should find thatthe above proof follows the outline given in the lecture notes, pretty much exactly, as long asyou considered “(1, p1)” to be the greedy choice.

On the other hand, the lectures and online notes also stated that the method could be appliedwhen other structures besides sets were used — so if you were able to apply the method to

136

this problem without trying to think of sequences as sets (probably the more likely case), thenthat’s perfectly acceptable, too.

Finally, in case you got stuck and were unable to apply the method to this problem: Acompletely “generic” definition of a Greedy Choice Property and an outline of how you wouldprove that this property holds, that have nothing in particular to do with the above problemand algorithm, will receive up to ten marks for this question — provided that the statementand proof outline are both reasonably clear, complete, and correct.

2. (10 marks) Describe, as specifically, and in as much detail as you can, what you would needto do in order to show that the Optimal Substructure Property is also satisfied for thisproblem as well. It is not necessary to give a proof of this property,

Solution:

In order to prove the “Optimal Substructure Property,” you must prove that the constructionsthat have been given for

• defining a smaller instance of the problem from a non-base instance and any greedychoice for it, and

• recovering a solution for the original non-base instance from that instance, a greedychoice, the smaller instance derived from this using the first construction, and any correctsolution for the smaller instance

are both correct — that is, they terminate and produce the expected outputs, whenever theyreceive well-formed inputs.

These constructions are given with the test. It should be clear, by inspection of these, thatthey both terminate. Therefore it would suffice to argue that they both “produce the expectedoutputs,” in order to complete a proof of the Optimal Substructure Property.

In order to prove this for the first construction, you should probably proceed by arguing thateach of the following claims is correct (in the given order).

(a) The output produced by the construction is a “well-formed” instance of the problem.

(b) The instance returned as output by this construction is strictly smaller than the givenone.

In order to prove this for the second construction, you should probably proceed by arguingthat each of the following claims is correct (in order).

(a) The output of the second construction is “well formed:” In this case, this means thatit’s a sequence of integers between 1 and n of the expected length, m.

(b) The output is a valid sequence, for the instance of the problem that was given.

(c) This sequence is also optimal : It does not cause any more page faults than any othervalid sequence would for the same instance.

The above answer would receive almost full marks — specifically, eight out of ten. Some ofthe following details would be required in order to earn full marks — for example, all of thedetails that follow about the first construction would suffice — and an answer that includedall of the following, correctly, will receive a few “bonus” marks, as well.

137

Correctness of the First Construction: It is clear by inspection that this constructionuses the same parameters n and k as for the given instance, so these are clearly integers suchthat 1 ≤ k ≤ n (assuming, of course, that the given instance was “well formed”).

Since the given instance is a non-base instance, m ≥ 1, and since p1 is a greedy choice, p1 isan integer between 1 and n (and, furthermore, p1 = r1 if r1 ∈ C0). Since C0 is a subset of1, 2, . . . , n with size k (assuming, again, that the given instance was “well formed”), thisimplies that the set C0 = (C0 \ p1) ∪ r1 is a subset of 1, 2, . . . , n of size k as well, asdesired.

Furthermore, the “derived” instance includes the sequence r1, r2, . . . , rm, where m = m−1 ≥ 0and ri = ri+1 for 1 ≤ i ≤ m. This clearly implies that the integers r1, r2, . . . , rm are allbetween 1 and n — implying that the output returned by the first construction is a “wellformed” instance of the problem, if the given instance was “well formed” as well.

Finally, the “size” of an instance was defined to be the length m of the sequence r1, r2, . . . , rmof requests that it included. Thus the given instance has size m ≥ 1 and the derived instancehas size m = m − 1 < m. That is, the derived instance is strictly smaller than the originalone, as required.

Correctness of the Second Construction: The output returned by the second con-struction is the sequence p1, p2, . . . , pm where p1 was the greedy choice and pi+1 = pi for1 ≤ i ≤ m = m−1, where p1, p2, . . . , pm is a correct solution for the derived smaller instance.

Therefore pi is an integer between 1 and n for all i: This is true for i = 1 because p1 wasa “greedy choice,” and it is true for i > 1 because pi = pi−1 and p1, p2, . . . , pm is a validsequence for the derived smaller instance (which includes the same parameter n).

This sequence also has the required length, m, since m = m − 1. Therefore, the outputgenerated by the second construction is “well formed:” it’s a sequence of m integers that areeach between 1 and n.

It’s been noted already that the smaller instance included the requests r1, r2, . . . , rm−1, whereri = ri+1 for 1 ≤ i ≤ m− 1 and the original instance included the requests r1, r2, . . . , rm.

Similarly, the second construction returns as output the sequence p1, p2, . . . , pm, where pi =pi+1 for 1 ≤ i ≤ m − 1, and p1, p2, . . . , pm−1 is the given correct solution for the smallerinstance.

It is also clear by inspection that C0 (the initial contents of the cache, in the smaller instance)is the same set as C1, the contents of the cache that would be obtained using the originalinstance after serving the first request, r1, using the greedy choice p1.

At this point, a completely straightforward induction on i can be used to prove that, for0 ≤ i ≤ m − 1, the contents of the cache Ci obtained after serving the first i requestsfor the smaller instance, using p1, p2, . . . , pi, is the same as the contents of the cache Ci+1

obtained after serving the first i+ 1 requests for the original instance, using p1, p2, . . . , pi+1.Furthermore, pi+1 ∈ Ci, and pi+1 = ri+1 if ri+1 ∈ Ci.This implies that the output p1, p2, . . . , pm returned by the second construction is a validsequence for the original instance.

Let F be the number of page faults caused when the sequence p1, p2, . . . , pm is used to servethe requests in the original instance, and let F be the number of page faults caused when

138

the sequence p1, p2, . . . , pm−1 is used to serve the requests in the smaller instance. Then, onemore thing can be established as part of (or, as a consequence of) the above inductive proof:F = F if r1 ∈ C0, and F = F + 1 if r1 6∈ C0.

Now, since the Greedy Choice Property is satisfied, there is a correct solution p1, q2, q3, . . . , qmfor the original instance that begins with p1. Using an argument similar to the one givenabove, it can be demonstrated that the sequence q2, q3, . . . , qm is a valid sequence for thesmaller instance of the problem. Furthermore, if G is the number of page faults caused whenthe sequence p1, q2, . . . , qm is used to serve the requests in the original instance, and G is thenumber of page faults caused when the sequence q2, q3, . . . , qm is used to serve the requestsin the smaller sequence, then G = G if r1 ∈ C0 and G = G+ 1 if r1 6∈ C0, as well.

Note that F − F = G − G in either case. However (adding F and subtracting G from bothsides), this implies that F −G = F − G.

Since the sequence p1, q2, q3, . . . , qm is an optimal sequence for the original instance, it cannotcause more page faults than the valid sequence p1, p2, . . . , pm, so F ≥ G and F −G ≥ 0.

However, the optimal sequence p1, p2, . . . , pm for the smaller instance cannot cause more pagefaults than the valid sequence q2, q3, . . . , qm, so F − G ≤ 0.

Since F−G = F−G, it follows that F−G = F−G = 0. Therefore, the sequence p1, p2, . . . , pm(output by the construction) is optimal, since it is valid and causes the same number of pagefaults as the optimal sequence p1, q2, q3, . . . , qm.

Since the construction returns a correct solution (for this “arbitrarily chosen” input), thisconstruction is correct — as is needed to complete the proof of the Optimal SubstructureProperty.

3. (5 marks) Very Challenging Bonus Question: Try to prove that the “lemma” given inQuestion 1 is correct.

Solution: Here, again, is the lemma:


for this instance, and a valid sequence p1, p2, . . . , pm for this instance, there exists (another)valid sequence p1, p2, . . . , pm for the same instance that starts with the greedy choice, andthat does not cause any more page faults than the sequence p1, p2, . . . , pm does.

Proof: Either p1 = p1, or p1 6= p1.

If p1 = p1 then the result is trivial, since it suffices to set pi = pi for 2 ≤ i ≤ m as well —that is, to set p1, p2, . . . , pm to be the same sequence as p1, p2, . . . , pm — in order to satisfythe claim (clearly, p1, p2, . . . , pm will cause exactly as many page faults as p1, p2, . . . , pm, inthis case).

Suppose, then, that p1 6= p1. Then it must be the case that r1 6∈ C0 — for otherwise, it wouldhave to be case that p1 = p1 = r1, since p1 is the greedy choice (and, therefore, equal to r1),and p1 is at the beginning of a valid sequence for these requests (so, p1 = r1 as well).

Let C0, C1, . . . , Cm be the sequence of sets obtained by serving the requests in the giveninstance using the sequence p1, p2, . . . , pm, so that Ci represents the contents of the cache justafter the ith request ri has been served.

139

Since r1 6∈ C0, a page fault occurs at time 1, both when the sequence p1, p2, . . . , pm is used,and when any sequence starting with p1 is used, as well. Let C1 be the contents of the cacheafter the first request is served using p1; then

C1 = (C0 \ p1) ∪ r1

and

C1 = (C0 \ p1) ∪ r1,

so that

|C1 ∩ C1| = k − 1, C1 = (C1 ∩ C1) ∪ p1, and C1 = (C1 ∩ C1) ∪ p1.

A final point should now be made: Since k ≥ 1, it should be clear that there is some validsequence that starts with p1. Indeed, since the Greedy Choice Property has been proved,there is even an optimal sequence that starts with p1.

Now, we’ll attempt to construct successive values p2, p3, . . . (and sets C2, C3, . . . that cor-respond to them) for as long as possible, in order to maintain the following properties: Ifp1, p2, . . . , pi have been chosen so far, then

• There is some valid sequence that starts with p1, p2, . . . , pi;

• A page fault occurred at time j using a valid sequence that begins with p1, p2, . . . , pi ifand only if it occurred at time j using the sequence p1, p2, . . . , pm, for all j such that1 ≤ j ≤ i;• |Cj ∩Cj | = k − 1, Cj = (Cj ∩Cj) ∪ p1, and Cj = (Cj ∩Cj) ∪ p1, for all j such that

1 ≤ j ≤ i.

Note that this has been established already for the case i = 1.

Suppose now that 1 ≤ i ≤ m and that that we’ve managed to find p1, p2, . . . , pi satisfying theabove properties.

If i = m then p1, p2, . . . , pm is a valid sequence (because of the above first property, and thefact that there is no valid sequence whose length is greater than m), it clearly starts with p1,and it causes exactly as many page faults as p1, p2, . . . , pm. Thus, p1, p2, . . . , pm is a sequencesatisfying all the properties claimed in the lemma, in this case.

If i < m and ri+1 ∈ Ci ∩ Ci, then it must be the case that pi+1 = ri+1, since ri+1 ∈ Ci andthe sequence p1, p2, . . . , pm is valid.

The above properties imply that there is at least one valid sequence that begins with thesequence p1, p2, . . . , pi. Since ri+1 ∈ Ci, every one of these sequences must also begin withp1, p2, . . . , pi+1, where pi+1 = ri+1 as well — so there exists at least one valid sequencebeginning with p1, p2, . . . , pi+1 when pi+1 = ri+1 in this case, too.

In this case, no page fault is caused at time i + 1 when either the sequence p1, p2, . . . , pmis used or when a valid sequence beginning with p1, p2, . . . , pi+1 is used (for pi+1 = ri+1,as above). Therefore, Ci+1 = Ci and Ci+1 = Ci, so |Ci+1 ∩ Ci+1| = |Ci ∩ Ci| = k − 1,Ci+1 = (Ci+1 ∩ Ci+1) ∪ p1, and Ci+1 = (Ci+1 ∩ Ci+1) ∪ p1 — the above properties aresatisfied for i+ 1 as well as i.

140

Suppose next that i < m and ri+1 6∈ Ci ∪ Ci; then page faults are caused at time i + 1,both when the sequence p1, p2, . . . , pm is used, and when any valid sequence beginning withp1, p2, . . . , pi is used. Since p1, p2, . . . , pm is a valid sequence, and the above properties aresatisfied for i,

pi+1 ∈ Ci = (Ci ∩ Ci) ∪ p1.

Thus, either pi+1 ∈ Ci ∩ Ci, or pi+1 = p1. It will be useful to consider these two “subcases”separately.

If pi+1 ∈ Ci ∩ Ci then we should choose pi+1 = pi+1. It can be argued (without too muchtrouble) that there is a valid sequence that begins with p1, p2, . . . , pi+1, and this choice ensuresthat |Ci+1 ∩Ci+1| = k− 1, Ci+1 = (Ci+1 ∩Ci+1)∪ p1, and Ci+1 = (Ci+1 ∩Ci+1)∪ p1, sothat the above properties are satisfied for i+ 1, in this case as well.

On the other hand, if pi+1 = p1, then it isn’t possible to choose pi+1 to be p1, becausep1 6∈ Ci+1. We will choose pi+1 = p1, instead. Now, a different set of properties has beenestablished (or, can easily be proved) for i+ 1. We’ll call these the Set Equality properties:

• There is a valid sequence that starts with p1, p2, . . . , pi+1;• For every integer j such that 1 ≤ j ≤ i+1, if a page fault occurs at time j when any valid

sequence starting with p1, p2, . . . , pi+1 is used, then a page fault occurs at time j whenthe sequence p1, p2, . . . , pm is used, as well. Thus, the number of page faults occurringbetween times 1 and i+1 when p1, p2, . . . , pm is used is at least as large as the number ofpage faults occurring at these times, when a valid sequence starting with p1, p2, . . . , pi+1

is used, instead;• Ci+1 = Ci+1.

If none of the above cases holds then either ri+1 = p1 or ri+1 = p1, because p1 and p1 are theonly elements in Ci ∪ Ci that don’t belong to Ci ∩ Ci as well.

Suppose i < m and ri+1 = p1, so that a page fault occurs at time i + 1 when the sequencep1, p2, . . . , pm is used (since p1 6∈ Ci), but a page fault does not occur at this time when avalid sequence starting with p1, p2, . . . , pi+1 is used instead (because p1 ∈ Ci). In this case,since p1, p2, . . . , pm is a valid sequence,

pi+1 ∈ Ci = (Ci ∩ Ci) ∪ p1,

so either pi+1 ∈ Ci ∩ Ci, or pi+1 = p1. Once again, it will be useful to consider these two“subcases” separately.

If pi+1 ∈ Ci ∩ Ci, then

Ci+1 = (Ci \ pi+1) ∪ p1= ((Ci ∩ Ci) ∪ p1) \ pi+1) ∪ p1= ((Ci ∩ Ci) ∪ p1) ∪ p1 \ pi+1= (Ci ∪ p1) \ pi+1.

On the other hand, since ri+1 = p1 ∈ Ci, we must choose pi+1 = p1, in order to ensure thata valid sequence starting with p1, p2, . . . , pi+1 exists (and, one does!). Then Ci+1 = Ci.

This implies that |Ci+1 ∩ Ci+1| = k − 1, but now Ci+1 = (Ci+1 ∩ Ci+1) ∪ p1 and Ci+1 =(Ci+1 ∩ Ci+1) ∪ pi+1.

141

The following properties have now been established for i+ 1. We’ll call these the Page FaultsTo Spare Properties:

• There exists a valid sequence beginning with p1, p2, . . . , pi+1;

• At least one more page fault is caused between times 1 and i + 1 when the sequencep1, p2, . . . pm is used, than is caused at these times when a valid sequence starting withp1, p2, . . . , pi+1 is used instead;

• |Ci+1 ∩ Ci+1| = k − 1.

If pi+1 = p1, instead, then, once again, we are forced to choose pi+1 = ri+1 = p1 in order toensure that a valid sequence starting with p1, p2, . . . , pi+1 exists. Now, we’ve achieved a setof properties for i+ 1 that you’ve seen before — namely, the “Set Equality” properties.

We are left with only one other case — namely, that the initial set of properties holds for i,i < m, and ri+1 = p1. However, since p1 6= p1 (so that p1 6= r1) and p1 was the greedychoice, either p1 is not in the set r1, r2, . . . , rm at all, or the first appearance of p1 in thissequence appears after the first appearance of all the other members of C0 in this sequence —including p1. Since p1 = ri+1 it is clear that p1 ∈ r1, r2, . . . , rm, so (since “ri+1” is the firstappearance of p1 in the sequence) there must have been some integer j such that 1 ≤ j ≤ iand rj = p1. However, this contradicts the fact that the original set of properties have beenestablished for i — because it implies that either the sets Cj and Cj are not related as theseproperties state, or that the page faults at time j could not have been related in the waythese properties require. Either way, we have a contradiction — so this is not possible, either.Thus this case (ri+1 = p1) cannot hold, and can therefore be ignored.

Now, one of four things has been achieved: Either

(a) i = m, and the needed sequence has been constructed;

(b) i < m, but the original set of properties has been established for i + 1 as well as for i(so that this construction of the sequence can continue);

(c) i < m, but the Set Equality properties have been established for i+ 1;

(d) i < m, but the Page Faults To Spare properties have been established for i+ 1.

It follows that either the original set of properties holds for i = m (because we can continueto iterate until the case i = m is reached), or one of the Set Equality or Page Faults To Sparesets of properties is established for some integer (”i + 1”) between two and m. It’s alreadybeen shown that the needed sequence exists for the first of these cases, so all we need to dois show that the needed sequence exists in the other two cases, as well, in order to completethe proof of the lemma.

Suppose that the “Set Equality” properties have been established for i+ 1, so that the valuesp1, p2, . . . , pi+1 have been found, Ci+1 = Ci+1, and the other properties included in the “SetEquality” properties also hold. If i + 1 = m then we are finished, because a valid sequencep1, p2, . . . , pm causing no more page faults than p1, p2, . . . , pm (and starting with the greedychoice) would have been obtained. Otherwise, it suffices to consider the sequence

p1, p2, . . . , pi+1, pi+2, pi+3, . . . , pm

that ends with the same sequence of m− i−1 values as the original sequence did. This clearlystarts with the greedy choice, and it can be proved (using induction) that this sequence is

142

valid, and that it does not cause any more page faults than the original sequence did, asrequired.

Finally, suppose that the “Page Faults To Spare” properties have been achieved for the integeri+ 1, instead. If m = i+ 1 then, once again, we’ve found a valid sequence p1, p2, . . . , pm thatstarts with the greedy choice and, in this case, the sequence causes fewer page faults thanthe use of p1, p2, . . . , pm would. That is, we’ve found a sequence with the properties given inthe lemma.

If i+ 1 < m, instead, then we’ll try to extend the sequence p1, p2, . . . , pi+1 again, in order toachieve either the “Set Equality” properties or the “Page Faults To Spare Properties” for theintegers i+ 2, i+ 3, . . . ,m.

Suppose, now, that the “Page Faults To Spare” properties have been established for aninteger j such that i + 1 ≤ j ≤ m. These properties have been established if j = i + 1; aspointed out above, we’re finished (and we’ve found the sequence we need) if j = m.

Now suppose j < m.

If either rj+1 ∈ Cj ∩Cj , or rj+1 6∈ Cj ∪Cj , then a page fault occurs at time j+1 when a validsequence starting with p1, p2, . . . , pj is used if and only if a page fault occurs at time j + 1occurs when the sequence p1, p2, . . . , pm is used, and it is possible to choose pj+1 to be equalto pj+1 in order to achieve either the “Page Faults To Spare” properties or the “Set Equality”properties for j + 1 — you should review the earlier part of this proof, if this isn’t clear. Ifthe “Page Faults To Spare” properties are achieved for j + 1 then we can continue to iteratethis construction, and if the “Set Equality” properties are achieved, then

p1, p2, . . . , pj , pj+1, pj+2, pj+3, . . . , pm (4.1)

is a valid sequence with the properties given in the lemma (as needed).

On the other hand, if rj+1 ∈ Cj ∪ Cj but rj+1 6∈ Cj ∩ Cj , then either rj+1 ∈ Cj \ Cj orrj+1 ∈ Cj \ Cj .If rj+1 ∈ Cj \ Cj then a page fault occurs at time j + 1 when the sequence p1, p2, . . . , pmis used, but one does not occur at this time if a valid sequence beginning with p1, p2, . . . , pjis used (and we’re forced to choose pj+1 = rj+1 in order to ensure that a valid sequencebeginning with p1, p2, . . . , pj+1 exists). Now, either pj+1 ∈ Cj ∩ Cj or pj+1 ∈ Cj \ Cj , andeither the “Page Faults To Spare” properties or the “Set Equality” properties are achievedat time j + 1, depending on which is the case. As above, we can continue to iterate theconstruction if the “Page Faults To Spare” properties are achieved, and we can conclude thatthe sequence shown in equation (4.1) is a valid sequence satisfying the needed properties,otherwise.

Finally, if rj+1 ∈ Cj \ Cj then a page fault occurs at time j + 1 when a valid sequencebeginning with p1, p2, . . . , pj is used, but a page fault does not occur at this time, if thesequence p1, p2, . . . , pm is used, instead.

Since there strictly more page faults between times 1 and j using p1, p2, . . . , pj than there wereat these times using a valid sequence beginning with p1, p2, . . . , pj , it is still the case that thereare at least as many page faults caused between times 1 and j+1, if the sequence p1, p2, . . . , pmis used, than there are at these times if a valid sequence starting with p1, p2, . . . , pj is usedinstead.

143

Since a page fault did not occur at time j + 1 using the sequence p1, p2, . . . , pm, it must bethe case (since this is sequence is valid) that pj+1 = rj+1 and Cj+1 = Cj .

On the other hand, since a page fault did occur at this time when the other sequence (beingdiscussed) was used, rj+1 must be the unique element of the set Cj \ Cj = Cj+1 \ Cj . Inthis case we should choose pj+1 to be the unique element of the set Cj \ Cj = Cj \ Cj+1.Then it can be shown that a valid sequence starting with p1, p2, . . . , pj+1 does exist — andCj+1 = Cj+1. Thus, the “Set Equality” properties have now been established at time j + 1.Once again, we can conclude at this point that the sequence shown in equation (4.1) is a validsequence, starting with p1, that does not cause any more page faults than p1, p2, . . . , pm, asdesired.

At this point we have — finally — shown that the desired sequence exists in every possiblecase, and we can conclude that the lemma is correct.

Note: Clearly, this proof (or, any other correct proof for the lemma) deserve more than fivebonus marks; the mark was set at five in order to encourage students to work on Questions #1and #2 before spending time on this bonus question.

If it’s discovered that students have given correct proofs for Question #3 (but made mistakeson, were unable to complete, one or the other of the first two questions), then the markingscheme for “Question #3” will be reconsidered.

Final Remarks About Paging

It should be clear that the “paging” algorithm that is the subject of this test isn’t one thatyou could generally use in practice, because it requires knowledge of future requests in orderto decide which page to “swap out” of the cache, in order to meet the current one.

Therefore, you might wonder why anyone would bother studying this algorithm at all.

More reasonable (that is, more practical) algorithms are “online” algorithms: These arealgorithms that decide which page to swap out of the cache at time i (if a page fault hasoccurred), using only the information that is currently available — namely, the initial contentsof the cache and the requests that have been made already.

It turns out that the optimal “offline” algorithm (which we’ve studied) is useful, becauseits performance gives a “benchmark,” to which the performance of more practical “online”algorithms may be compared.

Here are a few things to note, in particular:

It isn’t sufficient to measure the number of page faults, as a function of the “input size” alone:There are arbitrarily long sequences of requests that cause no page faults at all (because theynever ask for anything that isn’t in the initial contents of the cache). On the other hand, ifm + k ≤ n then there are also sequences of length m such that a page fault must occur atevery time between 1 and m (because the members of C0 and the integers r1, r2, . . . , rm areall distinct). For this problem, you must measure the number of pages faults as a function ofthe input itself, and not just as a function of the input size.

Furthermore, the choice of paging strategy (represented here by the pi’s) can make a significantdifference, too. Suppose, for example, that k ≥ 2, C0 contains the integer 2 but not the

144

integer 1, and consider the sequence of requests (of arbitrarily large length m)

1, 2, 1, 2, 1, 2, . . .

An “intelligent” choice of p1 would be any member of C0 except for 2: Then C1 would containboth 1 and 2 and there would never be another page fault after the first one when the abovesequence of requests was served. That is, the total number of page faults caused would be 1.

A “foolish” choice of p1, p2, . . . , pm would be the sequence

2, 1, 2, 1, 2, 1 . . .

This is valid (one could prove this by induction on m), but it also causes a page fault to occurat every time between 1 and m.

So, here is a sequence where the “optimal” number of page faults is 1, but where you couldmanage to swap pages in the cache in order to cause m page faults to occur, instead.

Next you should note that every “online” paging strategy can be considered to be an “offline”strategy as well — it’s just one that chooses to “ignore” the future requests. Therefore thenumber of page faults caused using the best sequence that an offline strategy could choosefor an input (namely, the one described above) is a valid lower bound for the number ofpage faults caused when any online strategy is used to solve the same instance of this pagingproblem, instead.

This might not seem very useful if the lower bound wasn’t very realistic (or “tight”). For-tunately — provided that you’re willing to compare the performance of the online strategyusing one cache size, to the performance of the optimal offline strategy, for the same instance,when another (smaller) cache size is used — it can be proved that there is an online strategy(called “Least Recently Used”) that is not significantly worse than the offline strategy: If youuse a cache for the online strategy that’s larger by a constant factor (say, twice as large) thanthe cache for the offline strategy, then the number of page faults only differ by a constantfactor, as well.

Since m is not a “constant,” this is clearly much better evidence that “Least Recently Used”is a good “online” strategy than you could (easily) provide without considering the offlinestrategy as well.

If you’re interested in this you can find the details (including a proof of the above resultabout the relative performance of the two strategies) in a paper by D. D. Sleator and R. E.Tarjan [12].

145

Bibliography

[1] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of ComputerAlgorithms. Addison-Wesley, 1974.

[2] Gilles Brassard and Paul Bratley. Fundamentals of Algorithmics. Prentice Hall, 1996.

[3] P. Burgisser, M. Clausen, and M. A. Shokrollahi. Algebraic Complexity Theory. Springer,1997.

[4] D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. Journalof Symbolic Computation, 9:251–280, 1990.

[5] Thomas H. Cormen, Charles E. Leiserson, and Ronald L Rivest. Introduction to Algorithms.McGraw-Hill/MIT Press, 1990.

[6] Ellis Horowitz, Sartaj Sahni, and Sanguthevar Rajasekaran. Computer Algorithms/C++.Computer Science Press, 1996.

[7] A. Karatsuba and Y. Offman. Multiplication of multidigit numbers on automata. SovietPhysics Doklady, 7:714–716, 1963.

[8] Donald E. Knuth. The Art of Computer Programming, Volume 2: Seminumerical Algorithms.Addison-Wesley, third edition, 1997.

[9] Richard Neapolitan and Kumarss Naimipour. Foundations of Algorithms Using C++ Pseu-docode. Jones and Bartlett, second edition, 1997.

[10] A. Schonhage and V. Strassen. Schnelle Multiplikation großer Zahlen. Computing, 7:281–292,1971.

[11] Robert Sedgewick. Algorithms in C, Parts 1-4: Fundamentals, Data Structures, Sorting,Searching. Addison-Wesley, 1998.

[12] D. D. Sleator and R. E. Tarjan. Amortized efficiency of list update and paging rules. Com-munications of the ACM, pages 202–208, 1985.

[13] Volker Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 14:354–356,1969.

147

Index

Divide and Conquerdescription, 7example: Binary Search, 8example: Integer Multiplication, 11example: Matrix Multiplication, 15example: Merge Sort, 9implementation, 18

Dynamic Programmingdescription, 33example: Fibonacci Numbers, 27example: Matrix Chain Problem, 43example: Optimal Fee Problem, 37storage space, 31

Greedy Choice Propertydescription, 69proof outline, 70

Greedy Methodsalgorithm template, 64description, 64example: Activity Selection Problem, 82example: Optimal Fee Problem, 60example: Optimal Fee Problem, correct-

ness, 75proof of correctness, 69proof of incorrectness, 66

Maximization Problem, 63Memoization

description, 36example: Fibonacci numbers, 29

Minimization Problem, 63

Optimal Substructure Propertydescription for Greedy Methods, 69for Divide and Conquer, 33for Dynamic Programming, 33proof outline for Greedy Methods, 71

Optimization Problem, 63Overlapping Subproblems Property, 34

148

CPSC 413 Lecture Notes | Part IIpages.cpsc.ucalgary.ca/~eberly/Courses/CPSC413/1998/...Neapolitan...

Documents

Transcript of CPSC 413 Lecture Notes | Part IIpages.cpsc.ucalgary.ca/~eberly/Courses/CPSC413/1998/...Neapolitan...