Dynamic Programming

Dynamic Programming

Study Guide for ES205

Yu-Chi HoJonathan T. LeeJan. 11, 2001

2

Outline Sample Problem General Formulation Linear-Quadratic Problem General Problems

3

Path-Cost Problem Find the path with minimal cost

N 1 2 3 4

1

5

3

3

65

12

23

6

18

63

19

82

60

48

3

5

4

Principle of Optimality “An optimal policy has the property

that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.”

N

5

Path-Cost Problem (cont.)

66

1 2 3 4

1

5

3

3

65

12

23

6

18

63

19

82

60

48

3

5

5

6

7

N

6


1 2 3 4

1

5

3

3

65

12

23

6

18

63

19

82

60

48

3

5

7

12

7

15

N

7


1 2 3 4

1

5

3

3

65

12

23

6

18

63

19

82

60

48

3

515

9

11

15

N

8

Formulation for Cost-Path Pb.

N

position of

funciton a ascost Terminal 00 NxJ

position offunction

a as go) left to step 1(with

1- stageat go-to-Cost 101 NNxJ

9

Formulation (cont.)

More generally,

1segmentpath ofcost min 0

1path" of "choice

0

iNxJ

iNxJ

i

i

NxJ

NxJ00

path" of "choice

01

segmentpath ofcost min

1

N

10

General Formulation Multistage optimization problem:

The cost-to-go

with initial condition

1

01,...,0

,,minN

iNuu

iiuixLNxJ

1,,min 0

1

0

iNxJiNiNuiNxL

iNxJ

iiNu

i

N

NxNxJ 00

11

Multistage or Optimal Control Problem Can be approached as static optimization

problem with specialized equality (staircase) constraints

See study guides titled “Dynamic Systems”

These two equivalent ways will be made clear below in the solution of a specific class of problems

12

Linear-Quadratic Problem

subject to linear system dynamics

given the initial state x(0)where x(i) is the state variables at time iu(i) is the control variable at time ia(i) and b(i) are the cost factor at time i

1

0

222

2

1

2

1

2

1min

N

iiu

iuibixiaNxNaJ

iuigixifix 1

N

13

LQ Problem (cont.)

N

NxNaNxJ 200 2

1

NxJ

NuNb

NxNa

NxJNu

00

2

2

1

01 11

2

1

112

1

min1

14

LQ Problem (cont.) Substitute

into

Set

We get

1111 NuNgNxNfNx

101 NxJ

N

0

1

101

Nu

NxJ

11

1111

2

NgNaNb

NgNxNfNaNu

15

LQ Problem (cont.) With some work, we have

LetThen, we have

N

1

11

Nb

NxNaNgNu

NxNaN

1

11

Nb

NNgNu

16

LQ Problem (cont.) With NxNNxJ

2

100

NxN

NuNb

NxNa

NxJNu

2

1

112

1

112

1

min1 2

2

1

01

1

11

Nb

NNgNu

N

17

LQ Problem (cont.) Substitute the optimum u(N-1), then

we have

Define

11112

110

1 NxNxNaNfNxNaNxJ

1111 NxNaNfNN

112

101 NxNNxJ

N

18

LQ Problem (cont.)

112

1

222

1

222

1

min2 2

2

2

02

NxN

NuNb

NxNa

NxJNu

2

122

Nb

NNgNu

N

19

LQ Problem (cont.) By induction, we have the optimal

solution to be

where

with boundary condition

ibiig

iu1

ixiaifii 1

NxNaN

N

20

General Problems Stochastic problems Combinatorial problems Variable termination time Constraints in the problem

N

21

Stochastic Problem

The cost-to-go


N

1

01,...,0

,,,,minN

iNuu

iiuixLNxEJ

1,,,min 0

1

0

iNxJiNiNuiNxLE

iNxJ

iiNu

i

,00 NxENxJ

22

Combinatorial Problem

The cost-to-go


N

N

ii

NxxixL

1,...,1min

1min 0

1

0

iNxJiNxL

iNxJ

iiNiNx

i

NxLNxJ N00

23

References:• Bellman, R., Dynamic Programming, Princeton

University Press, 1957.• Bryson, Jr., A. E. and Y.-C. Ho, Applied Optimal

Control: Optimization, Estimation, and Control, Taylor & Francis, 1975.

• Dreyfus, S. E. and A. M. Law, The Art and Theory of Dynamic Programming, Academic Press, 1977.

• Ho, Y.-C., Lecture Notes, Harvard University, 1997.

24

References:• National Institute of Standards and Technology,

Dictionary of Algorithms, Data Structures, and Problems, http://hissa.nist.gov/dads/HTML/principle.html

• Ortega, A. and K. Ramchandran, “Rate-Distortion Methods for Image and Video Compression: An Overview,” IEEE Signal Processing Magazine, Nov. 1998. http://sipi.usc.edu/~ortega/RD_Examples/boxDP.html

Dynamic Programming

Documents

Transcript of Dynamic Programming