Lecture 17: Maximum Principle - Purdue Universityjianghai/Teaching/ECE695/... · 2018-10-25 ·...

Lecture 17: Maximum Principle

Optimal Control Problem

For control system x = f (x(t), u(t), t), x(0) = x0, solve

minu(t), t∈[0,tf ]

J(u) =

∫ tf

0F (x , u, t) dt + g(x(tf ))

subject to x = f (x , u, t), ∀t ∈ [0, tf ]; x(0) = x0

• Solve for optimal control u∗ and optimal cost J∗

• Running cost F (x , u, t) ≥ 0 and terminal cost g(·) ≥ 0

• Free terminal time problem: tf may also need to be optimized

• Possible additional constraints:• State constraint x(t) ∈ Ωx(t) and control constraint u ∈ Ωu(t)• Terminal state constraint x(tf ) ∈ Sf

2 / 23

Examples

Example 1 (LQR)

• Dynamics x = Ax + Bu, with x(0) = x0, x(tf ) ∈ Sf

• Cost function

∫ tf

(xTQx + uTRu) dt

Example 2 (Brachistochrone Problem) Find the slide with the fastest

descent time: τdescent = 1√2g

∫ y0

√1+dx/dy

• Dynamicsdx

dt= u, with fixed x(0) = 0, x(y0) = x0

• Cost function 1√2g

∫ y0

√1+ut

3 / 23

Dynamic Programming Approach

Value function V (x , t) is the optimal cost over the time interval[t, tf ] starting from x(t) = x :

V (x , t) := minu(τ), τ∈[t,tf ]

∫ tf

tF (x , u, τ) dτ + g(x(tf ))

subject to x = f (x , u, τ), ∀τ ∈ [t, tf ]; x(t) = x

Optimality Principle: assuming u(·) ≡ v in [t, t + δ] for some small δ > 0

V (x , t) = minv

∫ t+δ

F (x , v , t) dτ + V (x(t + δ), t + δ)

+ o(δ)

• x(t + δ) = x(t) + f (x(t), v , t)δ + o(δ)

• Taylor series expansion of V (·, ·) at (x(t), t)

4 / 23

Hamilton-Jacobi-Bellman equation

V (x , t) satisfies the Hamilton-Jacobi-Bellman (HJB) equation:

minuF (x , u, t) +∇xV (x , t) · f (x , u, t) = −∇tV (x , t)

with boundary condition: V (·, tf ) = g(·)

• A partial differential equation, typically solved backward in time

• Optimal control u∗ is the one achieving minimum above

• V may not be differentiable everywhere (viscosity solution)

5 / 23

Linear Quadratic Regulation

Continuous-time LQR problem:

minu(t), t∈[0,tf ]

∫ tf

(xTQx + uTRu

)dt + x(tf )TQf x(tf )

subject to x = Ax + Bu, x(0) = x0

• Value function is quadratic: V (x , t) = xTP(t)x with P(tf ) = Qf

• HJB equation implies P(·) satisfies the Riccati differential equation

xTQx + uTRu + 2xTP(Ax + Bu)

= −xT Px

⇒ minu

]T [Q + PA + ATP PB

]= −xT Px

⇒ − P = Q + PA + ATP − PBR−1BTP

6 / 23

Pontryagin Maximum Principle

Suppose u∗(·), x∗(·) are a solution to the optimal control problem

∫ tf

F (x , u, t) dt + g(x(tf ))

s.t. x = f (x , u, t), t ∈ [0, tf ]; x(0) = x0

Then there exists a co-state λ∗(·) ∈ Rn such thatx∗ = ∇λH(x∗, u∗, λ∗, t)

λ∗ = −∇xH(x∗, u∗, λ∗, t) (adjoint equation)

where H(x , u, λ, t) := F (x , u, t) + λT f (x , u, t) is the Hamiltonian

Optimal control u∗ satisfies ∇uH(x∗, u, λ∗, t) = 0, or more generally

H(x∗, u∗, λ∗, t) = infuH(x∗, u, λ∗, t)

The Mathematical Theory of Optimal Processes L. S. Pontryagin, V. G.Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, Interscience, 1962

7 / 23

Connection with Dynamic Programming

• Via dynamic programming, optimal control is decided from

u∗ = arg minu

[F (x∗, u, t) +∇xV (x∗, t) · f (x∗, t)

]• Via maximum principle, optimal control is decided from

u∗ = arg minu

H(x∗, u∗, λ∗, t) = arg minu

[F (x∗, u, t) + λ∗ · f (x∗, t)

]• Indeed, co-state is the gradient of value function w.r.t. state:

λ∗ = ∇xV (x∗, t), ∀t ∈ [t0, tf ]

8 / 23

Calculus of Variations

Perturb control u(·) to u(·) + δu(·)

State perturbed from x(·) to x(·) + δx(·), and

δx = ∇x f (x , u, t) δx +∇uf (x , u, t) δu + o(δ)

(linearizatized dynamics around x(·))

Optimality condition (assume g ≡ 0): constrained optimization problem

minδu,δx

∫ tf

[∇xF (x , u, t)δx +∇uF (x , u, t)δu] dt

s.t. ˙δx = ∇x f (x , u, t) δx +∇uf (x , u, t) δu, ∀t ∈ [0, tf ]

achieves its minimum value of 0 at δu = 0 and δx = 0

9 / 23

Unconstrained Optimization

By introducing the Lagrange multiplier function λ(·), the constrainedoptimization problem is converted to an unconstrained one with Lagrangian∫ tf

[(∇xH(x , u, t) + λT

)δx +∇uH(x , u, t)δu

]dt − λT (tf )δx(tf )

where H = F + λT f is the Hamiltonian defined before.

To achieve minimum at δx = 0 and δu = 0, their coefficients should be zero,which implies the maximum principle

Further, we have λT (tf )δx(tf ) = 0

1 If x(tf ) is fixed, λ(tf ) is unconstrained

2 If x(tf ) is unconstrained, λ(tf ) = 0

3 If x(tf ) ∈ Sf , λ(tf ) ⊥ tangent space of Sf at x(tf ) (transversalitycondition)

Extension to g 6≡ 0 case straightforward (replace λ(tf ) with λ(tf )−∇xg(x(tf )))

10 / 23

Example

• Dynamics

x1 = u1

x2 = u2

, i.e., f (x , u, t) =

]• x(0) = x0, x(tf ) ∈ Sf

• Cost

∫ tf

(u21 + u2

2) dt, i.e., F (x , u, t) = u21 + u2

Solution via maximum principle

• Co-state λ =[λ1 λ2

]T• Hamiltonion H(x , u, λ, t) = F + λT f = u2

1 + u22 + λ1u1 + λ2u2

• Co-state dynamics λ∗ = −∇xH(x∗, u∗, λ∗, t) = 0; thus λ∗ is constant

• Optimal control: ∂H∂ui

= 2u∗i + λ∗i = 0; hence u∗ = −λ∗

2is constant

• Transversality condition: u∗ (hence the optimal path) ⊥ Sf

11 / 23

Dubins Path

Vehicle dynamics

x = v0 cos θ

y = v0 sin θ

θ = u

with fixed x(0) = x0 and x(tf ) = xf

• Constant speed v0 and bounded turn rate u ∈ [−1, 1]

• Cost J(u) = tf =∫ tf

01 dt, i.e., F ≡ 1 (free terminal time problem)

• Shortest curve with bounded curvature connecting x0 and xf

Solution via maximum principle

• Hamiltonian H = 1 + v0(λ1 cos θ + λ2 sin θ) + λ3u

λ1 = 0 (i.e., λ1 is constant)

λ2 = 0 (i.e., λ2 is constant)

λ3 = v0(−λ1 sin θ + λ2 cos θ)

and u∗ =

1 if λ3 < 0

−1 if λ3 > 0

? if λ3 = 0

12 / 23

Dubins PathFact: optimal path is a combination of no more than three motion primitives

• S (straight): u ≡ 0

• L (left turn): u ≡ 1

• R (right turn): u ≡ −1

Further, the only possible combinations are LRL,RLR, LSL, LSR,RSL,RSR

“On curves of minimal length with a constraint on average curvature, and with prescribed initial and terminalpositions and tangents”, L. E. Dubins. American Journal of Mathematics, 79:497-516, 1957.

13 / 23

A relook at LQR Problem

Continuous-time LQR problem:

minimize J(x , u) =1

∫ tf

(xTQx + uTRu) dt

subject to x = Ax + Bu, t ∈ [0, tf ], x(0) = x0

Hamiltonian function H(x , u, λ) = 12

(xTQx + uTRu

)+ λT (Ax + Bu)

By Maximum Principle, the optimal u, x , λ satisfy

Hu = Ru + BTλ = 0 ⇒ u∗ = −R−1BTλ∗x∗ = Hλ = Ax∗ − BR−1BTλ∗

λ∗ = −Hx = −Qx∗ − ATλ∗⇒

[A −BR−1BT

−Q −AT

] [x∗

]with two-point boundary conditions x(0) = x0, λtf = 0

Connection to value function Vt(x) = xTP(t)x : λ(t) = P(t)x(t)

14 / 23

Optimal Control of Hybrid System• Two modes 1 and 2 with domains

D1 = x2 ≥ 0 and D2 = x2 ≤ 0

• Identical dynamics f1 = f2 =

x1 = u1

x2 = u2

• Switching surface (guard) D1 ∩ D2

• Trivial reset condition

Suppose two modes have different running cost:

F1(x , u, t) = u21 + u2

2 , F2(x , u, t) = 2u21 + 2u2

Optimal control problem: Among all the solutions that start from x0 ∈ D1 attime 0, switch exactly once from mode 1 to mode 2, and end at xf ∈ D2 at afixed terminal time tf , find the one with the least cost

• With a switching time t1 ∈ (0, tf ), the cost is∫ t1

F1(x , u, t) dt +

∫ tf

F2(x , u, t) dt

15 / 23

Variational Method

To see if u is optimal, perturb it to u + δu

• Switching time t1 is perturbed to t1 + δt1

• State trajectory is perturbed from x to x + δx (two segments)

• Cost J is perturbed to J + δJ

Optimality condition: For u, x to be optimal, following problem shouldhave optimal solution δx = 0 and δu = 0:

minδu

δJ subject to ODE(δx , δu), δx(0) = δx(tf ) = 0

• Introduce Lagrange multiplier (co-state) λ(t), t ∈ [0, tf ], to convert theabove constrained problem to unconstrained problem

• Integration by part and set coefficients of δu and δx to zero

16 / 23

Optimality Condition

(Hybrid) Hamiltonian function H(x , u, λ, t) is defined as

H(x , u, λ, t) :=

F1(x , u, t) + λT f1(x , u, t)

= u21 + u2

2 + λ1u1 + λ2u2 if x ∈ D1

F2(x , u, t) + λT f2(x , u, t)

= 2u21 + 2u2

2 + λ1u1 + λ2u2 if x ∈ D2

Suppose u∗, x∗ are an optimal solution with the switching timet∗1 ∈ (0, tf ). Then there exists a co-state λ∗(t), t ∈ [0, tf ], such that

x∗ = Hλ(x∗, u∗, λ∗, t), t ∈ [0, t1) ∪ (t1, tf ]

λ∗ = −Hx(x∗, u∗, λ∗, t), t ∈ [0, t1) ∪ (t1, tf ]

Moreover, u∗ satisfies ∇uH(x∗, u∗, λ∗, t) = 0; and trasversality condition

(λ(t+1 )− λ(t−1 )) ⊥ Tx∗(t1)(D1 ∩ D2)

17 / 23

Solving Optimality Condition

• As H does not depend on x , λ∗ = −Hx = 0 implies

λ∗(t) ≡

λ− t ∈ [0, t1)

λ+ t ∈ (t1, tf ]

• Transversal condition (λ(t+1 )− λ(t−1 )) ⊥ Tx∗(t1)(D1 ∩ D2) implies

λ−1 = λ+1

• ∇uH(x∗, u∗, λ∗, t) = 0 implies u∗ is constant in each mode:

u∗ =

2 t ∈ [0, t1)

−λ+

4 t ∈ (t1, tf ]

18 / 23

Finding Optimal Solution

Optimal u∗, x∗ specified by the two vectors λ−, λ+ satisfyingλ−1 = λ+

−λ−2

2 · t1 = vertical coordinate of x0

−λ−

2 · t1 − λ+

4 · (tf − t1) = xf − x0

• Four unknowns and four equations

• In general admits a unique solution

• Solution determines u∗, x∗ (and t1)

19 / 23

Snell’s Law

A light ray passing through a boundarybetween two isotropic media

• v1, v2: velocities of light

• n1, n2: refractive indexes

Snell’s Law:

sin θ1

sin θ2=

From wikipedia.org

20 / 23

Hybrid Maximum Principle

A general theory of maximum principle for hybrid systems:

• H.J. Sussmann, A maximum principle for hybrid optimal controlproblems, CDC, pp. 425-430, 1999

Notable references on optimal control of hybrid systems:

• S. C. Bengea and R. A. DeCarlo, Optimal control of switchingsystems, Automatica , 2005.

• X. Xu and P. Antsaklis Optimal control of switched systems basedon parameterization of the switching instants, TAC, 2004.

• M. Egerstedt, Y. Wardi and H. Axelsson, Transition-timeoptimization for switched-mode dynamical systems, TAC, 2006.

21 / 23

Embedding Technique

Optimal control problem for a switched system with σ(t) ∈ 1, . . . ,m

minu(·),σ(·)

∫ tf

Fσ(t)(x(t), u(t), t) dt + g(x(tf ), tf )

subject to x(t) = fσ(t)(x(t), u(t), t), ∀t ∈ [0, tf ]; x(0) = x0 (1)

Idea: solve the optimal control problem for a nonswitched system:

minu(·),∆(·)

∫ tf

m∑i=1

∆i (t) · Fi (x(t), u(t), t) dt + g(x(tf ), tf )

subejct to x(t) =m∑i=1

∆i (t) · fi (x(t), u(t), t) (2)

• Non-switched system (2) has inputs u(·) and[∆1(t) · · · ∆m(t)

]taking values in the m-simplex: ∆i (t) ≥ 0,

∑mi=1 ∆i (t) = 1

• Under some mild conditions, solutions of (1) are dense in solutions of (2)

22 / 23

Embedding Technique

Hamiltonian of embedded system

H(x , u,∆, λ, t) =m∑i=1

∆i · Hi (x , u, λ, t)

where Hi (x , u, λ, t) = Fi + λT fi is the Hamiltonian of i-th subsystem

Optimal solution of embedded system:

x = Hλ =m∑i=1

∆i · fi (x , u, t), λ = −m∑i=1

∆i ·∂

∂xHi (x , u, λ, t)

Optimal u and ∆: H(x , u∗,∆∗, λ, t) = mini,u Hi (x , u, λ, t)

• Generally, optimal ∆∗ takes values in a corner of m-simplex

• In some cases, ∆∗ takes interior values and J∗ 6= J∗

“Optimal control of switching systems”, S. C. Bengea and R. A. DeCarlo,Automatica, 2005.

23 / 23

Lecture 17: Maximum Principle - Purdue Universityjianghai/Teaching/ECE695/... · 2018-10-25 ·...

Documents

Transcript of Lecture 17: Maximum Principle - Purdue Universityjianghai/Teaching/ECE695/... · 2018-10-25 ·...

W H A T I S T H E L A W O F A T T R A C T I O Nquwave.com/QuWave_Law_of_Attraction_ebook.pdfThe Law of Attraction (sometimes known as “The Secret”) is a principle that says you

Can Anger Hurt Your Heart? - Berkeley Wellness · Can Anger Hurt Your Heart? ... W T F S S M T W T F S S M T W T F S S M T W T F S S M T W T F S S M T W T F S S M T W T F S S M T

NC-T T F T F T F NC-F T F T F T F T F T F T F T F T F T F ...

Chapter 7cs453ta/notes/ch7.pdf · Probability Ranking Principle Robertson (1977) “If f t i l t ’ (i h i )“If a reference retrieval system’s (i.e., search engines) response

1.Cauchy’s Principle for Function F z

Principle Of A F S

1. General introduction to principle in T,FGeneral introduction to principle in T,F 2. The principle in runningThe principle in running 3. The principle.

Vital Signs for...Vital Signs for Venice United Church of Christ, Venice, FL generated from Holy Cow! Consulting Page 2 T F T F T F T F T F T F T F T F T F T F Your Thoughts: About

Technical Information Proline t-mass 65F, 65I t-mass 65F, 65I Endress+Hauser 3 Function and system design Measuring principle Thermal dispersion principle The thermal principle operates

On the Variational Principle - CEREMADEekeland/Articles/OnThe...ON THE VARIATIONAL PRINCIPLE 327 Clearly, the FrCchet-differentiability of F implies that F is Gateaux-dif- ferentiable;

Vapor Compression Refrigerator(Cycle J-T Principle) 2009

D R A F T D R A F T D R A F T D R A F T - California

CAPE 2018 SOLUTIONS - Sthillworx by Peter St.Hill Dip.Ed. · cape 2018 solutions question 1 (a) (i) ) )~ ~ ( ∨ (~ ∨ ~ ∧~ t t f f t f f t f f t t f f f t t f t f f

lt t f ~ t C f ~ i ~ t

CHAPTER 2 · CHAPTER 2 CONCEPTUAL FRAMEWORK ... F 16. Expense recognition principle. T 17. Recognizable revenues. ... Test Bank for Intermediate Accounting, Sixteenth Edition 2 -

Chapter 4 The Maximum Principle and Economic … Maximum Principle and Economic Interpretations ... In developing the necessary conditions in Chapter 2, ... 101 Ht f t g t

cyc Hz sec T - Flipping Physics...cosθ=mg F in =F T in =! F T sinθ=ma c =m v t 2 r ⎛ ⎝ ⎜ ⎞ ⎠ ∑ ⎟ v t =rωorv t = Δx Δt = C T = 2πr T F T cosθ=mg⇒F T = mg cosθ!

storage.googleapis.comstorage.googleapis.com/.../user-22595481/...Key.docx · Web viewAB(A & B) & (A & ~B) TTT T TF T F FT. T FT F F F T T TF. FTF F F F F F FT. FFF F FF F F TFAll

T B GUIDANCE PRINCIPLE USING THE INDIAN T … 2015/Robinson-AILJ-IV... · 2015] The Binding Guidance Principle 3 how the Principle allows tribes to hold more agencies accountable

w r f e P T f t r e f T fu T e T T 'i R i J M 'c f (* 5 ...gariaband.gov.in/sites/default/files/mgnrega-vacancy-result.pdf · w r f e P T f t r e f T fu T e T T 'i R i J M 'c f ...