Slides 10b

7/31/2019 Slides 10b

1/29

1

7/31/2019 Slides 10b

2/29

Given:

Find u(t) that minimizes

Use Lagrange multiplierfunctions

x(t) = f(x(t), u(t), t) x(t0) given t [t0, tf]

J = (x(tf), tf) +

tft0

L (x(t),u(t), t) dt

JA = (x(tf), tf) +

tft0

L (x(t),u(t), t) dt +

tft0

T(t) [f(x(t),u(t), t) x(t)] dt

7/31/2019 Slides 10b

3/29

Define a Hamiltonian

Plug into Lagrangian and integrate by parts

Its variation (w.r.t. changes in x and u) is:

H (x(t),u(t), t) = L (x(t),u(t), t) + T(t)f (x(t),u(t), t)

T(t)x(t)

J = (x(tf), tf)T (tf)x (tf)+

T (t0)x (t0)+

tf

t0

H (x(t),u(t), t) + Tx(t)

dt

J =

x T

xt=tf

+ [T

x]t=t0 +tft0

Hx +

Tx +

H

u udt

7/31/2019 Slides 10b

4/29

Zeroing it out (+ keeping dynamics requirements) leads to the EulerLagrange equations:

4

x(t) = f(x(t),u(t), t)

T

(t) =L

x

T

(t)

f

x

T(tf) =

x(tf)

H

u= 0

x0 given

7/31/2019 Slides 10b

5/29

If and fare not explicit functions oft (happens often) then theHamiltonian is constant (i.e. invariant) on the systems trajectory!

On the extremal path both terms zero out, meaning thatH(t) = constant

5

7/31/2019 Slides 10b

6/29

Given x(0), tfand a linear system:

Find a path that brings (some of) x close to zero without allowing it tovary to much on the way and without too much control energy. I.e. find u

that minimizes

where S,Q and Rare positive definite.

Note: this can easily be generalized to time varyingsystems and costs.

Solution will be very similar to the discrete case.

The resulting u turns out to be a continuous state feedback rule.6

7/31/2019 Slides 10b

7/29

Solution:

The Hamiltonian is

The end constraint on is

s diff. eq. is

The third Euler Lagrange condition gives

Rearranging

This is a two point boundary value problem with x known at t0 and

known at tf. The two linear differential equations are coupled by u. 7

7/31/2019 Slides 10b

8/29

Plugging into the state dynamics gives

One can show (see Bryson & Ho, sec. 5.2) that there exists a timevarying matrix S(t) that provides a linear relation between and x

Plugging this into the above state dynamics gives

i.e. is a linear state feedback. Similarly to thediscrete case, to know the control rule we simply need to find S(t).

We plug into and get

Next we plug () into the above equation and after a bit of rearrangingget

Since x(t)0 (otherwise there is no need for regulation) we can drop xand get... 8

()

-

-

7/31/2019 Slides 10b

9/29

...

This is a quadratic differential equation in S with a boundary condition

This equation is known as a matrix Riccati equation. It can be solved(e.g. by numeric integration from Sfbackwards) to get S(t).

Usually, this dynamic system is stable and reaches a steady state S ast. The Matlab function care solves for S (if a real valued solution

exists).

S is good for regulating the system for a long duration (i.e. forever).

Since this differential equation is quadratic, it may have more than onesolution. The desired solution is PSD. Starting from S=0 (instead of

S=Sf) and numerically integrating until convergence (i.e. till )will give the PSD solution (see Bryson & Ho, sec. 5.4).

We will see (in a future lecture) that the HJB equations show thatJ=xT(t0)S(t0)xT(t0). (This is also true for the discrete case, were we used

the notation P instead ofS). 9

7/31/2019 Slides 10b

10/29

Same variation ofJholds:

Suppose xk(tf)is given, then xk(tf) = 0 and therefore we do not require

in order to zero out the variation

Similarly, ifxk(t0) is not given, then xk(t0) 0 and to zero out Jwerequire that

Note that if the system is not controllable the condition may beimpossible to satisfy. 10

J =

x T

x

t=tf

+ [Tx]t=t0 +

tft0

H

x+ T

x +

H

uu

dt

xk(tf) Tk = 0

k(t0) = 0

7/31/2019 Slides 10b

11/29

The Euler Lagrange equations are now

11

x(t) = f(x(t), u(t), t)

T(t) = H

x = L

x T(t)

f

x

Tk (tf) =

xk(tf)or xk(tf) given

H

u

= 0

k k(0) = 0 or xk(0) given

7/31/2019 Slides 10b

12/29

Find a path x(t), starting at rest (zero velocity & acceleration) from x0 attime t0 and ending at rest at xfat time tfso that the squared cumulativechange in acceleration (jerk) along the path is minimal.

Define the state to be x = [x, v, a] (position, velocity, acceleration) and

the control signal to be

The cost is:

where and = 0

We solve for the one dimensional (scalar position) case but solutionholds for the multidimensional case as well.

12

u(t) = a

J=1

2

tft0

u2(t)dt

L=

1

2u2

7/31/2019 Slides 10b

13/29

The dynamics equations are

The Jacobian is therefore

and (remembering )

so, by we get

13

x(t) = f(x(t),u(t), t)

f1 = x = v

f2 = v = a

f3 = a = u

f

x

=

0 1 00 0 1

0 0 0

L

x=

00

0

f

u=

00

1

T(t) =

H

x=

L

x T(t)

f

x

= 0 [1,2,3]

0 1 00 0 1

0 0 0

[1, 2, 3]T

L = 12u2

7/31/2019 Slides 10b

14/29

We can solve these differential equations:

The Hamiltonian ( H =L+Tf) is

so

14

1 = c1

2 = c1t+ c2

3 = 12c1t

2 c2t+ c3

H

u = 00 = u + 3

u = 1

2c1t

2 + c2t c3

H = u2(t) + Tf

= 12u2(t) + 1v + 2a + 3u

1

2

7/31/2019 Slides 10b

15/29

Plugging u into the dynamics equations:

Using the initial conditionsx(0) =x0 and v(0) = a(0) = 0 we see thatc4 = c5 =0 and c6 = x0 leaving us with

15

a = u

a =

1

6c1t3

+

1

2c2t2

c3t+ c4v = a

v = 1

24c1t

4 +1

6c2t

3

1

2c3t

2 + c4t+ c5

x = v

x =

1

120c1t

5

+1

24c2t4

1

6c3t3

+1

2c4t

2

+ c5t+ c6

a =16c1t3 + 1

2c2t2 c3t

v = 1

24c1t

4 +1

6c2t

3

1

2c3t

2

x = 1

120c1t

5 +1

24c2t

4

1

6c3t

3 + x0

7/31/2019 Slides 10b

16/29

Using the final conditions,x(tf) = xfand v(tf) = a(tf) = 0 we get3 equations in 3 unknowns (with tfas a parameter):

The solutions of above equations yields:

Resulting in

where

16

c3 = (xf x0)60

t3

f

c2 = (xf x0)360

t4

f

c1 = (xf x0)720

t5

f

x(t) = x0 + (xf x0)(65 154 + 103)

=t

tf

00

xf x0

=

1

6t3f

1

2t2f tf

1

24t4f

1

6t3f

1

2t2f

1

120t5f

1

24t4f

1

6t3f

c1c2c3

7/31/2019 Slides 10b

17/29

17

10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x(t)

10 20 30 40 50 60 70 80 90 100

2

4

6

8

10

12

14

16

18

x 10!3

v(t)

10 20 30 40 50 60 70 80 90

!5

!4

!3

!2

!1

0

1

2

3

4

5

x 10!4

a(t)

10 20 30 40 50 60 70 80 90

!2

!1

0

1

2

3

4

5

x 10!5

u(t)

7/31/2019 Slides 10b

18/29

When tfis not constrained it becomes a parameter of the variationproblem and affects the cost of the solution. All the previous optimalityconditions still apply and an extra condition is added:

Proof sketch (see Bryson & Ho, sec 2.7 for real proof):

The Lagrange multipliers augmented cost is, as before

The variation is now also in tf. To simplify the proof we assume that x(tf)is unchanged by the variation in tf, that is x(t)=x(tf) fort[tf, tf+ t].

18

7/31/2019 Slides 10b

19/29

The resulting variation is

where Jis the variation given tf

Note that the same result is reached without the assumption on x notvarying.

19

7/31/2019 Slides 10b

20/29

This is an example of a common case where tfis not given.

problem setting: given dynamical system

and some of the start/end state valuesxk(t0) andxk(tf) find the control thatminimizes the following simple cost

That is

20

7/31/2019 Slides 10b

21/29

The Euler Lagrange equations are

We have 2n boundary constraints for 2n differential equations, moptimality constraints form control variables and one constraint fortf.

21

7/31/2019 Slides 10b

22/29

Suppose we have an extra requirement from the optimal trajectory:

E.g. given fixed amount of fuel, reach the destination with an empty tank(while obeying some optimality criterions) .N(x,u,t) would be the fuelconsumption and c the given fuel amount.

Solution: (by reduction to a regular problem with an extra state variablethat has given boundary values)Add a new state variable,xn+1 with the following dynamics functionf

22

7/31/2019 Slides 10b

23/29

It is clear that

Now require that

Obviously, the augmented system obeys the integral constraint

(see Bryson & Ho Sec 3.1)

23

7/31/2019 Slides 10b

24/29

Maximum area under curve of given length:

We formulate it as a control problem: A particle moves with constantvelocity (1) along the horizontal axis, starting at time 0 and ending attime a. The position along thex1 axis changes as a function of the angle

which is the control signal. The particles path (in thex1-tplane) must

bep and the area under the particle is maximized.

We assume that -/2

7/31/2019 Slides 10b

25/29

The corresponding equations are

The cost is

25

7/31/2019 Slides 10b

26/29

Solution:

Add another state variable,x2, and replace the integral constraint with

The Hamiltonian is

we know that it is a constant becauseL and fare not explicit functionsof time (see previous slide).

The Lagrange multipliers obay

26

7/31/2019 Slides 10b

27/29

The Lagrange equation foru gives

so

i.e

This means that under this control scheme the sine of (t) is linear.

One can then use the given boundary conditions to findH,c1,c2,solvex1and show that the optimal path is an ark of a circle whose center is at

, its radius isp/2 and obeys

27

1 = 2 sin = c2 sin

7/31/2019 Slides 10b

28/29

Suppose we wish to find a trajectory that obeys an extra set of equalityconstraints, c(x,u,t)=0.

This is no different than requiring that = 0.

We therefore treat the new constraint in a similar manner. We rewrite theHamiltonian to be:

where is an extra vector of Lagrange multipliers that gets the exact

same treatment as .

The rest of the Euler Lagrange constraint derivation is unchanged.

(see Bryson & Ho, sec. 3.3)

28

7/31/2019 Slides 10b

29/29

functions of the state variables given at t0 and/or given or unconstrained

terminal time tf. (Eg. manipulate a robotic hand from one curved wall toanother along the shortest path).

via point problems (trajectory is constrained to obey some rule at acertain time or position along its path).

problems where the Lagrange equations do not produce a constraint onthe control signal (although it is obviously constrained). Eg. bang-bangcontrol.

...

29

Slides 10b

Documents

Transcript of Slides 10b