ME 233 ReviewSolve the LQG problem under the assumption that the state vector is measurable 2. Solve...

1

ME 233 Advanced Control II

Lecture 8

Discrete Time

Linear Quadratic Gaussian (LQG)

Optimal Control

(ME233 Class Notes pp.LQG1-LQG7)

2

Outline

• Stochastic optimization

• Finite horizon LQG

– State feedback optimal LQG control

– Output feedback optimal LQG control

3

Stochastic ControlLinear system contaminated by noise:

YX(zI-A )-1 CB

+ +U

W

VBw

Two random disturbances:

• Input noise w(k) - contaminates the state x(k)

• Measurement noise v(k) - contaminates the

output y(k)

4

Stochastic state model

Where:

• available output

• control input

• Gaussian, uncorrelated, zero mean, input noise

• Gaussian, uncorrelated, zero mean, meas. noise

• Gaussian initial state

5

Assumptions (same as for KF)

• Initial conditions:

• Noise properties:

Zero-mean

Gaussian

uncorrelated

noises

Some notation- control and measurements

The control sequence from k to N-1

6

The optimal control sequence from k to N-1

The output measurements up to k

7

Finite-horizon LQG

For N > 0, find the optimal control sequence:

Which minimizes the cost functional:

where can only be based on the observations

8

Separation Principle

Main Theorem:

The optimal control is given by:

Where:

• The feedback gain K(k) is obtained from the

deterministic LQR solution.

• The state estimate is the a-posteriori

Kalman Filter state estimate.

9

Separation Principle

U Y

zB

YX

(zI-A )-1 CB

+ -

F(k)

A

+

-

X

o

Y~

X

C

o

-+

- K(k+1)

A-posteriori KF

Deterministic LQR

+

10

Separation Principle Proof

The proof of the separation principle is conducted

in two steps:

1. Solve the LQG problem under the assumption

that the state vector is measurable

2. Solve the LQG problem and show that the

optimal solution is obtained by replacing

by the a-posteriori state estimate

11

Finite-horizon state feedback LQG

This problem is similar to the standard

deterministic finite-horizon LQR…

…except that there is an additional input noise…

…and the control is only allowed to be a

function of

Functionality constraint on control

• The control u(k) is only allowed to be a

function of x(0),…,x(k)

• We write this constraint as

• We write the constraints

for k=m,…,N-1 as

12

13


We want to solve using dynamic programming:

Need 2 preliminary results:

1. Functional optimization

2. Stochastic Bellman equation

Functional optimization

Lemma 1:

Let X be a random vector and let denote

the constraint that u is a function of X

Also assume that there exists uo(x) such that

Then

14

Functional optimization

Proof is in 2 parts:

1.

2.

15

Proof:

Let uo(x) minimize f(x,u)

u is a function of X

Because

Proof:

• Let

• Minimizing the right-hand side over

completes the proof

u is a function of X

This holds, regardless of how was chosen

Definitions

• Terminal cost

• Stage cost (transient cost)

• Optimal cost to go

18

Stochastic Bellman equation

Lemma 2:

If for

Then

19

Proof: (m=N-1 case is trivial, and thus omitted)

21


Theorem 1:

a) The optimal control is given by

Standard deterministic LQR solution!

22

Theorem 1:

b) The optimal cost Jo is given by


23

Theorem 1:

b) The optimal cost is given by

This term reflects the detrimental effect of w(k) on the cost

b(k) is a dynamic function of the noise intensity

b(k) is computed backwards in time with b(N) = 0


24

Theorem 1:



Deterministic LQR

cost associated

with mean of x(0)

Detrimental effect of

randomness of x(0)

on the cost

Detrimental effect

of w(0),…,w(k)

on the cost

Proof consists of 2 steps:

1. Prove

and using induction

on decreasing m, Lemma 1, and the stochastic

Bellman equation (Lemma 2)

2. Prove

25


Proof of Theorem 1: and .

Start with base case: m=N

26


For m=0,1,…,N-1:

(We use induction on decreasing m)

27

By the induction hypothesis,

Term 1

Term 2

Term 3

Proof of Theorem 1: and . 28

Term 2

Since x(m) and u(m) only depend on quantities

that are independent from w(m)

Ax(m) + Bu(m) is independent from w(m)

0


Term 3


Therefore

30


Now use stochastic Bellman equation


• Use Lemma 1 to exchange min and E

• b(m) does not depend on u(m)


This is the same optimization we

solved for deterministic LQR!

Optimal value:

Proof consists of 2 steps:

1. Prove

and using induction

on decreasing m, Lemma 1, and the stochastic

Bellman equation (Lemma 2)

2. Prove

34


35

Proof:

0

36

Proof: (cont’d)

37

Separation Principle Proof

The proof of the separation principle is conducted

in two steps:

1. Solve the LQG problem under the assumption

that the state vector is measurable

2. Solve the LQG problem and show that the

optimal solution is obtained by replacing

by the a-posteriori state estimate

38

Finite-horizon LQG

This problem is similar to the standard

deterministic finite-horizon LQR…

…except that there is an additional input noise…

…and the control is only allowed to be a

function of

Functionality constraint on control

• The control u(k) is only allowed to be a

function of y(0),…,y(k)

• As before, we write this constraint as

• As before, we write the constraints

for k=m,…,N-1 as

39

40

Finite-horizon LQG

We want to solve:

We will relate this to an optimal

state feedback LQG control problem

For simplicity, assume S = 0

Reformulation of LQG

• Examine

41

0 (by LS property 1)

Reformulation of LQG

• Therefore,

• Similarly,

• Want to apply these identities to LQG

42

(Recall that we assumed S = 0)

Reformulation of LQG43


Terms

minimized by

the Kalman

filter

We will show that this corresponds to a

state feedback LQG control problem

• From the Kalman filter :

• Recall that is uncorrelated and



Initial conditions:

Notate this as


Initial conditions:

Correlation of with :


Want to solve:

u(k) is a function of


(because are functions of )


(because , i.e. knowledge of does not

give any “information” about by LS property 1 )


Want to solve:


This is a state feedback LQG control problem!

Apply results from first half of lecture

Uncorrelated with

50

Optimal finite-horizon LQG, S=0

Main Theorem:

a) The optimal control is given by

Standard deterministic LQR solution!

51

Optimal finite-horizon LQG, S=0

A-posteriori state observer structure:

Main Theorem:

52

Optimal finite-horizon LQG, S=0Main Theorem:


State space form of LQG controller53

Kalman

filter

LQR

Plugging this expression for uo(k) into the expression for

yields the state space model on the next slide

Eliminating from the expression for uo(k) yields

State space form of LQG controller54

where

K(k+1) is the standard deterministic LQR gain

F(k) and L(k) are the standard Kalman filter gains

ME 233 ReviewSolve the LQG problem under the assumption that the state vector is measurable 2. Solve...

Documents

Transcript of ME 233 ReviewSolve the LQG problem under the assumption that the state vector is measurable 2. Solve...