ME 233 ReviewSolve the LQG problem under the assumption that the state vector is measurable 2. Solve...
Transcript of ME 233 ReviewSolve the LQG problem under the assumption that the state vector is measurable 2. Solve...
1
ME 233 Advanced Control II
Lecture 8
Discrete Time
Linear Quadratic Gaussian (LQG)
Optimal Control
(ME233 Class Notes pp.LQG1-LQG7)
2
Outline
• Stochastic optimization
• Finite horizon LQG
– State feedback optimal LQG control
– Output feedback optimal LQG control
3
Stochastic ControlLinear system contaminated by noise:
YX(zI-A )-1 CB
+ +U
W
VBw
Two random disturbances:
• Input noise w(k) - contaminates the state x(k)
• Measurement noise v(k) - contaminates the
output y(k)
4
Stochastic state model
Where:
• available output
• control input
• Gaussian, uncorrelated, zero mean, input noise
• Gaussian, uncorrelated, zero mean, meas. noise
• Gaussian initial state
5
Assumptions (same as for KF)
• Initial conditions:
• Noise properties:
Zero-mean
Gaussian
uncorrelated
noises
Some notation- control and measurements
The control sequence from k to N-1
6
The optimal control sequence from k to N-1
The output measurements up to k
7
Finite-horizon LQG
For N > 0, find the optimal control sequence:
Which minimizes the cost functional:
where can only be based on the observations
8
Separation Principle
Main Theorem:
The optimal control is given by:
Where:
• The feedback gain K(k) is obtained from the
deterministic LQR solution.
• The state estimate is the a-posteriori
Kalman Filter state estimate.
9
Separation Principle
U Y
zB
YX
(zI-A )-1 CB
+ -
F(k)
A
+
-
X
o
Y~
X
C
o
-+
- K(k+1)
A-posteriori KF
Deterministic LQR
+
10
Separation Principle Proof
The proof of the separation principle is conducted
in two steps:
1. Solve the LQG problem under the assumption
that the state vector is measurable
2. Solve the LQG problem and show that the
optimal solution is obtained by replacing
by the a-posteriori state estimate
11
Finite-horizon state feedback LQG
This problem is similar to the standard
deterministic finite-horizon LQR…
…except that there is an additional input noise…
…and the control is only allowed to be a
function of
Functionality constraint on control
• The control u(k) is only allowed to be a
function of x(0),…,x(k)
• We write this constraint as
• We write the constraints
for k=m,…,N-1 as
12
13
Finite-horizon state feedback LQG
We want to solve using dynamic programming:
Need 2 preliminary results:
1. Functional optimization
2. Stochastic Bellman equation
Functional optimization
Lemma 1:
Let X be a random vector and let denote
the constraint that u is a function of X
Also assume that there exists uo(x) such that
Then
14
Functional optimization
Proof is in 2 parts:
1.
2.
15
Proof:
Let uo(x) minimize f(x,u)
u is a function of X
Because
Proof:
• Let
• Minimizing the right-hand side over
completes the proof
u is a function of X
This holds, regardless of how was chosen
Definitions
• Terminal cost
• Stage cost (transient cost)
• Optimal cost to go
18
Stochastic Bellman equation
Lemma 2:
If for
Then
19
Proof: (m=N-1 case is trivial, and thus omitted)
21
Finite-horizon state feedback LQG
Theorem 1:
a) The optimal control is given by
Standard deterministic LQR solution!
22
Theorem 1:
b) The optimal cost Jo is given by
Finite-horizon state feedback LQG
23
Theorem 1:
b) The optimal cost is given by
This term reflects the detrimental effect of w(k) on the cost
b(k) is a dynamic function of the noise intensity
b(k) is computed backwards in time with b(N) = 0
Finite-horizon state feedback LQG
24
Theorem 1:
b) The optimal cost is given by
Finite-horizon state feedback LQG
Deterministic LQR
cost associated
with mean of x(0)
Detrimental effect of
randomness of x(0)
on the cost
Detrimental effect
of w(0),…,w(k)
on the cost
Proof consists of 2 steps:
1. Prove
and using induction
on decreasing m, Lemma 1, and the stochastic
Bellman equation (Lemma 2)
2. Prove
25
Finite-horizon state feedback LQG
Proof of Theorem 1: and .
Start with base case: m=N
26
Proof of Theorem 1: and .
For m=0,1,…,N-1:
(We use induction on decreasing m)
27
By the induction hypothesis,
Term 1
Term 2
Term 3
Proof of Theorem 1: and . 28
Term 2
Since x(m) and u(m) only depend on quantities
that are independent from w(m)
Ax(m) + Bu(m) is independent from w(m)
0
Proof of Theorem 1: and . 29
Term 3
Proof of Theorem 1: and .
Therefore
30
Proof of Theorem 1: and . 31
Now use stochastic Bellman equation
Proof of Theorem 1: and . 32
• Use Lemma 1 to exchange min and E
• b(m) does not depend on u(m)
Proof of Theorem 1: and . 33
This is the same optimization we
solved for deterministic LQR!
Optimal value:
Proof consists of 2 steps:
1. Prove
and using induction
on decreasing m, Lemma 1, and the stochastic
Bellman equation (Lemma 2)
2. Prove
34
Finite-horizon state feedback LQG
35
Proof:
0
36
Proof: (cont’d)
37
Separation Principle Proof
The proof of the separation principle is conducted
in two steps:
1. Solve the LQG problem under the assumption
that the state vector is measurable
2. Solve the LQG problem and show that the
optimal solution is obtained by replacing
by the a-posteriori state estimate
38
Finite-horizon LQG
This problem is similar to the standard
deterministic finite-horizon LQR…
…except that there is an additional input noise…
…and the control is only allowed to be a
function of
Functionality constraint on control
• The control u(k) is only allowed to be a
function of y(0),…,y(k)
• As before, we write this constraint as
• As before, we write the constraints
for k=m,…,N-1 as
39
40
Finite-horizon LQG
We want to solve:
We will relate this to an optimal
state feedback LQG control problem
For simplicity, assume S = 0
Reformulation of LQG
• Examine
41
0 (by LS property 1)
Reformulation of LQG
• Therefore,
• Similarly,
• Want to apply these identities to LQG
42
(Recall that we assumed S = 0)
Reformulation of LQG43
Reformulation of LQG44
Terms
minimized by
the Kalman
filter
We will show that this corresponds to a
state feedback LQG control problem
• From the Kalman filter :
• Recall that is uncorrelated and
Reformulation of LQG45
Reformulation of LQG46
Initial conditions:
Notate this as
Reformulation of LQG47
Initial conditions:
Correlation of with :
Reformulation of LQG48
Want to solve:
u(k) is a function of
u(k) is a function of
(because are functions of )
u(k) is a function of
(because , i.e. knowledge of does not
give any “information” about by LS property 1 )
Reformulation of LQG49
Want to solve:
u(k) is a function of
This is a state feedback LQG control problem!
Apply results from first half of lecture
Uncorrelated with
50
Optimal finite-horizon LQG, S=0
Main Theorem:
a) The optimal control is given by
Standard deterministic LQR solution!
51
Optimal finite-horizon LQG, S=0
A-posteriori state observer structure:
Main Theorem:
52
Optimal finite-horizon LQG, S=0Main Theorem:
b) The optimal cost is given by
State space form of LQG controller53
Kalman
filter
LQR
Plugging this expression for uo(k) into the expression for
yields the state space model on the next slide
Eliminating from the expression for uo(k) yields
State space form of LQG controller54
where
K(k+1) is the standard deterministic LQR gain
F(k) and L(k) are the standard Kalman filter gains