Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear...
Transcript of Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear...
![Page 1: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/1.jpg)
Neighboring-Optimal Control via Linear-Quadratic (LQ) Feedback
Robert Stengel Optimal Control and Estimation, MAE 546
Princeton University, 2018
• Linearization of nonlinear dynamic models– Nominal trajectory– Perturbations about the nominal trajectory– Linear, time-invariant dynamic models– Examples
• Continuous-time, time-varying optimal LQ feedback control
• Discrete-time and sampled-data linear dynamic models
• Dynamic programming approach to optimal LQ sampled-data control
Copyright 2018 by Robert Stengel. All rights reserved. For educational use only.http://www.princeton.edu/~stengel/MAE546.html
http://www.princeton.edu/~stengel/OptConEst.html
1
Neighboring Trajectories Satisfy Same Dynamic Equations
• Neighboring-trajectory dynamic model is the same as the nominal dynamic model
!xN (t) = f[xN (t),uN (t),wN (t)], xN to( ) given
!x(t) = f[x(t),u(t),w(t)], x to( ) given
Δx(to ) = x(to )− xN (to )Δx(t) = x(t)− xN (t)Δ!x(t) = !x(t)− !xN (t)
!xN (t) = f[xN (t),uN (t),wN (t),t]!x(t) = !xN (t)+ Δ!x(t) = f[xN (t)+ Δx(t),uN (t)+ Δu(t),wN (t)+ Δw(t),t]
Δu(t) = u(t)− uN (t)Δw(t) = w(t)−wN (t)
2
![Page 2: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/2.jpg)
Solve for the Nominal and Perturbation Trajectories Separately
Jacobian matrices of the linear model are evaluated along the nominal trajectory
!xN (t) = f[xN (t),uN (t),wN (t),t],xN to( ) given
Δ!x(t) ≈ ∂ f∂x
t( )Δx(t)+ ∂ f∂u
t( )Δu(t)+ ∂ f∂w
t( )Δw(t),
Δx to( ) given
F(t) ! ∂ f∂x x=xN (t )
u=uN (t )w=wN (t )
; G(t) ! ∂ f∂u x=xN (t )
u=uN (t )w=wN (t )
; L(t) ! ∂ f∂w x=xN (t )
u=uN (t )w=wN (t )
Δx(t) = F(t)Δx(t) +G(t)Δu(t) + L(t)Δw(t), Δx to( ) given3
!x(t) = !xN (t)+ Δ!x(t)≈ f[xN (t),uN (t),wN (t),t]
+ ∂ f∂x
Δx(t)+ ∂ f∂u
Δu(t)+ ∂ f∂w
Δw(t),
x(t0 ) = xN (t0 )+ Δx(to ) given
Linearization Example
4
![Page 3: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/3.jpg)
Cubic Springs
Force is a nonlinear function of deflection
f x( ) = −k1x ± k2x3
moment θ( ) ≈ −k1θ + k2θ3
Example: Expand gsinθ in a power series
5
Stiffening Cubic Spring Example2nd-order nonlinear dynamic model
!x1(t) = f1 x(t)[ ] = x2 (t)!x2 (t) = f2 x(t)[ ] = −10x1(t)−10x1
3(t)− x2 (t)Integrate equations to produce nominal path
x1(0)x2 (0)
⎡
⎣⎢⎢
⎤
⎦⎥⎥→
f1N x(t)[ ]f2N x(t)[ ]
⎡
⎣
⎢⎢
⎤
⎦
⎥⎥dt→
0
t f
∫x1N (t)
x2N (t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
in 0,t f⎡⎣ ⎤⎦
Evaluate partial derivatives of the Jacobian matrices
∂ f1∂ x1
= 0; ∂ f1∂ x2
= 1
∂ f2∂ x1
= −10 − 30x1N2 (t); ∂ f2
∂ x2= −1
∂ f1∂u = 0;
∂ f1∂w = 0
∂ f2∂u = 0;
∂ f2∂w = 0
6
![Page 4: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/4.jpg)
Nominal and Perturbation Dynamic Equations
!x1N (t) = x2N (t)
!x2N (t) = −10x1N (t)−10x1N3(t)− x2N (t)
Δx1(t)Δx2 (t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥=
0 1− 10 + 30x1N
2 (t)( ) −1
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx1(t)Δx2 (t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
x1N (0)
x2N (0)
⎡
⎣⎢⎢
⎤
⎦⎥⎥= 0
9⎡
⎣⎢
⎤
⎦⎥
Nonlinear, Time-Invariant (NTI) Equation
Linear, Time-Varying (LTV) Equation
Initial Conditions for Nonlinear and Linear Models
7
Δx1(0)Δx2 (0)
⎡
⎣⎢⎢
⎤
⎦⎥⎥= 0
1⎡
⎣⎢
⎤
⎦⎥
Comparison of Approximate and Exact Solutions
xN (t)Δx(t)xN (t)+ Δx(t)x(t)
x2N (0) = 9Δx2 (0) = 1x2N (t)+ Δx2 (t) = 10x2 (t) = 10
!xN (t)Δ!x(t)!xN (t)+ Δ!x(t)!x t( )
8
![Page 5: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/5.jpg)
Euler-Lagrange Equations for Minimizing Variational
Cost Function
9
Optimize Nominal and Neighboring Dynamic Models Separately
Linear, Neighboring-Optimal Control Added to Nominal Control
Δu(t) = −C(t)Δx(t)= −C(t) x(t)− x*(t)[ ]
Nominal, Nonlinear, Open-Loop Optimal Control
u(t) = uopt (t) ! u*(t)
10
u(t) = u*(t)+ Δu(t)
![Page 6: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/6.jpg)
Expand Optimal Control Function
J x*(t0 )+ Δx(t0 )[ ], x*(t f )+ Δx(t f )⎡⎣ ⎤⎦{ } !J * x*(t0 ),x*(t f )⎡⎣ ⎤⎦ + ΔJ Δx(t0 ),Δx(t f )⎡⎣ ⎤⎦ + Δ2J Δx(t0 ),Δx(t f )⎡⎣ ⎤⎦
Nominal optimized cost, plus nonlinear dynamic constraint
Expand optimized cost function to second degree
J * x*(t0 ),x*(t f )⎡⎣ ⎤⎦ = φ x*(t f )⎡⎣ ⎤⎦ + L x*(t),u*(t)[ ]t0
t f
∫ dt
subject to nonlinear dynamic equation!x*(t) = f x*(t),u*(t)[ ], x(t0 ) = x0
= J * x *(t0 ),x *(t f )⎡⎣ ⎤⎦ + Δ2J Δx(t0 ),Δx(t f )⎡⎣ ⎤⎦because First Variation, ΔJ Δx(t0 ),Δx(t f )⎡⎣ ⎤⎦ = 0
11
2nd Variation of the Cost Function
Cost weighting matrices expressed as
Objective: Given optimal nominal solution, minimize 2nd-variational cost subject to linear dynamic constraint
minΔu
Δ2J = 12ΔxT (t f )φxx (t f )Δx(t f )+
12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Lxx (t) Lxu(t)Lux (t) Luu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
t0
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
P(t f ) ! φxx (t f ) =∂2φ∂x2
(t f )
Q(t) M(t)MT (t) R(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥!
Lxx (t) Lxu(t)Lux (t) Luu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
dim P(t f )⎡⎣ ⎤⎦ = dim Q(t)[ ] = n × ndim R(t)[ ] = m ×mdim M(t)[ ] = n ×m
subject to perturbation dynamicsΔ!x(t) = F(t)Δx(t)+G(t)Δu(t), Δx(t0 ) = Δxo
12
![Page 7: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/7.jpg)
2nd Variational Hamiltonian
Δ2J = 12ΔxT (t f )P(t f )Δx(t f )+
12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q(t) M(t)MT (t) R(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
t0
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
H Δx(t),Δu(t),Δλ t( )⎡⎣ ⎤⎦ = L Δx(t),Δu(t)[ ]+ ΔλT t( )f Δx(t),Δu(t)[ ]= 12
ΔxT (t)Q(t)Δx(t)+ 2ΔxT (t)M(t)Δu(t)+ ΔuT (t)R(t)Δu(t)⎡⎣ ⎤⎦
+ ΔλT t( ) F(t)Δx(t)+G(t)Δu(t)[ ]
Variational Lagrangian plus adjoined dynamic constraint
Variational cost function
= 12ΔxT (t f )P(t f )Δx(t f )+
12
ΔxT (t)Q(t)Δx(t)+ 2ΔxT (t)M(t)Δu(t)+ ΔuT (t)R(t)Δu(t)⎡⎣ ⎤⎦dtt0
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
13
2nd Variational Euler-Lagrange Equations
Δλ t f( ) = φxx (t f )Δx(t f ) = P(t f )Δx(t f )
Terminal condition, solution for adjoint vector, and optimality condition
Δ λ t( ) = −
∂H Δx(t),Δu(t),Δλ t( )⎡⎣ ⎤⎦∂Δx
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
T
= −Q(t)Δx(t) −M(t)Δu(t) − FT (t)Δλ t( )
∂H Δx(t),Δu(t),Δλ t( )⎡⎣ ⎤⎦∂Δu
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
T
=MT (t)Δx(t) + R(t)Δu(t) −GT (t)Δλ t( ) = 0
14Can explicitly solve for control
E-L #1
E-L #2
E-L #3
![Page 8: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/8.jpg)
Two-Point Boundary-Value Problem
Δx(t) = F(t)Δx(t) +G(t)Δu(t)
Δλ t( ) = −Q(t)Δx(t) −M(t)Δu(t) − FT (t)Δλ t( )
Δλ t f( ) = P(t f )Δx(t f )
Δx(t0 ) = Δx0
State Equation
Adjoint Vector Equation
15
Δu(t) = −R−1(t) MT (t)Δx(t) +GT (t)Δλ t( )⎡⎣ ⎤⎦
From Hu = 0
Use Control Law to Solve the LTV Two-Point Boundary-Value Problem
Δu(t) = −R−1(t) MT (t)Δx(t) +GT (t)Δλ t( )⎡⎣ ⎤⎦
Δx(t)Δ λ t( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥=
F(t) −G(t)R−1(t)MT (t)⎡⎣ ⎤⎦ −G(t)R−1(t)GT (t)
−Q(t) +M(t)R−1(t)MT (t)⎡⎣ ⎤⎦ − F(t) −G(t)R−1(t)MT (t)⎡⎣ ⎤⎦T
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
Δx(t)Δλ t( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t0 )
Δλ t f( )⎡
⎣
⎢⎢
⎤
⎦
⎥⎥=
Δx0PfΔx f
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Perturbation state vectorPerturbation adjoint vector
From Hu = 0
Substitute for control in system and adjoint equations
Control law that feeds back state and adjoint vectors
Adjoint relationship at end point
16
![Page 9: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/9.jpg)
Use Control Law to Solve the LTV Two-Point Boundary-Value Problem
Assume the adjoint relationship between state and control applies over the entire interval
Δλ t( ) = P t( )Δx t( )Substitute for adjoint: Control law feeds back state alone
Δu*(t)= –R–1(t) MT(t)Δx(t)+GT(t)P t( )Δx t( )⎡⎣ ⎤⎦= –R–1(t) MT(t)+GT(t)P t( )⎡⎣ ⎤⎦Δx t( )
dim C( ) = m × n
17 Δu*(t) ! argmin
Δu t( ) in t0 , t f( )Δ2J Δx Δu( )⎡⎣ ⎤⎦
Δu*(t) = –C(t)Δx t( )
Time-Varying Linear-Quadratic (LQ) Optimal Control Gain Matrix
• Properties of feedback gain matrix– Full state feedback (m x n)– Time-varying matrix
• R, G, and M given• Control weighting matrix, R• State-control weighting matrix, M• Control effect matrix, G
Δu(t) = −C(t)Δx t( )
C(t) = R−1(t) GT (t)P t( ) +MT (t)⎡⎣ ⎤⎦
Optimal feedback gain matrix
18P(t) remains to be determined
![Page 10: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/10.jpg)
Solution for the Adjoining Matrix, P(t)
Δλ t( ) = P t( )Δx t( ) + P t( )Δx t( )
P t( )Δx t( ) = Δ λ t( ) − P t( )Δx t( )
Time-derivative of adjoint vector
Rearrange
!P t( )Δx t( ) = −Q(t)+M(t)R−1(t)MT (t)⎡⎣ ⎤⎦Δx t( )− F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦TΔλ t( )
−P t( ) F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦Δx t( )−G(t)R−1(t)GT (t)Δλ t( ){ }
Recall coupled state/adjoint equation
Δx(t)Δ λ t( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥=
F(t) −G(t)R−1(t)MT (t)⎡⎣ ⎤⎦ −G(t)R−1(t)GT (t)
−Q(t) +M(t)R−1(t)MT (t)⎡⎣ ⎤⎦ − F(t) −G(t)R−1(t)MT (t)⎡⎣ ⎤⎦T
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
Δx(t)Δλ t( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Substitute in adjoint matrix equation
19
Solution for the Adjoining Matrix, P(t)
Substitute for adjoint vector
Δλ t( ) = P t( )Δx t( )
!P t( )Δx t( ) = −Q(t)+M(t)R−1(t)MT (t)⎡⎣ ⎤⎦Δx t( )− F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦
TP t( )Δx t( )
−P t( ) F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦Δx t( )−G(t)R−1(t)GT (t)P t( )Δx t( ){ }
... and eliminate state vector
20
![Page 11: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/11.jpg)
Matrix Riccati Equation for P(t)
!P t( ) = −Q(t)+M(t)R−1(t)MT (t)⎡⎣ ⎤⎦ − F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦TP t( )
−P t( ) F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦ + P t( )G(t)R−1(t)GT (t)P t( )P t f( ) = φxx t f( )
Nonlinear, ordinary differential equation for the Riccati Matrix, P(t), with terminal boundary conditions
21
Characteristics of the Adjoining ( Riccati) Matrix, P(t)
• P(tf) is symmetric, n x n, and typically positive semi-definite
• Matrix Riccati equation is symmetric• Therefore, P(t) is symmetric and positive semi-definite
throughout• Once P(t) has been determined, optimal feedback
control gain matrix, C(t) can be calculated
22
![Page 12: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/12.jpg)
Example of Neighboring-Optimal Control: Improved Infection Treatment via
Feedback
23
Natural Response to Pathogen Assault (No Therapy)
24
![Page 13: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/13.jpg)
�
J = 12s11x1 f
2 + s44x4 f
2( ) + 12
q11x12 + q44x4
2 + ru2( )dtto
t f
∫
Tradeoff Between Rate of Killing Pathogen, Preservation of Organ
Health, and Drug Use
Optimal Open-Loop Control
25
u(t) = u*(t)−C(t)Δx(t)= u*(t)−C(t)[x(t)− x*(t)]
50% Increased Infection: Nominal and Neighboring-Optimal Control (u1)
26
![Page 14: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/14.jpg)
Optimal Control of Linear Systems
27
Time-Varying Linear-Quadratic (LQ) Feedback Control
LTV Continuous-time linear dynamic system
LTV optimal control law
Linear dynamic system with LTV optimal feedback control
Δx(t) = F(t)Δx(t) +G(t)Δu(t)
Δu(t) = −R−1(t) MT (t)+GT (t)P t( )⎡⎣ ⎤⎦Δx t( ) −C(t)Δx t( )
Δx(t) = F(t)Δx(t) +G(t)Δu(t)
= F(t)Δx(t)+G(t) −C(t)Δx t( )⎡⎣ ⎤⎦= F(t)−G(t)C(t)[ ]Δx(t)
28
![Page 15: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/15.jpg)
LTI System with Time-Varying LQ Feedback Control
Open-loop system is LTI, but optimal control gain matrix is time-varying
Closed-loop system Δu(t) = −R−1 MT +GTP t( )⎡⎣ ⎤⎦Δx t( ) −C(t)Δx t( )
Δ!x(t) = FΔx(t)+G −C(t)Δx t( )⎡⎣ ⎤⎦= F −GC(t)[ ]Δx(t) = Fclosed−loop t( )Δx(t)
29
LTI dynamic system
Δx(t) = FΔx(t) +GΔu(t)
Example: LTV Optimal Control of a LTI First-Order System
Δx = fΔx + gΔu
Δu = −r−1 gp t( )⎡⎣ ⎤⎦Δx t( )
= −gp t( )r
Δx
!p t( ) = −q − 2 fp t( ) + g2p2 t( )r
p t f( ) = pf
Δ2J = 12pfΔx
2 (t f )+12
qΔx2 + 0( )ΔxΔu + rΔu2( )dtt0
t f
∫
30
![Page 16: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/16.jpg)
!p t( ) = −1+ 2p t( ) + p2 t( )p t f( ) = 1
Δx = −Δx + Δu; x 0( ) = 1
Δu = − p t( )Δx
Δx = − 1+ p t( )⎡⎣ ⎤⎦Δx
Example: LTV Optimal Control of a Stable First-Order Systemf = −1; g = 1
q = r = 1
Control gain = p t( )
Δxopen−loop t( )
Δxclosed−loop t( )
p t( )
Δu t( )
31
Δu = − p t( )Δx
Δx = 1− p t( )⎡⎣ ⎤⎦Δx
Example: LTV Optimal Control of an Unstable First-Order Systemf = +1; g = 1
Control gain = p t( )
Δx = Δx + Δu; x 0( ) = 1
!p t( ) = −1− 2p t( ) + p2 t( )p t f( ) = 1
Δxopen−loop t( )
Δxclosed−loop t( )
p t( )
Δu t( )
32
![Page 17: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/17.jpg)
Discrete-Time and Sampled-Data Linear, Time-Invariant
(LTI) Systems
33
Continuous- and Discrete-Time LTI System Models
Δ!x(t) = FΔx(t)+GΔu(t)+LΔw(t)Δx(t0 ) given
Continuous-time (“analog”) model is based on an ordinary differential equation
Dynamic Process
Observation Process
34
Δx(tk+1) = ΦΔx(tk )+ ΓΔu(tk )+ ΛΔw(tk )Δx(t0 ) given
Dynamic Process
Observation Process
Discrete-time (“digital”) model is based on an ordinary difference equation
Δy(t) = HxΔx(t)+HuΔu(t)+HwΔw(t)
Δy(tk ) = HxΔx(tk )+HuΔu(tk )+HwΔw(tk )
![Page 18: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/18.jpg)
Digital Control Systems Use Sampled Data
Δxk = Δx tk( ) = Δx kΔt( )
• Periodic sequence
• Sampler is an analog-to-digital (A/D) converter• Reconstructor is a digital-to-analog (D/A) converter
35
Solution of a LTI dynamic model
Δ!x(t) = FΔx(t)+GΔu(t)+LΔw(t), Δx(t0 ) given
Δx(t) = Δx(t0 )+ FΔx(τ )+GΔu(τ )+LΔw(τ )[ ]dτt0
t
∫• ... has two parts
– Unforced (homogeneous) response to initial conditions– Forced response to control and disturbance inputs
LTI System Response to Inputs and Initial Conditions
36
Initial condition response
Step input response
![Page 19: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/19.jpg)
Unforced Response to Initial Conditions
The state transition matrix propagates the state from t0 to t by a single multiplication
Δx(t) = Δx(t0 )+ FΔx(τ )[ ]dτt0
t
∫ = Φ t,t0( )Δx(t0 )
Neglecting forcing functions
The state transition matrix is the matrix exponential
Δx(t) = Δx(t0 )+ FΔx(τ )[ ]dτt0
t
∫= Φ t − t0( )Δx(t0 ) = eF t−t0( )Δx(t0 ) 37
LTI State Transition Matrix is the Matrix Exponential
eF t−t0( ) ! eFδ t =Matrix Exponential! Φ δ t( ) = State Transition Matrix
= I+ Fδ t + 12!Fδ t[ ]2 + 1
3!Fδ t[ ]3 + ...
See pages 79-84 of Optimal Control and Estimation for a description of how the State Transition Matrix is calculated for LTV
and LTI systems
38
![Page 20: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/20.jpg)
Response to Inputs
Solution of the LTI model with piecewise-constant
forcing functions
Δx(tk ) = Δx(tk−1) + FΔx(τ ) +GΔu(τ ) + LΔw(τ )[ ]dτtk−1
tk
∫
Δx(tk ) = eFδ tΔx(tk−1)+ e
Fδ t e−F τ −tk−1( )⎡⎣ ⎤⎦dτtk−1
tk
∫ GΔu(tk−1)+LΔw(tk−1)[ ]= ΦΔx(tk−1)+ ΓΔu(tk−1)+ ΛΔw(tk−1)
39
δ t ! tk − tk−1 = constant
Φ ! eFδ t
Γ ! eFδ t I− e−Fδ t( )F−1G = eFδ t − I( )F−1G
Λ ! eFδ t I− e−Fδ t( )F−1L = eFδ t − I( )F−1L
Discrete-Time LTI System Response to Step Input
Δx(t1) = ΦΔx(t0 )+ ΓΔu(t0 )+ ΛΔw(t0 )Δx(t2 ) = ΦΔx(t1)+ ΓΔu(t1)+ ΛΔw(t1)Δx(t3) = ΦΔx(t2 )+ ΓΔu(t2 )+ ΛΔw(t2 )
!
Step Input: Discrete-time solution is exact at sampling instants
Propagation of Δx tk( )with constant δ t, Φ, Γ, and Λ
40
![Page 21: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/21.jpg)
As time interval becomes very small, discrete-time model approaches
continuous-time model
Φ δ t→0⎯ →⎯⎯ I + Fδt( )Γ δ t→0⎯ →⎯⎯ GδtΛ δ t→0⎯ →⎯⎯ Lδt
Series Expressions for Discrete-Time LTI System Matrices
Γ = eFδ t − I( )F−1G = I + 12!F δt( )⎡⎣ ⎤⎦ +
13!F δt( )⎡⎣ ⎤⎦
2 + ...⎛⎝⎜
⎞⎠⎟Gδt
Λ = eFδ t − I( )F−1L = I + 12!F δt( )⎡⎣ ⎤⎦ +
13!F δt( )⎡⎣ ⎤⎦
2 + ...⎛⎝⎜
⎞⎠⎟Lδt
41
Φ = e Fδt = I + F δt( ) + 1
2!F δt( )⎡⎣ ⎤⎦
2+ 13!F δt( )⎡⎣ ⎤⎦
3+!
Sampled-Data Cost Function
minΔu t( )
Δ2J = minΔu t( )
12ΔxT (t f )P(t f )Δx(t f )+
12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q MMT R
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
t0
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
Minimize subject to sampled-data dynamic constraint
Δx(tk+1) = Φ δt( )Δx(tk ) + Γ δt( )Δu(tk )
Sampled-Data Cost Function: a Discrete-Time Cost Function that accounts for system response between sampling instants
minΔu t( )
Δ2J = minΔu t( )
12Δxk f
TPk fΔxk f +12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q MMT R
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
tk
tk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑
Sum integrals over short time intervals, (tk, tk+1)
42
![Page 22: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/22.jpg)
Integrand of Sampled-Data Cost Function
Use dynamic equation ...
12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q MMT R
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
tk
tk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑
...to express the integrand in the sampling interval, (tk,tk+1)
Δx t( ) = Φ t,tk( )Δx tk( ) + Γ t,tk( )Δu tk( )! Φ t,tk( )Δxk + Γ t,tk( )Δuk
43
Integrand of Sampled-Data Cost Function
12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q MMT R
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
tk
tk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑
=12
ΔxkT Δuk
T⎡⎣
⎤⎦
ΦT t,tk( )QΦ t,tk( ) ΦT t,tk( ) QΓ t,tk( ) +M⎡⎣ ⎤⎦
QΓ t,tk( ) +M⎡⎣ ⎤⎦TΦ t,tk( ) R + ΓT t,tk( )M +MTΓ t,tk( ) + ΓT t,tk( )QΓ t,tk( )⎡⎣ ⎤⎦
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥tk
tk+1
∫k
dtΔxkΔuk
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪k=0
k f −1
∑
= 12
ΔxkT Δuk
T⎡⎣
⎤⎦
Q̂ M̂M̂T R̂
⎡
⎣⎢⎢
⎤
⎦⎥⎥k
ΔxkΔuk
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑
Bring state and control out of integralAssume control is constant in sampling interval
Integration has been replaced by summation
Berman, Gran, J. Aircraft, 1974 44
![Page 23: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/23.jpg)
Sampled-Data Cost Function Weighting Matrices
Q̂ = ΦT t,tk( )QΦ t,tk( )dttk
tk+1
∫
M̂ = ΦT t,tk( ) QΓ t,tk( ) +M⎡⎣ ⎤⎦dttk
tk+1
∫
R̂ = R + ΓT t,tk( )M +MTΓ t,tk( ) + ΓT t,tk( )QΓ t,tk( )⎡⎣ ⎤⎦dttk
tk+1
∫
The integrand accounts for continuous-time variations of the LTI system between sampling instants (“Inter-sample ripple”)
Assume Q, M, and R are constant in the integration interval
Φ t,tk( ) and Γ t,tk( ) vary in the integration interval
45
Dynamic Programming Approach to Sampled-Data
Optimal Control
46
![Page 24: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/24.jpg)
Discrete-Time Hamilton-Jacobi-Bellman Equation
V (t0 ) =ϕk f+ Lk
k=0
k f −1
∑
Value Function at toDiscrete HJB equation
• Begin at terminal point with optimal value function• Working backward, add minimum value function
increment (Bellman’s Principle of Optimality)
Vk*= −minΔuk
Lk +Vk+1 *{ }= −min
ΔukHk , V * Δxk f *⎡⎣ ⎤⎦ = given
… optimal policy … whatever the initial state and initial decision …remaining decisions must constitute an optimal policy with regard to the current state
47
subject toΔxk+1 = ΦΔxk + ΓΔuk
Sampled-Data Cost Function Contains Terminal and Summation Costs
Integral cost has been replaced by a summation costTerminal cost is the same
minΔuk
Jsampled
= minΔuk
12Δxk f
TPk fΔxk f +12
ΔxkT Δuk
T⎡⎣
⎤⎦
Q̂ M̂M̂T R̂
⎡
⎣⎢⎢
⎤
⎦⎥⎥k
ΔxkΔuk
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
subject toΔxk+1 = ΦΔxk + ΓΔuk
48
![Page 25: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/25.jpg)
Dynamic Programming Approach to Sampled-Data LQ Control
V (t0 ) =12Δxk f
TPk fΔxk f +12
ΔxkT Δuk
T⎡⎣
⎤⎦
Q̂ M̂M̂T R̂
⎡
⎣⎢⎢
⎤
⎦⎥⎥
ΔxkΔuk
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪k=0
k f −1
∑
Vk*= −minΔuk
12
Δxk *T Q̂Δxk *+2Δxk *T M̂Δuk + ΔukT R̂Δuk⎡⎣ ⎤⎦ +Vk+1 *⎧
⎨⎩
⎫⎬⎭
= −minΔuk
Hk , V * Δxk f *⎡⎣ ⎤⎦ = Δxk f *T Pk fΔxk f *T
subject to Δxk+1 = ΦΔxk + ΓΔuk
Quadratic Value Function at to
Discrete HJB equation
49
Optimality Condition
Vk =12Δxk
TPkΔxk ; Vk+1 =12Δxk+1
TPk+1Δxk+1
Assume value function takes a quadratic form
Optimality condition∂Hk
∂Δuk= Δxk
TM̂ + ΔukT R̂⎡⎣ ⎤⎦ +
∂Vk+1∂Δuk
= 0
∂Vk+1∂Δuk
=∂ 12Δxk+1
TPk+1Δxk+1⎡⎣⎢
⎤⎦⎥
∂Δuk= ΦΔxk + ΓΔuk[ ]T Pk+1Γ
ΔxkTM̂ + Δuk
T R̂⎡⎣ ⎤⎦ + ΔxkTΦT + Δuk
TΓT⎡⎣ ⎤⎦Pk+1Γ = 0
where
hence50
![Page 26: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/26.jpg)
Minimizing Value of Control
Δuk = − R̂ + ΓTPk+1Γ⎡⎣ ⎤⎦
−1M̂T + ΓTPk+1Φ⎡⎣ ⎤⎦Δxk −CkΔxk
Must find Pk in 0,k f( )Use definitions of V * and Δu in HJB equation
∂Hk
∂Δuk= Δxk
T M̂ +ΦTPk+1Γ⎡⎣ ⎤⎦ + ΔukT R̂ + ΓTPk+1Γ⎡⎣ ⎤⎦ = 0
51
Solution for Pk
12Δxk
TPkΔxk =
−minΔuk
12
ΔxkTQ̂Δxk + 2Δxk
TM̂ −CkΔxk( ) + −CkΔxk( )T R̂ −CkΔxk( ) + Δxk+1TPk+1Δxk+1⎡
⎣⎤⎦
⎧⎨⎩
⎫⎬⎭
Vk =12Δxk
TPkΔxk ; Vk+1 =12Δxk+1
TPk+1Δxk+1
Pk = Q̂+Φ TPk+1Φ − M̂ T + Γ TPk+1Φ⎡⎣ ⎤⎦TR̂ + Γ TPk+1Γ⎡⎣ ⎤⎦
−1M̂ T + Γ TPk+1Φ⎡⎣ ⎤⎦
Pk f given
Δuk = − R̂ + ΓTPk+1Γ⎡⎣ ⎤⎦
−1M̂T + ΓTPk+1Φ⎡⎣ ⎤⎦Δxk −CkΔxk
Substitute for Vk ,Vk+1, and Δuk in discrete-time HJB equation
Rearrange and cancel Δxk on both sides of the equation to yield thediscrete-time Riccati equation
52
![Page 27: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/27.jpg)
Discrete-Time System with Linear-Quadratic Feedback Control
Dynamic System
Control law
Dynamic system with LQ feedback control
Δxk+1 = ΦΔxk + ΓΔuk
Δuk = − R̂ + Γ TPk+1Γ⎡⎣ ⎤⎦
−1M̂ T + Γ TPk+1Φ⎡⎣ ⎤⎦Δxk −CkΔxk
Δxk+1 = ΦΔxk + ΓΔuk= ΦΔxk + Γ −CkΔx( )k= Φ − ΓCk( )Δxk
53
Next Time:Dynamic System Stability
ReadingOCE: Section 2.5
54
![Page 28: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/28.jpg)
Supplemental Material
55
Neighboring Trajectories• Nominal (or reference) trajectory and control history
xN (t), uN (t),wN (t){ } for t in [t0,t f ]
• Trajectory perturbed by– Small initial condition variation– Small control variation– Small disturbance variation
x(t), u(t),w(t){ } for t in [t0,t f ]
= xN (t)+ Δx(t), uN (t)+ Δu(t),wN (t)+ Δw(t){ }
�
x : dynamic stateu : control inputw : disturbance input
56
![Page 29: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/29.jpg)
Example: Separate Solutions for Nominal and Perturbation Trajectories
Δ!x1Δ!x2Δ!x3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥=
0 1 0−2a1 x3N − x1N( ) −a2 a2 + 2a1 x3N − x1N( )c1 + b3u1N( ) c1 3c2x3N
2
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
Δx1Δx2Δx3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
+0 0b1 b2b3x1N 0
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
Δu1Δu2
⎡
⎣⎢⎢
⎤
⎦⎥⎥+
d00
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥Δw1 ;
Δx1(to )Δx2 (to )Δx3(to )
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥given
Linear, time-varying equation describes perturbation dynamics
Original nonlinear equation describes nominal dynamics
xN =
x1Nx2Nx3N
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
=
x2N + dw1N
a2 x3N − x2N( ) + a1 x3N − x1N( )2 + b1u1Nc2x3N
3 + c1 x1N + x2N( ) + b3x1N u1N+ b2u2N
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
,
x1Nx2Nx3N
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
given
57
!p t( ) = −q + 2p t( ) + p2 t( ); q = 1 or 100
p t f( ) = 1 Δx = −Δx + Δu + Δw; x 0( ) = 0
Δu = − p t( )Δx Δx = − 1+ p t( )⎡⎣ ⎤⎦Δx
Example: LQ Optimal Control, Stable First-Order System, “White-Noise” Disturbance
q = 1 q = 100Open-Loop Response
Open- and Closed-Loop Responses
Δxopen−loop t( )
p t( )
Δu t( )
Δxopen−loop t( )
p t( )
Δu t( )
58
![Page 30: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/30.jpg)
!p t( ) = −1+ 2p t( ) + p2 t( )p t f( ) = 1 or 1000
Δx = −Δx + Δu + Δw; x 0( ) = 0
Δu = − p t( )Δx Δx = − 1+ p t( )⎡⎣ ⎤⎦Δx
pf = 1 pf = 1000Open-Loop Response
Open- and Closed-Loop Responses
Δxopen−loop t( )
p t( )
Δu t( )
Δxopen−loop t( )
p t( )
Δu t( )
59
Example: LQ Optimal Control, Stable First-Order System, “White-Noise” Disturbance
!p t( ) = −100 + 2p t( ) + p2 t( )p t f( ) = 1 or 0
Δx = −Δx + Δu + Δw; x 0( ) = 0
Δu = − p t( )Δx Δx = − 1+ p t( )⎡⎣ ⎤⎦Δx
pf = 1 pf = 0Δxopen−loop t( )
Δxopen−loop t( ), Δxclosed−loop t( )
p t( )
Δu t( )
Δxopen−loop t( )
Δxopen−loop t( ), Δxclosed−loop t( )
p t( )
Δu t( )
60
Example: LQ Optimal Control, Stable First-Order System, “White-Noise” Disturbance
![Page 31: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/31.jpg)
Multivariable LQ Optimal Control with Cross Weighting, M, = 0
!P t( ) = −Q[ ]− F[ ]T P t( )− P t( ) F[ ]+ P t( )GR−1GTP t( )P t f( ) = Pf
Δu(t) = −R−1 GTP t( )⎡⎣ ⎤⎦Δx t( )
Δ2J = 12ΔxT (t f )P(t f )Δx(t f )
+ 12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦Q 00 R
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
t0
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
No state/control coupling in cost function
Δx(t) = FΔx(t) +GΔu(t)
61
Nonlinear System with Neighboring-Optimal Feedback Control
Nonlinear dynamic system
x(t) = f x t( ),u(t)⎡⎣ ⎤⎦
u(t) = u * (t) −C(t)Δx t( ) = u * (t) −C(t) x t( ) − x * t( )⎡⎣ ⎤⎦
x(t) = f x t( ), u * (t) −C(t) x t( ) − x * t( )⎡⎣ ⎤⎦⎡⎣ ⎤⎦{ }
Neighboring-optimal control law
Nonlinear dynamic system with neighboring-optimal feedback control
62
![Page 32: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/32.jpg)
Development of Neighboring-Optimal Therapy
x(t) = x * (t) + Δx(t)u(t) = u * (t) + Δu(t)
• Expand dynamic equation to first degree
• Compute nominal optimal control history using original nonlinear dynamic model
• Compute optimal perturbation control using locally linearized dynamic model
• Sum the two for neighboring-optimal control of the dynamic system
Δu(t) = −C(t) x(t)− xopt (t)⎡⎣ ⎤⎦u(t) = uopt (t)+ Δu(t)
u(t) = uopt (t)
63
First-Order LQ Example Code% First-Order LQ Example% Rob Stengel % 2/23/2011 clear global tp p q tw w xo = 0; to = 0; tf = 10; tspanx = [to tf]; tw = [0:0.01:10]; for k = 1:length(tw) w(k) = randn(1); end [tx,x] = ode15s('First',tspanx,xo); pf = 0; q = 100; tspanp = [tf to]; [tp,p] = ode15s('FirstRiccati',tspanp,pf); [tc,xc] = ode15s('FirstCL',tspanx,xo); u = interp1(tp,p,tc).*xc; figure subplot(4,1,1) plot(tx,x),grid,xlabel('Time'),ylabel('x-open-loop') subplot(4,1,2) plot(tp,p,'r'),grid,xlabel('Time'),ylabel('p') subplot(4,1,3) plot(tc,u),grid, xlabel('Time'),ylabel('u') subplot(4,1,4) plot(tc,xc,'r',tx,x,'b'),grid,xlabel('Time'),ylabel('x- closed-loop')
function xdot = First(t,x)global tp p tw w wt = interp1(tw,w,t,'nearest');xdot = -x + wt;
function pdot = FirstRiccati(t,p)
global qpdot = -q + 2*p + p^2;
function xdot = FirstCL(tc,xc);
global tp p tw w wt = interp1(tw,w,tc,'nearest');pt = interp1(tp,p,tc);xdot = -(1 + pt)*xc + wt;
64
![Page 33: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/33.jpg)
Initial-Condition Response via State Transition
Φ = I+ F δ t( ) + 12!F δ t( )⎡⎣ ⎤⎦
2
+ 13!F δ t( )⎡⎣ ⎤⎦
3+ ...
Δx(t1) = Φ t1 − to( )Δx(t0 )Δx(t2 ) = Φ t2 − t1( )Δx(t1)Δx(t3) = Φ t3 − t2( )Δx(t2 )…
Δx(t1) = Φ δ t( )Δx(t0 ) = ΦΔx(t0 )Δx(t2 ) = ΦΔx(t1) = Φ2Δx(t0 )Δx(t3) = ΦΔx(t2 ) = Φ3Δx(t0 )
…
Propagation of Δx tk( ) in LTI system
State transition matrix is constant if tk − tk−1( ) = δ t = constant
65
Example: Equivalent Continuous-Time and Discrete-Time System Matrices
F = −1.2794 −7.98561 −1.2709
⎡
⎣⎢
⎤
⎦⎥
G = −9.0690
⎡
⎣⎢
⎤
⎦⎥
L = −7.9856−1.2709
⎡
⎣⎢
⎤
⎦⎥
Φ = 0.845 −0.69360.0869 0.8457
⎡
⎣⎢
⎤
⎦⎥
Γ = −0.8404−0.0414
⎡
⎣⎢
⎤
⎦⎥
Λ = −0.6936−0.1543
⎡
⎣⎢
⎤
⎦⎥
Φ = 0.0823 −1.47510.1847 0.0839
⎡
⎣⎢
⎤
⎦⎥
Γ = −2.4923−0.6429
⎡
⎣⎢
⎤
⎦⎥
Λ = −1.4751−0.9161
⎡
⎣⎢
⎤
⎦⎥
Continuous-time (“analog”) system
Discrete-time (“digital”) system
Time interval has a large
effect on the discrete-time
matrices
δt = 0.1s
δt = 0.5s
66
![Page 34: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/34.jpg)
Evaluating Sampled-Data Weighting Matrices
Φ t,tk( ) and Γ t,tk( ) vary in the integration interval
Break interval into smaller intervals, and approximate as sum of short rectangular integration steps
Q̂ = ΦT t,0( )QΦ t,0( )dt0
Δt
∫
! ΦT tk−1,0( )QΦ tk−1,0( )δ t⎡⎣ ⎤⎦k=1
100
∑ , δ t = Δt 100, tk = kδ t
! eFT tk−1QeFtk−1δ t⎡⎣ ⎤⎦
k=1
100
∑
67
Q̂,M̂, and R̂ evaluated just once for LTI system
Γ = eFδ t − I( )F−1G = I + 12!F δt( )⎡⎣ ⎤⎦ +
13!F δt( )⎡⎣ ⎤⎦
2 + ...⎛⎝⎜
⎞⎠⎟Gδt
Evaluating Sampled-Data Weighting Matrices
M̂ = ΦT t,0( ) QΓ t,0( ) +M⎡⎣ ⎤⎦dt0
Δt
∫
! eFT tk−1Q I+ 1
2!Ftk−1[ ]+ 13! Ftk−1[ ]2 + ...⎛
⎝⎜⎞⎠⎟Gtk−1
⎡⎣⎢
⎤⎦⎥+M
⎧⎨⎩
⎫⎬⎭δ t
k=1
100
∑
R̂ ! R + ΓT tk−1( )M +MTΓ tk−1( ) + ΓT tk−1( )QΓ tk−1( )⎡⎣ ⎤⎦δ tk=1
100
∑68
![Page 35: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/35.jpg)
Sampled-Data Cost Function Weighting Always Includes
State-Control Weighting
M̂ = ΦT t,tk( ) QΓ t,tk( ) +M⎡⎣ ⎤⎦dttk
tk+1
∫
= ΦT t,tk( )QΓ t,tk( )dttk
tk+1
∫ even if M = 0
Lk =12
ΔxkTQ̂Δxk + 2Δxk
TM̂Δuk + ΔukT R̂Δuk⎡⎣ ⎤⎦
Sampled-Data Lagrangian
69
Continuous-Time LQ Optimization via Dynamic
Programming
70
![Page 36: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/36.jpg)
Dynamic Programming Approach to Continuous-
Time LQ Control
V (t0 ) =12ΔxT (t f )P(t f ) f Δx(t f )
+ 12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q(t) M(t)MT (t) R(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
to
t f
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
V (t1) =12ΔxT (t f )P(t f )Δx(t f )
− 12
ΔxT (t) ΔuT (t)⎡⎣
⎤⎦
Q(t) M(t)MT (t) R(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Δx(t)Δu(t)
⎡
⎣⎢⎢
⎤
⎦⎥⎥dt
tf
t1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
Value Function at to
Value Function at t1
71
Dynamic Programming Approach to Continous-
Time LQ Control
∂V * Δx*(t1)[ ]∂t
=
−minΔu
12
Δx*T (t1)Q(t1)Δx*(t1)+ 2Δx*T (t1)M(t1)Δu(t1)+ ΔuT (t1)R(t1)Δu(t1)⎡⎣ ⎤⎦
+∂V * Δx*(t1)[ ]
∂ΔxF(t1)Δx*(t1)+G(t1)Δu(t1)[ ]
⎧
⎨⎪⎪
⎩⎪⎪
⎫
⎬⎪⎪
⎭⎪⎪
Time Derivative of Value Function
72
![Page 37: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/37.jpg)
Dynamic Programming Approach to Continuous-
Time LQ Control
∂V * Δx*[ ]∂t
= −minΔu
H Δx*,Δu, ∂V *∂Δx
⎡⎣⎢
⎤⎦⎥,
V * Δx(t f )⎡⎣ ⎤⎦ = Δx*T (t f )P(t f )Δx*T (t f )
Hamiltonian
H Δx*,Δu, ∂V *∂Δx
⎡⎣⎢
⎤⎦⎥!
12
Δx*T QΔx*+2Δx*T MΔu+ ΔuTRΔu⎡⎣ ⎤⎦ +∂V * Δx*[ ]
∂ΔxFΔx*+GΔu[ ]
HJB Equation
73
Plausible Form for the Value Function
V * Δx * (t)[ ] = Δx *T (t)P(t)Δx *T (t)
∂V *∂t
= −12
Δx *T (t1) P(t1)Δx * (t1)⎡⎣ ⎤⎦
∂V *∂Δx
= Δx *T (t)P(t)
Time Derivative of the Value Function
Gradient of the Value Function with respect to the state
Quadratic Function of State Perturbation
74
![Page 38: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/38.jpg)
Optimal Control Law and HJB Equation
∂H∂u
= ΔxTM + ΔuTR + ΔxTPG = 0
Δu(t) = −R−1 GTP+MT( )Δx(t)
ΔxT !PΔx =
ΔxT −Q+MR−1MT⎡⎣ ⎤⎦ − F −GR−1MT⎡⎣ ⎤⎦TP− P F −GR−1MT⎡⎣ ⎤⎦ + PGR
−1GTP{ }Δx
Optimal control law
Incorporate Value Function Model in HJB equation
Δx t( ) can be cancelled on left and right 75
Matrix Riccati Equation
!P t( ) = −Q(t)+M(t)R−1(t)MT (t)⎡⎣ ⎤⎦
− F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦TP t( )
−P t( ) F(t)−G(t)R−1(t)MT (t)⎡⎣ ⎤⎦+P t( )G(t)R−1(t)GT (t)P t( )
P t f( ) = φxx t f( )
76
![Page 39: Neighboring-Optimal Control via Linear-Quadratic (LQ) FeedbackCubic Springs Force is a nonlinear function of deflection f(x)=−k 1 x±k 2 x 3 moment(θ)≈−k 1θ+k 2θ3 Example:](https://reader035.fdocuments.in/reader035/viewer/2022071112/5fe86b195a6a02253f18e724/html5/thumbnails/39.jpg)
Example: 1st-Order System with LQ Feedback Control
1st-order discrete-time dynamic system
LQ optimal control law
Dynamic system with LQ feedback controlΔxk+1 = φΔxk + γΔuk
= φΔxk + γ −ckΔx( )k= φ −γ ck( )Δxk
Δxk+1 = φΔxk + γΔuk
Δuk = −
m + φγ pk+1r + γ 2 pk+1
Δxk −ckΔxk pk = q + φ2 pk+1 −
m + φγ pk+1( )2
r + γ 2 pk+1
, pkf given
77