LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON...

25
LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 J. FR ´ ED ´ ERIC BONNANS Contents 1. Unconstrained finite horizon linear quadratic problem 2 1.1. Critical points of quadratic functionals 2 1.2. Shooting function and Hamiltonian flow 6 1.3. Riccati equation 8 1.4. Expression of the critical value 9 1.5. Legendre forms and minima of quadratic functions 9 1.6. Spectral analysis 12 2. Infinite horizon problems 13 2.1. Setting 13 2.2. Kalman state space theory 13 2.3. The algebraic Riccati equation 18 2.4. Computation of the solution of the ARE 21 2.5. Illustration 22 3. Notes 23 References 24 Index 25 Overview. We give in this chapter a concise presentation of the theory of optimal control for unconstrained linear quadratic problems (i.e., with a lin- ear state equation and a quadratic cost). The first section starts with the discussion of critical points of quadratic functionals, the shooting equations, and the associated Riccati equation. It continues with the analysis of min- imization problems, their well-posedness, and the spectral characterization of conjugate points. The second section is devoted to infinite horizon problems. We recall some basic concepts of Kalman’s state space theory, obtain an existence and uniqueness result for the solution of the algebraic Riccati equation, and give a numerical method for solving it. We end the chapter with a numerical illustration on the problem of the (linearized) inverse pendulum. 1

Transcript of LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON...

Page 1: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL

VERSION OF 02-07-2010

J. FREDERIC BONNANS

Contents

1. Unconstrained finite horizon linear quadratic problem 21.1. Critical points of quadratic functionals 21.2. Shooting function and Hamiltonian flow 61.3. Riccati equation 81.4. Expression of the critical value 91.5. Legendre forms and minima of quadratic functions 91.6. Spectral analysis 122. Infinite horizon problems 132.1. Setting 132.2. Kalman state space theory 132.3. The algebraic Riccati equation 182.4. Computation of the solution of the ARE 212.5. Illustration 223. Notes 23References 24Index 25

Overview. We give in this chapter a concise presentation of the theory ofoptimal control for unconstrained linear quadratic problems (i.e., with a lin-ear state equation and a quadratic cost). The first section starts with thediscussion of critical points of quadratic functionals, the shooting equations,and the associated Riccati equation. It continues with the analysis of min-imization problems, their well-posedness, and the spectral characterizationof conjugate points.

The second section is devoted to infinite horizon problems. We recallsome basic concepts of Kalman’s state space theory, obtain an existenceand uniqueness result for the solution of the algebraic Riccati equation, andgive a numerical method for solving it. We end the chapter with a numericalillustration on the problem of the (linearized) inverse pendulum.

1

Page 2: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

2 J. FREDERIC BONNANS

Notations. We denote the Euclidean norm of x ∈ IRn by |x|. The transpo-sition of a matrix A is A⊤.

The set of n dimensional real (vertical) vectors is denoted IRn; it dual,whose elements are horizontal vectors, is denoted IRn∗. Generally speaking,Lagrange mutipliers (including the costate) are viewed as dual vectors. Thescalar product and norm in IRn are denoted by “·” and “| · |”, resp.

1. Unconstrained finite horizon linear quadratic problem

1.1. Critical points of quadratic functionals. Consider the followingdynamical system

(1) yt = Atyt + Btut, t ∈ [s, T ]; ys = x,

whith initial time s ≤ T , and matrices At et Bt, measurable functions oftime, are of size n × n and n × m respectively, and essentially bounded.Denote the control and state spaces by

Us,T := L2(s, T, IRm); Ys,T := H1(s, T, IRn).

We know that with each u ∈ Us,T is associated a unique solution in Ys,T of(1), called the state and denoted y[u]. Define the criterion

(2) F (u, y) := 12

∫ T

s[yt · Ctyt + 2ut · Dtyt + ut · Rtut] dt + 1

2yT · MyT .

The matrices Ct, Dt and Rt are measurable, essentially bounded functionsof time of dimension n × n, m × n and m × m, resp. The function F istherefore well-defined Us,T × Ys,T → IR. Denote

(3) J(u) := F (u, y[u]).

Being quadratic and continuous, J has a derivative which is an affine func-tion of u. We say that u is a critical point of J if DJ(u) = 0.

Let us show how to express the derivativewq of J . The (topological) dualspaces, i.e. the spaces of linear continuous forms over Us,T and Ys,T are

(4) U∗s,T := L2(s, T, IRn∗); Y∗

s,T := H1(s, T, IRn∗).

It is known that Ys,T ⊂ C([s, T ], IRn), with compact inclusion1. We will usemany times the following integration by parts formula:

Lemma 1.1. Let y ∈ Ys,T and p ∈ Y∗s,T . Then

(5) pT yT = psys +

∫ T

s(ptyt + ptyt) dt.

1If y belongs to the unit ball of Y, and s ≤ t ≤ τ ≤ T , then by the Cauchy-Schwarzinequality

|yτ − yt| = |Z τ

t

ytdt| ≤√

t′ − t‖y‖2 ≤√

t′ − t.

By the Ascoli-Arzela theorem, bounded subsets of Y are equicontinuous, so that anybounded sequence has a uniformly convergent subsequence.

Page 3: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3

Proof. The mapping a : Ys,T × Y∗s,T → IR, defined by

(6) a(y, p) := pT yT − psys −∫ T

s(ptyt + ptyt) dt

is bilinear continuous, and equal to zero on the set E of pairs of C1 func-tions. Since E is a dense subset2 of Ys,T × Y∗

s,T , for any pair (y, p), there

exists a sequence (yk, pk) → (y, p) in Ys,T × Y∗s,T . It follows that a(y, p) =

lim a(yk, pk) = 0. �

The Lagrangian function associated with the state equation (1) and thecriterion F is the sum of the criterion and of the duality product of somemultiplier with the state equation:

(7) L(u, y, p, q) := F (u, y) +

∫ T

spt(Atyt + Btut − yt)dt − qy0,

where u ∈ Us,T , y ∈ Ys,T , p ∈ H1(s, T, IRn∗) so that the integral is well-defined, and q ∈ IRn∗. Then the costate equation is obtained by setting tozero the derivative of the Lagrangian with respect to the state. Let us seehow to verify this condition.

Lemma 1.2. We have that DyL(u, y, p) = 0 whenever the following twoconditions hold: (i) q = p0, and (ii) p, called the costate, is the uniquesolution in Y∗

s,T of the backward equation

(8) − pt = ptA + y⊤t Ct + u⊤t Dt, t ∈ [s, T ]; pT = y⊤T M.

Proof. Take p ∈ Y∗s,T . Integrating by parts, we have that, for any z ∈ Ys,T ,

denoting by DyL(u, y, p)z the directional derivative of L in direction z:

(9)DyL(u, y, p)z =

∫ T

s(yt · C + ut · Dt + ptAt + pt) ztdt

+(yT · M − pT )zT + (p0 − q)z0.

Taking z in the set D(s, T, IRn) of C∞ functions of Y with compact supportin (s, T ), we deduce that yt·C+ut·Dt+ptAt−pt is orthogonal to D(s, T, IRn).Since the latter is a dense subset of L2(s, T, IRn), the first relation in (8)follows. But then, since DyL(u, y, p) = 0, (9) implies

(10) (yT · M − pT )zT + (p0 − q)z0 = 0, for all z ∈ Y,

from which the end point conditions pT = y⊤T ·M and p0 = q easily follow. �

2Since Ys,T and Y∗

s,T are essentially the same space, it suffices to prove the the set of

C1 functions: [0, T ] → IR is dense in Ys,T . For that, extend y ∈ Ys,T over (−∞, +∞)by setting yt = ys, t < s; yt = yT , t > T. For ε > 0, and τ ∈ IR, define yε

τ by

yετ :=

R +∞

−∞yε(τ−t)ρtdt, where ρ : IR → IR+ is of class C∞ with support in [−1, 1] and such

thatR

IRρ = 1. Then yε is of class C∞, and its restriction on (s, T ) converges to y in Ys,T .

Page 4: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

4 J. FREDERIC BONNANS

Proposition 1.3. The quadratic mapping u → J(u) has a continuous de-rivative DJ(u) ∈ L2(s, T, IRm∗), given by

(11) DJ(u)t = ptBt + u⊤t Rt + y⊤t D⊤

t , t ∈ [0, T ].

where y and p are the state and costate associated with u.

Proof. By the chain-rule theorem, we have that

(12) DJ(u)v =

∫ T

s

(

y⊤t Ctzt + vt · Dtyt + ut · Dtzt + u⊤t Rtvt

)

dt+y⊤T MzT ,

where z ∈ Ys,T is the linearized state in direction v, solution of

(13) zt = Atzt + Btvt, t ∈ [s, T ]; zs = 0.

On the other hand, using lemma 1.1, (8) and (13), we get:(14)

y⊤T MzT = pT zT =∫ Ts (ptzt + ptzt)dt =

∫ Ts (ptBtvt − y⊤t Ctzt − ut · Dtzt)dt.

Adding (12) and (14), obtain

(15) DJ(u)v =

∫ T

s

(

ptBt + u⊤t Rt + y⊤t D⊤

t

)

vtdt.

The conclusion follows. �

The critical points of J are therefore characterized by the (algebraic-dif-ferential) sytem below:

yt = Atyt + Btut, t ∈ [s, T ]; y0 = x,(16)

−pt = ptAt + y⊤t Ct + u⊤t Dt, t ∈ [s, T ]; pT = y⊤T M,(17)

0 = ptBt + u⊤t Rt + y⊤t D⊤

t , t ∈ [s, T ].(18)

Although our main motivation is to study the problem of minimizing J , ithappens that some of the theory may be extended without any difficultiesto the study of critical points, assuming uniform invertibility of matrices Rt:

(19) ∃ α > 0; |Rtv| ≥ α|v|, for all v ∈ IRm, t ∈ (0, T ).

This hypothesis allows to eliminate the control variable from relation (18):

(20) ut = −R−1t

(

B⊤t p⊤t + Dtyt

)

, t ∈ [s, T ].

We obtain then that the triple (u, y, p) is solution of (16)-(18) iff (y, p)is solution of the differential two-point boundary value problem (TPBVP)(two-points refers to the fact that, unlike Cauchy problems where the initialcondition is fixed, here conditions have to be satisfied at both end points):

yt = (At − BtR−1t Dt)yt − BtR

−1t B⊤

t p⊤t , t ∈ [s, T ];(21)

−pt = y⊤t (Ct − D⊤t R−1

t Dt) + pt(At − BtR−1t Dt), t ∈ [s, T ];(22)

ys = x, pT = y⊤T M.(23)

Page 5: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 5

The previous equations can be formulated using the Hamiltonian function:IRm × IRn × IRn → IR, associated with the original system, defined as thesum of the cost integrand and of the product of costate by the dynamics:

(24) H(u, y, p, t) := 12 (y · Cty + 2u · Dty + u · Rtu) + p(Aty + Btu).

By substituting u = −R−1t (B⊤

t p⊤+Dty), we obtain the reduced Hamiltonian

(25) H(y, p, t) := 12y · Cty + pAty − 1

2(pBt + y⊤D⊤t )R−1

t (B⊤t p⊤ + Dty).

Denoting the partial derivatives of H by DyH, etc, we see that (21)-(22) canbe written in the Hamiltonian system form:

(26) yt = Hp(yt, pt, t) − pt = DyH(yt, pt, t).

If matrices At, Bt, Ct, and Dt, are of class C1, we deduce that(27)d

dtH(yt, pt, t) = DyH(yt, pt, t)yt+DpH(yt, pt, t)pt+DtH(yt, pt, t) = Ht(yt, pt, t).

In particular, for autonomous problems (whose data do not depend on time),the Hamiltonian is constant along the trajectory. We may also write (21)-(22) in the form of a single linear differential equation

(28)d

dt

(

yt

p⊤t

)

= MHt

(

yt

p⊤t

)

,

The expression of the above Hamiltonian matrix matrix is

(29) MHt =

(

At − BtR−1t Dt −BtR

−1t B⊤

t

−Ct + D⊤t R−1

t Dt −A⊤t + D⊤

t R−1t B⊤

t

)

.

It happens that the spectrum of these functions have a very particularstructure.

Remark 1.4. We remind that a square matrix G is in Jordan form if it isblock diagonal, with as many blocks as distinct eigenvalues, and the blockJλ associated with the eigenvalue λ having a size equal to the multiplicityof the eigenvalue, and being itself block diagonal, each sub block being ofthe form

(30)

λ 1

λ. . .. . . 1

λ

It is known that there exists an invertible matrix U of the same size as G suchthat U−1GU = JG, where by JG we denote a Jordan form of G. Equivalently,GU = UJG. The columns of U are called the generalized eigenvectors of G.The subset Uλ of generalized eigenvectors associated with an eigenvalue λ ischaracterized by the relation GUλ = UλJG. We recall that G and G⊤ havethe same eigenvalues, with the same multiplicity. Note that multiplying the

Page 6: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

6 J. FREDERIC BONNANS

ith eigenvector by (−1)i we obtain a set of generalized eigenvectors, wherethe Jordan form has −1 instead of 1.

Lemma 1.5. The spectrum of MHt is symmetric, and opposite eigenvalues

have equal multiplicity. More precisely, if λ ∈ C is an eigenvalue of MHt , and

(

xy

)

is an associated generalized eigenvector, then

(

−yx

)

is a generalized

eigenvector of (MHt )⊤ for the eigenvalue −λ (with the convention of having

-1 instead of +1 in the Jordan form).

Proof. a) We show that this property holds in fact for any matrix of the

form M :=

(

A BC −A⊤

)

, where all matrices are of size n × n, and B and

C are symmetric. We first prove the result by induction, starting with thecase of an eigenvector of M with eigenvalue λ. Then

(31) M

(

xy

)

=

(

Ax + ByCx − A⊤y

)

= λ

(

xy

)

,

and therefore

(32) M⊤

(

−yx

)

=

(

−A⊤y + Cx−By − Ax

)

= −λ

(

−yx

)

.

Since M and M⊤ have the same eigenvalues, the desired property holds.b) Next it suffices to consider the case of two consecutive generalized eigen-

vectors

(

xy

)

and

(

x′

y′

)

of a sub block associated with the eigenvalue λ.

Since

(33) M

(

x′

y′

)

=

(

Ax′ + By′

Cx′ − A⊤y′

)

=

(

xy

)

+ λ

(

x′

y′

)

;

we have that

(34) M⊤

(

−y′

x′

)

=

(

−A⊤y′ + Cx′

−By′ − Ax′

)

= −(

−yx

)

− λ

(

−y′

x′

)

,

which is the desired property. �

1.2. Shooting function and Hamiltonian flow. Equations (21)-(23) maybe rewritten in an abstract way as

(35) A(y, p) = Bx,

where A and B are linear continuous operators, from the spaces Y ×Y∗ andIRn, resp., into L2(0, T, IRn) × L2(0, T, IRn) × IR2n. The set of solution istherefore a closed affine space, that reduces to a singleton iff A is one toone. Let us reduce this equation to a finite dimensional system.

Let us introduce the shooting function Ss,T : IRn∗ → IRn∗, that with

q ∈ IRn∗ associates MyT −p⊤T , where (y, p) ∈ Y×Y∗ is solution of (21)-(22),with initial condition (x, q) at time s. We can easily see that

Page 7: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 7

Lemma 1.6. If (19) holds, a control u ∈ U is a critical point of J iff theassociated costate p is such that ps is a zero of Ss,T .

The problem of finding the critical points of J reduces therefore to theone of solving a linear equation in IRn. Denote by Φs,t, s ≤ t ≤ T , the

Hamiltonian flow, i.e., the linear mapping that associates with

(

xq⊤

)

∈ IR2n

the value

(

yt

p⊤t

)

∈ IR2n obtained by integrating (21)-(22) over [s, t]. We may

write Φs,t in the block form

(36) Φs,t

(

xq⊤

)

=

(

Φyys,t Φyp

s,t

Φpys,t Φpp

s,t

)(

xq⊤

)

=

(

Φyys,tx + Φyp

s,tq⊤

Φpys,tx + Φpp

s,tq⊤

)

We may write the shooting equation Ss,T q = MyT − p⊤T = 0 under theform

(37)(

MΦyps,T − Φpp

s,T

)

p⊤s =(

Φpys,T − MΦyy

s,T

)

x.

Lemma 1.7. Assume that (19) holds. Then when s is close to T , thereexists a unique critical point of J .

Proof. It is easy to check that Ss,T is a continuous function of s, andSs,T (q) → q − Mx when s ↑ T . Therefore Ss,T is invertible for s close to T .The conclusion follows. �

Definition 1.8. We say that s < T is a conjugate point of T if Ss,T is notinvertible. Denote by T the set of times s < T which are not conjugate, i.e.,for which Ss,T is invertible.

Since Ss,T is continuous, T is an open set. In the case when all matricesare (real) analytic functions of time (i.e., locally expandable in power series),then the shooting function is also an analytic function, and has for each s,at most finitely many zeroes. To see this, observe that the determinant ofthe Jacobian of the shooting function is a nonzero (for s close to T ) analyticfunction of time, so that it may have only a finite number of zeroes over abounded interval of IR.

We say that (y, p) is a singular solution of the two-point boundary valueproblem (21)-(23) if it is a nonzero solution of (21)-(23) with x = 0. We canexpress the fact that a time is a conjugate point using singular solutions.

Lemma 1.9. A time τ is a conjugate point of T iff there exists a singularsolution of (21)-(23).

Proof. We have that τ is a conjugate point iff the shooting equation has anonzero solution q with zero initial condition x. Integrating (21)-(23) withinitial condition (0, q), we derive the conclusion. �

Page 8: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

8 J. FREDERIC BONNANS

1.3. Riccati equation. Let s ∈ T . In view of (37), we can express q = ps

as a a linear mapping of x, i.e. p⊤s = Psx, where Ps is a square matrix ofsize n defined by

(38) Ps = −(

MΦyps,T − Φpp

s,T

)−1 (

MΦyys,T − Φpy

s,T

)

.

Since Φs,T ∈ W 1,∞(s, T, IR2n×2n), we have that Ps ∈ W 1,∞(s, T, IRn×n). Forall σ ∈ T ∩]s, T [, (y, p) solution of (21)-(23), restricted to [σ, T ], is a criticalpoint with initial condition yσ, and so

p⊤σ = Pσyσ.

Substituting Ptyt to p⊤t in (22), and factorizing by yt, we get(39)

0 = Ptyt +[

Pt + (Ct − D⊤t R−1

t Dt) + (A⊤t − D⊤

t R−1t B⊤

t )Pt

]

yt, t ∈ T .

Using the expression of yt in (21) with p⊤t = Ptyt, we obtain(40)

0 =[

Pt + PtAt + A⊤t Pt + Ct − (PtBt + D⊤

t )R−1t (B⊤

t Pt + Dt)]

yt, t ∈ T .

Since this must be satisfied for all possible values of yt (take s = t and thenyt = x arbitrary) we obtain that P is solution of the Riccati equation(41)

0 = Pt + PtAt + A⊤t Pt + Ct − (PtBt + D⊤

t )R−1t (B⊤

t Pt + Dt) t ∈ T ,PT = M.

Denote by τ0 the largest conjugate point (i.e., the first starting backwardsfrom T ). If no conjugate point exist, we set τ0 = −∞.

Lemma 1.10. (i) The Riccati equation (41) has a unique solution over(τ0, T ].(ii) Let τ be an isolated conjugate point. Then limt↓τ ‖Pt‖ = +∞.(iii) The Riccati operator Pt is symmetric, for all t ∈ T .

Proof. (i) It is a standard result of the theory of ODEs that, since (41) is adifferential equation with locally Lipschitz dynamics, it has a unique solutionover a segment of the form (τ1, T ], and if τ1 is finite, limt↓τ1 ‖Pt‖ = +∞.

Since (41) has a solution over T , we obtain that τ1 ≤ τ0.(ii) If τ1 < τ0, we easily check that pt = Ptyt is solution of the two pointboundary value problem over [τ1, T ], for any initial condition x. This con-tradicts the non invertibility of the shooting mapping.(iii) It is easily checked that if Pt is solution of the DRE over an interval(a, b] of T , then P⊤

t is also solution of the DRE over (a, b]. Sine the DRE hasa unique solution over (τ0, T ], it follows that Pt is symmetric over (τ0, T ].

If the dynamic data At, Bt, Ct, and Dt are analytic function of time, sois Φs,T and therefore Ps is analytic too, for all t ∈ T . Being symmetric forvalues close to T , it must be symmetric for all t ∈ T .

Page 9: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 9

For non analytic data it suffices to approximate them by convolution witha smooth kernel (so as to obtain C∞ data), and then by polynomials. Weobtain the convergence of the perturbed Φs,T , and Ps, and since the latteris symmetric, the conclusion follows. �

1.4. Expression of the critical value. Let u be a critical point of J whenstarting at time s and iniytial condition x. Since

(42) yT · MyT = pT yT = psx +

∫ T

s(ptyt + ptyt) dt

we obtain, combining with the state and costate equations (1) and (8), that

(43) yT · MyT = psx +

∫ T

s

(

ptBtut − yt · Ctyt − yt · D⊤t ut

)

dt.

Eliminating ptBt thanks to (18), we see that the critical value is equal to12x·ps. In particular, if s ∈ T , the critical equations having a unique solution,we may denote the critical value as vals(x) and we have

(44) vals(x) = 12x · Psx, for all s ∈ T .

It follows that vals(·) is a nonnegative function iff Ps is positive semidefinite.

1.5. Legendre forms and minima of quadratic functions. We considerin this section the problem of minimizing the quadratic cost J . A localminimum u satisfies the second-order necessary condition3

(45) DJ(u) = 0 and D2J(u) � 0.

Since D2J(·) is constant, this means that u is a critical point of J and thatJ is convex. In that case we know that critical points coincide with globalminima.

The next step is to study the well-posedness of local minima. The lattermay be defined as the invertibility of D2f(u), so the the implicit functiontheorem applies to a smooth perturbation of the critical point equationDf(u) = 0. The following is proved in [2, Lemma 4.124].

Given a Banach space X and a linear continuous operator H : X → X∗,we say that H is self adjoint if 〈Hx, x′〉 = 〈Hx′, x〉 for all x, x′ ∈ X, that H isnonnegative if 〈Hx, x〉 ≥ 0, for all x ∈ X, and that H is said to be invertibleif it is one-to-one, onto, and the inverse operator H−1 is continuous. We saythat H is uniformly positive if there exists α > 0 such that 〈Hx, x〉 ≥ α‖x‖2

for all x ∈ X. Note that if there exists a self adjoint uniformly positiveoperator H : X → X∗, then ‖x‖1 := 〈Hx, x〉1/2 defines a norm on X thatis equivalent to the original norm, and hence the space X is Hilbertizable,and is therefore reflexive.

Lemma 1.11. Let X be a Banach space and let H ∈ L(X,X∗) be selfadjoint and nonnegative. Then H is invertible iff it is uniformly positive.

3If Q is a quadratic form, Q � 0 means that Q is nonnegative, i.e., Q(x) ≥ 0 for all x.

Page 10: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

10 J. FREDERIC BONNANS

Proof. If H is invertible, given x ∈ X, set x∗ := Hx. Since H is nonnegative,the (quadratic) optimization problem

(46) Minx∈X

12 〈Hx, x〉 − 〈x∗, x〉

is convex. Therefore, its optimal solutions are characterized by the opti-mality condition Hx = x∗. Since H is invertible, the unique solution isx = H−1x∗. Therefore,

(47) 12〈Hx, x〉 − 〈x∗, x〉 = −1

2〈x∗,H−1x∗〉 ≤ 1

2〈Hx, x〉 − 〈x∗, x〉, ∀ x ∈ X.

Write the last inequality as

(48) 〈Hx, x〉 ≥ 2〈x∗, x〉 − 〈Hx, x〉, ∀ x ∈ X.

There exists x1 ∈ X such that ‖x1‖ = 1 and 〈x∗, x1〉 ≥ 12‖x∗‖. Setting

x := γx1, with γ ∈ IR, in (48), we obtain

〈Hx, x〉 ≥ 2γ〈x∗, x1〉 − γ2〈Hx1, x1〉, ∀ γ ∈ IR.

Maximizing the right hand side of the above inequality over γ ∈ IR, andusing ‖x‖ = ‖H−1x∗‖ ≤ ‖H−1‖ ‖x∗‖, we obtain

〈Hx, x〉 ≥ 〈x∗, x1〉2〈Hx1, x1〉

≥ ‖x∗‖2

4‖H‖ ≥ 1

4‖H−1‖−2‖H‖−1‖x‖2,

which is the desired inequality with α := 1

4‖H−1‖−2‖H‖−1.

We show now that the condition is sufficient. Since H is uniformly posi-tive, the quadratic optimization problem (46) has a strongly convex objectivefunction. Therefore, for any x∗ ∈ X∗, problem (46) has a unique optimalsolution x, characterized by the first order optimality system Hx = x∗. Thisshows that H is one-to-one and onto. By the Open Mapping Theorem itfollows that H−1 is continuous, and hence H is invertible. �

Since J is quadratic, its Hessian is uniformly positive iff J satisfies thefollowing quadratic growth condition.

Definition 1.12. Let u be a critical point of J . We say that the quadraticgrowth property is satisfied if there exists α > 0 such that J(u) ≥ J(u) +α‖u − u‖2

U , for all u in some neighborhood of u.

Let us now relate these notions to the one of Legendre forms, see [6], [2,Sections 3.3.2 and 3.4.3].

Definition 1.13. Let X be a Hilbert space. We say that Q : X → IR is aLegendre form if it is a sequentially weakly lower semi continuous (w.l.s.c.)quadratic form over X, such that, if yk → y weakly in X and Q(yk) → Q(y),then yk → y strongly.

Set wk := yk − y. Using

Q(yk) = Q(y) + DQ(y)wk + Q(wk),

Page 11: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 11

and since DQ(y)wk → 0 as wk → 0 weakly, we have that Q is a Legendreform iff, for any sequence wk weakly converging to 0, Q(wk) → 0 impliesthat wk → 0 strongly.

The following examples apply easily to the quadratic costs for optimalcontrol problems:

Example 1.14. Let Q be a quadratic form over a Hilbert space X.(i) Let Q(y) = ‖y‖2 be the square of the norm. Then obviously Q(wk) → 0iff wk → 0 strongly. Therefore Q is a Legendre form.(ii) Assume that Q is nonnegative, and y 7→

Q(y) is a norm equivalentto the one of X. Then (the weak topology being invariant by under a newequivalent norm) Q is a Legendre form.(iii) Assume that Q(y) = Q1(y) + Q2(y), where Q1 is a Legendre form, andQ2 is weakly continuous. Then Q is a Legendre form.

The notions of quadratic growth and Legendre form are related in thefollowing way:

Lemma 1.15. Let Q : X → IR be a Legendre form, and C a closed convexcone of X. Then the two statements below are equivalent:

(49) Q(h) > 0, for all h ∈ C \ {0}

(50) ∃ α > 0; Q(h) ≥ α‖h‖2, for all h ∈ C.

Proof. It suffices to prove that, if Q is Legendre and satisfies (49), then (50)holds. Assume on the contrary that, for some sequence hk of nonzero vectorsin X, Q(hk) ≤ o(‖hk‖2). Changing hk into hk/‖hk‖, we may assume that‖hk‖ = 1. Since X is a Hilbert space, extracting if necessary a subsequence,we may assume that hk has a weak limit h. By weak l.s.c. of Q, Q(h) ≤lim infk Q(hk) ≤ 0. Since C is a closed convex cone, and hence, is weaklyclosed, h ∈ C so that Q(h) ≤ 0 implies h = 0. But then (by weak l.s.c. ofQ again) Q(h) = 0 = limk Q(hk) so that hk → h for the strong topology, incontradiction with the fact that h = 0 and ‖hk‖ = 1. �

Lemma 1.16. The functional J is w.l.s.c. over U iff Rt � 0 a.e., and D2Jis a Legendre form iff there exists α > 0 such that Rt � αId a.e.

Proof. (i) We can decompose J as J = J1 + J2, where J1 is the part thatdoes not depend on the state, i.e.

(51) J1(u) := 12

∫ T

sut · Rtutdt,

and J2(u) = J2(u) − J2(u). Since the mapping u 7→ y[u] is compact (asequence of states associated with bouded controls has a strongly convergentsubsequence), J2 is weakly continuous. Therefore J is w.l.s.c. iff J1 is w.l.s.c.(ii) If Rt � 0 a.e., then J1 is convex, and since it is also continuous, J1 is

Page 12: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

12 J. FREDERIC BONNANS

w.l.s.c. If not, there exists β > 0 and a measurable set I ⊂ (s, T ) of nonzeromeasure such that

(52) h · Rth ≤ −β‖h‖2, for all h ∈ IRm, a.e. t ∈ I.

Let UI be the subset of U of functions that are zero a.e. outside I. Since UI

is infinite dimensional, there is an orthonormal sequence uk in UI . We havethat uk → 0 weakly in U , whereas

(53) lim supk

J(uk) = lim supk

J1(uk) ≤ −β < 0 = f(0).

This implies that J is not w.l.s.c. So we have proved that J is w.l.s.c. iffRt � 0 a.e.(iii) If Rt � αId a.e., then

√J1 defines a norm equivalent to the one of

U , and since J2 is weakly continous, D2f is a Legendre form (case (iii) ofexample 1.14).Otherwise, if Rt is not uniformly positive, we may construct an orthonormalsequence uk such that a := lim supJ1(u

k) ≤ 0. Since J1 is nonnegative,a = 0 so that J1(u

k) → J1(0). At the same time uk converges to 0 weaklybut not strongly, contradicting the definition of the Legendre form. �

1.6. Spectral analysis. In this section, for simplicity, we assume that allmatrices in the definition of the quadratic problem are constant over time,and that R is positive definite. We can make a change of variable on IRm,

v = Lu

such that L⊤L = R, and then |v|2 = u · Ru. The corresponding change ofvariables on U has the effect of reducing R to identity. So in the sequel weassume that R is the identity matrix. Also for simplicity we assume thatD = 0. So we may write J = J1 + J2, with

(54) J1(u) = 12

∫ T

s|ut|2dt = 1

2‖u‖2, J2(u) = 1

2

∫ T

syt ·Ctytdt+ 1

2yT ·MyT .

Let Hs denote the Hessian of J2, and Qs denote the associated quadraticform.

If X, Y are Banach spaces, an operator A ∈ L(X,Y ) is said to be compactif the image of BX (unit ball) by A has a compact closure. The followinglemma is classical.

Lemma 1.17. The operator Hs is selfadjoint and compact. Consequently,there is an orthonormal basis of Us composed of eigenvectors of Hs.

Proof. That Hs is selfadjoint is obvious. Its compactness is a consequenceof the one of the mapping Us → Ys, v 7→ z, where z is the unique solutionof the linearized equation

(55) z = Az + Bv; zs = 0.

Page 13: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 13

The second statement comes from the well-known theory of compact oper-ators; see e.g., Balakrishnan [1, Section 3.3] or Dunford and Schwartz [5].�

Lemma 1.18. We have that

(56) lim sups↑T

Hs(v, v)

‖v‖2Us

= 0.

Proof. The conclusion follows easily from the inequalities below, that areconsequence of Gronwall’s lemma and the Cauchy-Schwarz inequality:

(57) ‖z‖∞ ≤ C

∫ T

s|vt|dt ≤ C

√T − s ‖v‖Us

.

For s close to T , the above lemma implies that the Hessian of J , i.e.,Id + Hs, is uniformly positive, and hence J is strongly convex, and hasa unique critical point that is a minimum point. By lemma 1.17, s is aconjugate point iff Hs has an eigenvalue equal to -1.

2. Infinite horizon problems

2.1. Setting. We consider in this section the case of a linear system whosematrices A (n × n) and B (n × m) do not depend of time:

(58) yt = Ayt + But, s ≤ t ≤ T ; y0 = x,

where x ∈ IRn. The horizon T belongs to (0,+∞]; if T < ∞ (resp. T = ∞)we will speak of a finite (resp. infinite) horizon system.

The control space is UT := L2(0, T, IRm). We already know that, whenT < ∞, (58) has a unique solution y[u] ∈ H1(0, T, IRn). Therefore, whenT = ∞, with each u ∈ U is associated a unique solution y[u] in the spaceH1

loc(0,∞, IRn) of measurable functions (0, T ) → IRn whose restriction over(0, t) belongs to H1(0, t, IRn), for all t ≥ 0. Consider the quadratic criterion

(59) JT (u) := 12

∫ T

s

(

u⊤t Rut + y⊤t Cyt

)

dt.

The matrices C and R, independent on time, are symmetric; C is semidefi-nite positive, and R is definite positive. Since the integrand is nonnegative,the integral is well defined, with value in [0,+∞]. The infinite horizon linearquadratic optimal control problem is:

Minu∈U∞

JT∞(u). (Px,∞)

2.2. Kalman state space theory. In order to state the main result, weneed to recal some basic concepts of the Kalman theory.

Page 14: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

14 J. FREDERIC BONNANS

Controllability. We say that the pair (A,B) is controllable if, for anyT > 0, x and z in IRm, there exists a control u ∈ UT such that yT [u] = z.In view of the linearity of the state equation, an equivalent conditions isthat, starting from the initial point x = 0, any value of the state at time Tcan be reached. Since the set of possible states at time T is a vector spacedenoted by E, controllability means that any q ∈ IRn∗ in E⊥ is equal to 0.The costate associated with q is the unique solution of

(60) − p = pA; pT = q,

i.e. qt = qe(T−t)A. In view of the expression of the derivatives of p and y,we have that:

(61) 0 = qyT [u] = [qy[u]]T0 =

∫ T

0ptButdt, for all u ∈ UT .

This is equivalent to ptB = 0 identically. Since ptB = qe(T−t)AB is ananalytic function of time, an equivalent condition is that all derivatives attime T are equal to zero, i.e.

(62) 0 = qB = −qAB = · · · = (−1)kqAkB = · · · , for all k ∈ IN.

In view of the Cayley-Hamilton theorem, it suffices to take the n first terms,and hence, we have that

Lemma 2.1. (i) The set E⊥ is the left kernel of the n × nm controllabil-ity!matrix

(63) C := [B AB · · ·An−1B].

(ii) The pair (A,B) is controllable iff C has rank n.

We can give an explicit form of the minimal energy controller transferinga state ξ to another state η in time t. Let

(64) Q :=

∫ t

0e(t−s)ABB⊤e(t−s)A∗

ds.

Lemma 2.2. (i) The matric Q is invertible iff (A,B) is controllable.(ii) If (A,B) is controllable, then the minimal energy transfer from state ξto another state η in time t is achieved by the control

(65) us := B⊤e(t−s)A∗

Q−1(η − etAξ).

Proof. (i) Let y ∈ Rn. We have that

(66) y⊤Qy =

∫ t

0y⊤e(t−s)ABB⊤e(t−s)A∗

yds =

∫ t

0|y⊤e(t−s)AB|2ds.

So y⊤Qy ≥ 0, with equality iff y⊤ belongs to the left kernel of e(t−s)AB, for

all s ∈ [0, t]. In particular, 0 = y⊤ di

dtie(t−s)AB = AiB, for all i = 0, 1, . . .

which in view of the Cayley-Hamilton theorem holds iff y⊤ belongs to the

Page 15: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 15

left kernel of the controllability matrix. The result follows.(ii) We easily check with (65) that the associated trajectory satisfies

(67) xt = etAξ +

∫ t

0e(t−s)ABusds = η.

Therefore the state ξ is transferred to the state η in time τ . Let v beanother control transferring ξ to η in time τ , and let w := v − u. We havethat

∫ t0 e(t−s)ABwsds = 0, and hence,

(68)

∫ t

0u⊤

s wsds = (η − etAξ)⊤Q−1

∫ t

0e(t−s)ABwsds = 0,

which proves that u minimizes the energy. �

Standard non controllable form. Assume that the controllability ma-trix has rank r < n. Chose a state space basis formed by a basis of E,complemented by primal vectors orthogonal to E, and denote by (x1, x2)the corresponding block decomposition of vectors. The state equation (58),skipping the time argument, may be written as

(69)d

dt

(

x1

x2

)

=

(

A11 A12

A21 A22

)(

x1

x2

)

+

(

B1

B2

)

u.

In view of (62), the n − r last rows of the controllability matrix C are zero.

We first observe that B2 = 0. Then AB =

(

A11B1

A21B1

)

implies A21B1 = 0. By

induction, we obtain that

(70) AkB =

(

Ak11B1

A21Ak−111 B1

)

and so A21Ak−111 B1 = 0.

Therefore, the rows of A21 belong to the left kernel of the reduced control-lability matrix

(71) C1 := [B1 A11B1 · · ·An−111 B1].

The latter coincide with the r first rows of C, and hence, has rank r, sothat its left kernel has no nonzero vector. It follows that A21 = 0, and thestandard non controllable form writes(72)

d

dt

(

x1

x2

)

=

(

A11 A12

0 A22

)(

x1

x2

)

+

(

B1

0

)

u, with (A11, B1) controllable.

Single input, single output systems. Consider single input, single out-put (or SISO) systems of the form

(73) h(n)t = a1ht + a2ht + · · · + anh

(n−1)t + ut.

Setting xi = h(i−1), we obtain the standard form (58) for the state equation,with B equal to the last element of the canonical basis, and we represent

Page 16: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

16 J. FREDERIC BONNANS

below A, which is in the companion form A = Comp(a), where

(74) Comp(a) :=

0 10 1

. . .

0 1a1 a2 an−1 an

.

Note that the looped matrix with gain denoted k (since it is a vector) Ak =A + Bk satisfies Ak = Comp(a + k). We recall the following classical result,see e.g. [3].

Lemma 2.3. The characteristic polynomial of the matric Comp(a) is

(75) det(λI − Comp(a)) = (−1)n(

λn − anλn−1 − an−1λn−2 − · · · − a1

)

.

Proof. We have that

(76) λI − Comp(a) =

λ −1λ −1

. . .

λ −1−a1 −a2 −an−1 λ − an

.

Add to column 1 of this matrix the product of columns 2, . . . , n by λ, . . . , λn−1

resp. All terms in the first column cancel except the last one, whose cofac-tor is equal to 1 (the corresponding submatrix being lower triangular). Theresult easily follows. �

Expanding the determinant of D(a) := λI − Comp(a) along the firstcolumn, and defining a2:n as the n−1 dimensional vectors with componentsa2, . . . , an, we find that

(77) detD(a) = λdet D(a2:n) + (−1)na1

and so by induction

(78) detD(a) = λn −n−1∑

i=1

aiλi.

It follows that we can arbitrarily set the values of the coefficients of thecharacteristic polynomial by choosing the looped matrix k. In particular wecan give arbitrary values to the eigenvalues of the looped system.Multiple input systems. We say that the pair (A,B) is in the Brunovskyform if it is a parallel union of m SISO systems, of size ni, with n1+· · ·+nm =n, i.e., can be put in the form

(79) h(ni)it = ai1ht + ai2hit + · · · + aini

h(n−1)it + uit, i = 1, . . . ,m.

Obviously a parallel union of controllable systems is controllable. Since SISOsystems are controllable, any pair in the Brunovsky form is controllable. Weprove a partial converse.

Page 17: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 17

Definition 2.4. We define a change of control variables as a transformationu = Kx + Gv, with G invertible of size m, and a change of state variablesas a transformation x = My, with M invertible of size n.

Exercice 2.5. Check that controllability is invariant under the above changesof variables.

We note that there is no loss of generality in assuming B to be of rank mfor studying the controllability properties.

Lemma 2.6. If the pair (A,B) is controllable, with B of rank m, then aftersome change of control and state variables, it is in the Brunovsky form.

Proof. We get the result by induction over n. For n = 1 the conclusiontrivially holds. Assume that it holds also for n − 1. After a change of statevariables we obtain that the colums of B are the m last elements of thecanonical basis of R

n. Then, by the change of control variables u = −Ax+v,where A is the restriction of A to its last m columns, we obtain a system ofthe form

(80) x =

(

A11 A12

0 0

)

x +

(

00 Im

)

v.

Necessarily (A11, A12) is a controllable pair. Extracting from A12 a maxi-mal set B1 of independent columns, we have that (A11, B1) is a controllablepair that by our induction hypothesis can be put in Brunovsky form. Butthen (80) gives the desired Brunovsky form for the original system. �

Stabilizability. A square matrix A is said to be stable if all its eigenvaluehave negative real part. The dynamical system x = Ax is said to be expo-nentially stable if there exist constants a > 0 and α > 0, not depending onthe initial state, such that |xt| ≤ ae−αt|x0|. It can be easily proved usingthe Jordan form that the exponential stability holds iff A is stable.

We say that the pair (A,B) is stabilizable if there exists a state feedback,i.e. an m × n gain matrix K, such that the looped system AK := A + BKis stable. Using the standard non controllable form (72), we see that thematrix of the looped system is

(81)

(

A11 + B1K1 A12 + B1K2

0 A22

)

where K1 (resp. K2) stand for the r (resp. n − r) rows of the feedbackmatrix K.

As a consequence of the Brunovsky’s normal form for controllable systems,and of the study of SISO systems, controllability implies stabilizability. Inview of (81), we deduce the following:

Lemma 2.7. When put in the standard non controllable form (72), the pair(A,B) is stabilizable iff the matrix A22 is stable.

Page 18: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

18 J. FREDERIC BONNANS

Observability. Let A and C be n × n and ℓ × n matrices, resp. The pair(A,C), related to the observation system

(82) xt = Axt; yt = Cxt

is sais to be observable if y is not identically zero whenever x is not identicallyzero. Since xt = etAx0, we have that yt = CetAx0 is analytical, and hence,is identically zero iff all its derivatives vanish at time 0, i.e.,

(83) 0 = Cx0 = CAx0 = · · ·CAkx0, for all k ∈ IN.

Since by the Cayley Hamilton theorem it suffices to check for k ≤ n − 1,(83) holds iff x0 is a (right) vector of the nℓ× n observability matrix below:

(84) O :=

CCA...

CAn−1

.

It follows that (A,C) is observable iff O has rank n. Note that O is the trans-posed matrix of the commandability matrix of the “dual system” (A⊤, C⊤).It is therefore possible to deduce from the standard non controllable form acorresponding standard non observable form.

Detectability. If C is of size r × n, and xt, yt satisfy (82) given a matrixG of dimension n × r, we say that z solution of

(85) zt = Azt + G(Czt − yt)

is a (convergent) linear estimator of the state if |zt − xt| → 0 for all initialcondition on x and z. Since e := z−x satisfies et = (A+GC)et an equivalentcondition is that A+GC is stable. We say that the pair (A,C) is detectableif it is possible to find G such that A+GC is stable. Since a matrix is stableiff its transpose is, an A + GC is stable iff A⊤ + C⊤K is, whith K = G⊤.Consequently, (C,A) is detectable iff (A⊤, C⊤) is stabilizable. In view of thestandard non controllable form (72), we see that we may write the dynamicalsystem under the form

(86)d

dt

(

x1

x2

)

=

(

A11 0A21 A22

)(

x1

x2

)

; y =(

C1 0)

x = C1x1,

with (C1, A11) observable. By lemma 2.7, the pair (C,A) is detectable iffA22 is stable. The observability matrix of (C,A) is the one of (C1, A11)supplemented by null columns.

2.3. The algebraic Riccati equation. Consider now the algebraic Riccatiequation (ARE) is

(87) 0 = PA + A⊤P + C − PBR−1B⊤P,

to be compared to the differential Riccati equation (DRE) on a finite horizon:

(88) 0 = P + PA + A⊤P + C − PBR−1B⊤P ; PT = 0.

Page 19: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 19

In the sequel we denote when necessary by Pt[T ] the value at time t of theDRE with horizon T . The main result of this section is as follows.

Theorem 2.8. Assume that (A,B) is stabilizable and that (A,C) is de-tectable. Then (87) has a unique symmetric stabilizing solution denoted byP∞, and we have v∞(x) = 1

2x⊤P∞x. In addition, the optimal control isgiven by

(89) ut = −B⊤P∞xt, t ≥ 0.

The theorem follows from the three following lemmas, which have theirown interest since they use weaker hypotheses.

Lemma 2.9. The ARE has at most one symmetric solution that is stabi-lizing, i.e., such that the looped dynamics A − BR−1B⊤P is stable.

Proof. Let P1 and P2 be two symmetric solution with stable associated feed-back dynamics Ai := A − BR−1B⊤Pi, for i = 1, 2. Set Q := P1 − P2. Oneeasily checks that

(90) QA1 + A⊤2 Q = 0.

Let x be an eigenvector of A1, with associated eigenvalue λ. Computing theproduct of both sides of (90) by x, obtain λQx+A⊤

2 Qx = 0. This means thaty := Qx is (if it is nonzero) an eigenvector of A⊤

2 with associated eigenvalue−λ. Since both A1 and A2 are stable (and A⊤

2 has the same eigenvalues thatA2) this implies that y = Qx = 0. We have proved that all eigenvectors ofA1 belong to Ker Q. Now we may assume that A1 is in Jordan form (see(30)). So we know that Qe1 = 0 (where e1 is the first basis vector). Butthen, in view of the Jordan form, QA1e2 = λ1Qe2, and hence by (90):

(91) 0 =(

QA1 + A⊤2 Q

)

e2 = λ1Qe2 + A⊤2 Qe2,

so that again Qe2 = 0. By induction we prove that the subspace associatedwith λ1 belongs to the kernel of Q. Since that is true for any eigenvalue, weobtain that Q = 0. The conclusion follows. �

Lemma 2.10. If the ARE (87) has a (necessarily unique) symmetric sta-bilizing solution P , then v∞(x) ≤ 1

2x⊤Px.

Proof. Let (u, x) be the trajectory defined by ut = −R−1B⊤Pxt, with initialcondition x0; since P is stabilizing, the linear ODE x = (A−BR−1B⊤P )xt

has a unique solution in H1(0, T, IRm), so that (u, x) is well-defined. We

Page 20: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

20 J. FREDERIC BONNANS

have then, skipping the time argument and using the ARE:(92)

J∞[u] = 12

∫ ∞

0[u · Ru + x · Cx]dt

= 12

∫ ∞

0

[

−u · B⊤Px + x · (PBR−1B⊤P − PA − A⊤P )x]

dt

= −∫ ∞

0x · P (Ax + Bu)dt

= −∫ ∞

0x · Pxdt = −1

2

∫ ∞

0

d

dt(x · Px) dt = x0 · Px0,

where in the last equality we use the fact that xt → 0 as t ↑ ∞. Theconclusion follows. �

Lemma 2.11. Assume that the pair (A,B) is stabilizable. Then (88) has,for any T > 0, a unique solution, and when T ↑ +∞, P0[T ] is nondecreasing(for the Lowner order4) and converges to a solution denoted P∞ of (87).

Proof. It follows from (44) that the value of problem (Px,T ), denoted vT (x)

is equal to 12x⊤PT (0)x. Since the state equation is autonomous and the

integrand is nonnegative and independent on time, T 7→ vT (x) is nonde-creasing; therefore P0[T ] is nondecreasing for the Lowner order. On theother hand, (A,B) being stabilizable, there exists a stabilizing feedback K,and positive constants a and α such that |yt[u]| ≤ ae−αt|x|, and therefore|ut| ≤ a‖K‖e−αt|x|. We deduce that J(u) ≤ b|x|2, for some b > 0. SincevT ≤ v(∞), Therefore for each x, T 7→ x⊤PT (0)x is nondecreasing andbounded, hence has a limit say q(x). Since a limit of quadratic functions isquadratic, there exists a limit P∞ of PT (0). Since vT (x) ≤ v∞(x), we havethat 1

2x⊤P∞x ≤ v∞(x).Finally if a solution of an autonomous ODE converges then its limit is

necessarily a critical point of the ODE. Therefore P∞ is a solution of (87).�

Lemma 2.12. Assume that (A,B) is stabilizable and that (A,C) is de-tectable. Then the matrix P∞ is stabilizing.

Proof. Since Pt is nondecreasing, we have that

(93) ‖P∞‖|x|2 ≥ ‖PT ‖|x|2 ≥∫ T

0x⊤

t (C + PT−tBR−1B⊤PT−t)xtdt,

where xt is the optimal state with dynamics xt = (A − BR−1B⊤Pt)xt, andthe horizon is T . In particular, for any τ < T :

(94) ‖P∞‖|x|2 ≥∫ τ

0x⊤

t [C + PT−tBR−1B⊤PT−t]xtdt.

4For two square symmetric matrices A and B, the Lowner order A � B means thatA − B is positive semidefinite.

Page 21: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 21

When T ↑ ∞, since PT−t → P∞ uniformly on [0, τ ], we obtain the samerelation with P∞ instead of PT−t. Now when τ ↑ ∞, it follows that

(95) ‖P∞‖|x|2 ≥∫ ∞

0x⊤

t [C + P∞BR−1B⊤P∞]xtdt,

where xt = (A − BR−1B⊤P∞)xt. We know that xt is a linear combinationof exponential of t times eigenvalues of A − BR−1B⊤P∞, with coefficientsthat are polynomial in time of degree less that the multiplicity of theseeigenvalues. By (95), ut and Cxt are exponentially stable, and so is

(96) CAxt = C(xt − But) =d

dt(Cxt) − CBut.

By induction, for k ≥ 1 we have exponentially stability for

(97) CAk+1xt = CAkAxt = CAk(xt − But) =d

dt(CAkxt) − CAkBut.

It follows that Oxt → 0 exponentially, where O is the observability ma-trix defined in (84). Using the canonical form (86), and the fact that theobservability matrix of (C,A) is the one of (C1, A11) supplemented by nullcolumns and (C1, A11) being observable the observability matrix of (C1, A11)is injective, it follows that x1 → 0 exponentially. Since A22 is stable and x1,u converge to 0 exponentially, so does x2. The conclusion follows. �

2.4. Computation of the solution of the ARE. The numerical com-putation of the solution of the ARE is based on the computation of theJordan form of the Hamiltonian matrix (remark 1.4) whose expression ishere, setting B := BR−1B⊤:

(98) M =

(

A −B−C −A⊤

)

Proposition 2.13. Assume that (A,B) is stabilizable and that (A,C) isobservable. Then M has exactly n generalized eigenvectors associated with

stable eigenvalues. Denote by Z :=

(

XY

)

the corresponding matrix, with X

and Y of size n. Then the stabilizing solution of the ARE is P = Y X−1.

Proof. Let G = A−BP denote the closed loop matrix. In view of the ARE(87) we have that

(99) PG = PA − PBP = −C − A⊤P.

Let X be such that X−1GX = J , where J is a Jordan form of G. LetY := PX. Then, using (99) for the second relation:

(100) XJ = GX = AX − BY ; Y J = PXJ = PGX = −CX − A⊤Y.

Equivalently, MZ = ZJ , meaning that the columns of Z are generalizedeigenvectors associated with stable eigenvalues of G. In view of lemma 1.5,M has exactly n generalized eigenvectors associated with stable eigenvalues.

Page 22: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

22 J. FREDERIC BONNANS

Remark 2.14. The Jordan form is not unique, since we can order the eigen-values in different ways, but we see that the product P = Y X−1 is invariantsince the ARE has at most one stabilizing solution.

2.5. Illustration. Here we illustrate the use of the ARE for stabilizing anonlinear system at an equilibrium point. Consider the simple control sys-tem

(101) ht = ht + ut, t ≥ 0,

with ht and ut scalar valued. This sytem arises for instance when linearizingthe equation of an inverted pendulum at the (unstable) equilibrium pointand therefore is the prototype of an unstable system. In the state spaceframework the state equation writes

(102)d

dt

(

x1

x2

)

=

(

0 11 0

)(

x1

x2

)

+

(

01

)

u.

We take R = 1 and C diagonal, so that

(103) A =

(

0 11 0

)

; B =

(

01

)

; C =

(

c11 00 c22

)

.

For solving PA+A⊤P +C −PBP = 0 in Scilab, type P=riccati(A,B,C,’c’).For c11 = c22 = 1, the solution found and closed loop dynamics AP =A − BB⊤P are(104)

P =

(

3.4142136 2.41421362.4142136 2.4142136

)

; AP =

(

0 1−1.4142136 −2.4142136

)

;

Since AP is a 2× 2 matrix with negative trace and positive determinant,it is stable as expected. The eigenvalues are -1 and -1.414, and the corre-sponding eigenvectors are

(105)

(

−11

)

;

(

−0.70711

)

.

We can test the application of this feedback to the pendulum equation,the angle θ being zero for the unstable position. The dynamics equation,assuming all constants to be equal to one, is

(106) θt = sin θt + ut

In figure 1 we display a simulation of the looped system, with zero initialspeed, and initial positions 0.1 and 0.5, resp. The plots on the top (resp.bottom) are those of position (resp. speed). We see that the looped systembehaves quite well. From the practical point of view, an obvious drawbackis that we do not take into account limitation of power delivered by thecontrol. So this is a motivation for the study of constrained systems in thenext chapters.

Page 23: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 23

0 100 200 300 400 500 600 700 800 900−0.3

−0.2

−0.1

0.0

0.1

0.2

0.3

0.4

0.5

Figure 1. Regulated inverse pendulum

3. Notes

Finite horizon problems. The main difference with classical presentationsof the subject consist in the discussion of critical points rather than minimumpoints. This can be useful in the study of mechanical problems where actionprinciples deal with critical points.Infinite horizon problems. Kalman [8] introduces decompositions in con-trollable / non controllable, observable / non observable parts.

Kalman [7] introduced the ARE and obtain the existence of a solution fora controllable system. In [9] he proved the uniqueness of a definite positivesolution for a controllable and observable system. Wonham [11] weakenedthese conditions using the concepts of stabilizability and detectability. Pot-ter [10] obtained the representation of the solution of the Riccati equationin term of the eigenvectors of the Hamiltonian matrix. The stability of thesolution of the Riccati equation is discussed in Bucy [4].

Page 24: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

24 J. FREDERIC BONNANS

References

[1] A.V. Balakrishnan. Applied functional analysis, volume 3 of Applications of Mathe-matics. Springer-Verlag, New York, second edition, 1981.

[2] J.F. Bonnans and A. Shapiro. Perturbation analysis of optimization problems.Springer-Verlag, New York, 2000.

[3] Louis Brand. The companion matrix and its properties. Amer. Math. Monthly, 71(6).[4] R. S. Bucy. Structural stability for the Riccati equation. SIAM J. Control, 13:749–753,

1975.[5] N. Dunford and J. Schwartz. Linear operators, Vol I and II. Interscience, New York,

1958, 1963.[6] A.D. Ioffe and V.M. Tihomirov. Theory of Extremal Problems. North-Holland Pub-

lishing Company, Amsterdam, 1979. Russian Edition: Nauka, Moscow, 1974.[7] R. E. Kalman. Contributions to the theory of optimal control. Bol. Soc. Mat. Mexi-

cana (2), 5:102–119, 1960.[8] R. E. Kalman. Canonical structure of linear dynamical systems. Proc. Nat. Acad. Sci.

U.S.A., 48:596–600, 1962.[9] R. E. Kalman. When is a linear control system optimal ? Trans. ASME, J. Basic

Eng., Ser. D, 86:51–60, 1964.[10] James E. Potter. Matrix quadratic solutions. SIAM J. Appl. Math., 14:496–501, 1966.[11] W. M. Wonham. On a matrix Riccati equation of stochastic control. SIAM J. Control,

6:681–697, 1968.

Page 25: LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION …bonnans/notes/oc/lq.pdf · LECTURE NOTES ON LINEAR QUADRATIC CONTROL VERSION OF 02-07-2010 3 Proof. The mapping a : Y s,T ×Y∗

Index

Brunovsky, 16

companion form, 16controllability, 14

matrix, 14matrix (reduced), 15

detectability, 18detectable, 18

Hamiltonianflow, 7matrix, 5reduced , 5

integration by parts, 2invertible operator, 9

Jordan form, 5

Legendre form, 10

observability, 18observability matrix, 18

quadratic growth, 10

Riccatialgebraic equation, 18equation, 8

shooting function, 6singular solution, 7SISO, 15stabilizability, 17

uniformly positive operator, 9

25