Stochastic tube MPC with state estimation

6
Automatica 48 (2012) 536–541 Contents lists available at SciVerse ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Brief paper Stochastic tube MPC with state estimation Mark Cannon a,1 , Qifeng Cheng a , Basil Kouvaritakis a , Saša V. Raković b a Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, United Kingdom b Institute for Automation Engineering, Otto-von-Guericke-Universität Magdeburg, Germany article info Article history: Received 9 June 2010 Received in revised form 8 July 2011 Accepted 9 August 2011 Available online 12 January 2012 Keywords: Model predictive control Output feedback State estimation Constrained control Stochastic systems abstract An output feedback Model Predictive Control (MPC) strategy for linear systems with additive stochastic disturbances and probabilistic constraints is proposed. Given the probability distributions of the disturbance input, the measurement noise and the initial state estimation error, the distributions of future realizations of the constrained variables are predicted using the dynamics of the plant and a linear state estimator. From these distributions, a set of deterministic constraints is computed for the predictions of a nominal model. The constraints are incorporated in a receding horizon optimization of an expected quadratic cost, which is formulated as a quadratic program. The constraints are constructed so as to provide a guarantee of recursive feasibility, and the closed loop system is stable in a mean-square sense. All uncertainties in this paper are taken to be bounded—in most control applications this gives a more realistic representation of process and measurement noise than the more traditional Gaussian assumption. © 2011 Elsevier Ltd. All rights reserved. 1. Introduction Model predictive control (MPC) has proved successful because it attains approximate optimality in the presence of constraints. Most MPC strategies consider hard constraints, and a number of robust MPC (RMPC) techniques exist to handle uncertainty. However model and measurement uncertainties are often stochas- tic, and therefore RMPC can be conservative since it ignores in- formation on the probabilistic distribution of the uncertainty. In addition, not all constraints are hard (i.e. inviolable), and it may be possible to improve performance by tolerating violations of con- straints provided that the frequency of violations remains within allowable limits (see e.g. van Hessem and Bosgra (2002), Kouvar- itakis, Cannon, and Tsachouridis (2004) or more recently Magni, Pala, and Scattolini (2009); Oldewurter, Jones, and Morari (2009); Primbs (2007)). Recent work (Cannon, Kouvaritakis, Raković, & Cheng, 2011; Kouvaritakis, Cannon, Raković, & Cheng, 2010) proposed Stochastic MPC (SMPC) algorithms that use probabilistic information on The material in this paper was partially presented at the 19th International Symposium on Mathematical Theory of Networks and Systems (MTNS 2010), July 5–9, 2010, Budapest, Hungary. This paper was recommended for publication in revised form by Associate Editor Fabrizio Dabbene under the direction of Editor Roberto Tempo. E-mail addresses: [email protected] (M. Cannon), [email protected] (Q. Cheng), [email protected] (B. Kouvaritakis), [email protected] (S.V. Raković). 1 Tel.: +44 1865 273189; fax: +44 1865 273906. additive disturbances in order to minimize the expected value of a predicted cost subject to hard and soft (probabilistic) constraints. Stochastic tubes were used to provide a recursive guarantee of feasibility and thus ensure closed loop stability and constraint satisfaction. Tubes enable probabilistic constraints to be invoked via linear constraints on nominal predictions and thus the online algorithms are computationally efficient. Moreover (Kouvaritakis et al., 2010) proposed conditions that, for the parameterization of predictions employed, are necessary and sufficient for recursive feasibility, thereby incurring no additional conservatism. The approach was based on state feedback, which assumes that the states are measurable. In practice this is often not the case, and it is then necessary to estimate the state via an observer. The introduction of state estimation into RMPC is well understood (see e.g. Bemporad and Garulli (2000), Lee and Kouvaritakis (2001), Chisci and Zappa (2002), Wan and Kothare (2002), Løvaas, Seron, and Goodwin (2008), Mayne, Raković, Findeisen, and Allgöwer (2009)) and uses lifting to describe the combined system and observer dynamics. The current paper extends these ideas to include probabilistic information on measurement noise and the unknown initial plant state. Extending the approach of Kouvaritakis et al. (2010), which considered the propagation of the effects of bounded additive stochastic disturbances, this paper examines how the effects of bounded measurement noise and uncertain state estimates propagate along predicted input/state trajectories. This analysis provides conditions for satisfaction of probabilistic constraints over an infinite prediction horizon as well as conditions that guarantee the existence of feasible solutions at all times, given a feasible 0005-1098/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2011.08.058

Transcript of Stochastic tube MPC with state estimation

Page 1: Stochastic tube MPC with state estimation

Automatica 48 (2012) 536–541

Contents lists available at SciVerse ScienceDirect

Automatica

journal homepage: www.elsevier.com/locate/automatica

Brief paper

Stochastic tube MPC with state estimation✩

Mark Cannon a,1, Qifeng Cheng a, Basil Kouvaritakis a, Saša V. Raković b

a Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, United Kingdomb Institute for Automation Engineering, Otto-von-Guericke-Universität Magdeburg, Germany

a r t i c l e i n f o

Article history:Received 9 June 2010Received in revised form8 July 2011Accepted 9 August 2011Available online 12 January 2012

Keywords:Model predictive controlOutput feedbackState estimationConstrained controlStochastic systems

a b s t r a c t

An output feedback Model Predictive Control (MPC) strategy for linear systems with additive stochasticdisturbances and probabilistic constraints is proposed. Given the probability distributions of thedisturbance input, themeasurement noise and the initial state estimation error, the distributions of futurerealizations of the constrained variables are predicted using the dynamics of the plant and a linear stateestimator. From these distributions, a set of deterministic constraints is computed for the predictions ofa nominal model. The constraints are incorporated in a receding horizon optimization of an expectedquadratic cost, which is formulated as a quadratic program. The constraints are constructed so as toprovide a guarantee of recursive feasibility, and the closed loop system is stable in amean-square sense. Alluncertainties in this paper are taken to be bounded—inmost control applications this gives amore realisticrepresentation of process and measurement noise than the more traditional Gaussian assumption.

© 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Model predictive control (MPC) has proved successful becauseit attains approximate optimality in the presence of constraints.Most MPC strategies consider hard constraints, and a numberof robust MPC (RMPC) techniques exist to handle uncertainty.Howevermodel andmeasurement uncertainties are often stochas-tic, and therefore RMPC can be conservative since it ignores in-formation on the probabilistic distribution of the uncertainty. Inaddition, not all constraints are hard (i.e. inviolable), and it may bepossible to improve performance by tolerating violations of con-straints provided that the frequency of violations remains withinallowable limits (see e.g. van Hessem and Bosgra (2002), Kouvar-itakis, Cannon, and Tsachouridis (2004) or more recently Magni,Pala, and Scattolini (2009); Oldewurter, Jones, and Morari (2009);Primbs (2007)).

Recent work (Cannon, Kouvaritakis, Raković, & Cheng, 2011;Kouvaritakis, Cannon, Raković, & Cheng, 2010) proposed StochasticMPC (SMPC) algorithms that use probabilistic information on

✩ The material in this paper was partially presented at the 19th InternationalSymposium on Mathematical Theory of Networks and Systems (MTNS 2010), July5–9, 2010, Budapest, Hungary. This paper was recommended for publication inrevised form by Associate Editor Fabrizio Dabbene under the direction of EditorRoberto Tempo.

E-mail addresses: [email protected] (M. Cannon),[email protected] (Q. Cheng), [email protected](B. Kouvaritakis), [email protected] (S.V. Raković).1 Tel.: +44 1865 273189; fax: +44 1865 273906.

0005-1098/$ – see front matter© 2011 Elsevier Ltd. All rights reserved.doi:10.1016/j.automatica.2011.08.058

additive disturbances in order to minimize the expected value of apredicted cost subject to hard and soft (probabilistic) constraints.Stochastic tubes were used to provide a recursive guarantee offeasibility and thus ensure closed loop stability and constraintsatisfaction. Tubes enable probabilistic constraints to be invokedvia linear constraints on nominal predictions and thus the onlinealgorithms are computationally efficient. Moreover (Kouvaritakiset al., 2010) proposed conditions that, for the parameterization ofpredictions employed, are necessary and sufficient for recursivefeasibility, thereby incurring no additional conservatism. Theapproach was based on state feedback, which assumes that thestates are measurable. In practice this is often not the case, and itis then necessary to estimate the state via an observer.

The introduction of state estimation into RMPC is wellunderstood (see e.g. Bemporad and Garulli (2000), Lee andKouvaritakis (2001), Chisci and Zappa (2002), Wan and Kothare(2002), Løvaas, Seron, and Goodwin (2008), Mayne, Raković,Findeisen, and Allgöwer (2009)) and uses lifting to describe thecombined system and observer dynamics. The current paperextends these ideas to include probabilistic information onmeasurement noise and the unknown initial plant state. Extendingthe approach of Kouvaritakis et al. (2010), which consideredthe propagation of the effects of bounded additive stochasticdisturbances, this paper examines how the effects of boundedmeasurement noise and uncertain state estimates propagatealong predicted input/state trajectories. This analysis providesconditions for satisfaction of probabilistic constraints over aninfinite prediction horizon as well as conditions that guaranteethe existence of feasible solutions at all times, given a feasible

Page 2: Stochastic tube MPC with state estimation

M. Cannon et al. / Automatica 48 (2012) 536–541 537

solution at initial time. Thus, unlike earlier work (e.g. Yan andBitmead (2005)) which did not consider feasibility and stability,the approach guarantees recursive feasibility with respect tohard and probabilistic constraints and ensures stability in amean-square sense. We provide a numerical example whichverifies that the rate of constraint violation can be as tightas the specified probability. As is common in the literatureon probabilistic robustness (e.g. Calafiore, Dabbene, and Tempo(2000)), all stochastic uncertainties are assumed to have boundedsupport. Not only is this necessary for asserting feasibility andstability, but it matches the real world more closely than themathematically convenient Gaussian assumption, since noise anddisturbances derived from physical processes are finite.

2. Problem formulation

Consider a linear system with state xk ∈ Rn, measured outputyk ∈ Rny and control input uk ∈ Rnu :

xk+1 = Axk + Buk + Dwk (1a)yk = Cxk + Fvk (1b)

where the disturbancewk ∈ Rnw andmeasurement noise vk ∈ Rnv

sequences are independent, identically distributed, with known,compactly supported distributions such that the elements of wkand vk are zero-mean and independent. Let

xk+1 = Axk + Buk + L(yk − yk) (2a)

yk = Cxk (2b)

where xk is the state estimate and L ∈ Rn×ny is chosen such thatA − LC is strictly stable. The error dynamics are then2

εk+1 = (A − LC)εk + Dwk − LFvk, (3)

where εk := xk − xk is the estimation error; ε0 is assumed tobe a random variable with a known and compactly supporteddistribution, so that ε0 ∈ Π0 almost surely (a.s.) for somecompact set Π0 ⊂ Rn. Let the state estimate and error sequencespredicted at time k be {xk+i|k, εk+i|k, i = 0, 1, . . .} and let ξk+i|k =

[xTk+i|k εTk+i|k]

T . Then, using the closed loop paradigm (Kouvaritakis,Rossiter, & Schuurmans, 2000), the predictions evolve according to

ξk+i+1|k = Ψ ξk+i|k + Bck+i|k + Dδk+i|k (4a)

uk+i|k = Kxk+i|k + ck+i|k (4b)

where: δk+i|k = [wTk+i|k vT

k+i|k]T is a stochastic disturbance; ck+i|k

for i = 0, . . . ,N −1 are decision variables and ck+i|k = 0 for i ≥ N ,for some chosen finite horizon N; also Ψ (and hence also A + BK )is assumed to be strictly stable, with

Ψ =

A + BK LC

0 A − LC

, B =

B0

, D =

0 LFD −LF

.

Given the assumptions on wk and vk, the distribution of δk iscompactly supported, so δk ∈ ∆ a.s. for compact ∆ ⊂ Rnw+nv .

The system is subject to probabilistic constraints:

Pr{ηTx xk+1 + ηT

uuk ≤ h} ≥ p, k = 0, 1, . . . (5)

for fixed scalar h, vectors ηx, ηu, and probability p ∈ [0, 1]. Appliedto the predictions of (4), these are equivalent to

Pr{gT0 ξk+i|k + gT

1 Dδk+i|k + f T ck+i|k ≤ h} ≥ p, i = 0, 1, . . . (6)

2 Following a reviewer’s suggestion we note that (3) is based on a priorstate estimate, whereas a posterior estimation scheme may be advantageous,i.e. replacing the RHS of (3) by (A − LCA)εk + (I − LC)Dwk − LFvk+1 can lead toimproved constraint handling.

where gT0 = [ηT

x (A + BK) + ηTuK

T ηTx A], gT

1 = [ηTx ηT

x ], andf T = ηT

x B + ηTu . The case of hard constraints is treated simply by

setting p = 1 in (5) and (6).We split the predictions ξk+i|k into nominal (zk+i|k) and uncer-

tain (ek+i|k) components, which evolve for i = 0, 1, . . . as

ξk+i|k = zk+i|k + ek+i|k (7a)

zk+i+1|k = Ψ zk+i|k + Bck+i|k (7b)

ek+i+1|k = Ψ ek+i|k + Dδk+i|k (7c)

with initial conditions3pertaining to i = 0: zk|k = [xTk 0]T , ek|k =

[0 εTk ]

T , ek|k ∼ Dk; where the distribution Dk is calculated using(3) and the distribution of initial estimation error ε0. This decom-position simplifies constraint handling by separating stochastic (e)and deterministic (z) predictions.

Our problem is to devise aMPC strategy thatminimizes, subjectto (5), the predicted cost

Jk =

∞i=0

E(xTk+i|kQxk+i|k + uTk+i|kRuk+i|k) (8)

with R ≻ 0, while ensuring that the closed loop system is stableand that xk converges to a neighborhood of the origin.

3. Recursively feasible probabilistic tubes

To handle the constraints of (5), we first give necessary andsufficient conditions that guarantee that (6) is satisfied by pre-dicted state and input sequences. In the sequel we define ck =

[cTk|k cTk+1|k · · · cTk+N−1|k]

T∈ RNnu .

Theorem 1. At time k, the constraints (6) are satisfied by predictionsof (4) if and only if ck satisfies

(gT0Hi + f TEi)ck + gT

0 Ψ izk|k ≤ h − γi|k, i = 0, 1, . . . (9)

where Hi = [Ψ i−1B · · · B 0 · · · 0], Eick = ck+i|k, and, for each k, γi|kis defined as the minimum value such that

Pr{gT0 ek|k + gT

1 Dδk|k ≤ γ0|k} = p, for i = 0, (10a)

Pr{gT0 (Ψ iek|k + Ψ i−1Dδk|k + · · · + Dδk+i−1|k)

+ gT1 Dδk+i|k ≤ γi|k} = p, for i = 1, 2, . . . . (10b)

Proof. The predictions of (7) give

zk+i|k = Ψ izk|k + Hick (11a)

ek+i|k = Ψ iek|k + Ψ i−1Dδk|k + · · · + Dδk+i−1|k (11b)

and since γi|k is the minimum value that satisfies (10a)–(10b), itfollows that (9) is equivalent to the constraint of (6). �

Many problems of practical interest have more than one con-strained variable. For the case of r constrained scalar variables, weassume that each constraint can be expressed as a condition of theform of (5), and hence (6), i.e.

Pr{gT0,jξ + gT

1,jDδ + f Tj c ≤ hj} ≥ pjfor j = 1, . . . , r . Then Theorem 1 can be applied to each of the rindividual constraints to derive a set of constraints (9).

3 Note that the initial conditions for predictions are updated using outputmeasurements at each time k, and the predictions generated by (7c) do not retainthe structure of the initial condition ek|k since the first block of ek+i|k contains arandom variable for i > 0.

Page 3: Stochastic tube MPC with state estimation

538 M. Cannon et al. / Automatica 48 (2012) 536–541

Due to linearity, the conditions (9) can easily be incorporatedinto an online optimization of ck. To determine γi|k, the distributionof gT

0 (Ψ iek|k + Ψ i−1Dδk + · · · + Dδk+i−1) + gT1 Dδk+i is needed,

and this requires a multivariate convolution integral. Instead, bydiscretizing the distributions of δk and ε0 and performing discretescalar convolutions, the values of γi|k can be approximated (to aspecified degree of accuracy) at reasonable computational cost.Importantly, the calculation of γi|k does not require knowledgeof xk and can be performed offline. Moreover, the error, δpi, dueto discretization can be accounted for by replacing p by p + δpi,thereby incurring some conservatism, but asmentioned above, thiscan be made as small as desired.

Although the conditions of Theorem1 ensure that (6) is satisfiedover the entire prediction horizon at time k, the existence of cksatisfying (9) does not ensure the existence of ck+1 generatingpredictions at time k + 1 satisfying (6). Consequently (9) does notguarantee the future feasibility of an online optimization problemincorporating (9) as a constraint. This is because γi|k in (10a)–(10b)is a probabilistic bound on the stochastic components of thepredicted value of gT

0 ξk+i|k + gT1 Dδk+i|k at time k, whereas the same

probabilistic bound on gT0 ξk+i|k+1 + gT

1 Dδk+i|k+1 could, on the basisof information available at time k, take its maximum value over allpossible realizations of the estimation error ek and disturbance δkat time k.

Define Tck = [cTk+1|k · · · cTk+N−1|k 0]T ∈ RNnu , then thepreceding argument implies that ck+1 = Tck satisfies (9) at timek + 1 if and only if γi|k is replaced in (9) by the bounds that areobtained when worst-case values for ek|k and δk|k are used in (10a).The following theorem generalizes this approach by deriving theconditions for feasibility of ck+i = T ick in (9) at time k + i forall i ≥ 1. These conditions therefore ensure that the probabilisticconstraints (6) are satisfied at time k and are also recursivelyfeasible in the sense that they remain feasible at all future times k+1, k + 2, . . .. We define γi as the minimum value such that

Pr{gT0 (Ψ i−1Dδk + · · · + Dδk+i−1) + gT

1 Dδk+i ≤ γi} = p, (12)and also define αi|k and di for i = 1, 2, . . . by

di = maxδ∈∆

gT0 Ψ i−1Dδ (13a)

αi|k = ess supek|k∼Dk

gT0 Ψ iek|k (13b)

where ess supe∼D f (e) denotes the maximum of a function f overthe support of the distribution D.

Theorem 2. At time k, the constraints (6) are satisfied by thepredictions of (4) if ck satisfies

(gT0Hi + f TEi)ck + gT

0 Ψ izk|k ≤ h − βi|k, i = 0, 1, . . . (14)

where βi|k is the maximum element of the (i + 1)th column of thematrix Bk defined by

Bk =

γ0|k γ1|k γ2|k · · ·

0 α1|k + d1 + γ0 α2|k + d2 + γ1 · · ·

0 0 α2|k + d2 + d1 + γ0 · · ·

......

.... . .

. (15)

Furthermore, if (14) is feasible at time k, then (14)will remain feasibleat all times k + 1, k + 2, . . . .Proof. If βi|k is defined by the (i + 1)th element of the first rowof Bk, then (14) is equivalent to (9). The (i + 1)th element of thesecond row of Bk is obtained by replacing ek|k and δk|k in (10b) bytheirworst-case values, so that (14) also ensures that Tck is feasiblefor (9) at time k+1. Similarly the jth row corresponds to theworst-case values of ek|k and δk|k, . . . , δk+j−2|k in (10b), thus ensuring thatT jck is feasible for (9) at time k + j. By the same argument, T jck isalso feasible for (14) at time k + j, for all j > 0. �

Lemma 3. For all i ≥ 1 we have βi|k = αi|k +i

j=1 dj + γ0.

Proof. From the definitions of di and αi|k in (13b) we have gT0

Ψ i−1Dδk ≤ di and gT0 Ψ iek|k ≤ αi|k for all realizations of ek|k and

δk. Hence γ1|k ≤ α1|k + d1 + γ0, and from the definition of γi−1 italso follows that

Pr{gT0 (Ψ iek|k + Ψ i−1Dδk + · · · Dδk+i−1) + γ0 ≤ αi|k + di + γi−1}

≥ p

so that γi|k ≤ αi|k + di + γi−1 for i = 2, 3, . . . . Similarly, weobtain γi ≤ di +γi−1 for i = 1, 2, . . . , and therefore themaximumelement in each column of Bk lies on the diagonal. �

Remark 4. Although the bounds in (13) account for worst-caseuncertainty, the bounds in (10a) and (12) hold with probabilityp ≤ 1. The constraints of (14)–(15) are therefore less restrictivethan the corresponding RMPC constraints, which would requirep = 1 in (6), and hence would assume the worst-case bounds onall future uncertainty variables.

Since ek|k and {δk, δk+1, . . .} are bounded, and since Ψ is strictlystable, the predicted sequence {ek|k, ek+1|k, . . .} generated by (7c),and hence also {β0|k, β1|k . . .}, is necessarily bounded for any k. Wenext show (similarly to Raković (2007)) how to compute boundson βi|k that hold for all i, k exceeding specified values. These areused in Section 4 to derive a recursively feasible SMPC algorithminvolving a finite number of constraints. These bounds also enablethe constraints in the online SMPC optimization to be computedoffline.

Define the symmetric positive definite matrix S ∈ R2n×2n andscalar ρ ∈ (0, 1) as the solution of the semidefinite program:

(ρ, S) = arg minρ>0

S=ST ≻0

ρ

subject to Ψ SΨ T≼ ρ2S, and (16a)

1 Dδ

δT DT S

≻ 0 for all δ ∈ ∆ (16b)

Note that the convexity of (16a), (16b) and the strict stability ofΨ ensure the existence and uniqueness of the optimal solutionof (16a)–(16b). Define E as the ellipsoidal set E = {e : ∥e∥S−1 ≤ 1},where ∥e∥S−1 =

√eT S−1e, and let Eε denote its projection onto the

subspace {e ∈ R2n: [In 0]e = 0}, so that

Eε = {ε : εT S−122 ε ≤ 1},

where S22 is the relevant block of S =

S11 S12S21 S22

. Also define

positive scalars τ and σ by

τ = ess supe0∼D0

∥e0∥S−1 = maxε0∈Π0

0ε0

S−1

(17a)

σ = maxε∈Eε

S−1

= λ

S1/222

0 I

S−1

0I

S1/222

(17b)

(where λ(A) denotes the maximum eigenvalue of A).

Lemma 5. For any integer ν > 1, βi|k satisfies:

βi|k =

ij=1

dj +k

j=1

bij + aik + γ0, 1 ≤ i < ν, k < ν (18a)

βi|k ≤ βi|ν =

ij=1

dj +ν−ij=1

bij + α + γ0, 1 ≤ i < ν, k ≥ ν (18b)

βi|k ≤ β =

νj=1

dj + α + γ0, i ≥ ν (18c)

Page 4: Stochastic tube MPC with state estimation

M. Cannon et al. / Automatica 48 (2012) 536–541 539

where bij = maxδ∈∆ gT0 Ψ i

0 00 I

Ψ j−1Dδ,

and aik = ess supe0∼D0gT0 Ψ i

0 00 I

Ψ ke0, α = ρνσ( 1

1−ρ+τ)∥g0∥S .

Proof. Using ek|k = [0 εTk ]

T and (3), the definition of αi|k in (13)givesαi|k =

kj=1 bij+aik, which implies that the expression forβi|k

in (18a) holds for all i ≥ 1 and k ≥ 0. To demonstrate the boundsin (18b) and (18c) note that (16b) implies D∆ ⊆ E (i.e. Dδ ∈ E

for all δ ∈ ∆), and (16a) therefore implies Ψ jD∆ ⊆ ρ jE . Sincemaxe∈E gT

0 e = ∥g0∥S , it follows that dj ≤ ρ j−1∥g0∥S . Furthermore,

(17b) implies0 00 I

E ⊆ σE,

so we have, similarly, bij ≤ ρ i+j−1σ∥g0∥S , and from (17a) wehave aik ≤ ρ i+kστ∥g0∥S . The bounds in (18b) and (18c) follow bysubstituting these bounds into (18a) and using the properties thatρ < 1 (since Ψ in (16a) is strictly stable), and σ > 1 (due to thedefinition of σ in (17b)).

Remark 6. The bounds in (18b), (18c) can bemade arbitrarily tightby using a sufficiently large value for ν.

Remark 7. The values of β0|k = γ0|k for k ≥ ν may be approxi-mated for sufficiently large ν (since Ψ is strictly stable) using theasymptotic distribution D∞ of limk→∞ ek|k, i.e.

β0|k ≈ γ0|∞ = γ∞ ∀k ≥ ν.

Combining this with the bounds on βi|k for i ≥ 1 in Lemma 5 en-ables recursively feasible constraints to be computed offline for allk ≥ 0 on the basis of a finite number of bounds:

γ0, {γ0|k, k = 0, . . . , ν − 1}, γ∞, {dj, j = 1, . . . , ν},

{bij, i, j = 1, . . . ν − 1}, {aij, i, j = 1, . . . ν − 1}, α.(19)

This approach assumes that the transient response of the coupledplant and observer due to the initial estimation error becomes neg-ligible after ν time steps. For ν less than the duration of these tran-sients, the degree of conservativeness of bounds of Lemma 5 couldbe non-negligible.

4. SMPC algorithm

This section describes a receding horizon control strategybased on the predictions of (4) and establishes closed loopstability and convergence properties. The strategy minimizes thepredicted expected value cost (8) over the decision variables ck =

[cTk|k · · · cTk+N−1|k]T subject to the constraints (6),which are invoked

via the recursively feasible constraints of (14). As mentionedearlier, these are less restrictive than the corresponding RMPCconstraints (with p = 1), and therefore allow for more controlauthority with which to improve performance and enlarge theregion of attraction. The predictions of (4b), with ck+i|k = 0 for i ≥

N , imply a dual mode prediction strategy, with mode 1 comprisingthe first N prediction time steps during which the control inputsare defined by ck, and mode 2 the subsequent infinite horizonon which control inputs are given by the feedback law uk+i|k =

Kxk+i|k. Constraints are handled explicitly in mode 1 and implicitlyin mode 2 through the constraint that the predicted state at theend of mode 1 should lie in a terminal set. This set is derived byre-writing (14) in mode 2 at time k as

gT0 Ψ jzk+N|k ≤ h − βN+j|k , j = 0, 1, . . . . (20)

The maximal admissible set at time k, denoted S∞|k, is the subsetof Rn containing all zk+N|k such that (20) holds. The bounds (18a)imply conditions for the existence of S∞|k.

Lemma 8. If the probability p is such that h > β and

h >

maxi=N,...,ν−1

βi|k if k < ν

maxi=N,...ν−1

βi|ν otherwise (21)

then S∞|k is non-empty.

Proof. Using (18b), (18c) to bound βN+j|k in (20) defining S∞|k, itis easy to show that the conditions on h given in the lemma ensurethat S∞|k contains the origin, and so is non-empty. �

The approach of Gilbert and Tan (1991) provides a method ofcomputing an inner approximation of S∞|k. Consider first the caseof k < ν, and define a terminal set Sν|k as follows:

Sν|k = {z : gT0 Ψ i−Nz ≤ h − βi|k, i = N, . . . , ν − 1

gT0 Ψ i−Nz ≤ h − β, i = ν, ν + 1, . . .}. (22)

From (18c) it follows that Sν|k ⊆ S∞|k, and furthermore, it can beshown (Gilbert & Tan, 1991) that there exists N∗

k such that Sν|k isdetermined by the finite set of inequalities:

Sν|k = {z : gT0 Ψ i−Nz ≤ h − βi|k, i = N, . . . , ν − 1

gT0 Ψ i−Nz ≤ h − β, i = ν, . . . , ν + N∗

k } (23)

whenever S∞|k is bounded. For k ≥ ν, (18b) enables a time-invariant terminal set to be defined as Sν|k = Sν , where

Sν = {z : gT0 Ψ i−Nz ≤ h − βi|ν, i = N, . . . , ν − 1

gT0 Ψ i−Nz ≤ h − β, i = ν, . . . , ν + N∗

ν } (24)

for some finite N∗ν . Thus a finite collection of terminal sets: Sν|k, for

k = 0, . . . , ν − 1 and Sν for all k ≥ ν, can be used to invoke themode 2 constraints at any time k. Moreover, for each k = 0, . . . , ν,the parameter N∗

k can be determined by solving a finite numberof linear programming problems (see e.g. Gilbert and Tan (1991)).Note that if S∞|k is unbounded, then Sν|k and Sν can be defined asbounded inner approximations of S∞|k by incorporating suitableadditional constraints in (22). Using the approach described above,it is then possible to express Sν|k and Sν in terms of a finite numberof inequalities as in (23) and (24).

The cost defined by (8) is the summation of the expected valueof each stage cost, which is infinite over the infinite predictionhorizon since the asymptotic value:

lss = limi→∞

E(xTk+i|kQxk+i|k + uTk+i|kRuk+i|k)

is non-zero for the case of persistent additive uncertainty consid-ered here. Thus we redefine the cost function in terms of the devi-ation of expected stage cost from this asymptotic limit:

Jk =

∞i=0

[E(xTk+i|kQxk+i|k + uTk+i|kRuk+i|k) − lss]. (25)

This modification provides a finite cost that can be minimized on-line, although clearly the optimal value of ck is unchanged.

Remark 9. Given the distribution of δ, the cost (25) can be ex-pressed as a quadratic function of the degrees of freedom ck in thepredictions:

Jk(ck) = cTkPccck + 2xTkPTxcck + pk(xk) (26)

where Pcc ∈ RNnu×Nnu , Pxc ∈ Rn×Nnu are constants that can be com-puted offline, and pk(xk) is independent of ck. Furthermore, if Kis the optimal feedback gain for the case of no constraints, thenPxc = 0 by construction.

Page 5: Stochastic tube MPC with state estimation

540 M. Cannon et al. / Automatica 48 (2012) 536–541

Algorithm 1 (SMPC).Offline. Use the distributions of e0 and δk to compute theparameters (19) that define the recursively feasible probabilisticconstraints (6). Determine the constraint checking horizonsN∗

k , fork = 0, . . . , ν, in the terminal constraint sets defined in (23) and(24).

Online. At each time k = 0, 1, . . .:

1. Compute the state estimate xk and set zk|k = (xk, 0).2. Solve the quadratic program:

c∗

k = argminck

Jk(ck)

subject to (gT0Hi + f TEi)ck + gT

0 Ψ izk|k ≤ h − βi,

i = 0, . . . ,N − 1zk+N|k ∈ Sk

where βi = βi|k and Sk = Sν|k if k < ν, while βi = βi|ν andSk = Sν if k ≥ ν.

3. Set uk = Kxk+i|k + c∗

k|k.

Theorem 10. Given feasibility at k = 0, SMPC remains feasible for allk > 0. The closed loop system under SMPC satisfies the probabilisticconstraint (5) and the quadratic stability condition:

limr→∞

1r

rk=0

E(xTkQxk + uTkRuk) ≤ lss. (27)

Proof. The recurrence of feasibility is due to the definition of theconstraints in the online optimization of SMPC, and this ensuresthat the probabilistic constraints (5) are satisfied in closed loopoperation. In addition, Theorem 2 ensures that Tc∗

k is feasible forthe online optimization at k + 1, so that

Jk(c∗

k) = Ek[Jk+1(Tc∗

k)] + xTkQxk + uTkRuk − lss

which therefore implies that the optimal cost value satisfies thestochastic Lyapunov-like condition

Jk(c∗

k) ≥ Ek[Jk+1(c∗

k+1)] + xTkQxk + uTkRuk − lss. (28)

Summing (28) over k = 0, 1, . . . , r givesr

k=0

[E(xTkQxk + uTkRuk) − lss] ≤ J0(c∗

0) − E[Jr+1(c∗

r+1)]

which implies (27) because J(c∗

0) is by assumption finite and, sincethe quadratic form (26) implies that Jr(c∗

r ) is lower-bounded, (28)ensures that E[Jr(c∗

r )] is finite for all r . �

Corollary 11. If K is the optimal feedback gain for the case of noconstraints, then xk ∈ T in the limit as k → ∞ under SMPC, whereT denotes the minimal robust invariant set for the state of (1) underuk = Kxk.

Proof. The quadratic form of J(ck) in (26) and the feasibility ofck+1 = Tck imply that

cTkPccck ≥ c∗

k+1Pccc∗

k+1 − c∗

k|kT (R + BTPxB)c∗

k|k

(where Px is the steady state solution of the Riccati equationassociated with (1) and (8)). From the sum of both sides of thisinequality over k ≥ 0 and the fact that R + BTPxB ≻ 0, it followsthat c∗

k|k → 0 as k → ∞, so that xk converges to T . �

5. Simulation example

In this example, the model parameters in (1) are:

A =

1.6 1.1

−0.7 1.2

, B =

11

C = [0.9 0.2],

D = I, F = 1.

There are four probabilistic constraints of the form (5), with

ηTx = [±1 ±0.3], ηu = 0, h = 1.5, p = 0.8.

The cost weights are Q = CTC and R = 0.1. The disturbancewk, initial state estimate uncertainty ε0, andmeasurement noise vkare bounded random variables with truncated and scaled Gaussiandistributions. For example vk is derived from normally distributedv′

k ∼ N (0, 1/100), which is truncated so that |vk| ≤ 0.12.Therefore the distribution function for vk, denoted Fv(V ) =

Pr{vk ≤ V }, is obtained from that of v′

k by an affine transformation

Fv(V ) = aFv′(V ) + b for all V ∈ [−0.12, 0.12],

where a, b are such that Fv(−0.12) = 0 and Fv(0.12) = 1.As explained in Section 1, this is not an approximation whichignores large uncertainties but rather conforms with real-worlduncertainties which have distributions with compact support. Thedistributions of wk, ε0 are defined analogously by truncating thedistributions of Gaussian variables w′

k, ε′

0:

w′

k ∼ N (0, I/100), ∥wk∥∞ ≤ 0.12ε′

0 ∼ N (0, I/100), ∥ε0∥∞ ≤ 0.12.

In accordance with the assumptions of Section 2, the sequences{wk, k = 0, 1, . . .} and {vk, k = 0, 1, . . .} are i.i.d. Thefeedback gain in (4b) is K = −[1.0313 1.0682], which is theunconstrained LQ-optimal gain for the given plant model and cost.The observer gain (based on the posterior estimation scheme) ischosen as L = −[0.8348 1.2234]T , which gives the spectralradius of A − LCA as 0.12. The mode 1 prediction horizon N andthe horizon ν over which constraints are handled explicitly arechosen (by considering the transient response when constraintsare absent) as N = 6 and ν = 13. The sequence {γ0|k, k =

0, 1, . . . , ν − 1} is computed by discrete convolution based on anequi-spaced grid with spacing 10−4. The values of βi|k are boundedusing (18a)–(18b). This gives the condition of (21) as h ≥ βwhere, respectively, for each constraint β is 1.31, 1.20, 1.31, 1.20;as required all these values are less than h = 1.5. For the chosenN and ν, the terminal sets Sν|k and Sν are given by (23) and (24)with N∗

k = 10 for all k. For 104 realizations of disturbance andnoise sequences, the response of Algorithm 1 was simulated andcompared with that of the optimal unconstrained feedback law.In each simulation, the initial state was defined as x0 = x0 + ε0,with the initial state estimate fixed: x0 = (1.8, −4) and with ε0obtained by sampling D0.

The evolution of the state sequence {xk, k = 0, 1, . . .} for 200realizations of disturbance and noise sequences is shown in Fig. 1,which also shows that the chosen x0 lies on the boundary of thefeasible region for SMPC and outside that for RMPC (note that theconstraints in (5) apply to the one-step-ahead state). For SMPC, theobserved probability of violating the two active constraints is (asexpected) 20% (at k = 1), whereas the violation rate is 100% atk = 1 under unconstrained optimal control. The cumulative costsof (25), for closed loop operation averaged over 104 simulations,are 3.56 and 2.38 under SMPC and unconstrained optimal controlrespectively. Although the constrained SMPC performance, asexpected, is worse that the unconstrained LQR, it is noteworthythat RMPC is infeasible for the given initial condition. Moreover forthe initial condition x0 = (1.35, −4), which lies on the boundaryof the feasible region for RMPC, the costs for SMPC and RMPC inclosed loop operation (averaged over 104 simulations) were 1.72and 2.42 respectively, indicating that SMPC outperforms RMPC.

Page 6: Stochastic tube MPC with state estimation

M. Cannon et al. / Automatica 48 (2012) 536–541 541

Fig. 1. Evolution of xk for SMPC and unconstrained optimal control; stateconstraints (dashed line); the sets of feasible initial conditions x0 for SMPC (solidblack line) and RMPC (dash–dot line).

6. Conclusion

This paper discusses and proposes a solution for a stochas-tic MPC problem for systems with bounded random additive dis-turbances and soft constraints, and for which the states are notavailable for direct measurement. The proposed algorithm car-ries a guarantee of recursive feasibility and ensures a form ofthe quadratic stability and convergence of the state to a neigh-borhood of the origin. Simulation results show that it achievesnear-optimal performance while satisfying the soft constraintsnon-conservatively, in the sense that the observed probability ofconstraint satisfaction in closed loop operation is the same as thespecified value.

Acknowledgment

Qifeng Cheng is supported by the K.C.Wong foundation.

References

Bemporad, A., &Garulli, A. (2000). Output feedbackpredictive control of constrainedlinear systems via set-membership state estimation. International Journal ofControl, 73(8), 655–665.

Calafiore, G. C., Dabbene, F., & Tempo, R. (2000). Randomized algorithms forprobabilistic robustness with real and complex structured uncertainty. IEEETransactions on Automatic Control, 45(12), 2218–2235.

Cannon, M., Kouvaritakis, B., Raković, S. V., & Cheng, Q. (2011). Stochastic tubesin model predictive control with probabilistic constraints. IEEE Transactions onAutomatic Control, 56(1), 194–200.

Chisci, L., & Zappa, G. (2002). Feasibility in constrained predictive control of linearsystem: the output feedback case. International Journal of Robust and NonlinearControl, 12(5), 465–487.

Gilbert, E. G., & Tan, K. T. (1991). Linear systems with state and control constraints:the theory and practice of maximal admissible sets. IEEE Transactions onAutomatic Control, 36(9), 1008–1020.

Kouvaritakis, B., Cannon, M., Raković, S. V., & Cheng, Q. (2010). Explicit useof probabilistic distributions in linear predictive control. Automatica, 46(10),1719–1724.

Kouvaritakis, B., Cannon, M., & Tsachouridis, V. (2004). Recent developments instochastic MPC and sustainable development. Annual Reviews in Control, 28(0),23–35.

Kouvaritakis, B., Rossiter, J. A., & Schuurmans, J. (2000). Efficient robust predictivecontrol. IEEE Transactions on Automatic Control, 45, 1545–1549.

Lee, Y. I., & Kouvaritakis, B. (2001). Receding horizon output feedback controlfor linear systems with input saturation. IEE Proceedings-Control Theory andApplications, 148(2), 109–115.

Løvaas, C., Seron, M. M., & Goodwin, G. C. (2008). Robust output-feedback modelpredictive control for systemswith unstructureduncertainty.Automatica, 44(8),1933–1943.

Magni, L., Pala, D., & Scattolini, R. (2009). Stochastic model predictive control ofconstrained linear systems with additive uncertainty. In Proc. European ControlConference. Hungary: Budapest.

Mayne, D. Q., Raković, S. V., Findeisen, R., & Allgöwer, F. (2009). Robust outputfeedback model predictive control of constrained linear systems: time varyingcase. Automatica, 45(9), 2082–2087.

Oldewurter, F., Jones, C.N., &Morari, M. (2009). A tractable approximation of chanceconstrained stochastic MPC based on affine disturbance feedback. In Proc. 48thIEEE Conf. Decision Control. Cancun.

Primbs, J.A. (2007). A soft constraint approach to stochastic receding horizoncontrol. In Proc. 46th IEEE Conf. Decision Control.

Raković, S.V. (2007). Minkowski algebra and Banach contraction principle in setinvariance for linear discrete time systems. In Proc. 46th IEEE Conf. DecisionControl (pp. 2169–2174). New Orleans.

van Hessem, D.H., & Bosgra, O.H. (2002). A conic reformulation of model predictivecontrol including bounded and stochastic disturbances under state and inputconstraints. In Proc. 41st IEEE Conf. Decision Control (pp. 4643–4648). Las Vegas.

Wan, Z., & Kothare, M. V. (2002). Robust output feedback model predictivecontrol using off-line linear matrix inequalities. Journal of Process Control, 12(7),763–774.

Yan, J., & Bitmead, R. R. (2005). Incorporating state estimation into modelpredictive control and its application to network traffic control. Automatica, 41,595–604.

Mark Cannon received the degrees of M.Eng. in Engineer-ing Science in 1993 and D.Phil. in Control Engineering in1998, both from Oxford University, and received the M.S.inMechanical Engineering in 1995 fromMassachusetts In-stitute of Technology. He is an Official Fellow of St. John’sCollege and a University Lecturer in Engineering Science,Oxford University.

Qifeng Cheng obtained the degrees of B.Eng. andM.Eng. inEngineering Science in 2006 and 2008, both fromTsinghuaUniversity, China. He is currently a D.Phil. student inEngineering Science and a member of St. Hugh’s Collegeat Oxford University.

Basil Kouvaritakis was awarded a B.Sc. in ElectricalEngineering from the University of Manchester Instituteof Science and Technology, where he also received hisMasters and Doctorate. He is currently a Professor ofEngineering at the Department of Engineering Science anda Tutorial Fellow at St. Edmund Hall, Oxford University.

Saša V. Raković received the B.Sc. degree in Electrical En-gineering from the Technical Faculty Čačak, University ofKragujevac (Serbia), the M.Sc. degree in Control Engineer-ing and the Ph.D. degree in Control Theory from Impe-rial College London. His Ph.D. Thesis, ‘‘Robust Control ofConstrained Discrete Time Systems: Characterization andImplementation’’, was awarded the Eryl Cadwaladr DaviesPrize for the best thesis in the Electrical and ElectronicEngineering Department of Imperial College London in2005. He held the posts of Research Associate at Impe-rial College London (until 11/2006) and Postdoctoral Re-

searcher at the ETH Zurich (11/2006–11/2008). He is currently a Scientific Associateat the Institute for Automation Engineering, Otto-von-Guericke-UniversitätMagde-burg and an Academic Visitor at the Department of Engineering Science, Universityof Oxford. His main research interests lie in the areas of control synthesis and anal-ysis as well as decision making under constraints and uncertainty.