Retrospective Cost Adaptive Controldsbaero/library/RCAC_to_appear.pdf · 2017-04-06 · Adaptive...

Retrospective Cost Adaptive Control

Pole Placement, Frequency Response, and Connections withLQG Control

Yousaf Rahman, Antai Xie, and Dennis S. Bernstein

Aerospace Engineering Department, University of Michigan, 1320 Beal Ave., Ann Arbor, MI 48109

A vast range of technological systems—from aerospace vehicles to chemical process plantsto Segways—depend on feedback control. These applications typically rely on a combinationof classical and modern control techniques, logic for mode switching, and diagnostics for faultdetection in order to ensure safety and reliability, validated and verified through simulation andtesting. A feedback control system is the quintessential cyberphysical system, where real-timedigital computing elements interact bidirectionally with the full complexity of the real worldthrough noisy transducers and limited communication channels [1]. In many applications, thecontrol system is crucial to safe operation, and the potential benefits of new ideas and techniquesfor feedback control must be weighed against the risk of unanticipated response and failure.

A successful application of feedback control depends first and foremost on the availabilityof reliable and affordable sensors and actuators that can monitor the response of the plantand modify its behavior. An autopilot for an aircraft is a classical example, where inertial andnoninertial sensors provide measurements of position, velocity, attitude, and angular rates, whilethrust and aerodynamic surfaces provide forces and moments. In other applications, such ascontrolled combustion [2], flow control [3]–[5], and controlled fusion [6], the development ofsensing and actuation strategies presents a challenge. Control research is typically concernedwith applications for which effective sensing and actuation technologies are available, and thegoal is to use this technology reliably and efficiently in order to achieve stabilization, commandfollowing, and disturbance rejection.

Despite these successes, many applications of feedback control remain beyond the reach ofmodern tools and techniques. These applications may be highly under-sensed and under-actuatedrelative to the order of their significant dynamics; they may be difficult to model due to complex,unknown, or unpredictably changing physics; and they may require reliable high-performance

DRAFT

1

control systems that must be engineered within tight deadlines and budgets. Adaptive controloffers a viable approach to these applications.

The underlying motivation for research in adaptive control is to develop control algorithmsthat can accommodate sensor and actuator limitations, respect communication constraints,account for complex, uncertain, and unpredictably changing dynamics, and operate robustly inthe presence of noise and reliably in the event of sensor/actuator failure. The promise of adaptivecontrol is the ability to account for all of these effects with minimal prior modeling, tuning, andanalysis for applications that are beyond the reach of fixed-gain and fixed-logic model-basedcontrol laws.

Adaptive control is different from robust control, which also accounts for uncertainty.In particular, robust control views uncertainty as static and seeks to trade performance forrobustness to the assumed level of uncertainty. In contrast, adaptive control strives to learnabout the plant during operation, either explicitly or implicitly, in order to overcome prioruncertainty. Consequently, by tuning itself to the actual plant, an adaptive controller transcendsthe performance/robustness tradeoff of robust control. To illustrate this point, consider a plantsubject to a harmonic disturbance whose frequency is ω. If ω is known, then a stabilizingcontroller based on the internal model principle can be used to apply infinite gain at thedisturbance fequency in order to asymptotically reject the disturbance. If, however, ω is unknown,then a robust controller must apply high but finite gain across a bandwidth of possible disturbancefrequencies, thus sacrificing asymptotic disturbance rejection and unnecessarily amplifying sensornoise. In contrast, an adaptive controller can tune itself to the actual disturbance frequencyand thereby asymptotically reject the disturbance. The same principle applies to the case of aplant with a lightly damped mode subject to broadband disturbances. If the modal frequencyis uncertain, then the H∞ distance between possible plants is large. Consequently, a robustcontroller based on an H∞ uncertainty metric must account for a large set of uncertain plants.In contrast, an adaptive controller can tune itself to the actual modal frequency.

Plants that are inherently difficult to control due to achievability constraints on closed-loopperformance [7] pose a potentially severe challenge to adaptive control as articulated in [8]:

Control engineers grounded in classical control know it is possible to formulate control design problemswhich in practical terms are not possible to solve. An inverted pendulum with more than two rods is awell-known example; again, a plant with nonminimum phase zeros well inside the passband and unstablepoles may be near impossible to control, unless additional inputs or outputs are used; another famousexample was provided in [9] and so on. When the plant is initially known, as well as the control objective,it will generally become clear at some point in the design process, if not ab initio, that the control objectiveis impractical.

DRAFT

2

Now what happens in adaptive control? The catch is that a full description of the plant is lacking. Theremay be no way to decide on the basis of the a priori information that the projected design task is or isnot practical. So what will happen if an adaptive control algorithm is run in such a case? At the least,the algorithm will not converge. At worst, an unstable closed loop will be established.

This article focuses on retrospective cost adaptive control (RCAC). RCAC is a direct,discrete-time adaptive control technique for stabilization, command following, and disturbancerejection. As a discrete-time approach, RCAC is motivated by the desire to implement controlalgorithms that operate at a fixed measurement sampling rate without the need for controllerdiscretization. This also means that the required modeling information can be estimated based ondata sampled at the same rate as the control update. Adaptive control algorithms for continuous-time plants are developed in [10]–[18] and for discrete-time plants in [19]–[30], and the abilityto handle plants with nonminimum-phase (NMP) zeros, that is, zeros outside of the open unitdisk, is demonstrated in [20]–[23].

RCAC was motivated by the concept of retrospectively optimized control, where pastcontroller coefficients used to generate past control inputs are re-optimized in the sense that,if the re-optimized coefficients had been used over a previous window of operation, then theperformance would have been better. However, unlike signal processing applications such asestimation and identification, it is impossible to change past control inputs, and thus the re-optimized controller coefficients are used only to generate the next control input. Since RCACdepends heavily on data for the controller update, this technique is similar to data-driven control[31]–[36].

RCAC was originally developed within the context of active noise control experiments[37]. The algorithm used in [37] is gradient-based, where the direction and step size are derivedfrom different cost functions. In subsequent work [38], the gradient algorithm was replaced bybatch least-squares optimization. In both [37] and [38], the modeling information is given byMarkov parameters (impulse response coefficients) of the open-loop transfer function Gzu fromthe control input u to the performance variable z. More recently, in [39], a recursive least squares(RLS) algorithm is used along with knowledge of the NMP zeros of Gzu. The approaches in[37]–[39] are closely related in the sense that all of the NMP zeros outside of the spectral radiusof Gzu are approximate roots of a polynomial whose coefficients are Markov parameters of Gzu;this polynomial is the numerator of a truncated Laurent expansion of Gzu about infinity. Thetruncated Laurent expansion serves as the transfer function Gf that defines the retrospective costby filtering the difference between the actual past control inputs and the re-optimized controlinputs. A key contribution of this article is to show that Gf serves as a target model for a specificclosed-loop transfer function, as explained below. To construct Gf , Markov parameters are used

DRAFT

3

in [37], [38], and NMP zeros are used in [39].

RCAC is applied to Rohrs’s counterexamples [40] in [41], broadband disturbance rejectionin [42], decentralized control in [43], sampled-data plants with aliasing in [44], MIMO plantsin [45], and Hammerstein plants in [46]–[48]. Numerical simulation studies are given in [49],[50] for flow control; in [51] for noncolocated control of a linkage; in [42], [43], [52], [53] forvibration control; in [54] for engine control; in [55]–[59] for aircraft control; in [60] for spacecraftcontrol; in [61] for quadrotor control; in [62] for missile control; in [63] for scramjet control;and in [48], [64], [65] for control of plants with hysteresis and hysteretic friction. Laboratoryexperiments are reported in [37], [66], [67] for noise control; in [68] for control of a ductedflame; and in [69] for control of a 6DOF shaker table.

The present article has several objectives. The first objective is to present the RCACalgorithm in sufficient detail that readers can grasp all of the key steps. This version of thealgorithm is based on RLS optimization. The algorithm is presented within the context ofthe adaptive standard problem, which includes command-following and disturbance-rejectionproblems with and without feedforward control as special cases; several of these problems arepresented as special cases of the adaptive servo problem. The adaptive standard problem andthe adaptive servo problem are stated for vector signals, and thus the RCAC algorithm is fullyMIMO.

The next contribution of this article concerns the modeling data used by RCAC asincorporated in Gf . In particular, we show that Gf serves as a target model for a closed-looptransfer function Gzu,k whose zeros include the zeros of Gzu. The special closed-loop transferfunction Gzu,k arises from the way in which RCAC updates the controller coefficients. Thiscontroller update can be interpreted as a virtual external control perturbation u that is injectedinternally to the control update. We call this intercalated injection.

The intercalated injection of the virtual external control perturbation u gives rise tothe closed-loop transfer function Gzu,k. At the same time, minimization of the retrospectivecost updates the controller coefficients so as to fit Gzu,k to the target model Gf . With thisinsight, it immediately becomes clear why RCAC requires knowledge of the NMP zeros of Gzu.Specifically, if a NMP zero of Gzu is not included in Gf , then RCAC cancels the zero throughfeedback in order to match Gzu,k to Gf . Although the RCAC algorithm is presented in the fullyMIMO case, the construction of Gf in the present article is confined to the case where the controlinput and performance variable are scalar signals.

Since RCAC updates the controller so as to match Gzu,k to the target model Gf , it is

DRAFT

4

straightforward to use Gf to place the closed-loop poles, that is, for adaptive pole placement.Pole placement aside, RCAC also matches the frequency response of Gzu,k to the frequencyresponse of Gf . What is, perhaps, more surprising is the fact that, if Gf is chosen to be afinite impulse response target model and the controller order is chosen to be sufficiently large,then RCAC yields a controller whose closed-loop frequency response matches the frequencyresponse of the closed-loop system given by high-authority LQG synthesis, that is, LQG withzero control weighting and zero measurement-noise covariance. This connection is surprisingdue to the fact that LQG uses a complete and exact plant model, whereas RCAC uses extremelylimited modeling information.

In addition to presenting a self-contained description of the RCAC algorithm, the goalof this article is to illustrate RCAC and investigate the role of Gf through examples. Theseexamples include plants with various types of stability (asymptotically stable, Lyapunov stable,and unstable), various types of control architectures (feedback, feedforward, MRAC, and PID),various problem objectives (stabilization, command following, and disturbance rejection), andvarious types of exogenous signals, including step, ramp, and harmonic commands as well asstep, ramp, harmonic, and stochastic sensor noise and disturbances. In addition, we apply RCACto linear plants with magnitude and rate control saturation, and we test it on nonlinear oscillators.Many of these examples go beyond the theory presented in [39], which is confined to feedbackcontrol of linear plants in the absence of sensor noise and with exogenous signals generatedby Lyapunov-stable exogenous dynamics. Neither broadband disturbances nor saturation areconsidered in [39]. Consequently, these examples provide insight into the performance of RCACin situations that lie outside the current theory. To this end, for several examples we vary theproblem data in order to make the problem progressively more difficult until RCAC fails. Thesetrends highlight the strengths and weaknesses of RCAC, and provide guidelines for applications.

The contents of this article are as follows. In the next section, we present the standardcontrol problem and the servo problem. Next, we define the adaptive standard problem and theadaptive servo problem. We then present the RCAC algorithm, and we discuss the construction ofthe target model Gf , which incorporates the modeling information required by RCAC. We thenshow examples of adaptive pole placement, as well as adaptive control for harmonic commandsand disturbances. Next, we consider stochastic exogenous signals and compare RCAC withhigh-authority LQG. Finally, we investigate the effect of sensor noise and modeling errors, inparticular, erroneous and unmodeled NMP zeros, erroneous relative degree, unmodeled timedelay, and erroneous plant gain.

DRAFT

5

The main text of this article is complemented by five sidebars. These sidebars demonstratePID autotuning using RCAC, compare RCAC to traditional MRAC, review the properties ofhigh-authority discrete-time LQG control, consider control magnitude and rate saturation aswell as integrator windup, and apply RCAC to nonlinear oscillators.

Standard Problem

Consider the standard problem consisting of the discrete-time, linear time-invariant plant

x(k + 1) = Ax(k) +Bu(k) +D1w(k), (1)

y(k) = Cx(k) +D0u(k) +D2w(k), (2)

z(k) = E1x(k) + E2u(k) + E0w(k), (3)

where x(k) ∈ Rn is the state, y(k) ∈ Rly is the measurement, u(k) ∈ Rlu is the control input,w(k) ∈ Rlw is the exogenous input, and z(k) ∈ Rlz is the performance variable. The plant(1)–(3) may represent a continuous-time, linear time-invariant plant sampled at a fixed rate. Thegoal is to develop a feedback or feedforward controller that operates on y to minimize z inthe presence of the exogenous signal w. The components of w can represent either a commandsignal r to be followed, an external disturbance d to be rejected, or sensor noise v that corruptsthe measurement as determined by the choice of D1, D2, and E0. Depending on the application,components of w may or may not be measured, and, for feedforward control, the measuredcomponents of w can be included in y by suitable choice of C and D2. For fixed-gain control,z need not be measured. For adaptive control, however, z is assumed to be measured.

Using the forward shift operator q, we can rewrite (1)–(3) as

z(k) = Gzw(q)w(k) +Gzu(q)u(k), (4)

y(k) = Gyw(q)w(k) +Gyu(q)u(k), (5)

where

Gzw(q)4= E1(qI − A)−1D1 + E0, Gzu(q)

4= E1(qI − A)−1B + E2, (6)

Gyw(q)4= C(qI − A)−1D1 +D2, Gyu(q)

4= C(qI − A)−1B +D0. (7)

Furthermore, the discrete-time, linear time-invariant controller has the form

u(k) = Gc(q)y(k). (8)

Note that q is a time-domain operator that accounts for initial conditions, and, although (6)and (7) are written as transfer functions, these expressions are convenient representations of

DRAFT

6

time-domain dynamics. For pole-zero analysis, q can be replaced by the Z-transform complexvariable z, in which case (4), (5), and (8) do not account for the initial conditions. Figures 1and 2 illustrate (4)–(8).

The closed-loop transfer function from the exogenous signal w to the performance variablez is given by

Gzw4= Gzw +GzuGc(I −GyuGc)

−1Gyw. (9)

We refer to the poles of Gzw as the closed-loop poles, and the transmission zeros of Gzw asthe closed-loop zeros. In the case where y, z, u, and w are scalar signals, (6)–(7) and (9) can bewritten as

Gzw =Nzw

D, Gzu =

Nzu

D, Gyw =

Nyw

D, Gyu =

Nyu

D, (10)

Gzw =Nzw

Dzw

=NzuNywNc +Nzw(DDc −NyuNc)

D(DDc −NyuNc), (11)

where

Gc =Nc

Dc

. (12)

We assume that D and Dc are monic. In the case where y = z, that is, C = E1, D0 = E2, andD2 = E0, (10) and (11) can be written as

Gzw = Gyw =Nw

D, Gzu = Gyu =

Nu

D, Gzw =

NwDc

DDc −NuNc

. (13)

In the case where w is matched with u, that is, B = D1, E0 = E2, and D0 = D2, (10) and (11)can be written as

Gzw = Gzu =Nz

D, Gyw = Gyu =

Ny

D, Gzw =

NzDc

DDc −NyNc

. (14)

In the case where y = z and w is matched with u, (10) and (11) can be written as

G4= Gzw = Gyw = Gzu = Gyu =

N

D, Gzw =

NDc

DDc −NNc

. (15)

For examples where y = z and w is matched with u, we use G to define the plant (1)–(3);otherwise, we use the state space representation.

DRAFT

7

Servo Problem

As a special case of the standard problem, we consider the discrete-time, linear time-invariant plant

x(k + 1) = Ax(k) +Bu(k) + D1d(k), (16)

y0(k) = Cx(k) + D0u(k), (17)

yn(k) = y0(k) + v(k), (18)

e0(k) = r(k)− y0(k), (19)

en(k) = r(k)− yn(k), (20)

where x(k) ∈ Rn is the state, yn(k) ∈ Rly is the measurement, u(k) ∈ Rlu is the control input,d(k) ∈ Rld is the disturbance, r(k) ∈ Rly is the command, v(k) ∈ Rly is the sensor noise, anden(k) ∈ Rly is the performance variable. We can rewrite (17) in terms of q as

y0(k) = Gu(q)u(k) +Gd(q)d(k), (21)

where

Gu(q)4= C(qI − A)−1B + D0, Gd(q)

4= C(qI − A)−1D1. (22)

Furthermore, the linear time-invariant controller has the form

u(k) = Gc(q)en(k). (23)

The measured error signal en is the difference between the command r and the measurementyn, which may be corrupted by noise. Since only the measured error available for feedback,it serves as the performance variable within RCAC. However, the true error signal e0, whichis the difference between the command r and the plant output y0, provides a true measure ofthe command-following performance. Since this signal is not available for feedback, it is usedonly as a diagnostic. If, however, sensor noise is absent, then en and e0 are identical. Figure 3illustrates (21)–(23). The servo problem is a special case of the standard problem with

w =

r

d

v

, y = en, z = en, (24)

and

D1 = [0 D1 0], C = E1 = −C, D0 = E2 = −D0, (25)

D2 = [Ily 0 − Ily ], E0 = [Ily 0 0], (26)

Gzw = [Ily −Gd 0], Gzu = −Gu, (27)

Gyw = [Ily −Gd − Ily ], Gyu = −Gu. (28)

DRAFT

8

In the case where d and u are colocated, it follows that D1 = B and D0 = 0, and thus Gd = Gu.In this case, we define G 4

= Gd = Gu. However, w is not necessarily matched with u. Moreover,if r = v = 0 and d and u are colocated, then w is matched with u. For examples where d

and u are colocated, we use G to define the plant (16)–(20); otherwise, we use the state spacerepresentation.

Retrospective Cost Adaptive Control Algorithm

Adaptive Standard Problem and Adaptive Servo Problem

Figure 4 shows the adaptive standard problem, which is the standard problem with anadaptive controller, while Figure 5 shows the adaptive servo problem, which is the servo problemwith an adaptive controller. Note that, for the adaptive servo problem, it is desirable to minimizethe true error e0. However, since e0 is not available, RCAC minimizes the measured error en,which may be corrupted by noise, as shown in Figure 3. For the adaptive standard problem,z = en.

Controller Structure

Define the dynamic compensator

u(k) =nc∑

i=1

Pi(k)u(k − i) +nc∑

i=kc

Qi(k)y(k − i), (29)

where Pi(k) ∈ Rlu×lu and Qi(k) ∈ Rlu×ly are the controller coefficient matrices, and kc ≥ 0.For controller startup, we implement (29) as

u(k) =

0, k < kw,

Φ(k)θ(k), k ≥ kw,(30)

where the regressor matrix Φ(k) is defined by

Φ(k)4=

u(k − 1)...

u(k − nc)

y(k − kc)...

y(k − nc)

T

⊗ Ilu ∈ Rlu×lθ , (31)

DRAFT

9

kw ≥ nc is an initial waiting period during which Φ(k) is populated with data, and the controllercoefficient vector θ(k) is defined by

θ(k)4= vec

[P1(k) · · · Pnc(k) Qkc(k) · · · Qnc(k)

]T

∈ Rlθ , (32)

lθ4= l2unc + luly(nc + 1− kc), “⊗” is the Kronecker product, and “vec” is the column-stacking

operator. Note that kc = 0 allows an exactly proper controller, whereas kc ≥ 1 yields a strictlyproper controller of relative degree of at least kc. In all examples in this article, we use kc = 1,and, unless specified otherwise, we use kw = nc. In terms of q, the time-domain transfer functionof the controller from y to u is given by

Gc,k(q) =(qncIlu − qnc−1P1(k)− · · · − Pnc(k)

)−1 (qnc−kcQkc(k) + · · ·+Qnc(k)). (33)

Note that the coefficients of Gc,k are given by the components of θ(k), which are time-dependent,and thus Gc,k is a linear, time-varying controller. Also, note that (33) is expressed in terms ofthe forward-shift operator q rather than the Z-transform variable z. Consequently, although (33)is written as a transfer function, this expression is merely a convenient representation of thetime-domain operator represented by (29). If y and u are scalar signals, then Gc,k is SISO and(33) can be written as

Gc,k(q) =qnc−kcQkc(k) + · · ·+Qnc(k)

qnc − qnc−1P1(k)− · · · − Pnc(k). (34)

Note that (33) is an infinite impulse response (IIR) controller. By removing u(k −1), . . . , u(k − nc) from (29) and Φ(k), and by modifying the structure of θ, we can enforcea finite impulse response (FIR) controller structure, where

u(k) =nc∑

i=kc

Qi(k)y(k − i). (35)

In this case (33) becomes

Gc,k(q) =1

qnc

(qnc−kcQkc(k) + · · ·+Qnc(k)

). (36)

Retrospective Performance Variable

We define the retrospective performance variable as

z(k, θ)4= z(k) +Gf(q)[Φ(k)θ − u(k)], (37)

where θ ∈ Rlθ and Gf is an nz × nu filter specified below. The rationale underlying (37)is to replace the control u(k) with Φ(k)θ∗, where θ∗ is the retrospectively optimized controller

DRAFT

10

coefficient vector obtained by optimization below. The updated controller thus has the coefficientsθ(k + 1) = θ∗. Consequently, the implemented control at step k + 1 is given by

u(k + 1) = Φ(k + 1)θ(k + 1). (38)

The filter Gf is constructed in section “Virtual External Control Perturbation and Target ModelGf” based on the required modeling information. This filter has the form

Gf4= D−1

f Nf , (39)

where Df is an lz × lz polynomial matrix with leading coefficient Ilz , and Nf is an lz × lu

polynomial matrix. For reasons given below, we refer to Gf as the target model. By definingthe filtered versions Φf(k) ∈ Rlz×lθ and uf(k) ∈ Rlz of Φ(k) and u(k), respectively, (37) can bewritten as

z(k, θ) = z(k) + Φf(k)θ − uf(k), (40)

where

Φf(k)4= Gf(q)Φ(k), uf(k)

4= Gf(q)u(k). (41)

Note that implementation requires kw ≥ max(nc, nf), where nf is the McMillan degree of Gf .

Retrospective Cost

Using the retrospective performance variable z(k, θ) defined by (37), we define thecumulative retrospective cost function

J(k, θ)4=

k∑

i=1

λk−i[zT(i, θ)Rz(i)z(i, θ) + (Gf(Φ(i)θ))TRu(i)Gf(Φ(i)θ)]

+ λk(θ − θ(0))TRθ(θ − θ(0)), (42)

where λ ∈ (0, 1] is the forgetting factor, Rθ ∈ Rlθ×lθ is positive definite, and, for all i ≥ 1,Rz(i) ∈ Rlz×lz is positive definite and Ru(i) ∈ Rlz×lz is positive semidefinite. The performance-variable and control-input weighting matrices Rz(i) and Ru(i) are time-dependent and thus maydepend on present and past values of y, z, and u. For example, choosing Ru(i) to be a functionof z(i)Tz(i) can help prevent unstable pole-zero cancellation in the case of unmodeled NMPzeros [70]. Recursive minimization of (42) is used to update the controller coefficient vector θ.The following result uses recursive least squares to obtain the minimizer of (42).

Proposition: Let P (0) = R−1θ , and, for all k ≥ 1, let θ∗ be the unique global minimizer

of the retrospective cost function (42). Then, θ∗ is given by

θ∗ = θ(k)− P (k)ΦTf (k)Υ−1(k)

[Φf(k)θ(k) + (Rz(k) +Ru(k))−1Rz(k)(z(k)− uf(k))

], (43)

DRAFT

11

P (k + 1) =1

λP (k) − 1

λP (k)ΦT

f (k)Υ−1(k)Φf(k)P (k), (44)

where

Υ(k)4= λ(Rz(k) +Ru(k))−1 + Φf(k)P (k)ΦT

f (k). (45)

Setting θ(k + 1) = θ∗, (43) yields the recursive controller coefficient update equation

θ(k + 1) = θ(k)− P (k)ΦTf (k)Υ−1(k)

[Φf(k)θ(k) + (Rz(k) +Ru(k))−1Rz(k)(z(k)− uf(k))

].

(46)

Note that, if λ = 1, then the covariance P (k) decreases monotonically, and thus the rate ofadaptation decreases. To maintain adaptation in cases where the plant or exogenous signals arechanging, the covariance can be reset using suitable logic. Alternatively, choosing the forgettingfactor λ < 1 prevents monotonic decrease of P (k), but can lead to instability in the presenceof noise and in the absence of persistency [71], [72]. Yet another approach is to include anadditional positive-semidefinite term Q(k) on the right-hand side of (44) of the form

P (k + 1) = P (k) − P (k)ΦTf (k)Υ−1(k)Φf(k)P (k) +Q(k), (47)

where λ = 1 in (45). Note that (47) is the discrete-time Kalman predictor Riccati error-covarianceupdate equation with the dynamics matrix A = Ilθ , output matrix C(k) = Φf(k), and process-noise covariance Q(k) [73]. Consequently, persistency in (47) is determined by the observabilityof the time-varying pair (Ilθ ,Φf), and the corresponding state-estimate update is given by (43).Alternatively, we can also use the discrete-time Kalman filter Riccati error-covariance updateequation

P (k + 1) = P (k) − P (k)ΦTf (k + 1)Υ−1(k)Φf(k + 1)P (k) +Q(k), (48)

where the corresponding state-estimate update is given by

θ(k + 1) = θ(k)− P (k)ΦTf (k + 1)Υ−1(k + 1)

·[Φf(k + 1)θ(k) + (Rz(k) +Ru(k))−1Rz(k)(z(k + 1)− uf(k + 1))

]. (49)

Note that, in (49), the estimate θ(k+ 1) depends on z(k+ 1). Therefore, implementation of (49)requires instantaneous update of the controller coefficient vector. In contrast, in (46), θ(k + 1)

depends on z(k), and thus (46) is implementable.

In practice, θ(0) can be chosen based on an initial design, if one is available. However, forall examples in this article, we initialize θ(0) = 0 in order to reflect the absence of additionalprior modeling information. Furthermore, for all i ≥ 1, we use Rz(i) = Ilz . Note that RCACcan use batch least squares optimization instead of recursive minimization.

DRAFT

12

Virtual External Control Perturbation and Target Model Gf

The target model Gf is a key feature of RCAC. In [38], Gf is viewed as a model of Gzu

that captures the sign of the leading coefficient of Nzu along with the NMP zeros of Gzu. In[39], the analysis of RCAC involves an ideal filter Gf , which is a closed-loop transfer functioninvolving an ideal feedback controller Gc. These insights lead to an alternative interpretation ofGf as a target model for a specific closed-loop transfer function, as demonstrated below.

Using (30), the retrospective performance variable (37) can be written as

z(k, θ) = z(k)−Gf(q)[u(k)− Φ(k)θ]. (50)

It can be seen from (50) that minimizing the cumulative retrospective cost function (42)determines the controller coefficient vector θ that best fits Gf(q)[u(k)−Φ(k)θ] to the performancedata z(k). In terms of the optimal controller coefficient vector θ∗, (50) can be written as

z(k, θ∗) = z(k)−Gf(q)[u(k)− Φ(k)θ∗]. (51)

For convenience, we define

u∗(k)4= Φ(k)θ∗, (52)

u(k)4= u(k)− u∗(k), (53)

so that

u(k) = u∗(k) + u(k). (54)

With this notation, (51) can be written as

z(k, θ∗) = z(k)−Gf(q)u(k). (55)

Using (54) to replace u in Φ by u∗ + u, it follows from (29)–(31) and (52) that u∗ satisfies

u∗(k) =nc∑

i=1

P ∗i u∗(k − i) +

nc∑

i=1

P ∗i u(k − i) +nc∑

i=kc

Q∗i y(k − i). (56)

Note that the actual input to the plant at step k is u(k), which, in (54), is written as the sumof the pseudo control input u∗(k) and the virtual external control perturbation u(k). Althoughthe signals u∗ and u are not explicitly used by RCAC, we now show that they are crucial tounderstanding the role of Gf . From (56) it follows that

u∗(k) = D∗−1c (q)[(qncIlu −D∗c(q))u(k) +N∗c (q)y(k)], (57)

DRAFT

13

where

D∗c(q)4= qncIlu − qnc−1P ∗1 − · · · − P ∗nc

, (58)

N∗c (q)4= qnc−kcQ∗kc + · · ·+Q∗nc

, (59)

G∗c4= D∗−1

c N∗c . (60)

The special closed-loop transfer function from u to z arises from the way in which RCACupdates the controller coefficients. This controller update can be interpreted as a virtual externalcontrol perturbation u that is injected internally to the control update. We call this intercalatedinjection. Figures 6 and 7 show the equivalent transfer function representations of (54) and (57)with u represented as an external input. Figure 6 illustrates the intercalated injection of u insidethe control update.

It follows from (4), (5), (54), and (57) that

z(k) = Gzw(q)w(k) +Gzu(q)[D∗−1c (q)[(qncIlu −D∗c(q))u(k) +N∗c (q)y(k)] + u(k)], (61)

y(k) = Gyw(q)w(k) +Gyu(q)[D∗−1c (q)[(qncIlu −D∗c(q))u(k) +N∗c (q)y(k)] + u(k)]. (62)

Solving (62) for y(k) and substituting y(k) into (61) yields

z(k) = G∗zw(q)w(k) + G∗zu(q)u(k), (63)

where G∗zw is given by (9) with Gc replaced by G∗c , that is,

G∗zw4= Gzw +GzuG

∗c(I −GyuG

∗c)−1Gyw, (64)

and where

G∗zu(q)4= qncGzu(q)[D∗−1

c (q) +G∗c(q)[Ily −Gyu(q)G∗c,(q)]−1Gyu(q)D∗−1c (q)]. (65)

Now assume that y, z, and u are scalar signals. Using the notation in (10), (65) can be writtenas

G∗zu(q) =Nzu(q)qnc

D(q)D∗c(q)+

Nzu(q)N∗c (q)Nyu(q)qnc

D(q)D∗c(q)[D(q)D∗c(q)−Nyu(q)N∗c (q)](66)

=Nzu(q)qnc

D(q)D∗c(q)−Nyu(q)N∗c (q). (67)

It can be seen from (55) that z(k, θ∗) = z(k) − Gf(q)u(k) is the residual of the fitbetween z and the output of the target model Gf with input u. However, (63) shows that G∗zu,whose coefficients are given by θ∗, is the actual transfer function from u(k) to z(k). Therefore,minimizing the retrospective cost function (42) yields the value θ(k+ 1) = θ∗ of θ, and thus the

DRAFT

14

controller Gc,k+1, that provides the best fit of Gf by the transfer function Gzu,k+1 from u to z.In other words, RCAC determines Gc,k+1 so as to optimally fit Gzu,k+1 to Gf .

The intercalated closed-loop transfer function Gzu,k+1 is distinct from the closed-looptransfer function Gzu,k from an external control-input perturbation u to the performance variablez. In the case where y, z, and u are scalar signals, Gzu,k is given by

Gzu,k =NzuDc,k

DDc,k −NyuNc,k

. (68)

The difference between (67) and (68) is the fact that the numerator of (67) has the fixedpolynomial qnc in place of the time-varying polynomial Dc,k in the numerator of (68). Thisdistinction implies that the only NMP zeros in (67) are those arising from Gzu, unlike (68),which includes “time-varying” zeros arising from Dc,k.

Modeling Information Required for Gf

In this section we specify the modeling information required by RCAC. For the standardproblem, this information includes the relative degree, the leading numerator coefficient, andall of the NMP zeros of Gzu. For the adaptive servo problem, it follows from (28) that thisinformation is obtained from Gu = −Gzu. The discussion in this section is confined to the casewhere z and u are scalar signals. Note, however, that y may be a vector signal, and thus thecontrollers based on Gf as specified below may be multiple-input, single-output.

Relative Degree

Since Gzu,k+1 approximates Gf , it is advantageous to choose the relative degree of Gf tobe equal to the relative degree of Gzu,k+1. It follows from (67) that the relative degree of Gzu,k+1

is equal to the relative degree of Gzu. We thus choose the relative degree of Gf to be equal tothe relative degree of Gzu. This choice requires knowledge of the relative degree dzu of Gzu

[39].

NMP Zeros

In [39], the target model Gf is chosen such that the roots of Nf include the NMP zeros ofGzu. As can be seen from (67), a key feature of Gzu,k+1 is the factor Nzu in its numerator. Thismeans that, since RCAC adapts Gc,k so as to match Gzu,k+1 to Gf , RCAC may cancel NMPzeros of Gzu that are not included in the roots of Nf in order to remove them from Gzu,k+1.This observation motivates the need to include all of the NMP zeros of Gzu in Nf . As an aside,

DRAFT

15

Example PP1 shows that RCAC cancels all of the minimum-phase zeros of Gzu that are notincluded in the roots of Nf in order to remove them from Gzu,k+1.

Markov Parameters

In [37], [38], Gf is based on the Markov parameters of Gzu. In particular, for each complexnumber z whose absolute value is greater than the spectral radius of A, it follows that Gzu hasthe Laurent expansion

Gzu(z) = E1(zI − A)−1B =∞∑

i=dzu

Hi

zi, (69)

where H04= E2 and, for all i ≥ 1, the ith Markov parameter of Gzu is given by

Hi4= E1A

i−1B. (70)

As shown in [38], a sufficiently large number n > dzu of Markov parameters in a truncation of(69) yields an FIR target model Gf(z) =

∑ni=dzu

Hizi whose zeros approximate the NMP zeros of

Gzu with absolute value greater than the spectral radius of A. In addition, every truncation of(69) with n ≥ dzu has the correct relative degree, that is, the relative degree of Gzu. Note that,since Gzu = Nzu

Dand D is monic, Hdzu is the leading numerator coefficient of Gzu.

FIR Target Model

In the case where Gzu is minimum phase, we define the FIR target model

Gf(q)4=Hdzu

qdzu. (71)

This choice of Gf requires knowledge of the relative degree dzu of Gzu and the first nonzeroMarkov parameter Hdzu of Gzu. Note that, for the adaptive servo problem, since Gzu = −Gu,it follows that Hdzu = −Hdu , where Hdu is the first nonzero Markov parameter of Gu.

In the case where Gzu is NMP, we define the FIR target model

Gf(q)4=HdzuNzu,u(q)

qdzu+deg(Nzu,u), (72)

where the roots of the monic polynomial Nzu,u are the NMP zeros of Gzu. This choice of Gf

requires knowledge of the relative degree dzu of Gzu, the first nonzero Markov parameter Hdzu

of Gzu, and the NMP zeros of Gzu. In both cases, the relative degree of Gf is equal to therelative degree of Gzu.

DRAFT

16

Adaptive Pole Placement

In this section, we consider adaptive pole placement for the adaptive standard problem andthe adaptive servo problem using RCAC. We begin by defining the IIR target model for poleplacement.

IIR Target Model for Pole Placement

Since Gzu approximates Gf , RCAC attempts to place the poles of Gzu at the locations ofthe poles of Gf . It can be seen from (11) and (67) that the denominator of Gzu is equal to thedenominator of the closed-loop transfer function Gzw. Consequently, RCAC attempts to place theclosed-loop poles at the locations of the poles of Gf . In order to use Gf for pole placement, letDp be a monic polynomial of degree np whose roots are the desired closed-loop pole locations.Then, in the case where Gzu is minimum phase, we define the IIR target model

Gf(q)4=Hdzuqnp−dzu

Dp(q), (73)

and, in the case where Gzu is NMP, we define the IIR target model

Gf(q)4=Hdzuqnp−dzu−deg(Nzu,u)Nzu,u(q)

Dp(q). (74)

The target models (71) and (73) for minimum-phase Gzu along with the target models (72) and(74) for NMP Gzu represent the modeling information required by RCAC.

In the case where np < n + nc, RCAC attempts to place np closed-loop poles at thelocations of the poles of Gf . The remaining n+ nc − np closed-loop poles are placed at eitherthe locations of the minimum-phase zeros of Gzu that are not included in the roots of Nf so thatGzu approximates Gf , or at zero.

Example PP1. Pole placement for the adaptive servo problem. Consider the unstable,minimum-phase plant

G(q) =q2 − 1.4q + 0.85

(q− 1.05)(q2 − 1.6q + 0.89). (75)

Let r be a unit step command, and let d = v = 0. To place five closed-loop poles at 0.3, 0.4, 0.6,

and ±0.1, we use the IIR target model (73) with

Dp(q) = (q− 0.3)(q− 0.4)(q− 0.6)(q2 + 0.01), (76)

and set Rθ = 10−20, Ru = 0, and nc = 4. RCAC asymptotically follows the step command andplaces five closed-loop poles near the locations of the roots of Dp, as shown in Figure 8. Note

DRAFT

17

that the remaining two closed-loop poles cancel the minimum-phase zeros of G, which are notincluded in the target model (73).

Next, we set Rθ = 10−40. Figure 9 shows the locations of the closed-loop poles. Note thatRCAC places the closed-loop poles closer to the desired locations as Rθ is decreased and thusP (0) is increased. �

Example PP2. Pole placement for the adaptive standard problem. Consider the unstable,NMP plant

G(q) =q− 1.2

(q− 1.1)(q− 2). (77)

Let w be a unit step. We apply RCAC with Rθ = 10−30, Ru = 0, and nc = 3. To place fiveclosed-loop poles at 0.1, 0.3, 0.5, and ±0.75, we use the IIR target model (74) with

Dp(q) = (q− 0.1)(q− 0.3)(q− 0.5)(q2 + 0.5625). (78)

Note that the NMP zero of (77) lies between the two unstable poles. It follows from rootlocus analysis and the parity interlacing property [74] that stabilization of (77) requires that thecontroller be unstable [21]. RCAC places five closed-loop poles near the locations of the rootsof Dp, as shown in Figure 10. As expected, Gc,k converges to an unstable controller. �

The examples in this section show how RCAC can be used to assign a subset of theclosed-loop poles, which can be viewed as partial pole placement. Example PP2 demonstratesthis objective for an unstable plant that requires an unstable controller for stabilization. In Effectof Sensor Noise, we investigate the effect of sensor noise on the ability to achieve pole placement.

Adaptive Harmonic Command Following and Disturbance Rejection

In this section, we demonstrate the ability of RCAC to develop internal models ofcommands and disturbances by considering two examples of adaptive harmonic commandfollowing and disturbance rejection. In the first example, we use RCAC to follow harmoniccommands with an IIR feedback controller as well as with combined feedback-feedforwardcontrol. Next, for harmonic disturbance rejection, we investigate the ability of RCAC to readaptto changing disturbance frequencies.

DRAFT

18

Example H1. Harmonic command following for the adaptive servo problem. Considerthe asymptotically stable, minimum-phase plant

G(q) =q− 0.8

(q− 0.95)(q− 0.99). (79)

Let r be the harmonic command r(k) = cosωk, where ω = 0.5 rad/sample, and let d = v = 0.We apply RCAC with Rθ = 0.2, Ru = 0, and nc = 5, and we use the FIR target model (71). Werestrict Gc,k to be an FIR controller of the form (35), (36). Since RCAC cannot develop an internalmodel of the command r due to the FIR structure of Gc,k, the command-following performanceis severely restricted, as shown in Figure 11. The closed-loop system is instantaneously unstableat most time steps up to k = 500.

Next, we allow Gc,k to be an IIR of the form (29), (34). In this case, RCAC automaticallydevelops an internal model in the form of controller poles located on the unit circle at thecommand frequency ω, and asymptotically follows the harmonic command, as shown in Figure12.

Finally, we use combined feedback-feedforward control with decentralized adaptation, asshown in Figure 13, where the feedback controller is FIR and the feedforward controller is IIR.Both controllers are of order nc = 5. In this case, since the controller is feedback-feedforward,RCAC asymptotically follows the harmonic command without developing an internal model, asshown in Figure 14. �

Example H2. Two-tone harmonic disturbance rejection for the adaptive servo problemusing an IIR controller. Consider the asymptotically stable plant

A =

0.9 −0.5625 0

1 0 0

0 1 0

, B =

1

0

0

, D1 =

0 0

1 0

0 1

, (80)

C = [0.78 − 1.18 1], D0 = 0, (81)

where Gu is NMP. Let r = v = 0 and d(k) = [cosω1k cosω2k]T, where ω1 = π8

rad/sampleand ω2 = π

12rad/sample. We apply RCAC with kw = 50, Rθ = 0.01, Ru = 0, and nc = 12,

and we use the FIR target model (72) with an IIR controller. RCAC automatically develops aninternal model of the harmonic disturbance and thus rejects the harmonic disturbance, as shownin Figure 15.

Next, let r(k) = cosω1k, v = 0, and d(k) = [cosω2k 1]T, where ω1 = π15

rad/sampleand ω2 = π

5rad/sample for 1 ≤ k ≤ 2000, and where ω2 = π

8rad/sample for 2000 < k ≤

DRAFT

19

4000. RCAC automatically develops internal models of the command and disturbance, and thusasymptotically follows the harmonic command and rejects the step and harmonic disturbances,as shown in Figure 16. Note that, after the disturbance frequency changes at step k = 2000,RCAC readapts and rejects the disturbance. �

The examples in this section show that, for harmonic command following and harmonicdisturbance rejection, RCAC has the ability to automatically develop an internal model of thecommand and disturbance without knowledge of the spectrum of these signals. In Adaptive PIDControl, we apply RCAC to step-command following using an adaptively tuned PID controller. InRCAC for Model Reference Adaptive Control, we compare RCAC to a continuous-time MRACmethod for minimum-phase and NMP plants.

Adaptive Control with Stochastic w and d

We now apply RCAC to the adaptive standard problem in the case where the exogenoussignal w is stochastic, as well as to the adaptive servo problem in the case where the disturbanced is stochastic. We also compare the closed-loop frequency response and the H2 cost of RCACto high-authority LQG, presented in Discrete-Time LQG Control.

H2 Cost of Strictly Proper Controllers

For the plant (1)–(3), we can compute the H2 cost of an arbitrary stabilizing strictly proper

controller Gc ∼[Ac Bc

Cc 0

]as follows. Defining

D4=

[D1

BcD2

], V = DDT, (82)

the H2 cost is given by

J(Ac, Bc, Cc) = tr(Q1R1) + tr(Q2CTc R2Cc), (83)

where Q1 ∈ Rn×n, Q2 ∈ Rnc×nc satisfy

Q =

[Q1 Q12

QT12 Q2

], (84)

and Q ∈ Rn+nc×n+nc is the solution of the discrete-time Lyapunov equation

Q = AQAT + V , (85)

DRAFT

20

where A is defined in (S24).

High-Authority LQG Target Model

Since RCAC tends to match Gzu to Gf , we choose Gf with the numerator Nzu(q)qnc andthe closed-loop denominator DHA of high-authority LQG in order to construct the high-authorityLQG target model

Gf(q) =Nzu(q)qnc

DHA(q)=

HdzuqmNzu,u(q)

Nzu,u(q−1)Nyw,s(q)Nyw,u(q−1), (86)

where m 4= nc − dzu − dyu. Note that m may be negative. The target model (86) is based on

(S29). By choosing (86), the goal is to compare the performance of RCAC with the performanceof high-authority LQG in the case nc = n. Note that using (86) as the target model requiresknowledge of DHA in addition to the modeling information required by (71) and (72). The useof (86) is thus only for conceptual illustration. In order to avoid using knowledge of DHA, wesubsequently use the FIR target models (71) and (72), but with nc > n.

In all examples in this section, unless specified otherwise, we set Rθ = 10−10, kw = 50,and Ru = 0.

Example SD1. Adaptive control with stochastic w for the adaptive standard problem.Consider the asymptotically stable plant

A =

0.855 1 0 0 0

−0.1715 0.855 −0.4266 −0.3607 0.4952

0 0 0.5 −0.5072 0.6964

0 0 0 0.6716 1

0 0 0 −0.4514 0.6716

, B =

0

0

0

0

1

, D1 =

−0.6269

0.3985

−0.3306

0.4415

0.3794

,

(87)

C = [0.7298 − 0.3954 − 0.3605 0.4003 0.1447], D0 = 0, D2 = 0, (88)

E1 = [0.1717 0.3351 − 0.5294 − 0.4476 0.6145], E0 = 0, E2 = 0, (89)

where Gzu, Gzw, Gyu, and Gyw are NMP. The H2 cost of the LQG controller is 1.059. Theexogenous signal w is zero-mean Gaussian white noise with standard deviation 0.1. We applyRCAC with nc = n, and we use the high-authority LQG target model (86). RCAC places theclosed-loop poles near the high-authority LQG closed-loop poles and approximates the closed-loop frequency response of high-authority LQG, as shown in Figure 17. The H2 cost of theRCAC controller is 1.0591.

DRAFT

21

Next, we show that, for sufficiently large nc > n, RCAC approximates the performanceof the high-authority LQG controller using the FIR target model (72), which uses knowledgeof dzu, Hdzu , and the NMP zeros of Gzu, but no other modeling data and no knowledge ofDHA. We apply RCAC with nc = 4n = 20. In this case, RCAC approximates the closed-loopfrequency response of high-authority LQG, as shown in Figure 18, and the H2 cost of theRCAC controller is 1.061. �

Example SD2. Adaptive control with nonzero-mean, stochastic w for the adaptive standardproblem. Consider the Lyapunov-stable, NMP plant

G(q) =(q− 0.5)(q2 − 1.92q + 1.44)

(q− 1)(q− 0.9)(q2 − 1.62q + 0.81), (90)

and let w be Gaussian white noise with mean 0.1 and standard deviation 0.05. The H2 costof the LQG controller is 28.96. We apply RCAC with nc = 5n = 20, and we use the FIRtarget model (72), which uses no knowledge of DHA. RCAC approximates the closed-loopfrequency response of high-authority LQG except for the internal model of the disturbance bias,which has the form of a notch at DC, as shown in Figure 19. Note that RCAC automaticallydevelops an internal model of the disturbance bias. The H2 cost of the RCAC controller is 35.79.�

Example SD3. Harmonic command following and stochastic disturbance rejection forthe adaptive servo problem. Consider the asymptotically stable, minimum-phase plant

G(q) =(q2 − 1.7q + 0.785)(q2 − 1.4q + 0.85)

(q− 0.5)(q2 − 1.8q + 0.97)(q2 − 1.4q + 0.98). (91)

The H2 cost of the LQG controller is 1.072. Let r be the harmonic command r(k) = cosωk,where ω = 0.8 rad/sample, let d be zero-mean Gaussian white noise with standard deviation0.01, and let v = 0. We apply RCAC with nc = 8n = 40, and we use the FIR target model(71), which uses no knowledge of DHA. RCAC asymptotically follows the harmonic commandand approximates the closed-loop frequency response of high-authority LQG except at thecommand frequency due to the internal model of the command, which has the form of a notchat the command frequency, as shown in Figure 20. As in Example SD2, RCAC automaticallydevelops an internal model in response to the harmonic command. The H2 cost of the RCACcontroller is 1.15. �

The examples in this section show that, as Gc,k adapts, the frequency response of Gzu

tends to the frequency response of Gf . It is also shown that, for sufficiently large nc > n,the frequency response of the closed-loop transfer function Gzw obtained from RCAC with theFIR target models (71) and (72) approximates the closed-loop frequency response and H2 cost

DRAFT

22

of high-authority LQG. In addition, for command following and disturbance rejection, RCACmatches the frequency response of high-authority LQG at nearly all frequencies, apart from thecommand frequency, where RCAC places an internal model.

Effect of Sensor Noise

In all of the examples considered so far in this article, the measurement y is not corruptedby noise. In contrast, the examples in this section consider the adaptive servo problem withstochastic sensor noise v. Difficulties arising from sensor noise in adaptive control systems areconsidered in [40], [75].

Example SN1. Pole placement for the adaptive servo problem with sensor noise. Considerthe unstable, minimum-phase plant

G(q) =q2 − 1.4q + 0.85

(q− 1.1)(q2 − 1.8q + 1.06). (92)

Let r be the harmonic command r(k) = cosωk, where ω = 0.3 rad/sample, and let d = v = 0.We apply RCAC with Rθ = 10−10, Ru = 0, and nc = 6. To place four closed-loop poles at 0,0.5, and ±0.1, we use the IIR target model (73) with

Dp(q) = q4 − 0.5q3 + 0.01q2 − 0.005q. (93)

RCAC places closed-loop poles near the locations of the roots of Dp, as shown in Figure 21.Now let v be zero-mean Gaussian white noise with standard deviation σ. For σ = 0.1, RCACfails to place closed-loop poles near the locations of the roots of Dp, but stabilizes the plant atstep k = 500. For σ = 0.2, RCAC fails to place closed-loop poles near the locations of the rootsof Dp, and the closed-loop system is unstable at step k = 500 (not shown). �

Example SN2. Stochastic disturbance rejection for the adaptive servo problem with sensornoise. Consider the asymptotically stable, minimum-phase plant

A =

0.9 1 0 0 0

0 0.95 −0.1507 −0.1674 0.5189

0 0 0.65 1 0

0 0 −0.4225 0.65 0.434

0 0 0 0 0.95

, B = D1 =

0

0

0

0

1

, (94)

C = E1 = [0.1596 0.3991 − 0.2405 − 0.2672 0.8283], D0 = E0 = E2 = 0, D2 = 1.

(95)

DRAFT

23

Let d and v be zero-mean Gaussian white noise signals with standard deviation 1. We applyRCAC with kw = 50, Rθ = 10−20, Ru = 0, and nc = 4n = 20, and we use the FIR target model(71). Instead of approximating the closed-loop frequency response of high-authority LQG, RCACapproximates the closed-loop frequency response of the LQG controller designed for the actualsensor noise level, namely, V2 = 1, as shown in Figure 22. �

Example SN3. Step command following and stochastic disturbance rejection for theadaptive servo problem with sensor noise. Consider the asymptotically stable, minimum-phaseplant

A =

0.8882 1 0 0 0

−0.1715 0.8882 −0.3624 −0.3238 0.6679

0 0 0.693 1 0

0 0 −0.4802 0.693 0.7276

0 0 0 0 0.5

, B =

0

0

0

0

1

, D1 =

0.5537

0.0603

−0.5457

0.5596

−0.2806

,

(96)

C = [0.2158 0.4234 − 0.3861 − 0.3449 0.7115]. D0 = 0. (97)

Let r be a unit step command, let d be zero-mean Gaussian white noise with standard deviation0.05, and let v be zero-mean Gaussian white noise with standard deviation 0.025. In order toaccount for the standard deviation of the sensor noise, it follows from (26) that

V2 = 0.025D2DT2 = 0.025[1 0 − 1][1 0 − 1]T = 0.05. (98)

We apply RCAC with Rθ = 10−10, Ru = 0, and nc = 8n = 40, and we use the FIR targetmodel (71). RCAC asymptotically follows the step command and approximates the closed-loopfrequency response of LQG except at DC due to the internal model of the command, which hasthe form of a notch at DC, as shown in Figure 23. For this example, RCAC approximates theclosed-loop frequency response of the LQG controller designed for the actual sensor noise level,namely, V2 = 0.05. �

The examples in this section investigate the effect of sensor noise on the closed-loopperformance of RCAC. Example SN1 shows that, as the level of the sensor noise increases,RCAC fails to place closed-loop poles near the target locations, but does stabilize the system.Examples SN2 and SN3 show that RCAC approximates the closed-loop frequency response ofLQG in the presence of sensor noise.

DRAFT

24

Robustness to Erroneous and Unmodeled NMP Zeros

As shown by the construction of Gf given by (71)–(74), the modeling information requiredby RCAC is dzu, Hdzu , and the NMP zeros of Gzu. In this section we investigate the effect oferroneous and unmodeled NMP zeros by considering four examples. In the first example, weconsider erroneous estimates of real and complex NMP zeros for the adaptive servo problem.Next, we consider an unmodeled change in the location of the NMP zero during operation. In thethird example, we consider unmodeled NMP sampling zeros for a sampled-data system. Finally,we consider an unstable plant with an unmodeled NMP zero.

Example NMP1. Erroneous NMP-zero estimates for the adaptive servo problem. Considerthe unstable, NMP plant

G(q) =(q− 0.95)(q− 0.975)(q− 1.1)

(q− 0.99)(q− 1.01)(q2 − 1.4q + 1.13). (99)

Let r be the harmonic command r(k) = cosωk, where ω = 0.01 rad/sample, and let d = v = 0.We use the target model Gf(q) = Hdzu(q − 1.265)/q2, where the estimate 1.265 of the NMPzero 1.1 is erroneous by 15%, and set Rθ = 0.1, Ru = 0, and nc = 12. Figure 24 shows thecommand-following performance. Figure 25 shows the NMP-zero estimates for which RCACasymptotically follows the harmonic command as well as the NMP-zero estimates for which theclosed-loop system becomes unstable. Figure 25 shows that, for the plant (99), RCAC is morerobust to overestimation of the NMP zero than underestimation.

Next, let d be zero-mean Gaussian white noise with standard deviation 0.5. Figure 25 alsoshows the NMP-zero estimates for which RCAC asymptotically follows the harmonic commandas well as the NMP-zero estimates for which the closed-loop system becomes unstable. Figure25 shows that, for the plant (99), RCAC is less robust to erroneous NMP-zero estimates in thecase of stochastic disturbances than in the case of harmonic commands.

Next, consider the asymptotically stable plant with complex NMP zeros given by

G(q) =q2 − 2.1q + 1.1125

(q− 0.999)(q2 − 1.9q + 0.9125). (100)

Let r be the harmonic command r(k) = cosωk, where ω = 0.01 rad/sample, and let d = v = 0.We use the target model Gf(q) = Hdzu(q − ξ1)(q − ξ2)/q2, where ξ1 and ξ2 are estimates ofthe NMP zeros 1.05± 0.1. Figure 26 shows the locations of the NMP-zero estimates for whichRCAC asymptotically follows the harmonic command, as well as the NMP-zero estimates forwhich the closed-loop system becomes unstable. �

DRAFT

25

Example NMP2. Unmodeled change in the NMP zeros for the adaptive standard problem.Consider the asymptotically stable, minimum-phase plant

G(q) =(q− 0.5)(q2 − 1.92q + 1.44)

(q− 0.35)(q− 0.6)(q2 − 0.8q + 0.32), (101)

and let w be zero-mean Gaussian white noise with standard deviation 0.01. We apply RCACwith Rθ = 10−5, Ru = 0.1z2, and nc = 4, and we use the IIR target model (74) with theroots of Dp(q) chosen as the closed-loop poles of high-authority LQG. RCAC approximatesthe closed-loop frequency response of high-authority LQG, as shown in Figure 27. At stepk = 5000, the plant dynamics change so that the NMP zeros move from 0.96 ± 0.72 to0.99 ± 1.38. If Gc,k is fixed to be Gc,5000, then the closed-loop system becomes unstable.However, under adaptation, the plant is restabilized, and RCAC approximates the closed-loopfrequency response of high-authority LQG for the modified plant, as shown in Figure 27. �

Example NMP3. Unmodeled NMP sampling zero for the adaptive servo problem.Consider the asymptotically stable, continuous-time plant T (s) = Λ(s)T0(s), where

Λ(s) =229

(s− 15 + 2)(s− 15− 2), T0(s) =

2

s+ 1, (102)

where Λ(s) represents unmodeled high-frequency dynamics [40]. Since the relative degree ofT0(s) is 1, the discrete-time sampled-data plant G0(z) obtained by discretizing T0(s) does notyield any sampling zeros [76]. However, since the relative degree of T (s) is 3, the discrete-timesampled-data plant G(z) obtained by discretizing T (s) possesses two sampling zeros due to Λ(s).It is shown in [41] that, if the sampling period h . 0.2, then one of the sampling zeros is NMP.Since Λ(s) represents unmodeled dynamics, neither the presence nor the location of this NMPzero can be assumed to be known. We choose h = 0.1, in which case G has a NMP samplingzero at −2.1481 as well as a minimum-phase sampling zero at −0.0672. Let r be the harmoniccommand r(k) = cosωk, where ω = 0.2 rad/sample, and let d = v = 0. We apply RCAC withRθ = 10, Ru = 0.1z2, and nc = 10, and we use the minimum-phase FIR target model (71).RCAC avoids unstable pole-zero cancellation and asymptotically follows the command withoutknowledge of the unmodeled NMP zero, as shown in Figure 28. �

Example NMP4. Unmodeled NMP zero for the adaptive servo problem. Consider theunstable, NMP double integrator

G(q) =q− 1.15

(q− 1)2. (103)

DRAFT

26

Let r be the harmonic command r(k) = cosωk, where ω = 0.55 rad/sample, and let d =

v = 0. We apply RCAC with Rθ = 104, Ru = z2, and nc = 6. Assuming that the NMPzero of G is unmodeled, we use the FIR target model (71). Figure 29 shows the command-following performance. As z becomes unbounded, the term

∑ki=1 λ

k−iθTΦTf (i)Ru(i)Φf(i)θ in

(42) dominates the remaining terms, and u converges to 0. Therefore, the closed-loop systemreverts to the unstable open-loop plant, and RCAC does not follow the harmonic command. Forthis example, RCAC is not robust to unmodeled NMP zeros for unstable plants, despite usingthe performance-dependent control weighting Ru = z2. �

The examples in this section investigate the robustness of RCAC to modeling errors in thetarget model. For Example NMP1, RCAC is less robust to erroneous NMP-zero estimates in thecase of stochastic disturbance rejection than in the case of harmonic command following. ForExample NMP2 where the NMP zeros are uncertain, performance-dependent cost regularization,that is, choosing Ru to be a function of z, improves the robustness of RCAC to uncertainty inthe location of the NMP zero. For Rohrs’s counterexample involving unmodeled dynamics [40],Example NMP3 shows that, with performance-dependent regularization, RCAC stabilizes theclosed-loop system. Example NMP4 shows that unstable plants with uncertain NMP zeros areespecially difficult for RCAC to control.

Robustness to Erroneous Relative Degree and Unmodeled Delay

In this section we investigate the effect of erroneous relative degree and unmodeled delaysby considering three examples that involve critical changes in the plant during operation. Inthe first example, we consider the effect of erroneous estimates of the relative degree of Gzu.Next, we consider unmodeled time delays both in the initial modeling of the system and duringoperation. Finally, we apply RCAC to a plant with limited achievable delay margin, and weinvestigate the effect of unmodeled time delays exceeding the delay margin. In all cases, weprovide design guidelines for each type of modeling error.

Example DM1. Erroneous dzu for the adaptive servo problem. Consider the asymptoti-cally stable, minimum-phase plant

G(q) =q− 0.95

(q− 0.85)(q2 − 1.5q + 0.985). (104)

Let r be the harmonic command r(k) = cosωk, where ω = 0.15 rad/sample, and let d = v = 0.We apply RCAC with Rθ = 10, Ru = 0, and nc = 6. For G given by (104), dzu = 2. We

DRAFT

27

use the FIR target model (71), but with dzu replaced by dzu, where dzu is an estimate of dzu.Figure 30 shows the command-following performance for dzu = 1, dzu = 3, and dzu = 4. Fordzu = 3 and dzu = 4, RCAC asymptotically follows the harmonic command. For dzu = 1,however, RCAC does not follow the harmonic command, and the plant output y0 diverges.RCAC asymptotically follows the command for 2 ≤ dzu ≤ 4. Moreover, by using Ru = z2,RCAC asymptotically follows the command for 1 ≤ dzu ≤ 10. For this example, RCAC ismore robust to overestimation of dzu than underestimation of dzu. Note that dzu > dzu accountsfor an unmodeled time delay of dzu − dzu steps in the sense that, if the plant experiences anunmodeled time delay of dzu− dzu steps, then dzu is the true relative degree. The next exampleconsiders time delay directly. �

Example DM2. Unmodeled time delay for the adaptive servo problem. Consider theasymptotically stable, minimum-phase plant G = GTDG0, where

GTD(q)4= q−kd , G0(q) =

q− 0.9

(q− 0.95)(q2 − 1.6q + 0.89), (105)

and GTD represents an unmodeled time delay of kd steps. Let r be the harmonic commandr(k) = cosωk, where ω = 0.1 rad/sample, and let d = v = 0. We apply RCAC with Rθ = 100,Ru = 0.1z2, and nc = 10. Since GTD is unmodeled, we use the FIR target model (71) based onG0. Figure 31 shows the command-following error e0 for kd = 1, kd = 2, kd = 3, and kd = 4.RCAC asymptotically follows the harmonic command in each case.

Next, let r(k) = cosωk, where ω = 0.5 rad/sample, and let d = v = 0. We apply RCACwith Rθ = 0.1, Ru = 0.1z2, and nc = 15, and we use the FIR target model (71). At stepk = 15000, an unmodeled 1-step time delay is inserted into the loop. At step k = 30000,an additional unmodeled 2-step time delay is inserted, and, at step k = 60000, an additionalunmodeled 6-step time delay is inserted. Table 1 shows the magnitude crossover frequency ωmco,the phase margin PM, and the delay margin DM prior to each insertion of additional delays,where

DM =PMωmco

π

180, (106)

where the units of PM are degrees and the units of ωmco are rad/sample. Note that each timedelay exceeds the delay margin at the time step of insertion. In each case, RCAC re-adapts andrestabilizes the closed-loop system, as shown in Figure 32. After the third time delay is insertedin the loop at k = 60000, RCAC restabilizes the closed-loop system at step k = 100000 (notshown). �

DRAFT

28

k ωmco (rad/sample) PM (deg) DM (steps) Additional UnmodeledDelay Inserted (steps)

15000 1.8872 19.0799 0.1765 130000 1.4003 86.1123 1.0733 260000 0.5742 181.7065 5.5233 6

TABLE 1: Example DM2: Unmodeled time-varying time delay for the adaptive servo problem.The values of magnitude crossover frequency ωmco, phase margin PM, and delay margin DM aredetermined prior to the insertion of additional time delays into the loop. Each additional timedelay inserted at time step k exceeds the delay margin at time step k.

Example DM3. Limited delay margin for the adaptive standard problem. Consider theunstable, minimum-phase, continuous-time plant given by [77]

A =

−0.08 −0.03 0.2

0.2 −0.04 −0.005

−0.06 0.2 −0.07

, B = D1 =

−0.1

−0.2

0.1

, (107)

C = E1 =[

0 −1 0], D2 = 0. (108)

This plant has an unstable pole at 0.1081. It is shown in [77] that the maximum achievabledelay margin for this plant is 18.51 sec. For the standard problem, we discretize (107) and(108) with the sampling period h = 0.1 sec. Using the controller given by (23) in [77] anddiscretizing the continuous-time plant with the sampling period h = 0.1 sec, the delay marginof the discrete-time closed-loop system is 6.07 steps.

Next, we use RCAC with the adaptive standard problem in order to stabilize (107) and(108). We apply RCAC with Rθ = 100, Ru = 0.1z2, and nc = 3, and we use the FIR targetmodel (71). The delay margin of the closed-loop system at step k = 3000 using RCAC is 0.31steps, as shown by Table 2. Figure 33 shows the closed-loop responses for the initial conditionx(0) = [0.1 0.1 0.1]T for both RCAC and the controller given by [77] discretized with thesampling period h = 0.1 sec. At step k = 3000, an unmodeled 7-step time delay is insertedinto the loop, which destabilizes both closed-loop systems. Under continued adaptation and aprolonged transient response, RCAC restabilizes the closed-loop system at time step k = 11800.�

DRAFT

29

Controller ωmco (rad/sec) PM (deg) DM (steps)[77] 0.3708 129.12 6.07RCAC 2.1834 38.83 0.31

TABLE 2: Margins for the controller from [77] and RCAC at step k = 3000. Note that theRCAC controller has significantly smaller phase margin and delay margin than the optimizedcontroller given in [77].

Robustness to Erroneous Plant Gain

In this section we investigate the effect of the estimate of the plant gain as determined bythe leading numerator coefficient by considering three examples that involve either erroneousestimates of the plant gain or unmodeled changes in the plant gain. In the first example, weconsider erroneous estimates of Hdzu for the adaptive servo problem. Next, we compare the gainmargin of RCAC to high-authority LQG, and consider an unmodeled change in the plant gain.Finally, we consider a plant with limited achievable gain margin for both high-authority LQGand RCAC, and investigate the effect of unmodeled changes in the plant gain for this system.

Example GM1. Erroneous Hdzu for the adaptive servo problem. Consider theasymptotically stable, minimum-phase plant (104), let r be a unit step command, and letd = v = 0. We apply RCAC with Rθ = 10, Ru = 0, and nc = 5. We use the FIR target model(71) with Hdzu replaced by Hdzu , where Hdzu is an estimate of the true value Hdzu = 1. Figure34 shows the command-following performance for Hdzu = −1, Hdzu = 0.1, and Hdzu = 10.RCAC asymptotically follows the command for Hdzu = 0.1 and Hdzu = 10, but not forHdzu = −1. For this example, RCAC is robust to errors in the magnitude of the estimate ofHdzu , but is not robust to errors in the sign of Hdzu . �

Example GM2. Unmodeled change in the static gain for the adaptive standard problem.Consider the asymptotically stable, NMP plant

G(q) =(q2 − 1.7q + 0.785)(q2 − 1.4q + 0.85)

(q− 0.5)(q2 − 1.8q + 0.97)(q2 − 1.4q + 0.98), (109)

and let w be zero-mean Gaussian white noise with standard deviation 0.01. We apply RCACwith Rθ = 10−5, Ru = 0, and nc = 10, and we use the FIR target model (72). Figure 35 showsthat RCAC approximates the closed-loop frequency response of high-authority LQG. At stepk = 5000, the closed-loop system has a gain margin of 1.8105 at the phase crossover frequency

DRAFT

30

ωpco = 0 rad/sample. At step k = 5000, G is replaced by 2.9G, where the additional gain 2.9is unmodeled. If Gc,k is fixed to be Gc,5000, then the closed-loop system becomes unstable.However, under adaptation, the plant is restabilized, and RCAC approximates the closed-loopfrequency response of LQG for the modified plant, as shown in Figure 35. �

Example GM3. Severely limited gain margin for the adaptive standard problem. Considerthe unstable, minimum-phase, continuous-time plant from [9] given by

A =

[1 1

0 1

], B =

[0

1

], D1 =

[1

1

], (110)

C = E1 =[

1 1], D2 = 1. (111)

For the standard problem, we discretize (110) and (111) with the sampling period h = 0.01

sec. We apply RCAC with Rθ = 10−10, Ru = 0, nc = 6, and we use the FIR target model (72).Figure 36 shows that RCAC approximates the closed-loop frequency response of high-authorityLQG. The LQG controller yields a gain margin of 0.04, and the RCAC controller yields again margin of 0.0012. Hence both controllers yield small gain margins. At step k = 10000,G is replaced by 1.1G, where the additional gain 1.1 is unmodeled. If Gc,k is fixed to beGc,10000, then the closed-loop system becomes unstable. However, under adaptation, the plantis restabilized, as shown in Figure 36. �

In this section and the previous section, we applied RCAC to a collection of examplesinvolving plants that are practically impossible to control using fixed-gain controllers due toextremely small gain and phase margins. Plants of this type are viewed in [8] as potentiallyproblematic for adaptive control as well. At convergence, the closed-loop systems possess smallgain or phase margin, as expected, and thus the insertion of additional gain or time delay causedinstability. For these examples, however, RCAC was able to re-adapt and restabilize the closed-loop system.

DRAFT

31

Discussion

This expository article presented a self-contained description of the RCAC algorithm foradaptive control. Control objectives include stabilization, command following, and disturbancerejection. RCAC is based on a recursive least squares procedure for updating the coefficients ofa linear controller for feedback or feedforward control, where all signals are vectors.

A key contribution of this article is the demonstration that retrospective cost optimizationupdates the controller coefficients so as to match the intercalated closed-loop transfer functionGzu,k to a target model Gf . It was shown that Gzu,k is the transfer function from the virtualexternal controller perturbation u to the performance variable z. The special nature of Gzu,k isdue to the fact that u enters the feedback loop through intercalated injection, which means thatu is injected internally to the controller as opposed to simply being added to the control input.

The target model Gf is selected by the user, and the choice of Gf is guided by its rolein the controller adaptation. In particular, since RCAC tends to match Gzu,k to the target modeland, since the target model possesses the NMP zeros of Gzu, the NMP zeros must be reproducedin the target model; otherwise, RCAC may cancel them, resulting in a hidden instability. Thismodeling information, along with the relative degree of Gzu and its leading numerator coefficient,constitutes the basic modeling information required by RCAC. These statements apply to thecase where Gzu is SISO.

The role of the target model was examined from various angles. First, it was shown that,in the absence of sensor noise and control weighting Ru, RCAC tends to match the closed-loopfrequency response of the high-authority LQG controller. This connection is surprising in viewof the fact that RCAC uses extremely limited modeling information relative to LQG. In effect,RCAC uses data to compensate for missing or erroneous modeling information. In addition tomatching closed-loop properties of the LQG controller, RCAC can be used for adaptive poleplacement by choosing the poles of the target model as the desired closed-loop spectrum. Forthe adaptive servo problem, RCAC is used to follow step and harmonic commands as well asto reject step, harmonic, and broadband disturbances.

For some adaptive control algorithms, sensor noise may produce bursting and gaindivergence [75]. To assess the performance of RCAC in the presence of sensor noise, it wasshown that RCAC matches the closed-loop frequency response of the LQG controller synthesizedfor the actual sensor noise variance. Stabilization, command following, and disturbance rejectionproblems were also considered with broadband sensor noise with and without bias. For several

DRAFT

32

examples, especially involving plants that are not asymptotically stable, the level of sensor noisewas increased until RCAC failed to either follow the command or stabilize the plant. In allexamples, no bursting was observed.

We then examined the performance of RCAC in the off-nominal case, that is, the case wherethe NMP zeros, relative degree, and leading numerator coefficient are uncertain. In the case wherethe NMP zeros are uncertain, it was shown that performance-dependent cost regularization, thatis, choosing Ru to be a function of z, improves the robustness of RCAC to uncertainty in thelocation of the NMP zero. Likewise, in the case where the relative degree is uncertain, it wasshown that performance-dependent cost regularization improves the robustness of RCAC. Thisuncertainty was related to uncertain time delay, that is, unknown latency in the feedback loop.

The examples in this article provide insight and guidelines for the application of RCAC.In particular, Example PP1 shows that RCAC can be used to assign a subset of the closed-looppoles, and that the remaining closed-loop poles may cancel unmodeled minimum-phase zeros.This example also shows that the accuracy of pole placement depends on the choice of Rθ, whichinitializes the RLS covariance. Example H2 shows that, for harmonic command following andharmonic disturbance rejection, RCAC has can develop an internal model of the command anddisturbance without knowledge of the spectrum of the exogenous dynamics. Under stochasticdisturbances, Example SD1 shows that, by choosing nc >> n, RCAC tends to match the closed-loop frequency response of high-authority LQG. For Example NMP1, RCAC is less robust toerroneous NMP-zero estimates in the case of stochastic disturbance rejection than in the case ofharmonic command following. For Example NMP3, which is Rohrs’s counterexample involvingunmodeled dynamics [40], RCAC with performance-dependent cost regularization stabilizes theclosed-loop system. Example NMP4 shows that unstable plants with uncertain NMP zeros areespecially difficult for RCAC to control. For a plant with severely limited delay margin, ExampleDM3 illustrates the ability of RCAC to readapt in the case where the margin is exceeded dueto the introduction of an unmodeled time delay. Example GM3 illustrates the correspondingproperty for the case of limited gain margin. The ability of RCAC to restabilize the plant afterunmodeled destabilizing time delays and gains are inserted into the loop suggests that the viewsexpressed in [8] and quoted in the opening section of this article may be overly pessimistic.

DRAFT

33

References

[1] K. J. Astrom and P. R. Kumar, “Control: A Perspective,” Automatica, vol. 50, no. 1, pp.3–43, 2014.

[2] J. Hathout, M. Fleifil, A. Annaswamy, and A. Ghoniem, “Role of Actuation in CombustionControl,” Comb. Sci. Tech., vol. 167, no. 1, pp. 57–82, 2001.

[3] M. Pastoor, L. Henning, B. R. Noack, R. King, and G. Tadmor, “Feedback Shear LayerControl for Bluff Body Drag Reduction,” J. Fluid Mech., vol. 608, pp. 161–196, 2008.

[4] N. Gautier, J.-L. Aider, T. Duriez, B. R. Noack, M. Segond, and M. Abel, “Closed-loopSeparation Control Using Machine Learning,” J. Fluid Mech., vol. 770, pp. 442–457, 2015.

[5] S. L. Brunton and B. R. Noack, “Closed-loop Turbulence Control: Progress and Challenges,”Appl. Mech. Rev., vol. 67, pp. 050 801–1–050 801–48, 2015.

[6] G. Ambrosino, M. Ariola, G. De Tommasi, and A. Pironti, “Plasma Vertical Stabilizationin the ITER Tokamak via Constrained Static Output Feedback,” IEEE Trans. Contr. Sys.Tech., vol. 19, pp. 376–381, 2011.

[7] M. M. Seron, J. H. Braslavsky, and G. C. Goodwin, Fundamental Limitations in Filteringand Control. Springer, 1997.

[8] B. D. O. Anderson and A. Dehghani, “Challenges of Adaptive Control–Past, Permanentand Future,” Ann. Rev. in Contr., vol. 32, pp. 123–135, 2008.

[9] J. C. Doyle, “Guaranteed Margins for LQG Regulators,” IEEE Trans. Autom. Control,vol. 23, pp. 756–757, 1978.

[10] K. Narendra and A. Annaswamy, “A New Adaptive Law for Robust Adaptive ControlWithout Persistent Excitation,” IEEE Trans. Autom. Contr., vol. AC-32, pp. 134–145, 1987.

[11] P. A. Ioannou and J. Sun, Robust Adaptive Control. Prentice Hall, 1996.[12] A. M. Annaswamy and J. Wong, “Adaptive Control in the Presence of Saturation Non-

linearity,” Int. J. Adapt. Contr. Signal Proc., vol. 11, no. 1, pp. 3–19, 1997.[13] K. S. Narendra and J. Balakrishnan, “Adaptive Control Using Multiple Models,” IEEE

Trans. Autom. Control, vol. 42, no. 2, pp. 171–187, 1997.[14] G. Tao, Adaptive Control Design and Analysis. Wiley, 2003.[15] N. T. Nguyen, K. S. Krishnakumar, and J. D. Boskovic, “An Optimal Control Modification

to Model-Reference Adaptive Control for Fast Adaptation,” in Proc. AIAA Guid. Nav. Contr.Conf., Honolulu, HI, August 2008, AIAA-2008-7283.

[16] “Combined/Composite Model Reference Adaptive Control,” in Proc. AIAA Guid. Nav.Contr. Conf., Chicago, IL, August 2009, AIAA-2009-6065.

[17] N. Hovakimyan and C. Cao, L1 Adaptive Control Theory: Guaranteed Robustness withFast Adaptation. SIAM, 2010.

[18] G. Chowdhary, T. Yucelen, M. Muhlegg, and E. N. Johnson, “Concurrent Learning Adaptive

DRAFT

34

Control of Linear Systems with Exponentially Convergent Bounds,” Int. J. Adapt. Contr.Signal Proc., vol. 27, no. 4, pp. 280–301, 2013.

[19] G. C. Goodwin, P. J. Ramadge, and P. E. Caines, “Discrete-Time Multi-Variable AdaptiveControl,” IEEE Trans. Autom. Control, vol. 25, pp. 449–456, 1980.

[20] K. J. Astrom, “Direct Methods for Nonminimum Phase Systems,” in Proc. Conf. Dec.Contr., Albuquerque, NM, December 1980, pp. 611–615.

[21] G. C. Goodwin and K. S. Sin, “Adaptive Control of Nonminimum Phase Systems,” IEEETrans. Autom. Contr., vol. 26, pp. 478–483, 1981.

[22] R. Johansson, “Parametric Models of Linear Multivariable Systems for Adaptive Control,”IEEE Trans. Autom. Contr., vol. 32, pp. 303–313, 1987.

[23] L. Praly, S. T. Hung, and D. S. Rhode, “Towards a Direct Adaptive Scheme for a Discrete-Time Control of a Minimum Phase Continuous-Time System,” in Proc. Conf. Dec. Contr.,Fort Lauderdale, FL, December 1989, pp. 1188–1191.

[24] R. Venugopal, V. G. Rao, and D. S. Bernstein, “Lyapunov-Based Backward-HorizonDiscrete-Time Adaptive Control,” Adaptive Contr. Sig. Proc., vol. 17, pp. 67–84, 2003.

[25] T. Hayakawa, W. M. Haddad, and A. Leonessa, “A Lyapunov-Based Adaptive ControlFramework for Discrete-time Non-linear Systems with Exogenous Disturbances,” Int. J.Control, vol. 77, pp. 250–263, 2004.

[26] S. Akhtar, R. Venugopal, and D. S. Bernstein, “Logarithmic Lyapunov Functions for DirectAdaptive Stabilization with Normalized Adaptive Laws,” Int. J. Contr., vol. 77, pp. 630–638,2004.

[27] J. B. Hoagg, M. A. Santillo, and D. S. Bernstein, “Discrete-time Adaptive CommandFollowing and Disturbance Rejection with Unknown Exogenous Dynamics,” IEEE Trans.Autom. Contr., vol. 53, pp. 912–928, 2008.

[28] C.-C. Cheng and S. H.-S. Fu, “On Trajectory-Dependent Direct Adaptive Control for a Classof Uncertain Linear Discrete-time Systems,” Int. J. Adapt. Contr. Signal Proc., vol. 23, pp.1014–1030, 2009.

[29] C. Li and J. Lam, “Stabilization of Discrete-Time Nonlinear Uncertain Systems by FeedbackBased on LS Algorithm,” SIAM J. Contr. Optim., vol. 51, pp. 1128–1151, 2013.

[30] M. Yu, J.-X. Xu, and D. Huang, “Discrete-Time Periodic Adaptive Control for ParametricSystems with Non-sector Nonlinearities,” Int. J. Adapt. Contr. Signal Proc., vol. 28, pp.987–1001, 2014.

[31] M. C. Campi, A. Lecchini, and S. M. Savaresi, “Virtual Reference Feedback Tuning(VRFT): A new Direct Approach to the Design of Feedback Controllers,” in Proc. Conf.Dec. Contr., Sydney, Australia, 2000, pp. 623–629.

[32] M. C. Campi and S. M. Savaresi, “Direct Nonlinear Control Design: The Virtual Reference

DRAFT

35

Feedback Tuning (VRFT) Approach,” IEEE Trans. Autom. Control, vol. 51, no. 1, pp.14–27, 2006.

[33] Z. Hou and S. Jin, “A Novel Data-driven Control Approach for a Class of Discrete-timeNonlinear Systems,” IEEE Trans. Contr. Sys. Tech., vol. 19, no. 6, pp. 1549–1558, 2011.

[34] A. S. Bazanella, L. Campestrini, and D. Eckhard, Data-Driven Controller Design: The H2

Approach. Springer, 2011.[35] S. S. Ge, Z. Li, and H. Yang, “Data Driven Adaptive Predictive Control for Holonomic

Constrained Under-actuated Biped Robots,” IEEE Trans. Contr. Sys. Tech., vol. 20, no. 3,pp. 787–795, 2012.

[36] Z. S. Hou and Z. Wang, “From Model-Based Control to Data-Driven Control: Survey,Classification and Perspective,” Infor. Sci., vol. 235, pp. 3–35, 2013.

[37] R. Venugopal and D. S. Bernstein, “Adaptive Disturbance Rejection Using ARMARKOVSystem Representations,” IEEE Trans. Contr. Sys. Tech., vol. 8, pp. 257–269, 2000.

[38] M. A. Santillo and D. S. Bernstein, “Adaptive Control Based on Retrospective CostOptimization,” J. Guid. Contr. Dyn., vol. 33, pp. 289–304, 2010.

[39] J. B. Hoagg and D. S. Bernstein, “Retrospective Cost Model Reference Adaptive Controlfor Nonminimum-Phase Systems,” J. Guid. Contr. Dyn., vol. 35, pp. 1767–1786, 2012.

[40] C. E. Rohrs, L. Velavani, M. Athans, and G. Stein, “Robustness of Continuous-TimeAdaptive Control Algorithms in the Presence of Unmodeled Dynamics,” IEEE Trans.Autom. Contr., vol. AC-30, pp. 881–889, 1985.

[41] E. D. Sumer and D. S. Bernstein, “Robust Sampled-Data Adaptive Control of the RohrsCounterexamples,” in Proc. Conf. Dec. Contr., Maui, HI, December 2012, pp. 7273–7278.

[42] E. D. Sumer, J. B. Hoagg, and D. S. Bernstein, “Broadband Disturbance Rejection UsingRetrospective Cost Adaptive Control,” in Proc. Dyn. Sys. Contr. Conf., no. DSCC2012-MOVIC2012-8572, Fort Lauderdale, FL, October 2012, pp. 1–10.

[43] E. D. Sumer and D. S. Bernstein, “Adaptive Decentralized Noise and Vibration Controlwith Conflicting Performance Objectives,” in Proc. Dyn. Sys. Contr. Conf., no. DSCC2012-MOVIC2012-8580, Fort Lauderdale, FL, October 2012, pp. 1–10.

[44] ——, “Aliasing Effects in Direct Digital Adaptive Control of Plants with High-FrequencyDynamics and Disturbances,” in Proc. Amer. Contr. Conf., Washington, DC, June 2013, pp.4934–4939.

[45] ——, “On the Role of Subspace Zeros in Retrospective Cost Adaptive Control of NonsquarePlants,” Int. J. Contr., vol. 88, pp. 295–323, 2015.

[46] J. Yan and D. S. Bernstein, “Minimal Modeling Retrospective Cost Adaptive Control ofUncertain Hammerstein Systems Using Auxiliary Nonlinearities,” Int. J. Contr., vol. 87,pp. 483–505, 2014.

[47] B. C. Coffer, J. B. Hoagg, and D. S. Bernstein, “Cumulative Retrospective Cost Adaptive

DRAFT

36

Control with Amplitude and Rate Saturation,” in Proc. Amer. Contr. Conf., San Francisco,CA, June 2011, pp. 2344–2349.

[48] M. Al Janaideh and D. S. Bernstein, “Adaptive Control of Hammerstein Systems withUnknown Prandtl–Ishlinskii Hysteresis,” Proc. Inst. Mech. Eng., Part I: J. Sys. Contr. Eng.,vol. 229, no. 2, pp. 149–157, 2015.

[49] M. Rizzo, M. A. Santillo, A. Padthe, J. B. Hoagg, S. Akhtar, D. S. Bernstein, andK. G. Powell, “CFD-Based Identification for Adaptive Flow Control Using ARMARKOVDisturbance Rejection,” in Proc. Amer. Contr. Conf., Minneapolis, MN, June 2006, pp.3783–3788.

[50] Y.-C. Cho, J. B. Hoagg, D. S. Bernstein, and W. Shyy, “Retrospective Cost Adaptive FlowControl of Low-Reynolds Number Aerodynamics Using a Dielectric Barrier DischargeActuator,” in Proc. 5th AIAA Flow Contr. Conf., Chicago, IL, August 2010, AIAA-2010-4841.

[51] M. Isaacs, J. B. Hoagg, A. V. Morozov, and D. S. Bernstein, “A Numerical Study onControlling a Nonlinear Multilink Arm Using A Retrospective Cost Model ReferenceAdaptive Controller,” in Proc. Conf. Dec. Contr., Orlando, FL, December 2011, pp. 8008–8013.

[52] E. D. Sumer and D. S. Bernstein, “Adaptive Control of Flexible Structures with UncertainDynamics and Uncertain Disturbance Spectra,” in Proc. AIAA Guid. Nav. Contr. Conf.,Minneapolis, MN, August 2012, AIAA-2012-4437-323.

[53] A. K. Padthe, P. P. Friedmann, and D. S. Bernstein, “Retrospective Cost Adaptive Controlfor Helicopter Vibration Reduction,” in Proc. AHS 69th Annual Forum, Phoenix, AZ, May2013, AHS2013-000179.

[54] J. Yan, A. M. D’Amato, K. Butts, I. Kolmanovsky, and D. S. Bernstein, “Adaptive Controlof the Air Flow System in a Diesel Engine,” in Proc. Dyn. Sys. Contr. Conf., no. DSCC2012-MOVIC2012-8597, Fort Lauderdale, FL, October 2012, pp. 1–9.

[55] M. J. Yu, Y. Rahman, E. M. Atkins, I. Kolmanovsky, and D. S. Bernstein, “MinimalModeling Adaptive Control of the NASA Generic Transport Model with Unknown Control-Surface Faults,” in Proc. AIAA Guid. Nav. Contr. Conf., Boston, MA, August 2013, AIAA-2013-4693.

[56] F. Sobolic and D. S. Bernstein, “Aerodynamic-Free Adaptive Control of the NASA GenericTransport Model,” in Proc. AIAA Guid. Nav. Contr. Conf., Boston, MA, August 2013,AIAA-2013-4999.

[57] M. J. Yu, J. Zhong, E. M. Atkins, I. V. Kolmanovsky, and D. S. Bernstein, “Trim-Commanded Adaptive Control for Waypoint-Defined Trajectory Following,” in Proc. AIAAGuid. Nav. Contr. Conf., Boston, MA, August 2013, AIAA-2013-4999.

[58] A. Ansari and D. S. Bernstein, “Adaptive Control of an Aircraft with Uncertain

DRAFT

37

Nonminimum-Phase Dynamics,” in Proc. Amer. Contr. Conf., Chicago, IL, July 2015, pp.844–849.

[59] ——, “Retrospective Cost Adaptive Control of the Generic Transport Model UnderUncertainty and Failure,” Submitted.

[60] M. Camblor, G. Cruz, S. Esteban, F. A. Leve, and D. S. Bernstein, “Retrospective CostAdaptive Spacecraft Attitude Control Using Control Moment Gyros,” in Proc. Amer. Contr.Conf., Portland, OR, June 2014, pp. 2492–2497.

[61] S. Dai, T. Lee, and D. S. Bernstein, “Adaptive Control of a Quadrotor UAV Transportinga Cable-Suspended Load with Unknown Mass,” in Proc. Conf. Dec. Contr., Los Angeles,CA, December 2014, pp. 6149–6154.

[62] F. Sobolic and D. S. Bernstein, “An Inner-Loop/Outer-Loop Architecture for an AdaptiveMissile Autopilot,” in Proc. Amer. Contr. Conf., Chicago, IL, July 2015, pp. 850–855.

[63] A. Goel, A. Xie, K. Duraisamy, and D. S. Bernstein, “Retrospective Cost Adaptive ThrustControl of a 1D Scramjet with Mach Number Disturbance,” in Proc. Amer. Contr. Conf.,Chicago, IL, July 2015, pp. 5551–5556.

[64] M. Al Janaideh and D. S. Bernstein, “Adaptive Control of Hammerstein Systems withUnknown Prandtl-Ishlinskii Hysteresis,” Proc. IMechE Part 1: J. Sys. Contr. Eng., pp. 1–9,2014.

[65] ——, “Adaptive Control of Mass-Spring Systems with Unknown Hysteretic ContactFriction,” in Proc. Amer. Contr. Conf, Boston, MA, July 2016, pp. 667–672.

[66] H. Sane and D. S. Bernstein, “Active Noise Control Using an Acoustic Servovalve,” inProc. Amer. Contr. Conf., Philadelphia, PA, June 1998, pp. 2621–2625.

[67] H. Sane, R. Venugopal, and D. S. Bernstein, “Disturbance Rejection Using Self-TuningARMARKOV Adaptive Control with Simultaneous Identification,” IEEE Trans. Contr. Sys.Tech., vol. 19, pp. 101–106, 2001.

[68] S. L. Lacy, R. Venugopal, and D. S. Bernstein, “ARMARKOV Adaptive Control of Self-Excited Oscillations of a Ducted Flame,” in Proc. Conf. Dec. Contr., Tampa, FL, December1998, pp. 4527–4528.

[69] B. L. Pence, M. A. Santillo, and D. S. Bernstein, “Markov-Parameter-Based AdaptiveControl of 3-Axis Angular Velocity in a 6DOF Stewart Platform,” in Proc. Amer. Contr.Conf., Seattle, WA, June 2008, pp. 4767–4772.

[70] E. D. Sumer and D. S. Bernstein, “Retrospective Cost Adaptive Control with Error-Dependent Regularization for MIMO Systems with Uncertain Nonminimum-Phase Trans-mission Zeros,” in Proc. AIAA Guid. Nav. Contr. Conf., Minneapolis, MN, August 2012,AIAA-2012-4670-123.

[71] K. J. Astrom and B. Wittenmark, Adaptive Control, 2nd ed. Addison-Wesley, 1995.

DRAFT

38

[72] G. C. Goodwin and K. S. Sin, Adaptive Filtering, Prediction, and Control. Prentice Hall,1984.

[73] B. Teixeira, “Kalman Filters,” IEEE Contr. Sys. Mag., vol. 2, no. 28, pp. 16–18, 2008.[74] M. Vidyasagar, Control System Synthesis: A Factorization Approach. MIT Press, 1985.[75] B. D. Anderson, “Adaptive Systems, Lack of Persistency of Excitation and Bursting

Phenomena,” Automatica, vol. 21, no. 3, pp. 247–258, 1985.[76] K. J. Astrom, P. Hagander, and J. Sternby, “Zeros of Sampled Systems,” Automatica, vol. 20,

pp. 31–38, 1984.[77] R. H. Middleton and D. E. Miller, “On the Achievable Delay Margin Using LTI Control

for Unstable Plants,” IEEE Trans. Autom. Control, vol. 52, no. 7, pp. 1194–1207, 2007.[78] E. Lavretsky and K. Wise, Robust and Adaptive Control: With Aerospace Applications.

Springer, 2012.[79] D. S. Bernstein, Matrix Mathematics, 2nd ed. Princeton, 2009.

DRAFT

39

Gzw Gzu

GyuGyw

Gc

w z

yu

1

Figure 1: Transfer function representation of the standard problem with the controller Gc.

Gyw Gc Gzu

Gzw

Gyu

uw y z

Figure 2: Equivalent transfer function representation of the standard problem with the controllerGc.

Gc

[Gd Gu]u

d

r

−

en

e0

y0vyn

−

1

Figure 3: Transfer function representation of the servo problem. en is the measured error, ande0 is the true error.

DRAFT

40

Gzw Gzu

GyuGyw

Gc,k

w z

yu

Gc,k

1

Figure 4: Transfer function representation of the adaptive standard problem with the adaptivecontroller Gc,k.

Gc,k[Gd Gu]u

d

r

−

en

e0

y0vyn

−

1

Figure 5: Transfer function representation of the adaptive servo problem with the adaptivecontroller Gc,k.

q−ncN∗c (q)

Ilu − q−ncD∗c (q)

[Gzw(q) Gzu(q)Gyw(q) Gyu(q)

]

y(k)

z(k)

u(k)u∗(k)

w(k)

u(k)

1

Figure 6: Transfer function representation of (54) and (57) with the virtual external controlperturbation u represented as an external input. The inner feedback loop, which represents (57),illustrates the intercalated injection of u inside the control update.

DRAFT

41

[Gzw(q) Gzu(q)Gyw(q) Gyu(q)

]

D∗−1c (q)N∗

c (q)

qncD∗−1c (q)

y(k)

z(k)w(k)

u(k)u(k)

1

Figure 7: Equivalent transfer function representation of (54) and (57) with u represented asan external input. In this representation, the inner feedback loop in Figure 6 is replaced by aprefilter.

0 50 100 150 200-30

-20

-10

0

10

y 0(k

)

r

y0

0 50 100 150 200Time Step

-150

-100

-50

0

50

100

150

3(k)

0 50 100 150 200Time Step

-20

-10

0

10

20

u(k)

-3 -2 -1 0 1Real Axis

-1

-0.5

0

0.5

1

Imag

inar

yA

xis eGzw

Dp

Figure 8: Example PP1: Pole placement for the adaptive servo problem. With Rθ = 10−20,RCAC places five closed-loop poles near the locations of the roots of Dp. The closed-loop polesand zeros are shown at step k = 100.

0 50 100 150 200-30

-20

-10

0

10

y 0(k

)

ry0

0 50 100 150 200Time Step

-150

-100

-50

0

50

100

3(k)

0 50 100 150 200Time Step

-20

-10

0

10

20

u(k)

-3 -2 -1 0 1Real Axis

-1

-0.5

0

0.5

1

Imag

inar

yA

xis eGzw

Dp

Figure 9: Example PP1: Pole placement for the adaptive servo problem. With Rθ = 10−40,RCAC places the closed-loop poles closer to the target locations.

DRAFT

42

0 20 40 60 80 100

-50

0

50z(k

)

OL

CL

0 20 40 60 80 100Time Step

-400

-200

0

200

3(k

)

0 20 40 60 80 100-200

-100

0

100

200

u(k

)

0 20 40 60 80 100Time Step

0

5

10

Spec

tralR

adiu

s

GceGzw

-10 -5 0 5Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

Gc

-10 -5 0 5Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

eGzw

Dp

Figure 10: Example PP2: Pole placement for the adaptive standard problem. RCAC places fiveclosed-loop poles near the locations of the roots of Dp. Note that the controller becomes unstableafter a few steps, and the closed-loop system is stabilized at step k = 16. The closed-loop polesand zeros are shown at step k = 100.

0 50 100 150 200 250 300-4

-2

0

2

y 0(k

)

r

y0

0 50 100 150 200 250 300Time Step

-2

-1

0

1

2

3(k)

0 2000 4000 6000 8000 10000-10

-5

0

5

10

u(k

)

0 2000 4000 6000 8000 10000Time Step

0

0.5

1

1.5

Spec

tral

Rad

ius eGzw

Figure 11: Example H1: Harmonic command following for the adaptive servo problem usingan FIR controller. Since RCAC cannot develop an internal model of the command r due to theFIR structure of Gc,k, the command-following performance is severely restricted. Note that thecontroller coefficients do not converge, and the closed-loop system alternates between asymptoticstability and instability.

DRAFT

43

0 50 100 150 200 250 300-3

-2

-1

0

1

2

3

y 0(k

)

r

y0

0 50 100 150 200 250 300Time Step

-2

-1

0

1

2

3

3(k)

0 50 100 150 200 250 300Time Step

-1.5

-1

-0.5

0

0.5

1

u(k)

-1.5 -1 -0.5 0 0.5 1Real Axis

-1

-0.5

0

0.5

1

Imag

inar

yA

xis eGzw

e|!

Figure 12: Example H1: Harmonic command following for the adaptive servo problem. With anIIR controller structure, RCAC achieves an internal model of the harmonic command signal byplacing controller poles on the unit circle at the command frequency. The internal-model polesof the controller are evident in the form of two closed-loop zeros on the unit circle, which areshown by the red plus signs. The closed-loop poles and zeros are shown at step k = 300.

Gfb,k[Gd Gu]

Gff,k

u

d

r

−

z

z0

y0

vyn

−

1

Figure 13: Transfer function representation of combined feedback-feedforward control withdecentralized adaptation for the adaptive servo problem.

0 50 100 150 200 250 300-3

-2

-1

0

1

2

3

y 0(k

)

r

y0

0 50 100 150 200 250 300Time Step

-1

-0.5

0

0.5

1

1.5

3 fb(

k)

0 50 100 150 200 250 300Time Step

-1

-0.5

0

0.5

1

u(k)

-1 -0.5 0 0.5 1Real Axis

-1

-0.5

0

0.5

1

Imag

inar

yA

xis G,

e|!

Figure 14: Example H1: Harmonic command following for the adaptive servo problem usingfeedback-feedforward control with decentralized adaptation. RCAC asymptotically follows thecommand without developing an internal model.

DRAFT

44

0 100 200 300 400 500-4

-2

0

2

4

y 0(k

)

y0,OL

y0

0 100 200 300 400 500-5

0

53(

k)

-3 -2 -1 0 1Real Axis

-1.5

-1

-0.5

0

0.5

1

1.5

Imag

inar

yA

xis

eGzw

e|!

0 100 200 300 400 500Time Step

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

u(k)

-3 -2 -1 0 1Real Axis

-1.5

-1

-0.5

0

0.5

1

1.5

Imag

inar

yA

xis

Gc

Figure 15: Example H2: Two-tone harmonic disturbance rejection for the adaptive servo problemusing an IIR controller. y0,OL indicates the open-loop response. RCAC achieves an internalmodel of the two-tone harmonic disturbance by placing controller poles at the two disturbancefrequencies. The internal-model poles of the controller are evident in the form of four closed-loop zeros on the unit circle, which are shown by the red plus signs. The closed-loop poles andzeros are shown at step k = 104.

0 1000 2000 3000 4000-6

-4

-2

0

2

4

6

y 0(k

)

y0,OL

r

y0

-6 -4 -2 0 2Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

eGzw

e|!

0 1000 2000 3000 4000Time Step

-3

-2

-1

0

1

2

3

u(k)

0 1000 2000 3000 4000Time Step

-150

-100

-50

0

50

100

3(k)

Figure 16: Example H2: Harmonic command following and step-plus-harmonic disturbancerejection for the adaptive servo problem. Note that, after the disturbance frequency changesat step k = 2000, RCAC readapts and rejects the disturbance. RCAC achieves an internalmodel of the command and disturbance signals by placing controller poles on the unit circle atthe command frequency and the two disturbance frequencies. The internal-model poles of thecontroller are evident in the form of five closed-loop zeros on the unit circle, which are shownby the red plus signs. The closed-loop poles and zeros are shown at step k = 105.

DRAFT

45

-1 -0.5 0 0.5 1 1.5 2 2.5Real Axis

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Imag

inar

yA

xis

Gzu

Gzw

Gyu

Gyw

OL

-1 -0.5 0 0.5 1 1.5Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

eGzw,LQG

0 :/4 :/2 3:/4 :

-20

-10

0

10

20

Mag

nitu

de(d

B)

0 :/4 :/2 3:/4 :

Frequency (rad/sample)

-180

-90

0

90

180

Pha

se(d

eg)

GzweGzw ,LQGeGzw ,RCAC

-1 -0.5 0 0.5 1 1.5Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

eGzw,RCAC

Figure 17: Example SD1: Adaptive standard problem, with the high-authority LQG target model(86). RCAC places the closed-loop poles near the high-authority LQG closed-loop poles andapproximates the closed-loop frequency response of high-authority LQG. The frequency-responseplots and closed-loop poles and zeros are shown at step k = 105.

0 :/4 :/2 3:/4 :

-40

-20

0

Mag

nitu

de(d

B)

0 :/4 :/2 3:/4 :


-180

-90

0

90

180

Pha

se(d

eg) GfeG

zeu

0 :/4 :/2 3:/4 :

-20

-10

0

10

20

Mag

nitu

de(d

B) GzweGzw ,LQGeGzw ,RCAC

0 :/4 :/2 3:/4 :


-180

-90

0

90

180

Pha

se(d

eg)

Figure 18: Example SD1: Adaptive standard problem with nc = 20. RCAC approximates theclosed-loop frequency response of high-authority LQG. In addition, the frequency response ofGzu,k approximates the frequency response of Gf . The frequency-response plots are shown atstep k = 105.

DRAFT

46

0 50 100 150 200 250 300-2000

-1000

0

1000

2000u(k

)

-3 -2 -1 0 1 2Real Axis

-1

-0.5

0

0.5

1

Imagin

ary

Axis

RCAC

LQG

eGzw

0 200 400 600 800 1000-2000

-1000

0

1000

2000

y0(k

)

y0,OLy0

0 200 400 600 800 1000Time Step

-20

-10

0

10

20

3(k

)

0 :/4 :/2 3:/4 :

-20

-10

0

10

20

Mag

nitu

de(d

B)

0 :/4 :/2 3:/4 :


-180-150-100-50

050

100150180

Pha

se(d

eg) GfeG

zeu

0 :/4 :/2 3:/4 :

0

20

40

60

Mag

nitu

de(d

B)

0 :/4 :/2 3:/4 :


-100

0

100

Pha

se(d

eg)

GzweGzw ,LQGeGzw ,RCAC

Figure 19: Example SD2: Adaptive standard problem with nc = 5n = 20. RCAC approximatesthe closed-loop frequency response of high-authority LQG except at DC due to the internalmodel needed to reject the step disturbance. The internal model has the form of a notch atDC corresponding to the closed-loop zero at 1. In addition, the frequency response of Gzu,k

approximates the frequency response of Gf . The frequency-response plots and closed-loop polesand zeros are shown at step k = 105.

DRAFT

47

0 100 200 300 400Time Step

-2

-1

0

1

u(k

)

-1 -0.5 0 0.5 1Real Axis

-1

-0.5

0

0.5

1

Imagin

ary

Axis

RCAC

LQG

eGzw

0 100 200 300 400-5

0

5

y0(k

)

y0;OL

r

y0

0 100 200 300 400Time Step

-2

-1

0

1

2

3(k

)

0 :/4 :/2 3:/4 :

-10

-5

0

5

10

Mag

nitu

de(d

B)

0 :/4 :/2 3:/4 :


-180-150-100-50

050

100150180

Pha

se(d

eg)

GfeGzeu

0 :/4 :/2 3:/4 :

-10

0

10

20

30

Mag

nitu

de(d

B)

GzweGzw ;LQGeGzw ;RCAC

0 :/4 :/2 3:/4 :


-200

-100

0

100

200

Pha

se(d

eg)

Figure 20: Example SD3: Command following and stochastic disturbance rejection for theadaptive servo problem with nc = 8n = 40. RCAC approximates the closed-loop frequencyresponse of high-authority LQG except at the command frequency due to the internal model.The internal model has the form of a notch at the command frequency corresponding to the twoclosed-loop zeros on the unit circle. The frequency-response plots and closed-loop poles andzeros are shown at step k = 105.

DRAFT

48

0 100 200 300 400 500-6

-4

-2

0

2y0(k

)

r

y0

0 100 200 300 400 500-50

0

50

100

3(k

)

0 100 200 300 400 500Time Step

-2

-1

0

1

2

u(k

)

-5 -4 -3 -2 -1 0 1Real Axis

-1

-0.5

0

0.5

1

Imagin

ary

Axis eGzw

Dp

0 100 200 300 400 500-10

-5

0

5

10

y0(k

)

r

y0

0 100 200 300 400 500-40

-20

0

20

3(k

)

0 100 200 300 400 500Time Step

-10

-5

0

5

10

u(k

)

-1.5 -1 -0.5 0 0.5 1Real Axis

-1

-0.5

0

0.5

1

Imagin

ary

Axis eGzw

Dp

0 20 40 60 80 100 120 140-10000

-5000

0

5000

y0(k

)

r

y0

0 20 40 60 80 100 120 140Time Step

-1000

-500

0

500

1000

3(k

)

0 20 40 60 80 100 120 140Time Step

-15

-10

-5

0

5

u(k

)

#10158

-2 -1.5 -1 -0.5 0 0.5 1 1.5Real Axis

-2

-1

0

1

2

Imagin

ary

Axis eGzw

Dp

Figure 21: Example SN1: Pole placement for the adaptive servo problem without sensor noise(upper plots), as well as with zero-mean Gaussian white sensor noise v with σ = 0.1 (middleplots) and σ = 0.2 (lower plots). In the case where v = 0, RCAC places closed-loop poles nearthe locations of the roots of Dp. For σ = 0.1, RCAC fails to place closed-loop poles near thelocations of the roots of Dp, but stabilizes the plant. However, for σ = 0.2, RCAC fails to placeclosed-loop poles near the locations of the roots of Dp, and the closed-loop system is unstableat step k = 500. The closed-loop poles and zeros are shown at step k = 500.

DRAFT

49

0 200 400 600 800 1000-50

0

50

100

y 0(k

)y0,OL

y0

0 200 400 600 800 1000Time Step

-2

-1

0

1

3(k)

0 200 400 600 800 1000Time Step

-40

-30

-20

-10

0

10

20

30

u(k

)

0 0.5 1 1.5 2 2.5 3

-10

-5

0

5

10

Mag

nitu

de(d

B)

0 0.5 1 1.5 2 2.5 3Frequency (rad/sample)

-180-150-100-50

050

100150180

Pha

se(d

eg)

GfeGzeu

0 0.5 1 1.5 2 2.5 3

0

20

40

Magnitude(dB) GzweGzw ,LQG(V2 = 0)eGzw ,LQG(V2 = 1)eGzw ,RCAC


-200

-100

0

100

200

Phase(deg)

Figure 22: Example SN2: Stochastic disturbance rejection for the adaptive servo problem withzero-mean Gaussian white sensor noise. RCAC approximates the closed-loop frequency responseof LQG for V2 = 1. In addition, the frequency response of Gzu,k approximates the frequencyresponse of Gf . The frequency-response plots are shown at step k = 105.

DRAFT

50

0 100 200 300 400

-2

0

2y 0

(k)

y0;OLry0

0 100 200 300 400Time Step

-2

-1

0

1

2

3(k

)

0 100 200 300 400Time Step

0

0.5

1

1.5

2

2.5

3

u(k

)

-20 -15 -10 -5 0 5Real Axis

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Imag

inar

yA

xis

RCAC

LQG eGzw

0 0.5 1 1.5 2 2.5 3

-20

0

20

Magnitude(dB)


-200

-100

0

100

200Pha

se(deg) GzweGzw ,RCACeGzw ,LQG(V2 = 0)eGzw ,LQG(V2 = 0:05)

Figure 23: Example SN3: Command following and stochastic disturbance rejection for theadaptive servo problem with zero-mean Gaussian white sensor noise. RCAC approximates theclosed-loop frequency response of LQG for V2 = 2 except at DC due to the internal model. Theinternal model has the form of a notch at DC corresponding to the closed-loop zero at 1. Forthis example, RCAC approximates the closed-loop frequency response of LQG in the presenceof sensor noise. The frequency-response plots and closed-loop poles and zeros are shown at stepk = 105.

0 100 200 300 400 500 600

-5

0

5

y 0(k

)

r

y0

0 100 200 300 400 500 600Time Step

-2

-1

0

1

2

3(k)

0 100 200 300 400 500 600Time Step

-8

-6

-4

-2

0

2

4

6

8

u(k

)

Figure 24: Example NMP1: Erroneous NMP-zero estimates for the adaptive servo problem.RCAC automatically develops an internal model of the command and asymptotically followsthe command despite a 15% error in the estimate of the NMP zero used by Gf .

DRAFT

51

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6Real Axis

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5Im

agin

ary

Axi

s

Success

Failure

NMP Zero

0.9 1 1.1 1.2 1.3 1.4 1.5 1.6Real Axis

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Imag

inar

yA

xis

Success

Failure

NMP Zero

Figure 25: Example NMP1: Erroneous NMP-zero estimates for the adaptive servo problem.For the case where d = 0 and the command is harmonic, the plot on the left shows the NMP-zero estimates for which RCAC asymptotically follows the harmonic command as well as theNMP-zero estimates for which the closed-loop system becomes unstable. For the case wherethe command is harmonic and d is zero-mean Gaussian white noise with standard deviation 0.5,the plot on the right shows the NMP-zero estimates for which RCAC asymptotically followsthe harmonic command as well as the NMP-zero estimates for which the closed-loop systembecomes unstable. Note that RCAC is less robust to erroneous NMP-zero estimates for the caseof stochastic disturbance than in the case of harmonic commands. For this example, RCAC ismore robust to overestimation of the NMP zero than underestimation.

0.95 1 1.05 1.1 1.15 1.2Real Axis

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Imag

inar

yA

xis

Success

Failure

NMP Zeros

Figure 26: Example NMP1: Erroneous NMP-zero estimates for the adaptive servo problem.For the case where the NMP zeros are complex, this plot shows the NMP-zero estimates forwhich RCAC asymptotically follows the harmonic command as well as the NMP-zero estimatesfor which the closed-loop system becomes unstable. For this example, RCAC is more robust tooverestimation of the NMP zeros than underestimation.

DRAFT

52

0 2500 5000 7500 10000

-0.2

0

0.2z(k

)OL

CL

0 2500 5000 7500 10000Time Step

-4

-2

0

2

4

3(k

)

0 :/4 :/2 3:/4 :

-10

0

10

Magnitude(dB)

0 :/4 :/2 3:/4 :


-180

-90

0

90

180

Phase(deg)

LQG1

RCAC1

LQG2

RCAC2

Figure 27: Example NMP2: Unmodeled change in the NMP zeros for the adaptive standardproblem. At step k = 5000, the NMP zeros move to 0.99± 1.38. If Gc,k is fixed to be Gc,5000,then the closed-loop system becomes unstable. With continued adaptation, however, the plant isrestabilized. LQG1 is the high-authority LQG controller for G with NMP zeros at 0.96± 0.72,whereas LQG2 is the high-authority LQG controller for G with NMP zeros at 0.99± 1.38.

0 2000 4000 6000 8000 10000-10

-5

0

5

10

y 0(k

)

r

y0

0 2000 4000 6000 8000 10000Time Step

-4

-2

0

2

4

6

8

3(k)

0 2000 4000 6000 8000 10000-100

-50

0

50

100

150

u(k)

-1 -0.5 0 0.5 1 1.5 2 2.5Real Axis

-1

-0.5

0

0.5

1

Imag

inar

yA

xis eGzw

e|!

Figure 28: Example NMP3: Unmodeled NMP sampling zero for the adaptive servo problem.RCAC achieves an internal model of the harmonic command signal by placing controller poles atthe command frequency, and Gzw is asymptotically stable at step k = 10000. The internal-modelpoles of the controller are evident in the form of two closed-loop zeros on the unit circle at thecommand frequency, which are shown by the red plus signs. For this example, the performance-dependent control weighting Ru allows RCAC to asymptotically follows the command for asampled-data plant with an unmodeled NMP zero. The closed-loop poles and zeros are shownat step k = 104.

DRAFT

53

0 0.5 1 1.5 2

#104

-2000

-1000

0

1000

y 0(k

)

ry0

0 0.5 1 1.5 2Time Step #104

-0.05

0

0.05

0.1

0.15

3(k)

0 0.5 1 1.5 2Time Step #104

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

u(k

)

Figure 29: Example NMP4: Unmodeled NMP zero for the adaptive servo problem. RCAC doesnot follow the harmonic command, and, as z becomes unbounded, the closed-loop system tendsto the unstable open-loop plant. For this unstable example, RCAC is not robust to unmodeledNMP zeros, despite using the performance-dependent control weighting Ru = z2.

0 500 1000 1500

-5

0

5

y 0(k

)

r

y0

0 500 1000 1500Time Step

-1

0

1

2

3(k)

0 500 1000 1500

-5

0

5y 0

(k)

r

y0

0 500 1000 1500Time Step

-2

-1

0

1

3(k)

0 5 10 15 20 25 30 35-20

-10

0

10

y 0(k

)

r

y0

0 5 10 15 20 25 30 35Time Step

-4

-2

0

2

3(k)

0 500 1000 1500-5

0

5

y 0(k

)

r

y0

0 500 1000 1500Time Step

-1

-0.5

0

0.5

1

3(k)

Figure 30: Example DM1: Effect of erroneous relative degree on command-following perfor-mance for the adaptive servo problem. For dzu = 3 (upper left) and dzu = 4 (upper right),RCAC asymptotically follows the harmonic command. For dzu = 1 (lower left), RCAC doesnot follow the harmonic command, and the plant output y0 diverges. However, by using theperformance-dependent control weighting Ru = z2 (lower right), RCAC asymptotically followsthe harmonic command, although the transient response is poor. For this example, RCAC isrobust to overestimation of dzu, but is less robust to underestimation of dzu.

DRAFT

54

0 0.5 1 1.5 2

#104

-5

0

5

e 0(k

)

0 0.5 1 1.5 2

#104

-1

-0.5

0

0.5

1

3(k)

0 0.5 1 1.5 2

#104

-10

0

10

e 0(k

)

0 0.5 1 1.5 2

#104

-1

0

1

2

3(k)

0 0.5 1 1.5 2

#104

-20

0

20

e 0(k

)

0 0.5 1 1.5 2Time Step #104

-2

-1

0

1

2

3(k)

0 0.5 1 1.5 2

#104

-20

0

20

e 0(k

)

0 0.5 1 1.5 2Time Step #104

-4

-2

0

2

3(k)

Figure 31: Example DM2: Unmodeled time delay for the adaptive servo problem. For kd = 1

(upper left), kd = 2 (upper right), kd = 3 (lower left), and kd = 4 (lower right), RCACasymptotically follows the harmonic command.

0 1.5 3 4.5 6

#104

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

je 0(k)

j

0 1.5 3 4.5 6

#104

-3

-2

-1

0

1

2

3

4

u(k)

0 1 2 3 4 5 6Time Step #104

-1

-0.5

0

0.5

3(k)

0 1.5 3 4.5 6Time Step #104

0.95

1

1.05

1.1

1.15

1.2

1.25

Spect

ralRa

dius

eGzw

Figure 32: Example DM2: Unmodeled time-varying time delay for the adaptive servo problem.At step k = 15000, an unmodeled 1-step time delay is inserted into the loop. At step k = 30000,an additional unmodeled 2-step time delay is inserted, and at step k = 60000, an additionalunmodeled 6-step time delay is inserted. With Gc,k fixed, each time delay is destabilizing. Withcontinued adaptation, however, RCAC re-adapts and restabilizes the closed-loop system

DRAFT

55

0 5000 10000 15000

100

105

z(k)

RCAC

[77]

0 5000 10000 15000-15

-10

-5

0

5

10

15

20

u(k)

0 5000 10000 15000Time Step

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

3(k)

0 5000 10000 15000Time Step

0.995

1

1.005

1.01

1.015

1.02

1.025

Spec

tralR

adius

eGzw

Figure 33: Example DM3: Limited delay margin for the adaptive standard problem. These plotsshow the closed-loop responses for the initial condition x(0) = [0.1 0.1 0.1]T for the controllergiven by [77] as well as for RCAC. At step k = 3000, an unmodeled 7-step time delay is insertedinto the loop, which destabilizes both closed-loop systems. With continued adaptation, however,RCAC re-adapts and restabilizes the closed-loop system.

0 50 100 150 200 250 300-4

-2

0

2

4

y 0(k

)

r

y0

0 50 100 150 200 250 300-0.4

-0.2

0

0.2

0.4

3(k)

0 50 100 150 200 250 3000

1

2

3

y 0(k

)

r

y0

0 50 100 150 200 250 300-0.5

0

0.5

1

1.5

3(k)

0 50 100 150 200 250 3000

2

4

y 0(k

)

#10188

r

y0

0 10 20 30 40 50Time Step

-5

0

5

3(k)

#1022

0 50 100 150 200 250 300-1

0

1

y 0(k

)

r

y0

0 50 100 150 200 250 300Time Step

-0.2

0

0.2

3(k)

Figure 34: Example GM1: Effect of erroneous Hd on command-following performance for theadaptive servo problem. Command-following performance for Hd = 0.1 (top left), Hd = 10 (topright), Hdzu = −1 (bottom left), and Hdzu = −1 with performance-dependent control weightingRu = z2 (bottom right). RCAC asymptotically follows the command for Hd = 0.1 and Hd = 10.For Hd = −1, RCAC causes instability in the case where Ru = 0, but the closed-loop systemremains asymptotically stable at step k = 5000 by using Ru (not shown). For this example,RCAC is robust to errors in the magnitude of the estimate of Hd, but is not robust to errors inthe sign of the estimate of Hd.

DRAFT

56

0 2500 5000 7500 10000-0.4

-0.2

0

0.2

0.4

z(k)

OL

CL

0 2500 5000 7500 10000Time Step

-1

0

1

2

3(k)

0 2500 5000 7500 10000-0.4

-0.2

0

0.2

0.4

z(k)

OL

CL

0 2500 5000 7500 10000Time Step

-1

0

1

2

3(k)

0 2500 5000 7500 10000-0.2

-0.1

0

0.1

0.2

u(k

)

Gc adapts

Gc -xed

0 2500 5000 7500 10000Time Step

0.9

1

1.1

1.2

Spec

tral

Rad

ius

Gc adaptsGc -xed

0 :/4 :/2 3:/4 :

-10

0

10

Magnitude

(dB) LQG1

RCAC1

LQG2

RCAC2

0 :/4 :/2 3:/4 :


-200

-100

0

100

200

Phase

(deg)

Figure 35: Example GM2: Unmodeled change in the static gain for the adaptive standardproblem. At step k = 5000, the gain margin is 1.8105, and G is replaced by 2.9G, wherethe gain 2.9 is unmodeled. If Gc,k is fixed to be Gc,5000, then the closed-loop system becomesunstable. Under continued adaptation, however, RCAC restabilizes the closed-loop system. LQG1is the high-authority LQG controller for G, and LQG2 is the high-authority LQG controller for2.9G.

0 0.5 1 1.5 2

#104

-2

0

2

z(k)

OL

CL

0 0.5 1 1.5 2Time Step #104

-5

0

5

10

3(k)

0 0.5 1 1.5 2Time Step #104

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

Spec

tral

Rad

ius

eGzw

Figure 36: Example GM3: Discretized plant from [9] with severely limited gain margin. Atstep k = 10000, the gain margin is 1.0034, and G is replaced by 1.1G, where the gain 1.1 isunmodeled. If Gc,k is fixed to be Gc,10000, then the closed-loop system becomes unstable. Undercontinued adaptation, however, RCAC restabilizes the closed-loop system.

DRAFT

57

Sidebar S1: Adaptive PID Control

Proportional-integral-derivative (PID) control is likely the most widely used feedbackcontrol technique [S1]–[S3]. Adaptive PID control is considered in [S4]. In this section, weconsider the discrete-time PID controller structure

u(k) = uP(k) + uI(k) + uD(k), (S1)

where

uP(k) = KP(k)y(k − 1), (S2)

uI(k) = KI(k)γ(k − 1), (S3)

uD(k) = KD(k)[y(k − 1)− y(k − 2)], (S4)

and the integrator state γ satisfies

γ(k) = γ(k − 1) + y(k − 1). (S5)

Note that the PID controller (S1)–(S5) is strictly proper. We use RCAC to adaptively tune KP,KI, and KD.

Example PID1. Step command following for the adaptive servo problem using adaptivePID control. Consider the asymptotically stable, NMP plant

G(q) =(q− 0.975)(q− 1.2)

(q− 0.99)(q2 − 1.6q + 0.965), (S6)

let r be a step command with height 2, let d be a step disturbance with height −1.1, and letv = 0. We apply RCAC to the adaptive PID controller (S1)–(S5) with Rθ = 104 and Ru = 0.RCAC rejects the step disturbance and asymptotically follows the step command, as shown inFigure S1. �

References

[S1] K. J. Astrom and T. Hagglund, Advanced PID Control, Instrumentation, Systems, andAutomation Society, 2006.

[S2] Y. Li, K. H. Ang, and G. C. Y. Chong, PID Control System Analysis and Design: Problems,Remedies, and Future Directions, IEEE Contr. Sys. Mag., vol. 26, pp. 32–41, 2006.

[S3] A. O’Dwyer, Handbook of PI and PID Controller Tuning Rules, Imperial College Press,2009.

[S4] V. Bobal, J. Bohm, J. Fessl, and J. Machaceck, Digital Self-tuning Controllers, Springer,2005.

DRAFT

58

0 200 400 600 800-2

0

2

4

y 0(k

)

r

y0

0 200 400 600 800Time Step

-6

-4

-2

0

2

3(k)

#10-3

KPKIKD

0 200 400 600 800Time Step

-0.45

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

uPuIuD

Figure S1: Example PID1: Step command following for the adaptive servo problem usingadaptive PID control. RCAC rejects the disturbance and asymptotically follows the stepcommand.

DRAFT

59

Sidebar S2: RCAC for Model Reference Adaptive Control

The model reference adaptive control (MRAC) problem is shown in Figure S2. Thisproblem is a special case of the adaptive standard problem with

w =

r

d

v

, y =

[r

yn

], ym = Gmr, z = ym − yn, (S7)

Gzw = [Gm −Gd − Ilz ], Gzu = −Gu, Gyw =

[Ilr 0 0

0 Gd Ilz

], Gyu = Gu, (S8)

where r ∈ Rlr and Gm is the reference model. Since y includes the command r, the controllerincludes both feedback and feedforward action. Note that the role of the target model Gf usedby RCAC is distinct from the role of the reference model Gm used in MRAC.

The MRAC problem is widely studied, and numerous techniques have been developed[11], [12], [14], [17], [78]. For comparison with RCAC, we consider the MRAC controllersin [11, pp. 343–371] for continuous-time plants. These controllers require knowledge of therelative degree of the plant as well as the leading coefficient of the plant numerator. The formof the controller depends on the relative degree of the plant, and the approach is confined tominimum-phase plants. An extension to NMP plants is given in [12].

Example MRAC1. MRAC for a minimum-phase plant. Consider the continuous-timeunstable, minimum-phase triple integrator

G(s) =s+ 0.1

s3. (S9)

Let r be an alternating sequence of step commands with heights ±1, let d = v = 0, and let Gm

be the continuous-time reference model Gm(s) = 1s+1

. We use the MRAC algorithm given in[11, p. 355] for plants with relative degree 2. Figure S3 shows the closed-loop response. Next,we discretize the plant (S9) and reference model Gm with the sampling period h = 0.01 sec,yielding

G(q) =10−5(5q2 + 0.0067q− 5)

(q− 1)3, Gm(q) =

0.01

q− 0.99. (S10)

We apply RCAC to the MRAC problem with Rθ = 10−10, Ru = 0, and nc = 10. We use 20Markov parameters to construct the target model. RCAC asymptotically follows the output ofthe reference model, as shown in Figure S3. Table S1 shows the value of the performance metricε4=∑20000

k=1 z2(k) for the MRAC controller given in [11, p. 355] and for RCAC.

DRAFT

60

Finally, let the disturbance d be a step with height −0.2. Figure S4 shows the command-following performance for the MRAC controller given in [11, p. 355] and RCAC. Table S1shows the value of the performance metric for the MRAC controller and for RCAC. �

Method d ε

[11, p. 355] 0 216RCAC 0 132

[11, p. 355] −0.2 3667RCAC −0.2 153

TABLE S1: Performance metric ε for the MRAC controller given in [11, p. 355] and RCAC.Note that RCAC yields better performance in the case where d = 0 as well as in the presenceof a constant disturbance.

Example MRAC2. MRAC for a NMP plant. Consider the continuous-time unstable, NMPdouble integrator

G(s) =s− 0.0952

s2. (S11)

Let r be a sequence of step commands with sign-alternating heights, let d = v = 0, and letGm be the continuous-time reference model Gm(s) = s−0.0952

s2+30s+1. Note that G and Gm have

the same NMP zero. Attempting to use the MRAC algorithm given in [11, p. 345] yields anunstable closed-loop system (not shown). We discretize the plant (S9) and the reference modelGm(s) = 1

s2+30s+1with the sampling period h = 1 sec, yielding

G(q) =q− 1.1

(q− 1)2, Gm(q) =

0.03174q + 0.001078

q2 − 0.9672q. (S12)

We apply RCAC to the MRAC problem with Rθ = 0.002, Ru = 0, and nc = 6, and we usethe FIR target model (72). RCAC asymptotically follows the output of the reference model, asshown in Figure S5. �

Gc,k

Gm

[Gd Gu]u

dr

ym−

z

y0vyn

−z0

Gc,k

1

Figure S2: Transfer function representation of the model reference adaptive control problemwith the adaptive controller Gc,k.

DRAFT

61

0 50 100 150 200Time (sec)

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

y 0(k

)

rymy0

0 50 100 150 200Time (sec)

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

y 0(k

)

rymy0

Figure S3: Example MRAC1: MRAC for the minimum-phase triple integrator (S9). The MRACcontroller from [11, p. 355] (left) and RCAC (right) both follow the output of the reference model.

0 50 100 150 200Time (sec)

-6

-5

-4

-3

-2

-1

0

1

2

3

4

y 0(k

)

rymy0

0 50 100 150 200Time (sec)

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

y 0(k

)

rymy0

Figure S4: Example MRAC1: MRAC for the minimum-phase triple integrator (S9) in thepresence of a step disturbance. The left plot shows the response of the MRAC controller from[11, p. 355], and the right plot shows the response of RCAC.

0 1000 2000 3000 4000 5000Time Step

-10

-8

-6

-4

-2

0

2

4

6

8

y 0(k

)

rymy0

0 1000 2000 3000 4000 5000-40

-20

0

20

40

3(k)

0 1000 2000 3000 4000 5000Time Step

-20

-10

0

10

20

u(k

)

Figure S5: Example MRAC2: MRAC for the NMP double integrator (S12). RCAC asymptoti-cally follows the output of the reference model.

DRAFT

62

Sidebar S3: Discrete-Time LQG Control

Discrete-time LQG is not as widely used as the continuous-time version [S5], [S6]; therelevant equations for discrete-time LQG can be found in [79, p. 878], and a complete derivationin a more general context is given in [S7]. Solutions of the discrete-time Riccati equations arediscussed in [S8]. Here we focus on the high-authority LQG solution.

For the standard problem (1)–(3), define

R14= ET

1 E1 ∈ Rn×n, R124= ET

1 E2 ∈ Rn×lu , R24= ET

2 E2 ∈ Rlu×lu , (S13)

V14= D1D

T1 ∈ Rn×n, V12

4= D1D

T2 ∈ Rn×ly , V2

4= D2D

T2 ∈ Rly×ly . (S14)

Assuming that w is zero-mean Gaussian white noise with covariance Ilw , the nth-order strictly

proper LQG controller Gc ∼[Ac Bc

Cc 0

]minimizes

J(Ac, Bc, Cc)4= lim

k→∞E

[1

k

k∑

i=0

zT(i)z(i)

], (S15)

and is given by

Ac = A+BCc −BcC −BcD0Cc, (S16)

Bc = (AQCT + V12)(V2 + CQCT)−1, (S17)

Cc = −(R2 +BTPB)−1(RT12 +BTPA), (S18)

where the positive-semidefinite matrices P ∈ Rn×n and Q ∈ Rn×n are solutions of the discrete-time algebraic Riccati equations

P = ATRPAR − AT

RPB(R2 +BTPB)−1BTPAR + R1, (S19)

Q = AEQATE − AEQC

T(V2 + CQCT)−1CQATE + V1, (S20)

where

AR4= A−BR−1

2 RT12, R1

4= R1 −R12R

−12 RT

12, (S21)

AE4= A− V12V

−12 C, V1

4= V1 − V12V

−12 V T

12. (S22)

The eigenvalues of the closed-loop system are given by

mspec(A) = mspec(A+BCc) ∪mspec(A−BcC), (S23)

where

A4=

[A BCc

BcC Ac +BcD0Cc

](S24)

DRAFT

63

and “mspec” denotes the spectrum of a matrix including eigenvalue multiplicity. Under theassumptions

i) (A,B) is stabilizable.ii) (AR, R1) has no unobservable eigenvalues on the unit circle.

iii) (A,C) is detectable.iv) (AE, V1) has no uncontrollable eigenvalues on the unit circle.

it follows that (S19) and (S20) have unique positive-semidefinite solutions P and Q, and,furthermore, A is asymptotically stable.

Note that the LQG controller is independent of E0. This is due to the fact that, since LQGis based on the assumption that w is zero-mean Gaussian white noise, the contribution of E0w toJ(Ac, Bc, Cc) is not affected by the choice of Ac, Bc, and Cc. However, for the servo problemshown in Figure 3, E0 is not zero, and the command r, which is a component of w, is notGaussian white noise. Likewise, in some applications, the disturbance d in the servo problemis not Gaussian white noise, and thus the exogenous signal w in the standard problem is notGaussian white noise. Therefore, in these cases the LQG controller is not necessarily optimal.Nevertheless, for comparison with RCAC, we use the LQG controller in these cases withoutmodification. In examples SD1–SD3, we compare the closed-loop frequency response and H2

cost of RCAC to high-authority LQG.

Properties of Discrete-Time High-Authority LQG

In this section, we review properties of high-authority LQG, that is, the case where R2 = 0

and V2 = 0. In this case, AR = A, R1 = R1, AE = A, and V1 = V1. The properties of discrete-time high-authority LQG are analogous to the properties of continuous-time high-authority LQGgiven in [S5, pp. 281–289].

For simplicity, we assume that y, z, u, and w are scalar signals. We consider thefactorizations of the numerators Nzu and Nyw of Gzu and Gyw, respectively, given by

Nzu = HdzuNzu,sNzu,u, (S25)

Nyw = HdywNyw,sNyw,u, (S26)

where dzu ≥ 0 is the relative degree of Gzu, Hdzu is the first nonzero Markov parameter of Gzu,dyw ≥ 0 is the relative degree of Gyw, Hdyw is the first nonzero Markov parameter of Gyw, theroots of the monic polynomials Nzu,s and Nyw,s are the minimum-phase zeros of Gzu and Gyw,respectively, and the roots of the monic polynomials Nzu,u and Nyw,u are the NMP zeros of Gzu

and Gyw, respectively. Note that Hdzu is the leading nonzero coefficient of Nzu, and Hdyw is theleading nonzero coefficient of Nyw. With this notation it follows that

DRAFT

64

mspec(A+BCc) = mzeros(zdzuNzu,s(z)Nzu,u(z−1)), (S27)

mspec(A−BcC) = mzeros(zdywNyw,s(z)Nyw,u(z−1)), (S28)

where mzeros denotes the multiset of zeros of a rational function including multiplicity. Notethat the zeros of Nzu,u(z−1) are the reflections across the unit circle (that is, the reciprocals) ofthe NMP zeros of Gzu. For example, if Nzu,u(z) = z− 1.2, then Nzu,u(z−1) = 1−1.2z

z . It followsfrom (S27) and (S28) that the closed-loop poles of high-authority LQG control are the zeros of

DHA(z) = zdzu+dywNzu,s(z)Nzu,u(z−1)Nyw,s(z)Nyw,u(z−1). (S29)

It thus follows from (S23) that mspec(A) = mzeros(DHA). Similar observations are made forcontinuous-time systems in [S5] and for discrete-time systems in [42]. A surprising aspect ofhigh-authority LQG control is the fact that the poles and zeros of Gyu, which is present in thefeedback loop and thus determines the gain and phase margins, do not affect the locations ofthe closed-loop poles.

Example LQG1. High-authority LQG control for the standard problem with y 6= z, withstochastic w not matched with u, and with minimum-phase Gzu, Gzw, Gyu, and Gyw. Considerthe asymptotically stable, minimum-phase plant

A =

0.4 0.0958 0.1183 0.3162

0 0.81 1 0

0 −0.1539 0.81 0.4813

0 0 0 −0.5

, B =

0

0

0

1

, D1 =

−0.3807

−0.2039

0.1771

0.8844

, (S30)

C = [0.4456 0.0832 − 0.332 − 0.8272], D0 = 0, D2 = 0, (S31)

E1 = [−0.274 0.2625 0.3241 0.8666], E0 = 0, E2 = 0. (S32)

The open-loop and closed-loop poles are shown in Figure S6. Note that mspec(A+BCc) consistsof the zeros of Gzu as well as 0 with multiplicity 1, while mspec(A − BcC) consists of thezeros of Gyw as well as 0 with multiplicity 1. This example illustrates (S27) and (S28), whichrelate the closed-loop spectrum to the zeros of Gzu and Gyw. �

Example LQG2. High-authority LQG control for the standard problem with y 6= z, withstochastic w not matched with u, and with NMP Gzu, Gzw, Gyu, and Gyw, all of which havedifferent NMP zeros. Consider the asymptotically stable, NMP plant

A =

0.81 1 0

−0.1539 0.81 0.6998

0 0 0.5

, B =

0

0

1

, D1 =

0.1841

0.1074

0.9770

, (S33)

DRAFT

65

C = [0.9280 0.1102 0.3558], D0 = 0, D2 = 0, (S34)

E1 = [0.4531 − 0.3513 0.8193], E0 = 0, E2 = 0. (S35)

The open-loop and closed-loop poles are shown in Figure S7. Note that mspec(A+BCc) consistsof the reciprocals of the NMP zeros of Gzu as well as 0 with multiplicity 1, while mspec(A−BcC) consists of the reciprocals of the NMP zeros of Gyw as well as 0 with multiplicity 1. Thisexample illustrates (S27) and (S28) in the case where both Gyu and Gzw are NMP. This examplealso shows that the NMP zeros of Gzw and Gyu have no effect on the closed-loop poles. �

References

[S5] H. Kwakernaak and R. Sivan, Linear Optimal Control Systems. Wiley, 1972.[S6] K. Zhou, J. C. Doyle, and K. Glover, Robust and Optimal Control. Prentice Hall, 1996.[S7] W. M. Haddad, V. Kapila, and E. G. Collins, Jr., Optimality Conditions for Reduced-Order

Modeling, Estimation, and Control for Discrete-Time Linear Periodic Plants. J. Math. Sys.Est. Contr., vol. 6, pp. 437-460, 1996.

[S8] V. Ionescu, C. Oara, and M. Weiss, Generalized Riccati Theory and Robust Control. Wiley,1999.

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

Gzu

Gzw

Gyu

Gyw

OL

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1 eGzw

Gzu

Gyw

Figure S6: Example LQG1: The closed-loop transfer function Gzw has one pole at each zeroof Gzu and Gyw as well as two poles at 0.

-1 -0.5 0 0.5 1 1.5-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Gzu

Gzw

Gyu

Gyw

OL

-1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5 eGzw

Gzu

Gyw

Figure S7: Example LQG2: The closed-loop transfer function Gzw has one pole at the reciprocalof each NMP zero of Gzu and Gyw as well as two poles at 0.

DRAFT

66

Sidebar S4: Application to Control Saturation

The most common nonlinearity encountered in practice is control saturation, which canlead to integrator windup and possibly instability. Since control magnitude and rate saturationaffect all real-world control systems, it is not surprising that an extensive literature is devotedto this problem [S9]–[S14].

We now investigate the performance of RCAC in the presence of control magnitude andrate saturation, as shown in Figure S8. The output of RCAC is the requested control u(k), andthe input to the plant is the actual control ua(k). In all examples, the regressor Φ(k) containsua(k). This means that either the nonlinearity is known or its output is measured. The case wherethe nonlinearity is unknown and its output is not measured is considered in [46].

Example SAT1. Control magnitude and rate saturation for the adaptive servo problem.Consider the unstable, NMP triple integrator

G(q) =(q− 1.075)(q− 0.95)

(q− 1)3. (S36)

Let r be the ramp command r(k) = k. We apply RCAC with Rθ = 103, Ru = 0, and nc = 10,and we use the FIR target model (72). The control u is magnitude saturated at ±50, and rate(that is, control-change) saturated at ±60. The rate saturation limit of ±60 prevents bang-bangbehavior, but allows the control u to change sign when it reaches the magnitude limit. Despitethe magnitude and rate saturation, Figure S9 shows that RCAC asymptotically follows thecommand. Stabilization and tracking for a chain of integrators in the presence of controlsaturation is considered in [S15], [S16] under full-state feedback. �

Example SAT2. Control magnitude saturation for the adaptive servo problem withadaptive PID control. Consider the asymptotically stable, minimum-phase plant

G(q) =(q− 0.09)(q− 0.8)

(q2 − q + 0.5)(q− 0.9). (S37)

Let r be a sequence of step commands with heights ±0.4 and ±1, and let d = v = 0. We usethe PID controller structure (S1), and set Rθ = 0.1 and Ru = 0. The control u is saturated at±0.2. This saturation level allows the controller to follow step commands with height ±0.4,but not with height ±1. Figure S10 shows the response of the adaptive PID controller. For stepcommands with height ±0.4, the adaptive PID controller uses integral action to follow the stepcommand. However, for step commands with height ±1, the adaptive PID controller drives theintegral gain KI to zero. The reduction in KI allows RCAC to avoid integrator windup. �

DRAFT

67

References

[S9] D. S. Bernstein and A. N. Michel, A Chronological Bibliography on Saturating Actuators,Int. J. Robust Nonlinear Contr., vo. 5, pp. 375–580, 1995.

[S10] N. Kapoor, A. R. Teel, and P. Daoutidis, An Anti-windup Design for Linear Systems withInput Saturation, Automatica, vol. 34, pp. 559-574, 1998.

[S11] A. H. Glattfelder and W. Schaufelberger, Control Systems with Input and OutputConstraints, Springer, 2003.

[S12] P. Hippe, Windup in Control: Its Effects and Their Prevention, Springer, 2006.[S13] L. Zaccarian and A. R. Teel, Modern Anti-windup Synthesis: Control Augmentation for

Actuator Saturation, Princeton University Press, 2011.[S14] S. Tarbouriech, G. Garcia, J. M. G. da Silva, and I. Queinnec, Stability and Stabilization

of Linear Systems with Saturating Actuators, Springer, 2011.[S15] A. R. Teel, Global Stabilization and Restricted Tracking for Multiple Integrators with

Bounded Controls, Sys. Contr. Lett., vol. 18, pp. 165–171, 1992.[S16] T. Lauvdal, R. M. Murray, and T. I. Fossen, Stabilization of Integrator Chains in the

Presence of Magnitude and Rate Saturations: A Gain Scheduling Approach, Proc. Conf.Dec. Contr., San Diego, CA, pp. 4004–4005, 1997.

Gc,k[Gd Gu]u ua

d

r

−

z

z0

y0vyn

−

1

Figure S8: Adaptive servo problem with control magnitude and rate saturation.

DRAFT

68

0 50 100 150 200-100

-50

0

50

100

150

200

y 0(k

)

ry0, Mag Saty0, Mag & Rate Sat

0 50 100 150 200Time Step

-400

-300

-200

-100

0

100

200

300

u(k)

u

ua

0 50 100 150 200Time Step

-300

-200

-100

0

100

200

300

u(k)

u

ua

0 50 100 150 200Time Step

-300

-200

-100

0

100

200

300

400

u(k)!

u(k!

1)

u

ua

Figure S9: Example SAT1: Control magnitude and rate saturation for the adaptive servo problem.The upper left plot shows the closed-loop response, the green line shows the response in thepresence of magnitude saturation, and the blue line shows the response in the presence ofmagnitude and rate saturation. The upper right plot shows the requested and actual control inthe presence of magnitude saturation. The lower left plot shows the requested and actual controlin the presence of magnitude and rate saturation. The lower right plot shows the requestedand actual control change in the presence of magnitude and rate saturation. Despite the controlmagnitude and rate saturation, RCAC asymptotically follows the command.

0 2000 4000 6000 8000-2

-1

0

1

2

y0(k

)

r

y0

0 2000 4000 6000 8000Time Step

-2

-1

0

1

2

u(k

)

uau

0 2000 4000 6000 8000-0.02

0

0.02

0.04

0.06

3(k

)

KI

0 2000 4000 6000 8000Time Step

-1

0

1

2

3(k

)

KPKD

Figure S10: Example SAT2: Control magnitude saturation for the adaptive servo problem withadaptive PID control. For step commands with height ±0.4, the adaptive PID controller usesintegral action to follow the step command. However, for step commands with height ±1, whichcannot be followed due to the magnitude saturation, the adaptive PID controller reduces theintegral gain KI to zero, and thus avoids integrator windup.

DRAFT

69

Sidebar S5: Application to Nonlinear Oscillators

Aside from nonlinearity due to control saturation, it is interesting to apply RCAC to plantswith nonlinear dynamics and observe the resulting response. For illustration, we consider the Vander Pol and Duffing oscillators, which possess limit-cycle and bistable dynamics, respectively.

Example NLO1. Harmonic command following for the Van der Pol oscillator. Considerthe discretized Van der Pol oscillator

x1(k) = x1(k − 1) + hx2(k − 1), (S38)

x2(k) = x2(k − 1) + h[(1− x1(k − 1)2)x2(k − 1)− x1(k − 1) + u(k − 1)], (S39)

where h = 0.01 sec, y0(k) = x2(k), and z(k) = r(k)− y0(k). Let r be the harmonic commandr(k) = cosωk, where ω = 0.002 rad/sample, and let d = v = 0. We apply RCAC with Rθ = 1,Ru = 0, and nc = 10, and we use the FIR target model (71). Figure S11 shows the command-following performance.

Next, consider the harmonic command r(k) = 10 cosωk, where ω = 0.0008 rad/sample,and let d = v = 0. We apply RCAC with Rθ = 100, Ru = 0, and nc = 20, and we use the FIRtarget model (71). Figure S12 shows the command-following performance. Note that, in bothcases, the resulting trajectory is harmonic in both states despite the nonlinearities. �

Example NLO2. Harmonic command following for the Duffing oscillator. Consider thediscretized Duffing oscillator with constant disturbance

x1(k) = x1(k − 1) + hx2(k − 1), (S40)

x2(k) = x2(k − 1) + h[−14x2(k − 1) + 4x1(k − 1)− x3

1(k − 1) + u(k − 1) + 1], (S41)

where h = 0.01 sec. The open-loop plant has asymptotically stable equilibria (−1.86, 0)

and (2.11, 0). Let y0(k) = x2(k), z(k) = r(k) − y0(k), let r be a unit-amplitude harmoniccommand with frequency ω = 0.0025 rad/sample, and let v = 0. We linearize the plantabout the equilibrium xe = [0 0]T, and we use 10 Markov parameters of the linearizedmodel to construct the target model. We apply RCAC with Rθ = 10, Ru = 0, and nc = 10.Figure S13 shows the harmonic command-following performance for the initial conditionx(0) = [0.1 0.1]T. In the open-loop case with zero command, the open-loop plant approachesthe equilibrium (2.11, 0). The state x2 of the closed-loop system asymptotically follows theharmonic command. �

DRAFT

70

0 1000 2000 3000 4000-5

-4

-3

-2

-1

0

1

2

3

4

5

y 0(k

)r

y0

y0,OL

0 1000 2000 3000 4000Time Step

-50

-40

-30

-20

-10

0

10

20

30

40

50

u(k)

0 1000 2000 3000 4000Time Step

-5

-4

-3

-2

-1

0

1

2

3(k)

-3 -2 -1 0 1 2 3x1(k)

-3

-2

-1

0

1

2

3

x 2(k

)

Open loopClosed loopCL asymptotic

Figure S11: Example NLO1: Harmonic command following for the Van der Pol oscillator (S38)–(S39) with a harmonic command whose phase portrait is inside the limit cycle. The open-loopplant approaches the limit cycle, and RCAC asymptotically follows the harmonic command.

0 2000 4000 6000 8000 10000Time Step

-140

-120

-100

-80

-60

-40

-20

0

20

40

y 0(k

)

r

y0

0 2000 4000 6000 8000 10000Time Step

-8000

-6000

-4000

-2000

0

2000

4000

6000

8000

u(k)

0 2000 4000 6000 8000 10000Time Step

-8

-6

-4

-2

0

2

4

6

3(k)

-6 -4 -2 0 2 4x1(k)

-50

-40

-30

-20

-10

0

10

20

x 2(k

)

Closed loopCL asymptoticOpen loop

Figure S12: Example NLO1: Harmonic command following for the Van der Pol oscillator (S38)–(S39) with a harmonic command whose phase portrait is outside the limit cycle. The open-loopplant approaches the limit cycle, and RCAC asymptotically follows the harmonic command.

DRAFT

71

0 500 1000 1500 2000Time Step

-4

-3

-2

-1

0

1

2

3

4

y 0(k

)r

y0

y0,OL

0 500 1000 1500 2000Time Step

-40

-30

-20

-10

0

10

20

30

u(k)

0 500 1000 1500 2000Time Step

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

3(k)

0 0.5 1 1.5 2 2.5 3x1(k)

-4

-3

-2

-1

0

1

2

3

4

x 2(k

)

Open loopClosed loopCL asymptotic

Figure S13: Example NLO2: Harmonic command following for the Duffing oscillator (S40)–(S41). The lower plot shows the phase portraits of the open-loop limit cycle and the closed-loopresponse. The open-loop plant approaches the equilibrium (2.11, 0), and RCAC asymptoticallyfollows the harmonic command.

DRAFT

Retrospective Cost Adaptive Controldsbaero/library/RCAC_to_appear.pdf · 2017-04-06 · Adaptive...

Documents

Transcript of Retrospective Cost Adaptive Controldsbaero/library/RCAC_to_appear.pdf · 2017-04-06 · Adaptive...