The Model Reduction Enterprise

13
Chapter 1 The Model Reduction Enterprise Dynamical systems are important tools for the modeling, prediction, and control of physical phe- nomena arising in many areas of human endeavor, such as regulation of heat dissipation in com- plex microelectronic devices, vibration suppression in large wind turbines, and the timely predic- tion of storm surges before an advancing hurricane. In many cases, direct numerical simulation may be the only possibility that can yield an accurate prediction or produce a feasible control for such complex phenomena. Yet the inevitable need for refined predictions and higher perfor- mance leads to the need for improved accuracy. This in turn drives the inclusion of more detail at the modeling stage, which in turn leads to consideration of larger-scale, ever more complex dynamical systems, often containing subsystems that are generated through spatial discretization of an underlying time-dependent system of coupled partial differential equations (PDEs). Simu- lations serving the goals of science and engineering in this way can create unmanageably large demands on computational resources. Model reduction seeks to temper these demands while still meeting the needs for accuracy. In broad terms, model reduction seeks to transform large, complicated models of time-dependent processes into smaller, simpler models that are nonetheless capable of accurately representing the behavior of the original process under a variety of operating conditions. While this statement could describe many activities related to science and engineering, the principal focus of this book is on ways to accomplish the transformation from an original model to a reduced model system- atically and without recourse to expert intervention. The ultimate goal is to produce efficient, methodical strategies that can yield dynamical systems of much lower order (that is, evolving in a substantially lower dimensional space, and hence requiring far fewer computational resources for simulation). Further, these lower-order dynamical systems should retain response character- istics that are very close to the original system. Such reduced models could be used as efficient surrogates for the original model by either replacing it as a component in larger simulations, or, in other contexts, enabling the development of simpler and faster controllers that are better suited to real time applications. Our plan to accomplish this involves system-theoretic ideas blended with scalable computational linear algebra strategies. To illustrate a typical context where model reduction provides dramatic benefits, we con- sider an important quality control step in chip manufacturing, physical verification, in which a manufactured chip sample is tested with a wide variety of random inputs, and then observed re- sponses are required to match up with simulated responses using the same inputs. To be useful, the simulated responses must be generated quickly. But quick simulation using the original chip model is simply not feasible due to a combination of efficiency and accuracy constraints. On the other hand, reduced-order models having order of a few hundred can adequately capture the 3 Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Transcript of The Model Reduction Enterprise

Page 1: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 3 — #9 ii

ii

ii

Chapter 1

The Model ReductionEnterprise

Dynamical systems are important tools for the modeling, prediction, and control of physical phe-nomena arising in many areas of human endeavor, such as regulation of heat dissipation in com-plex microelectronic devices, vibration suppression in large wind turbines, and the timely predic-tion of storm surges before an advancing hurricane. In many cases, direct numerical simulationmay be the only possibility that can yield an accurate prediction or produce a feasible controlfor such complex phenomena. Yet the inevitable need for refined predictions and higher perfor-mance leads to the need for improved accuracy. This in turn drives the inclusion of more detailat the modeling stage, which in turn leads to consideration of larger-scale, ever more complexdynamical systems, often containing subsystems that are generated through spatial discretizationof an underlying time-dependent system of coupled partial differential equations (PDEs). Simu-lations serving the goals of science and engineering in this way can create unmanageably largedemands on computational resources.

Model reduction seeks to temper these demands while still meeting the needs for accuracy. Inbroad terms, model reduction seeks to transform large, complicated models of time-dependentprocesses into smaller, simpler models that are nonetheless capable of accurately representingthe behavior of the original process under a variety of operating conditions. While this statementcould describe many activities related to science and engineering, the principal focus of this bookis on ways to accomplish the transformation from an original model to a reduced model system-atically and without recourse to expert intervention. The ultimate goal is to produce efficient,methodical strategies that can yield dynamical systems of much lower order (that is, evolving ina substantially lower dimensional space, and hence requiring far fewer computational resourcesfor simulation). Further, these lower-order dynamical systems should retain response character-istics that are very close to the original system. Such reduced models could be used as efficientsurrogates for the original model by either replacing it as a component in larger simulations, or,in other contexts, enabling the development of simpler and faster controllers that are better suitedto real time applications. Our plan to accomplish this involves system-theoretic ideas blendedwith scalable computational linear algebra strategies.

To illustrate a typical context where model reduction provides dramatic benefits, we con-sider an important quality control step in chip manufacturing, physical verification, in which amanufactured chip sample is tested with a wide variety of random inputs, and then observed re-sponses are required to match up with simulated responses using the same inputs. To be useful,the simulated responses must be generated quickly. But quick simulation using the original chipmodel is simply not feasible due to a combination of efficiency and accuracy constraints. Onthe other hand, reduced-order models having order of a few hundred can adequately capture the

3

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 2: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 4 — #10 ii

ii

ii

4 Chapter 1. The Model Reduction Enterprise

full-order system response in most cases. So, simulation based upon a reduced model allowscompletion of the computation and comparison with observed behavior in a reasonable amountof time. Naturally, it is essential that the fidelity of the reduced-order simulations be sufficient toaccurately represent the original chip response and, further, that salient physical features presentin the response of the original chip model be preserved in the reduced model response.

The last two decades have seen significant advances in model reduction: new methods havebeen developed; advances in numerics and increases in computational power have dramaticallyexpanded applicability and utility of existing methods; and all the while, model reduction hasproved vital in several state-of-the-art endeavors including optimization, inverse problems, anduncertainty quantification. The proper orthogonal decomposition (POD), balanced truncation,and interpolatory methods have emerged as the most frequently used methods among those fewthat remain feasible for the reduction of large-scale dynamical systems. In this book, we focuson interpolatory methods. We consider both linear and nonlinear dynamical systems, as wellas systems that are parameter dependent. We develop and analyze both the projection model-ing framework, where state-space equations, i.e., internal dynamics, are presumed known andexplicitly specified, as well as the data-driven modeling framework, where only input/outputmeasurements for the system are available and no direct access or knowledge of internal dynam-ics is presumed.

Interpolatory model reduction methods have matured quickly in the last decade, evolvingfrom earlier formulations that bore labels such as rational Krylov methods and moment matchingmethods. New developments include algorithms for constructing (locally) optimal interpolatoryreduced models at modest cost, extension of these methods to the reduction of nonlinear andparametric systems, and the ability to produce reduced models directly from input/output datawithout having the benefit of knowledge of internal system dynamics (or, taken from a differentperspective, without bearing the burden of requiring access to internal system dynamics). Thesemethods have been adopted and used by a growing number of researchers and practitioners andhave emerged as one of the few leading choices that are available for use with truly large-scaleproblems.

Interpolatory model reduction methods have their roots in approximation theory, numericalanalysis, and linear algebra; they include highly refined methods that exploit the structure of lin-ear dynamical systems and are closely related to rational interpolation and Padé approximation.Indeed, the main thrust of the approaches we describe, at least in the context of linear dynam-ical systems, involves generating reduced models that interpolate the original system in a veryspecific sense, which we describe later. As a practical matter, these methods typically requireonly the solution of large sparse linear systems and are capable of producing locally optimal,high-fidelity reduced models. Although the basic working framework of interpolation methodsexploits the linearity of the dynamical system of interest, one is able to be far more ambitiousand demanding in the quality of outcomes that are achieved, and, indeed, the methods we discusshere are often dramatically successful in this setting.

The general outlook for model reduction methods that are applicable to nonlinear systemsis more complex, which is perhaps not surprising considering the vast range of types of non-linearity that may be encountered. Of course, one may consider linear dynamical systems as aspecial trivial case within the wider class of nonlinear dynamical systems, but one often finds thatmethods applicable to nonlinear systems can give disappointing results when applied to linearsystems. Indeed, large-scale linear dynamical systems can present significant model reductionchallenges themselves, and the field matured first in the development of methods to meet thesechallenges, drawing heavily on techniques that have their roots in large-scale computational lin-ear algebra and systems theory. Conversely, to the extent that many methods for approachingnonlinear systems build upon the analysis of carefully chosen linearizations of such systems,refined approaches for linear dynamical systems can play a role in producing effective strategies

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 3: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 5 — #11 ii

ii

ii

1.1. The problem setting 5

for the reduction of large-scale nonlinear dynamical systems as well; see, for instance, [45]. Ourapproach to nonlinear systems goes significantly further: by narrowing the scope of problemsand the character of nonlinearities that one considers, we are able to extend methods based onflexibly configured methods for linear dynamical systems into domains that include bilinear andquadratic nonlinearities, and, remarkably, this extension itself opens up many possibilities forsystematic model reduction approaches for broader classes of nonlinearities (e.g., see the discus-sion of the FitzHugh–Nagumo model in Section 1.2.4).

1.1 The problem settingDynamical systems in the context that we consider here are characterized by their input-outputmap S : u 7→ y, mapping inputs u(t) ∈ Rm to outputs y(t) ∈ Rp that we write initially as

S :

E(p)x(t;p) = A(p)x(t;p) + B(p)u(t) + f(x,u,p),

y(t;p) = g(x,p) + D(p)u(t), with x(0) = x0.(1.1)

The dependence on p here represents a possible parametrization, allowing for the common cir-cumstance that one develops reduced models not just for a single full-scale model but also fora family of full-scale models (indexed here by p). For any fixed p, E(p),A(p) ∈ Rn×n,B(p) ∈ Rn×m, D(p) ∈ Rp×m are constant matrices and f( · , · ; p) : Rn × Rm 7→ Rn,g( · , · ; p) : Rn 7→ Rp are (jointly) continuously differentiable functions. u(t) ∈ Rm andy(t;p) ∈ Rp are, respectively, the input and the output of the system S, while x(t;p) ∈ Rn isan internal variable that we refer to as the state of the system.4 Typically, we are interested insituations where n max(m, p), allowing also for cases where either n m or n p (butpossibly not both). S is a single-input/single-output (SISO) system when m = p = 1 (scalar-valued input, scalar-valued output); S will be designated a multi-input/multi-output (MIMO)system otherwise.

Systems of the form (1.1) with extremely large state-space dimension, n, arise in many disci-plines; see the discussion in the following section as well as [22] and [75] for a collection of suchexamples. Despite evolving in a large-dimensional state space, in many cases the system trajec-tories, x(t;p), will hew closely to subspaces having substantially lower dimension. That is, theyevolve in ways that do not fully occupy the state space; the original model S behaves very nearlyas if it had many fewer internal degrees of freedom. State-space dimension is commonly used asa proxy for system complexity and the level of computational resources required for simulation.Given the rich variety of dynamical systems and the vast range of strategies employed for simu-lation, this statement should be viewed cautiously as a rough heuristic, plausible, at least, in thesense that, with all other things being equal,5 the simulation of a dynamical system will demandcomputational resources that grow proportionally with some power of the state-space dimension.

The first goal of model reduction, then, is to produce a surrogate dynamical system thatevolves in a much lower dimensional state space (say, of dimension r n) yet still mimics theoriginal dynamical system, recovering very nearly the original input-output map. “Very nearly”in this context means that we want the reduced input-output map, Sr : u 7→ yr, to be close tothe full-order input-output map, S, in an appropriate sense that is made precise in later chapters.

4If E(p) is singular, (1.1) is a differential algebraic equation (DAE) (see Chapter 9) and the solution trajectory,x(t; p), might only feasibly occupy a proper subset of Rn. Although in such circumstances, x(t; p) harbors implicitconstraints and is not properly considered a system state, we will nonetheless continue to refer to x(t; p) in this way inorder to avoid mental clutter. In this context, Rn will still be referred to as the state space as well.

5They never are.

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 4: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 6 — #12 ii

ii

ii

6 Chapter 1. The Model Reduction Enterprise

Since the input-output map Sr is associated with a smaller version of the original model, weseek to describe it with a similar realization:

Sr :

Er(p)xr(t;p) = Ar(p)xr(t;p) + Br(p)u(t) + fr(xr,u,p),

yr(t;p) = gr(x,p) + Dr(p)u(t), with xr(0) = xr0.(1.2)

The input dimension, m, and the output dimension, p, for the reduced model are the same as,respectively, the input dimension and the output dimension for the original model; only internalstate-space dimensions differ: r n. There may be attributes in the specification of a full-order model, S, that reflect underlying conservation laws or other physically important structuralfeatures that one might wish to be reflected in any “similar” (albeit smaller) realization.

This reduction in order inevitably comes at a cost, and generally one anticipates that repeateduse of the reduced model as a surrogate for the full-order model will accrue savings that shouldbalance the initial construction cost of the reduced model. It is important that clear goals be laidout to provide context for the choice of Er, Ar, Br, Dr, fr, and gr:

Goals for reduced-order modeling:

1. The reduced input-output map Sr should be uniformly close to S in an appropri-ate sense. That is, when both the full and reduced systems are presented with thesame inputs, u(t), the difference between full and reduced system outputs, y − yr ,should be small with respect to a physically relevant norm over a wide range ofsystem inputs, such as over all u(t) with bounded energy (e.g., in the unit ball ofL2([0,∞),Rm) ).

2. Critical system structure should be preserved in the reduced-order system. This caninclude parametric dependence, and second-order and internal-delay structure.

3. Strategies for obtaining the reduced input-output map, Sr , should lead to numer-ically stable, efficient algorithms that require minimal application-specific tuningand little or no expert intervention. Methods should be robust and largely automaticto allow the broadest level of flexibility and applicability in complex multiphysicssettings.

Computational efficiency enters the game at many different levels, but minimally we mustbe certain that the treatment of very large problems remains tractable. At least with regard tocurrent practice, many elegant model reduction approaches requiring the solution of linear matrixinequalities are removed from serious consideration for this reason (although hybrid methodsthat incorporate these ideas remain in play). Computational efficiency plays a more subtle rolehere: the cost of model reduction must be regarded in the broader context of efficiency gains thataccrue from later use of the reduced model as a high-fidelity surrogate for the original full model,e.g., in design optimization or the development of control algorithms. The importance of havingmethods that may be deployed without significant need for expert oversight acknowledges theincreasing relative cost of human intervention as well as the enormous increase of complexity thatoccurs when systems from different physical domains are interconnected, confounding “commonwisdom” and making appropriate expert intervention much more difficult.

1.2 Motivating examplesModel reduction is an activity that comes up in diverse settings, often coupled with special goalsand contexts, so that it can be difficult to identify “typical” applications. Nonetheless, it maybe useful to motivate the model reduction enterprise with a few examples drawn from variousapplication domains.

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 5: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 7 — #13 ii

ii

ii

1.2. Motivating examples 7

• The first example arises in diffuse optical tomography, where a model of optical trans-mission through heterogeneous media with reduced models of parametrized diffusion andabsorption forms the core of a nonlinear inversion process that underlies an image recon-struction problem that serves diagnostic goals. The parametric dependence is both nonlin-ear and of large order, and so significant benefits accrue from using parametrized reducedmodels as surrogates for the original optical transmission model so as to reduce cost andimprove diagnostic utility.

• The second example describes the role of reduced models in reducing energy consump-tion associated with close control of interior environments in office and residential build-ings. HVAC (heating, ventilation, and air conditioning) is a major component of urbanenergy consumption; close regulation that ensures a comfortable environment without en-ergy waste entails highly complex and flexible models of air and heat transport that maylead to overly complex and failure-prone controllers. Reduced models play an impor-tant role in the derivation and implementation of responsive, efficient controllers for thesesettings.

• The third example describes an application where model reduction helps accelerate thedesign and certification cycle in the early development stages for modern commercial air-craft. This generally involves expensive simulations that must be done repeatedly withevery design modification, but one may streamline the process by maintaining reducedmodels for subsystems that mostly remain the same through many design cycles, so thatthe bulk of simulation resources are directed toward the portions of the system that changesignificantly in that cycle.

• The fourth and final example illustrates the construction of reduced models for a distributedFitzHugh–Nagumo system, which is a prototype for many nonlinear reaction-diffusionprocesses. This simple example captures many of the key recent advances in model re-duction that have dramatically changed the landscape for model reduction of nonlinearsystems, yet are direct outgrowths of a system-theoretic framework that until recently wasfully articulated mainly for linear dynamical systems.

1.2.1 Image reconstruction in diffuse optical tomography

Nonlinear inverse problems appear frequently in many fields of science and engineering. Im-age reconstruction is one example of such an inverse problem, and if one interprets the notionof an image broadly, a closely related example involves the identification and spatial localiza-tion of material anomalies within a given substrate. In such inverse problems, one wishes toidentify through image reconstruction the spatial distribution and physical extent of somethingthat is not directly observable and generally problematic, such as tumors in the body [98], con-taminant pools in the earth [199], or cracks in a material sample [283]). The principal toollinking reconstructed images that reflect information on soughtafter anomalies to data that maybe concretely observed and measured is a mathematical model describing how observable andmeasurable quantities can be expressed parametrically with respect to the (otherwise unobserv-able) spatial distribution of material anomalies; this is the forward model. The deviation betweenwhat these forward models predict and what is actually observed is attributable (ideally) to errorsin the hypothesized spatial distribution and, hence, may be viewed as an objective function thatachieves its minimum value at a spatial distribution that is at least approximately close to thetrue spatial distribution of material anomalies. In this way, the solution of an inverse problem

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 6: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 8 — #14 ii

ii

ii

8 Chapter 1. The Model Reduction Enterprise

is recast as an optimization problem, and so these forward models must be evaluated/simulatedmany times (every time the objective function is evaluated). In most applications, the forwardmodels are large-scale, discretized 2D or 3D PDEs, and thus are expensive to evaluate. This typ-ically constitutes the largest computational bottleneck in inverse problems, and model reductioncan be very effective in making the forward models dramatically cheaper to evaluate. Below,following [117], we give more details for a specific inverse problem, diffuse optical tomography.However, the discussion extends to similar inverse problems.

Diffuse optical tomography is an imaging procedure that may be considered as a radiation-free alternative to X-ray mammography. For this application, the breast is in compression be-tween two parallel plates containing optical sources and detectors. Assume then that the breasttissue to be imaged is contained in a rectangular cuboid, Ω = [−a1, a1]× [−a2, a2]× [−a3, a3],and let z = (z1, z2, z3)T refer to spatial location within Ω. The top surface (z3 = a3) and thebottom surface (z3 = −a3) of this tissue slab will be denoted as ∂Ω+ and ∂Ω−, respectively.The lateral surfaces where either z1 = ±a1 or z2 = ±a2 will be denoted by Γ. Arridge [32] de-veloped a diffusion model for photon flux η(z, t) driven by an input source g(z, t) that is selectedout of a set of nsrc possible sources. These sources are assumed to be physically stationary, inde-pendently driven, and positioned on the top plate, ∂Ω+. Let bj(z), j = 1, . . . , nsrc, be functionsdescribing the transmittance field of the jth source, so that g(z, t) = bj(z)uj(t) for some j andsome given pulse profile uj(t). Suppose that there are ndet optical sensors located on both topand bottom plates, ∂Ω±, and let yi(t), i = 1, . . . , ndet, denote the observations of tissue illumi-nation by the combined sources. The response characteristics of the sensors are presumed to becaptured by functions ci(z), so that yi(t) =

∫∂Ωci(z)η(z, t) dz, i = 1, . . . , ndet.

Following [32], the model for illumination of the tissue contained in Ω is given by

1

ν

∂tη(z, t) = ∇ · (D(z)∇η(z, t) )− µ(z)η(z, t) + bj(z)uj(t) for z ∈ Ω, (1.3)

0 = η(z, t) + 2AD(z)∂

∂ξη(z, t) for z ∈ ∂Ω±, (1.4)

0 = η(z, t) for z ∈ Γ, (1.5)0 = η(z, 0), (1.6)

yi(t) =

∫∂Ω

ci(z)η(z, t) dz for i = 1, . . . , ndet, (1.7)

where D(z) and µ(z) denote diffusion and absorption coefficients, respectively; A is a constantrelated to diffusive boundary reflection; and ξ denotes the outward unit normal.

In general, the scalar fields defined by D(z) and µ(z) are only partially known. Then, thegoal of the inverse problem is to utilize observations, y(t) = (y1(t), y2(t), . . . , yndet(t))

T , madewhen the system is illuminated by a variety of sources, u(t) = (u1(t), u2(t), . . . , unsrc(t))

T ,in order to more accurately determine D(z) and µ(z). Accurate determination of D(z) andµ(z) amounts to the “image reconstruction” problem yielding information about distributionand extent of tissue anomalies (such as tumors). For simplicity we proceed as in [117] andassume that the diffusivity D(z) is constant and known so that only the absorption field, µ(z),remains to be determined. We assume further that the absorption field, µ(·), although unknown, isexpressible in terms of a finite set of parameters, p = [p1, . . . , pν ]T . An effective parametrizationof µ(·) = µ(·,p) can be achieved by radial basis functions; see, e.g., [1]. A further simplificationinvolves averaging over the tissue thickness, effectively producing a two-dimensional model ofthe same form but constraining a2 → 0 so that Ω becomes a rectangle in the z2 plane withDirichlet conditions at z1 = ±a1 and Robin conditions on the top and bottom edges where thesources and detectors are located, z3 = ±a3.6

6A three-dimensional formulation using the same ideas developed here is considered in [275].

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 7: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 9 — #15 ii

ii

ii

1.2. Motivating examples 9

Spatial discretization of (1.3)–(1.7) yields a differential algebraic system of equations givenby

E x(t;p) = A(p)x(t;p) + Bu(t)

with y(t;p) = Cx(t;p).(1.8)

Here x denotes the (discretized) photon flux; y = [y1, . . . , yndet ]T is the vector of detector out-

puts; Cx constitutes a set of quadrature rules for (1.7) applied to the discretized photon flux;the columns of B are discretizations of the source “footprints” bj(z) for j = 1, . . . , nsrc; andA(p) = A0 + A1(p), with A0 and A1(p) corresponding to the diffusion and absorption terms,respectively (A1(p) inherits the absorption field parametrization, µ(·,p)). E is singular, reflect-ing the inclusion of the discretized Robin condition (1.4) as an algebraic constraint.

Suppose y(ııω;p) denotes the Fourier transform of y(t;p). Then, for a parameter choice p,associated with the absorption field µ(·,p), the vector of estimated observations attributable tothe ith input source at frequency ωj is predicted by the forward model to be yi(ωj ;p) ∈ Cndet .If we stack the estimated observation vectors for the nsrc sources and nω frequencies, we obtain

Y(p) = [y1(ω1;p)T , . . . , y1(ωnω ;p)T , y2(ω1;p)T , . . . , ynsrc(ωnω ;p)T ]T ,

which is a (complex) vector of dimension ndet · nsrc · nω . We construct the corresponding em-pirical observation vector, D, from acquired data. Then, the optimization problem correspondingto the underlying nonlinear parameter inversion problem for diffuse optical tomography is

minp∈R`‖Y(p)− D‖2 (1.9)

such thatE x(t;p) = A(p)x(t;p) + Bu(t),

y(t;p) = Cx(t;p).(1.10)

Note that (1.9) is a constrained optimization problem. Solving this optimization problem requiresrepeated evaluation of the cost function and its Jacobian, which, in this case, correspond toevaluating the large-scale discretized problem (1.10) in the frequency domain for every parameterpoint p(k) that the optimization algorithm is going through. More specifically, for this problem,the function and Jacobian evaluations require solving large-scale linear systems for computing(ııωE −A(p(k))

)−1B and C

(ııωE −A(p(k))

)−1for k = 1, 2, . . . , kopt; see [117]. This is

the main computational bottleneck and is where model reduction is employed. Model reductionin this problem replaces the large-scale dynamics (1.10) with the “reduced” version

Er xr(t; u) = Ar(p)xr(t; u) + Bru(t) with yr(t; u) = Crx(t; u), (1.11)

where Er,Ar have drastically smaller column and row dimensions, Br has a drastically smallercolumn dimension, and Cr has a drastically smaller row dimension. Thus, throughout the opti-mization whenever the cost function and the Jacobian need to be evaluated, we use the reduced(surrogate) model and approximate these quantities by solving much smaller linear systems,namely

(ııωEr −Ar(p(k))

)−1Br and Cr

(ııωEr −Ar(p(k))

)−1.

1.2.2 Simulation and control of indoor environments

The heating, cooling, and lighting of buildings accounts for more than three-fourths of all electri-cal energy consumption in the United States [250], hence the design of energy efficient buildingshas become a priority in the pursuit of improved energy stewardship. Models of indoor-air circu-lation and temperature distribution are critical components in the development of indoor environ-ment HVAC control systems and are factored into decisions regarding sensor and duct placement

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 8: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 10 — #16 ii

ii

ii

10 Chapter 1. The Model Reduction Enterprise

X Y

Z

inlet windowwindow

table

ventlight

inlet

inletinlet

light

Figure 1.1. Geometry for our indoor-air simulation [88].

in new building designs. To illustrate this, we consider an example taken from [88] modelingthe indoor-air environment in a conference room with four inlets, one return vent, and thermalloads provided by two windows, two overhead lights, and occupants as shown in Figure 1.1. Thedynamics for the indoor-air velocity, temperature, and moisture quantities are modeled using theBoussinesq equations and the transport equation:

∂v

∂t+ v · ∇v = −∇P +

1

Re∆v +

Gr

Re2T k, (1.12)

∇ · v = 0, (1.13)∂T

∂t+ v · ∇T =

1

RePr∆T +Bu, (1.14)

∂S

∂t+ v · ∇S =

1

Pe∆S, (1.15)

where v is the velocity vector, P is the pressure, T is the temperature, S is the moisture concen-tration, k is the unit vector in the vertical (z) direction, u is the input, Re is the Reynolds number,Pr is the Prandtl number, Gr is the Grashof number, and Pe is the Peclet number. Dirichletboundary conditions for v are assumed everywhere except at the outflow, where a zero stresscondition is applied. The moisture concentration is prescribed at the inflows, and a zero fluxcondition is applied everywhere else. Adiabatic boundary conditions on all surfaces except theinlets, windows, and lights are assumed. A time-averaged velocity field v was computed bysimulating the airflow in the room for a 30-minute period using FLUENT CFD software [88].A finite element model for thermal energy transfer was computed using a convection/diffusionmodel with the computed time-averaged velocity field v:

∂T

∂t+ v · ∇T =

1

RePr∆T +Bu,

where B represents control and disturbance inputs. Finally, a finite element discretization of thisequation leads to a high-order state-space model in the form

Ex(t) = Ax(t) + B u(t),

y(t) = C x(t),

with n = 202140 degrees of freedom, i.e., x ∈ R202140. The states x correspond to nodalvalues of the temperature, the matrix E corresponds to the finite element mass matrix, A is

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 9: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 11 — #17 ii

ii

ii

1.2. Motivating examples 11

the finite element approximation of the convection and diffusion operators, B represents theinfluence of two input temperatures, and C describes the resulting system effect through twooutput temperatures:

Inputs OutputsTemperature of inflow

air at all four ventsTemperature at sensorlocation on maxx wall

Disturbance due to occupancyaround conference table

Average temperature in an occupiedvolume around conference table

In this case, a reduced model of the same form, i.e.,

Erxr(t) = Arxr(t) + Br u(t),

yr(t) = Cr xr(t),

yet with many fewer degrees of freedom is constructed to predict the indoor-air environment fordifferent forcing/disturbances as represented by u(t). This reduced model is well suited for usein the design of an optimal controller. Note that optimal controllers typically have the same orderas the system being controlled.

1.2.3 Accelerating the commercial aircraft design and certificationcycle

The design and optimization of modern commercial aircraft typically requires accurate predic-tions of airframe response to diverse operational loads, which may include, for example, windgusts, turbulence, wake vortices, asymmetric thrust, and vibration. The verification and vali-dation of such predictions contribute to the assurance of future passenger comfort and safetyand are generally part of the clearance certification process mandated by civil flight authorities.Beyond the minimal requirements of comfort and safety, design objectives must also work to-ward increasing efficiency and decreasing weight, which often has the complicating effect ofcoupling different subsystems much more tightly; this generally leads to expensive simulationsof extremely complex models that must reliably capture coupled aeroelastic and control systemresponse. This must be done repeatedly at different stages of the design process, where differentfeatures of the response may be the focus of primary attention, reflecting the systematic progres-sion of focus on the behavior of different subsystems. High-fidelity reduced models can playa significant role in dramatically reducing overall simulation time and resources by providingcheap and accurate surrogates for different subsystems. The data-driven Loewner approaches ofChapter 4 are well suited to this purpose, since the construction of reduced models can be doneindependently of the availability of analytical system models and can accommodate a mix of sim-ulated response data and experimentally acquired response data. This has been explored in [263]and [268]; we illustrate the dramatic reduction of complexity using data presented and analyzedin [263].7 The system input-output response was measured at 421 frequencies. The system inputis a distributed gust disturbance, and the system output is the resulting rate of global heave andpitch, as well as the lift and pitch moment at 44 locations on the wing and tail. The full-ordersystem had around 5 × 105 fluid degrees of freedom and 2 × 103 structural degrees of freedom,while the reduced dynamical system has 30 degrees of freedom. The relative deviation betweenthe reduced system response and the full-order system response is never greater than 0.35%. Thefrequency responses of the full-order and reduced-order models are depicted in Figure 1.2.

7See also https://morwiki.mpi-magdeburg.mpg.de/morwiki/index.php?title=Flexible_Aircraft.

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 10: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 12 — #18 ii

ii

ii

12 Chapter 1. The Model Reduction Enterprise

102

10-5

102- - -- -- -- - - ----- - - -

10-5

10-6 - - - - - - -- - - ----- - - -

10° 101 1

Original (full-order) frequency response Reduced (order 30) frequency response

Figure 1.2. Frequency responses of the full-order and reduced-order models in the aircraft designcycle.

1.2.4 Nonlinear reaction-diffusion and the FitzHugh–Nagumo system

The FitzHugh–Nagumo model was conceived originally as a simplified model of nonlinear wavepropagation in excitable media (such as nerve axons and heart muscle). It is a distillation ofkey features found in the physiologically more descriptive (and complicated) Hodgkin–Huxleymodel of neuron action potential propagation and has been developed as a fundamental model formore general reaction-diffusion processes, among them processes in population genetics [259]and flame propagation [31].

The principal dynamics are described by the coupled system of PDEs

εvt(x, t) = ε2vxx(x, t) + f(v(x, t))− w(x, t) + u2(t),

wt(x, t) = hv(x, t)− γw(x, t) + u2(t),(1.16)

where f(v) is a cubic polynomial in v: f(v) = v(v−α)(β − v); α, β, γ, ε, q, and h are positiveconstants. Initial and boundary conditions are given as

v(x, 0) = 0, w(x, 0) = 0, x ∈ (0, L),

vx(0, t) = u1(t), vx(1, t) = 0, t ≥ 0.

u1(t) and u2(t) act as control inputs, u1(t) acts on the boundary, and typically u2(t) is takenas constant for t ≥ 0. Outputs are defined as y1(t) = v(ζ, t) and y2(t) = w(ζ, t) for a fixedζ ∈ (0, L).

In the original model conception, v is an idealization of the neuron membrane potential, andw is a “recovery variable” describing the (slower) dynamics associated with membrane repo-larization. The cubic nonlinearity is fundamental to the model dynamics, yet if we introduce anew state variable s(x, t) = v(x, t)2, we are able to rewrite (1.16) as a system of differentialequations that is now quadratic in the state variables at the cost of adding a single additionaldifferential equation:

ε vt = ε2 vxx − s · v + (α+ β) s− (αβ) v − w + u2,

wt = hv − γw + u2,

st = 2 s · v.(1.17)

This maneuver is representative of a broader approach, where for many nonlinear dynamicalsystems a judicious introduction of new state variables can substantially simplify the character

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 11: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 13 — #19 ii

ii

ii

1.3. A roadmap for what follows 13

of the nonlinearity and produce an equivalent nonlinear dynamical system involving no worsethan quadratic nonlinearities in state (see [266] and the discussion in our Remark 7.11.1). Aswe will see in Chapter 7, quadratic dynamical systems are, in turn, amenable to highly effective,systematic interpolatory approaches that directly extend the optimal model reduction approachesoriginally developed for large-scale linear dynamical systems, creating significant opportunitiesfor robust nonlinear model reduction using methods that are explored in this book.

These modified governing equations are discretized using finite differences leading then toan ordinary differential equation (ODE) system using the values ε = 0.015, h = 0.5, γ = 2, L =0.3, and u1(t) = (5× 10−4) · t3 exp(−15t) and u2(t) = 0.05 for t ≥ 0. With k = 500 spatialgrid points, the dynamical system dimension is 3k = 1500. AnH2-optimal reduced-order modelof order 20 was developed using the methods of Section 7.11. We show the time response of theoriginal system together with that of a reduced model in Figure 1.3 for six different locations foroutput stations, ζ ∈ (0, L).

Figure 1.3. Time response of original/reduced FitzHugh–Nagumo system.8

1.3 A roadmap for what followsOur goal in creating this book was, first, to produce a reader-friendly9 research monograph thatpresents basic machinery and results of the area in a unified way, offering a path for students andresearchers into the literature. Inevitably, there have been many difficult choices that we havehad to make in selecting what topics to include, what topics to omit, and, for those topics that areincluded, how much detail to include. The greatest detail is reserved for topics appearing in PartII (Core Concepts); Parts III and IV successively step down the level of detail while broadeningthe scope of topics discussed. The familiar trade-offs of depth versus breadth have been furthercomplicated by the rapidity with which the field is advancing; moving targets are hard to hit. Wehad to decide to work with a snapshot of the field and either ignore or only lightly touch on avariety of exciting recent developments that otherwise merit more attention. A second goal inthis project after all was to finish it.

8Figure provided by Dr. P. Goyal, Max Planck Institute for Dynamics of Complex Technical Systems Magdeburg.9Dissent will be greeted with earnest concern but not with a refund.

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 12: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 14 — #20 ii

ii

ii

14 Chapter 1. The Model Reduction Enterprise

Giving a closer view of paths forward, we offer at the trailhead a brief overview of the mainconcepts from systems theory that will be used throughout the book in Chapter 2. The point ofview provided by systems theory constitutes the foundation of many of the developments de-scribed in this book. Much of the material in Chapter 2 could be viewed as standard background,but obviously what is “standard” for one will be “supplemental” for another. If you are not surehow much time to invest here, we suggest skimming initially and coming back to it later forreference or moral support when needed.

Chapter 3 presents the general problem setting for projection-based interpolatory model re-duction of linear dynamical systems. Fundamental theorems for construction of interpolatoryreduced models are presented together with extensions into settings where natural system real-izations often reflect physics-based modeling structures that may be important to retain in thereduced-order versions.

We devote Chapter 4 to a thorough analysis of data-driven interpolatory model reduction, us-ing input/output measurements in lieu of knowledge of internal system dynamics. This chapterintroduces the Loewner modeling framework, a fundamental tool for data-driven model reduc-tion, and establishes a connection to classical rational interpolation.

In Chapter 5, we examine the fundamental issue of optimality in reduced models and, prag-matically, what constitutes “good” interpolation data. We focus on optimality measured withrespect to the H2 system norm, for which (delightfully) optimal reduced-order models are guar-anteed to be tangential interpolants of the original full-order model. Both theoretical and algo-rithmic aspects are introduced and discussed here. A computational framework is presented forboth the state-space projection framework and the data-driven Loewner framework. The data-driven computational framework allows one to construct locally optimal rational approximantsrequiring only transfer function evaluations. In particular, this provides a mechanism for con-structing locally optimal rational approximants for infinite-dimensional systems, including, forexample, systems having internal delays.

For a broad class of problems, the underlying state-space dynamics depend on parameters.Such parameters may represent, for example, the variability of material properties, domain ge-ometry and shape, or interface and boundary conditions. Chapter 6 is devoted to interpolatorymodel reduction of parametrized dynamical systems. We consider how the interpolatory pro-jection framework that we have previously developed can be extended to the construction ofreduced parametric systems that interpolate the original model in both the frequency domain andthe parameter domain. Parameter selection strategies are briefly discussed. We also illustratehow one may use the data-driven Loewner framework to construct parametric reduced modelsdirectly from observations.

An important milestone in the progress of interpolatory model reduction has been its exten-sion to the reduction of special classes of nonlinear systems, including bilinear and quadratic-in-state systems. Frequency-domain representations, which are analogues to transfer functions,are central in deriving optimal/high-fidelity reduced models for linear systems. For nonlinearsystems, the concept of a transfer function generally is lost. Nonetheless, the Volterra seriesrepresentation for bilinear systems and quadratic-in-state systems is able to fill a gap left by thegeneral absence of transfer functions. Taking advantage of this, we devote Chapter 7 to the ap-plication of interpolatory model reduction methods to nonlinear systems—principally to bilinearand quadratic-in-state systems. We discuss how the concept of transfer function interpolationgeneralizes to these settings and present how interpolatory optimalH2 model reduction methodsgeneralize directly to the case of bilinear systems. The chapter also introduces the data-drivenLoewner framework for nonlinear systems.

Besides theH2 norm, a variety of other error metrics can be useful in quantifying the qualityof reduced models; we consider some of these in Chapter 8, including a weighted-H2 error mea-sure, which can be especially important for controller reduction. We present an effective model

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)

Page 13: The Model Reduction Enterprise

ii

“book” — 2019/12/6 — 15:00 — page 15 — #21 ii

ii

ii

1.3. A roadmap for what follows 15

reduction framework for this setting as well. We also consider the use of a discrete least-squaresmeasure, which could be considered as a quadrature approximation to the regular H2 norm, andproceed to explain the vector fitting framework from this perspective. The H∞ norm is the finalerror metric that we consider here, discussing the corresponding H∞ approximation problemfrom a rational interpolation perspective. This provides some insight into how best to providehigh-fidelityH∞ models using large-scale interpolatory methods.

Chapter 3 introduces some methods for retaining in the reduced model special structuralfeatures that reflect the underlying physics of processes being modeled. Chapter 9 continues thistheme, and we discuss system interpolation for differential algebraic equations (DAEs) whereadditional side constraints reflecting system constraints may need to be satisfied. We considerspecial cases where explicit computation of the deflating projectors can be avoided.

As a practical matter, the requirement for solving large-scale shifted linear systems of equa-tions in interpolatory projection-based approaches can be a burden for cases where the linearsystems that must be solved to form the model reduction subspaces can only be solved approxi-mately by iterative methods such as GMRES and BiCG. In such cases, the data that is marshalledfor use in constructing reduced-order models will necessarily be approximate, and reasonablyone should be concerned with the accuracy of the final computed reduced-order model. Thebook concludes with Chapter 10, where we develop some of the perturbation theory necessaryto understand these effects and discuss some additional matters that arise in practical applicationand implementation.

Copyright © 2020 Society for Industrial and Applied Mathematics From Interpolatory Methods for Model Reduction - Antoulas, Beattie, Güğercin (9781611976076)