On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of...

50
On the adaptive discretization of PDE-based optimization problems Rolf Rannacher Institute of Applied Mathematics, University of Heidelberg [email protected] Summary. This article presents recent developments of a posteriori error esti- mation and mesh adaptation in the solution of optimization problems governed by partial differential equations. The finite element Galerkin method provides the framework for the Dual Weighted Residual (DWR) method for residual-based er- ror estimation. Here, the impact of cellwise discretization residuals on certain target quantities is controlled by sensitivity factors which are obtained from the solution of associated dual problems. In this way highly economical meshes can be generated for solving the optimization problem with minimal cost. This approach to model reduction by adaptive discretization is described in detail and illustrated by various examples from optimal control, parameter identification, and model calibration. Contents 1. Introduction: Principles of error estimation and mesh adaptation 2. A general paradigm of a posteriori error analysis Application for variational equations Application for eigenvalue problems Application for general optimization problems 3. Examples of special optimization problems A stationary tracking problem A nonstationary tracking problem A stationary flow control problem Parameter identification Model calibration 4. Conclusion and outlook 5. References * This article contains the material of a survey lecture given at the Summer School ‘PDE Constrained Optimization’ in Tomar, Portugal, July 27-29, 2005. The mate- rial is based on joint work with Roland Becker (University of Pau, France), Boris Vexler (RADON Institute, Linz, Austria), and Dominik Meidner (University of Heidelberg, Germany). We gratefully acknowledge the support by the German Research Foundation (DFG) through SFB 359 at Heidelberg University.

Transcript of On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of...

Page 1: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

On the adaptive discretization of PDE-based

optimization problems

Rolf Rannacher∗

Institute of Applied Mathematics, University of [email protected]

Summary. This article presents recent developments of a posteriori error esti-mation and mesh adaptation in the solution of optimization problems governedby partial differential equations. The finite element Galerkin method provides theframework for the Dual Weighted Residual (DWR) method for residual-based er-ror estimation. Here, the impact of cellwise discretization residuals on certain targetquantities is controlled by sensitivity factors which are obtained from the solutionof associated dual problems. In this way highly economical meshes can be generatedfor solving the optimization problem with minimal cost. This approach to modelreduction by adaptive discretization is described in detail and illustrated by variousexamples from optimal control, parameter identification, and model calibration.

Contents

1. Introduction: Principles of error estimation and mesh adaptation2. A general paradigm of a posteriori error analysis

− Application for variational equations− Application for eigenvalue problems− Application for general optimization problems

3. Examples of special optimization problems− A stationary tracking problem− A nonstationary tracking problem− A stationary flow control problem− Parameter identification− Model calibration

4. Conclusion and outlook5. References∗ This article contains the material of a survey lecture given at the Summer School

‘PDE Constrained Optimization’ in Tomar, Portugal, July 27-29, 2005. The mate-rial is based on joint work with Roland Becker (University of Pau, France), BorisVexler (RADON Institute, Linz, Austria), and Dominik Meidner (University ofHeidelberg, Germany). We gratefully acknowledge the support by the GermanResearch Foundation (DFG) through SFB 359 at Heidelberg University.

Page 2: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

2 Rolf Rannacher

1 Introduction: Principles of error estimation and mesh

adaptation

The application of mesh adaptation in the discretization of optimal controlproblems

J(u, q) → min, A(u, q) = 0, (1)

with states u and controls q , raises several fundamental questions:

• What is the right notion of ‘admissibility’ of states u = u(q)? Discretiza-tion inevitably introduces perturbation of the state equation. Achievinghigh accuracy in the discretization of PDEs is expensive. Hence, the extentto which ‘admissibility’ is relevant for the optimization process becomes acritical question and is actually a modeling issue.

• How should admissibility be ‘measured’? In solving ODEs, one may requirethe error to be uniformly ‘small’, but in the context of PDEs the choice ofan appropriate error measure is less clear.

The efficient numerical solution of optimal control problems governed by PDEsrequires work reduction by adaptive discretization and, in the nonstationarycase, storage reduction by checkpointing techniques.

In the following, we consider finite element discretization using continuouspiecewise polynomial trial and test functions for all unknowns on meshesTh = K which consist of non-degenerate quadrilaterals (‘cells’) K of widthhK := diam(K) . The ‘global mesh size’ is h := maxK∈Th

hK . These meshesare allowed to have ‘hanging nodes’ for simplifying local mesh refinement andcoarsening.

The question of how to organize systematic mesh adaptation in the ap-proximation of optimal control problems will be discussed within the generalframework of ‘goal-oriented’ a posteriori error analysis. Let the goal of a nu-merical simulation be the computation of a quantity J(u) from the solutionof a continuous model

A(u) = 0, (2)

with accuracy TOL by using a discrete model

Ah(uh) = 0, (3)

in a finite element space of dimension N . Accordingly, the goal of adaptivityis the optimal use of computing resources, i.e.,

TOL given : N → min or N given : TOL→ min.

To achieve this goal, a posteriori information is used in terms of error repre-sentations of the form

Page 3: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 3

J(u)−J(uh) ≈ η(uh) :=∑

K∈Th

ρK(uh)ωK , (4)

where the terms ρK(uh) contain ‘smoothness’ or ‘residual’ information ofthe approximate solution uh , while the weights ωK represent the sensitivityof the error J(u)−J(uh) with respect to the cell residuals ρK(uh) . Theseweights are obtained from the approximate solution of an associated global lin-ear dual problem which is driven by the target functional J(·) as right-handside. This approach, called the ’Dual Weighted Residual (DWR) Method’,has been developed in Becker/Rannacher [16, 17]. For a comprehensive dis-cussion of the DWR method and related references, we refer to the survey arti-cles Becker/Rannacher [18], Becker/Heuveline/Rannacher [10], and the bookBangerth/Rannacher [2]. Related strategies of duality-based error control andmesh adaptation are described in Eriksson et al. [34] and Giles/Suli [40]. An al-ternative approach using duality is described in the book Neittaanmaki/Repin[71]. The application of the DWR method for mesh adaptation in optimal con-trol problems has been developed in Becker/Kapp/Rannacher [13], Kapp [54],Becker [5], Bangerth [1], and Vexler [78]. For publications of these results andother references, we refer to the survey articles Becker et al. [12] and Car-raro/Heuveline/Rannacher [27]. Most of the numerical results presented havebeen obtained using the software packages GASCOIGNE Becker/Braack [6]and RODOBO Becker/Meidner/Vexler [14]. For the visualization of resultsthe package VISUSIMPLE Becker/Dunne/Meidner [9] has been used.

Throughout this paper, we use the common notation for the Lebesguespaces L2(Ω) and the Sobolev spaces Hm(Ω) and Hm

0 (Γ ;Ω) ⊂ Hm(Ω)on a bounded domain Ω ⊂ Rd , with the corresponding norms denoted by‖ · ‖ = ‖ · ‖Ω and ‖ · ‖m = ‖ · ‖m;Ω , respectively. The L2 scalar productis denoted by (·, ·) = (·, ·)Ω . The subscript Ω is usually omitted. Spacesof Rd-valued functions v = (v1, . . . , vd) are denoted by boldface-type, butno distinction is made in the notation of norms and inner products; thusH1

0(Γ ;Ω) = H10 (Γ ;Ω)d has norm ‖v‖1 = (

∑di=1 ‖vi‖

2)1/2, etc. All othernotation are self-evident, e.g., ∂tu = ∂u/∂t and ∂nv = n · ∇v, where n is anouter normal unit vector.

1.1 A model situation

To illustrate the mechanism of the DWR method, we consider the linear modelproblem

−∆u = f in Ω, u|∂Ω = 0, (5)

on a polygonal domain Ω ⊂ R2 . Its variational formulation seeks u ∈ V :=H1

0 (Ω) , such that

a(u, ψ) := (∇u,∇ψ) = (f, ψ) ∀ψ ∈ V. (6)

Page 4: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

4 Rolf Rannacher

The Galerkin finite element method uses subspaces Vh ⊂ V := H10 (Ω) (e.g.,

P1- or Q1-elements) and determines approximations uh ∈ Vh by

a(uh, ψh) = (f, ψh) ∀ψh ∈ Vh. (7)

An essential feature of this scheme is the ‘Galerkin orthogonality’ of the errore := u− uh,

a(e, ψh) = 0, ψh ∈ Vh. (8)

The goal in solving problem (5) may be to obtain insight into the overallstructure of its solution. For this purpose, the most natural error measure isthe energy norm associated to the bilinear from a(·, ·) , i.e., for a prescribedtolerance TOL, we want to achieve that

‖∇(u−uh)‖ ≤ TOL.

However, in many practical situations, one is more interested in computingcertain functionals of the solution, e.g., point values J(u) := u(P ) . Then, wewant to achieve that

|J(u)−J(uh)| ≤ TOL.

For controlling this functional error, we employ a standard duality argument.Let z ∈ V be the solution (the ‘dual solution’) of the auxiliary problem (the‘dual problem’)

a(ϕ, z) = J(ϕ) ∀ϕ ∈ V, (9)

and zh ∈ Vh its Galerkin approximation defined by

a(ϕh, zh) = J(ϕh) ∀ϕh ∈ Vh. (10)

For the point value functional, we use the regularized functional

Jǫ(u) := |Bǫ(P )|−1

Bǫ(P )

u(x) dx,

with ǫ ≈ TOL , which is well defined on V . The corresponding dual solutionbehaves like a regularized Green function, i.e., z(x) ≈ − log(‖x−P‖ + ǫ) .Analogously, the dual solution corresponding to the derivative point value,J(u) ≈ ∂1u(P ) , behaves like the regularized derivative Green function. Suchhighly irregular functions for ǫ→ 0 can be computed with good accuracy onappropriately refined meshes as is shown in Figure 1.

Page 5: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 5

Fig. 1. Approximate dual solutions (Green functions) for evaluating u(P ) (left)and ∂1u(P ) (right).

Assuming the functional J(·) as linear, we take the test function ϕ = e in(9) and use Galerkin orthogonality (8) for an arbitrary approximation ihz ∈Vh , to obtain

J(e) = a(e, z) = a(e, z−ihz)

= (f, z−ihz) − a(uh, z−ihz) =: ρ(uh)(z−ihz).

The residual term on the right-hand side is rewritten using cellwise integrationby parts as

ρ(uh)(z−ihz) =∑

K∈Th

(f +∆uh, z−ihz)K − (∂nuh, z−ihz)∂K

=∑

K∈Th

(R(uh), z−ihz)K + (r(uh), z−ihz)∂K

,

with the residuals R(uh) and r(uh) defined by

R(uh)|K := f +∆uh, r(uh)|Γ :=

− 12n · [∇uh], if Γ ⊂ ∂K\∂Ω,

0, if Γ ⊂ ∂Ω,

where [∇uh] denotes the jump across inter-element edges. From this errorrepresentation, we obtain the a posteriori error estimate

|J(e)| ≤ ηω :=∑

K∈Th

ρK(uh)ωK(z), (11)

where the cell residuals ρK(uh) and cell weights ωK(z) are given by

ρK(uh) := ‖R(uh)‖K + h−1/2K ‖r(uh)‖∂K ,

ωK(z) := ‖z−ihz‖K + h1/2K ‖z−ihz‖∂K .

Page 6: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

6 Rolf Rannacher

From the estimate (11), we can derive the ‘energy-norm error’ a posterioriestimate (for details see [18] or [2])

‖∇(u−uh)‖ ≤ ηE := cScI

(

K∈Th

h2KρK(uh)2

)1/2

, (12)

with an ‘interpolation constant’ cI , which is computable but large in general,and an ‘stability constant’ cS , which equals 1 in the present model case.

Evaluation of the a posteriori error estimate

In order to evaluate the above error representation or the resulting error es-timate (11), we need easily computable approximations z to the generallyunknown dual solution z . In the present case it suffices to apply local post-processing to the Galerkin approximation zh ∈ Vh , e.g., by patchwise higher-order interpolation,

(z−ihz)|K ≈ (z−ihz)|K := (i(2)2h zh−zh)|K

where i(2)2h denotes the operator of cell-patchwise quadratic or biquadratic

interpolation. Accordingly, we use the cell error indicators

ηK :=∣

∣(R(uh), z−ihz)K + (r(uh), z−ihz)∂K

∣,

for steering the mesh adaptation.

Strategy of mesh adaptation

The mesh adaptation aims at the equilibration of the cell error indicators,

η :=∑

K∈Th

ηK , ηK ≈η

N, N := #K ∈ Th ⇒ η ≈ TOL,

which can be argued to be the criterion for an optimally adapted, i.e., mosteconomical, mesh for approximating the quantity J(u) with tolerance TOL.In practice this process may be organized as follows: One determines the 25%cells with largest and the 10% cells with smallest values of ηK . The cells ofthe first group are refined and those of the second group coarsened. In twodimensions this strategy leads to about a doubling of the number of cellsin each refinement cycle. By a similar strategy it can be achieved that thenumber of cells stays about constant during the adaptation process within atime stepping procedure. For more details on mesh adaptation strategies inthe context of ‘goal-oriented’ error estimation, we refer to [18] or [2].

Page 7: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 7

Remark 1. The described approach to automatic mesh adaptation using weightedresidual-based a posteriori error estimates for quantities of interest is the DualWeighted Residual (DWR) method, Becker/Rannacher [16, 17, 18]. In con-trast to mesh adaptation based on estimates of the form (12), which onlycontain information on the ‘features’ (‘residuals’) of the approximate solution,mesh adaptation by the DWR method is ‘sensitivity driven’ as it combines lo-cal residual information with information (‘weights’) about their global effecton the target error. In general situations, particularly for complex systemsand those with anisotropic transport this seems to be the only way to achievereally economical meshes in goal-oriented computation.

Literature

A priori error analysis of finite element discretization of PDE-based optimiza-tion problems started with the papers of Falk [35, 36]. This work was later ex-tended, e.g., to nonlinear optimal control problems in Gunzburger/Hou [43], todistributed parameter identification in Neittaanmaki/Tai [70] and Karkkainen[55], and to state-constrained control problems in Casas/Mateos [29], to men-tion only a few relevant papers. Adaptive discretization in solving optimizationproblems governed by PDEs using residual-based a posteriori error estimateshas been studied in the literature since only recently, starting in 1997 withthe article Becker/Kapp [11]. In the following, we give a brief account of thatliterature, which has relations to the present article.

• W.-B. Liu and coworkers (University of Kent, UK) since 2000 have pro-duced a series of papers on a posteriori energy-norm error estimates(‘feature-oriented’ error estimates) for different kinds of optimal con-trol problems including stationary and nonstationary tracking problems,[62, 59, 63, 64, 61, 65, 66, 67, 60].

• In the group of C. Johnson (Chalmers Technical University of Gothenburg,Sweden) duality-based a posteriori error estimeates have been used forinverse scattering problems, [21].

• W. Dahmen (Technical University of Aachen, Germany) and A. Kunoth(University of Bonn, Germany) have derived energy-norm a posteriori errorestimates for wavelet approximations of stationary tracking-type optimalcontrol problems, [30, 57].

• In the groups of R. Hoppe (University of Augsburg, Germany, and Univer-sity of Houston, USA) and S. Repin (St Petersburg University, Russia) aposteriori energy-norm error estimates have been proved for finite elementapproximations of stationary control problems, [50, 38, 39, 47, 51].

• B. Mohammadi [69] has used mesh adaptivity by metric control in opti-mal shape design. M.S. Gockenbach and W.W. Symes [41] based adaptivityin optimization on the adjoint state methods. L. Dede and A. Quarteroni[31] have applied mesh adaptivity to the control of advection-diffusion pro-cesses, and H. Johansson et al. [52, 53] describe duality-based sensitivityassessment and error control in parameter estimation.

Page 8: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

8 Rolf Rannacher

• The group of the author (University of Heidelberg, Germany) studies aposteriori error estimates and mesh adaptation by the DWR method forvarious kinds of optimization problems since 1997. The material of thissurvey article is largely collected from these references.- The basic concepts and application for stationary elliptic control prob-

lems are decribed in Becker/Kapp [11], Becker/Kapp/Rannacher [13],Kapp [54], and Rannacher [73].

- An example of stationary flow control (drag minimization in viscousincompressible fluid flow) is treated in Becker [4, 5].

- Stationary distributed parameter estimation problems are studied inBangerth [1].

- Stationary discrete parameter estimation problems are considered inBecker/Vexler [19, 20] and Vexler [78, 79].

- Stationary parameter estimation problems in models of chemically re-active flows are considered in Becker/Braack/Vexler [7, 8].

- Nonstationary parameter estimation problems are studied in Becker/Meidner/Vexler [15].

2 A general paradigm of a posteriori error analysis

In order to use the duality-based approach to mesh adaptation described inthe preceding section for more general problems, we consider the followingabstract setup. Let L(·) be a differentiable functional defined on some linearspace X and x ∈ X a stationary point,

L′(x)(y) = 0 ∀y ∈ X. (13)

For a (finite dimensional) subspace Xh ⊂ X the corresponding Galerkinapproximation xh ∈ Xh is defined by

L′(xh)(yh) = 0 ∀ yh ∈ Xh. (14)

Proposition 1. Let the functional L(·) posses directional derivatives of upto third order. Then, for any solutions x ∈ X of (13) and xh ∈ Xh of (14),there holds the error representation

L(x)−L(xh) = 12L

′(xh)(x−ihx) + Rh, (15)

with arbitrary ihx ∈ Xh. The remainder Rh is cubic in e := x−xh,

Rh := 12

∫ 1

0

L′′′(xh+se)(e, e, e) s(s−1) ds.

Proof. By elementary calculus, we have

L(x)−L(xh) =

∫ 1

0

L′(xh+se)(e) ds,

Page 9: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 9

and approximating the integral by the trapezoidal rule,

L(x)−L(xh) = 12

L′(xh)(e) + L′(x)(e)

+ 12

∫ 1

0

L′′′(xh+i) s(s−1) ds.

Then, observing L′(x)(e) = 0 and employing Galerkin orthogonality,

L′(xh)(e) = L′(xh)(x−ihx) + L′(xh)(ihx−xh), ihx ∈ Xh,

we obtain the error representation (15).

2.1 Application for variational equations

First, we use the abstract result of Proposition 1 to derive a posteriori er-ror estimates for the Galerkin approximation of general nonlinear variationalequations. With a differentiable semilinear form A(·)(·) and a linear func-tional F (·) defined on a linear space V , we consider the variational equation

A(u)(ψ) = F (ψ) ∀ψ ∈ V, (16)

and for a (finite dimensional) subspace Vh ⊂ V its Galerkin approximation

A(uh)(ψh) = F (ψh) ∀ψh ∈ Vh. (17)

Suppose that, for a given functional J(·) , we are interested in the value J(u)and therefore in the error J(u)−J(uh) . To embed this problem into the gen-eral framework of the preceding section, we rewrite it as a (trivial) constrainedoptimization problem,

J(u) → min, A(u)(ψ) = F (ψ) ∀ψ ∈ V. (18)

For solving this problem, we use the Euler-Lagrange approach by introducingthe Lagrangian functional

L(u, z) := J(u) + F (z) −A(u)(z),

with the adjoint variable z ∈ V . Then, any stationary point u, z ∈ V × Vof L(·, ·) characterized by the set of equations

A′(u)(ϕ, z) = J ′(u)(ϕ) ∀ϕ ∈ V, (19)

A(u)(ψ) = F (ψ) ∀ψ ∈ V, (20)

yields a solution of (16). The second of these equations (20) is just the givenvariational equations (16) to be solved, while the first one (19) is the ‘dualproblem’ associated to the functional J(·) . The corresponding Galerkin ap-proximation uh, zh ∈ Vh × Vh is determined by the set of equations

Page 10: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

10 Rolf Rannacher

A′(uh)(ϕh, zh) = J ′(uh)(ϕh) ∀ϕh ∈ Vh, (21)

A(uh)(ψh) = F (ψh) ∀ψh ∈ Vh. (22)

To these equations, we associate the ‘primal’ and ‘dual’ residual functionals

ρ(·) := F (·) − a(uh)(·), ρ∗(·) := J ′(uh)(·) − a′(uh)(·, zh),

defined on V . Then, from Proposition 1, we infer the following result.

Proposition 2. Let the form A(·)(·) and the functional J(·) possess direc-tional derivatives of up to third order. Then, for any solutions u, z ∈ V×Vof (19,20) and uh, zh ∈ Vh×Vh of (21, 22), there holds the error represen-tation

J(u)−J(uh) = 12ρ(z−ihz) + 1

2ρ∗(u−ihu) + Rh, (23)

for arbitrary approximations ihu, ihz ∈ Vh , with a remainder Rh cubic inu−uh and z−zh ,

Rh = 12

∫ 1

0

J ′′′(uh+se)(e, e, e)−A′′′(uh+se)(e, e, e, zh+se∗)

− 3A′′(uh+se)(e, e, e∗)

s(s−1) ds.

Proof. For embedding the present situation into the general framework ofProposition 1, we introduce the product spaces X := V×V and Xh := Vh×Vh

and for x := u, z, xh := uh, zh set

L(x) := L(u, z).

Note that for x = u, z, ϕ = ϕu, ϕz ∈ V ×V ,

L′(x)(ϕu, ϕz) = L′u(u, z)(ϕu) + L′

z(u, z)(ϕz)

= J ′(u)(ϕu) −A′(u)(ϕu, z) + F (ϕz) − a(uh)(ϕz).

Then, by applying Proposition 1, we obtain the error representation

L(x) − L(xh) = 12L

′(xh)(x−ihx) + Rh

= 12ρ

∗(u−ihu) + 12ρ(z−ihz) + Rh,

with the cubic remainder

Rh = 12

∫ 1

0

L′′′(xh+se)(e, e, e) s(s−1) ds

= 12

∫ 1

0

J ′′′(uh+se)(e, e, e)−A′′′(uh+se)(e, e, e, zh+se∗)

− 3A′′(uh+se)(e, e, e∗)

s(s−1) ds.

Finally, observing that at the solution u, z and uh, zh ,

L(x) − L(xh) = J(u) + F (z) −A(u)(z) − J(uh) + F (zh) −A(uh)(zh)

= J(u) − J(uh),

we obtain the desired error representation (23).

Page 11: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 11

Remark 2. The statement of Proposition 2 needs some comments.

- The a posteriori error representation (23) holds, provided that solutionsto the systems (19, 20) and (21, 22) exist without requiring their unique-ness. This will be important in applying this result for example to theapproximation of eigenvalue problems. For nonunique solutions the a pri-ori assumption

uh, zh → u, z (h→0)

is needed to make the error representation useful.- The cubic remainder term Rh is usually neglected.- The evaluation of the error identity requires guesses for the primal anddual solution u and z , e.g., by postprocessing of the computed approxi-mations uh and zh , e.g., the method described in Section 1 for the modelsituation.

- The dual problem is linear. Hence, the whole error estimation may amountto only a relatively small fraction of the total cost for the solution process.This has to be compared to the usually much higher cost when workingon non-optimized meshes.

Example of flow simulation

For illustrating the potential of the DWR method for complexity reductionin numerical simulation of complex processes, we use an example from fluidmechanics. The example is drag computation in a viscous incompressible chan-nel flow around a cylinder with square cross section. The configuration of theexample is depicted in Figure 2. The flow is modeled by the stationary Navier-Stokes equations for the pair u := v, p of velocity and pressure,

−ν∆v + v · ∇v + ∇p = f, ∇ · v = 0, (24)

supplemented by the usual boundary conditions

v|Γrigid= 0, v|Γin

= vin, ν∂nv − np|Γout= 0.

The viscosity ν , the maximum inflow velocity v , and the diameter D of thecylinder are chosen such that the Reynolds number is Re = Lv/ν = 20, whichguarantees a laminar stationary flow.

The goal is to compute the ‘drag coefficient’, which is defined by

Jdrag(u) :=2

vD

S

nTσ(v, p)e1 ds,

taken over the surface S of the cylinder, with the stress tensor σ(v, p) =−pI + ν(∇v+∇vT ) , and the unit vector e1 of the main flow direction. Thediscretization of this problem is by a conforming finite element scheme usingequal-order bilinear trial functions for both velocity and pressure and which

Page 12: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

12 Rolf Rannacher

is stabilized by LPS stabilization (local projection stabilization), for detailsof this discretization and the computational results see Braack/Richter [25].A prior study of the corresponding 2-dimensional flow benchmark can befound in Becker/Rannacher [16] and Becker [3]. For a survey of adaptive finiteelement discretization of flow problems see Rannacher [74].

Fig. 2. Configuration of the 3D channel flow around a cylinder.

The results of automatic mesh adaption by the DWR method are comparedto those obtained on uniformly refined meshes (using the high-order Q2/Q1

Taylor-Hood Stokes element) and those on meshes, which are generated usingan error indicator based on local smoothness, e.g., in terms of second-orderdifference quotients of the discrete flow field.

(a)Q2/Q1 Taylor-Hood element with ’globally uniform’ refinement,(b) Stabilized Q1/Q1 element with feature-based local refinement (‘energy-

norm’),(c) Stabilized Q1/Q1 element with sensitivity-based local refinement by the

DWR method.

These results are collected in Table 1. A corresponding optimized mesh isshown in Figure 3.

Fig. 3. Locally refined mesh for the drag computation obtained by the DWRmethod.

Page 13: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 13

Table 1. Computed drag values using the three different methods (a), (b) and (c),the error level of < 1% is indicated by boldface type, Nxxx number of unknowns.

a) Nglobal cd b) Nenergy cd c) Nweighted cd15, 960 8.2559 3, 696 12.7888 3, 696 12.7888

117, 360 7.9766 21, 512 8.7117 8, 456 9.8262899, 040 7.8644 80, 864 7.9505 15, 768 8.1147

7,035,840 7.8193 182, 352 7.9142 30, 224 8.184855, 666, 560 7.7959 473, 000 7.8635 84,832 7.8282

− − 1,052,000 7.7971 162, 680 7.7788− − − − 367, 040 7.7784∞ 7.7730 ∞ 7.7730 ∞ 7.7730

2.2 Application to eigenvalue problems

Next, we want to demonstrate that the DWR approach can also be appliedto less standard situations, e.g., to eigenvalue problems such as occurring inhydrodynamic stability analysis. For details of the following discussion seeHeuveline/Rannacher [46]. Eigenvalue problems may be viewed as special pa-rameter identification problems and can therefore be used to illustrate someof the features of this type of optimization problems.

Let V ⊂ H ⊂ V ′ be a Gelfand triple of function Hilbert spaces withscalar products and corresponding norms denoted by (·, ·)V , ‖ · ‖V , (·, ·)H ,and ‖ · ‖H , respectively. The embedding V ⊂ H is assumed to be compact.The typical example of such a situation is H = L2(Ω) and V = H1

0 (Ω)for a bounded domain Ω ⊂ Rd . Further, let a(·, ·) be a bounded elliptic(not necessarily hermitean) sesquiliner form on V and m(·, ·) a boundedhermitean semidefinit sesquilinear form on H ,

|a(u, u)| ≥ γ‖u‖2V − β‖u‖2

H , u ∈ V,

m(u, v) = m(v, u), m(u, u) ≥ 0, u, v ∈ H,

with certain constants γ > 0 and β ≥ 0 . Within this framework we considerthe eigenvalue problem

a(u, ψ) = λm(u, ψ) ∀ψ ∈ V, (25)

with λ ∈ C and the normalization condition m(u, u) = 1 , and its Galerkinapproximation on (finite dimensional) subspaces Vh ⊂ V ,

a(uh, ψh) = λhm(uh, ψh) ∀ψh ∈ Vh, (26)

with λh ∈ C , and m(uh, uh) = 1 . By the general Fredholm theory the eigen-value problem (25) possesses a countable set of isolated eigenvalues without

Page 14: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

14 Rolf Rannacher

finite accumulation point and each with finite geometric as well as algebraicmultiplicity. Under the approximation assumption

infϕh∈Vh

‖u−ϕh‖V → 0 (h → 0), u ∈ V, (27)

the eigenvalues and eigenfunctions of (26) converge in a suitable sense to theeigenvalues and eigenvectors of (25); see Bramble/Osborn [26]. We want toderive a posteriori error representations for this approximation process.

To this end, we use an interpretation of the eigenvalue problem in thecontext of variational equations. We introduce the spaces V := V ×C andVh := Vh×C with elements U := u, λ and Uh := uh, λh , respectively.On V , we define the semilinear form

A(U)(Ψ) := λm(u, ψ) − a(u, ψ) + µ

m(u, u)−1

.

With this notation the eigenvalue problems (25) and (26) can equivalently bewritten in the form

A(U)(Ψ) = 0 ∀Ψ ∈ V , (28)

A(Uh)(Ψh) = 0 ∀Ψh ∈ Vh. (29)

We want to apply the result of Proposition 2 to this situation. For controlingthe eigenvalue error, we use the functional

J(Ψ) := µm(ψ, ψ), Ψ = ψ, µ ∈ V ,

which for an eigenpair U = u, λ has the property J(U) = λ . The associ-ated dual solutions Z = z, λ∗ and Zh = zh, λ

∗h are determined by the

equations

A′(U)(Φ,Z) = J ′(U)(Φ) ∀Φ ∈ V , (30)

A′(Uh)(Φh, Zh) = J ′(Uh)(Φh) ∀Φh ∈ Vh. (31)

A straightforward calculation shows that this reduces to the adjoint eigenvalueproblems

a(ϕ, u∗) = λ∗m(ϕ, u∗) ∀ϕ ∈ V, (32)

a(ϕh, u∗h) = λ∗hm(ϕh, u

∗h) ∀ϕh ∈ Vh, (33)

with normalization m(u, u∗) = 1 , m(uh, u∗h) = 1 . Note that λ∗ = λ and

λ∗h = λh . The corresponding primal and dual residuals are defined by

ρ(·) := a(uh, ·) − λhm(uh, ·),

ρ∗(·) := a(·, u∗h) − λ∗hm(·, u∗h).

Then, from Proposition 2, we obtain the following result.

Page 15: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 15

Proposition 3. Let u, λ and uh, λh be any eigenpairs of the eigenvalueproblems (25) and (26), respectively. Then, there holds the error representa-tion

λ−λh = 12ρ(u

∗−ihu∗) + 1

2ρ∗(u−ihu) + Rh, (34)

with arbitrary ihz, ihu ∈ Vh and the cubic remainder

Rh = 12 (λ−λh)m(u−uh, u

∗−u∗h).

Remark 3. The normalization condition m(uh, u∗h) = 1 preassumes that the

considered eigenvalue λh is nondegenerate, i.e., its geometric and algebraicmultiplicities coincide. The convergence behavior ‖u∗h‖ → ∞ (h → 0) in-dicates that the limiting eigenvalue λ is degenerate. A discussion of thisconsequence of non-normality of the eigenvalue problem is given in Heuve-line/Rannacher [46].

Remark 4. In stability analysis the eigenvalue problem (25) results from lin-earizing a semilinear variational problem, such as (16), about a stationarysolution u ∈ V . For a semilinear problem

a(u)(ψ) = 0 ∀ψ ∈ V, (35)

the resulting stability eigenvalue problem has the form

a′(u)(u, ψ) = λm(u, ψ) ∀ψ ∈ V, (36)

with the corresponding Galerkin approximation

a′(uh)(uh, ψh) = λhm(uh, ψh) ∀ψh ∈ Vh. (37)

For this situation, a generalization of the argument used for deriving Propo-sition 3 yields the following extended error representation (for details seeHeuveline/Rannacher [46]),

λ−λh = 12 ρ(z−ihz) + 1

2 ρ∗(u−ihu)

+ 12ρ(z−ihz) + 1

2ρ∗(u−ihu) + Rh,

(38)

with a cubic remainder Rh . The first two terms are the primal and dualresidual corresponding to the computation of the base solution uh , while theremaining two terms are the residuals of the eigenvalue computation analo-gously to (34). This error relation allows us to control the effect of an inac-curate computation of the approximate base solution uh (on a possibly toocoarse mesh) on the accuracy of the eigenvalue computation. This aspect willbe explained by an example below.

In order to determine the explicit form of the residual terms in the eigen-value error representation (34), we need to be more specific about the par-ticular structure of the eigenvalue problem considered. For example, considerthe diffusion-transport eigenvalue problem

Page 16: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

16 Rolf Rannacher

Au := −∆u+ b · ∇u = λu in Ω, u|∂Ω = 0, (39)

on a polygonal domain Ω ⊂ R2 , with a smooth transport coefficient b ∈ R

2 .In this case, the natural Hilbert spaces are H := L2(Ω) , V := H1

0 (Ω) , andthe sesquilinear forms are given by

a(u, v) := (∇u,∇v)Ω + (b · ∇u, v)Ω, m(u, v) := (u, v)Ω.

Suppose that this eigenvalue problem is approximated by the Galerkin methodusing piecewise linear or bilinear finite elements on meshes Th = K , asdescribed in Section 1. Within this setting, we can proceed analogously asbefore, obtaining

ρ(·) =∑

K∈Th

(Auh−λhuh, ·)K + 12 ([∂nuh], ·)∂K

,

ρ∗(·) =∑

K∈Th

(·,A∗u∗h − λ∗hu∗h)K + 1

2 (·, [∂nu∗h])∂K

,

with [·] again indicating jumps across intercell boundaries. Assuming thatthe approximation is good enough such that

|m(u−uh, u∗−u∗h)| ≤ 1, (40)

the error representation (34) implies the a posteriori eigenvalue error bound

|λ−λh| ≤ ηωλ :=

K∈Th

ρKω∗K + ρ∗KωK

. (41)

Here, the primal and dual cell-residual terms and corresponding weights are

ρK :=(

‖Rh‖2K + h

−1/2K ‖rh‖

2∂K\∂Ω

)1/2,

ωK :=(

‖u−ihu‖2K + 1

2h1/2K ‖u−ihu‖

2∂K

)1/2,

ρ∗K :=(

‖R∗h‖

2K + h

−1/2K ‖r∗h‖

2∂K\∂Ω

)1/2,

ω∗K :=

(

‖u∗−ihu∗‖2

K + h1/2K ‖u∗−ihu

∗‖2∂K

)1/2,

where Rh := Auh−λhuh, R∗h := A∗zh−λ

∗hu

∗h , and rh := − 1

2 [∂nuh], r∗h :=− 1

2 [∂nu∗h] . Then, carrying the estimation further, from the estimate (41), we

deduce the ‘energy-norm-type’ error estimate

|λ−λh| ≤ η(1)λ := cλ

K∈Th

h2K

ρ2K + ρ∗2K

, (42)

with a constant cλ growing linearly with |λ| . The analogue of the a posteri-

ori error estimator η(1)λ for symmetric eigenvalue problems has been given by

Nystedt [72]. There, the symmetry of the problem is extensively used. Further-more, H2-regularity of the eigenfunctions is required which excludes domains

Page 17: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 17

with reentrant corners, the most interesting case for an adaptive approach.

Our derivation of η(1)λ does not need these assumptions.

The essential difference between the ‘weighted’ eigenvalue error estimator

ηωλ and the ‘energy-norm-type’ error estimate η

(1)λ is that in ηω

λ the localresidual properties of primal and dual eigenfunctions are multiplied while in

η(1)λ they appear as square sums. Neglecting the presence of the dual eigenvalue

problem, one may try to control the error in approximating the eigenvalueusing the primal residual part alone, i.e., using the ‘reduced’ error estimator

ηredλ :=

K∈Th

h2Kρ

2K .

The meshes generated on the basis of these three different error estimatorscan be expected to look quite differently for problems involving transport.

2.3 Numerical example

We consider the convection-diffusion model eigenvalue problem (39) on theslit-domain Ω = (−1, 1)× (−1, 3) \ x ∈ R2, x1 = 0,−1 < x2 ≤ 0 , withthe constant transport field b ≡ (0, 3)T . More details on this test case canbe found in Heuveline/Rannacher [45]. Figure 4 shows the computed primaland dual eigenfunctions corresponding to the leading eigenvalue, i.e., thatwith smallest real part. Due to the transport term these eigenfunctions havesignificantly different behavior. The primal eigenfunction has a boundary layerat the upper boundary and is only weakly affected by the slit, while the dualeigenfunction has no boundary layer but a more pronounced singularity atthe tip of the slit.

Fig. 4. Primal (left) and dual (right) eigenfunction.

Figure 5 contains the meshes obtained on the basis of the three different

error estimators ηweightλ , η

(1)λ , and ηred

λ . Clearly, the estimator ηredλ does

not reflect the features of the dual eigenfunction, while η(1)λ emphasizes the

Page 18: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

18 Rolf Rannacher

slit singularity too much. The resulting mesh efficiencies for computing theeigenvalue are shown in Figure 6.

P

b

Fig. 5. Configuration of the slid domain and adapted meshes with about 10, 000cells constructed by the error estimator η

(1)λ

(left), by its reduced version ηredλ

(middle), and by the ’weighted’ estimator ηweightλ

(right).

101

102

103

104

105

10−4

10−3

10−2

10−1

100

Number of cells

e λ := |λ

−λ h|

eλ(1)

ηλ(1)

eλred

ηλred

eλweight

ηλweight

eλ(Global)

Fig. 6. Mesh-efficiencies of the error estimators η(1)λ

(symbol ’O’), ηredλ (symbol

’×’), ηweightλ

(symbol ’∗’), compared to uniform refinement (symbol ’’). The dashedcurves correspond to the estimates for the eigenvalue error.

Page 19: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 19

Mesh adaptation by the DWR approach proves to be superior over theother two strategies. However, though reflecting the correct asymptotic be-havior for N → ∞ (‘reliability’), all three estimators crudely overestimatethe true error (‘efficiency’). This is due to the many steps of applying Holderand triangle inequality in their derivation which lose sharpness. A more re-alistic estimate of the true eigenvalue error may be obtained be monitoringthe convergence behavior of the approximate eigenvalues on the generatedsequence of adapted meshes.

2.4 Application for general optimization problems

Now, we describe the application of the DWR method to the Galerkin approx-imation of general optimization problems. This is the natural extension of theEuler-Lagrange approach used in Subsection 2.1 for variational equations toreal optimization problems.

Let V be a Hilbert space of states and Q a space of controls. Then, witha differentiable cost functional J(·, ·) : V×Q→ R , a differentiable semi-linearform A(·, ·)(·) and a linear functional F (·) on V ×Q×V , we consider thefollowing optimization problem:

J(u, q) → min, A(u, q)(ψ) = F (ψ) ∀ψ ∈ V. (43)

We assume that there is a locally unique minimum u, q ∈ V ×Q which ischaracterized as corresponding to a saddle-point u, q, λ ∈ V ×Q×V of theLagrangian functional

L(u, q, λ) := J(u, q) + F (λ) −A(u, q)(λ),

where λ denotes the associated co-state (’adjoint’ or ‘dual’ variable). Con-ditions under which this is true can be found in the standard literature onoptimal control theory, e.g., Lions [58], Fursikov [37], and the survey Heinken-schloss/Troltzsch [44]. The triplet u, q, λ ∈ V ×Q×V is determined bythe saddle-point problem, the so-called ‘optimality system’ or ‘Karush-Kuhn-Tucker (KKT) system’,

A′u(u, q)(ϕ, λ) = J ′

u(u, q)(ϕ) ∀ϕ ∈ V,

A′q(u, q)(χ, λ) = J ′

q(u, q)(χ) ∀χ ∈ Q,

A(u, q)(ψ) = 0 ∀ψ ∈ V.

(44)

Here, A′u(u, q)(ϕ, ·) , A′

q(u, q)(χ, ·) , J ′u(u, q)(ϕ), and J ′

q(u, q)(χ) denote thedirectional derivatives of the semilinear form A(u, q)(·) and the functionalJ(u, q) in the directions ϕ and χ , respectively.

The optimization problem is discretized by a standard Galerkin methodusing finite dimensional subspaces Vh×Qh ⊂ V ×Q:

J(uh, qh) → min, A(uh, qh)(ψh) = F (ψh) ∀ψh ∈ Vh. (45)

Page 20: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

20 Rolf Rannacher

Its local solutions uh, qh ∈ Vh × Qh correspond to saddle points xh :=uh, qh, λh ∈ Xh := Vh×Qh×Vh of the Lagrangian L(·, ·, ·) , i.e., to solutionsof the discrete saddle-point problems

A′u(uh, qh)(ϕh, λh) = J ′

u(uh, qh)(ϕh) ∀ϕh ∈ Vh,

A′q(uh, qh)(χh, λh) = J ′

q(uh, qh)(χh) ∀χh ∈ Qh,

A(uh, qh)(ψh) = 0 ∀ψh ∈ Vh.

(46)

It seems to be a natural idea to seek control of the discretization errorwith respect to the cost functional J(·, ·) of the optimal control problem interms of the ‘primal’, ‘dual’, and ‘control’ residuals defined by

ρ∗(·) := J ′u(uh, qh)(·) −A′

u(uh, qh)(·, λh) (dual residual)

ρq(·) := J ′q(uh, qh)(·) −A′

q(uh, qh)(·, λh) (control residual)

ρ(·) := F (·) −A(uh, qh)(·) (primal residual)

From Becker/Rannacher [18], we recall the following fundamental result.

Proposition 4. Let the cost functional J(·, ·) and the semilinear form A(·, ·)(·)posses directional derivatives of up to third order. Then, for any solutionsu, q, λ and uh, qh, λh of the saddle point problems (44) and (46), respec-tively, there holds the a posteriori error representation

J(u, q)−J(uh, qh) = 12ρ

∗(u−ihu) + 12ρ

q(q−jhq) + 12ρ(λ−ihλ) + Rh, (47)

with arbitrary approximations ihu, ihλ ∈ Vh and jhq ∈ Qh. The remainderRh is cubic in the errors eu := u−uh , eq := q−qh , and eλ := λ−λh .

Proof. For embedding the present situation into the general framework ofProposition 1, we introduce the product spaces X := V ×Q×V and Xh :=Vh×Qh ×Vh and for x := u, q, λ, xh := uh, qh, λh set L(x) := L(u, q, λ) .Note that for x = u, q, λ, ϕ = ϕu, ϕq, ϕλ ∈ V ×Q×V ,

L′(x)(ϕu, ϕλ) = L′u(u, q, λ)(ϕu) + L′

q(u, q, λ)(ϕλ) + L′

λ(u, q, λ)(ϕλ)

= J ′u(u, q)(ϕu) −A′

u(u, q)(ϕu, λ) + J ′q(u, q)(ϕ

u)

−A′q(u, q)(ϕ

q, λ) + F (ϕλ) −A(uh, qh)(ϕλ).

Then, by applying Proposition 1, we obtain the error representation

L(x)−L(xh) = 12L

′(xh)(x−ihx) + Rh

= 12ρ

∗(u−ihu) + 12ρ

q(q−jhq) + 12ρ(λ−ihλ) + Rh,

with remainder Rh , which is cubic in the error x−xh = u−uh, q−qh, λ−λh .Finally, observing that at the saddle points u, q, λ and uh, qh, λh ,

L(x)−L(xh) = J(u, q)+F (z)−A(u, q)(z)−J(uh, qh)+F (zh)−A(uh, qh)(zh)

= J(u, q)−J(uh, qh),

we obtain the desired error representation (47).

Page 21: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 21

Remark 5. The statement of Proposition 4 requires some comments.

• As in the case of variational equations the derivation of the error repre-sentation (47) does not require the uniqueness of solutions. The a prioriassumption uh, qh, λh → u, q, λ as h→0 makes the result meaningfulfor nonunique solutions.

• The generally nonlinear discrete saddle-point problem (46) can be solvedby a Newton or SQP iteration with mesh adaptation in each step. Hence,the combined process may be viewed as a successive ‘model enrichment’starting from an initial coarse mesh.

• The evaluation of the second-order optimality condition requires the solu-tion of auxiliary boundary value problems, Beckewr/Vexler [19].

• In concrete situations the error identity (47) can again be converted intoan a posteriori error estimates of the form

|J(u, q)−J(uh, qh)| ≈ η(uh) :=∑

K∈Th

ρuK ωλ

K + ρqK ωq

K + ρλK ωu

K

,

with cell residuals ρu/q/λK and cell weights ω

u/q/λK , the sensitivity factors.

The cubic remainder term Rh is usually neglected.• The systematic mesh adaptation based on the a posteriori error repre-

sentation (47) allows us to compute approximate optimal controls qopth

on highly nonuniform meshes such that the corresponding discrete statesuopt

h may lack admissibility, though the optimal cost value is well achieved.In order to recover more admissible states, one may compute a better ap-proximation uopt

h′ to uopt from the optimal control qopth by solving the

state equation in a larger space Vh′ ⊃ Vh ,

A(uopth′ , q

opth )(ψh′) = 0 ∀ψh′ ∈ Vh′ .

• The natural choice of the cost functional J(·, ·) for error control may notbe appropriate in parameter identification problems, in which the error inthe control itself is more interesting. This requires a more refined approachas will be discussed below.

Solution process and mesh adaptation

We briefly discuss the solution process of the optimal control problems dis-cretized by Galerkin finite element methods and the essential steps of thecorresponding mesh adaptation based on the a posteriori error representation(47). For more details and extensions, we refer to Becker [4], Vexler [78], andBecker/Meidner/Vexler [15].

Evaluation of a posteriori error estimates

Neglecting the cubic remainder Rh in (47), we obtain the approximate errorrepresentation

Page 22: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

22 Rolf Rannacher

J(u, q)−J(uh, qh) ≈ η := 12ρ

∗(u−ihu) + 12ρ

q(q−ihq) + 12ρ(λ−ihλ). (48)

Suppose that Vh is an intermediate finite element space obtained in the courseof a successive mesh adaptation process. Let uh, qh, λh ∈ Vh×Qh ×Vh bea corresponding solution of the saddle-point problem (46). In order to dothe next adaptation step, we have to evaluate the cellwise contributions tothe residual terms in the error representation (48). This follows the proce-dure already described for the model situation in Section 1 and the eigen-value problem in Subsection 2.2. Again the weighting terms u−ihu , λ−ihλ ,and q−jhq are approximated by locally postprocessing the discrete solutionsuh, qh, λh , e.g., as described in Section 1. In the case of (isoparametric)bilinear finite elements on quadrilateral meshes Th = K a patchwise bi-

quadratic interpolation i(2)2h uh is constructed using the nodal values of uh .

Then it is set(u−ihu)|K ≈ (i

(2)2h uh−uh)|K ,

and analogously for λ and q . This results in local error indicators ηK asso-ciated to the cells of the mesh Th .

2.5 Mesh adaptation strategy

Mesh adaptation based on a posteriori error estimates usually aims at theequilibration of the cell error indicators,

η :=∑

K∈Th

ηK , ηK ≈η

N, N := #K ∈ Th ⇒ η ≈ TOL,

A standard and simple strategy of mesh adaptation is the so-called ‘fixedfraction strategy’, which proceeds as follows. The cells K of the current meshTh are ordered with respect to the size of the associated error indicators,i.e., ηK1

≤ · · · ≤ ηKNh. The cells are split into two groups, according to the

conditionsN∗∑

i=1

ηKi≈ aη,

Nh∑

i=N∗

ηKi≈ bη,

with pre-assigned fractions 0 < a < b < 1 (say a = 0.25, b = 0.1 ). Then, thecells K1, . . . ,KN∗

are refined and the cells KN∗ , . . .KNhare coarsened. This

process is continued until the prescribed error tolerance TOL or the maximalavailable number Nmax of cells are reached, i.e., η ≈ TOL or Nh ≈ Nmax .We note that there are much more sophisticated strategies for mesh adaptation(see Becker/Rannacher [18]).

2.6 Solution process

Let a desired error tolerance TOL or a maximum mesh complexity Nmax

be given. Starting from a coarse initial mesh T0 , a hierarchy of successively

Page 23: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 23

refined meshes Ti, i ≥ 1 , and corresponding finite element spaces Vi, isgenerated using the procedure described above. On each intermediate meshthe problem may be solved by reformulating the constrained optimal controlproblem (43) as an unconstrained optimal control problem

j(q) := J(S(q), q) → min, (49)

where S : Q→ V is the solution operator of the state equation

A(S(q), q)(ψ) = 0 ∀ψ ∈ V.

The local existence and sufficient regularity of S is assumed. Then, the firstand second-order necessary conditions for an optimal q are that

j′(q)(χ) = 0, j′′(q)(χ, χ) ≥ 0 ∀χ ∈ Q. (50)

The derivatives of the reduced functional can be computed using the La-grangian functional L(u, q, λ) = J(u, q)+F (λ)−A(u, q)(λ) introduced above.This is the basis of Newton-SQP methods for solving the optimality equationsuch as described in Troltzsch [77] and Hinze/Kunisch [48]. For given q ∈ Q ,let u = S(q) ∈ V be the corresponding state and λ ∈ V the correspond-ing co-state determined by the adjoint equation (the first equations in theoptimality system (46)),

L′u(u, q, λ)(ψ) = 0 ∀ψ ∈ V.

Then, for δq ∈ Q , there holds

j′(q)(δq) = L′q(u, q, λ)(δq). (51)

Further, for given δq ∈ Q , let δu ∈ V and δλ ∈ V fulfill the equation

L′′qλ(u, q, λ)(δq, ϕ) + L′′

uλ(u, q, λ)(δu, ϕ) = 0 ∀ϕ ∈ V,

and the equation

L′′qu(u, q, λ)(δq, ϕ) + L′′

uu(u, q, λ)(δu, ϕ) + L′′uλ(u, q, λ)(δλ, ϕ) = 0 ∀ϕ ∈ V,

respectively. Then, for δr ∈ Q , there holds

j′′(q)(δq, δr) = L′′qq(u, q, λ)(δq, δr) + L′′

uq(u, q, λ)(δu, δr)

+ L′′λq(u, q, λ)(δλ, δr).

(52)

Now, with this notation the classical Newton method (written on the contin-uous level) for solving the optimality equation in (51) reads

j′′(qn)(δqn, χ) = −j′(qn)(χ) ∀χ ∈ Q, (53)

with the update step qn+1 := qn + δqn . If for the solution of this linear sys-tem Krylov space and multigrid methods are used, usually only the evaluationsj′′(qn)(δqn, χ) and j′(qn)(χ) are required for fixed χ . This can efficiently beachieved based on the formulas (51) and (52), respectively, after a suitablediscretization (for details see Becker/Meidner/Vexler [15]). After having ob-tained a limit solution the above mesh adaptation process is used to generatethe new mesh.

Page 24: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

24 Rolf Rannacher

3 Examples of special optimization problems

3.1 A stationary tracking problem

In order to illustrate the performance of the DWR method applied to opti-mal control problems, we consider a simple model situation, namely a sta-tionary tracking problem with Neumann boundary control, Kapp [54] andBecker/Kapp/Rannacher [13]. Let the following semi-linear diffusion problembe given:

−∆u+ s(u) = f in Ω ⊂ R2,

∂nu|ΓN= 0, ∂nu|ΓC

= q,(54)

with the ‘control boundary’ ΓC ⊂ ∂Ω and ΓN := ∂Ω \ ΓC . The boundarycontrol q on ΓC is to be determined as to satisfy the tracking condition

J(u, q) = 12‖u−u‖

2ΓO

+ 12α‖q‖

2ΓC

→ min,

where ΓO ⊂ ∂Ω is the ‘observation boundary’, u ≡ 1 , and α ≥ 0 . Thecorresponding configurations are depicted in Figure 7. Using the spaces V :=H1(Ω) and Q := L2(ΓC) the variational formulation of the state equationreads

(∇u,∇ψ) + (s(u), ψ) − (q, ψ)ΓC= (f, ψ) ∀ψ ∈ V. (55)

Fig. 7. Configuration 1 (left) and Configuration 2 (right).

The KKT system corresponding to this problem has the form

(ϕ, u−u0)ΓO+ (∇ϕ,∇λ)Ω + (s′(u)ϕ, λ)Ω = 0 ∀ϕ ∈ V,

α(q, χ)ΓC− (λ, χ)ΓC

= 0 ∀χ ∈ Q,

(∇u,∇ψ)Ω + (s(u), ψ)Ω − (f, ψ)Ω − (q, ψ)ΓC= 0 ∀ψ ∈ V.

(56)

Page 25: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 25

Its Galerkin approximation uses piecewise bilinear finite elements on quadri-lateral meshes as described in Subsection 1.1, Vh := Q1-elements andQh := ∂nvh|ΓC

, vh ∈ Vh. The discrete KKT system reads

(ϕh, uh−c0)ΓO+ (∇ϕh,∇λh)Ω + (s′(uh)ϕh, λh)Ω = 0 ∀ϕh ∈ Vh,

α(qh, χh)ΓC− (λh, χh)ΓC

= 0 ∀χh ∈ Qh,

(∇uh,∇ψh)Ω + (s(uh), ψh)Ω − (f, ψh)Ω − (qh, ψh)ΓC= 0 ∀ψh ∈ Vh.

(57)

For this situation the approximate error estimate (48) takes the form

|J(u, q)−J(uh, qh)| ≤ ηω :=∑

K∈Th

ρuK ωλ

K + ρqK ωq

K + ρλK ωu

K

, (58)

where the cell residuals and weights are given by

ρλK := ‖Rλ

h‖K + h−1/2K ‖rλ

h‖∂K , ωuK := ‖u−ihu‖K + h

1/2K ‖u−ihu‖∂K ,

ρqK := h

−1/2K ‖rq

h‖∂K , ωqK := h

1/2K ‖q−jhq‖∂K ,

ρuK := ‖Ru

h‖K + h−1/2K ‖ru

h‖∂K , ωλK := ‖λ−ihλ‖K + h

1/2K ‖λ−ihλ‖∂K .

with arbitrary ihu, jhq, ihλ ∈ Vh×Qh×Vh, and

Ruh|K := f +∆uh − s(uh), Rλ

h|K := ∆λh − s′(uh)λh,

ruh|Γ :=

12 [∂nuh], if Γ 6⊂ ∂Ω,

∂nuh, if Γ ⊂∂Ω\ΓC ,

∂nuh−qh, if Γ ⊂ ΓC ,

rqh|Γ :=

λh−αqh, if Γ ⊂ΓC ,

0, if Γ 6⊂ΓC .

rλh|Γ :=

12 [∂nλh], if Γ 6⊂∂Ω,

∂nλh, if Γ ⊂∂Ω\ΓO,

∂nλh+uh−u, if Γ ⊂ΓO,

The performance of the ‘weighted’ a posteriori error estimator ηω will becompared with that of the heuristic ‘energy-norm-type’ error indicators

ηprimalE := cI

(

K∈Th

h2K (ρu

K)2)1/2

,

ηprimal/dualE := cI

(

K∈Th

h2K

(ρuK)2 + (ρλ

K)2

)1/2

,

where the calibration constant cI is set to one. The indicator ηprimalE only con-

tains the residuals of the primal state equation, while the indicator ηprimal/dualE

also contains the residuals of the adjoint equation. However, both do not con-tain the control residuals ρq

K and miss the sensitivity information containedin the weighting factors.

Page 26: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

26 Rolf Rannacher

Numerical results

Figures 8 and 9 show the solutions and corresponding meshes obtained for thetwo different problem configurations by using the different error estimators,Kapp [54] and Becker/Kapp/Rannacher [13]. As to be expected, for the con-figuration 1 the ‘weighted’ estimator ηω in contrast to the ‘energy-norm-type’error estimator ηE does not enforce any mesh refinement at the reentrant cor-ners. From the corresponding results in Figure 10, we see that in this case ηω

yields by far more economical meshes than ηE . The corresponding results forconfiguration 2 are somewhat different. Again the efficiency of the ‘weighted’error estimator is acceptable, but its superiority over the ‘energy-type’ errorestimator is less distinct. This is due to the fact that both estimators mustallow the information to pass through the entire domain and consequentlyhave to suppress the effect of the reentrant corners by mesh refinement.

Fig. 8. Solutions and meshes obtained by ηω (left) and ηE (right) for problemconfiguration 1.

Fig. 9. Solutions and meshes obtained by ηω (left) and ηE (right) for problemconfiguration 2.

Page 27: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 27

1e-07

1e-06

1e-05

0.0001

1000 10000 100000

E(V

_h)

Number of elements N

"energy_u""energy_u+lambda"

"dual_weighted"

1e-10

1e-09

1e-08

1e-07

1e-06

1e-05

0.0001

1000 10000 100000

E(V

_h)

Number of elements N

"energy_u""energy_u+lambda"

"dual_weighted"

Fig. 10. Mesh efficiencies for ηE and ηω for problem configuration 1 (left) andproblem configuration 2 (right).

3.2 A nonstationary tracking problems

We consider ‘initial control’ in the nonstationary semilinear diffusion problem,Becker/Meidner/ Vexler [15],

∂tu−∆u + s(u) = f in QT = Ω×(0, T ],

∂nu|∂Ω = 0, u|t=0 = q.(59)

The control q is to be betermined in Ω , such that a prescribed profile u ismatched in the least-squares sense by the solution at the final time T ,

J(u, q) := 12‖u(·, T )−u‖2

Ω + 12α‖q‖

2Ω → min .

The variational formulation of the state equation in the space-time domainQT reads

(∂tu, ψ)QT+ (∇u,∇ψ)QT

+ (s(u), ψ)QT= (f, ψ)QT

+ (q−u(0), ψ0)Ω , (60)

for all test functions ψ ∈ V . Here, the initial condition is incorporated intothe variational formulation. Applying the Lagrangian formalism results in thefollowing KKT system in space-time for the triplet u, q, λ :

−(ϕ, ∂tλ)QT+ (∇ϕ,∇λ)QT

+ (ϕ, s′(u)λ)QT= (ϕ(T ), u(T )−u−λ(T ))Ω,

(λ(0) + αq, χ)Ω = 0,

(∂tu, ψ)QT+ (∇u,∇ψ)QT

+ (s(u), ψ)QT= (q−u0, ψ0)Ω,

for all admissible test triplets ψ, χ, ϕ . This system reads in strong form like

−∂tλ−∆λ+ s′(u)λ = 0 in QT ,λ|t=T = u(T )−u, ∂nλ|∂Ω = 0,

λ|t=0 = −αq,

∂tu−∆u + s(u) = 0 in QT ,u|t=0 = q, ∂nu|∂Ω = 0.

(61)

Page 28: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

28 Rolf Rannacher

For solving this saddle-point system, we consider a space-time finite elementGalerkin discretization using the so-called dG(r) (‘discontinuous Galerkin’)or cG(r) (‘continuous Galerkin’) time discretization method (see, e.g., Eriks-son/Johnson/Thomee [33]). For a time grid

0 = t0 < ... < tm < ... < tM = T, km := tm−tm−1,

and a corresponding sequence of spatial meshes Tmh , we introduce finite ele-

ment spaces V mh consisting of spatially continuous functions which are cell-

wise (isoparametric) bilinear in space and polynomial (constant or linear)in time. Depending on the time discretization used the trial functions maybe discontinuous or continuous in time, while the test functions are alwaysdiscontinuous in time. Then, the two coupled subproblems for the primalstate uh and the adjoint state λh can be written in form of successive time-stepping schemes running from t0 = 0 forward to tM = T and from tM = Tbackward to t0 = 0 , respectively. This space-time Galerkin finite elementmethod is very similar to the backward Euler (cG(0) method) and the Crank-Nicolson (cG(1) method) scheme in time together with a time-varying spa-tial finite element discretization. For more details, we refer to Becker [5] andBangerth/Rannacher [2].

The main difficulty in solving the space-time KKT problem (61) is dueto the enormous storage requirements for evaluating the u-dependent coef-ficients of the adjoint equation over the whole time interval. These storagerequirements can be reduced by so-called ‘checkpointing’ techniques (see Gri-wank [42] and Hinze/Walther/Sternberg [49]), which will be described in thefollowing:

Checkpointing algorithm: We sketch the checkpointing method in its simplestform. First, the discrete time values um of the primal state are computedover the whole time interval [t0, tM ], and the P+1 samples u0, uQ, . . . , uQP are stored. Additionally, the Q−1 values of um in the last section are storedsuch that now the adjoint state λm can be computed backward in timein the last section. In the following process the Q − 1 stored values of theprimal state in the last section are no longer needed and their storage canbe used for storing the primal state over other time sections. Next, startingfrom the stored value u(P−2)Q the primal state is recomputed in the (P − 1)-st section and the resulting Q − 1 of its values are stored. Then, in turn,the corresponding adjoint state can be calculated backward in time in the(P − 1)-st section. This process is repeated until primal and adjoint state arecomputed on the first section. Denoting by S1 the storage (per level) and byW1 the work (number of forward time steps), we have

S1(P,Q) = (P+1) + (Q−1) = P+Q,

W1(P,Q) = M + (P−1)(Q−1) = 2M−P−Q+1.

This ‘simple’ checkpointing can be recursively refined to ‘multilevel’ check-pointing. Assuming that M = M0 · . . . · ML , with certain Ml ∈ N+ , the

Page 29: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 29

’one-level checkpointing’ can be applied for the factorization M = PQ withP = M0 and Q = M1 ·. . .·ML , and then recursively to each of the P sections.For the corresponding memory requirement and total number of forward timesteps one finds by induction:

SL =L

l=0

(Ml−1) + 2, WL = (L+1)M −L

l=0

M

Ml+1.

Hence, for L ∼ log2(M) minimal storage Sopt and work count Wopt areachieved for Ml ∼M1/(L+1), l = 0, . . . , L :

Smin = O(

log2(M))

, Wmin = O(

M log2(M))

.

We see that multilevel checkpointing makes the simultaneous computation ofprimal and dual states feasible even for fine space-time meshes.

Remark 6. The checkpointing method can drastically reduce the storage re-quirement in an nonstationary optimal control calculation. However, the re-sulting increase in computational work may be rather significant, such aslog2(M) ∼ 10 . In this case the time taken by the additional time steps maylargely exceed the time needed for storing and reading all the data of the pri-mal solution from fast exterior memory devices. Thus, checkpointing may notbe the most effective method in dealing with the complexity problem in solv-ing nonstationary optimal control problems compared to using simple datacompression techniques.

Numerical results

The model problem (59) is considered with nonlinearity s(u) = u2 , posedon the space-time region QT := Ω × [0, T ] where Ω = (0, 1)3 ⊂ R3 and[0, T ] = [0, 1] . The initial control is finitely parametrized, such that the controlspace is Q = Rn , while the space for states and co-states is V = H1(Ω) asbefore. The parameter in the regularization term is set to α = 10−4 . Thediscretization of this problem uses the cG(1) method in space with 32768cells (trilinear shape functions) and the dG(0) method in time with 500 timesteps. Table 2 shows the effect on storage requirements compared to work ofdifferent forms of multi-checkpointing, Becker/Meidner/Vexler [15]. For thediscretization used a storage reduction by about a factor of 1/30 can beachieved while the total number of time steps only grows by a factor of 4 .

Page 30: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

30 Rolf Rannacher

Factorization Memory in MB Time Steps500 274 35,000

5 · 100 58 87,9485 · 10 · 10 13 113,4634 · 5 · 5 · 5 9 130,788

Table 2. Reduction of the storage requirement due to multilevel checkpointingwith cG(1) discretization and 32, 768 cells in each time step.

3.3 A stationary flow control problem

Next, we consider a model problem from stationary flow control, Becker [4, 5].Consider the (stationary) incompressible Navier-Stokes system

−ν∆v + v · ∇v + ∇p = 0, ∇ · v = 0, in Ω, (62)

where the physical unknowns are the velocity and the pressure u = v, p forprescribed viscosity ν , and constant density ρ ≡ 1 . The flow configuration isshown in Figure 11.

.

Γin S Γout

ΓQ

ΓQ

Fig. 11. Configuration of the flow control problem

The boundary conditions are

v|Γrigid= 0, v|Γin

= vin, ν∂nv − np|Γout= 0,

with the ‘rigid’ boundary Γrigid , the ‘inflow’ boundary Γin , the ‘outflow’boundary Γout , and the ‘control’ boundary ΓQ . The Reynolds number isRe = v2Dν−1 = 40 , defined in terms of viscosity ν , the characteristic lengthD = diam(cylinder) , and the maximal inflow velocity v , corresponding tostationary (uncontrolled) flow.

The goal of the optimization is the minimization of the drag coefficient

Jdrag(v, p) :=2

vD

S

nTσ(v, p)e1 ds,

taken over the surface S of the cylinder, with the stress tensor σ(v, p) =−pI + ν(∇v+∇vT ) , and the unit vector e1 of the main flow direction. The

Page 31: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 31

piecewise constant control q = q1, q2 ∈ R2 acts as a Neumann control atthe two components of the control boundary ΓQ . The corresponding stateequation posed in variational form seeks u = v, p ∈ vin+H×L2(Ω) andq ∈ R2, such that

ν(∇v,∇ϕv) + (v · ∇v, ϕv) − (p,∇ · ϕv) − (ϕp,∇ · v) = (q, n · ϕv)ΓQ, (63)

for test functions ϕ = ϕv, ϕp ∈ V , with the spaces V = H×L2(Ω) andH = H1

0 (Ω;Γrigid∪Γin)2 . In this case the KKT system for the triplet u, q, λ ,with u = v, p and λ = z, π , in variational form reads as follows:

ν(∇ψv,∇z) + (v · ∇ψv + ψv · ∇v, z) − (π,∇ · ψv)

−(ψp,∇ · z) = Jdrag(ψv, ψp),

(r, z · n)ΓQ= 0,

ν(∇v,∇ϕv) + (v · ∇v, ϕv) − (p,∇ · ϕv) − (ϕp,∇ · v) = (q, n · ϕv)ΓQ,

(64)

for all ψ, r, ϕ ∈ V ×R2×V . This nonlinear system is discretized by the fi-nite element Galerkin method using (isoparametric) bilinear trial functions(equal-order ‘Q1-Stokes element’) on quadrilateral meshes with consistentleast-squares stabilization of pressure-velocity coupling and transport. Weomit the explicit statement of the discrete Galerkin equations and the cor-responding a posteriori error estimates. For the details see Becker [4, 5]. Theresulting discrete equations are solved by a simplified Newton method.

Table 3 shows the advantages of using mesh adaptation in this optimalcontrol model problem, Becker [4, 5]. The same minimal drag value can bereached on adapted meshes which have only about 6.5% of the cells of acorresponding uniform meshes. Figure 12 shows the streamline plots for theuncontrolled flow (q = 0) and the optimally controlled flow together withthe corresponding adapted ‘control mesh’. The controlled flow has been com-puted by post-processing as described above on a globally refined mesh. Thestructure of this (stationary) flow pattern indicates that it my not be dynami-cally stable and hence not physically relevant. In fact, an accompanying linearstability analysis confirms this expectation, Heuveline/Rannacher [46].

Table 3. Uniform refinement versus adaptive refinement in the drag minimization.

Uniform refinement Adaptive refinementN Jdrag N Jdrag

10512 3.31321 1572 3.2862541504 3.21096 4264 3.16723164928 3.11800 11146 3.11972

Page 32: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

32 Rolf Rannacher

Fig. 12. Velocity of the uncontrolled flow (top), controlled flow (middle), andadapted mesh (bottom) for the optimization process.

Stability of stationary flow

The structure of the stationary flow state corresponding to the optimal con-trol for drag minimization shown in Figure 12 raises the concern of whetherthis state is dynamically stable, i.e. physically meaningful. Instability maybe possible since the Newton method used for solving the nonlinear KKTsystem is capable of converging toward (dynamically) unstable points. Forthe present situation this question may be investigated using the standardlinearized stability approach, which leads to the eigenvalue problem

− ν∆v + v · ∇v + v · ∇v + ∇q = λv, ∇ · v = 0, in Ω,

v|Γrigid= 0, v|Γin

= vin, ν∂nv − pn|Γout= 0.

(65)

Here u := v, p denotes the stationary base flow, the stability of which is tobe investigated. Then, instability under small perturbations is to be expectedif there exits an eigenvalue with negative real part, Reλ < 0 . For more detailson hydrodynamic stability theory and for the finite element approximation ofthe stability eigenvalue problem (65), we refer to Heuveline/Rannacher [46]and the literature cited therein. By a simple extension of the approach usedin Subsection 2.2 for deriving a posteriori error representations for eigenvalueapproximations, we obtain an error estimate of the form

|λ−λh| ≈∑

K∈Th

ηK + ηλK

, (66)

Page 33: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 33

where the residual terms ηK measure the local error source by the approx-imation of the state v, p and the ηλ

K measure the error source by thediscretization of the eigenvalue problem, such as in (41). Therefore, in orderto guarantee a reliable stability analysis, we have to ensure that on the finalmesh there holds

K∈Th

ηK ≤∑

K∈Th

ηλK . (67)

The interplay of the two components of the error estimator in the course of therefinement process in computing the critical eigenvalue is shown in Figure 13.It turns out that a reliable eigenvalue computation requires the resolution oflarger parts of the flow field than needed for the optimal control process. Thesefinal meshes are shown in Figure 14. The computed approximations for themost critical eigenvalue shown in Figure 15 demonstrate that the ‘optimal’stationary flow state, which corresponds to a control value of qopt ∼ 0.5 , isactually dynamically unstable, Heuveline/Rannacher [46]

103

104

105

10−4

10−3

10−2

10−1

Number of cells

Err

or e

stim

ator

o: Foward simulation error control*: Eigenvalue error control

Fig. 13. Discretization errors for base solution and eigenvalue approximation.

Fig. 14. Meshes obtained by the error estimators for the drag minimization (top)and the eigenvalue computation (bottom).

Page 34: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

34 Rolf Rannacher

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7−5

−4

−3

−2

−1

0

1

Pressure drop

Rea

l par

t

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

10

20

30

40

50

60

70

80

Pressure drop

Imag

inar

y pa

rt

Fig. 15. Real (left) and imaginary (right) parts of the critical eigenvalue as functionof the control pressure, qopt

∼ 0.5 .

3.4 Parameter identification

Next, we discuss the application of the DWR method to the special situa-tion of parameter identification problems. The special character of this typeof optimization problems is explained by the following consideration. Let usconsider the model problem

−∆u+ qu = f in Ω, u|∂Ω = 0. (68)

The goal is to determine the coefficient q by comparing the resulting stateu = u(q) with given measurements u ,

J(u, q) := 12‖u−u‖

2 + 12α‖q‖

2 → min!

where the regularization parameter α is chosen positive but small. In thiscase the KKT system reads

−(ϕ, u−u) + (∇ϕ,∇λ) + (qϕ, λ) = 0,

α(χ, q) + (χu, λ) = 0,

(∇u,∇ψ) + (qu, ψ) − (f, ψ) = 0,

(69)

for test triples ϕ, χ., ψ. In the case of an ‘identifiable’ parameter q > 0 ,there holds

−∆λ+ qλ = u−u = 0,

and consequently λ ≡ 0 . This would imply that in an a posteriori error esti-mate for the least-squares functional J(·, ·) the weights vanish and no meshrefinement would be induced. This demonstrates that the least-squares costfunctional in parameter estimation may not be appropriate for residual-basedmesh design. Further, in such problems one is more interested in estimatingthe actual error in the control, ‖q−qh‖ , rather than the error in the costfunctional.

Page 35: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 35

Remark 7. We note the similarity of the above situation with that of eigen-value problems, which can be written in the form

−∆u = λu in Ω, u|∂Ω = 0,

with the minimization condition

J(u) := (‖u‖−1)2 → min .

The control of the discretiaztion error with respect to this cost functional J(·)is obviously not of particular interest. Therefore, in using the DWR methodfor this situation, we have employed the error control functional

E(u, λ) := λ‖u‖2.

The estimation of the error with respect to this functional was achieved byusing an outer dual problem which, however, could largely be reduced byexploiting the special structure of the eigenvalue problem. A similar approachwill also be the key to an effective way of mesh adaptation in parameteridentification problems.

Error estimation based on stability

The traditional way of proving a priori as well as a posteriori error estimatesfor parameter identification problems uses the inherent stability (coercivity)properties of the problem. However, since parameter identification is typi-cally ill-posed, stability needs to be artificially enhanced by introducing theregularization term 1

2α‖q‖2Q in the least-squares cost functional.

Let the KKT system of the parameter identification problem be writtenin compact form for arguments U = (u, q, λ) and Φ = (ϕu, ϕq, ϕλ) as

A(U)(Φ) = F (Φ),

where F (Φ) := (f, ϕu) and

A(U)(Φ) := (ϕu, u− u) + (∇ϕλ,∇λ) + (qϕλ, λ)

+ α(ϕq , q) + (ϕqu, λ) + (∇u,∇ϕu) + (qu, ϕu).

For this situation one may expect a worst-case oriented stability estimate ofthe form

supΨ∈X

A′(U)(Φ, Ψ)

‖Ψ‖X≥ β(α)‖Φ‖X , Φ ∈ X,

to hold with a constant β(α) ∼ α . On this basis one can derive a correspond-ing a posteriori error estimate

‖U−Uh‖X ≤ γ(α)‖R(Uh)‖X∗ ,

which is likewise worst-case oriented with a constant γ(α) ∼ α−1 . This ap-proach is underlying the traditional a posteriori error analysis given, e.g., in[62] and [50].

Page 36: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

36 Rolf Rannacher

Error estimation based on duality

We consider a discrete parameter identification problem in abstract form withstate space V , control space Q = Rnp , observation space Z = Rnm , observa-tion operator C : V → Z, and given observation C ∈ Z,

J(u) := 12‖C(u)−C‖2

Z → min, A(u, q)(ψ) = F (ψ) ∀ψ ∈ V. (70)

The semilinear form A(·)(·) in the state equation is assumed to possess direc-tional derivates and the functional F (·) as linear. The corresponding KKTsystem reads

A′λ(u, q)(ϕ, λ) = J ′

λ(u, q)(ϕ) ∀ϕ ∈ V,

A′q(u, q)(χ, λ) = J ′

q(u, q)(χ) ∀χ ∈ Q,

A(u, q)(ψ) = F (ψ) ∀ψ ∈ V.

(71)

The details of the following discussion can be found in Becker/Vexler [19],Vexler [78], and Bangerth [1].

The discretization of (70) and (71) uses finite element spaces Vh ⊂ V anddetermines uh, qh ∈ Vh ×Q , such that

J(uh) = 12‖C(uh)−C‖2

Z → min, A(uh, qh)(ψh) = F (ψh) ∀ψh ∈ Vh. (72)

In the present case discretization of the control space is not necessary. If theform A(·, q)(·) is regular for any q ∈ Q , we can define the solution opera-tor S : Q → V and, setting u = S(q) , obtain the following unconstrainedequivalent of (70) in the control space Q = Rnp :

j(q) := 12‖c(q)−C‖

2Z → min, c(q) := C(S(q)), q ∈ Q. (73)

The derivatives

Gij := ∂qjci(q) = C′

i(u)(wj), G = (Gij)np

i,j=1,

are determined by the solutions wj ∈ V of the equations

A′u(u, q)(wj , ψ) = −A′

qj(u, q)(1, ψ) ∀ψ ∈ V.

Using this notation the necessary first-order optimality condition is

j′(q) = 0 ⇔ G∗(c(q) − C) = 0.

This equation may be solved by a fixed-point iteration of the form

qk+1 = qk + δq, Hk δq = G∗k(c− c(qk)), Gk = c′(qk),

with a suitably chosen preconditioning matrix Hk . Popular choices are:

• The full ‘Newton algorithm’, Hk := G∗kGk + 〈c(qk) − c, c′′(qk)〉Z .

Page 37: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 37

• The ‘Gauß-Newton algorithm’, Hk := G∗kGk .

• The ‘update method’, Hk := G∗kGk +Mk with certain corrections Mk .

However, the full Newton algorithm is rarely used in this context since itinvolves the evaluation of the term 〈c(qk

h) − C, c′′(qkh)〉Z , particularly the

second derivative c′′h(qkh) . This is rather expensive since it requires the solution

of several auxiliary problems, depending on the dimension of Q . Since in thelimit k → ∞ the deviation ch(qh) − C is expected to be small, the Gauß-Newton algorithm is justified and its convergence is even superlinear.

For controlling the error in this discretization, we use an error controlfunctional E(·) which addresses the error in the controls more directly, e.g.,

E(uh, qh) := − 12‖q−qh‖

2Q,

which yields E(u, q)−E(uh, qh) = 12‖q−qh‖

2Q. Then, the systematic error con-

trol by the general approach for variational equations described in Subsection2.1 applied to the Galerkin approximation of the saddle-point system (71) hasto use an extra ‘outer’ dual problem with solution z = zu, zq, zλ ∈ V×Q×V ,

A′(u, q, λ)(ϕ, χ, ψ, zu, zq, zλ) = E′(u, q, λ)(ϕ, χ, ψ), (74)

for all ϕ, χ, ψ ∈ V ×Q×V , where

A(u, q, λ)(ϕ, χ, ψ) := A′u(u, q)(ϕ, λ) − J ′

u(u, q)(ϕ)

+A′q(u, q)(χ, λ) − J ′

q(u, q)(χ) +A(u, q)(ψ) − F (ψ).

A careful analysis of this setting results in an a posteriori error representationof the following form (Becker/Vexler [19, 20]).

Proposition 5. Let u, q be a local solution of the optimization problem anduh, qh its finite element approximation. Further, let G be the derivative ofsolution operator u = S(q) with respect to q . For a prescribed control errorfunctional E(·) let z ∈ V be the corresponding dual solution determined by

a′u(u, q)(ϕ, z) = −〈G(G∗G)−1∇E(q), C′(u)(ϕ)〉Z ∀ϕ ∈ V. (75)

Then, there holds the a posteriori error representation

E(q)−E(qh) = 12ρ(z−ihz) + 1

2ρq(q−jhq) + 1

2ρ∗(u−ihu) + Ph + Rh, (76)

for arbitrary ihu, ihλ ∈ Vh and jhq ∈ Qh, with the discretization residuals

ρ(·) := F (·) −A(uh, qh)(·), ρq(·) := α〈q, ·〉Q −A′q(u, q)(·, λ),

ρ∗(·) := 〈Gh(G∗hGh)−1∇E(qh), C′(uh)(·)〉 − a′u(uh, qh)(·, zh).

The remainder Ph is bounded by

|Ph| ≤ C ‖e‖V ‖C(u)−C‖Z , (77)

and the linearization remainder Rh is again cubic in the error e := eu, eq, ez .

Page 38: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

38 Rolf Rannacher

If the parameter space Q is finite dimensional and therefore not dis-cretized, the control residual ρq(·) vanishes. Due to the particular features ofthe parameter identification problem, we can expect that ‖C(u)−C‖Z ≪ 1 forthe optimal state. Hence, this term is neglected compared to the leading termη := 1

2ρ(z−ihz)+ 12ρ

∗(u−ihu) . Based on the a posteriori error representation(76) the mesh adaptation is organized as described above.

Numerical example: determination of reaction parameters

We consider the reaction-diffusion problem, Becker/Vexler [19, 20, 78],

β · ∇u − µ∆u+ f(u) = 0 in Ω,

u|Γin= u, ∂nu|∂Ω\Γin

= 0,(78)

on the domain depicted in Figure 16. The reaction term has the form of anArrhenius law,

f(u) = A exp(

−E

1−u

)

u(1−u).

The goal is to determine the parameters A and E from given line averages∫

Γi

u ds, i = 1, . . ., 10.

The results obtained by using the DWR approach for mesh adaptation areshown in Figure 17. The clear superiority of systematic local mesh adaptationover uniform mesh refinement is seen in Figure 18.

Ox

Ox

F

Fig. 16. Configuration of the reaction-diffusion problem, measurements are verticalline-integrals of concentration (dotted lines), initial solution for A = 54.6, E = 0.15 .

Fig. 17. Estimated solution for A = 992.3, E = 0.07 and refined mesh.

Page 39: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 39

0.01

0.1

10000 100000

rela

tive

erro

r

nodes

globallocal

Fig. 18. Quality of generated meshes: error in estimated parameters versus numberof nodes for uniform and adaptive mesh refinement.

A note on distributed parameter estimation

In solving distributed parameter identification problems also the control spaceneeds to be discretized. We consider the following model problem of determi-nation of a distributed diffusion parameter, Bangerth [1]:

−∇ · (q∇u) = f in Ω, u|∂Ω = 0, (79)

from distributed measurements u ∈ L2(Ω) . The usual least-squares approachseeks to minimize the cost functional

J(u, q) := 12‖u−u‖

2 + 12α‖q‖

2,

with a stabilization parameter 0 < α ≪ 1 . The corresponding KKT systemin strong form reads

−∇ · (q∇λ) = u− u in Ω, λ|∂Ω = 0,

∇u · ∇λ = αq in Ω,

−∇ · (q∇u) = f in Ω, u|∂Ω = 0.

(80)

This system is discretized by the finite element method using Q1-elements forprimal and adjoint variable u and λ , and Q0-elements for the distributedcontrol q . For this approximation the general DWR approach yields the aposteriori error representation

J(u, q)−J(uh, qh) = 12ρ

∗(u−ihu) + 12ρ

q(q−ihq) + 12ρ(λ−ihλ) + Rh, (81)

with the cell-residuals and weights defined by

Page 40: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

40 Rolf Rannacher

ρλK := ‖Rλ

h‖K + h−1/2K ‖rλ

h‖∂K , ωuK := ‖u−ihu‖K + h

1/2K ‖u−ihu‖∂K ,

ρqK := h

−1/2K ‖Rq

h‖∂K , ωqK := h

1/2K ‖q−jhq‖∂K ,

ρuK := ‖Ru

h‖K + h−1/2K ‖ru

h‖∂K , ωλK := ‖λ−ihλ‖K + h

1/2K ‖λ−ihλ‖∂K .

with arbitrary ihu, jhq, ihλ ∈ Vh×Qh×Vh, and

Ruh|K := f+∇ · (qh∇uh), Rλ

h|K := u−uh+∇ · (qh∇λh)

Rqh|K := αqh+∇λh · ∇uh

ruh|Γ :=

12 [∂nuh], if Γ 6⊂ ∂Ω,

0, if Γ ⊂∂Ω,rλh|Γ :=

12 [∂nλh], if Γ 6⊂∂Ω,

0, if Γ ⊂∂Ω.

The remainder has the explicit form

Rh := 112

(

(q−qh)∇(λ−λh),∇(u−uh))

.

Remark 8. From first numerical experience for this kind of problem, we drawthe following conclusions:

1. Using different meshes for primal/dual solutions u, λ and for the controlq may be necessary for maximum solution efficiency.

2. The design of the control q-mesh based on ‘dual’ information on coarsemeshes is delicate for Q0-elements. Feature-based refinement seems morerobust in this case. The situation is better for ‘smoother’ parameter ap-proximations, e.g., by continuous Q1-elements.

3. The a posteriori error representation (81) is very simular to the a poste-riori error representation derived before for eigenvalue approximation inSubsection 2.2.

3.5 Model calibration

The calibration of mathematical models by using available data from exper-imental measurements may be considered as special parameter estimationproblems. Usually these problems are better conditioned than pure parame-ter identification problems and do not depend so much on regularization.

Example: Calibration of diffusion parameters

We consider a reactive flow problem governed by the full set of stationary con-servation equations for mass, momentum, energy and species concentrations(the ‘compressible’ Navier-Stokes equations):

Page 41: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 41

∇ · (ρv) = 0,

(ρv · ∇)v + ∇ · π + ∇p = 0,

ρv · ∇T − c−1p ∇ · Q = −

9∑

i=1

hifi,

ρv · ∇yk + ∇ · Fk = fk, k ∈ S, #S = 9,

Fk = qkD∗k∇yk, D∗

k = (1−yk)(

l 6=k

xl

Dbinkl

)−1

.

(82)

The setup of this configuration of a hydrogen diffusion flame is taken fromBraack/Ern [23] and is shown schematically in Figure 19. At the inflow bound-ary of the center tube, 10% mass fraction of hydrogen yO2

and 90% of ni-trogen yN2

is prescribed. The inflow temperature is T = 273 K . Along theupper and lower boundary the constant values yO2

= 0.22 and yN2= 0.78

are given. The peak velocity of the three parabolic velocity profiles is 1 m/s .For a more detailed description of such a model, we refer to Braack/Rannacher[24] and Becker/Braack/Vexler [7, 8].

2 cm

5 mm 2 mm

1 mm

1 mm

Air

Air

H2 / N2Measurements:point-values ofconcentrations

Fig. 19. Configuration of the diffusion-reaction problem.

The numerical evaluation of the above so-called ‘multi-component diffu-sion’ model Fk = qkD

∗k∇yk is rather expensive since it requires the inver-

sion of linear systems in all nodal points. Therefore, it is desirably to replacethis complicated model by the much simpler ‘Fick’s model’ Fk = qkDk∇yk ,with diffusion coefficients independent of the other species. The goal is nowto choose these coefficients Dk in such a way that the resulting solution isclose to that obtained by using the full multi-component diffusion. This modelcalibration is done on the basis of given measured point values of species con-centrations. Since the case of pointwise data is not directly covered by ourHilbert-space-based theory, we may replace point values by certain local av-erages. An a priori error analysis of the discretization of point value-basedparameter estimation problems is given in Rannacher/Vexler [76].

Figure 20 shows the result of a successful adaptation measured in termsof the position of the combustion front which is usually a very sensitive pa-rameter. Figure 21 shows zooms into automatically adapted meshes (for moredetail see Vexler [78, 79]).

Page 42: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

42 Rolf Rannacher

Fig. 20. Left: solution by multicomponent diffusion (reference solution), middle:solution by Fick’s law (initial parameters), right: solution by optimally calibratedFick’s law.

Fig. 21. Locally refined meshes (zooms).

Calibration of boundary conditions

Another type of model calibration concerns the proper choice of boundaryconditions for making the underlying model well-posed. We consider a high-temperature flow reactor depicted in Figure 22, which is used for determiningthe rates of elementary chemical reactions from local measurements of speciesconcentrations under different temperature and pressure conditions. As anexample the reaction rate

k(T ) = A( T

300K

exp(

−Ea

RT

)

of the elementary chemical reaction O(1D) + H2 → OH + H is to be de-termined at temperature T = 780K . This reaction is important for the un-derstanding of atmospheric chemistry and until recently its reaction velocityhad been known only for room temperature T ∼ 300K . The reaction modelinvolves 7 species and 6 elementary reactions, with argon as inert gas. By laser-induced fluorescence (LIF) techniques local average values of H-concentrationcan be measured in the reactor. The result of the parameter identificationprocess is a significantly increased value of k = 1.5 · 10−10 [cm3 mol−1 s−1] ,for T = 780 K , over the known value k = 1.0 · 10−10 [cm3 mol−1 s−1] ,for T = 300 K . For more details, we refer to Carraro [28] and Car-raro/Heuveline/Rannacher [27].

Page 43: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 43

250mm

1000mm

15mm

44,5/2

25/1

6/1

6/1

6/1

41mm

he

atin

g z

on

e

depth of cooled gas inlet

cooled gas inlet

Simulation of Temperature in a Flow Reactor

pulsed laserphotolysis

gas inlet

gas inlet

time resolvedH-atom LIFdetection

300

440

580

720

860

1000

1140

1280

1420

1560

1700K

5 cm 10 cm 15 cm 20 cm 25 cm 30 cm

Fig. 22. Configuration of the high-temperature reactor.

The first step in preparing for the simulation of the flow conditions in thereactor is the determination of a complete set of initial and boundary condi-tions for which the model can be expected to be well-posed. For the initialconcentration of the species, their partial pressures are calculated from thevalues of their fluxes and the total pressure, and from the partial pressuresthe initial number of molecules per m3 is obtained. The inflow profiles arecalculated from the geometry of the inflow and the fluxes measured experi-mentally by the flux controllers. The outflow boundary conditions, the no-slipcondition at rigid walls and the symmetry condition along the cylinder axisare imposed as usual. Only the temperature boundary condition at the outerwall is not so clearly defined. This is a typical difficulty for the simulation ofreal-life experiments. The reactor is heated at the exterior wall in order toachieve the desired high temperature in the interior particularly in the smallmeasurement zone. However, due to constructional constraints the exact tem-perature distribution along the outer wall cannot be directly measured and isactually nonuniform due to the cold inflow at the upper inlets. The only infor-mation available is the temperature profile along the inner reactor axis whichis obtained from thermo-sensor measurements. This is taken as input datafor the implicit determination of the unknown temperature boundary data bya parameter estimation process. The setting of this calibration process is asfollows (see Figure 23):

1. The temperature profile along the wall is assumed as a piecewise linearfunction described by 6 parameters, the temperature values at the posi-tions y2, . . . , y7 , to be determined.

Page 44: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

44 Rolf Rannacher

2. Input data for the parameter estimation are measured profiles Tmeas ofthe temperature along the symmetry axis.

3. The parameters are determined by minimizing the least squares functional

J(T ) = 12‖Tsim−Tmeas‖

2Γsym

for the measured and computed temperature profile along the symmetryaxis subject to the flow equations.

Such a ‘model calibration’ has to be done within each measurement cycle inthe reactor. Results for an experiment with 580K in the heating zone areshown in Figure 24, Carraro [28] and Carraro/Heuveline/Rannacher [27].

y5

y6

y7

y8

y2

y3

y4

y1

N2OH2 ArAr

mea

sure

d te

mpe

ratu

re

tem

pera

ture

’s b

ound

ary

cond

itio

n

Laser

Fig. 23. Experimental setup (left), computed temperature with 780 K in themeasurement area (middle), adapted mesh for the flow simulation (right).

0

50

100

150

200

250

300

350

-20 -10 0 10 20 30 40 50 60

Tem

pera

ture

[C]

Position [cm]

Calibration of the temperature at the boundary

Experimental MeasurementsBoundary Temp.: result of param. estim.

Simulated Temperature

Fig. 24. Temperature profiles along the reactor middle axis after calibration ofboundary conditions.

Page 45: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 45

Remark 9. Prior simulation of the flow conditions in the reactor have shownthat at high temperature and ‘high’ (atmospheric) pressure the inclusion ofgravity may result in persisting nonstationary flow behavior, see Figure 25.This has enforced the restriction of the experiment to low pressure conditions(15 − 50 Torr). In order to do controlled measurements in the reactor at hightemperature and atmospheric pressure the flow needs to be stabilized. Forsuppressing this nonstationary behavior, one may try to minimize the costfunctional

J(u) = 12K

∫ T

0

‖∂tv(t)‖2Ωmeas

+ ‖∂tT (t)‖2Ωmeas

dt + ‘Regularization’,

where Ωmeas is the measurement zone in the reactor and u = v, p, T, yis determined by the flow model. It is expected that in this way the LIFmeasurements can be extended to atmospheric pressure. The realization ofthis program is subject of ongoing work.

Fig. 25. A time series of temperature plots in the case of non-zero gravity indicatingnonstationary behavior

4 Conclusion and outlook

Duality-based error estimation can be applied, in principle, for all problemsposed in variational form. The approach described in this article for stationaryand nonstationary control problems is currently extended into the followingdirections:

• Estimation of distributed parameters: The identification of distributed pa-rameters, for example in wave propagation problems in seismics, requiresthe proper discretization of the control space. Doing this adaptively mayresult in different meshes for states and control as suggested by the obser-vations in Bangerth [1].

Page 46: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

46 Rolf Rannacher

• Adaptive nonstationary optimal control: The multilevel checkpointing tech-nique is to be realized in the context of duality based simultaneous mesh-size and time-step adaptation following ideas as developed in Becker [5],Becker/Meidner/Vexler [15], and Meidner/Vexler[68]. It has to be com-pared with regards to time efficiency with the usage of external storagedevices together with data compression techniques.

• Control- and state-constrained problems: Most practical optimal controlproblems involve (inequality) constraints for the controls and, what ismuch more difficult to handle, also for the states. Though the algorith-mic treatment of this complication is reasonably understood (see, e.g.,Kunisch/Rosch [56]), its interaction with sensitivity-controlled mesh adap-tation based on a posteriori error estimates is largely open. In this con-text techniques used for the adaptive discretization of variational inequal-ities corresponding to Signorini-type problems, Blum/Suttmeier [22], or inelasto-plasticity, Rannacher/Suttmeier [75], may be useful.

• Optimal experimental design: Parameter estimation is usually troubledby inherent ill-conditioning due to insufficient data from measurements.This raises the question of how to find ‘good’ data which would be ‘op-timal’, i.e., maximal sensitive, for the parameter identification process.First steps towards such ‘optimal experimental design’ in the context ofPDEs and adaptive discretization are described in Carraro [28] and Car-raro/Heuveline/Rannacher [27].

• Further aspects currently considered are summarized by the key words ro-bust parameter identification, parameter identification with uncertainties,multi-goal optimization, and shape optimization in the context of fluid-structure interaction, Dunne/Rannacher [32].

References

1. W. Bangerth: Adaptive Finite Element Methods for the Identification of Dis-tributed Parameters in Partial Differential Equations, Dissertation, Institute ofApplied Mathematics, Univ. of Heidelberg, 2002.

2. W. Bangerth and R. Rannacher: Adaptive Finite Element Methods for Differen-tial Equations, Lectures in Mathematics, ETH Zurich, Birkhauser, Basel 2003.

3. R. Becker: An optimal control approach to a posteriori error estimation for fi-nite element discretizations of the Navier-Stokes equations, East-West J. Numer.Math. 8, 257-274 (2000).

4. R. Becker: Mesh adaptation for stationary flow control, J. Math. Fluid Mech. 3,317-341 (2001).

5. R. Becker: Adaptive Finite Elements for Optimal Control Problems, Habilitationthesis, Institute of Applied Mathematics, Univ. of Heidelberg, 2001.

6. R. Becker and M. Braack: The finite element toolkit GASCOIGNE, In-stitute of Applied Mathematics, University of Heidelberg, URL http://

www.gascoigne.uni-hd.de, 2005.

Page 47: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 47

7. R. Becker, M. Braack, and B. Vexler: Numerical parameter estimation for chemi-cal models in multidimensional reactive flows, Combustion Theory and Modelling8, 661–682 (2004).

8. R. Becker, M. Braack, and B. Vexler: Parameter identification for chemical mod-els in combustion problems, Appl. Numer. Math. 54 (3-4), 519-536, (2005).

9. R. Becker, T. Dunne, and D. Meidner: VISUSIMPLE: An interactive visualiza-tion and graphics/mpeg-generation program, Institute of Applied Mathematics,University of Heidelberg, URL http://www.visusimple.uni-hd.de, 2005.

10. R. Becker, V. Heuveline, and R. Rannacher: An optimal control approach toadaptivity in computational fluid mechanics, Int. J. Numer. Meth. Fluids. 40,105-120 (2002).

11. R. Becker and H. U. Kapp: Optimization in PDE models with adaptive finite ele-ment discretization, Proc. ENUMATH’97, Heidelberg, Sept 28, 1997 (H. G. Bocket al., eds), pp. 147-155, World Scientific, Singapore, 1998.

12. R. Becker, D. Meidner, R. Rannacher, and B. Vexler: Adaptive finite elementmethods for PDE-constrained optimal control problems, in Reactive Flows, Dif-fusion and Transport (W. Jager et al., eds), Springer, Heidelberg, 2006.

13. R. Becker, H. Kapp, and R. Rannacher: Adaptive finite element methods foroptimal control of partial differential equations: basic concepts, SIAM J. Opti-mization Control 39, 113-132 (2000).

14. R. Becker, D. Meidner, and B. Vexler: RODOBO: A C++ library for optimiza-tion with stationary and nonstationary PDEs, Institute of Applied Mathematics,University of Heidelberg, URL http://www.rodobo.uni-hd.de, 2005.

15. R. Becker, D. Meidner, and B. Vexler: Efficient numerical solution of parabolicoptimization problems by finite element methods, submitted to OptimizationMethods and Software, 2005.

16. R. Becker and R. Rannacher: Weighted a-posteriori error estimates in FE meth-ods, Lecture ENUMATH-95, Paris, Sept. 18-22, 1995, in: Proc. ENUMATH-97,Heidelberg, Sept. 28 - Oct.3, 1997 (H.G. Bock, et al., eds), pp. 621-637, WorldScientific Publ., Singapore, 1998.

17. R. Becker and R. Rannacher: A feed-back approach to error control in finite ele-ment methods: Basic analysis and examples, East-West J. Numer. Math., 4:237–264 (1996).

18. R. Becker and R. Rannacher: An optimal control approach to error estimationand mesh adaptation in finite element methods, Acta Numerica 2000 (A. Iserles,ed.), pp. 1-102, Cambridge University Press, 2001.

19. R. Becker and B. Vexler: A Posteriori error estimation for finite elementdiscretization of parameter identification problems, Numer. Math. 96, 435–459(2004).

20. R. Becker and B. Vexler: Mesh refinement and numerical sensitivity analysisfor parameter calibration of partial differential equations, J. Comput. Phys. 206,95-110 (2005).

21. L. Beilina and C. Johnson: A posteriori error estimation in computational in-verse scattering, M3AS, 2004.

22. H. Blum and F.-T. Suttmeier. Weighted error estimates for finite element solu-tions of variational inequalities, Computing 65, 119–134 (2000).

23. M. Braack and A. Ern. Coupling multimodeling with local mesh refinement forthe numerical solution of laminar flames, Combust. Theory Modelling 8, 771–788(2004).

Page 48: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

48 Rolf Rannacher

24. M. Braack and R. Rannacher. Adaptive finite element methods for low-Mach-number flows with chemical reactions, in 30th Computational Fluid Dynamics,VKI LS 1999-03 (H. Deconinck, ed.), pages 1–93. von Karman Institute for FluidDynamics, Brussels, Belgium, 1999.

25. M. Braack and T. Richter: Solutions of 3D Navier-Stokes benchmark problemswith adaptive finite elements, Computers and Fluids, in press, online May 2005.

26. J.H. Bramble and J.E. Osborn. Rate of convergence estimates for nonselfadjointeigenvalue approximations. Math. Comp. 27, 525–545 (1973).

27. T. Carraro, V. Heuveline, and R. Rannacher: Determination of kinetic param-eters in laminar flow reactors. I. Numerical aspects, Reactive Flows, Diffusionand Transport, (W. Jager et al., eds), Springer, Heidelberg, 2006.

28. T. Carraro: Parameter estimation and optimal experimental design in flow re-actors, Dissertation, Institute of Applied Mathematics, Univ. Heidelberg, 2005.

29. E. Casas and M. Mateos: Uniform convergence of the finite element method.Applications to state constrained control problems, Comput. Appl. Math. 21,67–100 (2002).

30. W. Dahmen and A. Kunoth: Adaptive wavelet methods for linear-quadratic el-liptic control problems, SIAM J. Contr. Optim. 43, 1640–1675 (2005).

31. L. Dede and A. Quarteroni: Optimal control and numerical adaptivity foradvection-diffusion equations, Math. Model. Numer. Anal. 39, 1019-1040 (2005).

32. Th. Dunne and R. Rannacher: Adaptive finite element approximation of fluid-structure interaction based on an Eulerian variational formulation, in Fluid-Structure Interaction: Modelling, Simulation, Optimisation (H.-J. Bungartz andM. Schafer, eds), Springer’s LNCSE-Series, to appear 2006.

33. K. Eriksson, C. Johnson, and V. Thomee: Time discretization of parabolic prob-lems by the discontinuous Galerkin method, RAIRO Model. Math. Anal. Numer.19, 611-643 (1985).

34. K. Eriksson, D. Estep, P. Hansbo, and C. Johnson: Introduction to adaptivemethods for differential equations. Acta Numerica 1995 (A. Iserles, ed.), pp.105–158, Cambridge University Press, 1995.

35. R. S. Falk: Approximation of a class of optimal control problems with order ofconvergence estimates, J. Math. Anal. Appl. 44, 28–47 (1973).

36. R. S. Falk: Error estimates of the numerical identification of a variable coeffi-cient, Math. Comp. 40, 537–546 (1983).

37. A. V. Fursikov: Optimal Control of Distributed Systems: Theory and Applica-tions, Vol. 187, Trans l. Math. Monogr. AMS, Providence, 1999.

38. A. Gaevskaya, R. H. W. Hoppe, Y. Iliash, and M. Kieweg: Convergence analysisof an adaptive finite element method for distributed control problems with controlconstraints, to appear in Proc. Conf. Optimal Control for PDEs, Oberwolfach,April 13-17, 2005 (G. Leugering et al., eds), Birkhauser, Basel.

39. A. Gaevskaya, R. H. W. Hoppe, ans S. Repin: A posteriori estimates for costfunctionals of optimal control problems, to appear in Proc. ENUMATH-2005,Santiago de Compostela, July 18-22, 2005, Springer, Berlin-Heidelberg.

40. M. B. Giles and E. Suli: Adjoint methods for PDEs: a posteriori error analysisand postprocessing by duality, Acta Numerica 2002 (A. Iserles, ed.), pp. 145–236,Cambridge University Press, 2002.

41. M. S. Gockenbach and W. W. Symes: Adaptive simulation, the adjointstate method, and optimization, Large-Scale PDE-Constrained Optimization(L. T. Biegler et al., eds), 281-297, LNCSE 30, Springer, Berlin, 2003.

Page 49: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

Adaptive discretization of optimization problems 49

42. A. Griewank: Achieving logarithmic growth of temporal and spatial complexityin reverse automatic differentiation, Optim. Methods Software 1, 35-54 (1992).

43. M. Gunzburger and L. Hou: Finite-dimensional approximation of a class ofconstrained nonlinear optimal control problems, SIAM J. Control Optim. 34,1001–1043 (1996).

44. M. Heinkenschloss and F. Troltzsch: Introduction to the theory and numericalsolution of PDE constrained optimization problems, Lecture at Workshop onPDE Constrained Optimization, Center for International Mathematics, Tomar,Portugal, July 26–29, 2005.

45. V. Heuveline and R. Rannacher: A posteriori error control for finite elementapproximations of elliptic eigenvalue problems, Advances in Comput. Math. 15,1-32 (2001).

46. V. Heuveline and R. Rannacher: Adaptive finite elements for stability eigenvalueproblems, Preprint, Inst. of Applied Mathematics, Univ. of Heidelberg, 2006.

47. M. Hintermuller, R. H. W. Hoppe, Y. Iliash, and M. Kieweg: An a posteriorierror analysis of adaptive finite element methods for distributed elliptic controlproblems with control constraints, to appear in ESAIM Contro, Optimisation andCalculus of Variations, 2006.

48. M. Hinze and K. Kunisch: Second order methods for optimal control of time-dependent fluid flow, SIAM J. Optim. 13, 321–334 (2002).

49. M. Hinze and K. Kunisch: An optimal memory-reduced procedure for calculat-ing adjoints of the instationary Navier-Stokes equations, to appear in OptimalControl Applications and Methods, 2006

50. R. Hoppe: Adaptive finite element methods for PDE-constrained optimal controlproblems, Lecture at Workshop on PDE Constrained Optimization, Center forInternational Mathematics, Tomar, Portugal, July 26–29,2005.

51. R. H. W. Hoppe, Y. Iliash, C. Iyyunni, and N. H. Sweilam: A posteriori errorestimates for adaptive finite element discretizations of boundary control problems,J. Numer. Math. 14, 57-82, 2006.

52. H. Johansson and K. Runesson: Parameter identification in constitutive modelsvia optimization with a posteriori error control, Int. J. Numner. Meth. Engrg 62,1315-1340 (2005).

53. H. Johansson, K. Runesson, and F. Larsson: Parameter identification with sensi-tivity assessment and error control, Department of Applied Mechanics, ChalmersUniversity of Technology, Goteborg, Sweden, to appear in Special Volume on Pa-rameter Identification, GAMM-Mitteilungen, 2006.

54. H. Kapp: Adaptive Finite Element Methods for Optimization in Partial Dif-ferential Equations, Dissertation, Institute of Applied Mathematics, Univ. ofHeidelberg, 2000.

55. T. Karkkainen: Error estimates for distributed parameter identification in linearelliptic equations, J. Math. Syst. Estim. Control 6, 117–120 (1996).

56. K. Kunisch and A. Rosch: Primal-dual active set strategy for a general class ofconstrained optimal control problems, SIAM J. Optim. 13, 321-334 (2002).

57. A. Kunoth: Adaptive wavelet schemes for an elliptic control problem with Dirich-let boundary control, Numer. Algor. 39, 199-220 (2005).

58. J.-L. Lions: Optimal Control of Systems Governed by Partial Differential Equa-tions, Springer, Berlin-Heidelberg-New York, 1971.

59. W. Liu: Recent advances in mesh adaptivity for optimal control problems, in FastSolution of Discretized Optimization Problems (K.-H. Hoffmann et al., eds), pp.154-166, ISNM 138, Birkhauser, Basel, 2001.

Page 50: On the adaptive discretization of PDE-based optimization ... · On the adaptive discretization of PDE-based optimization problems Rolf Rannacher∗ Institute of Applied Mathematics,

50 Rolf Rannacher

60. W. Liu: A posteriori error estimates of DG time-stepping method for optimalcontrol problems governed by parabolic equations, SIAM J. Numer.Anal, 2003.

61. W. Liu, R. Li, H. P. Ma, and T. Tang: Adaptive finite element method for ellipticoptimal control, SIAM J. Control. Optim. 41, 1321-1349 (2002).

62. W. Liu and N. Yan: A posteriori error estimates for some model boundary controlproblems, J. Comput. Appl- Math. 120, 159–173 (2000).

63. W. Liu and N. Yan: A posteriori error estimates for optimal boundary control,SIAM. J. Numer. Anal. 39, 73-99 (2001).

64. W. Liu and N. Yan: A posteriori error estimates for distributed optimal controlproblems, Advan. Comp. Math. 15, 285-309 (2001).

65. W. Liu and N. Yan: A posteriori error estimates for control problems governedby parabolic equations, Numer. Math. 93, 497–521 (2003).

66. W. Liu and N. Yan: A posteriori error estimates for nonlinear elliptic controlProblems, App. Numer. Anal. 47, 173-187 (2003).

67. W. Liu and N. Yan: A posteriori error estimates for control problems governedby Stokes’ equations, SIAM J. Numer. Anal. 40, 1805-1869 (2003).

68. D. Meidner and B. Vexler: Adaptive space-time finite element methods forparabolic optimization problems, Preprint, Institute of Applied Mathematics,University of Heidelberg, 2006.

69. B. Mohammadi: Mesh adaption and automatic differentiation for optimal shapedesign, Int. J. Comput. Fluid Dyn. 10, 199-211 (1998).

70. P. Neittaanmaki and X.-C. Tai: Error estimates for numerical identification ofdistributed parameters, J. Comput. Math. 10, 66–78 (1992).

71. P. Neittaanmaki and S. Repin: Reliable methods for computer simulation. Errorcontrol and a posteriori estimates, Studies in Mathematics and its Applications,Elsevier Science B. V., Amsterdam, 2004.

72. C. Nystedt. A priori and a posteriori error estimates and adaptive finite el-ement methods for a model eigenvalue problem. Technical Report N0 1995-05,Department of Mathematics, Chalmers University of Technology, 1995.

73. R. Rannacher: Adaptive finite element discretization of optimization problems,Proc. Conf. Numer. Analysis 1999 (D.F. Griffiths and G.A. Watson, eds.), pp.21-42, Chapman&Hall CRC, London, 2000.

74. R. Rannacher: Incompressible viscous flow, in Encyclopedia of ComputationalMechanics (E. Stein, R. de Borst, T.J.R. Hughes, eds), Volume 3 ‘Fluids’, JohnWiley, Chichester, 2004.

75. R. Rannacher and F.-T. Suttmeier: Error estimation and adaptive mesh designfor FE models in elasto-plasticity, in Error-Controlled Adaptive FEMs in SolidMechanics (E. Stein, ed.), pp. 5–52, John Wiley, Chichster, 2002.

76. R. Rannacher and B. Vexler: A priori error analysis for the finite element dis-cretization of elliptic parameter identification problems, SIAM J. Control Opti-mization 44, 1844–1863 (2005).

77. F. Troltzsch: Optimale Steuerung partieller Differentialgleichungen - Theorie,Verfahren und Anwendungen, Vieweg, 2005.

78. B. Vexler: Adaptive Finite Element Methods for Parameter Identification Prob-lems, Dissertation, Institute of Applied Mathematics, Univ. of Heidelberg, 2004.

79. B. Vexler: Finite element approximation of elliptic Dirichlet optimal controlproblems, SIAM J. Control Optim., submitted, 2005.