Introduction to the Modern Calculus of Variations · MA4G6 Lecture Notes Introduction to the Modern...

MA4G6 Lecture Notes

Introduction to theModern Calculus of Variations

Filip Rindler

Spring Term 2015

Filip RindlerMathematics InstituteUniversity of WarwickCoventry CV4 7ALUnited [email protected]

http://www.warwick.ac.uk/filiprindler

Copyright ©2015 Filip Rindler.Version 1.1.

mailto:[email protected]

http://www.warwick.ac.uk/filiprindler

Preface

These lecture notes, written for the MA4G6 Calculus of Variations course at the Universityof Warwick, intend to give a modern introduction to the Calculus of Variations. I have triedto cover different aspects of the field and to explain how they fit into the “big picture”.This is not an encyclopedic work; many important results are omitted and sometimes Ionly present a special case of a more general theorem. I have, however, tried to strike abalance between a pure introduction and a text that can be used for later revision of forgottenmaterial.

The presentation is based around a few principles:

• The presentation is quite “modern” in that I use several techniques which are perhapsnot usually found in an introductory text or that have only recently been developed.

• For most results, I try to use “reasonable” assumptions, not necessarily minimal ones.

• When presented with a choice of how to prove a result, I have usually preferred the(in my opinion) most conceptually clear approach over more “elementary” ones. Forexample, I use Young measures in many instances, even though this comes at theexpense of a higher initial burden of abstract theory.

• Wherever possible, I first present an abstract result for general functionals defined onBanach spaces to illustrate the general structure of a certain result. Only then do Ispecialize to the case of integral functionals on Sobolev spaces.

• I do not attempt to trace every result to its original source as many results have evolvedover generations of mathematicians. Some pointers to further literature are given inthe last section of every chapter.

Most of what I present here is not original, I learned it from my teachers and col-leagues, both through formal lectures and through informal discussions. Published worksthat influenced this course were the lecture notes on microstructure by S. Muller [73], theencyclopedic work on the Calculus of Variations by B. Dacorogna [25], the book on Youngmeasures by P. Pedregal [81], Giusti’s more regularity theory-focused introduction to theCalculus of Variations [44], as well as lecture notes on several related courses by J. Ball,J. Kristensen, A. Mielke. Further texts on the Calculus of Variations are the elementaryintroductions by B. van Brunt [96] and B. Dacorogna [26], the more classical two-part trea-tise [39, 40] by M. Giaquinta & S. Hildebrandt, as well as the comprehensive treatment of

Com

men

t...

http://hyperurl.co/t1zpfn?entry.1778399955=1.1-i

ii PREFACE

the geometric measure theory aspects of the Calculus of Variations by Giaquinta, Modica& Soucek [41, 42].

In particular, I would like to thank Alexander Mielke and Jan Kristensen, who are re-sponsible for my choice of research area. Through their generosity and enthusiasm in shar-ing their knowledge they have provided me with the foundation for my study and research.Furthermore, I am grateful to Richard Gratwick and the participants of the MA4G6 coursefor many helpful comments, corrections, and remarks.

These lecture notes are a living document and I would appreciate comments, correctionsand remarks – however small. They can be sent back to me via e-mail at

[email protected]

or anonymously via

http://tinyurl.com/MA4G6-2015

or by scanning the following QR-Code:

Moreover, every page contains its individual QR-code for immediate feedback. I encourageall readers to make use of them.

F. R.January 2015

Com

men

t...

mailto:[email protected]

http://tinyurl.com/MA4G6-2015

http://hyperurl.co/t1zpfn?entry.1778399955=1.1-ii

Contents

Preface i

Contents iii

1 Introduction 11.1 The brachistochrone problem . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Electrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Stationary states in quantum mechanics . . . . . . . . . . . . . . . . . . . 51.4 Hyperelasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Optimal saving and consumption . . . . . . . . . . . . . . . . . . . . . . . 81.6 Sailing against the wind . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Convexity 112.1 The Direct Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Functionals with convex integrands . . . . . . . . . . . . . . . . . . . . . . 142.3 Integrands with u-dependence . . . . . . . . . . . . . . . . . . . . . . . . 182.4 The Euler–Lagrange equation . . . . . . . . . . . . . . . . . . . . . . . . . 192.5 Regularity of minimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6 Lavrentiev gap phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . 312.7 Invariances and the Noether theorem . . . . . . . . . . . . . . . . . . . . . 332.8 Side constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.9 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . 41

3 Quasiconvexity 433.1 Quasiconvexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.2 Null-Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.3 Young measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.4 Gradient Young measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.5 Gradient Young measures and rigidity . . . . . . . . . . . . . . . . . . . . 663.6 Lower semicontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.7 Integrands with u-dependence . . . . . . . . . . . . . . . . . . . . . . . . 733.8 Regularity of minimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.9 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . 75

Com

men

t...

http://hyperurl.co/t1zpfn?entry.1778399955=1.1-iii

iv CONTENTS

4 Polyconvexity 774.1 Polyconvexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.2 Existence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.3 Global injectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.4 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . 86

5 Relaxation 875.1 Quasiconvex envelopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885.2 Relaxation of integral functionals . . . . . . . . . . . . . . . . . . . . . . . 915.3 Young measure relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . 955.4 Characterization of gradient Young measures . . . . . . . . . . . . . . . . 1015.5 Notes and bibliographical remarks . . . . . . . . . . . . . . . . . . . . . . 105

A Prerequisites 107A.1 General facts and notation . . . . . . . . . . . . . . . . . . . . . . . . . . 107A.2 Measure theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109A.3 Functional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111A.4 Sobolev and other function spaces spaces . . . . . . . . . . . . . . . . . . 112

Bibliography 117

Index 125

Com

men

t...

http://hyperurl.co/t1zpfn?entry.1778399955=1.1-iv

Chapter 1

Introduction

Physical modeling is a delicate business – trying to balance the validity of the resultingmodel, that is, its agreement with nature and its predictive capabilities, with the feasibilityof its mathematical treatment is often highly non-straightforward. In this quest to formulatea useful picture of an aspect of the world, it turns out on surprisingly many occasions thatthe formulation of the model becomes clearer, more compact, or more descriptive, if oneintroduces some form of a variational principle, that is, if one can define a quantity, such asenergy or entropy, which obeys a minimization, maximization or saddle-point law.

How “fundamental” such a scalar quantity is perceived depends on the situation: Forexample, in classical mechanics, one calls forces conservative if they can be expressed asthe derivative of an “energy potential” along a curve. It turns out that many, if not most,forces in physics are conservative, which seems to imply that the concept of energy is fun-damental. Furthermore, Einstein’s famous formula E = mc2 expresses the equivalence ofmass and energy, positioning energy as the “most” fundamental quantity, with mass becom-ing “condensed” energy. On the other hand, the other ubiquitous scalar physical quantity,the entropy, has a more “artificial” flavor as a measure of missing information in a model,as is now well understood in Statistical Mechanics, where entropy becomes a derived quan-tity. We leave it to the field of Philosophy of Science to understand the nature of variationalconcepts. In the words of William Thomson, the Lord Kelvin1:

The fact that mathematics does such a good job of describing the Universe isa mystery that we don’t understand. And a debt that we will probably never beable to repay.

Our approach to variational quantities here is a pragmatic one: We see them as providingstructure to a problem and to enable different mathematical, so-called variational, methods.For instance, in elasticity theory it is usually unrealistic that a system will tend to a globalminimizer by itself, but this does not mean that – as an approximation – such a principlecannot be useful in practice. On a fundamental level, there probably is no minimization assuch in physics, only quantum-mechanical interactions (or whatever lies “below” quantum

1Quotes prove nothing, however. Thomson also said “X-rays will prove to be a hoax”.

Com

men

t...

http://hyperurl.co/t1zpfn?entry.1778399955=1.1-1

2 CHAPTER 1. INTRODUCTION

mechanics). However, if we wait long enough, the inherent noise in a realistic system willmove our system around and it will settle with a high probability in a low-energy state.

In this course, we will focus solely on minimization problems for integral functionalsof the form (Ω ⊂ Rd open and bounded)∫

Ωf (x,u(x),∇u(x)) dx → min, u : Ω → Rm,

possibly under conditions on the boundary values of u and further side constraints. Theseproblems form the original core of the Calculus of Variations and are as relevant today asthey have always been. The systematic understanding of these integral functionals startsin Euler’s and Bernoulli’s times in the late 1600s and the early 1700s, and their studywas boosted into the modern age by Hilbert’s 19th, 20th, 23rd problems, formulated in1900 [47]. The excitement has never stopped since and new and important discoveries aremade every year.

We start this course by looking at a parade of examples, which we treat at varying levelsof detail. All these problems will be investigated further along the course once we havedeveloped the necessary mathematical tools.

1.1 The brachistochrone problem

In June 1696 Johann Bernoulli published a problem in the journal Acta Eruditorum, whichcan be seen as the time of birth of the Calculus of Variations (the name, however, is fromLeonhard Euler’s 1766 treatise Elementa calculi variationum). Additionally, Bernoulli senta letter containing the question to Gottfried Wilhelm Leibniz on 9 June 1696, who returnedhis solution only a few days later on 16 June, and commented that the problem tempted him“like the apple tempted Eve” because of its beauty. Isaac Newton published a solution (afterthe problem had reached him) without giving his identity, but Bernoulli identified him “exungue leonem” (from Latin, “by the lion’s claw”). The problem was formulated as follows2:

Given two points A and B in a vertical [meaning “not horizontal”] plane, oneshall find a curve AMB for a movable point M, on which it travels from A to Bin the shortest time, only driven by its own weight.

The resulting curve is called the brachistochrone (from Ancient Greek, “shortest time”)curve.

We formulate the problem as follows: We look for the curve y connecting the origin(0,0) to the point (x, y), where x > 0, y < 0, such that a point mass m > 0 slides from restat (0,0) to (x, y) quickest among all such curves. See Figure 1.1 for several possible slidepaths. We parametrize a point (x,y) on the curve by the time t ≥ 0. The point mass has

2The German original reads “Wenn in einer verticalen Ebene zwei Punkte A und B gegeben sind, sollman dem beweglichen Punkte M eine Bahn AMB anweisen, auf welcher er von A ausgehend vermoge seinereigenen Schwere in kurzester Zeit nach B gelangt.”

Com

men

t...

1.1. THE BRACHISTOCHRONE PROBLEM 3

Figure 1.1: Several slide curves from the origin to (x, y).

kinetic and potential energies

Ekin =m2

(dxdt

)2

+

(dydt

)2=

m2

(dxdt

)21+(

dydx

)2,

Epot = mgy,

where g ≈ 9.81m/s2 is the (constant) gravitational acceleration on Earth. The total energyEkin +Epot is zero at the beginning and conserved along the path. Hence, for all y we have

m2

(dxdt

)21+(

dydx

)2=−mgy.

We can solve this for dt/dx (where t = t(x) is the inverse of the x-parametrization) to get

dtdx

=

√1+(y′)2

−2gy

(dtdx

≥ 0),

where we wrote y′ = dydx . Integrating over the whole x-length along the curve from 0 to x,

we get for the total time duration T [y] that

T [y] =1√2g

∫ x

0

√1+(y′)2

−ydx.

We may drop the constant in front of the integral, which does not influence the minimizationproblem, and set x = 1 by reparametrization, to arrive at the problemF [y] :=

∫ 1

0

√1+(y′)2

−ydx → min,

y(0) = 0, y(1) = y < 0.

Com

men

t...


Clearly, the integrand is convex in y′, which will be important for the solution theory. Wewill come back to this problem in Example 2.34.

A related problem concerns Fermat’s Principle expressing that light (in vacuum or in amedium) always takes the fastest path between two points, from which many optical laws,e.g. about reflection and refraction, can be derived.

1.2 Electrostatics

Consider an electric charge density ρ : R3 → R (in units of C/m3) in three-dimensionalvacuum. Let E : R3 → R3 (in V/m) and B : R3 → R3 (in T = Vs/m2) be the electric andmagnetic fields, respectively, which we assume to be constant in time (hence electrostatics).Assuming that R3 is a linear, homogeneous, isotropic electric material, the Gauss law forelectricity reads

∇ ·E = divE =ρε0,

where ε0 ≈ 8.854 ·10−12 C/(Vm). Moreover, we have the Faraday law of induction

∇×E = curlE =dBdt

= 0.

Thus, since E is curl-free, there exist an electric potential φ : R3 → R (in V) such that

E =−∇φ.

Combining this with the Gauss law, we arrive at the Poisson equation,

∆φ = ∇ · [∇φ] =− ρε0. (1.1)

We can also look at electrostatics in a variational way: We use the norming conditionφ(0) = 0. Then, the electric potential energy UE(x;q) of a point charge q > 0 (in C) at pointx ∈ R3 in the electric field E is given by the path integral (which does not depend on thepath chosen since E is a gradient)

UE(x;q) =−∫ x

0qE ·ds =−

∫ 1

0qE(hx) dh = qφ(x).

Thus, the total electric energy of our charge distribution ρ is (the factor 1/2 is necessary tocount mutual reaction forces correctly)

UE :=12

∫R3

ρφ dx =ε0

2

∫R3(∇ ·E)φ dx,

which has units of CV = J. Using (∇ ·E)φ = ∇ · (Eφ)−E · (∇φ), the divergence theorem,and the natural assumption that φ vanishes at infinity, we get further

UE =ε0

2

∫R3

∇ · (Eφ)−E · (∇φ) dx =−ε0

2

∫R3

E · (∇φ) dx =ε0

2

∫R3

|∇φ|2 dx.

Com

men

t...


1.3. STATIONARY STATES IN QUANTUM MECHANICS 5

The integral∫

Ω |∇φ|2 dx is called Dirichlet integral.In Example 2.15 we will see that (1.1) is equivalent to the minimization problem

UE +∫R3

ρφ dx =∫R3

ε0

2|∇φ |2 +ρφ dx → min .

The second term is the energy stored in the electric field caused by its interaction with thecharge density ρ .

1.3 Stationary states in quantum mechanics

The non-relativistic evolution of a quantum mechanical system with N degrees of freedomin an electrical field is described completely through its wave function Ψ : RN ×R → Cthat satisfies the Schrodinger equation

ihddt

Ψ(x, t) =[−h2µ

∆+V (x, t)]

Ψ(x, t), x ∈ RN , t ∈ R,

where h ≈ 1.05 · 10−34 Js is the reduced Planck constant, µ > 0 is the reduced mass, andV =V (x, t) ∈R is the potential energy. The (self-adjoint) operator H :=−(2µ)−1h∆+V iscalled the Hamiltonian of the system.

The value of the wave function itself at a given point in spacetime has no physical inter-pretation, but according to the Copenhagen Interpretation of quantum mechanics, |Ψ(x)|2is the probability density of finding a particle at the point x. In order for |Ψ|2 to be a proba-bility density, we need to impose the side constraint

∥Ψ∥L2 = 1.

In particular, Ψ(x, t) has to decay as |x| → ∞.Of particular interest are the so-called stationary states, that is, solutions of the station-

ary Schrodinger equation[−h2

2µ∆+V (x)

]Ψ(x) = EΨ(x), x ∈ RN ,

where E > 0 is an energy level. We then solve this eigenvalue problem (in a weak sense) forΨ ∈ W1,2(RN ;C) with ∥Ψ∥L2 = 1 (see Appendix A.4 for a collection of facts about Sobolevspaces like W1,2).

If we are just interested in the lowest-energy state, the so-called ground state, we canfind minimizers of the energy functional

E [Ψ] :=∫RN

h2

2µ|∇Ψ(x)|2 + 1

2V (x)|Ψ(x)|2 dx,

again under the side constraint

∥Ψ∥L2 = 1.

The two parts of the integral above correspond to kinetic and potential energy, respectively.We will continue this investigation in Example 2.39.

Com

men

t...


1.4 Hyperelasticity

Elasticity theory is one of the most important theories of continuum mechanics, that is, thestudy of the mechanics of (idealized) continuous media. We will not go into much detailabout elasticity modeling here and refer to the book [20] for a thorough introduction.

Consider a body of mass occupying a domain Ω ⊂ R3, called the reference configu-ration. If we deform the body, any material point x ∈ Ω is mapped into a spatial pointy(x) ∈ R3 and we call y(Ω) the deformed configuration. For a suitable continuum mechan-ics theory, we also need to require that y : Ω → y(Ω) is a differentiable bijection and that itis orientation-preserving, i.e. that

det∇y(x)> 0 for all x ∈ Ω.

For convenience one also introduces the displacement

u(x) := y(x)− x.

One can show (using the implicit function theorem and the mean value theorem) that theorientation-preserving condition and the invertibility are satisfied if

supx∈Ω

|∇u(x)|< δ (Ω) (1.2)

for a domain-dependent (small) constant δ (Ω)> 0.Next, we need a measure of local “stretching”, called a strain tensor, which should serve

as the parameter to a local energy density. On physical grounds, rigid body motions, that is,deformations u(x)=Rx+u0 with a rotation R∈R3×3 (RT =R−1 and detR= 1) and u0 ∈R3,should not cause strain. In this sense, strain measures the deviation of the displacement froma rigid body motion. One popular choice is the Green–St. Venant strain tensor3

E :=12(∇u+∇uT +∇uT ∇u

). (1.3)

We first consider fully nonlinear (“finite strain”) elasticity. For our purposes we simplypostulate the existence of a stored-energy density W : R3×3 → [0,∞] and an external bodyforce field b : Ω → R3 (e.g. gravity) such that

F [y] :=∫

ΩW (∇y)−b · y dx

represents the total elastic energy stored in the system. If the elastic energy can be writtenin this way as

∫ΩW (∇y(x)) dx, we call the material hyperelastic. In applications, W is

often given as depending on the Green–St. Venant strain tensor E instead of ∇y, but for ourmathematical theory, the above form is more convenient. We require several properties ofW for arguments F ∈ R3×3:

3There is an array of choices for the strain tensor, but they are mostly equivalent within nonlinear elasticity.

Com

men

t...

1.4. HYPERELASTICITY 7

(i) The undeformed state costs no energy: W (I) = 0.

(ii) Invariance under change of frame4: W (RF) =W (F) for all R ∈ SO(3).

(iii) Infinite compression costs infinite energy: W (F)→+∞ as detF ↓ 0.

(iv) Infinite stretching costs infinite energy: W (F)→+∞ as |F | → ∞.

The main problem of nonlinear hyperelasticity is to minimize F as above over ally : Ω → R3 with given boundary values. Of course, it is not a-priori clear in which spacewe should look for a solution. Indeed, this depends on the growth properties of W . Forexample, for the prototypical choice

W (F) := dist(F,SO(3))2,

which however does not satisfy (3) from our list of requirements, we would look for squareintegrable functions. More realistic in applications are W of the form

W (F) :=M

∑i=1

ai tr[(FT F)γi/2]+ N

∑j=1

b j tr cof[(FT F)δ j/2]+Γ(detF), F ∈ R3×3,

where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R∪ +∞ is a convex function withΓ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0. These materials are called Ogden materialsand occur in a wide range of applications.

In the setting of linearized elasticity, we make the “small strain” assumption that thequadratic term in (1.3) can be neglected and that (1.2) holds. In this case, we work with thelinearized strain tensor

E u :=12(∇u+∇uT ).

In this infinitesimal deformation setting, the displacements that do not create strain areprecisely the skew-affine maps5 u(x) = Qx+u0 with QT =−Q and u0 ∈ R.

For linearized elasticity we consider an energy of the special “quadratic” form

F [u] :=∫

Ω

12E u : CE u−b ·u dx,

where C(x) = Ci jkl(x) is a symmetric, strictly positive definite (A : C(x)A > c|A|2 for somec > 0) fourth-order tensor, called the elasticity/stiffness tensor, and b : Ω → R3 is ourexternal body force. Thus we will always look for solutions in W1,2(Ω;R3).

For homogeneous, isotropic media, C does not depend on x or the direction of strain,which translates into the additional condition

(AR) : C(x)(AR) = A : C(x)A for all x ∈ Ω, A ∈ R3×3 and R ∈ SO(3).4Rotating or translating the (inertial) frame of reference must give the same result.5This becomes more meaningful when considering a bit more algebra: The Lie group SO(3) of rotations

has as its Lie algebra Lie(SO(3)) = so(3) the space of all skew-symmetric matrices, which then can be seen as“infinitesimal rotations”.

Com

men

t...



In this case, it can be shown that F simplifies to

F [u] =12

∫Ω

2µ|E u|2 +(

κ − 23

µ)| trE u|2 −b ·u dx

for µ > 0 the shear modulus and κ > 0 the bulk modulus, which are material constants. Forexample, for cold-rolled steel µ ≈ 75 GPa and κ ≈ 160 GPa.

1.5 Optimal saving and consumption

Consider a capitalist worker earning a (constant) wage w per year, which he can either spendon consumption or save. Denote by S(t) the savings at time t, where t ∈ [0,T ] is in years,with t = 0 denoting the beginning of his work life and t = T his retirement. Further, letC(t) be the consumption rate (consumption per time) at time t. On the saved capital, theworker earns interest, say with gross-continuous rate ρ > 0, meaning that a capital amountm > 0 grows as exp(ρt)m. If we were given an APR ρ1 > 0 instead of ρ , we could computeρ = ln(1+ ρ1). We further assume that salary is paid continuously, not in intervals, forsimplicity. So, w is really the rate of pay, given in money per time. Then, the worker’ssavings evolve according to the differential equation

S(t) = w+ρS(t)−C(t). (1.4)

Now, assume that our worker is mathematically inclined and wants to optimize thehappiness due to consumption in his life by finding the optimal amount of consumptionat any given time. Being a pure capitalist, the worker’s happiness only depends on hisconsumption rate C. So, if we denote by U(C) the utility function, that is, the “happiness”due to the consumption rate C, our worker wants to find C : [0,T ]→ R such that

H [C] :=∫ T

0U(C(t)) dt

is maximized. The choice of U depends on our worker’s personality, but it is probably real-istic to assume that there is a law of diminishing returns, i.e. for twice as much consumption,our worker is happier, but not twice as happy. So, let us assume U ′ > 0 and U ′(C)→ 0 asC → ∞. Also, we should have U(0) = −∞ (starvation). Moreover, it is realistic for U tobe concave, which implies that there are no local maxima. One function that satisfies all ofthese requirements is the logarithm, and so we use

U(C) = ln(C), C > 0.

Let us also assume that the worker starts with no savings, S(0) = 0, and at the endof his work life wants to retire with savings S(T ) = ST ≥ 0. Rearranging (1.4) for C(t)and plugging this into the formula for H , we therefore want to solve the optimal savingproblemF [S] :=

∫ T

0− ln(w+ρS(t)− S(t)) dt → min,

S(0) = 0, S(T ) = ST ≥ 0.

This will be solved in Example 2.18.

Com

men

t...


1.6. SAILING AGAINST THE WIND 9

Figure 1.2: Sailing against the wind in a channel.

1.6 Sailing against the wind

Every sailor knows how to sail against the wind by “beating”: One has to sail at an angle ofapproximately 45 to the wind6, then tack (turn the bow through the wind) and finally, afterthe sail has caught the wind on the other side, continue again at approximately 45 to thewind. Repeating this procedure makes the boat follow a zig-zag motion, which gives a netmovement directly against the wind, see Figure 1.2. A mathematically inclined sailor mightask the question of “how often to tack”. In an idealized model we can assume that tackingcosts no time and that the forward sailing speed vs of the boat depends on the angle α to thewind as follows (at least qualitatively):

vs(α) =−cos(4α),

which has maxima at α =±45 (in real boats, the maximum might be at a lower angle, i.e.“closer to the wind”). Assume furthermore that our sailor is sailing along a straight riverwith the current. Now, the current is fastest in the middle of the river and goes down to zeroat the banks. In fact, a good approximation would be the formula of Poiseuille (channel)flow, which can be derived from the flow equations of fluids: At distance r from the centerof the river the current’s flow speed is approximately

vc(r) := vmax

(1− r2

R2

),

where R > 0 is half the width of the river.

6The optimal angle depends on the boat. Ask a sailor about this and you will get an evening-filling lecture.

Com

men

t...



If we denote by r(t) the distance of our boat from the middle of the channel at timet ∈ [0,T ], then the total speed (called the “velocity made good” in sailing parlance) is

v(t) := vs(arctanr′(t))+ vc(r(t))

=−cos(4arctanr′(t))+ vmax

(1− r(t)2

R2

).

The key to understand this problem is now the fact that a 7→ −cos(4arctana) has two max-ima at a =±1. We say that this function is a double-well potential.

The total forward distance traveled over the time interval [0,T ] is∫ T

0v(t) dt =

∫ T

0−cos(4arctanr′(t))+ vmax

(1− r(t)2

R2

)dt.

If we also require the initial and terminal conditions r(0) = r(T ) = 0, we arrive at theoptimal beating problem:F [r] :=

∫ T

0cos(4arctanr′(t))− vmax

(1− r(t)2

R2

)dt → min,

r(0) = r(T ) = 0, |r(t)| ≤ R.

Our intuition tells us that in this idealized model, where tacking costs no time, we shouldbe tacking as “infinitely fast” in order to stay in the middle of the river. Later, once we haveadvanced tools at our disposal, we will make this idea precise, see Example 5.6.

Com

men

t...


Chapter 2

Convexity

In the introduction we got a glimpse of the many applications of minimization problems inphysics, technology, and economics. In this chapter we start to develop the abstract theory.Consider a minimization problem of the formF [u] :=

∫Ω

f (x,u(x),∇u(x)) dx → min,

over all u ∈ W1,p(Ω;Rm) with u|∂Ω = g.

Here, and throughout the course (if not otherwise stated) we will make the standard assump-tion that Ω ⊂ Rd is a bounded Lipschitz domain, that is, Ω is open, bounded, and has aLipschitz boundary. Furthermore, we assume 1 < p < ∞, so that the integrand

f : Ω×Rm ×Rm×d → R,

is measurable in the first and continuous in the second and third arguments (hence f is aCaratheodory integrand). For the prescribed boundary values g we assume

g ∈ W1−1/p,p(∂Ω;Rm).

In this context recall that W1−1/p,p(∂Ω;Rm) is the space of all traces of functions u ∈W1,p(Ω;Rm), see Appendix A.4 for some background on Sobolev spaces. These conditionswill be assumed from now on unless stated otherwise. Furthermore, to make the aboveintegral finite, we assume the growth bound

| f (x,v,A)| ≤ M(1+ |v|p + |A|p) for some M > 0.

Below, we will investigate the solvability and regularity properties of this minimizationproblem; in particular, we will take a close look at the way in which convexity propertiesof f in its gradient (third) argument determine whether F is lower semicontinuous, whichis the decisive attribute in the fundamental Direct Method of the Calculus of Variations.

We will also consider the partial differential equation associated with the minimizationproblem, the so-called Euler–Lagrange equation that is a necessary (but not sufficient) con-dition for minimizers. This leads naturally to the question of regularity of solutions. Finally,

Com

men

t...

12 CHAPTER 2. CONVEXITY

we also treat invariances, leading to the famous Noether theorem before having a glimpseat side constraints.

Most results here are already formulated for vector-valued u, as this has many applica-tions. Some aspects, however, in particular regularity theory, are specific to scalar problems,where u is assumed to be merely real-valued.

2.1 The Direct Method

Fundamental to all of the existence theorems in this text is the conceptionally simple DirectMethod1 of the Calculus of Variations. Let X be a topological space2 (e.g. a Banach spacewith the strong or weak topology) and let F : X → R∪+∞ be our objective functionalsatisfying the following two assumptions:

(H1) Coercivity: For all Λ ∈ R, the sublevel set x ∈ X : F [x]≤ Λ is sequentially rel-atively compact, that is, if F [x j]≤ Λ for a sequence (x j)⊂ X and some Λ ∈ R, then(x j) has a convergent subsequence.

(H2) Lower semicontinuity: For all sequences (x j)⊂ X with x j → x it holds that

F [x]≤ liminfj→∞

F [x j].

Consider the abstract minimization problem

F [x]→ min over all x ∈ X . (2.1)

Then, the Direct Method is the following simple strategy:

Theorem 2.1 (Direct Method). Assume that F is both coercive and lower semicontinu-ous. Then, the abstract problem (2.1) has at least one solution, that is, there exists x∗ ∈ Xwith F [x∗] = minF [x] : x ∈ X .

Proof. Let us assume that there exists at least one x ∈ X such that F [x] < +∞; otherwise,any x ∈ X is a “solution” to the (degenerate) minimization problem.

To construct a minimizer we take a minimizing sequence (x j)⊂ X such that

limj→∞

F [x j]→ α := inf

F [x] : x ∈ X<+∞.

Then, since convergent sequences are bounded, there exists Λ ∈ R such that F [x j]≤ Λ forall j ∈N. Hence, by the coercivity (H1), the sequence (x j) is contained in a compact subset

1“Direct” refers to the fact that we construct solutions without employing a differential equation.2In fact, we only need a space together with a notion of convergence as all our arguments will involve

sequences as opposed to more general topological tools such as nets. In all of this text we use “topology” and“convergence” interchangeably.

Com

men

t...

2.1. THE DIRECT METHOD 13

of X . Now select a subsequence, which here as in the following we do not renumber, suchthat

x j → x∗ ∈ X .

By the lower semicontinuity (H2), we immediately conclude

α ≤ F [x∗]≤ liminfj→∞

F [x j] = α.

Thus, F [x∗] = α and x∗ is the sought minimizer.

Example 2.2. Using the Direct Method, one can easily see that

h(t) :=

1− t if t < 0,t if t ≥ 0,

t ∈ R (= X),

has minimizer t = 0, despite not even being continuous there.

Despite its nearly trivial proof, the Direct Method is very useful and flexible in applica-tions. Of course, neither coercivity nor lower semicontinuity are necessary for the existenceof a minimizer. For instance, we really only needed coercivity and lower semicontinuityalong a minimizing sequence, but this weaker condition is hard to check in most practicalsituations; however, such a refined approach can be useful in special cases.

The abstract principle of the Direct Method pushes the difficulty in proving existenceof a minimizer toward establishing coercivity and lower semicontinuity. This, however, is abig advantage, since we have many tools at our disposal to establish these two hypothesesseparately. In particular, for integral functionals lower semicontinuity is tightly linked toconvexity properties of the integrand, as we will see throughout this course.

At this point it is crucial to observe how coercivity and lower semicontinuity interactwith the topology (or, more accurately, convergence) in X : If we choose a stronger topology,i.e. one for which there are fewer convergent sequences, then it becomes easier for F tobe lower semicontinuous, but harder for F to be coercive. The opposite holds if choosinga weaker topology. In the mathematical treatment of a problem we are most likely in asituation where F and X are given by the concrete problem. We then need to find a suitabletopology in which we can establish both coercivity and lower semicontinuity.

In this text, X will always be an infinite-dimensional Banach space and we have a realchoice between using the strong or weak convergence. Usually, it turns out that coerciv-ity with respect to the strong convergence is false since strongly compact sets in infinite-dimensional spaces are very restricted, whereas coercivity with respect to the weak conver-gence is true under reasonable assumptions. However, whereas strong lower semicontinuityposes few challenges, lower semicontinuity with respect to weakly converging sequences isa delicate matter and we will spend considerable time on this topic.

We will always use the Direct Method in the following version:

Theorem 2.3 (Direct Method for weak convergence). Let X be a Banach space and letF : X → R∪+∞. Assume the following:

Com

men

t...

(WH1) Weak Coercivity: For all Λ ∈ R, the sublevel set

x ∈ X : F [x]≤ Λ is sequentially weakly relatively compact,

that is, if F [x j]≤ Λ for a sequence (x j)⊂ X and some Λ ∈R, then (x j) has a weaklyconvergent subsequence.

(WH2) Weak lower semicontinuity: For all sequences (x j) ⊂ X with x j x (weak conver-gence X) it holds that


F [x j].

Then, the minimization problem

F [x]→ min over all x ∈ X,

has at least one solution.

Before we move on to the more concrete theory, let us record one further remark: Whileit might appear as if “nature does not give us the topology” and it is up to mathematicians to“invent” a suitable one, it is remarkable that the topology that turns out to be mathematicallyconvenient is also often physically relevant. This is for instance expressed in the followingobservation: The only phenomena which are present in weakly but not strongly convergingsequences are oscillations and concentrations, as can be seen in the classical Vitali Con-vergence Theorem A.6. On the other hand, these phenomena very much represent physicaleffects, for example in crystal microstructure. Thus, the fact that we work with the weakconvergence means that we have to consider microstructure (which might or might not beexhibited), a fact that we will come back to many times throughout this course.

2.2 Functionals with convex integrands

Returning to the minimization problem at the beginning of this chapter, we first considerthe minimization problem for the simpler functional

F [u] :=∫

Ωf (x,∇u(x)) dx

over all u∈W1,p(Ω;Rm) for some 1< p<∞ to be chosen later (depending on lower growthproperties of f ). The following lemma together with (2.3) allows us to conclude that F [u]is indeed well-defined.

Lemma 2.4. Let f : Ω×RN → R be a Caratheodory integrand, that is,

(i) x 7→ f (x,A) is Lebesgue-measurable for all fixed A ∈ RN .

(ii) A 7→ f (x,A) is continuous for all fixed x ∈ Ω.

Com

men

t...

2.2. FUNCTIONALS WITH CONVEX INTEGRANDS 15

Then, for any Lebesgue-measurable function V : Ω → RN the composition x 7→ f (x,V (x))is Lebesgue-measurable.

We will prove this lemma via the following “Lusin-type” theorem for Caratheodoryfunctions:

Theorem 2.5 (Scorza Dragoni). Let f : Ω×RN →R be Caratheodory. Then, there existsan increasing sequence of compact sets Sk ⊂ Ω (k ∈ N) with |Ω\Sk| ↓ 0 such that f |Sk×RN

is continuous.

See Theorem 6.35 in [37] for a proof of this theorem, which is rather measure-theoreticand similar to the proof of the Lusin theorem, cf. Theorem 2.24 in [85].

Proof of Lemma 2.4. Let Sk ⊂ Ω (k ∈ N) be as in the Scorza Dragoni Theorem. Set fk :=f |Sk×RN and

gk(x) := fk(x,V (x)), x ∈ Sk.

Then, for any open set E ⊂ R, the pre-image of E under gk is the set of all x ∈ Sk such that(x,V (x)) ∈ f−1

k (E). As fk is continuous, f−1k (E) is open and from the Lebesgue measura-

bility of the product function x 7→ (x,V (x)) we infer that g−1k (E) is a Lebesgue-measurable

subset of Sk. We can extend the definition of gk to all of Ω by setting gk(x) := 0 wheneverx ∈ Ω \ Sk. This function is still Lebesgue-measurable as Sk is compact, hence measur-able. The conclusion follows from the fact that gk(x)→ f (x,V (x)) and pointwise limits ofLebesgue-measurable functions are themselves Lebesgue-measurable.

We next investigate coercivity: Let 1 < p < ∞ The most basic assumption, and theonly one we want to consider here, to guarantee coercivity is the following lower growth(coercivity) bound:

µ|A|p −µ−1 ≤ f (x,A) for all x ∈ Ω, A ∈ Rm×d and some µ > 0. (2.2)

This coercivity also suggests the exponent p for the definition of the function spaces wherewe look for solutions. We further assume the upper growth bound

| f (x,A)| ≤ M(1+ |A|p) for all x ∈ Ω, A ∈ Rm×d and some M > 0, (2.3)

which gives finiteness of F [u] for all u ∈ W1,p(Ω;Rm).

Proposition 2.6. If f satisfies the lower growth bound (2.2), then F is weakly coercive onthe space W1,p

g (Ω;Rm) =

u ∈ W1,p(Ω;Rm) : u|∂Ω = g

.

Proof. We need to show that any sequence (u j)⊂ W1,pg (Ω;Rm) with

sup j F [u j]< ∞

Com

men

t...

is weakly relatively compact. From (2.2) we get

µ · sup j

∫Ω|∇u j|p dx− |Ω|

µ≤ sup j F [u j]< ∞,

whereby sup j ∥∇u j∥Lp < ∞. By the Poincare–Friedrichs inequality, Theorem A.16 (I), inconjunction with the fact that we fix the boundary values, we therefore get sup j ∥u j∥W1,p <

∞. We conclude by the fact that bounded sets in reflexive Banach spaces, like W1,p(Ω;Rm)for 1 < p < ∞, are sequentially weakly relatively compact, see Theorem A.11.

Having settled the question of (weak) coercivity, we can now turn to investigate theweak lower semicontinuity.

Theorem 2.7. If A 7→ f (x,A) is convex for all x∈Ω and f (x,A)≥ κ for some κ ∈R (whichfollows in particular from (2.2)), then F is weakly lower semicontinuous on W1,p(Ω;Rm).

Proof. Step 1. We first establish that F is strongly lower semicontinuous, so let u j → uin W1,p and ∇u j → ∇u almost everywhere (which holds after selecting a non-renumberedsubsequence, see Appendix A.2). By assumption we have

f (x,∇u j(x))−κ ≥ 0.

Thus, applying Fatou’s Lemma,

liminfj→∞

(F [u j]−κ|Ω|

)≥ F [u]−κ|Ω|.

Thus, F [u] ≤ liminf j→∞ F [u j]. Since this holds for all subsequences, it also follows forour original sequence.

Step 2. To show weak lower semicontinuity take (u j)⊂ W1,p(Ω;Rm) with u j u. Weneed to show that

F [u]≤ liminfj→∞

F [u j] =: α. (2.4)

Taking a subsequence (not relabeled), we can in fact assume that F [u j] converges to α .By the Mazur Lemma A.14, we may find convex combinations

v j =N( j)

∑n= j

θ j,nun, where θ j,n ∈ [0,1] andN( j)

∑n= j

θ j,n = 1,

such that v j → u strongly. Since f (x, q) is convex,

F [v j] =

∫Ω

f

(x,

N( j)

∑n= j

θ j,n∇un(x)

)dx ≤

N( j)

∑n= j

θ j,nF [un].

Now, F [un]→ α as n → ∞ and n ≥ j inside the sum. Thus,

liminfj→∞

F [v j]≤ α = liminfj→∞

F [u j].

On the other hand, from the first step and v j → u strongly, we have F [u]≤ liminf j→∞ F [v j].Thus, (2.4) follows and the proof is finished.

Com

men

t...

2.2. FUNCTIONALS WITH CONVEX INTEGRANDS 17

We can summarize our findings in the following theorem:

Theorem 2.8. Let f : Ω×Rm×d be a Caratheodory integrand such that f satisfies thelower growth bound (2.2), the upper growth bound (2.3) and is convex in its second argu-ment. Then, the associated functional F has a minimizer over the space W1,p

g (Ω;Rm).

Proof. This follows immediately from the Direct Method in Banach spaces, Theorem 2.3together with Proposition 2.6 and Theorem 2.7.

Example 2.9. The Dirichlet integral (functional) is

F [u] :=∫

Ω|∇u(x)|2 dx, u ∈ W1,2(Ω).

We already encountered this integral functional when considering electrostatics in Sec-tion 1.2. It is easy to see that this functional satisfies all requirements of Theorem 2.8and so there exists a minimizer for any prescribed boundary values g ∈ W1/2,2(∂Ω).

Example 2.10. In the prototypical problem of linearized elasticity from Section 1.4, weare tasked with the minimization problemF [u] :=

12

∫Ω

2µ|E u|2 +(

κ − 23

µ)| trE u|2 − f u dx → min,

over all u ∈ W1,2(Ω;R3) with u|∂Ω = g,

where µ,κ > 0, f ∈ L2(Ω;R3), and g ∈ W1/2,2(∂Ω;Rm). It is clear that F has quadraticgrowth. Let us first consider the coercivity: For u ∈ W1,2(Ω;R3) with u|∂Ω = g we have thefollowing version of the L2-Korn inequality

∥u∥2W1,2 ≤CK

(∥E u∥2

L2 +∥g∥2W1/2,2

).

Here, CK =CK(Ω)> 0 is a constant. Let us for a simplified estimate assume that κ− 23 µ ≥ 0

(this is not necessary, but shortens the argument). Then, also using the Young inequality,we get for any δ > 0,

F [u]≥ µ∥E u∥2L2 −∥ f∥L2∥u∥L2

≥ µ(∥E u∥2

L2 +∥g∥2W1/2,2

)− 1

2δ∥ f∥2

L2 −δ2∥u∥2

L2 −µ∥g∥2W1/2,2

≥(

µCK

− δ2

)∥u∥2

W1,2 −1

2δ∥ f∥2

L2 −µ∥g∥2W1/2,2 .

Choosing δ = µ/CK , we get the coercivity estimate

F [u]≥ µ2CK

∥u∥2W1,2 −

CK

2µ∥ f∥2

L2 −µ∥g∥2W1/2,2 .

Hence, F [u] controls ∥u∥W1,2 and our functional is weakly coercive. Moreover, it is clearthat our integrand is convex in in the E u-argument (note that the trace function is linear).Hence, Theorem 2.8 yields the existence of a solution u∗ ∈W1,2(Ω;R3) to our minimizationproblem of linearized elasticity.

Com

men

t...


We finish this section with the following converse to Theorem 2.7:

Proposition 2.11. Let F be an integral functional with a Caratheodory integrand f : Ω×Rm×d → R satisfying the upper growth bound (2.3). Assume furthermore that F is weaklylower semicontinuous on W1,p(Ω;Rm). If either m = 1 or d = 1 (the scalar case), thenA 7→ f (x,A) is convex for almost every x ∈ Ω.

Proof. We will not prove the full result here, since Proposition 3.27 together with Corol-lary 3.4 in the next chapter will give a stronger version (including the corresponding resultfor the vectorial case as well).

In the vectorial case, i.e. m = 1 or d = 1, the necessity of convexity is far from beingtrue and there is indeed another “canonical” condition ensuring weak lower semicontinuity,which is weaker than (ordinary) convexity; we will explore this in the next chapter.

2.3 Integrands with u-dependence

If we try to extend the results in the previous section to more general functionals

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx → min,

we discover that our proof strategy of using the Mazur Lemma runs into difficulties: Wecannot “pull out” the convex combination inside∫

Ωf

(x,

N( j)

∑n= j

θ j,nun(x),N( j)

∑n= j

θ j,n∇un(x)

)dx

anymore. Nevertheless, a lower semicontinuity result analogous to the one for the scalarcase turns out to be true:

Theorem 2.12. Let f : Ω×Rm ×Rm×d → R satisfy the growth bound

| f (x,v,A)| ≤ M(1+ |v|p + |A|p) for some M > 0, 1 < p < ∞,

and the convexity property

A 7→ f (x,v,A) is convex for all (x,v) ∈ Ω×Rm.

Then, the functional

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx, u ∈ W1,p(Ω;Rm),

is weakly lower semicontinuous.

While it would be possible to give an elementary proof of this theorem here, we post-pone the verification to the next chapter. There, using more advanced and elegant techniques(blow-ups and Young measures), we will in fact establish a much more general result andsee that, when viewed from the right perspective, u-dependence does not pose any additionalcomplications.

Com

men

t...

2.4. THE EULER–LAGRANGE EQUATION 19

2.4 The Euler–Lagrange equation

In analogy to the elementary fact that the derivative of a function at its minimizers is zero,we will now derive a necessary condition for a function to be a minimizer. This conditionfurnishes the connection between the Calculus of Variations and PDE theory and is of greatimportance for computing particular solutions.

Theorem 2.13 (Euler–Lagrange equation (system)). Let f : Ω×Rm ×Rm×d → R bea Caratheodory integrand that is continuously differentiable in v and A and satisfies thegrowth bounds

|Dv f (x,v,A)|, |DA f (x,v,A)| ≤ M(1+ |v|p + |A|p), (x,v,A) ∈ Ω×Rm ×Rm×d ,

for some M > 0, 1 ≤ p < ∞. If u∗ ∈ W1,pg (Ω;Rm), where g ∈ W1−1/p,p(∂Ω;Rm), minimizes

the functional

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx, u ∈ W1,p

g (Ω;Rm),

then u∗ is a weak solution of the following system of PDEs, called the Euler–Lagrangeequation3,

−div[DA f (x,u,∇u)

]+Dv f (x,u,∇u) = 0 in Ω,

u = g on ∂Ω.(2.5)

Here we used the common convention to omit the x-arguments whenever this does notcause any confusion in order to curtail the proliferation of x’s, for example in f (x,u,∇u) =f (x,u(x),∇u(x)).

Recall that u∗ is a weak solution of (2.5) if∫Ω

DA f (x,u∗,∇u∗) : ∇ψ +Dv f (x,u∗,∇u∗) ·ψ dx = 0

for all ψ ∈ C∞c (Ω;Rm), where

DA f (x,v,A) : B := limh↓0

f (x,v,A+hB)− f (x,v,A)h

, A,B ∈ Rm×d,

is the directional derivative of f (x,v, q) at A in direction B, likewise for Dv f (x,v,A) ·w.The derivatives DA f (x,v,A) and Dv f (x,v,A) can also be represented as a matrix in Rm×d

and a vector in Rm, respectively, namely

DA f (x,v,A) :=(∂A j

kf (x,v,A)

) jk, Dv f (x,v,A) :=

(∂v j f (x,v,A)

) j.

Hence we use the notation with the Frobenius matrix vector product “:” (see the appendix)and the scalar product “·”. The boundary condition u = g on ∂Ω is to be understood in thesense of trace.

3This is a system of PDEs, but we simply speak of it as an “equation”.

Com

men

t...

Proof. For all ψ ∈ C∞c (Ω;Rm) and all h > 0 we have

F [u∗]≤ F [u∗+hψ]

since u∗+hψ ∈ W1,pg (Ω;Rm) is admissible in the minimization. Thus,

0 ≤∫

Ω

f (x,u∗+hψ,∇u∗+h∇ψ)− f (x,u∗,∇u∗)h

dx

=∫

Ω

∫ 1

0

1h

ddt

[f (x,u∗+ thψ,∇u∗+ th∇ψ)

]dt dx

=∫

Ω

∫ 1

0DA f (x,u∗+ thψ,∇u∗+ th∇ψ) : ∇ψ

+Dv f (x,u∗+ thψ,∇u∗+ th∇ψ) ·ψ dt dx.

By the growth bounds on the derivative, the integrand can be seen to have an h-uniform L1-majorant, namely C(1+ |u∗|p + |ψ|p + |∇u∗|p + |∇ψ|p) and so we may apply the Lebesguedominated convergence theorem to let h ↓ 0 under the (double) integral. This yields

0 ≤∫

ΩDA f (x,u∗,∇u∗) : ∇ψ +Dv f (x,u∗,∇u∗) ·ψ dx

and we conclude by taking ψ and −ψ in this inequality.

Remark 2.14. If we want to allow ψ ∈ W1,p0 (Ω;Rm) in the weak formulation of (2.5),

then we need to assume the stronger growth conditions

|Dv f (x,v,A)|, |DA f (x,v,A)| ≤ M(1+ |v|p−1 + |A|p−1)

for some M > 0, 1 ≤ p < ∞.

Example 2.15. Coming back to the Dirichlet integral from Example 2.9, we see that theassociated Euler–Lagrange equation is the Laplace equation

−∆u = 0,

where ∆ := ∂ 2x1+ · · ·+∂ 2

xdis the Laplace operator. Solutions u are called harmonic func-

tions. Since the Dirichlet integral is convex, all solutions of the Laplace equation are in factminimizers. In particular, it can be seen that solutions of the Laplace equation are uniquefor given boundary values.

Example 2.16. In the linearized elasticity problem from Example 2.10, we may computethe Euler–Lagrange equation as−div

[2µ E u+

(κ − 2

3µ)(trE u)I

]= f in Ω,

u = g on ∂Ω.

Com

men

t...


Sometimes, one also defines the first variation δF [u] of F at u ∈ W1,p(Ω;Rm) as thelinear map

δF [u] : W1,p0 (Ω;Rm)→ R

through (whenever this exists)

δF [u][ψ] := limh↓0

F [u+hψ]−F [u]h

, ψ ∈ W1,p0 (Ω;Rm).

Then, the assertion of the previous theorem can equivalently be formulated as

δF [u∗] = 0 if u∗ minimizes F over W1,pg (Ω;Rm).

Of course, this condition is only necessary for u to be a minimizer. Indeed, any solution ofthe Euler–Lagrange equation is called a critical point of F , which could be a minimizer,maximizer, or saddle point.

One crucial consequence of the results in this section is that we can use all availablePDE methods to study minimizers. Immediately, one can ask about the type of PDE we aredealing with. In this respect we have the following result:

Proposition 2.17. Let f be twice continuously differentiable in v and A. Also let A 7→f (x,v,A) be convex for all (x,v)∈Ω×Rm. Then, the Euler–Lagrange equation is an ellipticPDE, that is,

0 ≤ D2A f (x,v,A)[B,B] :=

d2

dt2 f (x,v,A+ tB)∣∣∣∣t=0

for all x ∈ Ω, v ∈ Rm, and A,B ∈ Rm×d .

This proposition is not difficult to establish and just needs a bit of notation; we omit theproof.

One main use of the Euler–Lagrange equation is to find concrete solutions of variationalproblems:

Example 2.18. Recall the optimal saving problem from Section 1.5,F [S] :=∫ T

0− ln(w+ρS(t)− S(t)) dt → min,

S(0) = 0, S(T ) = ST ≥ 0.

The Euler–Lagrange equation is

− ddt

[1

w+ρS(t)− S(t)

]=− ρ

w+ρS(t)− S(t),

which resolves to

ρ S(t)− S(t)w+ρS(t)− S(t)

= ρ .

Com

men

t...


Figure 2.1: The solution to the optimal saving problem.

With the consumption rate C(t) := w+ρS(t)− S(t), this is equivalent to

C(t)C(t)

= ρ,

and so, if C(0) =C0 (to be determined later),

w+ρS(t)− S(t) =C(t) = eρtC0.

This differential equation for S(t) can be solved, for example via the Duhamel formula,which yields

S(t) = eρt ·0+∫ t

0eρ(t−s)(w− eρsC0) ds

=eρt −1

ρw− teρtC0.

and C0 can now be chosen to satisfy the terminal condition S(T ) = ST , in fact, C0 = (1−e−ρT )w/(ρT )− e−ρT ST/T .

Since C 7→ − lnC is convex, we know that S(t) as above is a minimizer of our problem.In Figure 2.1 we see the optimal savings strategy for a worker earning a (constant) salaryof w = £30,000 per year and having a savings goal of ST = £100,000. The worker has tosave for approximately 27 years, reaching savings of just over £168,000, and then startswithdrawing his savings (S(t)< 0) for the last 13 years. The worker’s consumption C(t) =eρtC0 goes up continuously during the whole working life, making him/her (materially)happy.

Com

men

t...


Example 2.19. For functions u = u(t,x) : R×Rd → R, consider the action functional

A [u] :=12

∫R×Rd

−|∂tu|2 + |∇xu|2 d(t,x),

where ∇xu is the gradient of u with respect to x ∈Rd . This functional should be interpretedas the usual Dirichlet integral, see Example 2.9, where, however, we use the Lorentz metric.Then, the Euler–Lagrange equation is the wave equation

∂ 2t u−∆u = 0.

Notice that A is not convex and consequently the wave equation is hyperbolic instead ofelliptic.

It is an important question whether a weak solution of the Euler–Lagrange equation (2.5)is also a strong solution, that is, whether u ∈ W2,2(Ω;Rm) and

−div[DA f (x,u(x),∇u(x))

]+Dv f (x,u(x),∇u(x)) = 0 for a.e. x ∈ Ω,

u = g on ∂Ω.(2.6)

If u ∈ C2(Ω;Rm)∩C(Ω;Rm) satisfies this PDE for every x ∈ Ω, then we call u a classicalsolution.

Multiplying (2.6) by a test function ψ ∈ C∞c (Ω;Rm), integrating over Ω, and using the

Gauss–Green Theorem, it follows that any solution of (2.6) also solves (2.5). The converseis also true whenever u is sufficiently regular:

Proposition 2.20. Let the integrand f be twice continuously differentiable and let u ∈W2,2(Ω;Rm) be a weak solution of the Euler–Lagrange equation (2.5). Then, u solves theEuler–Lagrange equation (2.6) in the strong sense.

Proof. If u ∈ W2,2(Ω;Rm) is a weak solution, then∫Ω

DA f (x,u,∇u) : ∇ψ +Dv f (x,u,∇u) ·ψ dx = 0

for all ψ ∈ C∞c (Ω;Rm). Integration by parts (more precisely, the Gauss–Green Theorem)

gives ∫Ω

−div

[DA f (x,u,∇u)

]+Dv f (x,u,∇u)

·ψ dx = 0,

again for all ψ as before. We conclude using the following important lemma.

Lemma 2.21 (Fundamental lemma of the Calculus of Variations). Let Ω⊂Rd be open.If g ∈ L1(Ω) satisfies∫

Ωgψ dx = 0 for all ψ ∈ C∞

c (Ω),

then g ≡ 0 almost everywhere.

Com

men

t...


Proof. We can assume that Ω is bounded by considering subdomains if necessary. Also, letg be extended by zero to all of Rd . Fix ε > 0 and let (ρδ )δ ⊂ C∞

c (B(0,δ )) be a family ofmollifying kernels, see Appendix A.2 for details. Then, since ρδ ⋆g → g in L1, we may findh ∈ (L1 ∩C∞)(Rd) with

∥g−h∥L1 ≤ε4

and ∥h∥∞ < ∞.

Set φ(x) := h(x)/|h(x)| for h(x) = 0 and φ(x) := 0 for h(x) = 0, so that hφ = |h|. Then takeψ = ρδ ⋆φ ∈ (L1 ∩C∞)(Rd) with

∥φ −ψ∥L1 ≤ε

2∥h∥∞.

Since |ψ| ≤ 1 (this follows from the convolution formula),

∥g∥L1 ≤ ∥g−h∥L1 +

∫Ω

hφ dx

≤ ∥g−h∥L1 +∫

Ωh(φ −ψ)+(h−g)ψ +gψ dx

≤ 2∥g−h∥L1 +∥h∥∞ · ∥φ −ψ∥L1 +0

≤ ε.

We conclude by letting ε ↓ 0.

2.5 Regularity of minimizers

As we saw at the end of the last section, whether a weak solution of the Euler–Lagrangeequation is also a strong or even classical solution, depends on its regularity, that is, on theamount of differentiability. More generally, one would like to know how much regularitywe can expect from solutions of a variational problem. Such a question also was the contentof David Hilbert’s 19th problem [47]:

Does every Lagrangian partial differential equation of a regular variationalproblem have the property of exclusively admitting analytic integrals?4

In modern language, Hilbert asked whether “regular” variational problems (defined below)admit only analytic solutions, i.e. ones that have a local power series representation.

In this section, we will prove some basic regularity assertions, but we will only sketchthe solution of Hilbert’s 19th problem, as the techniques needed are quite involved. We re-mark from the outset that regularity results are very sensitive to the dimensions involved, inparticular the behavior of the scalar case (m = 1) and the vector case (m > 1) is fundamen-tally different. We also only consider the quadratic (p = 2) case for reasons of simplicity.

4The German original asks “ob jede Lagrangesche partielle Differentialgleichung eines regularen Varia-tionsproblems die Eigenschaft hat, daß sie nur analytische Integrale zulaßt.”

Com

men

t...

2.5. REGULARITY OF MINIMIZERS 25

In the spirit of Hilbert’s 19th problem, call

F [u] :=∫

Ωf (∇u(x)) dx, u ∈ W1,2(Ω;Rm),

a regular variational integral if f ∈ C2(Rm×d) and there are constants 0 < µ ≤ M with

µ|B|2 ≤ D2A f (A)[B,B]≤ M|B|2 for all A,B ∈ Rm×d .

In this context,

D2A f (A)[B1,B2] =

ddt

dds

f (A+ sB1 + tB2)

∣∣∣∣s,t=0

for all A,B1,B2 ∈ Rm×d ,

which is a bilinear form. Clearly, regular variational problems are convex. In fact, inte-grands f that satisfy the above lower bound µ|B|2 ≤ D2

A f (A)[B,B] are called strongly con-vex. For example, the Dirichlet integral from Example 2.9 is regular. Also, the followingCauchy–Schwarz-type estimate can be shown:∣∣D2

A f (A)[B1,B2]∣∣≤ M|B1||B2|. (2.7)

The basic regularity theorem is the following:

Theorem 2.22 (W2,2loc-regularity). Let F be a regular variational integral. Then, for

any minimizer u ∈ W1,2(Ω;Rm) of F , it holds that u ∈ W2,2loc(Ω;Rm). Moreover, for any

B(x0,3r)⊂ Ω (x0 ∈ Ω, r > 0) the Caccioppoli inequality∫B(x0,r)

|∇2u(x)|2 dx ≤(

2Mµ

)2 ∫B(x0,3r)

|∇u(x)− [∇u]B(x0,3r)|2

r2 dx (2.8)

holds, where [∇u]B(x0,3r) := −∫

B(x0,3r) ∇u dx. Consequently, the Euler–Lagrange equationholds strongly,

−divD f (∇u) = 0, a.e. in Ω.

For the proof we employ the difference quotient method, which is fundamental in regu-larity theory. Let u : Ω →Rm, x ∈ Ω, k ∈ 1, . . . ,d, and h ∈R. Then, define the differencequotients

Dhku(x) :=

u(x+hek)−u(x)h

, Dhu := (Dh1u, . . . ,Dh

du),

where e1, . . . ,ed is the standard basis of Rd . The key is the following characterization ofSobolev spaces in terms of difference quotients:

Lemma 2.23. Let 1 < p < ∞, D ⊂⊂ Ω ⊂ Rd be open, and u ∈ Lp(Ω;Rm).

(I) If u ∈ W1,p(Ω;Rm), then

∥Dhku∥Lp(D) ≤ ∥∂ku∥Lp(Ω) for all k ∈ 1, . . . ,d, |h|< dist(D,∂Ω).

Com

men

t...

(II) If for some 0 < δ < dist(D,∂Ω) it holds that

∥Dhku∥Lp(D) ≤C for all k ∈ 1, . . . ,d and all |h|< δ ,

then u ∈ W1,ploc (Ω;Rm) and ∥∂ku∥Lp(D) ≤C for all k ∈ 1, . . . ,d.

Proof. For (I), assume first that u ∈ (Lp ∩C1)(Ω;Rm). In this case, by the FundamentalTheorem of Calculus, at x ∈ Ω it holds that

Dhku(x) =

1h

∫ 1

0

ddt

u(x+ thek) dt =∫ 1

0∂ku(x+ thek) dt.

Thus, ∫D|Dh

ku|p dx ≤∫

Ω|∂ku|p dx,

from which the assertion follows. The general case follows from the density of (Lp ∩C1)(Ω;Rm) in Lp(Ω;Rm).

For (II), we observe that for fixed k ∈ 1, . . . ,d by assumption (Dhku)0<h<δ is uniformly

Lp-bounded. Thus, for an arbitrary fixed sequence of h’s tending to zero, there exists asubsequence h j ↓ 0 with

Dh jk u vk in Lp

for some vk ∈ Lp(Ω;Rm). Let ψ ∈ C∞c (Ω;Rm). Using an “integration-by-parts” rule for

difference quotients, which is elementary to check, we get∫D

vk ·ψ dx = limj→∞

∫D

Dh jk u ·ψ dx =− lim

j→∞

∫D

u ·D−h jk ψ dx =−

∫D

u ·∂kψ dx.

Thus, u ∈ W1,p(D;Rm) and vk = ∂ku. The norm estimate follows from the lower semicon-tinuity of the norm under weak convergence.

Proof of Theorem 2.22. The idea is to emulate the a-priori non-existent second derivativesusing difference quotients and to derive estimates which allow one to conclude that thesedifference quotients are in L2. Then we can conclude by the preceding lemma.

Let u ∈ W1,2(Ω;Rm) be a minimizer of F . By Theorem 2.13,

0 =∫

ΩD f (∇u) : ∇ψ dx (2.9)

holds for all ψ ∈ C∞c (Ω;Rm), and then by density also for all ψ ∈ W1,2(Ω;Rm). Here we

have used the notation D f (A) : B for the directional derivative of f at A in direction B. Fixa ball B(x0,3r)⊂ Ω and take a Lipschitz cut-off function ρ ∈ W1,∞(Ω) such that

1B(x0,r) ≤ ρ ≤ 1B(x0,2r) and |∇ρ | ≤ 1r.

Com

men

t...

With the W2,2loc-regularity result at hand, we may use ψ = ∂kψ , k = 1, . . . ,d, as test

function in the weak formulation of the Euler–Lagrange equation −div[D f (∇u)] = 0 toconclude that

−div[D2 f (∇u) : ∇(∂ku)

]= 0 (2.11)

holds in the weak sense. Indeed,

0 =−∫

ΩdivD f (∇u) ·∂kψ dx =

∫Ω

div[D2 f (∇u) : ∇(∂ku)

]·ψ dx

for all ψ ∈ C∞c (Ω;Rm). In the special case that f is quadratic, D2 f is constant and the above

PDE has the same structure (with ∂ku as unknown function) as the original Euler–Lagrangeequation. Thus it is itself an Euler–Lagrange equation and we can bootstrap the regularitytheorem to get u ∈ Wk,2

loc(Ω;Rm) for all k ∈ N, hence k ∈ C∞(Ω;Rm).

Example 2.24. For a minimizer u ∈ W1,2(Ω) of the Dirichlet integral as in Example 2.9,the theory presented here immediately gives u ∈ C∞(Ω).

Example 2.25. Similarly, for minimizers u ∈ W1,2(Ω;R3) to the problem of linearizedelasticity from Section 1.4 and Example 2.10, we also get u ∈ C∞(Ω;R3). The reasoninghas to be slightly adjusted, however, to take care of the fact that we are dealing with E uinstead of ∇u.

In the general case, however, our Euler–Lagrange equation is nonlinear and the abovebootstrapping procedure fails. In particular, this includes the problem posed in Hilbert’s19th problem, which was only solved, in the scalar case m = 1, by Ennio De Giorgi in1957 [28] and, using different methods, by John Nash in 1958 [78]. After the proof wasimproved by Jurgen Moser in 1960/1961 [71, 72], the results are now subsumed in the DeGiorgi–Nash–Moser theory.

The current standard solution (based on De Giorgi’s approach) is based on the De Giorgiregularity theorem and the classical Schauder estimates, which establish Holder regular-ity of weak solutions of a PDE.

Theorem 2.26 (De Giorgi regularity theorem). Let the function A : Ω → Rd×d be mea-surable, symmetric, i.e. A(x) = A(x)T for x ∈ Ω, and satisfy the ellipticity and boundednessestimates

µ|v|2 ≤ vT A(x)v ≤ M|v|2, x ∈ Ω,v ∈ Rd , (2.12)

for constants 0 < µ ≤ M. If u ∈ W1,2(Ω) is a weak solution of

−div[A∇u] = 0, (2.13)

then u ∈ C0,α0loc (Ω), that is, u is α0-Holder continuous, for some α0 = α0(d,M/µ) ∈ (0,1).

Com

men

t...


Theorem 2.27 (Schauder estimate). Let A : Ω → Rd×d be as above but in addition as-sumed to be of class Cs−1,α for some s ∈N and α ∈ (0,1). If u ∈ W1,2(Ω) is a weak solutionof

−div[A∇u] = 0,

then u ∈ Cs,αloc (Ω).

We remark that the Schauder estimates also hold for systems of PDEs (u : Ω → Rm,m > 1), but the De Giorgi regularity theorem does not. The proofs of these results are quiteinvolved and reserved for an advanced course on regularity theory, but we establish theirarguably most important consequence, namely the solution of Hilbert’s 19th problem in thescalar case:

Theorem 2.28 (De Giorgi–Nash–Moser). Let F be a regular variational integral andf ∈ Cs(Rd), s ∈ 2,3, . . .. If u ∈ W1,2(Ω) minimizes F , then it holds that u ∈ Cs−1,α

loc (Ω)for some α ∈ (0,1). In particular, if f ∈ C∞(Rd), then u ∈ C∞(Ω).

Proof. We saw in (2.11) that the partial derivative ∂ku, k = 1, . . . ,d, of a minimizer u ∈W1,2(Ω) of F satisfy (2.13) for

A(x) := D2 f (∇u(x)), x ∈ Ω.

From general properties of the Hessian we conclude that A(x) is symmetric and the upperand lower estimates (2.12) on A follow from the respective properties of D2 f . However,we cannot conclude (yet) any regularity of A beyond measurability. Nevertheless, we mayapply the De Giorgi Regularity Theorem 2.26, whereby ∂ku ∈ C0,α0

loc (Ω) for some α0 ∈ (0,1)and all k = 1, . . . ,d. Hence u ∈ C1,α0

loc (Ω). This is the assertion for s = 2.If s = 3, our arguments so far in conjunction with the regularity assumptions on f imply

D2 f (∇u(x))∈C0,α0loc (Ω). Hence, the Schauder estimates from Theorem 2.27 apply and yield

∂ku ∈ C1,α0loc (Ω), whereby u ∈ C2,α0

loc (Ω) = Cs−1,α0loc (Ω). For higher s, this procedure can be

iterated until we run out of f -derivatives and we have achieved u ∈ Cs−1,α0loc (Ω).

The above argument also yields analyticity of u if f is analytic. Another refinementshows that in the situation of the De Giorgi–Nash–Moser Theorem, we even have u ∈Cs−1,α

loc (Ω) for all α ∈ (0,1). Here one needs the additional Schauder–type result that weaksolutions u to −div

[A∇(∂ku)

]= 0 for A continuous are of class C0,α

loc for all α ∈ (0,1).In the above proof we can apply this result in the case s = 2 since A(x) = D2 f (∇u(x)) iscontinuous. Thus, u ∈ C1,α

loc (Ω) for any α ∈ (0,1). The other parts of the proof are adaptedaccordingly. See [86] for details and other refinements.

We close our look at regularity theory by considering the vectorial case m > 1. It wasshown again by De Giorgi in 1968 that if d = m > 2, then his regularity theorem does nothold:

Com

men

t...


Example 2.29 (De Giorgi). Let d = m > 2 and define

u(x) := |x|−γx, γ :=d2

(1− 1√

(2d −2)2 +1

), x ∈ B(0,1).

Note that 1 < γ < d2 and so u ∈ W1,2(B(0,1);Rd) but u /∈ L∞

loc(B(0,1);Rd). It can bechecked, though, that u solves (2.13) for

vT A(x)v := |v|2+[(

(d −1)I +d · x⊗ x|x|2

)v]2

,

which satisfies all assumptions of the De Giorgi Theorem. Moreover, u is a minimizer ofthe quadratic variational integral∫

B(0,1)∇u(x)T A(x)∇u(x) dx.

However, this is not a regular variational integral because of the x-dependence of theintegrand is not smooth.

In principle, this still leaves the possibility that the vectorial analogue of Hilbert’s 19thproblem has a positive solution, just that its proof would have to proceed along a differentavenue than via the De Giorgi Theorem. However, after first results of Necas in 1975 [79],the current state of the art was established by Sverak and Yan in 2000–2002 [90, 91]. Theyprove that there exist regular (and C∞) variational integrals such that minimizers must be:

• non-Lipschitz if d ≥ 3, m ≥ 5,

• unbounded if d ≥ 5, m ≥ 14.

However, there are also positive results, in particular we have that for d = 2 min-imizers to regular variational problems are always as regular as the data allows (as inTheorem 2.28), this is the Morrey regularity theorem from 1938, see [68]. If we con-fine ourselves to Sobolev-regularity, then using the difference quotient technique, we canprove W2,2+δ

loc -regularity for minimizers of regular variational problems for some dimension-dependent δ > 0, this result is originally due to Sergio Campanato. By an embeddingtheorem this yields C0,α -regularity for some α ∈ (0,1) when d ≤ 4. Only in 2008 wereJan Kristensen and Christof Melcher able to establish W2,2+δ

loc -regularity for a dimension-independent δ > 0 (in fact, δ = µ/(50M)), see [55]. We will learn about further regularityresults in Section 3.8. Regularity theory is still an area of active research and new resultsare constantly discovered.

Finally, we mention that regularity of solutions might in general be extended up to andincluding the boundary if the boundary is smooth enough, see Section 6.3.2 in [35] andalso [44] for results in this direction.

Com

men

t...

2.6. LAVRENTIEV GAP PHENOMENON 31

2.6 Lavrentiev gap phenomenon

So far we have chosen the function spaces in which we look for solutions of a minimizationproblem according to coercivity and growth assumptions. However, at first sight, classicallydifferentiable functions may appear to be more appealing. So the question arises whetherthe minimum (infimum) value is actually the same when considering different functionspaces. Formally, given two spaces X ⊂ Y such that X is dense in Y and a functionalF : Y → R∪+∞, is

infY

F = infX

F ?

This is certainly true if we assume good growth conditions:

Theorem 2.30. Let f : Ω×Rm ×Rm×d → R be a Caratheeodory function satisfying thegrowth bound

| f (x,v,A)| ≤ M(1+ |v|p + |A|p), (x,v,A) ∈ Ω×Rm ×Rm×d ,

for some M > 0, 1 < p < ∞. Then, the functional

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx, u ∈ W1,p(Ω;Rm)

is strongly continuous. Consequently,

infW1,p(Ω;Rm)

F = infC∞(Ω;Rm)

F .

Proof. Let u j → u in W1,p, whereby u j → u and ∇u j →∇u almost everywhere (which holdsafter selecting a non-renumbered subsequence). Then, from the growth assumption we get

F [u j] =∫

Ωf (x,u j,∇u j) dx ≤

∫Ω

M(1+ |u j|p + |∇u j|p) dx

and by the Pratt Theorem A.5,

F [u j]→ F [u].

This shows the continuity of F with respect to strong convergence in W1,p.The assertion about the equality of the infima now follows readily since C∞(Ω;Rm) is

(strongly) dense in W1,p(Ω;Rm).

If we dispense with the (upper) growth, however, the infimum over different spacesmay indeed be different – this is the Lavrentiev gap phenomenon, discovered in 1926 byMikhail Lavrentiev. We here give an example between the spaces W1,1 and W1,∞ (withboundary conditions) due to Basilio Mania from 1934.

Com

men

t...

Example 2.31 (Mania). Consider the minimization problemF [u] :=∫ 1

0(u(t)3 − t)2u(t)6 dt → min,

u(0) = 0, u(1) = 1

for u from either W1,1(0,1) or W1,∞(0,1) (Lipschitz functions). We claim that

infW1,1(0,1)

F < infW1,∞(0,1)

F ,

where here and in the following these infima are to be taken only over functions u withboundary values u(0) = 0, u(1) = 1.

Clearly, F ≥ 0, and for u∗(t) := t1/3 ∈ (W1,1 \W1,∞)(0,1) we have F [u∗] = 0. Thus,

infW1,1(0,1)

F = 0.

On the other hand, for any u ∈ W1,∞(0,1) with u(0) = 0, u(1) = 1, the Lipschitz-continuityimplies that

u(t)≤ h(t) :=t1/3

2for all t ∈ [0,τ ] and some τ > 0,

and u(τ) = h(τ). Then, u(t)3 − t ≤ h(t)3 − t for t ∈ [0,τ ] and, since both of these terms arenegative,

(u(t)3 − t)2 ≥ (h(t)3 − t)2 =72

82 t2 for all t ∈ [0,τ].

We then estimate

F [u]≥∫ τ

0(u(t)3 − t)2 u(t)6 dt ≥ 72

82

∫ τ

0t2 u(t)6 dt.

Further, by Holder’s inequality,∫ τ

0u(t) dt =

∫ τ

0t−1/3 · t1/3 u(t) dt

≤(∫ τ

0t−2/5 dt

)5/6

·(∫ τ

0t2 u(t)6 dt

)1/6

=55/6

35/6 τ1/2(∫ τ

0t2 u(t)6 dt

)1/6

.

Since also∫ τ

0u(t) dt = u(τ)−u(0) = h(τ) =

τ1/3

2,

Com

men

t...

2.7. INVARIANCES AND THE NOETHER THEOREM 33

we arrive at

F [u]≥ 7235

825526τ≥ 7235

825526 > 0.

Thus,

infW1,∞(0,1)

F > 0 = infW1,1(0,1)

F .

In a more recent example, Ball & Mizel [13] showed that the problemF [u] :=∫ 1

−1(t4 −u(t)6)2|u(t)|2m + ε u(t)2 dt → min,

u(−1) = α, u(1) = β ,

also exhibits a Lavrentiev gap phenomenon between the spaces W1,2 and W1,∞ if m ∈ Nsatisfies m > 13, ε > 0 is sufficiently small, and −1 ≤ α < 0 < β ≤ 1. This example issignificant because the Ball–Mizel functional is coercive on W1,2(−1,1).

Finally, we note that despite its theoretical significance, the Lavrentiev gap phenomenonis a major obstacle for the numerical approximation of minimization problems. For in-stance, standard piecewise-affine finite elements are in W1,∞ and hence in the presence ofthe Lavrentiev gap phenomenon we cannot approximate the true solution with such ele-ments! Thus, one is forced to work with non-conforming elements and other advancedschemes. This issue does not only affect “academic” examples such as the ones above, butis also of great concern in applied problems, such as nonlinear elasticity theory.

2.7 Invariances and the Noether theorem

In physics and other applications of the Calculus of Variations, we are often highly inter-ested in symmetries of minimizers or, more generally, critical points. These symmetriesmanifest themselves in other differential or pointwise relations that are automatically satis-fied for any critical point. They can be used to identify concrete solutions or are interestingin their own right, for example as “gauge symmetries” in modern physics. In this section,for simplicity we only consider “sufficiently smooth” (W2,2

loc) critical points, which is thenatural level of regularity, see Theorem 2.22.

As a first concrete example of a symmetry, consider the Dirichlet integral

F [u] :=∫

Ω|∇u(x)|2 dx, u ∈ W1,2(Ω),

which was introduced in Example 2.9. First we notice that F is invariant under translationsin space: Let u ∈ (W1,2 ∩W2,2

loc)(Ω), τ ∈ R, k ∈ 1, . . . ,d, and set

xτ := x+ τek, uτ(x) := u(x+ τek).

Com

men

t...


Then, for any open D ⊂ Rd with Dτ := D+ τek ⊂ Ω, we have the invariance∫D|∇uτ(x)|2 dx =

∫Dτ

|∇u(xτ)|2 dxτ . (2.14)

The Dirichlet integral also exhibits invariance with respect to scaling: For λ > 0 set

xλ := λx, uλ (x) := λ (d−2)/2u(λx), Dλ := λD. (2.15)

Then it is not hard to see that (2.14) again holds if we set λ = eτ (to allow τ ∈R as before).The main result of this section, the Noether theorem, roughly says that “differentiable

invariances of a functional give rise to conservation laws”. More concretely, the two in-variances of the Dirichlet integral presented above will yield two additional PDEs that anyminimizer of the Dirichlet integral must satisfy.

To make these statement precise in the general case, we need a bit of notation: Letu ∈ (W1,2 ∩W2,2

loc)(Ω;Rm) be a critical point of the functional

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx,

where f : Ω×Rm ×Rm×d → R is twice continuously differentiable. Then, u satisfies thestrong Euler–Lagrange equation

−div[DA f (x,u,∇u)

]+Dv f (x,u,∇u) = 0 a.e. in Ω.

see Proposition 2.20. We consider u to be extended to all of Rd (it will not matter belowhow we extend u).

Our invariance is given through maps g : Rd ×R → Rd and H : Rd ×R → Rm, whichwill depend on u above, such that

g(x,0) = x and H(x,0) = u(x), x ∈ Rd .

We also require that g,H are continuously differentiable in their second argument τ foralmost every x ∈ Ω. Then set for x ∈ Rd , τ ∈ R, D ⊂⊂ Rd

xτ := g(x,τ), uτ(x) := H(x,τ), Dτ := g(D,τ).

Therefore, g,H can be thought of as a form of homotopy.We call F invariant under the transformation defined by (g,H) if∫

Df (x,uτ(x),∇uτ(x)) dx =

∫Dτ

f (x′,u(x′),∇u(x′)) dx′. (2.16)

for all τ ∈ R sufficiently small and all open D ⊂ Rd such that Dτ ⊂⊂ Ω.The main result of this section goes back to Emmy Noether5 and is considered one of

the most important mathematical theorems ever proved. Its pivotal idea of systematicallygenerating conservation laws from invariances has proved to be immensely influential inmodern physics.

5The German mathematician Emmy Noether (23 March 1882 – 14 April 1935) has been called the mostimportant woman in the history of mathematics by Einstein and others.

Com

men

t...


2.7. INVARIANCES AND THE NOETHER THEOREM 35

Theorem 2.32 (Noether). Let f : Ω×Rm ×Rm×d → R be twice continuously differen-tiable and satisfy the growth bounds

|Dv f (x,v,A)|, |DA f (x,v,A)| ≤ M(1+ |v|p + |A|p), (x,v,A) ∈ Ω×Rm ×Rm×d ,

for some M > 0, 1 ≤ p < ∞, and let the associated functional F be invariant underthe transformation defined by (g,H) as above, for which we assume that there exist a p-majorant g ∈ Lp(Ω) such that

|∂τH(x,τ)|, |∂τg(x,τ)| ≤ g(x) for a.e. x ∈ Ω and all τ ∈ R. (2.17)

Then, for any critical point u ∈ (W1,2 ∩W2,2loc)(Ω;Rm) of F the conservation law

div[µ(x) ·DA f (x,u,∇u)−ν(x) · f (x,u,∇u)

]= 0 for a.e. x ∈ Ω. (2.18)

holds, where

µ(x) := ∂τH(x,0) ∈ Rm and ν(x) := ∂τg(x,0) ∈ Rd (x ∈ Ω)

are the Noether multipliers.

Corollary 2.33. If f = f (v,A) : R×Rd → R does not depend on x, then every criticalpoint u ∈ (W1,2 ∩W2,2

loc)(Ω) of F satisfies

d

∑i=1

∂i

[(∂ku) ·∂Ak f (u,∇u)−δik f (u,∇u)

]= 0 (2.19)

for all k = 1, . . . ,d.

Proof. Differentiate (2.16) with respect to τ and then set τ = 0. To be able to differentiatethe left hand side under the integral, we use the growth assumptions on the derivatives|Dv f (x,v,A)|, |DA f (x,v,A)| and (2.17) to get a uniform (in τ) majorant, which allows tomove the differentiation under the intgral sign. For the right hand side, we need to employthe formula for the differentiation of an integral with respect to a moving domain (this is aspecial case of the Reynolds transport theorem),

ddτ

∫Dτ

f (x,u,∇u) dx =∫

∂Dτf (x,u,∇u)∂τg(x,τ) ·n dσ ,

where n is the unit outward normal; this formula can be checked in an elementary way. Ab-breviating for readability F := f (x,u(x),∇u(x)), DAF := DA f (x,u(x),∇u(x)), and DvF :=Dv f (x,u(x),∇u(x)), we get∫

DDAF : ∇µ +DvF ·µ dx =

∫∂D

Fν ·n dσ .

Applying the Gauss–Green theorem,∫D

[−divDAF +DvF

]·µ dx =

∫∂D

[−µ ·DAF +νF

]·n dσ

=∫

Ddiv[−µ ·DAF +νF

]dx

Com

men

t...

Now, the Euler–Lagrange equation −divDAF +DvF = 0 immediately implies∫D

div[−µ ·DAF +νF

]dx = 0.

and varying D we conclude that (2.18) holds.The corollary follows by considering the invariance xτ := x+ τek, uτ(x) := u(x+ τek),

k = 1, . . . ,d.

Example 2.34 (Brachistochrone problem). In the brachistochrone problem presented inSection 1.1, we were tasked to minimize the functional

F [y] :=∫ 1

0

√1+(y′)2

−ydx

over all y : [0,1]→ R with y(0) = 0, y(1) = y < 0. The integrand f (v,a) =√

−(1+a2)/vis independent of x, hence from (2.19) we get

y′ ·Da f (y,y′)− f (y,y′) = const =− 1√2r

for some r > 0 (positive constants lead to inadmissible y). So,

(y′)2√1+(y′)2 ·

√−y

−√

1+(y′)2√−y

=− 1√2r

,

which we transform into the differential equation

(y′)2 =−2ry−1.

The solution of this differential equation is called an inverted cycloid with radius r > 0,which is the curve traced out by a fixed point on a sphere of radius r (touching the y-axis atthe beginning) that rolls to the right on the bottom of the x-axis, see Figure 2.2. It can bewritten in parametric form as

x(t) = r(t − sin t),

y(t) =−r(1− cos t),t ∈ R.

The radius r > 0 has to be chosen as to satisfy the boundary conditions on one cycloidsegment. This is the solution of the brachistochrone problem.

Example 2.35. We return to our canonical example, the Dirichlet integral, see Exam-ple 2.9. We know from Example 2.24 that critical points u are smooth. We noticed at thebeginning of this section that the Dirichlet integral is invariant under the scaling transforma-tion (2.15) with λ = eτ . By the Noether theorem, this yields the (non-obvious) conservationlaw

div[(2∇u(x) · x+(d −2)u)∇u−|∇u|2x

]= 0.

Com

men

t...

2.8. SIDE CONSTRAINTS 37

Figure 2.2: The brachistochrone curve (here, x = 1).

Integrating this over B(0,r)⊂ Ω, r > 0, and using the Gauss–Green theorem, we get

(d −2)∫

B(0,r)|∇u(x)|2 dx = r

∫∂B(0,r)

|∇u(x)|2 −2(

∇u(x) · x|x|

)2

dσ

and then, after some computations,

ddr

(1

rd−2

∫B(0,r)

|∇u(x)|2 dx)=

2rd−2

∫∂B(0,r)

(∇u(x) · x

|x|

)2

dσ ≥ 0.

This monotonicity formula implies that

1rd−2

∫B(0,r)

|∇u(x)|2 dx is increasing in r > 0.

Any harmonic function (−∆u = 0) that is defined on all of Rd satisfies this monotonicityformula. For example, this allows us to draw the conclusion that if d ≥ 3, then u cannotbe compactly supported. The formula also shows that the growth around a singularity atzero (or anywhere else by translation) has to behave “more smoothly” than |x|−2 (in fact,we already know that solutions are C∞). While these are not particularly strong remarks,they serve to illustrate how Noether’s theorem restricts the candidates for solutions.

The last example exhibited a conservation law that was not obvious from the Euler–Lagrange equation. While in principle it could have been derived directly, the Noethertheorem gave us a systematic way to find this conservation law from an invariance.

2.8 Side constraints

In many minimization problems, the class of functions over which we minimize our func-tional may be restricted to include one or several side constraints. To establish existenceof a minimizer also in these cases, we first need to extend the Direct Method:

Com

men

t...



Theorem 2.36 (Direct Method with side constraint). Let X be a Banach space and letF ,G : X → R∪+∞. Assume the following:

(WH1) Weak Coercivity of F : For all Λ ∈ R, the sublevel set

x ∈ X : F [x]≤ Λ is sequentially weakly compact,

that is, if F [x j] ≤ Λ for a sequence (x j) ⊂ X, then (x j) has a weakly convergentsubsequence.

(WH2) Weak lower semicontinuity of F : For all sequences (x j) ⊂ X with x j x it holdsthat


F [x j].

(WH3) Weak continuity of G : For all sequences (x j)⊂ X with x j x it holds that

G [x j]→ G [x].

Assume also that there exists at least one x0 ∈ X with G [x0] = 0. Then, the minimizationproblem

F [x]→ min over all x ∈ X with G [x] = α ∈ R,

has a solution.

Proof. The proof is almost exactly the same as the one for the standard Direct Method inTheorem 2.3. However, we need to select the x j for an infimizing sequence with G [x j] = α .Then, by (WH3), this property also holds for any weak limit x∗ of a subsequence of thex j’s.

A large class of side constraints can be treated using the following simple result:

Lemma 2.37. Let g : Ω×Rm → R be a Caratheodory integrand such that there existsM > 0 with

|g(x,v)| ≤ M(1+ |v|p), (x,v) ∈ Ω×Rm.

Then, G : W1,p(Ω;Rm)→ R defined through

G [u] :=∫

Ωg(x,u(x)) dx.

is weakly continuous.

Com

men

t...


2.8. SIDE CONSTRAINTS 39

Proof. Let u j u in W1,p, whereby after selecting a subsequence and employing theRellich–Kondrachov Theorem A.18 and Lemma A.3, u j → u in Lp (strongly) and almosteverywhere. By assumption we have

±g(x,v)+M(1+ |v|p)≥ 0.

Thus, applying Fatou’s Lemma separately to these two integrands, we get

liminfj→∞

(±G [u j]+

∫Ω

M(1+ |u j|p) dx)≥±G [u]+

∫Ω

M(1+ |u|p) dx.

Since ∥u j∥Lp → ∥u∥Lp , we can combine these two assertions to get G [u j] → G [u]. Sincethis holds for all subsequences, it also follows for our original sequence.

We will later see more complicated side constraints, in particular constraints involvingthe determinant of the gradient.

The following result is very important for the calculation of minimizers under sideconstraints. It effectively says that at a minimizer u∗ of F subject to the side constraintG [u∗] = 0, the derivatives of F and G have to be parallel, in analogy with the finite-dimensional situation.

Theorem 2.38 (Lagrange multipliers). Let f : Ω×Rm ×Rm×d →R, g : Ω×Rm →R beCaratheodory integrands and satisfy the growth bounds

| f (x,v,A)| ≤ M(1+ |v|p + |A|p),|g(x,v)| ≤ M(1+ |v|p), (x,v,A) ∈ Ω×Rm ×Rm×d,

for some M > 0, 1 ≤ p < ∞, and let f ,g be continuously differentiable in v and A. Supposethat u∗ ∈ W1,p

g (Ω;Rm), where g ∈ W1−1/p,p(∂Ω;Rm), minimizes the functional

F [u] :=∫

Ωf (x,u(x),∇u(x)) dx, u ∈ W1,p

g (Ω;Rm),

under the side constraint

G [u] :=∫

Ωg(x,u(x)) dx = α ∈ R.

Assume furthermore the consistency condition

δG [u∗][v] =∫

ΩDvg(x,u∗(x)) · v(x) dx = 0 (2.20)

for at least one v ∈ W1,p0 (Ω;Rm). Then, there exists a Lagrange multiplier λ ∈R such that

u∗ is a weak solution of the system of PDEs−div

[DA f (x,u,∇u)

]+Dv f (x,u,∇u) = λDvg(x,u) in Ω,

u = g on ∂Ω.(2.21)

Com

men

t...


Proof. The proof is structurally similar to its finite-dimensional analogue.Let u∗ ∈ W1,p

g (Ω;Rm) be a minimizer of F under the side constraint G [u] = 0. Fromthe consistency condition (2.20) we infer that there exists w ∈ W1,p

0 (Ω;Rm) such that

δG [u∗][w] =∫

ΩDvg(x,u∗) ·w dx = 1.

Now fix any variation v ∈ W1,p0 (Ω;Rm) and define for s, t ∈ R,

H(s, t) := G [u∗+ sv+ tw].

It is not difficult to see that H is continuously differentiable in both s and t (this uses strongcontinuity of the integral, see Theorem 2.30) and

∂sH(s, t) = δG [u∗+ sv+ tw][v],

∂tH(s, t) = δG [u∗+ sv+ tw][w].

Thus, by definition of w,

H(0,0) = α and ∂tH(0,0) = 1.

By the implicit function theorem, there exists a function τ : R→ R such that τ(0) = 0 and

H(s,τ(s)) = α for small |s|.

The chain rule yields for such s,

0 = ∂s[H(s,τ(s))] = ∂sH(s,τ(s))+∂tH(s,τ(s))τ ′(s),

whereby

τ ′(0) =−∂sH(0,0) =−∫

ΩDvg(x,u∗) · v dx. (2.22)

Now define for small |s| as above,

J(s) := F [u∗+ sv+ τ(s)w].

We have

G [u∗+ sv+ τ(s)w] = H(s,τ(s)) = α

and the continuously differentiable function J has a minimum at s = 0 by the minimiza-tion property of u∗. Thus, with the shorthand notations DA f = DA f (x,u∗,∇u∗) and Dv f =Dv f (x,u∗,∇u∗),

0 = J′(0) =∫

ΩDA f : (∇v+ τ ′(0)∇w)+Dv f · (v+ τ ′(0)w) dx.

Com

men

t...


2.9. NOTES AND BIBLIOGRAPHICAL REMARKS 41

Rearranging and using (2.22), we get∫Ω

DA f : ∇v+Dv f · v dx =−τ ′(0)∫

ΩDA f : ∇w+Dv f ·w dx

= λ∫

ΩDvg(x,u∗) · v dx, (2.23)

where we have defined

λ :=∫

ΩDA f : ∇w+Dv f ·w dx.

Since (2.23) shows that u∗ is a weak solution of (2.21), the proof is finished.

Example 2.39 (Stationary Schrodinger equation). When looking for ground states inquantum mechanics as in Section 1.3, we have to minimize

E [Ψ] :=∫RN

h2

2µ|∇Ψ|2 + 1

2V (x)|Ψ|2 dx

over all Ψ ∈ W1,2(RN ;C) under the side constraint

∥Ψ∥L2 = 1.

From Theorem 2.36 in conjunction with Lemma 2.37 (slightly extended to also apply tothe whole space Ω = RN) we see that this problem always has at least one solution. Theo-rem 2.38 (likewise extended) yields that this Ψ satisfies[

−h2

2µ∆+V (x)

]Ψ(x) = EΨ(x), x ∈ Ω,

for some Lagrange multiplier E ∈ R, which is precisely the stationary Schrodinger equa-tion. One can also show that E > 0 is the smallest eigenvalue of the operator Ψ 7→ [−h2

2µ ∆+

V (x)]Ψ.

2.9 Notes and bibliographical remarks

If a given problem has some convexity, this is invariable very useful and the analysis candraw on a vast theory. We have not mentioned the general field of Convex Analysis, see forexample the classic texts [83] or [34]. In this area, however, the emphasis is more on ex-tending the convex theory to “non-smooth” situations, see for example the monographs [84]and [66, 67].

For the u-dependent variational integrals, the growth in the u-variable can be improvedup to q-growth, where q < p/(p− d) (the Sobolev embedding exponent for W1,p). More-over, we can work with the more general growth bounds | f (x,v,A)| ≤ M(g(x)+ |v|q+ |A|p),with g ∈ L1(Ω; [0,∞)) and q < p/(p−d). For reasons of simplicity, we have omitted thesegeneralization here.

Com

men

t...


The Fundamental Lemma of Calculus of Variations 2.21 is due to Du Bois-Raymondand is sometimes named after him.

Difference quotients were already considered by Newton, their application to regular-ity theory is due to Nirenberg in the 1940s and 1950s. Many of the fundamental resultsof regularity theory (albeit in a non-variational context) can be found in [43]. The bookby Giusti [44] contains theory relevant for variational questions. A nice framework forSchauder estimates is [86]. De Giorgi’s 1968 counterexample is from [29].

Noether’s theorem has many ramifications and can be put into a very general form inHamiltonian systems and Lie-group theory. The idea is to study groups of symmetries andtheir actions. For an introduction into this diverse field see [80].

Example 2.35 about the monotonicity formula is from [36].Side-constraints can also take the form of inequalities, which is often the case in Opti-

mization Theory or Mathematical Programming. This leads to Karush–Kuhn–Tucker con-ditions or obstacle problems.

Com

men

t...


Chapter 3

Quasiconvexity

We saw in the last chapter that convexity of the integrand implies lower semicontinuity forintegral functionals. Moreover, we saw in Proposition 2.11 that also the converse is true inthe scalar case d = 1 or m = 1. In the vectorial case (d,m > 1), however, it turns out that onecan find weakly lower semicontinuous integral functionals whose integrand is non-convex,the following is the most prominent one: Let p ≥ d (Ω ⊂ Rd a bounded Lipschitz domainas usual) and define

F [u] :=∫

Ωdet∇u(x) dx, u ∈ W1,p

0 (Ω;Rd).

Then, we can argue using Stokes’ Theorem (this can also be computed in a more elementaryway, see Lemma 3.8 below):

F [u] =∫

Ωdet∇u dx =

∫Ω

du1 ∧·· ·∧dud =∫

∂Ωu1 ∧du2 ∧·· ·∧dud = 0

because u ∈ W1,d0 (Ω;Rd) is zero on the boundary ∂Ω. Thus, F is in fact constant on

W1,p0 (Ω;Rd), hence trivially weakly lower semicontinuous. However, the determinant func-

tion is far from being convex if d ≥ 2; for instance we can easily convex-combine a matrixwith positive determinant from two singular ones. But we can also find examples not in-volving singular matrices: For

A :=(−1 −22 1

), B :=

(1 −22 −1

),

12

A+12

B =

(0 −22 0

),

we have det A = detB = 3, but det(A/2+B/2) = 4.Furthermore, convexity of the integrand is not compatible with one of the most funda-

mental principles of continuum mechanics, the frame-indifference or objectivity. Assumethat our integrand f = f (A) is frame-indifferent, that is,

f (RA) = f (A) for all A ∈ Rd×d , R ∈ SO(d),

where SO(d) is the set of (d × d)-orthogonal matrices with determinant 1 (i.e. rotations).Furthermore, assume that every purely compressive or purely expansive deformation costs

Com

men

t...


44 CHAPTER 3. QUASICONVEXITY

energy, i.e.

f (αI)> f (I) for all α = 1, (3.1)

which is very reasonable in applications. Then, f cannot be convex: Let us for simplicityassume d = 2. Set, for a fixed γ ∈ (0,2π),

R :=(

cosγ −sinγsinγ cosγ

)∈ SO(2).

Then, if f was convex, for M := (R+RT )/2 = (cosγ)I we would get

f ((cosγ)I) = f (M)≤ 12(

f (R)+ f (RT ))= f (I),

contradicting (3.1). Sharper arguments are available, but the essential conclusion is thesame: convexity is not suitable for many variational problems originating from continuummechanics.

3.1 Quasiconvexity

We are now ready to replace convexity with a more general notion, which has become oneof the cornerstones of the modern Calculus of Variations. This is the notion of quasicon-vexity, introduced by Charles B. Morrey in the 1950s: A locally bounded Borel-measurablefunction h : Rm×d → R is called quasiconvex if

h(A)≤ −∫

B(0,1)h(A+∇ψ(z)) dz (3.2)

for all A∈Rm×d and all ψ ∈W1,∞0 (B(0,1);Rm). Here, −

∫B(0,1) := |B(0,1)|−1 ∫ , where B(0,1)

is the unit ball in Rd .Let us give a physical interpretation to the notion of quasiconvexity: Let d = m = 3 and

assume that

F [y] :=∫

B(0,1)h(∇y(x)) dx, y ∈ W1,∞(B(0,1);R3),

models the physical energy of an elastically deformed body, whose deformation from thereference configuration is given through y : Ω = B(0,1) → R3 (see Section 1.4 for moredetails on this modeling). A special class of deformations are the affine ones, say a(x) =y0 +Ax for some y0 ∈ R3, A ∈ R3×3. Then, quasiconvexity of f entails that

F [a] =∫

B(0,1)h(A) dx ≤

∫B(0,1)

h(A+∇ψ(z)) dz

for all ψ ∈ W1,∞0 (B(0,1);R3). This means that the affine deformation a is always energet-

ically favorable over the internally distorted deformation a(x)+ψ(x); this is very often a

Com

men

t...


3.1. QUASICONVEXITY 45

reasonable assumption for real materials. This argument also holds on any other domain byLemma 3.2 below.

To justify its name, we also need to convince ourselves that quasiconvexity is indeeda notion of convexity: For A ∈ Rm×d and V ∈ L1(B(0,1);Rm×d) with

∫B(0,1)V (x) dx = 0

define the probability measure µ ∈ M1(Rm×d) via its action as an element of C0(Rm×d)∗ asfollows (recall that M(Rm×d)∼= C0(Rm×d)∗ by the Riesz Representation Theorem A.10):⟨

h,µ⟩

:= −∫

B(0,1)h(A+V (x)) dx for h ∈ C0(Rm×d).

This µ is easily seen to be an element of the dual space to C0(Rm×d), and in fact µ is aprobability measure: For the boundedness we observe |⟨h,µ⟩| ≤ ∥h∥∞, whereas the positiv-ity ⟨h,µ⟩ ≥ 0 for h ≥ 0 and the normalization ⟨1,µ⟩= 1 follow at once. The barycenter [µ]of µ is

[µ] =⟨id,µ

⟩= A+ −

∫B(0,1)

V (x) dx = A.

Therefore, if h is convex, we get from then Jensen inequality, Lemma A.9,

h(A) = h([µ])≤⟨h,µ⟩= −∫

B(0,1)h(A+V (x)) dx.

In particular, (3.2) holds if we set V (x) := ∇ψ(x) for any ψ ∈ W1,∞0 (B(0,1);Rm). Thus, we

have shown:

Proposition 3.1. All convex functions h : Rm×d → R are quasiconvex.

However, quasiconvexity is weaker than classical convexity, at least if d,m ≥ 2. Thedeterminant function and, more generally, minors are quasiconvex, as will be proved in thenext section, but these minors (except for the (1× 1)-minor) are not convex. We will seemany further quasiconvex functions in the course of this and the next chapter. The relationbetween (quasi)convexity and Jensen’s inequality plays a major role in the theory, as wewill see soon.

Other basic properties of quasiconvexity are the following:

Lemma 3.2. In the definition of quasiconvexity we can replace the domain B(0,1) byany bounded Lipschitz domain Ω′ ⊂ Rd . Furthermore, if h has p-growth, i.e. |h(A)| ≤M(1+ |A|p) for some 1 ≤ p < ∞, M > 0, then in the definition of quasiconvexity we canreplace testing with ψ of class W1,∞

0 by ψ of class W1,p0 .

Proof. To see the first statement, we will prove the following: If ψ ∈ W1,p(Ω;Rm), whereΩ ⊂Rd is a bounded Lipschitz domain, satisfies ψ|∂Ω = Ax for some A ∈ Rm×d , then thereexists ψ ∈ W1,p(Ω′;Rm) with ψ|∂Ω′ = Ax and, if one of the following integrals exists andis finite,

−∫

Ωh(∇ψ) dx = −

∫Ω′

h(∇ψ) dy for all measurable h : Rm×d → R. (3.3)

Com

men

t...


In particular, ∥∇ψ∥Lp = ∥∇ψ∥Lp . Clearly, this implies that the definition of quasiconvexityis independent of the domain.

To see (3.3), take a Vitali cover of Ω′ with re-scaled versions of Ω, see Theorem A.8,

Ω′ = Z ⊎N⊎

k=1

Ω(ak,rk), |Z|= 0,

with ak ∈ Ω, rk > 0, Ω(ak,rk) := ak + rkΩ, where k = 1, . . . ,N ∈ N. Then let

ψ(y) :=

rkψ(

y−ak

rk

)+Aak if y ∈ Ω(ak,rk),

0 otherwise,y ∈ Ω′.

It holds that ψ ∈ W1,∞(Ω′;Rm). We compute for any measurable h : Rm×d → R,∫Ω′

h(∇ψ) dy = ∑k

∫Ω(ak,rk)

h(

∇ψ(

y−ak

rk

))dy

= ∑k

rdk

∫Ω

h(∇ψ) dx

=|Ω′||Ω|

∫Ω

h(∇ψ) dx

since ∑k rdk = |Ω′|/|Ω|. This shows (3.3).

The second assertion follows since W1,∞0 (B(0,1);Rm) is dense in W1,p

0 (B(0,1);Rm) andunder a p-growth assumption, the integral functional

ψ 7→ −∫

Ωh(∇ψ) dx, ψ ∈ W1,p

0 (B(0,1);Rm),

is well-defined and continuous, the latter for example by Pratt’s Theorem A.5, see the proofof Theorem 2.30 for a similar argument.

An even weaker notion of convexity is the following one: A locally bounded Borel-measurable function h : Rm×d → R is called rank-one convex if it is convex along anyrank-one line, that is,

h(θA+(1−θ)B

)≤ θh(A)+(1−θ)h(B)

for all A,B ∈ Rm×d with rank(A−B) ≤ 1 and all θ ∈ (0,1). In this context, recall that amatrix M ∈ Rm×d has rank one if and only if M = a⊗b = abT for some a ∈ Rm, b ∈ Rd .

Proposition 3.3. If h : Rm×d → R is quasiconvex, then it is rank-one convex.

Proof. Let A,B ∈ Rm×d with B−A = a⊗ n for a ∈ Rm and n ∈ Rd with |n| = 1. Denoteby Qn a unit cube (|Qn| = 1) centered at the origin and with one side normal to n. ByLemma 3.2 we can require h to satisfy

h(M)≤ −∫

Qn

h(∇ψ(z)) dz (3.4)

Com

men

t...



Figure 3.1: The function φ0.

for all M ∈Rm×d and all ψ ∈ W1,∞(Qn;Rm) with ψ(x) = Mx on ∂Qn (in the sense of trace).For fixed θ ∈ (0,1), we now let M := θA+(1− θ)B and define the sequence of test

functions φ j ∈ W1,∞(Qn;Rm) as follows:

φ j(x) := Mx+1jφ0(

jx ·n−⌊ jx ·n⌋)a, x ∈ Qn,

where ⌊s⌋ denotes the largest integer below or equal to s ∈ R, and

φ0(t) :=

−(1−θ)t if t ∈ [0,θ ],θ t −θ if t ∈ (θ ,1],

t ∈ [0,1).

Such φ j are called laminations in direction n. See Figures 3.1 and 3.2 for φ0 and φ j,respectively.

We calculate

∇φ j(x) =

M− (1−θ)a⊗n = A if jx ·n−⌊ jx ·n⌋ ∈ (0,θ),M+θa⊗n = B if jx ·n−⌊ jx ·n⌋ ∈ (θ ,1).

Thus,

limj→∞

−∫

Qn

h(∇φ j(x)) dx = θh(A)+(1−θ)h(B).

Also notice that φ j∗ a in W1,∞ for a(x) := Mx since φ0 is uniformly bounded. We will

show below that we may replace the sequence (φ j) with a sequence (ψ j) ⊂ W1,∞(Qn;Rm)with the additional property ψ(x) = Mx on ∂Qn, but such that still

limj→∞

−∫

Qn

h(∇ψ j(x)) dx = θh(A)+(1−θ)h(B).

Com

men

t...



Figure 3.2: The laminate φ j.

Combining this with (3.4) allows us to conclude

h(θA+(1−θ)B

)≤ θh(A)+(1−θ)h(B)

and h is indeed rank-one convex.It remains to construct the sequence (ψ j) from (φ j), for which we employ a standard

cut-off construction: Take a sequence (ρ j) ⊂ C∞c (Qn; [0,1]) of cut-off functions such that

for G j :=

x ∈ Ω : ρ j(x) = 1

it holds that |Qn \G j| → 0 as j → ∞. With a(x) = Mx set

ψ j,k := ρ jφk +(1−ρ j)a,

which lies in W1,∞(Qn;Rm) and satisfies ψ j,k(x) = Mx near ∂Qn. Also,

∇ψ j,k = ρ j∇φk +(1−ρ j)M+(φk −a)⊗∇ρ j.

Since W1,∞(Qn;Rm) embeds compactly into L∞(Qn;Rm) by the Rellich-Kondrachov theo-rem (or the classical Arzela–Ascoli theorem), we have φk → a uniformly. Thus, in

∥∇ψ j,k∥L∞ ≤ ∥∇φk∥L∞ + |M|+∥(φk −a)⊗∇ρ j∥L∞

the last term vanishes as k → ∞. As the ∇φk furthermore are uniformly L∞-bounded, wecan for every j ∈N choose k( j) ∈N such that ∥∇ψ j,k( j)∥L∞ is bounded by a constant that isindependent of j. Then, as h is assumed to be locally bounded, this implies that there existsC > 0 (again independent of j) with

∥h(∇φ j)∥L∞ +∥h(∇ψ j,k( j))∥L∞ ≤C.

Com

men

t...



Hence, for ψ j := ψ j,k( j), we may estimate

limj→∞

∫Qn

|h(∇φ j)−h(∇ψ j)| dx = limj→∞

∫Qn\G j

|h(∇φ j)|+ |h(∇ψ j,k( j))| dx

≤ limj→∞

C|Qn \G j|= 0.

This shows that above we may replace (φ j) by (ψ j).

Since for d = 1 or m = 1, rank-one convexity obviously is equivalent to convexity, wehave for the scalar case:

Corollary 3.4. If h : Rm×d → R is quasiconvex and d = 1 or m = 1, then h is convex.

The following is a standard example:

Example 3.5 (Alibert–Dacorogna–Marcellini). For d = m = 2 and γ ∈ R define

hγ(A) := |A|2(|A|2 −2γ det A

), A ∈ R2×2.

For this function it is known that

• hγ is convex if and only if |γ| ≤ 2√

23

≈ 0.94,

• hγ is rank-one convex if and only if |γ| ≤ 2√3≈ 1.15,

• hγ is quasiconvex if and only if |γ| ≤ γQC for some γQC ∈(

1,2√3

].

It is currently unknown whether γQC = 2√3. We do not prove these statements here, see

Section 5.3.8 in [25] for the (involved!) details.

We end this section with the following regularity theorem for rank-one convex (andhence also for quasiconvex) functions:

Proposition 3.6. If h : Rm×d → R is rank-one convex (and locally bounded), then it islocally Lipschitz continuous.

Proof. We will prove the stronger quantitative bound

lip(h;B(A0,r))≤√

min(d,m) · osc(h;B(A0,6r))3r

, (3.5)

where

lip(h;B(A0,r)) := supA,B∈B(A0,r)

|h(A)−h(B)||A−B|

Com

men

t...


is the Lipschitz constant of h on the ball B(A0,r)⊂ Rm×d , and

osc(h;B(A0,r)) := supA,B∈B(A0,r)

|h(A)−h(B)|

is the oscillation of h on B(A0,r). By the local boundedness, the oscillation is bounded onevery ball, and so, locally, the Lipschitz constant is finite.

To show (3.5), let A,B ∈ B(x0,r) and assume first that rank(A−B)≤ 1. Define the pointC ∈ Rm×d as the intersection of ∂B(A0,2r) with the ray starting at B and going through A.Then, because h is convex along this ray,

|h(A)−h(B)||A−B|

≤ |h(C)−h(B)||C−B|

≤ osc(h;B(A0,2r))r

=: α(2r). (3.6)

For general A,B ∈ B(A0,r), use the (real) singular value decomposition to write

B−A =min(d,m)

∑i=1

σiP(ei ⊗ ei)QT ,

where σi ≥ 0 is the i’th singular value, and P ∈ Rm×m, Q ∈ Rd×d are orthogonal matrices.Set

Ak := A+k−1

∑i=1

σiP(ei ⊗ ei)QT , k = 1, . . . ,min(d,m)+1,

for which we have A1 = A, Amin(d,m)+1 = B. We estimate (recall that we are employing the

Frobenius norm |M| :=√

∑ j,k(Mjk )

2 =√

∑i σi(M)2)

|Ak −A0| ≤ |A−A0|+

√k−1

∑i=1

σ 2i ≤ |A−A0|+ |B−A|< 3r

andmin(d,m)

∑k=1

|Ak −Ak+1|2 =min(d,m)

∑k=1

σ2i = |A−B|2.

Applying (3.6) to Ak,Ak+1 ∈ B(A0,3r), k = 1, . . . ,min(d,m), we get

|h(A)−h(B)| ≤min(d,m)

∑k=1

|h(Ak)−h(Ak+1)|

≤ α(6r)min(d,m)

∑k=1

|Ak −Ak+1|

≤ α(6r)√

min(d,m) ·

[min(d,m)

∑k=1

|Ak −Ak+1|2]1/2

= α(6r)√

min(d,m) · |A−B|.

This yields (3.5).

Com

men

t...

3.2. NULL-LAGRANGIANS 51

Remark 3.7. An improved argument (see Lemma 2.2 in [12]), where one orders the sin-gular values in a particular way, allows to establish the better estimate

lip(h;B(A0,r))≤√

min(d,m) · osc(h;B(A0,2r))r

.

3.2 Null-Lagrangians

The determinant is quasiconvex, but it is only one representative of a larger class of canon-ical examples of quasiconvex, but not convex, functions: In this section, we will investigatethe properties of minors (subdeterminants) as integrands. Let for r ∈ 1,2, . . . ,min(d,m),

I ∈ P(m,r) :=(i1, i2, . . . , ir) ∈ 1, . . . ,mr : i1 < i2 < · · ·< ir

and J ∈ P(d,r) be ordered multi-indices. Then, a (r× r)-minor M : Rm×d → R is a func-tion of the form

M(A) = MIJ(A) := det

(AI

J),

where AIJ is the (square) matrix consisting of the I-rows and J-columns of A. The number

r ∈ 1, . . . ,min(d,m) is called the rank of the minor M.The first result shows that all minors are null-Lagrangians, which by definition is the

class of integrands h : Rm×d such that∫

Ω h(∇u) dx only depends on the boundary values ofu.

Lemma 3.8. Let M : Rm×d → R be a (r × r)-minor, r ∈ 1, . . . ,min(d,m). If u,v ∈W1,p(Ω;Rm), p ≥ r, with u− v ∈ W1,p

0 (Ω;Rm), then∫Ω

M(∇u) dx =∫

ΩM(∇v) dx.

Proof. In all of the following, we will assume that u,v are smooth and supp(u− v) ⊂⊂ Ω,which can be achieved by approximation (incorporating a cut-off procedure), see Theo-rem. A.19. We also need that taking the minor M of the gradient commutes with strongconvergence, i.e. the strong continuity of u 7→ M(∇u) in W1,p for p ≥ r; this follows byHadamard’s inequality |M(A)| ≤ |A|r and Pratt’s Theorem A.5. Finally, we assume (againby approximation) that supp(u− v)⊂⊂ Ω.

All first-order minors are just the entries of ∇u,∇v and the result follows from∫Ω

∇u dx =∫

∂Ωu ·n dσ =

∫∂Ω

v ·n dσ =∫

Ω∇v dx

since u− v ∈ W1,p0 (Ω;Rm).

For higher-order minors, the crucial observation is that minors of gradients can be writ-ten as divergences, which we will establish below. So, if M(∇u) = divG(u,∇u) for someG(u,∇u) ∈ C∞(Ω;Rd), then, since supp(u− v)⊂⊂ Ω,∫

ΩM(∇u) dx =

∫∂Ω

G(u,∇u) ·n dσ =∫

∂ΩG(v,∇v) ·n dσ =

∫Ω

M(∇v) dx

Com

men

t...


and the result follows.We first consider the physically most relevant cases d = m ∈ 2,3.For d = m = 2 and u = (u1,u2)T , the only second-order minor is the Jacobian determi-

nant and we easily see from the fact that second derivatives of smooth functions commutethat

det∇u = ∂1u1∂2u2 −∂2u1∂1u2

= ∂1(u1∂2u2)−∂2(u1∂1u2)

= div(u1∂2u2,−u1∂1u2).

For d = m = 3, consider a second-order minor M¬k¬l (A), i.e. the determinant of A after

deleting the k’th row and l’th column. Then, analogously to the situation in dimension two,we get, using cyclic indices k, l ∈ 1,2,3,

M¬k¬l (∇u) = (−1)k+l[∂l+1uk+1∂l+2uk+2 −∂l+2uk+1∂l+1uk+2]

= (−1)k+l[∂l+1(uk+1∂l+2uk+2)−∂l+2(uk+1∂l+1uk+2)]. (3.7)

For the three-dimensional Jacobian determinant, we will show

det∇u =3

∑l=1

∂l(u1(cof∇u)1

l), (3.8)

where we recall that (cofA)kl = (−1)k+lM¬k

¬l (A). To see this, use the Cramer formula(det A)I = A(cofA)T , which holds for any matrix A ∈ Rn×n, to get

det∇u =3

∑l=1

∂lu1 · (cof∇u)1l .

Then, our formula (3.8) follows from the Piola identity

divcof∇u = 0, (3.9)

which can be verified directly from the expression (3.7) for M¬k¬l (∇u). Hence, (3.8) holds.

For general dimensions d,m we use the notation of differential geometry to tame the mul-tilinear algebra involved in the proof. So, let M be a (r× r)-minor. Reordering x1, . . . ,xd

and u1, . . . ,um, we can assume without loss of generality that M is a principal minor, i.e.M is the determinant of the top-left (r× r)-submatrix. Then, in differential geometry it iswell-known that

M(∇u)dx1 ∧·· ·∧dxd = du1 ∧·· ·∧dur ∧dxr+1 ∧·· ·∧dxd

= d(u1 ∧du2 ∧·· ·∧dur ∧dxr+1 ∧·· ·∧dxd).

Thus, Stokes Theorem from Differential Geometry gives∫Ω

M(∇u)dx1 ∧·· ·∧dxd =∫

Ωd(u1 ∧du2 ∧·· ·∧dur ∧dxr+1 ∧·· ·∧dxd)

=∫

∂Ωu1 ∧du2 ∧·· ·∧dur ∧dxr+1 ∧·· ·∧dxd .

Therefore,∫

Ω M(∇u)dx1 ∧·· ·∧dxd only depends on the values of u around ∂Ω.

Com

men

t...


3.2. NULL-LAGRANGIANS 53

Consequently, we immediately have:

Corollary 3.9. All (r× r)-minors M : Rm×d →R are quasiaffine, that is, both M and −Mare quasiconvex.

We will later use this property to define a large and useful class of quasiconvex func-tions, namely the polyconvex ones; this is the topic of the next chapter.

Further, minors enjoy a surprising continuity property:

Lemma 3.10 (Weak continuity of minors). Let M : Rm×d → R be an (r× r)-minor, r ∈1, . . . ,min(d,m), and (u j)⊂ W1,p(Ω;Rm) where r < p ≤ ∞. If

u j u in W1,p (or u j∗ u in W1,∞).

then

M(∇u j) M(∇u) in Lp/r (or M(∇u j)∗ M(∇u) in W1,∞).

Proof. For d = m ∈ 2,3, we use the special structure of minors as divergences, as ex-hibited in the proof of Lemma 3.8. For a (2× 2)-minor M¬k

¬l in three dimensions (that is,deleting the k’th row and l’th column; in two dimensions there is only one (2× 2)-minor,the determinant, but we still use the same notation), we rely on (3.7) to observe that withcyclic indices k, l ∈ 1,2,3,∫

ΩM¬k

¬l (∇u j)ψ dx =−(−1)k+l∫

Ω(uk+1

j ∂l+2uk+2j )∂l+1ψ − (uk+1

j ∂l+1uk+2j )∂l+2ψ dx

for all ψ ∈ C∞c (Ω) and then by density also for all ψ ∈ Lp/r(Ω)∗ ∼= Lp/(p−r)(Ω). Since

u j u in W1,p, we have u j → u strongly in Lp. The above expression consists of productsof one Lp-weakly and one Lp-strongly convergent factor as well as a uniformly boundedfunction under the integral. Hence the integral is weakly continuous and as j →∞ convergesto ∫

ΩM¬k

¬l (∇u)ψ dx.

This finishes the proof for d = m = 2.For d = m = 3, we additionally need to consider the determinant. However, as a conse-

quence of the two-dimensional reasoning, cof∇u j cof∇u in Lp/2. Then, (3.8) implies∫Ω

det∇u j ψ dx =−3

∑l=1

∫Ω

[u1

j(cof∇u j)1l]∂lψ dx

and by a similar reasoning as above, this converges to

−3

∑l=1

∫Ω

[u1(cof∇u)1

l]∂lψ dx =

∫Ω

det∇uψ dx.

In the general case, one proceeds by induction. We omit the details.

Com

men

t...


It can also be shown that any quasiaffine function can be written as an affine functionof all the minors of the argument. This characterization of quasiaffine functions is due toBall [5], a different proof (also including further characterizing statements) can be found inTheorem 5.20 of [25].

3.3 Young measures

We now introduce a very versatile tool for the study of minimization problems – the Youngmeasure, named after its inventor Laurence Chisholm Young. In this chapter, we will onlyuse it as a device to organize the proof of lower semicontinuity for quasiconvex integralfunctions, see Theorems 3.25, 3.29, but later it will also be important as an object in its ownright.

Let us motivate this device by the following central issue: Assume that we have a weaklyconvergent sequence v j v in L2(Ω), where Ω ⊂ Rd is a bounded Lipschitz domain, andan integral functional

F [v] :=∫

Ωf (x,v(x)) dx

with f : Ω×R→ R continuous and bounded, say. Then, we may want to know the limit ofF [v j] as j → ∞, for instance if (v j) is a minimizing sequence. In fact, this will be corner-stone of the proof of weak lower semicontinuity for integral functionals with quasiconvexintegrand.

Equivalently, we could ask for the L∞-weak* limit of the sequence of compound func-tions g j(x) := f (x,v j(x)). It is easy to see that the weak* limit in general is not equal tox 7→ f (x,v(x)). For example, in Ω = (0,1), consider the sequence

v j(x) :=

a if jx−⌊ jx⌋ ≤ θ ,b otherwise,

x ∈ Ω,

where a,b ∈R, θ ∈ (0,1), and ⌊s⌋ is the largest integer below s ∈R. Then, if f (x,a) = α ∈R and f (x,b) = β ∈ R (and smooth and bounded elsewhere), we see v j

∗ θa+(1− θ)b

and

f (x,v j(x))∗ h(x) = θα +(1−θ)β .

The right-hand side, however, is not necessarily f (x,θa+ (1− θ)b). Instead, we couldwrite

h(x) =∫

f (x,v) dνx(v)

with a x-parametrized family of probability measures νx ∈ M1(R), namely

νx = θδα +(1−θ)δβ , x ∈ (0,1).

This family (νx)x∈Ω is the “Young measure generated by the sequence (v j)”.We start with the following fundamental result:

Com

men

t...


3.3. YOUNG MEASURES 55

Theorem 3.11 (Fundamental theorem of Young measure theory). Let (Vj)⊂Lp(Ω;RN)be a bounded sequence, where 1 ≤ p ≤ ∞. Then, there exists a subsequence (not relabeled)and a family (νx)x∈Ω ⊂ M1(RN) of probability measures, called the p-Young measure gen-erated by the sequence (Vj), such that the following assertions are true:

(I) The family (νx) is weakly* measurable, that is, for all Caratheodory integrandsf : Ω×RN → R, the compound function

x 7→∫

f (x,A) dνx(A), x ∈ Ω,

is Lebesgue-measurable.

(II) If 1 ≤ p < ∞, it holds that⟨⟨| q|p,ν⟩⟩ :=

∫Ω

∫|A|p dνx(A) dx < ∞

or, if p = ∞, there exists a compact set K ⊂ RN such that

suppνx ⊂ K for a.e. x ∈ Ω.

(III) For all Caratheodory integrands f : Ω×RN → R such that the family ( f (x,Vj(x))) j

is uniformly L1-bounded and equiintegrable, it holds that

f (x,Vj(x))

(x 7→

∫f (x,A) dνx(A)

)in L1(Ω). (3.10)

For parametrized measures ν = (νx)x∈Ω that satisfy (I) and (II) above, we write ν =(νx) ∈ Yp(Ω;RN), the latter being called the set of p-Young measures. In symbols weexpress the generation of a Young measure ν by as sequence (Vj) as

VjY→ ν .

Furthermore, the convergence (3.10) can equivalently be expressed as (absorb a test functionfor weak convergence into f )∫

Ωf (x,Vj(x)) dx →

∫Ω

∫f (x,A) dνx(A) dx =:

⟨⟨f ,ν⟩⟩.

In this context, recall from the Dunford–Pettis Theorem A.13 that ( f (x,Vj(x))) j is equiin-tegrable if and only it is weakly relatively compact in L1(Ω).

Remark 3.12. Some assumptions in the fundamental theorem can be weakened: For theexistence of a Young measure, we only need to require limK↑∞ sup j ||Vj| ≥ K|= 0, whichfor example follows from sup j

∫Ω |Vj|p dx < ∞ for some p > 0. Of course, in this case (II)

needs to be suitably adapted.

Com

men

t...

Proof. We associate with each Vj an elementary Young measure (δV j) ∈ Yp(Ω;RN) asfollows:

(δV j)x := δV j(x) for a.e. x ∈ Ω. (3.11)

Below we will show the following Young measure compactness principle, which is alsouseful in other contexts. We only formulate the principle for 1 ≤ p < ∞ for reasons ofconvenience, but it holds (suitably adapted) also for p = ∞.

Let (ν( j)) j ⊂ Yp(Ω;RN) be a sequence of p-Young measures such that

sup j⟨⟨| q|p,ν( j)⟩⟩= sup j

∫Ω

∫|A|p dν( j)

x (A) dx < ∞. (3.12)

Then, there exists a subsequence of the (ν j) (not relabeled) and ν ∈ Yp(Ω;RN)such that

limj→∞

⟨⟨f ,ν( j)⟩⟩= ⟨⟨ f ,ν

⟩⟩(3.13)

for all Caratheodory f : Ω×RN → R for which the sequence of functions x 7→⟨ f (x, q),ν( j)

x ⟩ is uniformly L1-bounded and the equiintegrability condition

sup j⟨⟨

f (x,A)1| f (x,A)|≥m,ν( j)⟩⟩ → 0 as m → ∞. (3.14)

holds. Moreover,⟨⟨| q|p,ν⟩⟩≤ liminf

j→∞⟨⟨| q|p,ν( j)⟩⟩.

In the theorem, our assumption of Lp-boundedness of (Vj) corresponds to (3.12), so (I)–(III) follow immediately from the compactness principle applied to our sequence (δV j) j ofelementary Young measures from (3.11) above once we realize that for the sequence ofYoung measures (δV j) j the condition (3.14) corresponds precisely to the equiintegrabilityof ( f (x,Vj(x))) j.

It remains to show the compactness principle. To this end define the measures

µ( j) := L dx Ω⊗ν( j)

x ,

which is just a shorthand notation for the Radon measures µ( j) ∈ M(Ω×RN) ∼= C0(Ω×RN)∗ defined through their action⟨

f ,µ( j)⟩= ∫Ω

∫f (x,A) dν( j)

x (A) dx for all f ∈ C0(Ω×RN).

So, for the ν( j) = δV j from (3.11), we recover the familiar form

⟨f ,µ( j)⟩= ∫

Ωf (x,Vj(x)) dx for all f ∈ C0(Ω×RN),

Com

men

t...


but in the compactness principle we might be in a more general situation.Clearly, every so-defined µ( j) is a positive measure and∣∣⟨ f ,µ( j)⟩∣∣≤ |Ω| · ∥ f∥∞,

whereby the (µ( j)) constitute a uniformly bounded sequence in the dual space C0(Ω×RN)∗.Thus, by the Sequential Banach–Alaoglu Theorem A.12, we can select a (non-relabeled)subsequence such that with a µ ∈ C0(Ω×RN)∗ it holds that⟨

f ,µ( j)⟩→ ⟨f ,µ⟩

for all f ∈ C0(Ω×RN). (3.15)

Next we will show that µ can be written as µ = L dx Ω⊗νx for a weakly* measurable

parametrized family ν = (νx)x∈Ω ⊂ M1(RN) of probability measures.

Theorem 3.13 (Disintegration). Let µ ∈ M(Ω×RN) be a finite Radon measure. Then,there exists a weakly* measurable family (νx)x∈Ω ⊂ M1(RN) of probability measures suchthat with κ(A) := µ(A×RN),

µ = κ(dx)⊗νx,

that is,⟨f ,µ⟩=∫

Ω

∫f (x,A) dνx(A) dκ(x) for all f ∈ C0(Ω×RN).

A proof of this measure-theoretic result can be found in Theorem 2.28 of [2].In our situation, we have for f (x,v) := φ(x) with arbitrary φ ∈ C0(Ω) that⟨

φ ,µ⟩= lim

j→∞

⟨φ,µ( j)⟩= ∫

Ωφ(x) dx,

and so κ = L d Ω. Hence, the Disintegration Theorem immediately yields the weak*measurability of (νx) and thus lim j→∞ ⟨⟨ f ,ν( j)⟩⟩ =

⟨⟨f ,ν⟩⟩

for f ∈ C0(Ω ×RN), whichis (3.13).

Next, we show (3.13) for the case of f merely Caratheodory and bounded and suchthat there exists a compact set K ⊂RN with supp f ⊂ Ω×K. We invoke the Scorza DragoniTheorem 2.5 to get an increasing sequence of compact sets Sk ⊂⊂Ω (k ∈N) with |Ω\Sk| ↓ 0such that f |Sk×RN is continuous. Let fk ∈ C(Ω×RN) be an extension of f |Sk×RN to all ofΩ×RN with the property that the fk are uniformly bounded. Indeed, since K is boundedand | fk(x,A)| ≤ M(1+ |A|p), f is bounded and hence we may require uniform boundednessof the fk.

Now, since fk ∈ C0(Ω×RN), (3.15) implies⟨fk(x, q),ν( j)

x⟩⟨

fk(x, q),νx⟩

in L1 as j → ∞.

Thus, we also get⟨f (x, q),ν( j)

x⟩⟨

f (x, q),νx⟩

in L1(Sk) as j → ∞.

Com

men

t...


On the other hand,∫Ω

∣∣⟨ f (x, q),ν( j)x⟩−1Sk

⟨f (x, q),ν( j)

x⟩∣∣ dx ≤

∫Ω\Sk

∣∣⟨ f (x, q),ν( j)x⟩∣∣ dx

and this converges to zero as k → ∞, uniformly in j, by the uniform boundedness of fk; thesame estimate holds with ν in place of ν( j). Therefore, we may conclude⟨

f (x, q),ν( j)x⟩⟨

f (x, q),νx⟩

in L1(Ω) as j → ∞,

which directly implies (3.13) (for f with supp f ⊂ Ω×K).Finally, to remove the restriction of compact support in A, we remark that it suffices

to show the assertion under the additional constraint that f ≥ 0 by considering the positiveand negative part separately. Then, in case f positive, choose for any m ∈ N a functionρm ∈ C∞

c (RN ; [0,1]) with ρm ≡ 1 on B(0,m) and suppρm ⊂ B(0,2m). Set

f m(x,A) := ρm(|A|p/2)ρm( f (x,A)) f (x,A).

Then,

E j,m :=∫

Ω

⟨f (x, q),ν( j)

x⟩−⟨

f m(x, q),ν( j)x⟩

dx

≤∫

Ω

∫ [1−ρm(|A|p/2)ρm( f (x,A))

]f (x,A) dν( j)

x (A) dx

≤∫∫

|A| ≥ m2/p or f (x,A)≥ mf (x,A) dν( j)

x (A) dx

≤ M∫

Ω

∫|A|≥m2/p

m dν( j)x (A) dx+

∫ f (x,A)≥m

f (x,A) dν( j)x (A) dx

≤ Mm

sup j⟨⟨|A|p,ν( j)⟩⟩+ sup j

⟨⟨f (x,A)1| f (x,A)|≥m,ν( j)⟩⟩.

Both of these terms converge to zero as m → ∞ by the assumption from the compactnessprinciple, in particular (3.12) and the equiintegrability condition (3.14).

As the Young measure convergence holds for f m by the previous proof step, we get

limsupj→∞

(⟨⟨f ,ν( j)⟩⟩−⟨⟨ f m,ν

⟩⟩)= limsup

j→∞

(⟨⟨f m,ν( j)⟩⟩−⟨⟨ f m,ν

⟩⟩)+E j,m

)≤ sup j E j,m.

Then, since f m ↑ f and sup j E j,m vanishes as m → ∞ and f ≥ 0, we get that indeed⟨⟨f ,ν( j)⟩⟩→ ⟨⟨

f ,ν⟩⟩

as j → ∞.

The last assertion to be shown in the compactness principle is, for 1 ≤ p < ∞,⟨⟨| q|p,ν⟩⟩≤ liminf

j→∞

⟨⟨| q|p,ν( j)⟩⟩.

Com

men

t...

For K ∈ N define |A|K := min|A|,K ≤ |A|. Then, by the Young measure representationfor equiintegrable integrands,

liminfj→∞

⟨⟨| q|p,ν( j)⟩⟩≥ lim

j→∞

⟨⟨| q|pK ,ν( j)⟩⟩= ⟨⟨| q|pK ,ν⟩⟩ for all K ∈ N.

We conclude by letting K ↑ ∞ and using the monotone convergence lemma.

Remark 3.14. The Fundamental Theorem can also be proved in a more functional analyticway as follows: Let L∞

w∗(Ω;M(RN)) be the set of essentially bounded weakly* measurablefunctions defined on Ω with values in the Radon measures M(RN). It turns out (see forexample [7] or [30, 31]) that L∞

w∗(Ω;M(RN)) is the dual space to L1(Ω;C0(RN)). One canshow that the maps ν( j) = (x 7→ ν( j)

x ) form a uniformly bounded set in L∞w∗(Ω;M(RN)) and

by the Sequential Banach–Alaoglu Theorem A.12 we can again conclude the existence of aweak* limit point ν of the ν( j)’s. The extended representation of limits for Caratheodory ffollows as before.

It is known that every weakly* measurable parametrized measure (νx)x∈Ω ⊂ M1(RN)with (x 7→ ⟨| q|p,νx⟩)∈L1(Ω) (1≤ p<∞) can be generated by a sequence (u j)⊂Lp(Ω;RN)(one starts first by approximating Dirac masses and then uses the approximation of a generalmeasure by linear combinations of Dirac masses, a spatial gluing argument completes thereasoning). Thus, in the definition of Yp(Ω;RN) we did not need to include (III) from theFundamental Theorem.

A Young measure ν = (νx) ∈ Yp(Ω;RN) is called homogeneous if νx = νy for a.e.x,y ∈ Ω; by a mild abuse of notation we then simply write ν in place of νx.

Before we come to further properties of Young measures, let us consider a few exam-ples:

Example 3.15. In Ω := (0,1) define u := 1(0,1/2)−1(1/2,1) and extend this function peri-odically to all of R. Then, the functions u j(x) := u( jx) for j ∈ N (see Figure 3.3) generatethe homogeneous Young measure ν ∈ Y∞((0,1);R) with

νx =12

δ−1 +12

δ+1 for a.e. x ∈ (0,1).

Indeed, for φ ⊗ h ∈ C0((0,1)×R) we have that φ is uniformly continuous, say |φ(x)−φ(y)| ≤ω(|x−y|) with a modulus of continuity ω : [0,∞)→ [0,∞) (that is, ω is continuous,increasing, and ω(0) = 0). Then,

limj→∞

∫ 1

0φ(x)h(u j(x)) dx = lim

j→∞

j−1

∑k=0

(∫ (k+1)/ j

k/ jφ(k/ j)h(u j(x)) dx+

ω(1/ j)∥h∥∞

j

)= lim

j→∞

j−1

∑k=0

1jφ(k/ j)

∫ 1

0h(u(y)) dy

=∫ 1

0φ(x) dx ·

(12

h(−1)+12

h(+1))

since the Riemann sums converge to the integral of φ . The last line implies the assertion.

Com

men

t...


Figure 3.3: An oscillating sequence.

Figure 3.4: Another oscillating sequence.

Example 3.16. Take Ω := (0,1) again and let u j(x)= sin(2π jx) for j ∈N (see Figure 3.4).The sequence (u j) generates the homogeneous Young measure ν ∈ Y∞((0,1);R) with

νx =1

π√

1− y2L 1

y (−1,1) for a.e. x ∈ (0,1),

as should be plausible from the oscillating sequence (there is more mass close to the hori-zontal axis than farther away).

Example 3.17. Take a bounded Lipschitz domain Ω ⊂ R2, let A,B ∈ R2×2 with B−A =a⊗n (this is equivalent to rank(A−B)≤ 1), where a,n ∈ R2, and for θ ∈ (0,1) define

u(x) := Ax+(∫ x·n

0χ(t) dt

)a, x ∈ R2,

where χ := 1∪z∈Z[z,z+1−θ). If we let u j(x) := u( jx)/ j, x ∈ Ω, then (∇u j) generates the

homogeneous Young measure ν ∈ Y∞(Ω;R2) with

νx = θδA +(1−θ)δB for a.e. x ∈ Ω.

This example can also be extended to include multiple scales, cf. [73].

The Fundamental Theorem has many important consequences, two of which are thefollowing:

Com

men

t...


Lemma 3.18. Let (Vj) ⊂ Lp(Ω;RN), 1 1 are weakly relatively compact and it sufficesto identify the limit of any weakly convergent subsequence. From the Dunford–Pettis The-orem A.13 it follows that any such sequence is in fact equiintegrable (in L1). Now simplyapply assertion (III) of the Fundamental Theorem for the integrand f (x,A) := A (or, morepedantically, f i

j(A) := Aij for i = 1, . . . ,d, m = 1, . . . ,m).

The preceding lemma does not hold for p = 1. One example is Vj := j1(0,1/ j), whichconcentrates.

Proposition 3.19. Let (Vj) ⊂ Lp(Ω;RN), 1 ≤ p ≤ ∞, be a bounded sequence generatingthe Young measure ν ∈ Yp(Ω;RN), and let f : Ω×RN → [0,∞) be any Caratheodory inte-grand (not necessarily satisfying the equiintegrability property in (III) of the FundamentalTheorem). Then, it holds that

liminfj→∞

∫Ω

f (x,Vj(x)) dx ≥⟨⟨

f ,ν⟩⟩.

Proof. For M ∈ N define fM(x,A) := min( f (x,A),M). Then, (III) from the FundamentalTheorem is applicable with fM and we get∫

ΩfM(x,Vj(x)) dx →

⟨⟨fM,ν

⟩⟩=∫

Ω

∫fM(x,A) dνx(A) dx.

Since f ≥ fM , we have

liminfj→∞

∫Ω

f (x,Vj(x)) dx ≥⟨⟨

fM,ν⟩⟩.

We conclude by letting M ↑ ∞ and using the monotone convergence lemma.

Another important feature of Young measures is that they allow us to read off whetheror not our sequence converges in measure:

Proposition 3.20. Let (νx) ⊂ M1(RN) be the Young measure generated by a boundedsequence (Vj)⊂ Lp(Ω;RN), 1 ≤ p ≤ ∞. Let furthermore K ⊂ RN be compact. Then,

dist(Vj(x),K)→ 0 in measure if and only if suppνx ⊂ K for a.e. x ∈ Ω.

Moreover,

Vj →V in measure if and only if νx = δV (x) for a.e. x ∈ Ω.

Com

men

t...


Proof. We only show the second assertion, the proof of the first one is similar. Take thebounded, positive Caratheodory integrand

f (x,A) :=|A−V (x)|

1+ |A−V (x)|, x ∈ Ω, A ∈ RN .

Then, for all δ ∈ (0,1), the Markov inequality and the Fundamental Theorem on Youngmeasures yield

limsupj→∞

∣∣x ∈ Ω : f (x,Vj(x))≥ δ∣∣≤ lim

j→∞

1δ

∫Ω

f (x,Vj(x)) dx

=1δ

∫Ω

⟨f (x, q),νx

⟩dx.

On the other hand,∫Ω

⟨f (x, q),νx

⟩dx = lim

j→∞

∫Ω

f(x,Vj(x)

)dx

≤ δ |Ω|+ limsupj→∞

∣∣x ∈ Ω : f (x,Vj(x))≥ δ∣∣.

The preceding estimates show that f (x,Vj(x)) converges to zero in measure if and onlyif ⟨ f (x, q),νx⟩ = 0 for L d-almost every x ∈ Ω, which is the case if and only if νx = δV (x)almost everywhere. We conclude by observing that f (x,Vj(x)) converges to zero in measureif and only if Vj converges to V in measure.

3.4 Gradient Young measures

The most important subclass of Young measures for our purposes is that of Young measuresthat can be generated by a sequence of gradients: Let ν ∈ Yp(Ω;Rm×d) (use RN = Rm×d

in the theory of the last section). We say that ν is a p-gradient Young measure, in sym-bols ν ∈ GYp(Ω;Rm×d), if there exists a bounded sequence (u j)⊂ W1,p(Ω;Rm) such that

∇u jY→ ν , i.e. the sequence (∇u j) generates ν . Note that it is not necessary that every se-

quence that generates ν is a sequence of gradients (which, in fact, is never the case). Wehave already considered examples of gradient Young measures in the previous section, seein particular Example 3.17.

Is it the case that every Young measure is a gradient Young measure? The answer isno, as can be easily seen: Let V ∈ (Lp ∩C∞)(Ω;Rm×d) with curlV (x) = 0 for some x ∈ Ω.Then, νx := δV (x) cannot be a gradient Young measure since for such measures it must hold

that [ν ] is a gradient itself: if there existed a sequence (u j) ⊂ W1,p(Ω;Rm) with ∇u jY→ ν ,

then ∇u j V by Lemma 3.18, and it would follow that curlV = 0. Consequently, for everyν ∈ GYp(Ω;Rm×d) there must exist u ∈ W1,p(Ω;Rm) with [ν ] = ∇u. In this case we callany such u an underlying deformation of ν .

We first show the following technical, but immensely useful, result about gradientYoung measures. Recall that we assume Ω to be open, bounded, and to have a Lipschitzboundary (to make the trace well-defined).

Com

men

t...


3.4. GRADIENT YOUNG MEASURES 63

Lemma 3.21. Let ν ∈ GYp(Ω;Rm×d), 1 < p < ∞, and let u ∈ W1,p(Ω;Rm) be an under-lying deformation of ν , i.e. [ν ] = ∇u. Then, there exists a sequence (u j) ⊂ W1,p(Ω;Rm)such that

u j|∂Ω = u|∂Ω and ∇u jY→ ν .

Furthermore, in addition we can ensure that (∇u j) is p-equiintegrable.

Proof. Step 1. Since Ω is assumed to have a Lipschitz boundary, we can extend a generatingsequence (∇v j) for ν to all of Rd , so we assume (v j)⊂ W1,p(Rd ;Rm) with sup j ∥v j∥W1,p ≤C < ∞.

In the following, we need the maximal function M f : Rd → R∪+∞ of f : Rd →R∪+∞, which is defined as

M f (x0) := supr>0

−∫

B(x0,r)| f (x)| dx, x0 ∈ Rd .

We quote the following two results about the maximal function whose proofs can be foundin [87]:

(i) ∥M f∥Lp ≤C∥M f∥Lp , where C =C(d, p) is a constant.

(ii) For every M > 0 and f ∈ W1,p(Rd ;Rm), the maximal function M f is Lipschitz-continuous on the set M (| f |+ |∇ f |) < M and its Lipschitz-constant is boundedby CM, where C =C(d,m, p) is a dimensional constant.

With this tool at hand, consider the sequence

Vj := M (|v j|+ |∇v j|), j ∈ N.

By (i) above, (Vj) is uniformly bounded in Lp(Ω) and we may select a (non-renumbered)subsequence such that Vj generates a Young measure µ ∈ Yp(Ω).

Next, define for h ∈ N the (nonlinear) truncation operator Th,

Thv :=

v if |v| ≤ h,h v|v| if |v|> h,

v ∈ R,

and let (φl)l ⊂ C∞(Ω) be a countable family that is dense in C(Ω).Then, for fixed h ∈ N, the sequence (ThVj) j is uniformly bounded in L∞(Ω) and so, by

Young measure representation,

limh→∞

limj→∞

∫Ω

φl(x)|ThVj(x)|p dx = limh→∞

∫Ω

φl(x)∫

|Thv|p dµx(v) dx =∫

Ωφl(x)

⟨| q|p,µx

⟩dx,

where in the last step we used the monotone convergence lemma. Thus, for some diagonalsequence Wk := TkVj(k), we have that

limk→∞

∫Ω

φl(x) |Wk(x)|p dx =∫

Ωφl(x)

⟨| q|p,µx

⟩dx for all l ∈ N.

Com

men

t...

Thus, |Wk|p ⟨| q|p,µx⟩ in L1 and |Wk|pk is an equiintegrable family by the Dunford–Pettis Theorem. We can see also the equiintegrability in a more elementary way as follows:For a Borel subset A ⊂ Ω take φ ∈ C∞(Ω) with 1A ≤ φ and estimate

limsupk→∞

∫A|Wk(x)|p dx ≤ lim

k→∞

∫Ω

φ(x) |Wk(x)|p dx =∫

Ωφ(x)

⟨| q|p,µx

⟩dx

by the L1-convergence of |Wk|p to x 7→ ⟨| q|p,µx⟩. The last integral tends to∫

A⟨| q|p,µx⟩ dx asφ ↓ 1A, which vanishes if |A| → 0. Thus the family Wkk is p-equiintegrable.

By (ii) above, vk = v j(k) is Ck-Lipschitz on the set Ek := Wk ≤ k and by a well-known result for Lipschitz functions due to Kirszbraun, we may extend vk to a functionwk : Rd → Rm that is globally Lipschitz continuous with the same Lipschitz constant Ckand wk = vk in Ek. Thus,

|∇wk| ≤CWk in Ω.

Thus, |∇wk|k inherits the p-equiintegrable from Wkk.Moreover, by the Markov inequality,

|Ω\Ek| ≤∥Wk∥p

Lp

kp → 0 as k → ∞.

Therefore, for all φ ∈ C0(Ω) and h ∈ C0(Rm),∫Ω|φh(∇wk)−φh(∇vk)| dx ≤ ∥φh∥∞|Ω\Ek| → 0.

Thus, since all such φ,h determine Young measure convergence (see the proof of the Fun-damental Theorem 3.11), we have shown that (∇wk) generates the same Young measure νas (∇v j) and additionally the family ∇wkk is p-equiintegrable.

Step 2. Since W1,p(Ω;Rm) embeds compactly into the space Lp(Ω;Rm) by the Rellich-Kondrachov Theorem A.18, we have wk → u in Lp. Let moreover (ρ j) ⊂ C∞

c (Ω; [0,1])be a sequence of cut-off functions such that for G j :=

x ∈ Ω : ρ j(x) = 1

it holds that

|Ω\G j| → 0 as j → ∞. For

u j,k := ρ jwk +(1−ρ j)u ∈ W1,pu (Ω;Rm)

we observe ∇u j,k = ρ j∇wk +(1−ρ j)∇u+(wk −u)⊗∇ρ j.We only consider the case p < ∞ in the following; the proof for the case p = ∞ is in fact

easier. For every Caratheodory function f : Ω×Rm×d →R with | f (x,A)| ≤ M(1+ |A|p) wehave

limsupk→∞

∫Ω

∣∣ f (x,∇wk(x))− f (x,∇u j,k(x))∣∣ dx

≤ limsupk→∞

CM∫

Ω\G j

2+2|∇wk|p + |∇u|p + |wk −u|p · |∇ρ j|p dx

≤ limsupk→∞

CM∫

Ω\G j

2+2|∇wk|p + |∇u|p dx,

Com

men

t...

3.4. GRADIENT YOUNG MEASURES 65

where C > 0 is a constant. However, by assumption, the last integrand is equiintegrable andhence

limj→∞

limsupk→∞

∫Ω

∣∣ f (x,∇wk(x))− f (x,∇u j,k(x))∣∣ dx = 0.

We can now select a diagonal sequence u j = u j,k( j) such that (∇u j) generates ν and satisfiesall requirements from the statement of the lemma. Note that the equiintegrability is notaffected by the cut-off procedure.

The following result is now easy to prove:

Lemma 3.22 (Jensen-type inequality). Let ν ∈ GYp(Ω;Rm×d) be a homogeneous Youngmeasure, i.e. νx = ν ∈ M1(Rm×d) for almost every x ∈ Ω, and let B := [ν ] ∈ Rm×d be itsbarycenter (which is a constant). Then, for all quasiconvex functions h : Rm×d → R with|h(A)| ≤ M(1+ |A|p) (M > 0, 1 < p < ∞) it holds that

h(B)≤∫

h dν . (3.16)

Notice that the conclusion of this lemma is trivial (and also holds for general Youngmeasures, not just the gradient ones) if h is convex. Thus, also (3.16) expresses the convexityin the notion of quasiconvexity.

Proof. Let (u j) ⊂ W1,p(Ω;Rm) with ∇u jY→ ν , u j|∂Ω(x) = Bx for x ∈ ∂Ω (in the sense of

trace) and (∇u j) p-equiintegrable, the latter two conditions being realizable by Lemma 3.21.Then, from the definition of quasiconvexity, we get

h(B)≤ −∫

Ωh(∇u j) dx.

Passing to the Young measure limit as j → ∞ on the right-hand side, for which we note thath(∇u j) is equiintegrable by the growth assumption on h, we arrive at

h(B)≤ −∫

Ω

∫h dν dx =

∫h dν,

which already is the sought inequality.

The last result will be of crucial importance in proving weak lower semicontinuity in thenext section. It is remarkable that the converse also holds, i.e. the validity of (3.16) for allquasiconvex h with p-growth precisely characterizes the class of (homogeneous) gradientYoung measures in the class of all (homogeneous) Young measures; this is the content ofthe Kinderlehrer–Pedregal Theorem 5.9, which we will meet in Chapter 5.

Com

men

t...


3.5 Gradient Young measures and rigidity

To round off the introduction to Young measures let us return to the question if every Youngmeasure is a gradient Young measure, but now with the additional requirement that theYoung measure in question is homogeneous. Then, the barycenter is constant and hencealways the gradient of an affine function, so the simple reasoning from the beginning ofthe section no longer applies. Still, there are homogeneous Young measures that are notgradients, but proving that no generating sequence of gradients can be found, is not sotrivial. One possible way would be to show that there is a quasiconvex function h : Rm×d →R such that the Jensen-type inequality of Lemma 3.22 fails; this is, however, not easy.

Let us demonstrate this assertion instead through the so-called (approximate) two-gradient problem: Let A,B ∈Rm×d , θ ∈ (0,1) and consider the homogeneous Young mea-sure

ν := θδA +(1−θ)δB ∈ Y∞(B(0,1);Rm×d). (3.17)

We know from Example 3.17 that for rank(A−B) ≤ 1, ν is a gradient Young measure. Inany case, suppν = A,B and by Proposition 3.20 it follows that dist(Vj,A,B) → 0 in

measure for any generating sequence VjY→ ν .

The following is the first instance of so-called rigidity results, which play a prominentrole in the field:

Theorem 3.23 (Ball–James). Let Ω ⊂Rd be open, bounded, and convex. Also, let A,B ∈Rm×d .

(I) Let u ∈ W1,∞(Ω;Rm) satisfy the exact two-gradient inclusion

∇u ∈ A,B a.e. in Ω.

(a) If rank(A−B)≥ 2, then ∇u = A a.e. or ∇u = B a.e.

(b) If B−A = a⊗ n (a ∈ Rm, n ∈ Rd \ 0), then there exists a Lipschitz functionh : R→ R with h′ ∈ 0,1 almost everywhere, and a constant v0 ∈ Rm such that

u(x) = v0 +Ax+h(x ·n)a.

(II) Let rank(A−B) ≥ 2 and assume that u j ⊂ W1,∞(Ω;Rm) is uniformly bounded andsatisfies the approximate two-gradient inclusion

dist(∇u j,A,B

)→ 0 in measure.

Then,

∇u j → A in measure or ∇u j → B in measure.

Com

men

t...


3.5. GRADIENT YOUNG MEASURES AND RIGIDITY 67

Proof. Ad (I) (a). Assume after a translation that B = 0 and thus rankA ≥ 2. Then, ∇u = Agfor a scalar function g : Ω → R. Mollifying u (i.e. working with uε := ρε ⋆ u instead of uand passing to the limit ε ↓ 0), we may assume that in fact g ∈ C1(Ω).

The idea of the proof is that the curl of ∇u vanishes, expressed as follows: for alli, j = 1, . . . ,d and k = 1, . . . ,m, it holds that

∂i(∇u)kj = ∂i juk = ∂ jiuk = ∂ j(∇u)k

i

where Mkj denotes the element in the k’th row and j’th column of the matrix M. For our

special ∇u = Ag, this reads as

Akj∂ig = Ak

i ∂ jg. (3.18)

Under the assumptions of (I) (a), we claim that ∇g ≡ 0. If otherwise ξ (x) := ∇g(x) = 0 forsome x ∈ Ω, then with ak(x) := Ak

j/ξ j(x) (k = 1, . . . ,m) for any j such that ξ j(x) = 0, whereak(x) is well-defined by the relation (3.18), we have

Akj = ak(x)ξ j(x), i.e. A = a(x)⊗ξ (x).

This, however, is impossible if rankA ≥ 2. Hence, ∇g ≡ 0 and u is an affine function. Thisproperty is also stable under mollification.

Ad (I) (b). As in (I) (a), we assume ∇u = Ag (B = 0) and A = a⊗ n. Pick any b ⊥ n.Then,

ddt

u(x+ tb)∣∣∣∣t=0

= ∇u(x)b = [anT b]g(x) = 0.

This implies that u is constant in direction b. As b ⊥ n was arbitrary and Ω is assumedconvex, u(x) can only depend on x ·n. This implies the claim.

Ad (II). Assume once more that B = 0 and that there exists a (2× 2)-minor M withM(A) = 0. By assumption there are sets D j ⊂ Ω with

∇u j −A1D j → 0 in measure.

Let us also assume that we have selected a subsequence such that

u j∗ u in W1,∞ and 1D j

∗ χ in L∞.

In the following we use that under an assumption of uniform L∞-boundedness conver-gence in measure implies weak* convergence in L∞. Indeed, for any ψ ∈ L1(Ω) and anyε > 0 we have∫

Ω(∇u j −A1D j)ψ dx

≤∣∣x ∈ Ω : |∇u j(x)−A1D j(x)|> ε

∣∣ · ∥∇u j −A1D j∥L∞ · ∥ψ∥L1 + ε∥ψ∥L1

→ 0+ ε∥ψ∥L1

Com

men

t...


As ε > 0 is arbitrary, we have indeed ∇u j −A1D j

∗ 0 in L∞.

Now,

w*-limj→∞

M(∇u j) = w*-limj→∞

M(A1D j) = M(A) ·w*-limj→∞

1D j = M(A)χ .

By a similar argument, ∇u j∗ ∇u = Aχ , and so, by the weak* continuity of minors proved

in Lemma 3.10,

w*-limj→∞

M(∇u j) = M(∇u) = M(Aχ) = M(A)χ2.

Thus, χ = χ2 and we can conclude that there exists D ⊂ Ω such that χ = 1D and ∇u = A1D.Furthermore, combining the above convergence assertions,

∇u j → ∇u = A1D in measure.

Finally, (I) (a) implies that ∇u = A or ∇u = 0 = B a.e. in Ω, finishing the proof.

With this result at hand it is easy to see that our example (3.17) cannot be a gradientYoung measure if rank(A−B)≥ 2: If there was a sequence of gradients ∇u j

Y→ ν , then bythe reasoning above we would have dist(∇u j,A,B)→ 0, but from (II) of the Ball–Jamesrigidity theorem, we would get ∇u j → A or ∇u j → B in measure, either one of which yieldsa contradiction by Proposition 3.20.

3.6 Lower semicontinuity

After our detour through Young measure theory, we now return to the central subject of thischapter, namely to minimization problems of the formF [u] :=

∫Ω

f (x,∇u(x)) dx → min,

over all u ∈ W1,p(Ω;Rm) with u|∂Ω = g,

where Ω ⊂ Rd is a bounded Lipschitz domain, 1 0,

and g ∈ W1−1/p,p(∂Ω;Rm) specifies the boundary values. In Chapter 2 we solved this prob-lem via the Direct Method of the Calculus of Variations, a coercivity result, and, crucially,the Lower Semicontinuity Theorem 2.7. In this section, we recycle the Direct Method andthe coercivity result, but extend lower semicontinuity to quasiconvex integrands; some mo-tivation for this was given at the beginning of the chapter.

Let us first consider how we could approach the proof of lower semicontinuity (it shouldbe clear that the proof via Mazur’s Lemma that we used for the convex lower semicontinuity

Com

men

t...

3.6. LOWER SEMICONTINUITY 69

theorem, does not extend). Assume we have a sequence (u j) ⊂ W1,p(Ω;Rm) such thatu j u in W1,p and we want to show lower semicontinuity of our functional F . If weassume that our (norm-bounded) sequence ∇u j also generates the gradient Young measureν ∈ GYp(Ω;Rm×d), which is true up to a subsequence, and that the sequence of integrands( f (x,∇u j(x))) j is equiintegrable, then we already have a limit:

F [u j]→∫

Ω

∫f (x,A) dνx(A) dx.

This is useful, because it now suffices to show the inequality∫f (x, q) dνx ≥ f (x,∇u(x)) dx

for almost every x ∈ Ω, to conclude lower semicontinuity. Now, this inequality is close tothe Jensen-type inequality (3.16) for gradient Young measures from Lemma 3.22, and wealready explained how this is related to (quasi)convexity. The only difference in comparisonto (3.16) is that here we need to “localize” in x ∈ Ω and make νx a gradient Young measurein its own right. This is accomplished via the fundamental blow-up technique:

Lemma 3.24 (Blow-up technique). Let ν = (νx) ∈ GYp(Ω;Rm×d) be a gradient Youngmeasure. Then, for almost every x0 ∈ Ω, νx0 ∈ GYp(B(0,1);Rm×d) is a homogeneous gra-dient Young measure in its own right.

Proof. We start with a general property of Young measures: Let φkk and hll be count-able dense subsets of C0(Ω) and C0(RN), respectively. Assume furthermore that for auniformly norm-bounded sequence (Vj)⊂ Lp(Ω;RN) and ν ∈ Yp(Ω;RN) we have

limj→∞

∫Ω

φk(x)hl(Vj(x)) dx =∫

Ωφk(x)

∫hl dνx dx for all k, l ∈ N.

Then we claim that VjY→ ν . This is in fact clear once we realize that the set of linear

combinations of the functions fk,l := φk ⊗ hl (that is, fk,l(x,A) := φk(x)hl(A)) is dense inC0(Ω×RN) and testing with such functions determines Young measure convergence, seethe proof of the Fundamental Theorem 3.11.

Now let x0 ∈ Ω be a Lebesgue point of all the functions x 7→⟨hl,νx

⟩(l ∈ N), that is,

limr↓0

∫B(0,1)

∣∣⟨hl,νx0+ry⟩−⟨hl,νx0

⟩∣∣ dy = 0.

By Theorem A.7, almost every x ∈ Ω has this property. Then, at such a Lebesgue point x0,set

v(r)j (y) :=u j(x0 + ry)− [u j]B(x0,r)

r, y ∈ B(0,1),

where [u]B(x0,r) := −∫

B(x0,r) u dx.

Com

men

t...


For φk, hl from the collections exhibited at the beginning of the proof, we get∫B(0,1)

φk(y)hl(∇v(r)j (y)) dy =∫

B(0,1)φk(y)hl(∇u j(x0 + ry)) dy

=1rd

∫B(x0,r)

φk

(x− x0

r

)hl(∇u j(x)) dx

after a change of variables. Letting j → ∞ and then r ↓ 0, we get

limr↓0

limj→∞

∫B(0,1)

φk(y)hl(∇v(r)j (y)) dy = limr↓0

1rd

∫B(x0,r)

φk

(x− x0

r

)⟨hl,νx

⟩dx

= limr↓0

∫B(0,1)

φk(y)⟨hl,νx0+ry

⟩dx

=∫

B(0,1)φk(y)

⟨hl,νx0

⟩dy,

the last convergence following from the Lebesgue point property of x0. Moreover,∫B(0,1)

|∇v(r)j |p dy =∫

B(0,1)|∇u j(x0 + ry)|p dy =

1rd

∫B(x0,r)

|∇u j(x)|p dx

and the last integral is uniformly bounded in j (for fixed r). Denote by λ ∈ M(Ω;Rm) theweak* limit of the measures |∇u j|pL d Ω, which exists after taking a subsequence. If werequire of x0 additionally that

limsupr↓0

λ (B(x0,r))rd < ∞,

which hold at L d-almost every x0 ∈ Ω, then

limsupr↓0

limj→∞

∫B(0,1)

|∇v(r)j |p dy < ∞.

Since also [v(r)j ]B(0,1) = 0, the Poincare inequality from Theorem A.16 yields that there exists

a diagonal sequence w j := vR( j)J( j) that is uniformly bounded in W1,p(B(0,1);Rm) and such

that for all k, l ∈ N,

limj→∞

∫B(0,1)

φk(y)hl(∇w j(y)) dy =∫

B(0,1)φk(y)

⟨hl,νx0

⟩dy.

Therefore, ∇w jY→ νx0 , where we now understand νx0 as a homogeneous (gradient) Young

measure on B(0,1).

The proof of weak lower semicontinuity is then straightforward:

Theorem 3.25. Let f : Ω ×Rm×d → R be a Caratheodory integrand with | f (x,A)| ≤M(1+ |A|p) for some M > 0, 1 < p < ∞. Assume furthermore that f (x, q) is quasiconvex forevery fixed x ∈ Ω. Then the corresponding functional F is weakly lower semicontinuous onW1,p(Ω;Rm).

Com

men

t...

3.6. LOWER SEMICONTINUITY 71

Proof. Let (∇u j) ⊂ W1,p(Ω;Rm) with u j u in W1,p. Assume that (∇u j) generates thegradient Young measure ν = (νx) ∈ GYp(Ω;Rm×d), for which it holds that [ν ] = ∇u. Thisis only true up to a (non-relabeled) subsequence, but if we can establish the lower semi-continuity for every such subsequence it follows that the result also holds for the originalsequence.

From Proposition 3.19 we get

liminfj→∞

∫Ω

f (x,∇u j(x)) dx ≥⟨⟨

f ,ν⟩⟩=∫

Ω


Now, for almost every x ∈ Ω, we can consider νx as a homogeneous Young measure inGYp(B(0,1);Rm×d) by the Blow-up Lemma 3.24. Thus, the Jensen-type inequality fromLemma 3.22 reads∫

f (x,A) dνx(A)≥ f (x,∇u(x)) for a.e. x ∈ Ω.

Combining, we arrive at

liminfj→∞

F [u j]≥ F [u],

which is what we wanted to show.

At this point is is worthwhile to reflect on the role of Young measures in the proof ofthe preceding result, namely that they allowed us to split the argument into two parts: First,we passed to the limit in the functional via the Young measure. Second, we establisheda Jensen-type inequality, which then yielded the lower semicontinuity inequality. It is re-markable that the Young measure preserves exactly the “right” amount of information toserve as an intermediate object.

We can now sum up and prove existence to our minimization problem:

Theorem 3.26. Let f : Ω×Rm×d → R be a Caratheodory integrand such that f satisfiesthe lower growth bound

µ|A|p −µ−1 ≤ f (x,A) for some µ > 0,

the upper growth bound

| f (x,A)| ≤ M(1+ |A|p) for some M > 0,

and is quasiconvex in its second argument. Then, the associated functional F has a mini-mizer over the space W1,p

g (Ω;Rm), where g ∈ W1−1/p,p(∂Ω;Rm).

Proof. This follows directly by combining the Direct Method from Theorem 2.3 with thecoercivity result in Proposition 2.6 and the Lower Semicontinuity Theorem 3.25.

The following result shows that quasiconvexity is also necessary for weak lower semi-continuity; we only state and show this for x-independent integrands, but we note that it alsoholds for x-dependent ones by a localization technique.

Com

men

t...



Proposition 3.27. Let f : Rm×d → R satisfy the p-growth condition.

| f (x,A)| ≤ M(1+ |A|p) for some M > 0.

If the associated functional

F [u] :=∫

Ωf (∇u(x)) dx, u ∈ W1,p(Ω;Rm),

is weakly lower semicontinuous (with or without fixed boundary values), then f is quasi-convex.

Proof. We may assume that B(0,1) ⊂⊂ Ω, otherwise we can translate and rescale the do-main. Let A ∈ Rm×d and ψ ∈ W1,∞

0 (B(0,1);Rm). We need to show

h(A)≤ −∫

B(0,1)h(A+∇ψ(z)) dz.

Take for every j ∈ N a Vitali cover of B(0,1), see Theorem A.8,

B(0,1) = Z ⊎N( j)⊎k=1

B(a( j)k ,r( j)

k ), |Z|= 0,

with a( j)k ∈ B(0,1), r( j)

k ≤ 1/ j (k = 1, . . . ,N( j)), and fix a smooth function h : Ω\B(0,1)→Rm with h(x) = Ax for x ∈ ∂B(0,1) (and h|∂Ω equal to the prescribed boundary values ifthere are any). Define

u j(x) := Ax+ r( j)k ψ

(x−a( j)

k

r( j)k

)if x ∈ B(a( j)

k ,r( j)k ),

and u j := h on Ω\B(0,1). Then, it is not hard to see that u j u in W1,p for

u(x) =

Ax if x ∈ B(0,1),h(x) if x ∈ Ω\B(0,1),

x ∈ Ω.

Thus, the lower semicontinuity yields, after eliminating the constant part of the functionalon Ω\B(0,1),∫

B(0,1)f (A) dx ≤ liminf

j→∞

∫B(0,1)

f (∇u j(x)) dx

= liminfj→∞ ∑

k

∫B(a( j)

k ,r( j)k )

f(

A+∇ψ(

x−a( j)k

r( j)k

))dx

= liminfj→∞ ∑

kr( j)

k

∫B(0,1)

f (A+∇ψ(y)) dy

=∫

B(0,1)f (A+∇ψ(y)) dy

since ∑k r( j)k = 1. This is nothing else than quasiconvexity.

We note that also for quasiconvex integrands, our results about the Euler–Lagrangeequations hold just the same (if we assume the same growth conditions).

Com

men

t...


3.7. INTEGRANDS WITH U-DEPENDENCE 73

3.7 Integrands with u-dependence

One very useful feature of our Young measure approach is that it allows to derive a lowersemicontinuity result for u-dependent integrands with minimal additional effort. So con-sider F [u] :=

∫Ω

f (x,u(x),∇u(x)) dx → min,

over all u ∈ W1,p(Ω;Rm) with u|∂Ω = g,

where Ω ⊂ Rd is a bounded Lipschitz domain, 1 0, (3.19)

and g ∈ W1−1/p,p(∂Ω;Rm).The main idea of our approach is to consider Young measures generated by the pairs

(u j,∇u j) ∈ Rm+md . We have the following result:

Lemma 3.28. Let (u j) ⊂ Lp(Ω;RM) and (Vj) ⊂ Lp(Ω;RN) be norm-bounded sequencessuch that

u j → u pointwise a.e. and VjY→ ν = (νx) ∈ Yp(Ω;RN).

Then, (u j,Vj)Y→ µ = (µx) ∈ Yp(Ω;RM+N) with

µx = δu(x)⊗νx for a.e. x ∈ Ω,

that is,∫Ω

f (x,u j(x),Vj(x)) dx →∫

Ω

∫f (x,u(x),A) dνx(A) dx (3.20)

for all Caratheodory f : Ω×RM ×RN → R satisfying the p-growth bound (3.19).

Proof. By a similar density argument as in the proof of Lemma 3.24, it suffices to showthe convergence (3.20) for f (x,v,A) = φ(x)ψ(v)h(A), where φ ∈ C0(Ω), ψ ∈ C0(RM),h ∈ C0(RN). We already know by assumption

h(Vj)∗(x 7→

⟨h,νx

⟩)in L∞.

Furthermore, ψ(u j)→ ψ(u) almost everywhere and thus strongly in L1 since ψ is bounded.Thus, since the product of a L∞-weakly* converging sequence and a L1-strongly convergingsequence converges itself weakly* in the sense of measures,∫

Ωφ(x)ψ(u j(x))h(Vj(x)) dx →

∫Ω

φ(x)ψ(u(x))⟨h,νx

⟩dx,

which is (3.20) for our special f . This already finishes the proof.

Com

men

t...

The trick of the preceding lemma is that in our situation, where Vj = ∇u jY→ ν ∈

GY(Ω;Rm×d), it allows us to “freeze” u(x) in the integrand. Then, we can apply the Jensen-type inequality from Lemma 3.22 just as we did in Theorem 3.25 to get:

Theorem 3.29. Let f : Ω×Rm ×Rm×d →R be a Caratheodory integrand with p-growth,i.e. | f (x,v,A)| ≤ M(1+ |v|p + |A|p) for some M > 0, 1 < p < ∞. Assume furthermore thatf (x,v, q) is quasiconvex for every fixed (x,v) ∈ Ω×Rm. Then, the corresponding functionalF is weakly lower semicontinuous on W1,p(Ω;Rm).

Remark 3.30. It is not too hard to extend the previous theorem to integrands f satisfyingthe more general growth condition

| f (x,v,A)| ≤ M(1+ |v|q + |A|p)

for a 1 ≤ q < p/(d− p). By the Sobolev Embedding Theorem A.17, u j → u in Lq (strongly)for all such q and thus Lemma 3.28 and then Theorem 3.29 can be suitably generalized.

3.8 Regularity of minimizers

At the end of Section 2.5 we discussed the failure of regularity for vector-valued problems.In particular, we mentioned various counterexamples to regularity. Upon closer inspection,however, it turns out that in all counterexamples, the points where regularity fails form aclosed set. This is not a coincidence.

Another issue was that all regularity theorems discussed so far require (strong) con-vexity of the integrand and our discussion so far has shown that convexity is not a goodnotion for vector-valued problems (however, much of the early regularity theory for vector-valued problems focused on convex problems, we will not elaborate on this aspect here). Toremedy this, we need a new notion, which will take over from strong convexity in regular-ity theory: A locally bounded Borel-measurable function h : Rm×d → R is called strongly(2-)quasiconvex if there exists µ > 0 such that

A 7→ h(A)−µ|A|2 is quasiconvex.

Equivalently, we may require that

µ∫

B(0,1)|∇ψ(z)|2 dz ≤

∫B(0,1)

h(A+∇ψ(z))−h(A) dz

for all A ∈ Rm×d and all ψ ∈ W1,∞0 (B(0,1);Rm).

The most well-known regularity result in this situation is by Lawrence C. Evans from1986 (there is significant overlap with work by Acerbi & Fusco in the same year):

Theorem 3.31 (Evans partial regularity theorem). Let f : Rm×d → R be of class C2,strongly quasiconvex, and assume that there exists M > 0 such that

D2A f (A)[B,B]≤ M|B|2 for all A,B ∈ Rm×d .

Com

men

t...


Let u ∈ W1,2g (Ω;Rm) be a minimizer of F over W1,2

g (Ω;Rm) where g ∈ W1/2,2(∂Ω;Rm).Then, there exists a relatively closed singular set Σu ⊂ Ω with |Σu| = 0 and it holds thatu ∈ C1,α

loc (Ω\Σu) for all α ∈ (0,1).

The proof is long and builds on earlier work of Almgren and De Giorgi in GeometricMeasure Theory.

The result is called a “partial regularity” result because the regularity does not holdeverywhere. It should be noted that while the scalar regularity theory was essentially atheory for PDEs, and hence applies to all solutions of the Euler–Lagrange equations, thisis not the case for the present result: There is no regularity theory for critical points of theEuler–Lagrange equation for a quasiconvex or even polyconvex (see next chapter) integralfunctional. This was shown by Muller & Sverak in 2003 [75] (for the quasiconvex case) andSzekelyhidi Jr. in 2004 [92] (for the polyconvex case). We finally remark that sometimesbetter estimates on the “smallness” of Σu than merely |Σu| = 0 are available. In fact, forstrongly convex integrands it can be shown that the (Hausdorff-)dimension of the singularset is at most d − 2, this is actually within reach of our discussion in Section 2.5, becausewe showed W2,2

loc-regularity, and general properties of such functions imply the conclusion,see Chapter 2 of [44]. In the strongly quasiconvex case, much less is known, but at least forminimizers that happen to be W1,∞ it was established in 2007 by Kristensen & Mingionethat the dimension of the singular set is strictly less than d, see [56]. Many other questionsare open.


The notion of quasiconvexity was first introduced in Morrey’s fundamental paper [69] andlater refined by Meyers in [65]. The results about null-Lagrangians, in particular Lemma 3.8and Lemma 3.10, go back to Morrey [70], Reshetnyak [82] and Ball [5]. Our proof is basedon the idea that certain combinations of derivatives might have good convergence proper-ties even if the individual derivatives do not. This is also pivotal for so-called compensatedcompactness, which originated in work of Tartar [94, 95] and Murat [76, 77], in particularin their celebrated Div–Curl Lemma, see for example the discussion in [35]. It is quiteremarkable that one can prove Brouwer’s fixed point theorem using the fact that minorsare null-Lagrangians, see Theorem 8.1.3 in [36]. Proposition 3.6 is originally due to Mor-rey [70], we follow the presentation in [12]. We also remark that if only r = p, then a weakerversion of Lemma 3.10 is true where we only have convergence of the minors in the senseof distributions, see Theorem 8.20 in [25].

The convexity properties of the quadratic function A 7→ MA : A, where M is a fourth-order tensor (or, equivalently, a matrix in Rmd×md) has received considerable attention be-cause it corresponds to a linear Euler–Lagrange equation. In this case, quasi-convexity andrank-one convexity are the same, and polyconvexity (see next chapter) is also equivalent toquasiconvexity if d = 2 or m = 2, but this does not hold anymore for d,m ≥ 3. These resultstogether with pointers to the vast literature can be found in Section 5.3.2 in [25].

Com

men

t...



A more general result on why convexity is inadmissible for realistic problems in non-linear elasticity can be found in Section 4.8 of [20].

The result that rank-one convex functions are locally Lipschitz continuous is well-known for convex functions, see for example Corollary 2.4 in [34] and an adapted versionfor rank-one convex (even separately convex) functions is in Theorem 2.31 in [25]. Ourproof with a quantitative bound is from Lemma 2.2 in [12].

The Fundamental Theorem of Young Measure Theory 3.11 has seen considerable evolu-tion since its original proof by Young in the 1930s and 1940s [97–99]. It was popularized inthe Calculus of Variations by Tartar [95] and Ball [7]. Further, much more general, versionshave been contributed by Balder [4], also see [19] for probabilistic applications.

The Ball–James Rigidity Theorem 3.23 is from [10], see [73] for further results in thesame spirit.

Lemma 3.21 is an expression of the well-known decomposition lemma from [38], an-other version is in [53].

Many results in this section that are formulated for Caratheodory integrands continue tohold for f : Ω×RN →R that are Borel measurable and lower semicontinuous in the secondargument, so called normal integrands, see [16] and [37].

The linear growth case p = 1 is very special and requires more sophisticated tools, so-called generalized Young measures, see [1, 32, 57, 58].

One issue in the linear-growth case is demonstrated in the following counterpart ofLemma 3.18 for p = 1:

Lemma 3.32. Let (Vj)⊂L1(Ω;RN) be a bounded sequence generating the Young measure(νx)x∈Ω ⊂ M1(RN). Define v(x) := [νx] = ⟨id,νx⟩. Then there exists an increasing sequenceΩk ⊂ Ω with |Ωk| ↑ |Ω| such that Vj v in L1(Ωk;RN) for all k; this is called bitingconvergence.

Once one knows the so-called Chacon Biting Lemma, which asserts that every L1-bounded sequence has a subsequence converging in the biting sense, the preceding lemmais easy to see using (III) from the Fundamental Theorem of Young Measure Theory 3.11.More information on this topic can be found in [14] and some of the references cited therein.

It is possible to prove the Lower Semicontinuity Theorem 3.25 for quasiconvex inte-grands without the use of Young measures, see for instance Chapter 8 in [25] for such anapproach. However, many of the ideas are essentially the same, they are just carried out di-rectly without the intermediate steps of Young measures. However, Young measures allowone to very clearly organize and compartmentalize these ideas, which aids the understand-ing of the material. We will also use Young measures later for the relaxation of non-lowersemicontinuous functionals. More on lower semicontinuity and Young measures can befound in the book [81]. The first use of Young measures to investigate questions of lowersemicontinuity for quasiconvex integral functionals seems to be [50].

The blow-up technique is related to the use of tangent measures in Geometric MeasureTheory, see for example [61].

Com

men

t...


Chapter 4

Polyconvexity

At the beginning of Chapter 3 we saw that convexity cannot hold concurrently with frame-indifference (and a mild positivity condition). Thus, we were lead to consider quasiconvexintegrands. However, while quasiconvexity is of huge importance in the theory of the Cal-culus of Variations, our Lower Semicontinuity Theorem 3.25 has one major drawback: weneeded to require the p-growth bound

| f (x,A)| ≤ M(1+ |A|p) for all x ∈ Ω, A ∈ Rd×d and some M > 0, 1 < p < ∞.

Unfortunately, this is not a realistic assumption for nonlinear elasticity because it ignoresthe fact that infinite compressions should cost infinite energy. Indeed, realistic integrandshave the property that

f (A)→+∞ as det A ↓ 0

and

f (A) = +∞ if det A ≤ 0.

For example, the family of matrices

Aα :=(

1 00 α

), α > 0,

satisfies det Aα ↓ 0 as α ↓ 0, but |Aα | remains uniformly bounded, so the above p-growthbound cannot hold.

The question whether the Lower Semicontinuity Theorem 3.25 for quasiconvex inte-grands can be extended to integrands with the above growth is currently a major unsolvedproblem, see [8]. For the time being, we have to confine ourself to a more restrictive notionof convexity if we want to allow for the above “elastic” growth. This type of convexityis called polyconvexity by John Ball in [5], who proved the corresponding existence theo-rem for minimization problems and enabled a mature mathematical treatment of nonlinearelasticity theory, including the realistic Ogden materials, see Section 1.4 and Example 4.3.

We will only look at the three-dimensional theory because it is by far the most physi-cally important case and because this restriction eases the notational burden; however, otherdimensions can be treated as well, we refer to the comprehensive treatment in [25].

Com

men

t...

78 CHAPTER 4. POLYCONVEXITY

4.1 Polyconvexity

A continuous integrand h : R3×3 →R∪+∞ (now we allow the value +∞) is called poly-convex if it can be written in the form

h(A) = H(A,cofA,det A), A ∈ R3×3,

where H : R3×3×R3×3×R→R∪+∞ is convex. Here, cofA denotes the cofactor matrixas defined in Appendix A.1.

While convexity obviously implies polyconvexity, the converse is clearly wrong. How-ever, we have:

Proposition 4.1. A polyconvex h : R3×3 → R (not taking the value +∞) is quasiconvex.

Proof. Let h be as stated and let ψ ∈ W1,∞(B(0,1);R3) with ψ(x) = Ax on ∂B(0,1). Then,using Jensen’s inequality

−∫

B(0,1)h(∇ψ) dx = −

∫B(0,1)

H(∇ψ,cof∇ψ,det∇ψ) dx

≥ H(−∫

B(0,1)∇ψ dx,−

∫B(0,1)

cof∇ψ dx,−∫

B(0,1)det∇ψ dx

)= H(A,cofA,det A) = h(A),

where for the last line we used the fact that minors are null-Lagrangians as proved inLemma 3.8.

At this point we have seen all four major convexity conditions that play a role in themodern Calculus of Variations. They can be put into a linear order by implications:

convexity ⇒ polyconvexity ⇒ quasiconvexity ⇒ rank-one convexity

While in the scalar case (d = 1 or m = 1) all these notions are equivalent, in higher dimen-sions the reverse implications do not hold in general. Clearly, the determinant function ispolyconvex but not convex. However, the fact that the notions of polyconvexity, quasicon-vexity, and rank-one convexity are all different in higher dimensions, are non-trivial. Thefact that rank-one convexity does not imply quasiconvexity, at least for d ≥ 2, m ≥ 3, is thesubject of Sverak’s famous counterexample, see Example 5.10. The case d = m = 2 is stillopen and one of the major unsolved problems in the field. Here we discuss an example of aquasiconvex, but not polyconvex function:

Example 4.2 (Alibert–Dacorogna–Marcellini). Recall from Example 3.5 the Alibert–Dacorogna–Marcellini function

hγ(A) := |A|2(|A|2 −2γ det A

), A ∈ R2×2.

It can be shown that hγ is polyconvex if and only if |γ| ≤ 1. Since it is quasiconvex if andonly if |γ| ≤ γQC, where γQC > 1, we see that there are quasiconvex functions that are notpolyconvex. See again Section 5.3.8 in [25] for details.

Com

men

t...


4.2. EXISTENCE THEOREMS 79

Example 4.3 (Ogden materials). Functions W of the form

W (F) :=M

∑i=1

ai tr[(FT F)γi/2]+ N

∑j=1

b j tr cof[(FT F)δ j/2]+Γ(detF), F ∈ R3×3,

where ai > 0, γi ≥ 1, b j > 0, δ j ≥ 1, and Γ : R → R∪ +∞ is a convex function withΓ(d) → +∞ as d ↓ 0, Γ(d) = +∞ for s ≤ 0 and good enough growth from below, can beshown to be polyconvex. These stored energy functionals correspond to so-called Ogdenmaterials and occur in a wide range of elasticity applications, see [20] for details.

It can also be shown that in three dimensions convex functions of certain combinationsof the singular values of a matrix (the singular values of A are the square roots of the eigen-values of AT A) are polyconvex, see Section 5.3.3 in [25] for details.

4.2 Existence Theorems

Let Ω ⊂ R3 be a connected bounded Lipschitz domain. In this section we will prove exis-tence of a minimizer of the variational problemF [u] :=

∫Ω

W (x,∇u(x))−b(x) ·u(x) dx → min,

over all u ∈ W1,p(Ω;R3) with det∇u > 0 a.e. and u|∂Ω = g,(4.1)

where W : Ω×R3×3 → R∪+∞ is Caratheodory and W (x, q) is polyconvex for almostevery x ∈ Ω. Thus,

W (x,A) =W(x,A,cofA,det A), (x,A) ∈ Ω×R3×3,

for W : Ω×R3×3 ×R3×3 ×R→ R∪+∞ with W(x, q, q, q) jointly convex and continuousfor almost every x ∈ Ω. As the only upper growth assumptions on W we impose

W (x,A)→+∞ as det A ↓ 0,

W (x,A) = +∞ if det A ≤ 0.(4.2)

Furthermore, we assume as usual that g ∈ W1−1/p,p(∂Ω;R3) and that b ∈ Lq(Ω;R3) where1/p+ 1/q = 1. The exponent p > 1 will remain unspecified for now. Later, when weimpose conditions on the coercivity of W , we will also specify p.

The first lower semicontinuity result is relatively straightforward:

Theorem 4.4. If in addition to the above assumptions it holds that

µ|A|p −µ−1 ≤W (x,A) for some µ > 0 and p > 3, (4.3)

then the minimization problem (4.1) has at least one solution in the space

A :=

u ∈ W1,p(Ω;R3) : det∇u > 0 a.e. and u|∂Ω = g,

whenever this set is non-empty.

Com

men

t...



Proof. We apply the usual Direct Method. For a minimizing sequence (u j) ⊂ A withF [u j]→ min, we get from the coercivity estimate (4.3) in conjunction with the Poincare–Friedrichs inequality (and the fixed boundary values) that for any δ > 0,∫

ΩW (x,∇u j)−b(x) ·u j(x) dx

≥ µ∥∇u j∥pLp −µ−1|Ω|−∥b∥Lq · ∥u j∥Lp

≥ µC∥u j∥p

W1,p −µC∥g∥p

W1−1/p,p −|Ω|µ

− 1δq

∥b∥qLq −

δp∥u j∥p

W1,p .

Choosing δ = pµ/(2C), we arrive at

supj∈N

∥u j∥W1,p ≤C(

supj∈N

F [u j]+1).

Thus, we may select a subsequence (not renumbered) such that u j u in W1,p(Ω;R3).Now, by Lemma 3.10 and p > 3,

det∇u j det∇u in Lp/3 and

cof∇u j cof∇u in Lp/2.

Thus, an argument entirely analogous to the proof of Theorem 2.7 (note that in that theoremwe did not need any upper growth condition) yields that the hyperelastic part of F ,

u 7→∫

ΩW (x,∇u j) dx =

∫ΩW(x,∇u j,cof∇u j,det∇u j) dx

is weakly lower semicontinuous in W1,p. The second part of F is weakly continuous byLemma 2.37. Thus,

F [u]≤ liminfj→∞

F [u j] = infA

F .

In particular, det∇u > 0 almost everywhere and u|∂Ω = g by the weak continuity of thetrace. Hence u ∈ A and the proof is finished.

Unfortunately, the preceding theorem has a major drawback in that p > 3 has to beassumed. In applications in elasticity theory, however, a more realistic form of W is

W (A) =λ2(trE)2 +µ trE2 +O(|E|2), E =

12(AT A− I), (4.4)

where λ ,µ > 0 are the Lame constants. Except for the last term O(|E|2), which vanishesas |E| ↓ 0, this energy corresponds to a so-called St. Venant–Kirchhoff material. It wasshown by Ciarlet & Geymonat [21] (also see Theorem 4.10-2 in [20]) that for any suchLame constants, there exists a polyconvex function W of the form

W (A) = a|A|2 +b|cofA|2 + γ(det A)+ c,

Com

men

t...



where a,b,c > 0 and γ : R → [0,∞] is convex, such that the above expansion (4.4) holds.Clearly, such W has 2-growth in |A|. Thus, we need an existence theorem for functions withthese growth properties.

The core of the refined argument will be an improvement of Lemma 3.10:

Lemma 4.5. Let 1 ≤ p,q,r ≤ ∞ with

p ≥ 2,1p+

1q≤ 1, r ≥ 1,

and assume that the sequence (u j)⊂ W1,p(Ω;R3) satisfies for some H ∈ Lq(Ω;R3×3), d ∈Lr(Ω) that

u j u in W1,p,

cof∇u j H in Lq,

det∇u j d in Lr.

Then, H = cof∇u and d = det∇u.

Proof. The idea is to use distributional versions of the cofactor matrix and Jacobian deter-minant and to show that they agree with the usual definitions for sufficiently regular func-tions. To this aim we will use the representation of minors as divergences already employedin Lemmas 3.8, 3.10.

Step 1. From (3.7) we get that for all φ = (φ1,φ2,φ3) ∈ C1(Ω;R3),

(cof∇φ)kl = ∂l+1(φk+1∂l+2φk+2)−∂l+2(φk+1∂l+1φk+2),

where k, l ∈ 1,2,3 are cyclic indices. Thus, for all ψ ∈ C∞c (Ω),∫

Ω(cof∇φ)k

l ψ dx =−∫

Ω(φk+1∂l+2φk+2)∂l+1ψ − (φk+1∂l+1φk+2)∂l+2ψ dx

=:⟨(Cof∇φ)k

l ,ψ⟩

(4.5)

and we call the quantity “Cof∇φ” the distributional cofactors.To investigate the continuity properties of the distributional cofactors, we consider

G(φ) :=∫

Ω(φk∂lφm)∂nψ dx, φ ∈ W1,p(Ω;R3),

for some k, l,m,n ∈ 1,2,3 and fixed ψ ∈ C∞c (Ω). We can estimate this using the Holder

inequality as follows:

|G(φ)| ≤ ∥φ∥Ls∥∇φ∥Lp∥∇ψ∥∞

whenever

1s+

1p≤ 1. (4.6)

Com

men

t...


In particular this is true for s = p ≥ 2. Thus, since C1(Ω;R3) is dense in W1,p(Ω;R3), wehave that the equality (4.5) also holds for φ ∈ W1,p(Ω;R3) whenever p ≥ 2.

Moreover, the above arguments yield for a sequence (φ j)⊂ W1,p(Ω;R3) that

G(φ j)→ G(φ) if

φ j → φ in Ls,

∇φ j ∇φ in Lp.

The first convergence is automatically satisfied by the compact Rellich–Kondrachov Theo-rem A.18 if

s <

3p

3−p if p < 3,

∞ if p ≥ 3.(4.7)

We can always choose s such that it simultaneously satisfies (4.6) and (4.7), as a quickcalculation shows. Thus,⟨

Cof∇φ j,ψ⟩→⟨Cof∇φ,ψ

⟩if φ j φ in W1,p, p ≥ 2. (4.8)

Step 2. First, if φ = (φ1,φ2,φ3) ∈ C1(Ω;R3), then we know from (3.8) that

det∇φ =3

∑l=1

∂lφ1(cof∇φ)1l =

3

∑l=1

∂l(φ1(cof∇φ)1

l).

Thus, in this case, we have for all ψ ∈ C∞c (Ω) that∫

Ω(det∇φ)ψ dx =

3

∑l=1

∫Ω

∂lφ1(cof∇φ)1l ψ dx =−

3

∑l=1

∫Ω

(φ1(cof∇φ)1

l)∂lψ dx. (4.9)

The key idea now is that by Holder’s inequality this quantity makes sense if only φ ∈Lp(Ω;R3) and cof∇φ ∈ Lq(Ω;R3×3) with 1

p +1q ≤ 1. Analogously to the argument for the

cofactor matrix, this motivates us to define for

φ ∈ Lp(Ω;R3) with cof∇φ ∈ Lq(Ω;R3), where1p+

1q≤ 1,

the distributional determinant⟨Det∇φ,ψ

⟩:=−

3

∑l=1

∫Ω

(φ1(cof∇φ)1

l)∂lψ dx, ψ ∈ C∞

c (Ω).

From (4.9) we see that if φ ∈ C1(Ω;R3), then “det = Det”, i.e.∫Ω(det∇φ)ψ dx =

⟨Det∇φ,ψ

⟩(4.10)

for all ψ ∈ C∞c (Ω). We want to show that this equality remains to hold if merely φ ∈

W1,p(Ω;R3) and cof∇φ ∈ Lq(Ω;R3×3) with p,q as in the statement of the lemma. For themoment fix ψ ∈ C∞

c (Ω) and define for φ ∈ C1(Ω;R3) and F ∈ C1(Ω;R3×3),

Z(φ,F) :=3

∑l=1

∫Ω

∂lφ1F1l ψ +φ1F1

l ∂lψ dx =3

∑l=1

∫Ω

F1l ∂l(φ1ψ) dx.

Com

men

t...

To establish (4.10), we need to show Z(φ,cof∇φ) = 0.Recall the Piola identity (3.9),

divcof∇v = 0 if v ∈ C1(Ω;R3).

As a consequence,

Z(φ,cof∇v) = 0 for φ,v ∈ C1(Ω;R3).

Furthermore, we have the estimate

|Z(φ,F)| ≤ ∥∇φ∥Lp∥F∥Lq∥ψ∥∞ whenever1p+

1q≤ 1.

Thus, we get, via two approximation procedures (first in F = cof∇v, then in φ)

Z(φ,cof∇v) = 0 for φ ∈ W1,p(Ω;R3), cof∇v ∈ Lq(Ω;R3×3).

In particular,

Z(φ,cof∇φ) = 0 for φ ∈ W1,p(Ω;R3), cof∇φ ∈ Lq(Ω;R3×3).

Therefore, (4.10) holds also for φ = u as in the statement of the lemma.Moreover, we see from the definition of the distributional determinant that for a se-

quence (φ j)⊂ W1,p(Ω;R3) we have

⟨Det∇φ j,ψ

⟩→⟨Det∇φ ,ψ

⟩if

φ j → φ in Ls,

cof∇φ j cof∇φ in Lq.

whenever

1s+

1q≤ 1. (4.11)

The first convergence follows, as before, from the Rellich–Kondrachov Theorem A.18 if

s <

3p

3−p if p < 3,

∞ if p ≥ 3.(4.12)

However, since we assumed

1p+

1q≤ 1,

we can always choose s such that it satisfies (4.11) and (4.12) simultaneously, as can beseen by some elementary algebra. Thus,

⟨Det∇φ j,ψ

⟩→⟨Det∇φ ,ψ

⟩if

φ j φ in W1,p,

cof∇φ j cof∇φ in Lq,(4.13)

Com

men

t...

where p,q satisfy the assumptions of the lemma.Step 3. Assume that, as in the statement of the lemma, we are given a sequence (u j)⊂

W1,p(Ω;R3) such that for H ∈ Lq(Ω;R3×3), d ∈ Lr(Ω) it holds thatu j u in W1,p,

cof∇u j H in Lq,

det∇u j d in Lr.

Then, (4.5), (4.10) imply that for all ψ ∈ C∞c (Ω),⟨

Cof∇u j,ψ⟩→⟨H,ψ

⟩and

⟨Det∇u j,ψ

⟩→⟨d,ψ

⟩.

On the other hand, from (4.8), (4.13),⟨Cof∇u j,ψ

⟩→⟨Cof∇u,ψ

⟩and

⟨Det∇u j,ψ

⟩→⟨Det∇u,ψ

⟩.

Thus, ∫Ω(Cof∇u−H)ψ dx = 0 and

∫Ω(Det∇u−d)ψ dx = 0

for all ψ ∈ C∞c (Ω). The Fundamental Lemma of the Calculus of Variations 2.21 then gives

immediately

H = cof∇u and d = det∇u.

This finishes the proof.

With this tool at hand, we can now prove the improved existence result:

Theorem 4.6 (Ball). Let 1 ≤ p,q,r < ∞

p ≥ 2,1p+

1q≤ 1, r > 1,

such that in addition to the assumptions at the beginning of this section, it holds that

W (x,A)≥ µ(|A|p + |cofA|q + |det A|r

)−µ−1 for some µ > 0. (4.14)

Then, the minimization problem (4.1) has at least one solution in the space

A :=

u ∈ W1,p(Ω;R3) : cof∇u ∈ Lq(Ω;R3×3), det∇u ∈ Lr(Ω),

det∇u > 0 a.e. and u|∂Ω = g,


Com

men

t...

4.3. GLOBAL INJECTIVITY 85

Proof. This follows in a completely analogous way to the proof of Theorem 4.4, but nowwe select a subsequence of a minimizing sequence (u j) ⊂ A such that for some H ∈Lq(Ω;R3×3), d ∈ Lr(Ω) we have

u j u in W1,p,

cof∇u j H in Lq,

det∇u j d in Lr,

which is possible by the usual weak compactness results in conjunction with the coercivityassumption (4.14). Thus, Lemma 4.5 yields

u j u in W1,p,

cof∇u j cof∇u in Lq,

det∇u j det∇u in Lr,

and we may argue as in Theorem 4.4 to conclude that u is indeed a minimizer. The proofthat u ∈ A is the same as in Theorem 4.4.

4.3 Global injectivity

For reasons of physical admissibility, we need to additionally prove that we can find a min-imizer that is globally injective, at least almost everywhere (since we work with Sobolevfunctions). There are several approaches to this issue, for example via the topological de-gree. We here present a classical argument by Ciarlet & Necas [22]. It is important tonotice that on physical grounds it is only realistic to expect this injectivity for p > d. Forlower exponents, complex effects such as cavitation and (microscopic) fracture have to beconsidered. This is already expressed in the fact that Sobolev functions in W1,p for p ≤ dare not necessarily continuous anymore.

We will prove the following theorem:

Theorem 4.7. In the situation of Theorem 4.4, in particular p > 3, the minimization prob-lem (4.1) has at least one solution in the space

A :=

u ∈ W1,p(Ω;Rd) : det∇u > 0 a.e., u is injective a.e., u|∂Ω = g,


Proof. Let (u j)⊂ A be a minimizing sequence. The proof is completely analogous to theone of Theorem 4.4, we only need to show in addition that u is injective almost everywhere.

For our choice p > 3, the space W1,p(Ω;R3) embeds continuously into C(Ω;R3), so wehave that

u j → u uniformly.

Com

men

t...


Let U ⊂R3 be any relatively compact open set with u(Ω)⊂⊂U (note that u(Ω) is relativelycompact). Then, u j(Ω)⊂U for j sufficiently large by the uniform convergence. Hence, forsuch j,

|u j(Ω)| ≤ |U |.

Furthermore, since det∇u j converges weakly to det∇u in Lp/3 by Lemma 3.10 and all u j

are injective almost everywhere by definition of the space A ,∫Ω

det∇u dx = limj→∞

∫Ω

det∇u j dx = limj→∞

|u j(Ω)| ≤ |U |.

Letting |U | ↓ |u(Ω)|, we get the Ciarlet–Necas non-interpenetration condition∫Ω

det∇u dx ≤ |u(Ω)|. (4.15)

It is known (see for example [60] or [17]) that even if only u ∈ W1,p(Ω;R3) it holds that∫Ω|det∇u| dx =

∫u(Ω)

H 0(u−1(x′)) dx′,

where we denote by H 0 the counting measure. We estimate

|u(Ω)| ≤∫

u(Ω)H 0(u−1(x′)) dx′ =

∫Ω

det∇u dx ≤ |u(Ω)|,

where we used that det∇u > 0 almost everywhere and (4.15). Thus,

H 0(u−1(x′)) = 1 for a.e. x′ ∈ u(Ω),

which is nothing else than the almost everywhere injectivity of u.


Theorem 4.6 is a refined version by Ball, Currie & Olver [9] of Ball’s original result in [5].Also see [10, 11] for further reading.

Many questions about polyconvex integral functionals remain open to this day: In par-ticular, the regularity of solutions and the validity of the Euler–Lagrange equations arelargely open in the general case. Note that the regularity theory from the previous chapter isnot in general applicable, at least if we do not assume the upper p-growth. These questionsare even open for the more restricted situation of nonlinear elasticity theory. See [8] for asurvey on the current state of the art and a collection of challenging open problems.

As for the almost injectivity (for p > 3), this is in fact sometimes automatic, as shownby Ball [6], but the arguments do not apply to all situations. Tang [93] extended this top > 2, but since then we have to deal with non-continuous functions, it is not even clearhow to define u(Ω). For fracture, there is a huge literature, see [74] and the recent [46],which also contains a large bibliography. A study on the surprising (mis)behavior of thedeterminant for p < d can be found in [52].

Com

men

t...

Chapter 5

Relaxation

When trying to minimize an integral functional over a Sobolev space, it it a very real pos-sibility that there is no minimizer if the integrand is not quasiconvex; of course, even func-tionals with non-quasiconvex integrands can have minimizers, but, generally speaking, weshould expect non-quasiconvexity to be correlated with the non-existence of minimizers.

For instance, consider the functional

F [u] :=∫ 1

0|u(x)|2 +

(|u′(x)|2 −1

)2 dx, u ∈ W1,40 (0,1).

The gradient part of the integrand, a 7→ (a2 − 1)2, is called a double-well potential, seeFigure 5.1. Approximate minimizers of F try to satisfy u′ ∈ −1,+1 as closely as pos-sible, while at the same time staying close to zero because of the first term under the in-tegral. These contradicting requirements lead to minimizing sequences that develop fasterand faster oscillations similar to the ones shown in Figure 3.2 on page 48. It should beintuitively clear that no classical function can be a minimizer of F .

In this situation, we have essentially two options: First, if we only care about the infimalvalue of F , we can compute the relaxation F∗ of F , which (by definition) will be lowersemicontinuous and then find its minimizer. This will give us the infimum of F over alladmissible functions, but the minimizer of F∗ says only very little about the minimizingsequence of our original F since all oscillations (and concentrations in some cases) havebeen “averaged out”.

Second, we can focus on the minimizing sequences themselves and try to find a gener-alized limit object to a minimizing sequence encapsulating “interesting” information aboutthe minimization. This limit object will be a Young measure and in fact this was the originalmotivation for introducing them. Effectively, Young measure theory allows one to replacea minimization problem over a Sobolev space by a generalized minimization problem over(gradient) Young measures that always has a solution. In applications, the emerging os-cillations correspond to microstructure, which is very important for example in materialscience.

In this chapter we will consider both approaches in turn.

Com

men

t...


88 CHAPTER 5. RELAXATION

Figure 5.1: A double-well potential.

5.1 Quasiconvex envelopes

We have seen in Chapter 3 that integral functionals with quasiconvex integrands are weaklylower semicontinuous in W1,p, where the exponent 1 < p < ∞ is determined by growthproperties of the integrand. If the integrand is not quasiconvex, then we would like tocompute the functional’s relaxation, i.e. the largest (weakly) lower semicontinuous functionbelow it. We can expect that when passing from the original functional to the relaxed one,we should also pass from the original integrand to a related but quasiconvex integrand. Forthis, we define the quasiconvex envelope Qh : Rm×d → R∪−∞ of a locally boundedBorel-function h : Rm×d → R as

Qh(A) := inf−∫

ωh(A+∇ψ(z)) dz : ψ ∈ W1,∞

0 (ω;Rm)

, A ∈ Rm×d , (5.1)

where ω ⊂ Rd is a bounded Lipschitz domain. Clearly, Qh ≤ h. By a similar argumentas in Lemma 3.2, one can see that this formula is independent of the choice of ω . If hhas p-growth at infinity, we may replace the space W1,∞

0 (ω;Rm) in the above formula byW1,p

0 (ω;Rm). Finally, by a covering argument as for instance in Proposition 3.27, we mayfurthermore restrict the class of ψ in the above infimum to those satisfying ∥ψ∥L∞ ≤ ε forany ε > 0.

It is well-known that for continuous h the quasiconvex envelope Qh is equal to

Qh = sup

g : g quasiconvex and g ≤ h, (5.2)

see Section 6.3 in [25]. Note that the equality also holds if one (hence both) of the twoexpressions is −∞.

We first need the following lemma:

Com

men

t...

5.1. QUASICONVEX ENVELOPES 89

Lemma 5.1. For continuous h : Rm×d →R with p-growth, |h(A)| ≤ M(1+ |A|p), the qua-siconvex envelope Qh as defined in (5.1) is either identically −∞ or real-valued with p-growth, and quasiconvex.

Proof. Step 1. If Qh takes the value −∞, say Qh(B0) =−∞, then for all N ∈ N there existsψN ∈ W1,∞

0 (B(0,1/2);Rm) such that∫B(0,1/2)

h(A0 +∇ψN(z)) dz ≤−N.

For any A0 ∈ Rm×d pick ψ∗ ∈ W1,∞A0x(B(0,1);R

m) that has trace B0x on ∂B(0,1/2) and set

ψN(z) :=

ψN(z) if z ∈ B(0,1/2),ψ∗(z otherwise,

z ∈ B(0,1).

Hence,

Qh(A0)≤ −∫

B(0,1)h(∇ψN(z)) dz ≤ −N +C

B(0,1)

for some fixed C > 0 (depending on our choice of ψ∗). Letting N → ∞, we get Qh(A0) =−∞. Thus, Qh is identically −∞.

Step 2. For any ψ ∈ W1,∞0 (B(0,1);Rm) and any A0 ∈ Rm×d we need to show

−∫

B(0,1)Qh(A0 +∇ψ(z)) dz ≥ Qh(A0). (5.3)

It can be shown that also Qh has p-growth, see for instance Lemma 2.5 in [54]. Thus, we canuse an approximation argument in conjunction with Theorem 2.30 to prove that it suffices toshow the inequality (5.3) for piecewise affine ψ ∈ W1,∞

0 (B(0,1);Rm), say ψ(x) = vk +Akx(vk ∈Rm, Ak ∈Rm×d) for x ∈ Gk from a disjoint collection of Lipschitz subdomains Gk ⊂ Ω(k = 1, . . . ,N) with |Ω \

∪k Gk| = 0. Here we use the W1,p-density of piecewise affine

functions in W1,∞0 , which can be proved by mollification and an elementary approximation

of smooth functions by piecewise affine ones (e.g. employing finite elements).Fix ε > 0. By the definition of Qh, for every k we can find φk ∈ W1,∞

0 (Gk;Rm) such that

Qh(A0 +Ak)≥ −∫

Gk

f (A0 +Ak +∇φk(z)) dz− ε .

Then, ∫B(0,1)

Qh(A0 +∇ψ(z)) dz =N

∑k=1

|Gk|Qh(A0 +Ak)

≥N

∑k=1

∫Gk

h(A0 +Ak +∇φk(z)) dz− ε|Gk|

=∫

B(0,1)h(A0 +∇φ(z)) dz− ε |B(0,1)|

≥ |B(0,1)|(Qh(A0)− ε

),

Com

men

t...


where φ ∈ W1,∞0 (B(0,1);Rm) is defined as

φ(x) :=

vk +Akx+φk(x) if x ∈ Gk,0 otherwise,

x ∈ B(0,1),

and the last step follows from the definition of Qh. Letting ε ↓ 0, we have shown (5.3).

We can also use the notion of the quasiconvex envelope to introduce a class of non-trivial quasiconvex functions with p-growth, 1 0 and the quasiconvex envelope Qh is not convex (at zero).

Proof. We first show Qh(0) > 0. Assume to the contrary that Qh(0) = 0. Then, by (5.1)there exists a sequence (ψ j)⊂ W1,∞

0 (B(0,1);R2) with

−∫

B(0,1)h(∇ψ j) dz → 0. (5.4)

Let P : R2×2 → R2×2 be the projection onto the orthogonal complement of spanF. Somelinear algebra shows |P(A)|p ≤ h(A) for all A ∈ R2×2. Therefore,

P(∇ψ j)→ 0 in Lp(B(0,1);R2×2). (5.5)

With the definition

g(ξ ) :=∫

B(0,1)g(x)e−2πiξ ·x dx, ξ ∈ Rd ,

for the Fourier transform g of a function g ∈ L1(B(0,1);RN), we get the rule ∇ψ j(ξ ) =(2πi) ψ j(ξ )⊗ξ . The fact that a⊗ξ /∈ spanF for all a,ξ ∈R2 \0 implies P(a⊗ξ ) = 0.Some algebra implies that we may write

∇ψ j(ξ ) = M(ξ )P(ψ j(ξ )⊗ξ ), ξ ∈ Rd ,

where M(ξ ) : R2×2 → R2×2 is linear and furthermore depends smoothly and positively0-homogeneously on ξ . We omit the details of this construction.

For p = 2, Parseval’s formula ∥g∥L2 = ∥g∥L2 together with (5.5) then implies

∥∇ψ j∥L2 = ∥∇ψ j∥L2 = ∥M(ξ )P(ψ j(ξ )⊗ξ )∥L2

≤ ∥M∥∞∥P(ψ j(ξ )⊗ξ )∥L2 = ∥M∥∞∥P(∇ψ j)∥L2 → 0.

But then h(∇ψ j)→ |F | in L1, contradicting (5.4).

Com

men

t...

5.2. RELAXATION OF INTEGRAL FUNCTIONALS 91

For 1 < p < ∞, we may apply the Mihlin Multiplier Theorem (see for example Theo-rem 5.2.7 in [45] and Theorem 6.1.6 in [15]) to get analogously to the case p = 2 that

∥∇ψ j∥Lp ≤ ∥M∥Cd∥P(∇ψ j)∥Lp → 0,

which is again at odds with (5.4).Finally, if Qh was convex at zero, we would have

Qh(0)≤ 12(Qh(F)+Qh(−F)

)≤ 1

2(h(F)+h(−F)

)= 0,

a contradiction.

5.2 Relaxation of integral functionals

We first consider the abstract principles of relaxation before moving on to more concreteintegral functionals. Suppose we have a functional F : X →R∪+∞, where X is a reflex-ive Banach space. Assume furthermore that our F is not weakly lower semicontinuous1,i.e. there exists at least one sequence x j x in X with F [x]> liminf j→∞ F [x j]. Then, therelaxation F∗ : X → R∪−∞,+∞ of F is defined as

F∗[x] := inf

liminfj→∞

F [x j] : x j x in X.

Theorem 5.3 (Abstract relaxation theorem). Let X be a separable, reflexive Banachspace and let F : X → R∪+∞ be a functional. Assume furthermore:

(WH1) Weak Coercivity: For all Λ > 0, the sublevel set

x ∈ X : F [x]≤ Λ is sequentially weakly relatively compact.

Then, the relaxation F∗ of F is weakly lower semicontinuous and minX F∗ = infX F .

Proof. Since X∗ is also separable, we can find a countable set fll∈N ⊂ X∗ such that

x j x in X ⇐⇒ sup j ∥x j∥< ∞ and ⟨x j, fl⟩ → ⟨x, fl⟩ for all l ∈ N.

Step 1. We first show weak lower semicontinuity of F∗: Let x j x in X such thatliminf j→∞ F∗[x j] < ∞ (if this is not satisfied, there is nothing to show). Selecting a (notrenumbered) subsequence if necessary, we may further assume that α := sup j F∗[x j]< ∞.

For every j ∈ N, there exists a sequence (x j,k)k ⊂ X such that x j,k x j in X as k → ∞and

liminfk→∞

F [x j,k]≤ F∗[x j]+1j.

1In principle, if F nevertheless is weakly lower semicontinuous for a minimizing sequence, then no relax-ation is necessary. It is, however, very difficult to establish such a property without establishing general weaklower semicontinuity.

Com

men

t...

Now select a y j := x j,k( j) such that∣∣⟨y j, fl⟩−⟨x j, fl⟩∣∣≤ 1

jfor all l = 1, . . . , j,

and

F [y j]≤ F∗[x j]+2j≤ α +

2j.

Then, from the weak coercivity (WH1) we get sup j ∥y j∥ < ∞. Thus, it follows that theso-selected diagonal sequence (y j) satisfies y j x and

F∗[x]≤ liminfj→∞

F [y j]≤ liminfj→∞

F∗[x j]

and F∗ is shown to be weakly lower semicontinuous.Step 2. Next take a minimizing sequence (x j)⊂ X such that

limj→∞

F∗[x j] = infX

F∗.

Now, for every x j select a y j ∈ X such that

F [y j]≤ F∗[x j]+1j.

From the coercivity assumption (WH1) we get that we may select a weakly convergingsubsequence (not renumbered) y j y, where y ∈ X . Then, since F∗ is weakly lower semi-continuous and F∗ ≤ F ,

F∗[y]≤ liminfj→∞

F∗[y j]≤ liminfj→∞

F [y j]≤ limj→∞

F∗[x j] = infX

F∗,

so F∗[y] = minX F∗. Furthermore,

infX

F ≤ liminfj→∞

F [y j]≤ minX

F∗ ≤ infX

F .

Thus, minX F∗ = infX F , completing the proof.

As usual, we here are most interested in the concrete case of integral functionals

F [u] :=∫

Ωf (x,∇u(x)) dx, u ∈ W1,p(Ω;Rm),

where, as usual, f : Ω×Rm×d →R is a Caratheodory integrand satisfying the p-growth andcoercivity assumption

µ|A|p −µ−1 ≤ f (x,A)≤ M(1+ |A|p), (x,A) ∈ Ω×Rm×d , (5.6)

for some µ,M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain. The main resultof this section is:

Com

men

t...

5.2. RELAXATION OF INTEGRAL FUNCTIONALS 93

Theorem 5.4 (Relaxation). Let F be as above and assume furthermore that there exists amodulus of continuity ω , that is, ω : [0,∞)→ [0,∞) is continuous, increasing, and ω(0) =0, such that

| f (x,A)− f (y,A)| ≤ ω(|x− y|)(1+ |A|p), x,y ∈ Ω, A ∈ Rm×d . (5.7)

Then, the relaxation F∗ of F is

F∗[u] =∫

ΩQ f (x,∇u(x)) dx, u ∈ W1,p(Ω;Rm),

where Q f (x, q) is the quasiconvex envelope of f (x, q) for all x ∈ Ω.

Proof. We define

G [u] :=∫

ΩQ f (x,∇u(x)) dx, u ∈ W1,p(Ω;Rm).

As Q f (x, q) is quasiconvex for all x ∈ Ω by Lemma 5.1, it is continuous by Proposition 3.6.We will see below that Q f is continuous in its first argument, so Q f is Caratheodory and Gis thus well-defined.

We will show in the following that (I) G ≤ F∗ and (II) F∗ ≤ G .To see (I), it suffices to observe that G is weakly lower semicontinuous by Theorem 3.25

and that G ≤ F . Thus, from the definition of F∗ we immediately get G ≤ F∗.We will show (II) in several steps.Step 1. Fix A ∈ Rm×d . For x ∈ Ω let ψx ∈ W1,p

0 (B(0,1);Rm) be such that

−∫

B(0,1)f (x,A+∇ψx(z)) dz ≤ Q f (x,A)+1.

Then we use (5.6) to observe

µ −∫

B(0,1)|A+∇ψx(z)|p dz−µ−1 ≤ Q f (x,A)+1 ≤ M(1+ |A|p)+1.

Let now x,y ∈ Ω and estimate using (5.7) and the definition of Q f (x, q),Q f (y,A)−Q f (x,A)≤ −

∫B(0,1)

∣∣ f (y,A+∇ψx(z))− f (x,A+∇ψx(z))∣∣ dz+1

≤ ω(|x− y|)−∫

B(0,1)1+ |A+∇ψx(z)|p dz+1

≤Cω(|x− y|)(1+ |A|p),

where C =C(µ,M) is a constant. Exchanging the roles of x and y, we see∣∣Q f (x,A)−Q f (y,A)∣∣≤Cω(|x− y|)(1+ |A|p) (5.8)

for all x,y ∈ Ω and A ∈ Rm×d . In particular, Q f is continuous in x.

Com

men

t...


Step 2. Next we show that it suffices to consider piecewise affine u. If u ∈ W1,p(Ω;Rm),then there exists a sequence (v j) ⊂ W1,p(Ω;Rm) of piecewise affine functions such thatv j → u in W1,p. Since Q f is Caratheodory and has p-growth (as Q f ≤ f ), Theorem 2.30shows that G [v j] → G [u] and, by lower semicontinuity, F∗[u] ≤ liminf j→∞ F∗[v j]. Thus,if (II) holds for all piecewise affine u, it also holds for all u ∈ W1,p(Ω;Rm).

Step 3. Fix ε > 0 and let u ∈ W1,p(Ω;Rm) be piecewise affine, say u(x) = vk +Akx(where vk ∈ Rm, Ak ∈ Rm×d) on the set Gk from a disjoint collection of open sets Gk ⊂ Ω(k = 1, . . . ,N) such that |Ω \

∪Nk=1 Gk| = 0. Fix such k and cover Gk up to a nullset with

countably many balls Bk,l := B(xk,l,rk,l) ⊂ Gk, where xk,l ∈ Gk, 0 < rk,l < ε (l ∈ N) suchthat ∣∣∣∣−∫

Bk,l

Q f (x,Ak) dx−Q f (xk,l,Ak)

∣∣∣∣≤Cω(ε)(1+ |Ak|p).

This is possible by the Vitali Covering Theorem A.8 in conjunction with (5.8).Now, from the definition of Q f and the independence of this definition from the domain,

we can find a function ψk,l ∈ W1,∞0 (Bk,l;Rm) with ∥ψk,l∥L∞ ≤ ε and∣∣∣∣Q f (xk,l,Ak)− −

∫Bk,l

f(xk,l,Ak +∇ψk,l(z)

)dz∣∣∣∣≤ ε.

Set

vε(x) := u(x)+

ψk,l(x) if x ∈ Bk,l ,0 otherwise,

x ∈ Ω.

In a similar way to Step 1 we can show that

µ −∫

Bk,l

|Ak +∇ψk,l(z)|p dz−µ−1 ≤ M(1+ |Ak|p)+ ε.

Thus, ∫Ω|∇vε |p dx = ∑

k,l

∫Bk,l

|Ak +∇ψk,l(x)|p dx

≤ Mµ

∫Ω

1+ |∇u(x)|p dx+|Ω|µ2 +

ε |Ω|µ

< ∞,

so, by the Poincare–Friedrichs inequality, (vε)ε>0 ∈ W1,p(Ω;Rm) is uniformly bounded inW1,p.

By the continuity assumption (5.7),∣∣∣∣−∫Bk,l

f (xk,l,∇vε(x)) dx− −∫

Bk,l

f (x,∇vε(x)) dx∣∣∣∣≤ ω(ε)−

∫Bk,l

1+ |∇vε(x)|p dx.

Com

men

t...

5.3. YOUNG MEASURE RELAXATION 95

Then, we can combine all the above arguments to get∣∣∣∣−∫Bk,l

Q f (x,∇u(x)) dx− −∫

Bk,l

f (x,∇vε(x)) dx∣∣∣∣

≤Cω(ε)−∫

Bk,l

2+ |Ak|p + |∇vε(x)|p dx+ ε.

Multiplying this by |Bk,l| and summing over all k, l, we get∣∣G [u]−F [vε ]∣∣≤Cω(ε)

∫Ω

2+ |∇u(x)|p + |∇vε(x)|p dx+ ε |Ω|

and this vanishes as ε ↓ 0. For u j := v1/ j we can convince ourselves that u j u since (u j)is weakly relatively compact in W1,p and ∥u j −u∥L∞ ≤ 1/ j → 0. Hence,

G [u] = limj→∞

F [u j]≥ F∗[u],

which is (II) for piecewise affine u.

5.3 Young measure relaxation

As discussed at the beginning of this chapter, the relaxation strategy as outlined in the pre-ceding section has one serious drawback: While it allows to find the infimal value, therelaxed functional does not tell us anything about the structure (e.g. oscillations) of mini-mizing sequences. Often, however, in applications this structure of minimizing sequences isdecisive as it represents the development of microstructure. For example, in material sci-ence, this microstructure can often be observed and greatly influences material properties.Therefore, in this section we implement the second strategy mentioned at the beginningof the chapter, that is, we extend the minimization problem to a larger space and look forsolutions there.

Let us first, in an abstract fashion, collect a few properties that our extension shouldsatisfy. Assume we are given a space X together with a notion of convergence “→” anda functional F : X → R∪+∞, which is not lower semicontinuous with respect to theconvergence “→” in X . Then, we need to extend X to a larger space X with a convergence“” and F to a functional F : X → R∪+∞, called the extension–relaxation such thatthe following conditions are satisfied:

(i) Extension property: X ⊂ X (in the sense of an embedding) and, using this embed-ding, F |X = F .

(ii) Lower bound: If x j → x in X , then, up to a subsequence, there exists x ∈ X such thatx j x in X and

liminfj→∞

F [x j]≥ F [x].

Com

men

t...


(iii) Recovery sequence: For all x ∈ X there exists a recovery sequence (x j) ⊂ X withx j x in X and such that

limj→∞

F [x j] = F [x].

Intuitively, (ii) entails that we can solve our minimization problem for F by passing to theextended space; in particular, if (x j) ⊂ X is minimizing and relatively compact (which, upto a subsequence, implies x j → x in X), then (ii) tells us that x should be considered the“minimizer”. On the other hand, (iii) ensures that the minimization problems for F and forF are sufficiently related. In particular, it is easy to see that the infima agree and indeed Fattains its minimum (for this, we need to assume some coercivity of F ):

minX

F = infX

F .

If we consider integral functionals defined on X = W1,p(Ω;Rm), we have already seen avery convenient extended space X , namely the space of gradient p-Young measures. In fact,these generalized variational problems are the original raison d’etre for Young measures. Inthe following we will show how this fits into the above abstract framework. This, we returnto our prototypical integral functional

F [u] :=∫

Ωf (x,∇u(x)) dx, u ∈ W1,p(Ω;Rm),

with f : Ω×Rm×d → R Caratheodory and

µ|A|p −µ−1 ≤ f (x,A)≤ M(1+ |A|p), (x,A) ∈ Ω×Rm×d , (5.9)

for some µ,M > 0, 1 < p < ∞, and Ω ⊂ Rd a bounded Lipschitz domain.Motivated by the considerations in Sections 3.3 and 3.4, we let for a gradient Young

measure ν = (νx) ∈ GYp(Ω;Rm×d) the extension–relaxation F : GYp(Ω;Rm×d)→ R bedefined as

F [ν ] :=⟨⟨

f ,ν⟩⟩=∫

Ω


Then, our relaxation result takes the following form:

Theorem 5.5 (Relaxation with Young measures). Let F ,F be as above.

(I) Extension property: For every u ∈ W1,p(Ω;Rm) define the elementary Young mea-sure (δ∇u) ∈ GYp(Ω;Rm×d) as

(δ∇u)x := δ∇u(x) for a.e. x ∈ Ω.

Then, F [u] = F [δ∇u].

Com

men

t...


(II) Lower bound: If u j u in W1,p(Ω;Rm), then, up to a subsequence, there exists a

Young measure ν ∈ GYp(Ω;Rm×d) such that ∇u jY→ ν (i.e. (∇u j) generates ν) with

[ν] = ∇u a.e., and

liminfj→∞

F [u j]≥ F [ν ]. (5.10)

(III) Recovery sequence: For all ν ∈ GYp(Ω;Rm×d) there exists a recovery generating

sequence (u j)⊂ W1,p(Ω;Rm) with ∇u jY→ ν and such that

limj→∞

F [u j] = F [ν ]. (5.11)

(IV) Equality of minima: F attains its minimum and

minGYp(Ω;Rm×d)

F = infW1,p(Ω;Rm)

F .

Furthermore, all these statements remain true if we prescribe boundary values. Fora Young measure we here prescribe the boundary values for an underlying deformation(which is only determined up to a translation, of course).

Proof. Ad (I). This follows directly from the definition of F .Ad (II). The existence of ν follows from the Fundamental Theorem 3.11. The lower

bound (5.10) is a consequence of Proposition 3.19.Ad (III). By virtue of Lemma 3.21 we may construct an (improved) sequence (u j) ⊂

W1,p(Ω;Rm) such that

(i) u j|∂Ω = u, where u∈W1,p(Ω;Rm) is an underlying deformation of ν , that is, ∇u= [ν ],

(ii) the family ∇u j j is p-equiintegrable, and

(iii) ∇u jY→ ν .

For this improved generating sequence we can now use the statement about representationof limits of integral functionals from the Fundamental Theorem 3.11 (here we need thep-equiintegrability) to get

F [u j]→⟨⟨

f ,ν⟩⟩= F [ν ],

which is nothing else than (5.11).Ad (IV). This is not hard to see using (II), (III) and the coercivity assumption, that is,

the lower bound in (5.9).

Com

men

t...



Figure 5.2: The double-well potential in the sailing example.

Example 5.6. In our sailing example from Section 1.6, we were tasked with solvingF [r] :=∫ T

0cos(4arctanr′(t))− vmax

(1− r2

R2

)dt → min,

r(0) = r(T ) = 0, |r(t)| ≤ R.

Here, because of the additional constraints we can work in any Lp-space that we want, evenL∞. Clearly, the integrand

W (r,a) := cos(4arctana)− vmax

(1− r2

R2

), |r(t)| ≤ R,

is not convex in a, see Figure 5.2. Technically, the preceding theorem is not applicable,since W depends on r and a and p = ∞, but it is obvious that we can simply extend it toconsider Young measures (δr(x)⊗νx)x ∈ Y∞(Ω;R×R) with ν ∈ GY∞(Ω;R) and [ν ] = r′ asin Section 3.7 (we omit the details for p = ∞). We collect all such product Young measuresin the set GY∞(Ω;R×R).

Then, we consider the extended–relaxed variational problemF [δr ⊗ν ] :=

⟨⟨W,δr ⊗ν

⟩⟩=∫ T

0W (r(t),a) dνx(a) dx → min,

δr ⊗ν = (δr(x)⊗νx)x ∈ GY∞(Ω;R×R),r(0) = r(T ) = 0, |r(t)| ≤ R.

Let us also construct a sequence of solutions that generate the optimal Young measuresolution. Some elementary analysis yields that g(a) := cos(4arctana) has two minima ina =±1, where g(a) =−1 (see Figure 5.2). Moreover, h(r) :=−vmax(1− r2/R2) attains itsminimum −vmax for r = 0. Thus,

W ≥−(1+ vmax

)=: Wmin.

Com

men

t...



We let

h(s) :=

s−1 if s ∈ [0,2],3− s if s ∈ (2,4],

s ∈ [0,4],

and consider h to be extended to all s ∈ R by periodicity. Then set

r j(t) :=T4 j

h(

4 jT

t +1).

It is easy to see that r′j ∈ −1,+1 and r j → 0 uniformly. Thus,

F [r j]→ T ·Wmin = infF .

By the Fundamental Theorem 3.11 on Young measures, we know that we may select asubsequence of j’s (not renumbered) such that (see Lemma 3.28)

(r j,r′j)Y→ δr ⊗ν = (δr(x)⊗νx)x ∈ GY∞(Ω;R×R).

From the construction of r j we see

δr ⊗ν = δ0 ⊗(

12

δ−1 +12

δ+1

).

Clearly, this δr ⊗ν is minimizing for F .

The following lemma creates a connection to the other relaxation approach from the lastsection:

Lemma 5.7. Let h : Rm×d → R be continuous and satisfy

µ|A|p −µ−1 ≤ h(A)≤ M(1+ |A|p) A ∈ Rm×d ,

for some µ,M > 0. Then, for all A ∈ Rm×d there exists a homogeneous gradient Youngmeasure νA ∈ GYp(B(0,1);Rm×d) such that

Qh(A) =∫

h dνA.

Proof. According to (5.1) and the remarks following it,

Qh(A) := inf−∫

B(0,1)h(A+∇ψ(z)) dz : ψ ∈ W1,p

0 (B(0,1);Rm)

.

Let now (ψ j) ⊂ W1,p(B(0,1);Rm) be a minimizing sequence in the above formula. Thisminimization problem is admissible in Theorem 5.5, which yields a Young measure mini-mizer ν ∈ GYp(B(0,1);Rm) such that

Qh(A) = −∫

B(0,1)

∫h dνx dx.

It only remains to show that we can replace (νx) by a homogeneous Young measure νA.This follows from the following lemma.

Com

men

t...


Lemma 5.8 (Homogenization). Let ν = (νx) ∈ GYp(Ω;Rm) such that [ν ] = ∇u for someu∈W1,p(Ω;Rm) with linear boundary values. Then, for any Lipschitz domain D⊂Rd thereexists a homogeneous ν = (νy) ∈ GYp(D;Rm) such that∫

h dν =1|D|

∫Ω

∫h dνx dx (5.12)

for all continuous h : Rm×d → R with |h(A)| ≤ M(1+ |A|p). Furthermore, this result re-mains valid if Ω = Qd = (−1/2,1/2)d , the d-dimensional unit cube and u has periodicboundary values.

Proof. Let (u j) ⊂ W1,p(Ω;Rm) with u j(x) = Mx on ∂Ω (in the sense of trace) for a fixed

matrix M ∈Rm×d such that ∇u jY→ ν , in particular sup j ∥∇u j∥Lp <∞. Now choose for every

j ∈ N a Vitali cover of D, see Theorem A.8,

D = Z ⊎N( j)⊎k=1

Ω(a( j)k ,r( j)

k ), |Z|= 0,

with a( j)k ∈ D, r( j)

k ≤ 1/ j (k = 1, . . . ,N( j)), and Ω(a,r) := a+ rΩ. Then define

v j(y) :=

r( j)k u j

(y−a( j)

k

r( j)k

)+Ma( j)

k if y ∈ Ω(a( j)k ,r( j)

k ),

0 otherwise,

y ∈ D.

We have v j ∈ W1,p(D;Rm) (it is easy to see that there are no jumps over the gluing bound-aries) and

∇v j(y) = ∇u j

(y−a( j)

k

r( j)k

)if y ∈ Ω(a( j)

k ,r( j)k ).

We can then use a change of variables to compute for all φ ∈ C0(D) and h ∈ C(Rm×d) withp-growth,

∫D

φ(y)h(∇v j(y)) dy =N( j)

∑k=1

∫Ω(a( j)

k ,r( j)k )

φ(y)h(

∇u j

(y−a( j)

k

r( j)k

))dy

=N( j)

∑k=1

(r( j)k )dφ(a( j)

k )∫

Ωh(∇u j(x)) dx+O

(1j

)|D|

where we also used that φ is uniformly continuous. Letting j → ∞ and using that (Riemannsums)

limj→∞

N( j)

∑k=1

(r( j)k )dφ(a( j)

k ) = −∫

Dφ(x) dx,

Com

men

t...

5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES 101

we arrive at

limj→∞

∫D

φ(y)h(∇v j(y)) dy = −∫

Dφ(x) dx ·

∫Ω

∫h(A) dνx dx. (5.13)

For φ = 1 and h(A) := |A|p, this gives

sup j ∥∇v j∥pLp = sup j ∥∇u j∥p

Lp < ∞.

Thus, there exists ν ∈ GYp(D;Rm) such that ∇v jY→ ν .

Using Lemma 3.21 we may moreover assume that (∇u j), and hence also (∇v j), is p-equiintegrable. Then, (5.13) implies∫

D

∫φ(y)h(A) dνy dy = −

∫D

φ(x) dx ·∫

Ω

∫h(A) dνx dx

for all φ,h as above. This implies in particular that (νy) = ν is homogeneous. For φ = 1,we get (5.12).

The additional claim about Ω = Qd and an underlying deformation u with periodicboundary values follows analogously, since also in this case we can “glue” generating func-tions on a (special) covering.

5.4 Characterization of gradient Young measures

In the previous section we replaced a minimization problem over a Sobolev space by itsextension–relaxation, defined on the space of gradient Young measures. Unfortunately, thissubset of the space of Young measures is so far defined only “extrinsically”, i.e. throughthe existence of a generating sequence of gradients. The question arises whether thereis also an “intrinsic” characterization of gradient Young measures. Intuitively, trying tounderstand gradient Young measures amounts to understanding the “structure” of sequencesof gradients, which is a useful endeavor.

Recall that in Lemma 3.22 we showed that a homogeneous gradient Young measure ν ∈GYp(B(0,1);Rm×d) with barycenter B := [ν ] ∈ Rm×d satisfies the Jensen-type inequality

h(B)≤∫

h dν

for all quasiconvex functions h : Rm×d → R with |h(A)| ≤ M(1+ |A|p) (M > 0). There, weinterpreted this as an expression of the (generalized) convexity of h, in analogy with theclassical Jensen-inequality.

However, we can also dualize the situation and see the validity of the above Jensen-type inequality for quasiconvex functions as a property of gradient Young measures. Thefollowing (perhaps surprising) result shows that this dual point of view is indeed valid andthe Jensen-type inequality (essentially) characterizes gradient Young measures, a result dueto David Kinderlehrer and Pablo Pedregal from the early 1990s:

Com

men

t...


Theorem 5.9 (Kinderlehrer–Pedregal). Let ν ∈ Yp(Ω;Rm×d) be a Young measure suchthat [ν ] = ∇u for some u ∈ W1,p(Ω;Rm) and such that for almost every x ∈ Ω the Jensen-type inequality

h(∇u(x))≤∫

h dνx

holds for all quasiconvex h : Rm×d → R with |h(A)| ≤ M(1+ |A|p) (M > 0). Then ν ∈GYp(Ω;Rm×d), i.e. ν is a gradient p-Young measure.

The idea of the proof is to show that the set of gradient Young measures is convexand (weakly*) closed in the set of Young measures and the Jensen-type inequalities entailthat ν cannot be separated from this set by a hyperplane (which turn out to be in one-to-one correspondence with integrands). Then, the Hahn–Banach Theorem implies that νactually lies in this set and is hence a gradient Young measure. One first shows this forhomogeneous Young measures and then uses a gluing procedure to extend the result tox-dependent ν = (νx). The details, however, are quite technical, see [81] or the originalworks [49, 51] for a full proof.

This theorem is conceptually very important. It entails that if we could understand theclass of quasiconvex functions, then we also could understand gradient Young measures andthus sequences of gradients. Unfortunately, our knowledge of quasiconvex functions (andhence of gradient Young measures) is very limited at present. Most results are for specificsituations only and much of it has an “ad-hoc” flavor.

We close this chapter with a result showing that quasiconvex functions are not easy tounderstand. In Proposition 3.3 we proved that quasiconvexity implies rank-one convexityand Proposition 4.1 showed that polyconvexity implies quasiconvexity. Hence, in a sense,rank-one convexity is a lower bound and polyconvexity an upper bound on quasiconvexity.Moreover, the Alibert–Dacorogna–Marcellini function (see Examples 3.5, 4.2)

hγ(A) := |A|2(|A|2 −2γ det A

), A ∈ R2×2, γ ∈ R.

has the following convexity properties:

• hγ is convex if and only if |γ| ≤ 2√

23

≈ 0.94,

• hγ is rank-one convex if and only if |γ| ≤ 2√3≈ 1.15,

• hγ is quasiconvex if and only if |γ| ≤ γQC for some γQC ∈(

1,2√3

],

• hγ is polyconvex if and only if |γ| ≤ 1.

However, it is open whether γQC = 2√3

and so it could be that quasiconvexity and rank-one

convexity are actually equivalent. This was open for a long time until Vladimir Sverak’s1992 counterexample [89], which settled the question in the negative, at least for d ≥ 2,

Com

men

t...


5.4. CHARACTERIZATION OF GRADIENT YOUNG MEASURES 103

m ≥ 3. The case d = m = 2 is still open and considered to be very important, also becauseof its connection to other branches of mathematics.

Example 5.10 (Sverak). We will show the non-equivalence of quasiconvexity and rank-one convexity for m = 3, d = 2 only. Higher dimensions can be treated using an embeddingof R3×2 into Rm×d . To accomplish this, we construct h : R3×2 → R that is quasiconvex, butnot rank-one convex. Define a linear subspace L of R3×2 as follows:

L :=

x 0

0 yz z

: x,y,z ∈ R

.

We denote by P : R3×2 → L a linear projection onto L given as

P(A) :=

a 00 d

(e+ f )/2 (e+ f )/2

for A =

a bc de f

∈ R3×2.

Also define g : L → R to be

g

x 00 yz z

:=−xyz.

For α,β > 0, we let hα,β : R3×2 → R be given as

hα,β (A) := g(P(A))+α(|A|2 + |A|4

)+β |A−P(A)|2.

We show the following two properties of hα,β :

(I) For every α > 0 sufficiently small and all β > 0, the function hα,β is not quasiconvex.

(II) For every α > 0, there exists β = β (α)> 0 such that hα,β is rank-one convex.

Ad (I): For A = 0, D = (0,1)2, and a periodic ψ ∈ W1,∞(D;R3) given as

ψ(x1,x2) :=1

2π

sin2πx1sin2πx2

sin2π(x1 + x2)

,

we have ∇ψ ∈ L and so P(∇ψ) = ∇ψ . Thus, we may compute (using the identity for thecosine of a sum and the fact that the sine is an odd function)∫

Dg(∇ψ) dx =−

∫ 1

0

∫ 1

0(cos2πx1)

2(cos2πx2)2 dx1 dx2 =−1

4< 0.

Thus, for α > 0 sufficiently small,∫D

hα,β (∇ψ) dx < 0 = hα,β (0).

Com

men

t...

It turns out that in the definition of quasiconvexity we may alternatively test with functionsthat have merely periodic boundary values on D (see Proposition 5.13 in [25]). Hence, (I)follows.

Ad (II): Rank-one convexity is equivalent to the Legendre–Hadamard condition

D2hα,β (A)[B,B] :=d2

dt2 hα ,β (A+ tB)∣∣∣∣t=0

≥ 0 for all A,B ∈ R3×2, rankB ≤ 1.

Here, D2hα,β (A)[B,B] is the second (Gateaux-)derivative of hα ,β at A in direction B.The function g is a homogeneous polynomial of degree three, whereby we can find c> 0

such that for all A,B ∈ R3×2 with rankB ≤ 1,

G(A,B) := D2(gP)(A)[B,B] :=d2

dt2 g(P(A+ tB))∣∣∣∣t=0

≥−c|A||B|2.

A computation then shows that

D2hα,β (A)[B,B] = G(A,B)+2α|B|2 +4α|A|2|B|2 +8α(A : B)2 +2β |B−P(B)|2

≥ (−c+4α|A|)|A||B|2.

where we used the Frobenius product A : B := ∑i, j AijB

ij. Thus, for |A| ≥ c/(4α), the

Legendre–Hadamard condition (and hence rank-one convexity) holds.We still need to prove the Legendre–Hadamard condition for |A|< c/(4α) and any fixed

α > 0. Since D2hα,β (A)[B,B] is homogeneous of degree two in B, we only need to considerA,B from the compact set

K :=(A,B) ∈ R3×2 ×R3×2 : |A| ≤ c

4α, |B|= 1, rankB = 1

.

From the estimate

D2hα,β (A)[B,B]≥ G(A,B)+2α|B|2 +2β |B−P(B)|2 =: κ(A,B,β )

it suffices to show that there exists β = β (α)> 0 such that κ(A,B,β )≥ 0 for all (A,B)∈ K.Assume that this is not the case. Then there exists β j → ∞ and (A j,B j) ∈ K with

0 > κ(A j,B j,β j) = G(A j,B j)+2α +2β |B j −P(B j)|2.

As K is compact, we may assume (A j,B j)→ (A,B) ∈ K, for which it holds that

G(A,B)+2α ≤ 0, P(B) = B, and rankB = 1.

However, since rankB= 1, we have g(P(A+tB))= 0 for all t ∈R. This implies in particularG(A,B)=D2(gP)(A)[B,B] = 0, yielding 2α ≤ 0, which contradicts our assumption α > 0.Thus, we have shown that for a suitable choice of α,β > 0 it indeed holds that hα,β is rank-one convex.

So, quasiconvexity and, dually, gradient Young measures, must remain somewhat mys-terious concepts. This also shows that the theory of elliptic systems of PDEs, for whichtraditionally the Legendre–Hadamard condition is the expression of “ellipticity”, is incom-plete. In particular, non-variational systems (i.e. systems of PDEs which do not arise as theEuler–Lagrange equations to a variational problem) are only poorly understood.

Com

men

t...



Classically, one often defines the quasiconvex envelope through (5.2) and not as we havedone through (5.1). In this case, (5.2) is often called the Dacorogna formula, see Sec-tion 6.3 in [25] for further references.

The construction of Lemma 5.2 is based on a construction of Sverak [88] and someellipticity arguments similar to those in Lemma 2.7 of [73].

Laurence Chisholm Young originally introduced the objects that are now called Youngmeasures as “generalized curves/surfaces” in the late 1930s and early 1940s [97–99] totreat problems in the Calculus of Variations and Optimal Control Theory that could not besolved using classical methods. At the end of the 1960s he also wrote a book [100] (2ndedition [101]) on the topic, which explained these objects and their applications in greatdetail (including philosophical discussions on the semantics of the terminology “generalizedcurve/surface”).

More recently, starting from the seminal work of Ball & James in 1987 [10], Youngmeasures have provided a convenient framework to describe fine phase mixtures in thetheory of microstructure, also see [33, 73] and the references cited therein. Further, andmore related to the present work, relaxation formulas involving Young measures for non-convex integral functionals have been developed and can for example be found in Pedregal’sbook [81] and also in Chapter 11 of the recent text [3]; historically, Dacorogna’s lecturenotes [24] were also influential.

A second avenue of development – somewhat different from Young’s original intention– is to use Young measures only as a technical tool (that is without them appearing inthe final statement of a result). This approach is in fact quite old and was probably firstadopted in a series of articles by McShane from the 1940s [62–64]. There, the author firstfinds a generalized Young measure solution to a variational problem, then proves additionalproperties of the obtained Young measures, and finally concludes that these properties entailthat the generalized solution is in fact classical. This idea exhibits some parallels to thehugely successful approach of first proving existence of a weak solution to a PDE and thenshowing that this weak solution has (in some cases) additional regularity.

The Kinderlehrer–Pedregal Theorem 5.9 also holds for p = ∞, which was in fact thefirst case to be established. See [81] for an overview and proof.

A different proof of Proposition 5.2 can be found in Section 5.3.9 of [25].The conditions (i)–(iii) at the beginning of Section 5.3 are modeled on the concept

of Γ-convergence (introduced by De Giorgi) and are also the basis of the usual treatmentof parameter-dependent integral functionals. See [18] for an introduction and [27] for athorough treatise.

Com

men

t...


Appendix A

Prerequisites

This appendix recalls some facts that are needed throughout the course.

A.1 General facts and notation

First, recall the following version of the fundamental Young inequality

ab ≤ δ2

a2 +1

2δb2,

which holds for all a,b ≥ 0 and δ > 0.In this text (as in much of the literature), the matrix space Rm×d of (m× d)-matrices

always comes equipped with the the Frobenius matrix (inner) product

A : B := ∑j,k

A jkB j

k, A,B ∈ Rm×d ,

which is just the Euclidean product if we identify such matrices with vectors in Rmd . Thisinner product induces the Frobenius matrix norm

|A| :=√

∑j,k(A j

k)2, A ∈ Rm×d.

Very often we will use the tensor product of vectors a ∈ Rm, b ∈ Rd ,

a⊗b := abT ∈ Rm×d .

The tensor product interacts well with the Frobenius norm,

|a⊗b|= |a| · |b|, a ∈ Rm, b ∈ Rd .

While all norms on the finite-dimensional space Rm×d are equivalent, some “finer” argu-ments require to specify a matrix norm; if nothing else is stated, we always use the Frobe-nius norm.

Com

men

t...


108 APPENDIX A. PREREQUISITES

It is proved in texts on Matrix Analysis, see for instance [48], that this Frobenius normcan also be expressed as

|A|=√

∑i

σi(A)2, A ∈ Rm×d ,

where σi(A)≥ 0 is the i’th singular value of A, where i = 1, . . . ,min(d,m). Recall that everymatrix A ∈ Rm×d has a (real) singular value decomposition

A = PΣQT

for orthogonal P ∈ Rm×m, Q ∈ Rd×d , and a diagonal Σ = diag(σ1, . . . ,σmin(d,m)) ∈ Rm×d

with only positive entries

σ1 ≥ σ2 ≥ . . .≥ σr > 0 = σr+1 = σr+2 = · · ·= σmin(d,m),

called the singular values of A, where r is the rank of A.For A ∈ Rd×d the cofactor matrix cofA ∈ Rd×d is the matrix whose ( j,k)’th entry is

(−1) j+kM¬ j¬k(A) with M¬ j

¬k (A) being the ( j,k)-minor of A, i.e. the determinant of the matrixthat originates from A by deleting the j’th row and k’th column. One important formula(and in fact another way to define the cofactor matrix) is

∂A[det A] = cofA.

Furthermore, the Cramer rule entails that

A(cofA)T = det A · I for all A ∈ Rd×d ,

where I denotes the identity matrix as usual. In particular, if A is invertible,

A−1 =(cofA)T

det A.

Sometimes, the matrix (cofA)T is called the adjugate matrix to A in the literature (we willnot use this terminology here, however).

A fundamental inequality involving the determinant is the Hadamard inequality: LetA ∈ Rd×d with columns A j ∈ Rn ( j = 1, . . . ,d). Then,

|det A| ≤d

∏j=1

|A j| ≤ |A|d.

Of course, an analogous formula holds with the rows of A.

Com

men

t...


A.2. MEASURE THEORY 109

A.2 Measure theory

We assume that the reader is familiar with the notion of Lebesgue- and Borel-measurability,nullsets, and Lp-spaces; a good introduction is [85]. For the d-dimensional Lebesgue mea-sure we write L d (or L d

x if we want to stress the integration variable), but the Lebesguemeasure of a Borel- or Lebesgue-measurable set A ⊂ Rd is simply |A|. THe characteristicfunctionf such an A is

1A(x) :=

1 if x ∈ A,0 otherwise,

x ∈ Rd .

In the following, we only recall some results that are needed elsewhere.

Lemma A.1 (Fatou). Let f j : RN → [0,+∞], j ∈N, be a sequence of Lebesgue-measurablefunctions. Then,

liminfj→∞

∫f j(x) dx ≥

∫liminf

j→∞f j(x) dx.

Lemma A.2 (Monotone convergence). Let f j : RN → [0,+∞], j ∈ N, be a sequence ofLebesgue-measurable functions with f j(x) ↑ f (x) for almost every x ∈ Ω. Then, f : RN →[0,+∞] is measurable and

limj→∞

∫f j(x) dx =

∫f (x) dx.

Lemma A.3. If f j → f strongly in L1(Rd), 1 ≤ p ≤ ∞, that is,

∥ f j − f∥Lp → 0 as j → ∞,

then there exists a subsequence (not renumbered1) such that f j → f almost everywhere.

Theorem A.4 (Lebesgue dominated convergence theorem). Let f j : RN → Rn, j ∈ N,be a sequence of Lebesgue-measurable functions such that there exists an Lp-integrablemajorant g ∈ Lp(RN), 1 ≤ p < ∞, that is,

| f j| ≤ g for all j ∈ N.

If f j → f pointwise almost everywhere, then also f j → f strongly in Lp.

The following strengthening of Lebesgue’s theorem is very convenient:

Theorem A.5 (Pratt). If f j → f pointwise almost everywhere and there exists a sequence(g j)⊂ L1(RN) with g j → g in L1 such that | f j| ≤ g j, then f j → f in L1.

The following convergence theorem is of fundamental significance:

1We generally do not choose different indices for subsequences if the original sequence is discarded.

Com

men

t...

Theorem A.6 (Vitali convergence theorem). Let Ω ⊂ Rd be bounded and let ( f j) ⊂Lp(Ω;Rm), 1 ≤ p < ∞. Assume furthermore that

(i) No oscillations: f j → f in measure, that is, for all ε > 0,∣∣x ∈ Ω : | f j − f |> ε∣∣→ 0 as j → ∞.

(ii) No concentrations: f j j is p-equiintegrable, i.e.

limK↑∞

sup j

∫| f j|>K

| f j|p dx = 0.

Then, f j → f strongly in Lp.

Lebesgue-integrable functions also have a very useful “continuity” property:

Theorem A.7. Let f ∈ L1(Rd). Then, almost every x0 ∈ Rd is a Lebesgue point of f , i.e.

limr↓0

−∫

B(x0,r)| f (x)− f (x0)| dx = 0,

where B(x0,r)⊂Rd is the ball with center x0 and radius r > 0 and −∫

B(x0,r) := 1|B(x0,r)|

∫B(x0,r).

The following covering theorem is a handy tool for several constructions, see Theo-rems 2.19 and 5.51 in [2] for a proof:

Theorem A.8 (Vitali covering theorem). Let B ⊂ RN be a bounded Borel set and letK ⊂ Rd be compact. Then, there exist ak ∈ B, rk > 0, where k ∈ N, such that

B = Z ⊎N⊎

k=1

K(ak,rk), K(ak,rk) = ak + rkK,

with Z ⊂ B a Lebesgue-nullset, |Z| = 0, and “⊎

” denoting that the sets in the union aredisjoint.

We also need other measures than Lebesgue measure on subsets of RN (this could beeither Rd or a matrix space Rm×d , identified with Rmd). All of these abstract measures willbe Borel measures, that is, they are defined on the Borel σ -algebra B(RN) of RN . Inthis context, recall that the Borel-σ -algebra is the smallest σ -algebra that contains the opensets. All Borel measures defined on RN that do not take the value +∞ are collected in theset M(RN) of (finite) Radon measures, its subclass of probability measures is M1(RN).We also use M(Ω),M1(Ω) for the subset of measures that only charge Ω ⊂ RN . We definethe duality pairing⟨

h,µ⟩

:=∫

h(A) dµ(A)

Com

men

t...

A.3. FUNCTIONAL ANALYSIS 111

for h : RN → Rm and µ ∈ M(RN), whenever this integral makes sense (we need at least hµ-measurable). The following notation is convenient for the barycenter of a finite Borelmeasure µ:

[µ] :=⟨id,µ

⟩=∫

A dµ(A).

Probability measures and convex functions interact well:

Lemma A.9 (Jensen inequality). For all probability measures µ ∈ M1(RN) and all con-vex h : RN → R it holds that

h([µ]) = h(∫

A dµ(A))≤∫

h(A) dµ(A) =⟨h,µ⟩.

There is an alternative, “dual”, view on finite measures, expressed in the followingimportant theorem:

Theorem A.10 (Riesz representation theorem). The space of Radon measures M(RN)is isometrically isomorphic to the dual space C0(RN)∗ via the duality pairing⟨

h,µ⟩=∫

h dµ.

As an easy, often convenient, consequence, we can define measures through their actionon C0(RN). Note, however, that we then need to check the boundedness |⟨h,µ⟩| ≤ ∥h∥∞.

If the element µ ∈ C0(RN)∗ is additionally positive, that is, ⟨h,µ⟩ ≥ 0 for h ≥ 0, andnormalized, that is, ⟨1,µ⟩ = 1 (here, 1 ≡ 1 on the whole space), then µ from the Rieszrepresentation theorem is in fact a probability measure, µ ∈ M1(RN).

A.3 Functional analysis

We assume that the reader has a solid foundation in the basic notions of Functional Analysissuch as Banach spaces and their duals, weak/weak* convergence (and topology2), reflex-ivity, and weak/weak* compactness. The application of x∗ ∈ X∗ to x ∈ X is often writtenvia the duality pairing ⟨x,x∗⟩ = ⟨x∗,x⟩ := x∗(x). We write x j x for weak convergenceand x j

∗ x for weak* convergence. In this context recall that in Banach spaces topologi-

cal weak compactness is equivalent to sequential weak compactness, this is the Eberlein–Smulian Theorem. Also recall that the norm in a Banach space is lower semicontinuouswith respect to weak convergence: If x j x in X , then ∥x∥ ≤ liminf j→∞ ∥x j∥. One verythorough reference for most of this material is [23]. The following are just reminders:

Theorem A.11 (Weak compactness). Let X be a reflexive Banach space. Then, norm-bounded sets in X are sequentially weakly relatively compact.

2However, we here really only need convergence of sequences.

Com

men

t...



Theorem A.12 (Sequential Banach–Alaoglu theorem). Let X be a separable Banachspace. Then, norm-bounded sets in the dual space X∗ are weakly* sequentially relativelycompact.

Theorem A.13 (Dunford–Pettis). Let Ω ⊂ Rd be bounded. A bounded family f j ⊂L1(Ω) ( j ∈ N) is equiintegrable if and only if it is weakly relatively (sequentially) compactin L1(Ω).

Finally, weak convergence can be “improved” to strong convergence in the followingway:

Lemma A.14 (Mazur). Let x j x in a Banach space X. Then, there exists a sequence(y j)⊂ X of convex combinations,

y j =N( j)

∑n= j

θ j,nxn, θ j,n ∈ [0,1],N( j)

∑n= j

θ j,n = 1.

such that y j → x in X.

A.4 Sobolev and other function spaces spaces

We give a brief introduction to Sobolev spaces, see [59] or [36] for more detailed accountsand proofs.

Let Ω ⊂ Rd . As usual we denote by C(Ω) = C0(Ω),Ck(Ω), k = 1,2, . . ., the spaces ofcontinuous and k-times continuously differentiable functions. As norms in these spaces wehave

∥u∥Ck := ∑|α|≤k

∥∂ αu∥∞, u ∈ Ck(Ω), k = 0,1,2, . . . ,

where ∥ q∥∞ is the supremum norm. Here, the sum is over all multi-indices α ∈ (N∪0)d

with |α| := α1 + · · ·+αd ≤ k and

∂ α := ∂ α11 ∂ α2

2 · · ·∂ αdd

is the α-derivative operator. Similarly, we define C∞(Ω) of infinitely-often differentiablefunctions (but this is not a Banach space anymore). A subscript “c” indicates that all func-tions u in the respective function space (e.g. C∞

c (Ω)) must have their support

suppu := x ∈ Ω : u(x) = 0

compactly contained in Ω (so, suppu ⊂ Ω and suppu compact). For the compact contain-ment of a A in an open set B we write A ⊂⊂ B, which means that A ⊂ B. In this way, theprevious condition could be written suppu ⊂⊂ Ω.

For k ∈ N a positive integer and 1 ≤ p ≤ ∞, the Sobolev space Wk,p(Ω) is defined tocontain all functions u ∈ Lp(Ω) such that the weak derivative ∂ αu exists in Lp(Ω) for all

Com

men

t...


A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES 113

multi-indices α ∈ (N∪0)d with |α| ≤ k. This means that for each such α , there is a(unique) function vα ∈ Lp(Ω) satisfying∫

vα ·ψ dx = (−1)|α|∫

u ·∂ αψ dx for all ψ ∈ C∞c (Ω),

and we write ∂ αu for this vα . The uniqueness follows from the Fundamental Lemma of theCalculus of Variations 2.21. Clearly, if u ∈ Ck(Ω), then the weak derivative is the classicalderivative. For u ∈ W1,p(Ω) we further define the weak gradient and weak divergence,

∇u := (∂1u,∂2u, . . . ,∂du), divu := ∂1u+∂2u+ · · ·+∂du.

As norm in Wk,p(Ω), 1 ≤ p < ∞, we use

∥u∥Wk,p :=

(∑

|α|≤k∥∂ αu∥p

Lp

)1/p

, u ∈ Wk,p(Ω).

For p = ∞, we set

∥u∥Wk,∞ := max|α|≤k

∥∂ αu∥L∞ , u ∈ Wk,∞(Ω).

Under these norms, the Wk,p(Ω) become Banach spaces.Concerning the boundary values of Sobolev functions, we have:

Theorem A.15 (Trace theorem). For every domain Ω ⊂Rd with Lipschitz boundary andevery 1 ≤ p ≤ ∞, there exists a bounded linear trace operator trΩ : W1,p(Ω) → Lp(∂Ω)such that

trφ = φ|∂Ω if φ ∈ C(Ω).

We write trΩ u simply as u|∂Ω. Furthermore, trΩ is bounded and weakly continuous betweenW1,p(Ω) and Lp(∂Ω).

For 1 < p < ∞ denote the image of W1,p(Ω) under trΩ by W1−1/p,p(∂Ω), called thetrace space of W1,p(Ω). As a makeshift norm on W1−1/p,p(∂Ω) one can use the Lp(∂Ω)-norm, but W1−1/p,p(∂Ω) is not closed under this norm. This can be remedied by intro-ducing a more complicated norm involving “fractional” derivatives (hence the notation forW1−1/p,p(∂Ω)), under which this set becomes a Banach space.

We write W1,p0 (Ω) for the linear subspace of W1,p(Ω) consisting of all W1,p-functions

with zero boundary values (in the sense of trace). More generally, we use W1,pg (Ω) with

g ∈ W1−1/p,p(∂Ω), for the affine subspace of all W1,p-functions with boundary trace g.The following are some properties of Sobolev spaces, stated for simplicity only for the

first-order space W1,p(Ω) with Ω ⊂ Rd a bounded Lipschitz domain.

Theorem A.16 (Poincare–Friedrichs inequality). Let u ∈ W1,p(Ω).

Com

men

t...

(I) If u|∂Ω = 0, then

∥u∥Lp ≤C∥∇u∥Lp ,

where C =C(Ω)> 0 is a constant. If 1 0 is a constant.

Sobolev functions have higher integrability than originally assumed:

Theorem A.17 (Sobolev embedding). Let u ∈ W1,p(Ω).

(I) If p < d, then u ∈ Lp∗(Ω), where

p∗ =d p

d − p

and there is a constant C =C(Ω)> 0 such that ∥u∥Lp∗ ≤C∥u∥W1,p .

(II) If p = d, then u ∈ Lq(Ω) for all 1 ≤ q < ∞ and ∥u∥Lq ≤Cq∥u∥W1,p .

(III) If p > d, then u ∈ C(Ω) and ∥u∥∞ ≤C∥u∥W1,p .

The second part can in fact be made more precise by considering embeddings intoHolder spaces, see Section 5.6.3 in [36] for details.

Theorem A.18 (Rellich–Kondrachov). Let (u j)⊂ W1,p(Ω) with u j u in W1,p.

(I) If p < d, then u j → u in Lq(Ω;RN) (strongly) for any q < p∗ = d p/(d − p).

(II) If p = d, then u j → u in Lq(Ω;RN) (strongly) for any q < ∞.

(III) If p > d, then u j → u uniformly.

The following is a consequence of the Stone–Weierstrass Theorem:

Theorem A.19 (Density). For every 1 ≤ p < ∞, u ∈ W1,p(Ω), and all ε > 0 there existsv ∈ (W1,p ∩C∞)(Ω) with u− v ∈ W1,p

0 (Ω) and ∥u− v∥W1,p < ε .

Finally, we note that for 0 < γ ≤ 1 a function u : Ω → R is γ-Holder-continuous func-tion, in symbols u ∈ C0,γ(Ω), if

∥u∥C0,γ := ∥u∥∞ + supx,y∈Ωx =y

|u(x)−u(y)||x− y|γ

< ∞.

Com

men

t...

A.4. SOBOLEV AND OTHER FUNCTION SPACES SPACES 115

Functions in C0,1(Ω) are called Lipschitz-continuous. We construct higher-order spacesCk,γ(Ω) analogously.

Next, we define a family of mollifiers as follows: Let ρ ∈C∞c (Rd) be radially symmetric

and positive. Then, the family (ρδ )δ ⊂ C∞c (B(0,δ ))(Rd) consists of the functions

ρδ (x) :=1

δ d ρ(

xδ d

), x ∈ Rd .

For u ∈ Wk,p(Rd), where k ∈ N∪0 and 1 ≤ p < ∞, we define the mollification uδ ∈Wk,p(Rd) as the convolution between u and ρδ , i.e.

uδ (x) := (ρδ ⋆u)(x) :=∫

ρδ (x− y)u(y) dy, x ∈ Rd .

Lemma A.20. If u ∈ Wk,p(Rd), then uδ → u strongly in Wk,p as δ ↓ 0.

Analogous results also hold for continuously differentiable functions.Finally, all the above notions and theorems continue to hold for vector-valued functions

u = (u1, . . . ,um)T : Ω → Rm and in this case we set

∇u :=

∂1u1 ∂2u1 · · · ∂du1

∂1u2 ∂2u2 · · · ∂du2

......

...∂1um ∂2um · · · ∂dum

∈ Rm×d .

We use the spaces C(Ω;Rm),Ck(Ω;Rm),Wk,p(Ω;Rm),Ck,γ(Ω;Rm) with analogous mean-ings.

We also use local versions of all spaces, namely Cloc(Ω),Ckloc(Ω),Wk,p

loc(Ω),Ck,γloc(Ω),

where the defining norm is only finite on every compact subset of Ω.

Com

men

t...

Bibliography

[1] J. J. Alibert and G. Bouchitte. Non-uniform integrability and generalized Youngmeasures. J. Convex Anal., 4:129–147, 1997.

[2] L. Ambrosio, N. Fusco, and D. Pallara. Functions of Bounded Variation andFree-Discontinuity Problems. Oxford Mathematical Monographs. Oxford Univer-sity Press, 2000.

[3] H. Attouch, G. Buttazzo, and G. Michaille. Variational Analysis in Sobolev and BVSpaces – Applications to PDEs and Optimization. MPS-SIAM Series on Optimiza-tion. Society for Industrial and Applied Mathematics, 2006.

[4] E. J. Balder. A general approach to lower semicontinuity and lower closure in optimalcontrol theory. SIAM J. Control Optim., 22:570–598, 1984.

[5] J. M. Ball. Convexity conditions and existence theorems in nonlinear elasticity. Arch.Ration. Mech. Anal., 63:337–403, 1976/77.

[6] J. M. Ball. Global invertibility of Sobolev functions and the interpenetration of mat-ter. Proc. Roy. Soc. Edinburgh Sect. A, 88:315–328, 1981.

[7] J. M. Ball. A version of the fundamental theorem for Young measures. In PDEs andcontinuum models of phase transitions (Nice, 1988), volume 344 of Lecture Notes inPhysics, pages 207–215. Springer, 1989.

[8] J. M. Ball. Some open problems in elasticity. In Geometry, mechanics, and dynamics,pages 3–59. Springer, 2002.

[9] J. M. Ball, J. C. Currie, and P. J. Olver. Null Lagrangians, weak continuity, andvariational problems of arbitrary order. J. Funct. Anal., 41:135–174, 1981.

[10] J. M. Ball and R. D. James. Fine phase mixtures as minimizers of energy. Arch.Ration. Mech. Anal., 100:13–52, 1987.

[11] J. M. Ball and R. D. James. Proposed experimental tests of a theory of fine mi-crostructure and the two-well problem. Phil. Trans. R. Soc. Lond. A, 338:389–450,1992.

Com

men

t...


118 BIBLIOGRAPHY

[12] J. M. Ball, B. Kirchheim, and J. Kristensen. Regularity of quasiconvex envelopes.Calc. Var. Partial Differential Equations, 11:333–359, 2000.

[13] J. M. Ball and V. J. Mizel. One-dimensional variational problems whose minimizersdo not satisfy the Euler-Lagrange equation. Arch. Rational Mech. Anal., 90:325–388,1985.

[14] J. M. Ball and F. Murat. Remarks on Chacon’s biting lemma. Proc. Amer. Math.Soc., 107:655–663, 1989.

[15] J. Bergh and J. Lofstrom. Interpolation Spaces, volume 223 of Grundlehren dermathematischen Wissenschaften. Springer, 1976.

[16] H. Berliocchi and J.-M. Lasry. Integrandes normales et mesures parametrees en cal-cul des variations. Bull. Soc. Math. France, 101:129–184, 1973.

[17] B. Bojarski and T. Iwaniec. Analytical foundations of the theory of quasiconformalmappings in Rn. Ann. Acad. Sci. Fenn. Ser. A I Math., 8:257–324, 1983.

[18] A. Braides. Γ-convergence for Beginners, volume 22 of Oxford Lecture Series inMathematics and its Applications. Oxford University Press, 2002.

[19] C. Castaing, P. R. de Fitte, and M. Valadier. Young Measures on Topological Spaces:With Applications in Control Theory and Probability Theory. Mathematics and itsApplications. Kluwer, 2004.

[20] Ph. G. Ciarlet. Mathematical Elasticity, volume 1: Three Dimensional Elasticity.North-Holland, 1988.

[21] Ph. G. Ciarlet and G. Geymonat. Sur les lois de comportement en elasticite nonlineaire compressible. C. R. Acad. Sci. Paris Ser. II Mec. Phys. Chim. Sci. UniversSci. Terre, 295:423–426, 1982.

[22] Ph. G. Ciarlet and J. Necas. Injectivity and self-contact in nonlinear elasticity. Arch.Rational Mech. Anal., 97:171–188, 1987.

[23] J. B. Conway. A Course in Functional Analysis, volume 96 of Graduate Texts inMathematics. Springer, 2nd edition, 1990.

[24] B. Dacorogna. Weak Continuity and Weak Lower Semicontinuity for Nonlinear Func-tionals, volume 922 of Lecture Notes in Mathematics. Springer, 1982.

[25] B. Dacorogna. Direct Methods in the Calculus of Variations, volume 78 of AppliedMathematical Sciences. Springer, 2nd edition, 2008.

[26] B. Dacorogna. Introduction to the calculus of variations. Imperial College Press,2nd edition, 2009.

Com

men

t...


BIBLIOGRAPHY 119

[27] G. Dal Maso. An Introduction to Γ-Convergence, volume 8 of Progress in NonlinearDifferential Equations and their Applications. Birkhauser, 1993.

[28] E. De Giorgi. Sulla differenziabilita e l’analiticita delle estremali degli integrali mul-tipli regolari. Mem. Accad. Sci. Torino. Cl. Sci. Fis. Mat. Nat. (3), 3:25–43, 1957.

[29] E. De Giorgi. Un esempio di estremali discontinue per un problema variazionale ditipo ellittico. Boll. Un. Mat. Ital., 1:135–137, 1968.

[30] J. Diestel and J. J. Uhl, Jr. Vector measures, volume 15 of Mathematical Surveys.American Mathematical Society, 1977.

[31] N. Dinculeanu. Vector measures, volume 95 of International Series of Monographsin Pure and Applied Mathematics. Pergamon Press, 1967.

[32] R. J. DiPerna and A. J. Majda. Oscillations and concentrations in weak solutions ofthe incompressible fluid equations. Comm. Math. Phys., 108:667–689, 1987.

[33] G. Dolzmann and S. Muller. Microstructures with finite surface energy: the two-wellproblem. Arch. Ration. Mech. Anal., 132:101–141, 1995.

[34] I. Ekeland and R. Temam. Convex analysis and variational problems. North-Holland,1976.

[35] L. C. Evans. Weak convergence methods for nonlinear partial differential equations,volume 74 of CBMS Regional Conference Series in Mathematics. Conference Boardof the Mathematical Sciences, 1990.

[36] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Math-ematics. American Mathematical Society, 2nd edition, 2010.

[37] I. Fonseca and G. Leoni. Modern Methods in the Calculus of Variations: Lp Spaces.Springer, 2007.

[38] I. Fonseca, S. Muller, and P. Pedregal. Analysis of concentration and oscillationeffects generated by gradients. SIAM J. Math. Anal., 29:736–756, 1998.

[39] M. Giaquinta and S. Hildebrandt. Calculus of variations. I – The Lagrangian for-malism, volume 310 of Grundlehren der mathematischen Wissenschaften. Springer,1996.

[40] M. Giaquinta and S. Hildebrandt. Calculus of variations. II – The Hamiltonian for-malism, volume 311 of Grundlehren der mathematischen Wissenschaften. Springer,1996.

[41] M. Giaquinta, G. Modica, and J. Soucek. Cartesian currents in the calculus of varia-tions. I, volume 37 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer,1998.

Com

men

t...


120 BIBLIOGRAPHY

[42] M. Giaquinta, G. Modica, and J. Soucek. Cartesian currents in the calculus of varia-tions. II, volume 38 of Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer,1998.

[43] D. Gilbarg and N. S. Trudinger. Elliptic Partial Differential Equations of SecondOrder, volume 224 of Grundlehren der mathematischen Wissenschaften. Springer,1998.

[44] E. Giusti. Direct methods in the calculus of variations. World Scientific, 2003.

[45] L. Grafakos. Classical Fourier Analysis, volume 249 of Graduate Texts in Mathe-matics. Springer, 2nd edition, 2008.

[46] D. Henao and C. Mora-Corral. Invertibility and weak continuity of the determinantfor the modelling of cavitation and fracture in nonlinear elasticity. Arch. Ration.Mech. Anal., 197:619–655, 2010.

[47] D. Hilbert. Mathematische Probleme – Vortrag, gehalten auf dem internationalenMathematiker-Kongreß zu Paris 1900. Nachrichten von der Koniglichen Gesellschaftder Wissenschaften zu Gottingen, Mathematisch-Physikalische Klasse, pages 253–297, 1900.

[48] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, 2ndedition, 2013.

[49] D. Kinderlehrer and P. Pedregal. Characterizations of Young measures generated bygradients. Arch. Ration. Mech. Anal., 115:329–365, 1991.

[50] D. Kinderlehrer and P. Pedregal. Weak convergence of integrands and the Youngmeasure representation. SIAM J. Math. Anal., 23:1–19, 1992.

[51] D. Kinderlehrer and P. Pedregal. Gradient Young measures generated by sequencesin Sobolev spaces. J. Geom. Anal., 4:59–90, 1994.

[52] K. Koumatos, F. Rindler, and E. Wiedemann. Differential inclusions and Youngmeasures involving prescribed Jacobians. SIAM J. Math. Anal., 2015. to appear.

[53] J. Kristensen. Lower semicontinuity of quasi-convex integrals in BV. Calc. Var.Partial Differential Equations, 7(3):249–261, 1998.

[54] J. Kristensen. Lower semicontinuity in spaces of weakly differentiable functions.Math. Ann., 313:653–710, 1999.

[55] J. Kristensen and C. Melcher. Regularity in oscillatory nonlinear elliptic systems.Math. Z., 260:813–847, 2008.

[56] J. Kristensen and G. Mingione. The singular set of Lipschitzian minima of multipleintegrals. Arch. Ration. Mech. Anal., 184:341–369, 2007.

Com

men

t...


BIBLIOGRAPHY 121

[57] J. Kristensen and F. Rindler. Characterization of generalized gradient Young mea-sures generated by sequences in W1,1 and BV. Arch. Ration. Mech. Anal., 197:539–598, 2010. Erratum: Vol. 203 (2012), 693-700.

[58] M. Kruzık and T. Roubıcek. On the measures of DiPerna and Majda. Math. Bohem.,122:383–399, 1997.

[59] G. Leoni. A first course in Sobolev spaces, volume 105 of Graduate Studies in Math-ematics. American Mathematical Society, 2009.

[60] M. Marcus and V. J. Mizel. Transformations by functions in Sobolev spaces andlower semicontinuity for parametric variational problems. Bull. Amer. Math. Soc.,79:790–795, 1973.

[61] P. Mattila. Geometry of Sets and Measures in Euclidean Spaces, volume 44 of Cam-bridge Studies in Advanced Mathematics. Cambridge University Press, 1995.

[62] E. J. McShane. Generalized curves. Duke Math. J., 6:513–536, 1940.

[63] E. J. McShane. Necessary conditions in generalized-curve problems of the calculusof variations. Duke Math. J., 7:1–27, 1940.

[64] E. J. McShane. Sufficient conditions for a weak relative minimum in the problem ofBolza. Trans. Amer. Math. Soc., 52:344–379, 1942.

[65] N. G. Meyers. Quasi-convexity and lower semi-continuity of multiple variationalintegrals of any order. Trans. Amer. Math. Soc., 119:125–149, 1965.

[66] B. S. Mordukhovich. Variational analysis and generalized differentiation. I – Basictheory, volume 330 of Grundlehren der mathematischen Wissenschaften. Springer,2006.

[67] B. S. Mordukhovich. Variational analysis and generalized differentiation. II – Appli-cations, volume 331 of Grundlehren der mathematischen Wissenschaften. Springer,2006.

[68] C. B. Morrey, Jr. On the solutions of quasi-linear elliptic partial differential equations.Trans. Amer. Math. Soc., 43:126–166, 1938.

[69] C. B. Morrey, Jr. Quasiconvexity and the semicontinuity of multiple integrals. PacificJ. Math., 2:25–53, 1952.

[70] C. B. Morrey, Jr. Multiple Integrals in the Calculus of Variations, volume 130 ofGrundlehren der mathematischen Wissenschaften. Springer, 1966.

[71] J. Moser. A new proof of De Giorgi’s theorem concerning the regularity problem forelliptic differential equations. Comm. Pure Appl. Math., 13:457–468, 1960.

Com

men

t...


122 BIBLIOGRAPHY

[72] J. Moser. On Harnack’s theorem for elliptic differential equations. Comm. Pure Appl.Math., 14:577–591, 1961.

[73] S. Muller. Variational models for microstructure and phase transitions. In Calculusof variations and geometric evolution problems (Cetraro, 1996), volume 1713 ofLecture Notes in Mathematics, pages 85–210. Springer, 1999.

[74] S. Muller and S. J. Spector. An existence theory for nonlinear elasticity that allowsfor cavitation. Arch. Rational Mech. Anal., 131:1–66, 1995.

[75] S. Muller and V. Sverak. Convex integration for Lipschitz mappings and counterex-amples to regularity. Ann. of Math., 157:715–742, 2003.

[76] F. Murat. Compacite par compensation. Ann. Scuola Norm. Sup. Pisa Cl. Sci., 5:489–507, 1978.

[77] F. Murat. Compacite par compensation. II. In Proceedings of the InternationalMeeting on Recent Methods in Nonlinear Analysis (Rome, 1978), pages 245–256.Pitagora, 1979.

[78] J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. J. Math.,80:931–954, 1958.

[79] J. Necas. On regularity of solutions to nonlinear variational inequalities for second-order elliptic systems. Rend. Mat., 8:481–498, 1975.

[80] P. J. Olver. Applications of Lie groups to differential equations, volume 107 of Grad-uate Texts in Mathematics. Springer, 2nd edition, 1993.

[81] P. Pedregal. Parametrized Measures and Variational Principles, volume 30 ofProgress in Nonlinear Differential Equations and their Applications. Birkhauser,1997.

[82] Y. G. Reshetnyak. On the stability of conformal mappings in multidimensionalspaces. Siberian Math. J., 8:69–85, 1967.

[83] R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematical Series.Princeton University Press, 1970.

[84] R. T. Rockafellar and R. J.-B. Wets. Variational analysis, volume 317 of Grundlehrender mathematischen Wissenschaften. Springer, 1998.

[85] W. Rudin. Real and Complex Analysis. McGraw–Hill, 3rd edition, 1986.

[86] L. Simon. Schauder estimates by scaling. Calc. Var. Partial Differential Equations,5:391–407, 1997.

[87] E. M. Stein. Harmonic Analysis. Princeton University Press, 1993.

Com

men

t...


BIBLIOGRAPHY 123

[88] V. Sverak. Quasiconvex functions with subquadratic growth. Proc. Roy. Soc. LondonSer. A, 433:723–725, 1991.

[89] V. Sverak. Rank-one convexity does not imply quasiconvexity. Proc. Roy. Soc.Edinburgh Sect. A, 120:185–189, 1992.

[90] V. Sverak and X. Yan. A singular minimizer of a smooth strongly convex functionalin three dimensions. Calc. Var. Partial Differential Equations, 10:213–221, 2000.

[91] V. Sverak and X. Yan. Non-Lipschitz minimizers of smooth uniformly convex func-tionals. Proc. Natl. Acad. Sci. USA, 99:15269–15276, 2002.

[92] L. Szekelyhidi, Jr. The regularity of critical points of polyconvex functionals. Arch.Ration. Mech. Anal., 172:133–152, 2004.

[93] Q. Tang. Almost-everywhere injectivity in nonlinear elasticity. Proc. Roy. Soc. Ed-inburgh Sect. A, 109:79–95, 1988.

[94] L. Tartar. Compensated compactness and applications to partial differential equa-tions. In Nonlinear analysis and mechanics: Heriot-Watt Symposium, Vol. IV, vol-ume 39 of Res. Notes in Math., pages 136–212. Pitman, 1979.

[95] L. Tartar. The compensated compactness method applied to systems of conservationlaws. In Systems of nonlinear partial differential equations (Oxford, 1982), volume111 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages 263–285. Reidel, 1983.

[96] B. van Brunt. The calculus of variations. Universitext. Springer, 2004.

[97] L. C. Young. Generalized curves and the existence of an attained absolute minimumin the calculus of variations. C. R. Soc. Sci. Lett. Varsovie, Cl. III, 30:212–234, 1937.

[98] L. C. Young. Generalized surfaces in the calculus of variations. Ann. of Math.,43:84–103, 1942.

[99] L. C. Young. Generalized surfaces in the calculus of variations. II. Ann. of Math.,43:530–544, 1942.

[100] L. C. Young. Lectures on the calculus of variations and optimal control theory.Saunders, 1969.

[101] L. C. Young. Lectures on the calculus of variations and optimal control theory.Chelsea, 2nd edition, 1980. (reprinted by AMS Chelsea Publishing 2000).

Com

men

t...


Index

action of a measure, 111adjugate matrix, 108

Ball existence theorem, 84Ball–James rigidity theorem, 66Banach–Alaoglu theorem, 112barycenter, 111biting convergence, 76blow-up technique, 69bootstrapping, 28Borel σ -algebra, 110Borel measure, 110brachistochrone curve, 2

Caccioppoli inequality, 25Caratheodory integrand, 14Ciarlet–Necas non-interpenetration con-

dition, 86coercivity, 12

weak, 14cofactor matrix, 108compactness principle for Young mea-

sures, 56compensated compactness, 75continuum mechanics, 6convexity, 11

strong, 25convolution, 115Cramer rule, 108critical point, 21cut-off construction, 48

Dacorogna formula, 105De Giorgi regularity theorem, 28

De Giorgi–Nash–Moser regularity theo-rem, 29

density theorem for Sobolev spaces, 114difference quotient, 25Direct Method, 12

for weak convergence, 13with side constraint, 38

directional derivative, 19Dirichlet integral, 5, 17, 33disintegration theorem, 57distributional cofactors, 81distributional determinant, 82double-well potential, 10, 87duality pairing, 111duality pairing, for measures, 110Dunford–Pettis theorem, 112

elasticity tensor, 7electric energy, 4electric potential, 4ellipticity, 21Euler–Lagrange equation, 19Evans partial regularity theorem, 74extension–relaxation, 95, 96

Faraday law of induction, 4Fatou lemma, 109frame-indifference, 43Frobenius matrix (inner) product, 107Frobenius matrix norm, 107Fundamental lemma of the Calculus of

Variations, 23Fundamental theorem of Young measure

theory, 55

Com

men

t...


126 INDEX

Gauss law, 4Green–St. Venant strain tensor, 6ground state in quantum mechanics, 5growth bound

lower, 15upper, 15

Holder-continuity, 114Hadamard inequality, 108Hamiltonian, 5harmonic function, 20homogenization lemma, 100hyperelasticity, 6

integrand, 11invariance, 34

Jensen inequality, 111Jensen-type inequality, 65

Kinderlehrer–Pedregal theorem, 102Korn inequality, 17

Lagrange multiplier, 39Lame constants, 80lamination, 47Laplace equation, 20Laplace operator, 20Lavrentiev gap phenomenon, 31Lebesgue dominated convergence theo-

rem, 109Lebesgue point, 110Legendre–Hadamard condition, 104linearized strain tensor, 7Lipschitz domain, 11Lipschitz-continuity, 115lower semicontinuity, 12

maximal function, 63Mazur lemma, 112microstructure, 87, 95minimizing sequence, 12minor, 51modulus of continuity, 93mollification, 115

mollifier, 115Monotone convergence lemma, 109multi-index, 112

ordered, 51

Noether multipliers, 35Noether theorem, 35normal integrand, 76null-Lagrangian, 51

o, 109objective functional, 12Ogden material, 7optimal beating problem, 10optimal saving problem, 8

Philosphy of Science, 1Piola identity, 52Poincare–Friedrichs inequality, 113Poisson equation, 4polyconvexity, 78Pratt theorem, 109probability measure, 110

quasiaffine, 53quasiconvex envelope, 88quasiconvexity, 44

strong, 74

Radon measure, 110rank of a minor, 51rank-one convexity, 46recovery sequence, 96regular variational integral, 25relaxation, 88, 91relaxation theorem

abstract, 91for integral functionals, 93with Young measures, 96

Rellich–Kondrachov theorem, 114Riesz representation theorem, 111rigid body motion, 6

scalar problem, 18Schauder estimate, 28, 29

Com

men

t...


INDEX 127

Schrodinger equation, 5stationary, 5, 41

Scorza Dragoni theorem, 15side constraint, 37singular set, 75singular value, 108singular value decomposition, 108Sobolev embedding theorem, 114Sobolev space, 112solution

classical, 23strong, 23weak, 19

St. Venant–Kirchhoff material, 80Statistical Mechanics, 1stiffness tensor, 7strong convergence

in Lp-spaces, 109support, 112symmetries of minimizers, 33

tensor product, 107trace operator, 113trace space, 113trace theorem, 113two-gradient inclusion

approximate, 66exact, 66

two-gradient problem, 66

underlying deformation, 62utility function, 8

variationfirst, 21

variational principle, 1vectorial problem, 18Vitali convergence theorem, 110Vitali covering theorem, 110

W2,2loc-regularity theorem, 25

wave equation, 23wave function, 5weak compactness in Banach spaces, 111weak continuity, 38weak continuity of minors, 53weak derivative, 112weak divergence, 113weak gradient, 113weak lower semicontinuity, 14

Young inequality, 107Young measure, 55

elementary, 56gradient, 62homogeneous, 59set of ˜s, 55

Com

men

t...


Introduction to the Modern Calculus of Variations · MA4G6 Lecture Notes Introduction to the Modern...

Documents

Transcript of Introduction to the Modern Calculus of Variations · MA4G6 Lecture Notes Introduction to the Modern...