THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK...

17
THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION RICHARD JORDAN , DAVID KINDERLEHRER , AND FELIX OTTO § SIAM J. MATH .A NAL . c 1998 Society for Industrial and Applied Mathematics Vol. 29, No. 1, pp. 1–17, January 1998 001 In memory of Richard Dun Abstract. The Fokker–Planck equation, or forward Kolmogorov equation, describes the evolu- tion of the probability density for a stochastic process associated with an Ito stochastic dierential equation. It pertains to a wide variety of time-dependent systems in which randomness plays a role. In this paper, we are concerned with Fokker–Planck equations for which the drift term is given by the gradient of a potential. For a broad class of potentials, we construct a time discrete, iterative variational scheme whose solutions converge to the solution of the Fokker–Planck equation. The major novelty of this iterative scheme is that the time-step is governed by the Wasserstein metric on probability measures. This formulation enables us to reveal an appealing, and previously unexplored, relationship between the Fokker–Planck equation and the associated free energy functional. Namely, we demonstrate that the dynamics may be regarded as a gradient flux, or a steepest descent, for the free energy with respect to the Wasserstein metric. Key words. Fokker–Planck equation, steepest descent, free energy, Wasserstein metric AMS subject classifications. 35A15, 35K15, 35Q99, 60J60 PII. S0036141096303359 1. Introduction and overview. The Fokker–Planck equation plays a central role in statistical physics and in the study of fluctuations in physical and biological systems [7, 22, 23]. It is intimately connected with the theory of stochastic dierential equations: a (normalized) solution to a given Fokker–Planck equation represents the probability density for the position (or velocity) of a particle whose motion is described by a corresponding Ito stochastic dierential equation (or Langevin equation). We shall restrict our attention in this paper to the case where the drift coecient is a gradient. The simplest relevant physical setting is that of a particle undergoing diusion in a potential field [23]. It is known that, under certain conditions on the drift and diusion coecients, the stationary solution of a Fokker–Planck equation of the type that we consider here satisfies a variational principle. It minimizes a certain convex free energy func- tional over an appropriate admissible class of probability densities [12]. This free energy functional satisfies an H-theorem: it decreases in time for any solution of the Fokker–Planck equation [22]. In this work, we shall establish a deeper, and appar- ently previously unexplored, connection between the free energy functional and the Fokker–Planck dynamics. Specifically, we shall demonstrate that the solution of the Received by the editors May 13, 1996; accepted for publication (in revised form) December 9, 1996. The research of all three authors is partially supported by the ARO and the NSF through grants to the Center for Nonlinear Analysis. In addition, the third author is partially supported by the Deutsche Forschungsgemeinschaft (German Science Foundation), and the second author is partially supported by grants NSF/DMS 9505078 and DAAL03-92-0003. http://www.siam.org/journals/sima/29-1/30335.html Center for Nonlinear Analysis, Carnegie Mellon University. Present address: Department of Mathematics, University of Michigan ([email protected]). Center for Nonlinear Analysis, Carnegie Mellon University ([email protected]). § Center for Nonlinear Analysis, Carnegie Mellon University and Department of Applied Math- ematics, University of Bonn. Present address: Courant Institute of Mathematical Sciences ([email protected]). 1

Transcript of THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK...

Page 1: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCKEQUATION∗

RICHARD JORDAN† , DAVID KINDERLEHRER‡ , AND FELIX OTTO§

SIAM J. M A T H . A N A L . c 1998 Society for Industrial and Applied MathematicsVol. 29, No. 1, pp. 1–17, January 1998 001

In memory of Richard Duffin

Abstract. The Fokker–Planck equation, or forward Kolmogorov equation, describes the evolu-tion of the probability density for a stochastic process associated with an Ito stochastic differentialequation. It pertains to a wide variety of time-dependent systems in which randomness plays a role.In this paper, we are concerned with Fokker–Planck equations for which the drift term is given bythe gradient of a potential. For a broad class of potentials, we construct a time discrete, iterativevariational scheme whose solutions converge to the solution of the Fokker–Planck equation. Themajor novelty of this iterative scheme is that the time-step is governed by the Wasserstein metric onprobability measures. This formulation enables us to reveal an appealing, and previously unexplored,relationship between the Fokker–Planck equation and the associated free energy functional. Namely,we demonstrate that the dynamics may be regarded as a gradient flux, or a steepest descent, for thefree energy with respect to the Wasserstein metric.

Key words. Fokker–Planck equation, steepest descent, free energy, Wasserstein metric

AMS subject classifications. 35A15, 35K15, 35Q99, 60J60

PII. S0036141096303359

1. Introduction and overview. The Fokker–Planck equation plays a centralrole in statistical physics and in the study of fluctuations in physical and biologicalsystems [7, 22, 23]. It is intimately connected with the theory of stochastic differentialequations: a (normalized) solution to a given Fokker–Planck equation represents theprobability density for the position (or velocity) of a particle whose motion is describedby a corresponding Ito stochastic differential equation (or Langevin equation). Weshall restrict our attention in this paper to the case where the drift coefficient isa gradient. The simplest relevant physical setting is that of a particle undergoingdiffusion in a potential field [23].

It is known that, under certain conditions on the drift and diffusion coefficients,the stationary solution of a Fokker–Planck equation of the type that we considerhere satisfies a variational principle. It minimizes a certain convex free energy func-tional over an appropriate admissible class of probability densities [12]. This freeenergy functional satisfies an H-theorem: it decreases in time for any solution of theFokker–Planck equation [22]. In this work, we shall establish a deeper, and appar-ently previously unexplored, connection between the free energy functional and theFokker–Planck dynamics. Specifically, we shall demonstrate that the solution of the

∗Received by the editors May 13, 1996; accepted for publication (in revised form) December 9,1996. The research of all three authors is partially supported by the ARO and the NSF throughgrants to the Center for Nonlinear Analysis. In addition, the third author is partially supportedby the Deutsche Forschungsgemeinschaft (German Science Foundation), and the second author ispartially supported by grants NSF/DMS 9505078 and DAAL03-92-0003.

http://www.siam.org/journals/sima/29-1/30335.html†Center for Nonlinear Analysis, Carnegie Mellon University. Present address: Department of

Mathematics, University of Michigan ([email protected]).‡Center for Nonlinear Analysis, Carnegie Mellon University ([email protected]).§Center for Nonlinear Analysis, Carnegie Mellon University and Department of Applied Math-

ematics, University of Bonn. Present address: Courant Institute of Mathematical Sciences([email protected]).

1

Page 2: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

2 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

Fokker–Planck equation follows, at each instant in time, the direction of steepestdescent of the associated free energy functional.

The notion of a steepest descent, or a gradient flux, makes sense only in contextwith an appropriate metric. We shall show that the required metric in the case of theFokker–Planck equation is the Wasserstein metric (defined in section 3) on probabilitydensities. As far as we know, the Wasserstein metric cannot be written as an inducedmetric for a metric tensor (the space of probability measures with the Wassersteinmetric is not a Riemannian manifold). Thus, in order to give meaning to the assertionthat the Fokker–Planck equation may be regarded as a steepest descent, or gradientflux, of the free energy functional with respect to this metric, we switch to a discretetime formulation. We develop a discrete, iterative variational scheme whose solutionsconverge, in a sense to be made precise below, to the solution of the Fokker–Planckequation. The time-step in this iterative scheme is associated with the Wassersteinmetric. For a different view on the use of implicit schemes for measures, see [6, 16].

For the purpose of comparison, let us consider the classical diffusion (or heat)equation

∂ρ(t, x)

∂t= ∆ρ(t, x) , t ∈ (0,∞) , x ∈ Rn ,

which is the Fokker–Planck equation associated with a standard n-dimensional Brow-nian motion. It is well known (see, for example, [5, 24]) that this equation is thegradient flux of the Dirichlet integral 1

2

Rn |∇ρ|2 dx with respect to the L2(Rn) met-

ric. The classical discretization is given by the scheme

Determine ρ(k) that minimizes

12 ρ

(k−1) − ρ2L2 (Rn) + h2

Rn|∇ρ|2 dx

over an appropriate class of densities ρ. Here, h is the time step size. On the otherhand, we derive as a special case of our results below that the scheme

Determine ρ(k) that minimizes

12 d(ρ

(k−1), ρ)2 + h

Rnρ log ρ dx

over all ρ ∈ K ,

(1)

where K is the set of all probability densities on Rn having finite second moments,is also a discretization of the diffusion equation when d is the Wasserstein metric.In particular, this allows us to regard the diffusion equation as a steepest descentof the functional

Rn ρ log ρ dx with respect to the Wasserstein metric. This re-

veals a novel link between the diffusion equation and the Gibbs–Boltzmann entropy(−Rn ρ log ρ dx) of the density ρ. Furthermore, this formulation allows us to at-

tach a precise interpretation to the conventional notion that diffusion arises from thetendency of the system to maximize entropy.

The connection between the Wasserstein metric and dynamical problems involvingdissipation or diffusion (such as strongly overdamped fluid flow or nonlinear diffusionequations) seems to have first been recognized by Otto in [19]. The results in [19]together with our recent research on variational principles of entropy and free energytype for measures [12, 11, 15] provide the impetus for the present investigation. Thework in [12] was motivated by the desire to model and characterize metastability

Page 3: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 3

and hysteresis in physical systems. We plan to explore in subsequent research therelevance of the developments in the present paper to the study of such phenomena.Some preliminary results in this direction may be found in [13, 14].

The paper is organized as follows. In section 2, we first introduce the Fokker–Planck equation and briefly discuss its relationship to stochastic differential equations.We then give the precise form of the associated stationary solution and of the freeenergy functional that this density minimizes. In section 3, the Wasserstein metric isdefined, and a brief review of its properties and interpretations is given. The iterativevariational scheme is formulated in section 4, and the existence and uniqueness of itssolutions are established. The main result of this paper—namely, the convergenceof solutions of this scheme (after interpolation) to the solution of the Fokker–Planckequation—is the topic of section 5. There, we state and prove the relevant convergencetheorem.

2. The Fokker–Planck equation, stationary solutions, and the free en-ergy functional. We are concerned with Fokker–Planck equations having the form

∂ρ

∂t= div (∇Ψ(x)ρ) + β−1∆ρ , ρ(x, 0) = ρ0(x),(2)

where the potential Ψ(x) : Rn → [0,∞) is a smooth function, β > 0 is a givenconstant, and ρ0(x) is a probability density on Rn. The solution ρ(t, x) of (2) must,therefore, be a probability density on Rn for almost every fixed time t. That is,ρ(t, x) ≥ 0 for almost every (t, x) ∈ (0,∞) × Rn, and

Rn ρ(t, x) dx = 1 for almost

every t ∈ (0,∞).It is well known that the Fokker–Planck equation (2) is inherently related to the

Ito stochastic differential equation [7, 22, 23]

dX(t) = −∇Ψ(X(t))dt+

2β−1 dW (t) , X(0) = X0 .(3)

Here, W (t) is a standard n-dimensional Wiener process, and X0 is an n-dimensionalrandom vector with probability density ρ0. Equation (3) is a model for the motionof a particle undergoing diffusion in the potential field Ψ. X(t) ∈ Rn then representsthe position of the particle, and the positive parameter β is proportional to the in-verse temperature. This stochastic differential equation arises, for example, as theSmoluchowski–Kramers approximation to the Langevin equation for the motion of achemically bound particle [23, 4, 17]. In that case, the function Ψ describes the chemi-cal bonding forces, and the term

2β−1 dW (t) represents white noise forces resulting

from molecular collisions [23]. The solution ρ(t, x) of the Fokker–Planck equation (2)furnishes the probability density at time t for finding the particle at position x.

If the potential Ψ satisfies appropriate growth conditions, then there is a uniquestationary solution ρs(x) of the Fokker–Planck equation, and it takes the form of theGibbs distribution [7, 22]

ρs(x) = Z−1 exp(−βΨ(x)),(4)

where the partition function Z is given by the expression

Z =

Rnexp(−βΨ(x)) dx.

Note that, in order for equation (4) to make sense, Ψ must grow rapidly enough toensure that Z is finite. The probability measure ρs(x) dx, when it exists, is the unique

Page 4: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

4 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

invariant measure for the Markov process X(t) defined by the stochastic differentialequation (3).

It is readily verified (see, for example, [12]) that the Gibbs distribution ρs satisfiesa variational principle—it minimizes over all probability densities on Rn the freeenergy functional

F (ρ) = E(ρ) + β−1S(ρ),(5)

where

E(ρ) :=

RnΨρ dx(6)

plays the role of an energy functional, and

S(ρ) :=

Rnρ log ρ dx(7)

is the negative of the Gibbs–Boltzmann entropy functional.Even when the Gibbs measure is not defined, the free energy (5) of a density ρ(t, x)

satisfying the Fokker–Planck equation (2) may be defined, provided that F (ρ0) isfinite. This free energy functional then serves as a Lyapunov function for the Fokker–Planck equation: if ρ(t, x) satisfies (2), then F (ρ(t, x)) can only decrease with time[22, 14]. Thus, the free energy functional is an H-function for the dynamics. Thedevelopments that follow will enable us to regard the Fokker–Planck dynamics as agradient flux, or a steepest descent, of the free energy with respect to a particularmetric on an appropriate class of probability measures. The requisite metric is theWasserstein metric on the set of probability measures having finite second moments.We now proceed to define this metric.

3. The Wasserstein metric. The Wasserstein distance of order two, d(µ1, µ2),between two (Borel) probability measures µ1 and µ2 on Rn is defined by the formula

d(µ1, µ2)2 = inf

p∈P(µ1 ,µ2 )

Rn×Rn|x− y|2 p(dxdy),(8)

where P(µ1, µ2) is the set of all probability measures on Rn ×Rn with first marginalµ1 and second marginal µ2, and the symbol | · | denotes the usual Euclidean normon Rn. More precisely, a probability measure p is in P(µ1, µ2) if and only if for eachBorel subset A ⊂ Rn there holds

p(A× Rn) = µ1(A), p(Rn ×A) = µ2(A).

Wasserstein distances of order q with q different from 2 may be analogously defined[10]. Since no confusion should arise in doing so, we shall refer to d in what followsas simply the Wasserstein distance.

It is well known that d defines a metric on the set of probability measures µ on Rn

having finite second moments:

Rn |x|2µ(dx) < ∞ [10, 21]. In particular, d satisfiesthe triangle inequality on this set. That is, if µ1, µ2, and µ3 are probability measureson Rn with finite second moments, then

d(µ1, µ3) ≤ d(µ1, µ2) + d(µ2, µ3) .(9)

We shall make use of this property at several points later on.

Page 5: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 5

We note that the Wasserstein metric may be equivalently defined by [21]

d(µ1, µ2)2 = inf E|X − Y |2 ,(10)

where E(U) denotes the expectation of the random variable U , and the infimum istaken over all random variables X and Y such that X has distribution µ1 and Yhas distribution µ2. In other words, the infimum is over all possible couplings of therandom variables X and Y . Convergence in the metric d is equivalent to the usualweak convergence plus convergence of second moments. This latter assertion may bedemonstrated by appealing to the representation (10) and applying the well-knownSkorohod theorem from probability theory (see Theorem 29.6 of [1]). We omit thedetails.

The variational problem (8) is an example of a Monge–Kantorovich mass transfer-ence problem with the particular cost function c(x, y) = |x−y|2 [21]. In that context,an infimizer p∗ ∈ P(µ1, µ2) is referred to as an optimal transference plan. When µ1

and µ2 have finite second moments, the existence of such a p∗ for (8) is readily verifiedby a simple adaptation of our arguments in section 4. For a probabilistic proof thatthe infimum in (8) is attained when µ1 and µ2 have finite second moments, see [10].Brenier [2] has established the existence of a one-to-one optimal transference planin the case that the measures µ1 and µ2 have bounded support and are absolutelycontinuous with respect to Lebesgue measure. Caffarelli [3] and Gangbo and McCann[8, 9] have recently extended Brenier’s results to more general cost functions c and tocases in which the measures do not have bounded support.

If the measures µ1 and µ2 are absolutely continuous with respect to the Lebesguemeasure, with densities ρ1 and ρ2, respectively, we will write P(ρ1, ρ2) for the set ofprobability measures having first marginal µ1 and second marginal µ2. Correspond-ingly, we will denote by d(ρ1, ρ2) the Wasserstein distance between µ1 and µ2. Thisis the situation that we will be concerned with in what follows.

4. The discrete scheme. We shall now construct a time-discrete scheme that isdesigned to converge in an appropriate sense (to be made precise in the next section)to a solution of the Fokker–Planck equation. The scheme that we shall describewas motivated by a similar scheme developed by Otto in an investigation of patternformation in magnetic fluids [19]. We shall make the following assumptions concerningthe potential Ψ introduced in section 2:

Ψ ∈ C∞(Rn);

Ψ(x) ≥ 0 for all x ∈ Rn ;(11)

|∇Ψ(x)| ≤ C (Ψ(x) + 1) for all x ∈ Rn(12)

for some constant C <∞. Notice that our assumptions on Ψ allow for cases in whichRn exp(−βΨ) dx is not defined, so the stationary density ρs given by (4) does not

exist. These assumptions allow us to treat a wide class of Fokker–Planck equations.In particular, the classical diffusion equation ∂ρ

∂t = β−1∆ρ, for which Ψ ≡ const., fallsinto this category. We also introduce the set K of admissible probability densities:

K :=

ρ: Rn → [0,∞) measurable

Rnρ(x) dx = 1 ,M(ρ) <∞

,

where

M(ρ) =

Rn|x|2 ρ(x) dx .

Page 6: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

6 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

With these conventions in hand, we now formulate the iterative discrete scheme:

Determine ρ(k) that minimizes

12 d(ρ

(k−1), ρ)2 + hF (ρ)

over all ρ ∈ K .

(13)

Here we use the notation ρ(0) = ρ0. The scheme (13) is the obvious generalization ofthe scheme (1) set forth in the Introduction for the diffusion equation. We shall nowestablish existence and uniqueness of the solution to (13).

PROPOSITION 4.1. Given ρ0 ∈ K, there exists a unique solution of the scheme(13).

Proof. Let us first demonstrate that S is well defined as a functional on K withvalues in (−∞,+∞] and that, in addition, there exist α < 1 and C < ∞ dependingonly on n such that

S(ρ) ≥ −C (M(ρ) + 1)α for all ρ ∈ K .(14)

Actually, we shall show that (14) is valid for any α ∈ ( nn+2 , 1). For future reference,

we prove a somewhat finer estimate. Namely, we demonstrate that there exists aC < ∞, depending only on n and α, such that for all R ≥ 0, and for each ρ ∈ K,there holds

Rn−BR|minρ log ρ, 0| dx ≤ C

1

R2 + 1

( 2 +n ) α−n2

(M(ρ) + 1)α ,(15)

where BR denotes the ball of radius R centered at the origin in Rn. Indeed, for α < 1there holds

|minz log z, 0| ≤ C zα for all z ≥ 0.

Hence by Holder’s inequality, we obtain

Rn−BR|minρ log ρ, 0| dx

≤ C

Rn−BRρα dx

≤ C

Rn−BR

1

|x|2 + 1

α1−α

dx

1−α

(M(ρ) + 1)α .

On the other hand, for α1−α > n

2 , we have

Rn−BR

1

|x|2 + 1

α1−α

dx ≤ C

1

R2 + 1

α1−α−

n2

.

Let us now prove that for given ρ(k−1) ∈ K, there exists a minimizer ρ ∈ K ofthe functional

K ρ → 12 d(ρ

(k−1), ρ)2 + hF (ρ) .(16)

Page 7: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 7

Observe that S is not bounded below on K and hence F is not bounded below on Keither. Nevertheless, using the inequality

M(ρ1) ≤ 2M(ρ0) + 2 d(ρ0, ρ1)2 for all ρ0, ρ1 ∈ K(17)

(which immediately follows from the inequality |y|2 ≤ 2|x|2 + 2|x− y|2 and from thedefinition of d) together with (14) we obtain

12 d(ρ

(k−1), ρ)2 + hF (ρ)

(17)≥ 1

4 M(ρ) − 12 M(ρ(k−1)) + hS(ρ)(18)

(14)≥ 1

4 M(ρ) − C (M(ρ) + 1)α − 12 M(ρ(k−1)) for all ρ ∈ K ,

which ensures that (16) is bounded below. Now, let ρν be a minimizing sequencefor (16). Obviously, we have that

S(ρν)ν is bounded above ,(19)

and according to (18),

M(ρν)ν is bounded .(20)

The latter result, together with (15), implies that

Rn|minρν log ρν , 0| dx

ν

is bounded ,

which combined with (19) yields that

Rnmaxρν log ρν , 0 dx

ν

is bounded .

As z → maxz log z, 0, z ∈ [0,∞), has superlinear growth, this result, in conjunctionwith (20), guarantees the existence of a ρ(k) ∈ K such that (at least for a subsequence)

ρνw ρ(k) in L1(Rn) .(21)

Let us now show that

S(ρ(k)) ≤ lim infν↑∞

S(ρν) .(22)

As [0,∞) z → z log z is convex and [0,∞) z → maxz log z, 0 is convex andnonnegative, (21) implies that for any R <∞,

BR

ρ(k) log ρ(k) dx ≤ lim infν↑∞

BR

ρν log ρν dx ,(23)

Rn−BRmaxρ(k) log ρ(k), 0 dx ≤ lim inf

ν↑∞

Rn−BRmaxρν log ρν , 0 dx.(24)

On the other hand we have according to (15) and (20)

limR↑∞

supν∈N

Rn−BR|minρν log ρν , 0| dx = 0 .(25)

Page 8: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

8 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

Now observe that for any R <∞, there holds

S(ρ(k)) ≤

BR

ρ(k) log ρ(k) dx +

Rn−BRmaxρ(k) log ρ(k), 0 dx ,

which together with (23), (24), and (25) yields (22).It remains for us to show that

E(ρ(k)) ≤ lim infν↑∞

E(ρν) ,(26)

d(ρ(k−1), ρ(k))2 ≤ lim infν↑∞

d(ρ(k−1), ρν)2 .(27)

Equation (26) follows immediately from (21) and Fatou’s lemma. As for (27), wechoose pν ∈ P(ρ(k−1), ρν) satisfying

Rn×Rn|x− y|2 pν(dx dy) ≤ d(ρ(k−1), ρν)

2 + 1ν .

By (20) the sequence of probability measures ρν dxν↑∞ is tight, or relatively com-pact with respect to the usual weak convergence in the space of probability measureson Rn (i.e., convergence tested against bounded continuous functions) [1]. This, to-gether with the fact that the density ρ(k−1) has finite second moment, guaranteesthat the sequence pνν↑∞ of probability measures on Rn×Rn is tight. Hence, thereis a subsequence of pνν↑∞ that converges weakly to some probability measure p.From (21) we deduce that p ∈ P(ρ(k−1), ρ(k)). We now could invoke the Skorohodtheorem [1] and Fatou’s lemma to infer (27) from this weak convergence, but we preferhere to give a more analytic proof. For R < ∞ let us select a continuous functionηR: Rn → [0, 1] such that

ηR = 1 inside of BR and ηR = 0 outside of B2R .

We then have

Rn×RnηR(x) ηR(y) |x− y|2 p(dx dy)

= limν↑∞

Rn×RnηR(x) ηR(y) |x− y|2 pν(dx dy)

≤ lim infν↑∞

d(ρ(k−1), ρν)2

(28)

for each fixed R <∞. On the other hand, using the monotone convergence theorem,we deduce that

d(ρ(k−1), ρ(k))2 ≤

Rn×Rn|x− y|2 p(dx dy)

= limR↑∞

Rn×RnηR(x) ηR(y) |x− y|2 p(dx dy) ,

which combined with (28) yields (27).To conclude the proof of the proposition we establish that the functional (16) has

at most one minimizer. This follows from the convexity of K and the strict convexityof (16). The strict convexity of (16) follows from the strict convexity of S, the linearityof E, and the (obvious) convexity over K of the functional ρ → d(ρ(k−1), ρ)2.

Page 9: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 9

Remark. One of the referees has communicated to us the following simple estimatethat could be used in place of (14)–(15) in the previous and subsequent analysis: forany Ω ⊂ Rn (in particular, for Ω = Rn −BR) and for all ρ ∈ K there holds

Ω|minρ log ρ, 0| dx ≤ C

Ωe−

|x|2 dx+ M(ρ) +

1

4

Ωρ dx(29)

for any > 0. To obtain the inequality (29), select C > 0 such that for all z ∈ [0, 1],we have z| log z| ≤ C

√z. Then, defining the sets Ω0 = Ω ∩ ρ ≤ exp(−|x|) and

Ω1 = Ω ∩ exp(−|x|) < ρ ≤ 1, we have

Ω|minρ log ρ, 0| dx =

Ω0

ρ|(log ρ)−| dx+

Ω1

ρ|(log ρ)−| dx

≤ C

Ωe−

|x|2 dx+

Ω|x|ρ dx .

The desired result (29) then follows from the inequality |x| ≤ |x|2 + 1/(4) for > 0.

5. Convergence to the solution of the Fokker–Planck equation. We comenow to our main result. We shall demonstrate that an appropriate interpolation ofthe solution to the scheme (13) converges to the unique solution of the Fokker–Planckequation. Specifically, the convergence result that we will prove here is as follows.

THEOREM 5.1. Let ρ0 ∈ K satisfy F (ρ0) <∞, and for given h > 0, let ρ(k)h k∈N

be the solution of (13). Define the interpolation ρh: (0,∞)×Rn → [0,∞) by

ρh(t) = ρ(k)h for t ∈ [k h, (k+ 1)h) and k ∈ N ∪ 0.

Then as h ↓ 0,

ρh(t) ρ(t) weakly in L1(Rn) for all t ∈ (0,∞) ,(30)

where ρ ∈ C∞((0,∞)×Rn) is the unique solution of

∂ρ

∂t= div(ρ∇Ψ) + β−1∆ρ ,(31)

with initial condition

ρ(t) → ρ0 strongly in L1(Rn) for t ↓ 0(32)

and

M(ρ), E(ρ) ∈ L∞((0, T )) for all T <∞ .(33)

Remark. A finer analysis reveals that

ρh → ρ strongly in L1((0, T )×Rn) for all T <∞ .

Proof. The proof basically follows along the lines of [19, Proposition 2, Theorem3]. The crucial step is to recognize that the first variation of (16) with respect to theindependent variables indeed yields a time-discrete scheme for (31), as will now bedemonstrated. For notational convenience only, we shall set β ≡ 1 from here on in.As will be evident from the ensuing arguments, our proof works for any positive β. In

Page 10: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

10 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

fact, it is not difficult to see that, with appropriate modifications to the scheme (13),we can establish an analogous convergence result for time-dependent β.

Let a smooth vector field with bounded support, ξ ∈ C∞0 (Rn,Rn), be given, anddefine the corresponding flux Φττ∈R by

∂τ Φτ = ξ Φτ for all τ ∈ R and Φ0 = id .

For any τ ∈ R, let the measure ρτ (y) dy be the push forward of ρ(k)(y) dy under Φτ .This means that

Rnρτ (y) ζ(y) dy =

Rnρ(k)(y) ζ(Φτ (y)) dy for all ζ ∈ C0

0 (Rn) .(34)

As Φτ is invertible, (34) is equivalent to the following relation for the densities:

det∇Φτ ρτ Φτ = ρ(k) .(35)

By (16), we have for each τ > 0

12 d(ρ

(k−1), ρτ )2 + hF (ρτ )

12 d(ρ

(k−1), ρ(k))2 + hF (ρ(k))

≥ 0,(36)

which we now investigate in the limit τ ↓ 0. Because Ψ is nonnegative, equation (34)also holds for ζ = Ψ, i.e.,

Rnρτ (y) Ψ(y) dy =

Rnρ(k)(y) Ψ(Φτ (y)) dy .

This yields

E(ρτ )− E(ρ(k))

=

Rn

1τ (Ψ(Φτ (y))−Ψ(y)) ρ(k)(y) dy .

Observe that the difference quotient under the integral converges uniformly to∇Ψ(y)·ξ(y),hence implying that

dd τ [E(ρτ )]τ=0 =

Rn∇Ψ(y)·ξ(y) ρ(k)(y) dy .(37)

Next, we calculate dd τ [S(ρτ )]τ=0. Invoking an appropriate approximation argument

(for instance approximating log by some function that is bounded below), we obtain

Rnρτ (y) log(ρτ (y)) dy

(34)=

Rnρ(k)(y) log(ρτ (Φτ (y))) dy

(35)=

Rnρ(k)(y) log

ρ(k)(y)

det∇Φτ (y)

dy .

Therefore, we have

S(ρτ )− S(ρ(k))

= −

Rnρ(k)(y) 1

τ log(det∇Φτ (y)) dy .

Page 11: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 11

Now using

ddτ [det∇Φτ (y)]τ=0 = divξ(y) ,

together with the fact that Φ0 = id, we see that the difference quotient under theintegral converges uniformly to divξ, hence implying that

dd τ [S(ρτ )]τ=0 = −

Rnρ(k) divξ dy .(38)

Now, let p be optimal in the definition of d(ρ(k−1), ρ(k))2 (see section 3). The formula

Rn×Rnζ(x, y) pτ (dx dy) =

Rn×Rnζ(x,Φτ (y)) p(dx dy) , ζ ∈ C0

0 (Rn×Rn)

then defines a pτ ∈ P(ρ(k−1), ρτ ). Consequently, there holds

12 d(ρ

(k−1), ρτ )2 − 1

2 d(ρ(k−1), ρ(k))2

Rn×Rn1τ

12 |Φτ (y)− x|2 − 1

2 |y − x|2p(dx dy) ,

which implies that

lim supτ↓0

12 d(ρ

(k−1), ρτ )2 − 1

2 d(ρ(k−1), ρ(k))2

Rn×Rn(y − x)·ξ(y) p(dx dy) .(39)

We now infer from (36), (37), (38), and (39) (and the symmetry in ξ → −ξ) that

Rn×Rn(y − x)·ξ(y) p(dx dy) + h

Rn(∇Ψ·ξ − divξ) ρ(k) dy = 0

for all ξ ∈ C∞0 (Rn,Rn) .

(40)

Observe that because p ∈ P(ρ(k−1), ρ(k)), there holds

Rn(ρ(k) − ρ(k−1)) ζ dy −

Rn×Rn(y − x)·∇ζ(y) p(dx dy)

=

Rn×Rn(ζ(y)− ζ(x) + (x− y)·∇ζ(y)) p(dx dy)

≤ 12 sup

Rn|∇2ζ|

Rn×Rn|y − x|2 p(dx dy)

= 12 sup

Rn|∇2ζ| d(ρ(k−1), ρ(k))2

for all ζ ∈ C∞0 (Rn). Choosing ξ = ∇ζ in (40) then gives

Rn

1h (ρ(k) − ρ(k−1)) ζ + (∇Ψ·∇ζ −∆ζ) ρ(k)

dy

≤ 1

2 supRn

|∇2ζ| 1h d(ρ

(k−1), ρ(k))2 for all ζ ∈ C∞0 (Rn) .(41)

Page 12: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

12 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

We wish now to pass to the limit h ↓ 0. In order to do so we will first establishthe following a priori estimates: for any T <∞, there exists a constant C <∞ suchthat for all N ∈ N and all h ∈ [0, 1] with N h ≤ T , there holds

M(ρ(N)h ) ≤ C ,(42)

Rnmaxρ(N)

h log ρ(N)h , 0 dx ≤ C ,(43)

E(ρ(N)h ) ≤ C ,(44)

N

k=1

d(ρ(k−1)h , ρ(k)

h )2 ≤ C h .(45)

Let us verify that the estimate (42) holds. Since ρ(k−1)h is admissible in the variational

principle (13), we have that

12 d(ρ

(k−1)h , ρ(k)

h )2 + hF (ρ(k)h ) ≤ hF (ρ(k−1)

h ) ,

which may be summed over k to give

N

k=1

12h d(ρ

(k−1)h , ρ(k)

h )2 + F (ρ(N)h ) ≤ F (ρ0) .(46)

As in Proposition 4.1, we must confront the technical difficulty that F is not boundedbelow. The inequality (42) is established via the following calculations:

M(ρ(N)h )

(17)≤ 2 d(ρ0, ρ(N)

h )2 + 2M(ρ0)

≤ 2NN

k=1

d(ρ(k−1)h , ρ(k)

h )2 + 2M(ρ0)

(46)≤ 4hN

F (ρ0)− F (ρ(N)

h )

+ 2M(ρ0)

(14)≤ 4T

F (ρ0) + C (M(ρ(N)

h ) + 1)α

+ 2M(ρ0) ,

which clearly gives (42). To obtain the second line of the above display, we have madeuse of the triangle inequality for the Wasserstein metric (see equation (9)) and theCauchy–Schwarz inequality. The estimates (43), (44), and (45) now follow readilyfrom the bounds (14) and (15), the estimate (42), and the inequality (46), as follows:

Rnmaxρ(N)

h log ρ(N)h , 0 dx ≤ S(ρ(N)

h ) +

Rn|minρ(N)

h log ρ(N)h , 0| dx

(15)≤ S(ρ(N)

h ) + C (M(ρ(N)h ) + 1)α

≤ F (ρ(N)h ) + C (M(ρ(N)

h ) + 1)α

(46)≤ F (ρ0) + C (M(ρ(N)

h ) + 1)α ;

E(ρ(N)h ) = F (ρ(N)

h ) − S(ρ(N)h )

(14)≤ F (ρ(N)

h ) + C (M(ρ(N)h ) + 1)α

(46)≤ F (ρ0) + C (M(ρ(N)

h ) + 1)α ;

Page 13: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 13

N

k=1

d(ρ(k−1)h , ρ(k)

h )2(46)≤ 2h

F (ρ0) − F (ρ(N)

h )

(14)≤ 2h

F (ρ0) + C (M(ρ(N)

h ) + 1)α.

Now, owing to the estimates (42) and (43), we may conclude that there exists ameasurable ρ(t, x) such that, after extraction of a subsequence,

ρh ρ weakly in L1((0, T )×Rn) for all T <∞ .(47)

A straightforward analysis reveals that (42), (43), and (44) guarantee that

ρ(t) ∈ K for a.e. t ∈ (0,∞) ,

M(ρ), E(ρ) ∈ L∞((0, T )) for all T <∞ .(48)

Let us now improve upon the convergence in (47) by showing that (30) holds. Fora given finite time horizon T < ∞, there exists a constant C < ∞ such that for allN,N ∈ N and all h ∈ [0, 1] with N h ≤ T , and N h ≤ T , we have

d(ρ(N )h , ρ(N)

h )2 ≤ C |N h−N h| .

This result is obtained from (45) by use of the triangle inequality (9) for d andthe Cauchy–Schwarz inequality. Furthermore, for all ρ, ρ ∈ K , p ∈ P(ρ, ρ), andζ ∈ C∞0 (Rn), there holds

Rnζ ρ dx −

Rnζ ρ dx

=

Rn×Rn(ζ(x) − ζ(y)) p(dx dy)

≤ supRn

|∇ζ|

Rn×Rn|x− y| p(dx dy)

≤ supRn

|∇ζ|

Rn×Rn|x− y|2 p(dx dy)

1

2

,

so from the definition of d we obtain

Rnζ ρ dx −

Rnζ ρ dx

≤ supRn

|∇ζ| d(ρ, ρ) for ρ, ρ ∈ K and ζ ∈ C∞0 (Rn) .

Hence, it follows that

Rnζ ρh(t) dx −

Rnζ ρh(t) dx

≤ C supRn

|∇ζ| (|t − t| + h)1

2

for all t, t ∈ (0, T ), and ζ ∈ C∞0 (Rn) .

(49)

Let t ∈ (0, T ) and ζ ∈ C∞0 (Rn) be given, and notice that for any δ > 0, we have

Rnζ ρh(t) dx −

Rnζ ρ(t) dx

Rnζ ρh(t) dx − 1

2 δ

t+δ

t−δ

Rnζ ρh(τ) dx dτ

+

12 δ

t+δ

t−δ

Rnζ ρh(τ) dx dτ − 1

2 δ

t+δ

t−δ

Rnζ ρ(τ) dx dτ

+

12 δ

t+δ

t−δ

Rnζ ρ(τ) dx dτ −

Rnζ ρ(t) dx

.

Page 14: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

14 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

According to (49), the first term on the right-hand side of this equation is boundedby

C supRn

|∇ζ| (δ + h)1

2 ,

and owing to (47), the second term converges to zero as h ↓ 0 for any fixed δ > 0. Atthis point, let us remark that from the result (47) we may deduce that ρ is smooth on(0,∞)×Rn. This is the conclusion of assertion (a) below, which will be proved later.From this smoothness property, we ascertain that the final term on the right-handside of the above display converges to zero as δ ↓ 0. Therefore, we have establishedthat

Rnζ ρh(t) dx →

Rnζ ρ(t) dx for all ζ ∈ C∞0 (Rn) .(50)

However, the estimate (42) guarantees that M(ρh(t)) is bounded for h ↓ 0. Conse-quently, (50) holds for any ζ ∈ L∞(Rn), and therefore, the convergence result (30)does indeed hold.

It now follows immediately from (41), (45), and (47) that ρ satisfies

(0,∞)×Rnρ (∂tζ −∇Ψ·∇ζ + ∆ζ) dx dt =

Rnρ0 ζ(0) dx ,

for all ζ ∈ C∞0 (R×Rn) .

(51)

In addition, we know that ρ satisfies (33). We now show that(a) any solution of (51) is smooth on (0,∞)×Rn and satisfies equation (31);(b) any solution of (51) for which (33) holds satisfies the initial condition (32);(c) there is at most one smooth solution of (31) which satisfies (32) and (33).

The corresponding arguments are, for the most part, fairly classical.Let us sketch the proof of the regularity part (a). First observe that (51) implies

Rnρ(t1) ζ(t1) dx−

(t0 ,t1 )×Rnρ (∂tζ −∇Ψ·∇ζ + ∆ζ) dx dt

=

Rnρ(t0) ζ(t0) dx(52)

for all ζ ∈ C∞0 (R×Rn) and a.e. 0 ≤ t0 < t1 .

We fix a function η ∈ C∞0 (Rn) to serve as a cutoff in the spatial variables. Itthen follows directly from (52) that for each ζ ∈ C∞0 (R×Rn) and for almost every0 ≤ t0 < t1, there holds

Rnη ρ(t1) ζ(t1) dx −

(t0 ,t1 )×Rnη ρ (∂tζ + ∆ζ) dx dt

=

(t0 ,t1 )×Rnρ (∆η −∇Ψ·∇η) ζ dx dt

+

(t0 ,t1 )×Rnρ (2∇η − η∇Ψ) ·∇ζ dx dt

+

Rnη ρ(t0) ζ(t0) dx .

(53)

Notice that for fixed (t1, x1) ∈ (0,∞)×Rn and for each δ > 0, the function

ζδ(t, x) = G(t1 + δ − t, x− x1)

Page 15: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 15

is an admissible test function in (53). Here G is the heat kernel

G(t, x) = t−n2 g(t−

1

2 x) with g(x) = (2π)−n2 exp(− 1

2 |x|2).(54)

Inserting ζδ into (53) and taking the limit δ ↓ 0, we obtain the equation

(ρ η)(t1) =

t1

t0

[ρ(t) (∆η −∇Ψ·∇η)] ∗G(t1 − t) dt

+

t1

t0

[ρ(t) (2∇η − η∇Ψ)] ∗ ∇G(t1 − t) dt

+ (ρ η)(t0) ∗G(t1 − t0) for a.e. 0 ≤ t0 < t1 ,

(55)

where ∗ denotes convolution in the x-variables. From (55), we extract the followingestimate in the Lp-norm:

(ρ η)(t1)Lp =

t1

t0

ρ(t) (∆η −∇Ψ·∇η)L1 G(t1 − t)Lp dt

+

t1

t0

ρ(t) (2∇η − η∇Ψ)L1 ∇G(t1 − t)Lp dt

+ (ρ η)(t0)L1 G(t1 − t0)Lp for a.e. 0 ≤ t0 < t1 .

Now observe that

G(t)Lp = t(1

p−1) n2 gLp ,

∇G(t)Lp , = t1

pn2−n + 1

2 ∇gLp ,

which leads to

(ρ η)(t1)Lp

= ess supt∈(t0 ,t1 )

ρ(t) (∆η −∇Ψ·∇η)L1

t1−t0

0t(

1

p−1) n2 gLp dt

+ ess supt∈(t0 ,t1 )

ρ(t) (2∇η − η∇Ψ)L1

t1−t0

0t

1

pn2−n + 1

2 ∇gLp dt

+ (ρ η)(t0)L1 G(t1 − t0)Lp for a.e. 0 ≤ t0 < t1 .

For p < nn−1 the t-integrals are finite, from which we deduce that

ρ ∈ Lploc((0,∞)×Rn) .

We now appeal to the Lp-estimates [18, section 3, (3.1), and (3.2)] for the potentialsin (55) to conclude by the usual bootstrap arguments that any derivative of ρ is inLploc((0,∞)×Rn), from which we obtain the stated regularity condition (a).

We now prove assertion (b). Using (55) with t0 = 0, and proceeding as above, weobtain

(ρ η)(t1) − (ρ0 η) ∗G(t1)L1

= ess supt∈(0,t1 )

ρ(t) (∆η −∇Ψ·∇η)L1

t1

0gL1 dt

+ ess supt∈(0,t1 )

ρ(t) (2∇η − η∇Ψ)L1

t1

0t−

1

2 ∇gL1 dt for all t1 > 0

Page 16: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

16 R. JORDAN, D. KINDERLEHRER, AND F. OTTO

and therefore,

(ρ η)(t) − (ρ0 η) ∗G(t) → 0 in L1(Rn) for t ↓ 0 .

On the other hand, we have

(ρ0 η) ∗G(t) → ρ0 η in L1(Rn) for t ↓ 0 ,

which leads to

(ρ η)(t) → ρ0 η in L1(Rn) for t ↓ 0 .

From this result, together with the boundedness of M(ρ(t))t↓0, we infer that (32)is satisfied.

Finally, we prove the uniqueness result (c) using a well-known method from thetheory of elliptic–parabolic equations (see, for instance, [20]). Let ρ1, ρ2 be solutionsof (32) which are smooth on (0,∞)×Rn and satisfy (32), (33). Their difference ρsatisfies the equation

∂ρ

∂t− div [ρ∇Ψ +∇ρ] = 0.

We multiply this equation for ρ by φδ(ρ), where the family φδδ↓0 is a convex andsmooth approximation to the modulus function. For example, we could take

φδ(z) = (z2 + δ2)1

2 .

This procedure yields the inequality

∂t[φδ(ρ)] − div [φδ(ρ)∇Ψ +∇[φδ(ρ)]]

= −φδ (ρ) |∇ρ|2 + (φδ(ρ) ρ− φδ(ρ)) ∆Ψ

≤ (φδ(ρ) ρ− φδ(ρ)) ∆Ψ,

which we then multiply by a nonnegative spatial cutoff function η ∈ C∞0 (Rn) andintegrate over Rn to obtain

d

dt

Rnφδ(ρ(t)) η dx

+

Rnφδ(ρ(t)) (∇Ψ·∇η −∆η) dx

Rn(φδ(ρ) ρ− φδ(ρ)) ∆Ψ η dx .

Integrating over (0, t) for given t ∈ (0,∞), we obtain with help of (32)

Rnφδ(ρ(t)) η dx +

(0,t)×Rnφδ(ρ(t)) (∇Ψ·∇η −∆η) dx dt

(0,t)×Rn(φδ(ρ) ρ− φδ(ρ)) ∆Ψ η dx dt .

Letting δ tend to zero yields

Rn|ρ(t)| η dx +

(0,t)×Rn|ρ(t)| (∇Ψ·∇η −∆η) dx dt ≤ 0 .(56)

According to (12) and (33), ρ and ρ∇Ψ are integrable on the entire Rn. Hence, if wereplace η in (56) by a function ηR satisfying

ηR(x) = η1

xR

, where η1(x) = 1 for |x| ≤ 1 , η1(x) = 0 for |x| ≥ 2 ,

and let R tend to infinity, we obtainRn |ρ(t)| dx = 0. This produces the desired

uniqueness result.

Page 17: THE VARIATIONAL FORMULATION OF THE FOKKER ...THE VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION∗ RICHARD JORDAN†, DAVID KINDERLEHRER‡, AND FELIX OTTO SIAM J. MA T H.AN

VARIATIONAL FORMULATION OF THE FOKKER–PLANCK EQUATION 17

Acknowledgments. The authors would like to thank the anonymous refereesfor valuable suggestions and comments.

REFERENCES

[1] P. BILLINGSLEY, Probability and Measure, John Wiley, New York, 1986.[2] Y. BRENIER, Polar factorization and monotone rearrangement of vector-valued functions,

Comm. Pure Appl. Math., 44 (1991), pp. 375–417.[3] L. A. CAFFARELLI, Allocation maps with general cost functions, in Partial Differential Equa-

tions and Applications, P. Marcellini, G. G. Talenti, and E. Vesintini, eds., Lecture Notesin Pure and Applied Mathematics 177, Marcel Dekker, New York, NY, 1996, pp. 29–35.

[4] S. CHANDRASEKHAR, Stochastic problems in physics and astronomy, Rev. Mod. Phys., 15(1942), pp. 1–89.

[5] R. COURANT, K. FRIEDRICHS, AND H. LEWY, Uber die partiellen Differenzgleichungen dermathematischen Physik, Math. Ann., 100 (1928), pp. 1–74.

[6] S. DEMOULINI, Young measure solutions for a nonlinear parabolic equation of forward–backward type, SIAM J. Math. Anal., 27 (1996), pp. 376–403.

[7] C. W. GARDINER, Handbook of stochastic methods, 2nd ed., Springer-Verlag, Berlin, Heidel-berg, 1985.

[8] W. GANGBO AND R. J. MCCANN, Optimal maps in Monge’s mass transport problems, C. R.Acad. Sci. Paris, 321 (1995), pp. 1653–1658.

[9] W. GANGBO AND R. J. MCCANN, The geometry of optimal transportation, Acta Math., 177(1996), pp. 113–161.

[10] C. R. GIVENS AND R. M. SHORTT, A class of Wasserstein metrics for probability distributions,Michigan Math. J., 31 (1984), pp. 231–240.

[11] R. JORDAN, A statistical equilibrium model of coherent structures in magnetohydrodynamics,Nonlinearity, 8 (1995), pp. 585–613.

[12] R. JORDAN AND D. KINDERLEHRER, An extended variational principle, in Partial DifferentialEquations and Applications, P. Marcellini, G. G. Talenti, and E. Vesintini, eds., LectureNotes in Pure and Applied Mathematics 177, Marcel Dekker, New York, NY, 1996, pp. 187–200.

[13] R. JORDAN, D. KINDERLEHRER, AND F. OTTO, Free energy and the Fokker–Planck equation,Physica D., to appear.

[14] R. JORDAN, D. KINDERLEHRER, AND F. OTTO, The route to stability through the Fokker–Planck dynamics, Proc. First U.S.–China Conference on Differential Equations and Appli-cations, to appear.

[15] R. JORDAN AND B. TURKINGTON, Ideal magnetofluid turbulence in two dimensions, J. Stat.Phys., 87 (1997), pp. 661–695.

[16] D. KINDERLEHRER AND P. PEDREGAL, Weak convergence of integrands and the Young mea-sure representation, SIAM J. Math. Anal., 23 (1992), pp. 1–19.

[17] H. A. KRAMERS, Brownian motion in a field of force and the diffusion model of chemicalreactions, Physica, 7 (1940), pp. 284–304.

[18] O. A. LADYZENSKAJA, V. A. SOLONNIKOV, AND N. N. URAL’CEVA, Linear and Quasi–LinearEquations of Parabolic Type, American Mathematical Society, Providence, RI, 1968.

[19] F. OTTO, Dynamics of labyrinthine pattern formation in magnetic fluids: A mean–field theory,Archive Rat. Mech. Anal., to appear.

[20] F. OTTO, L1–contraction and uniqueness for quasilinear elliptic–parabolic equations, J. Differ-ential Equations, 131 (1996), pp. 20–38.

[21] S. T. RACHEV, Probability metrics and the stability of stochastic models, John Wiley, NewYork, 1991.

[22] H. RISKEN, The Fokker-Planck equation: Methods of solution and applications, 2nd ed.,Springer-Verlag, Berlin, Heidelberg, 1989.

[23] Z. SCHUSS, Singular perturbation methods in stochastic differential equations of mathematicalphysics, SIAM Rev., 22 (1980), pp. 119–155.

[24] J. C. STRIKWERDA, Finite Difference Schemes and Partial Differential Equations, Wardsworth& Brooks/Cole, New York, 1989.