Two cases of stochastic maximum principle in the...

Post on 17-Jul-2020

1 views 0 download

Transcript of Two cases of stochastic maximum principle in the...

Two cases of stochastic maximum principle

in the optimal control of SPDEs

Marco Fuhrman

Politecnico di Milano

Ying Hu

Universite de Rennes 1

Gianmario Tessitore

Universita di Milano-Bicocca

Rennes 24th of May 2013

Structure of the talk

We prove Pontryagin maximum principle (necessary conditions for optimality)for a controlled stochastic PDE in the following situations:

1. Part I: Infinite dimensional (white) noise - Special case (No second variationneeded)

• Stochastic parabolic equations in an interval [0,1]

• driven by space-time white noise (cylindrical Wiener process) (Wt)

• with convex set of controls (control affects noise)

• non-linearities are Nemytskii operators F (x) = f(·, x(·)), x ∈ Lp(O)

2. Part II: Finite dimensional noise, General case

• Stochastic parabolic equations in a domain O ⊂ Rd

• driven by a finite dimensional Wiener process Wt = (β1t , . . . , β

mt )

• with non convex set of controls (control affects noise)

• non-linearities are Nemytskii operators:

1

Very incomplete history of SMP in ∞ dimensions

• Bensoussan, J. Frank. Inst. (1983) and Hu-Peng, Stochastics (1990):Special case, State ∞-dim., noise has trace class covariance.

• Peng SICON (1990): General case, state and noise fin. dim.,

• Zhou, SICON (1993): State ∞-dim. noise fin. dim., Linear state equat.and cost.,

• Tang-Li LNPAM (1994): General case, state ∞-dim. noise fin. dim., noisecan have jumps, second derivatives of the coefficients are Hilbert-Schmidt.

• Fuhrman-Hu-T. CRAS 2012, AMO 2012 (electronic): General case, state∞-dim., noise fin. dim., Specific framework to cover stochastic parabolicPDEs

• Lu-Zhang, Preprint 2012: General case, state ∞-dim., noise fin. dim., Pt

characterized as “transposition solution” of a BSEE. Nonlinearities regularin functional spaces.

• Du-Meng, Preprints 2012: General case, state ∞-dim. noise either fin.dim. or trace class, Leading operator A can depend on t; unbounded linearterm affecting noise. Some regularity required for the nonlinarities.

• Fuhrman-Hu-T. : Special case, state ∞-dim. noise ∞-dim. and cylindrical.

2

PART I: INFINITE DIMENSIONAL - (WHITE) NOISE - Special Case

Formulation of the optimal control problem

Let (W(t, x)), t ≥ 0, x ∈ [0,1] be a space time white noise(Ft)t≥0 denotes its natural (completed) filtration.

The set of admissible control actions U is a convex subset of L∞([0,1]).A control u is a (progressive) process with values in U .

The controlled state equation is the following SPDE: for t ∈ [0, T ], x ∈ [0,1],dXt(x) =

∂2

∂x2Xt(x) dt+ b(x,Xt(x), ut(x)) dt+ σ(x,Xt(x), ut(x))dW(t, x),

Xt(0) = Xt(1) = 0, t ∈ [0, T ]

X0(x) = x0(x), x ∈ [0,1]

where b(x, r, u), σ(x, r, u) : [0,1]× R× R → R are given,we assume they are C1 and Lipschitz with respect to r and u;for fixed r and u we suppose b(·, r, u) ∈ L2([0,1]), σ(·, r, u) ∈ L∞([0,1]) bdd.

We also introduce the cost functional:

J(u) = E∫ T

0

∫Ol(x,Xt(x), ut(x)) dx dt+ E

∫Oh(x,XT(x)) dx

where l(x, r, u) : O×R×R → R, h(x, r) : O×R → R are given bounded functions,we assume that they are C1 with bounded derivatives with respect to r and u;

3

Abstract reformulation

The noise is reformulated as a L2([0,1]) valued cylindrical Wiener process (Wt)

E < Wt, x >L2< Ws, y >L2= (t ∧ s) < x, y >L2, ∀x, y ∈ H = L2([0,1])

A is the realization of the second derivative operator in H with Dirichlet boundaryconditions. So it is an unbounded operator with domain H2

0([0,1]) ⊂ H =L2([0,1]).

For all X,V ∈ H, x ∈ [0,1] the non linearities are defined by

F (X,u)(x) = b(x,X(x), u(x)), [G(X,u)V ](x) = σ(x,X(x), u(x))V (x),

L(X,u)(x) =

∫Ol(X(x), u(x))dx, Φ(X)(x) =

∫Oh(X(x))dx

The state equation written in abstract form becomes

dtXt = AXtdt+ F (Xs, us)ds+G(Xs, us)dWs, X0 = x0

where x0 ∈ H and the solution will evolve in H.

The cost becomes

J(x, u) = E∫ T

0L(Xs, us)ds+ EΦ(XT)

4

Standing Framework

(i) A is the generator of a C0 semigroup etA, t ≥ 0, in H. Moreover ∀s > 0:

esA ∈ L2(H) with |esA|L2(H) ≤ Ls−γ; for suitable L > 0, γ ∈ [0,1/2).

where L2(H) is the (Hilbert) space of Hilbert Schmidt operators in H.

(ii) U is a bounded convex subset of a separable Banach space U0

(iii) F : H × U → H is lipschitz in both variables

(iv) G : H × U → L(H) verifies for all s > 0, t ∈ [0, T ], X,Y ∈ H, u, v ∈ U ,

|esAG(t,0, u)|L2(H) ≤ L s−γ,

|esAG(t,X, u)− esAG(t, Y, v)|L2(H) ≤ L s−γ(|X − Y |+ |u− v|), (1)

for some constants L > 0 and γ ∈ [0,1/2).

(v) F (·, ·) is Gateaux differentiable H × U → H,for all s > 0, esAG(·, ·) is Gateaux differentiable H × U → L2(H).

(vi) L(·, ·) and Φ(·) are bounded lipschitz and differentiable

(vii) For all Ξ ∈ H the map u → G(X,u)Ξ is Gateaux differentiable and

|∇uG(X,u)Ξ|L(U0,H) ≤ cost|Ξ|H recall (U0 ⊂ L∞)

5

Under the above assumptions the state equation (formulated in mild sense):

Xt = etAx0 +

∫ t

0e(t−s)AF (Xs, us)ds+

∫ t

0e(t−s)AG(Xs, us)dWs

admits a unique solution X ∈ LpW(Ω, C([0, T ], H)) see [Da Prato, Zabczyk ’92]

Remark: If we perturb the control by spike variation that is we consider solutionof

Xϵt = etAx0 +

∫ t

0e(t−s)AF (Xs, u

ϵs)ds+

∫ t

0e(t−s)AG(Xs, u

ϵs)dWs

where uϵs = usI[t0,t0+ϵ]c(s) + v0I[t0,t0+ϵ](s) for fixed t0 ∈ [0, T ], v0 ∈ U then

|Xϵ(t0 + δ)−X(t0 + δ)|L2(Ω,P,H) ≈ δ(1/2−γ)

6

First Variation Equation

Let (X, u) be an optimal pair, fix any other bdd. U-valued progressive control v

Let uϵt = ut + ϵ(vt − ut) and Xϵ

t the corr. solution of the state equation.

Finally denote (δu)t = vt − ut

Since we are not considering spike variations things are easy at this level

Xϵt = Xt + ϵYt + o(ϵ).

dYt =[AYt +∇XF (Xt, ut)Yt +∇uF (Xt, ut)(δu)t

]dt

∇XG(Xt, ut)Yt dWt +∇uG(Xt, ut)(δu)t, dWt

Y ϵ0 = 0

By [Da Prato Zabczyk] the above equation admits an unique mild solution with

E( supt∈[0,T ]

|Yt|2) < +∞,

Moreover

J(x, uϵ) = J(x, u) + ϵI(v) + o(ϵ)

with

I(v) = E∫ T

0

[⟨∇XL(Xt, ut), Yt⟩+ ⟨∇uL(Xt, ut), (δu)t⟩

]dt+ E⟨∇XΦ(XT), YT ⟩

7

We fix a basis (ei)i∈N ∈ H and assume that for all i ∈ N the map X → G(X,u)eiis Gateaux differentiable H → H.

We notice that in our concrete case for all V ∈ H

[∇X(G(X,u)ei)V ](x) =∂

∂Xσ(x,Xt(ξ), ut(ξ))ei(ξ)V (ξ)

So it is enough to choose ei ∈ L∞([0,1])

We denote ∇X(G(Xt, ut)ei)V = Ci(t)V .

Recall that gradients ∇X are with respect to variables X ∈ H = L2([0,1])

For simplicity we let F = 0 from now on:.

The equation for the first variation becomesdYt(x) = AYtdt+

∑∞i=1Ci(t)Yt dβi

t +∇uG(Xt, ut)(δu)t dWt

Y0 = 0

where βit = ⟨ei,Wt⟩ and we have:

• |Ci(t)|L(H) ≤ c, P− a.s. for all t ∈ [0, T ]

•∑∞

i=1 |etACi(s)v|2 ≤ ct−2γ|v|2H for all t ≥ 0, s ≥ 0, (γ < 1/2)

•∑∞

i=1 |etAei|2 ≤ ct−2γ for all t ≥ (γ < 1/2),

We also take into account that A and Ci are self adjoint (although not essential).

8

Adjoint equation

The adjoint equation is (at least formally)−dpt(x) =

[Apt +∇XL(Xt, ut) +

∑∞i=1Ci(t)Qtei

]dt+QtdWt

pT = ∇xΦ(XT)

We expect a solution with pt ∈ H and Qt ∈ L2(H) but we notice that the term∑∞i=1Ci(t)Qtei does not converge for Qt ∈ L2(H).

We can rewrite the above equation in the mild form

pt = e(T−t)A∇XΦ(XT) +

∫ T

t

e(s−t)A∇XL(Xs, us)ds+

+

∫ T

t

∞∑i=1

e(s−t)ACi(s)Qseids+

∫ T

t

QsdWs

but still∑∞

i=1 e(s−t)ACi(s)Qei doesn’t converge if Q ∈ L2(H). Indeed if V ∈ H

∞∑i=1

⟨e(s−t)ACi(s)Qei, V ⟩ =∞∑i=1

⟨Qei, Ci(s)e(s−t)AV ⟩ ≤ |Q|L2(H)

( ∞∑i=1

|Ci(s)e(s−t)AV |2

)1/2

?

On he contrary

∞∑i=1

⟨e(s−t)ACi(s)Qei, V ⟩ ≤

( ∞∑i=1

|Qei|

)supi∈N

|Ci(s)e(s−t)AV |

9

Easy Facts on Schatten - von Neumann classes

We denote by L2(H) the Hilbert space of Hilbert Schmidt operators H → Hendowed with the scalar product ⟨L,M⟩2 =

∑∞i=1⟨Lei,Mei⟩H

Given L ∈ L2(H) there exists a sequence (aLn)n∈N ∈ ℓ2 and a couple of orthonormalbases (eLn)n∈N, (fL

n )n∈N in H such that

L =∞∑

n=1

aLnfLn ⟨eLn, ·⟩ and |L|2 =

∑n

(aLn)2.

If t → Lt is a L2 valued process then the above objects can be selected with thesame measurability properties as L.

Define L1(H) = L ∈ L2(H) : |L|1 < ∞ where

|L|1 := sup⟨B,L⟩2 : B ∈ L2(H), |B|L(H) ≤ 1

• If B ∈ L(H) and L ∈ L1(H) then LB, BL are in L1(H) moreover

|LB|1 ≤ |L|1|B|L(H), |BL|1 ≤ |L|1|B|L(H)

• If L ∈ L1(H) the trace Tr(L) :=∑∞

i=1⟨ei, Lei⟩ converges absolutely and itsvalue is independent on the choice of the basis (ei)i∈N

• |L|1 =∑∞

n=1 |aLn|, Tr(L) =∑∞

n=1 aLn consequently |Tr(L)| ≤ |L|1

10

Coming back to our bad term if Q ∈ L1(H) there exist two ONB such thatQ =

∑j ajfj⟨·, gj⟩, and

∑i

∣∣∣e(s−t)ACi(s)Qei

∣∣∣ =∑i

∣∣∣∣∣∣∑j

e(s−t)ACi(s)ajfj⟨ei, gj⟩

∣∣∣∣∣∣≤

∑j

|aj|∑i

|e(s−t)ACi(s)fj| |⟨ei, gj⟩| by Cauchy

≤∑j

|aj|c(s− t)−γ = c|Q|L1(s− t)−γ.

Moreover we formally compute dt⟨Yt, pt⟩ we get

E⟨YT ,∇XΦ(XT)⟩+ E∫ T

0⟨Ys,∇XL(Xs, us)⟩ds = E

∫ T

0Tr[(∇uG(Xs, us)(δu)s

)Qs

]ds

• The multiplication operator ∇uG(Xs, us)(δu)s is at most bounded in H thus

Tr[(∇uG(Xs, us)(δu)s

)Qs

]is not well defined for Q ∈ L2(H) but is well defined for Q ∈ L1(H) .

• We can not bypass the above term since it will remain in the final formulationof the maximum principle.

Conclusion The non-hilbertian space L1(H) has something to do here

11

Existence of a solution by approximations

Let us denote η := ∇XΦ(XT) ∈ L2(Ω,FT ,P, H) and f := ∇XL(X, u) ∈ L2W(Ω ×

[0, T ], H).

Consider the approximating BSDEs

dpNt = e(T−t)Aη +

∫ T

t

e(s−t)Afsds+

∫ T

t

N∑i=1

e(s−t)ACi(s)QNs eids+

∫ T

t

e(s−t)AQNs dWs

By the standard theory (see [Hu-Peng 91]) there exists a unique solution with

supt∈[0,T ]

E|pNt |2H + E

∫ T

0|QN

t |2L2(H)dt ≤ c

(E|η|2 + E

∫ T

0|ft|2dt

)The idea is to exploit the duality relation with a forward equation in order toobtain good estimates in the L1 norm.

First we show that the same duality relation easily implies weak convergence ofthe solutions of the sequence (pN , QN)

12

The perturbed forward equation

Consider the perturbed (forward) equationdY Γ,ξ

t = AY Γ,ξt dt+

[∑∞i=1Ci(t)Y

Γ,ξt +Γt

]dWt

Y Γ,ξs = ξ ∈ L2(Ω,Fs,P, H)

Proposition 1 Given Γ ∈ L2W(Ω × [0, T ], L2(H)) the above equation admits a

unique mild solution that verifies

E|Y Γ,ξt |2 ≤ cE

∫ t

s

|Γℓ|2L2(H)dℓ+ Eξ2

E|Y Γ,ξt |2 ≤ cE

∫ t

s

(t− ℓ)−2γ|Γℓ|2L(H)dℓ+ Eξ2

The same estimates hold (with independent constant) for the solutions Y Γ,N ofthe approximating equations

dY Γ,ξ,Nt = AY Γ,ξ,N

t dt+[∑N

i=1Ci(t)YΓ,ξ,Nt +Γt

]dWt

Y Γ,ξ,Ns = ξ ∈ L2(Ω,Fs, H)

Moreover E∫ T

s|Y Γ,ξ,N

t − Y Γ,ξt |2dt → 0 and E|Y Γ,ξ,N

t − Y Γ,ξt |2 → 0 for all t ∈ [s, T ]

13

If we compute (by introducing Yosida approximations of A) dt⟨Y Γ,Nt , pNt ⟩ we obtain

E∫ T

s

⟨Γt, QNt ⟩2dt+ E⟨ξ, pNs ⟩2 = E

∫ T

s

⟨ft, Y Γ,ξ,Nt ⟩dt+ E⟨η, Y Γ,ξ,N

T ⟩H

and taking into account the convergence of Y Γ,ξ,N towards Y Γ,ξ

Corollary 2 There exists a couple of adapted processes Q ∈ L2(Ω×[0, T ], L2(H)),p ∈ C([0, T ], L2(Ω, H)), such that

QN Q in L2(Ω× [0, T ], L2(H)) pNt pt in L2(Ω, H) ∀t ∈ [0, T ]

Moreover since for all t ∈ [0, T ] the stochastic integral∫ T

te(s−t)AQN

s dWs converges

weakly to∫ T

te(s−t)AQsdWs we immediately deduce that, by difference, for all

t ∈ [0, T ] there exists Ξt in L2(Ω,Ft, H) such that

N∑i=1

∫ T

t

e(s−t)ACi(s)QNs eids Ξt weakly in L2(Ω,FT , H)

The mild BSDE for the couple (p,Q) at this point reeds:

pt = Ξt + e(T−t)Aη +

∫ T

t

e(s−t)Afsds+

∫ T

t

e(s−t)AQsdWs

14

The above is not satisfactory in the sense that we want at least to obtain a mildBSDE. To start with we notice that the convergence of the bad term holds forthe limit process Q itself namely

Proposition 3

N∑i=1

∫ T

t

e(s−t)ACi(s)Qseids Ξt in L2(Ω,FT ,P, H)

Proof: Given ξ ∈ L2(Ω,FT ,P, H)

E⟨N∑

i=1

∫ T

t

e(s−t)ACi(s)Qseids, ξ⟩ =N∑

i=1

E∫ T

t

⟨Qsei, Ci(s)e(s−t)AE(ξ|Fs)⟩ds

= limM→∞

N∑i=1

E∫ T

t

⟨QMs ei, Ci(s)e

(s−t)AE(ξ|Fs)⟩ds

If γi(s) = Ci(s)e(s−t)AE(ξ|Fs) and Y M,N is the solution of the forward mild SDE

Y M,Nζ =

M∑i=1

∫ ζ

t

e(s−t)ACi(s)YM,Ns dβi

s +N∑

i=1

∫ ζ

t

e(s−t)Aγi(s)dβis

then

E⟨N∑

i=1

∫ T

t

e(s−t)ACi(s)QMs eids, ξ⟩ = E⟨η, Y M,N

ζ ⟩+ E∫ T

t

E⟨fζ, Y M,Nζ ⟩dζ

15

Noticing that for all ρ > t

∞∑i=1

E∫ ρ

t

|e(ρ−s)Aγi(s)ds|2 =∞∑i=1

E∫ ρ

t

|e(ρ−s)ACi(s)e(s−t)AE(ξ|Fs)ds|2 ≤ cE|ξ|2

we can show that, for all fixed t ∈ [0, T ], as N,M → ∞

E|Y M,Nζ − Y ∞

ζ |2 → 0,

∫ T

t

E|Y M,Nζ − Y ∞

ζ |2dζ → 0

where Y ∞ is the mild solution of the forward SDEdY ∞

t = AY ∞ζ dζ +

∑∞i=1Ci(ζ)Y

M,Nζ dβi

ζ +∑∞

i=1 γi(ζ)dβiζ

Y ∞t = 0

In conclusion

E⟨N∑

i=1

∫ T

t

e(s−t)ACi(s)Qseids, ξ⟩ → E⟨η, Y ∞t ⟩+ E

∫ T

t

E⟨fζ, Y ∞ζ ⟩dζ

In an identical way we can show that

E⟨N∑

i=1

∫ T

t

e(s−t)ACi(s)QNs eids, ξ⟩ → E⟨η, Y ∞

t ⟩+ E∫ T

t

E⟨fζ, Y ∞ζ ⟩dζ

and this concludes the proof

16

Estimates of Q in the L1 norm

Proposition 4 E∫ T

0(T − s)2γ|Qs|2L1(H)ds ≤ cE|η|2 + cE

∫ T

0|fs|2ds

Remembering the representation QNt =

∑∞n=1 an(t)fn(t)⟨en(t), ·⟩ we choose

ΓNt := α(t)

N∑n=1

sgn(an(t))fn(t)⟨en(t), ·⟩ with α : [0, T ] → R.

We notice that |ΓNt |L(H) ≤ α(t) and that ⟨Qt,ΓN

t ⟩L2(H) = α(t)∑N

n=1 |aQn (t)| so

that:

E∫ T

0|Qt|L1(H)α(t)dt = E

∫ T

0supN

⟨Qt, γNt ⟩2dt ≤ sup

N

[E⟨η, Y ΓN

T ⟩H +

∫⟨fs, Y ΓN

s ⟩Hds

]where

dY Γt = AY Γ

t dt+[∑∞

i=1Ci(t)Y Γt +Γt

]dWt

Y Γs = 0

recalling the estimate of Y Γ with respect to the L(H) norm of Γ we get.

E

∫ T

0|Qt|1α(t)dt ≤ cη,f

(∫ T

0(T − s)−2γα2(s)ds

)1/2

and the proof follows letting α(s) = (T − s)−γα(s) and rewriting the above as

E

∫ T

0|Qt|1(T − t)γα(t)dt ≤ cη,f |α|L2[0,T ]

17

Corollary 5 The sequence∑∞

i=1

∫ T

te(s−t)ACi(s)Qseids converges in L1(Ω,P, H)

and the BSDE is satisfied in proper sense that is for all t ∈ [0, T ] it holds P-a.s.

pt = e(T−t)Aη +

∫ T

t

e(s−t)Afsds+∞∑i=1

∫ T

t

e(s−t)ACi(s)Qseids+

∫ T

t

e(s−t)AQsdWs

Proof: Recall the estimate∞∑i=1

∣∣∣e(s−t)ACi(s)Qei

∣∣∣ ≤ C|Q|L1(s− t)−γ.

Then for all N

EN∑

i=1

∣∣∣∣∫ T

t

e(s−t)ACi(s)Qseids

∣∣∣∣ ≤ cE∫ T

t

|Qs|L1(s− t)−γds

≤(E∫ T

t

|Qs|2L1(T − s)2γds

)1/2(∫ T

t

(T − s)−2γ(s− t)−2γds

)1/2

and the claim follows since this last integral converges.

18

Conclusion

Passing to the limit the duality relation holding for (pN , QN) we get (recallingthe expansion of the cost)

J(x, uϵ)− J(x, u) = ϵE∫ T

0⟨(δu)s,

[∇uF (Xs, us)

]∗ps⟩ds+ ϵE

∫ T

0⟨∇uL(Xs, us), (δu)s⟩ds

+ϵE∫ T

0Tr[(∇uG(Xs, us)(δu)s

)Qs

]ds+ o(ϵ)

And we now know that all the terms in the above formula are well defined.

Recall the we are assuming that |∇uG(Xs, us)vs|L(H) ≤ cost and we have justproved that Q ∈ L1(H), P⊗ dt, a.s..

So we con conclude (by the usual localization - Lebesgue differentiation proce-dure) that ∀v ∈ U it holds P⊗ dt a.s.

⟨∇uL(Xs, us), v − us⟩+Tr[(∇uG(Xs, us)vs

)Qs

]≥ 0

Uniqueness of the mild BSDE ?

19

PART II: FINITE DIMENSIONAL NOISE

Formulation of the optimal control problem

Let (β1t , . . . , β

mt ), t ≥ 0, be a standard m-dimensional Wiener process.

(Ft)t≥0 denotes its natural (completed) filtration.

The set of control actions U is a separable metric space not necessarily convex.A control u is a process in U .

O ⊂ Rn is a bounded open set with regular boundary. The controlled stateequation is an SPDE of the following semi abstract form: for t ∈ [0, T ], x ∈ O, dXt(x) = AXt(x) dt+ b(x,Xt(x), ut) dt+

m∑j=1

σj(x,Xt(x), ut) dβjt ,

X0(x) = x0(x),

where

b(x, r, u), σj(x, r, u) : O × R × U → R are given (all difficulties are already presentif b and σj are very regular in r and independent on x).

H = L2(O) is the state space, with usual scalar product ⟨·, ·⟩.We assume x0 ∈ H. The solution Xt, t ∈ [0, T ], will be a process in H.

A is the realization of a partial differential operator, with appropriate boundaryconditions.

20

Standing assumptions

1) Regular coefficients

The functions b(x, r, u), σj(x, r, u), l(x, r, u), h(x, r) are measurable anda) continuous in u;b) of class C2 in r ∈ R;c) bounded together with their first and second derivative w.r.t. r,

2) Lp-boundedness of the semigroup

A is a generator of a strongly continuous semigroup etA, t ≥ 0, in H = L2(O).Moreover, for every p ∈ [2,∞) and t ∈ [0, T ],

etA(Lp(O)) ⊂ Lp(O), ∥etAf∥Lp(O) ≤ Cp∥f∥Lp(O)

for some constants Cp independent of t and f .

3) Compactness in L4 of the semigroup

the restriction of etA, t ≥ 0, to L4(O) is an analytic semigroup with domain ofthe infinitesimal generator compactly imbedded in L4(O).

21

Statement of the stochastic maximum principle

For u ∈ U and X, p, q1, . . . , qm ∈ H = L2(O) denote

H(u,X, p, q1, . . . , qm) =

∫O

[l(x,X(x), u)+b(x,X(x), u)p(x)+σj(x,X(x), u)qj(x)

]dx

Theorem. Let (Xt, ut) be an optimal pair. Then there are (suitably charac-terized):

1) (m+1) L2(O)-valued adapted processes pt, q1t, . . . , qmt, t ∈ [0, T ],2) one operator-valued process Pt, t ∈ [0, T ];

for which the following inequality holds P-a.s. for a.e. t ∈ [0, T ]:for every v ∈ U ,

H(v, Xt, pt, q1t, . . . , qmt)−H(ut, Xt, pt, q1t, . . . , qmt)

+1

2⟨Pt[σj(·, Xt(·), v)− σj(·, Xt(·), ut)], σj(·, Xt(·), v)− σj(·, Xt(·), ut)⟩ ≥ 0.

The first adjoint processes pt, qjt are characterized as the unique solutions inH of an appropriate BSPDE and satisfy

supt∈[0,T ]

E∥pt∥2H + E∫ T

0∥qjt∥2H dt < ∞.

The second adjoint process Pt takes values in the space of linear bounded op-erators L4(O) → L4(O)∗ = L4/3(O) and also admits a suitable unique character-ization.

22

Preliminaries to the proof of the maximum principle

Let (X, u) be an optimal pair. We introduce the spike variation:we fix an arbitrary interval [t, t+ ϵ] ⊂ (0, T ) and an arbitrary v ∈ U and define

uϵt =

ut if t /∈ [t, t+ ϵ],

v if t ∈ [t, t+ ϵ].

Letδlt(x) = l(x, Xt(x), uϵ

t)− l(x, Xt(x), ut)δbt(x) = b(x, Xt(x), uϵ

t)− b(x, Xt(x), ut)δσjt(x) = σj(x, Xt(x), uϵ

t)− σj(x, Xt(x), ut)δb′t(x) = b′(x, Xt(x), uϵ

t)− b′(x, Xt(x), ut)δσ′

jt(x) = σ′j(x, Xt(x), uϵ

t)− σ′j(x, Xt(x), ut)

Let (X, u) be an optimal pair, uϵt the spike variation, and Xϵ

t the corr. solution:dXϵ

t(x) = AXϵt(x) dt+ b(x,Xϵ

t(x), uϵt) dt+ σj(x,Xϵ

t(x), uϵt) dβ

jt ,

Xϵ0(x) = x0(x)

We wish to represent in the form

Xϵt = Xt + Y ϵ

t + Zϵt + remainder term

where the remainder has to be o(ϵ).

23

Equation for Y ϵt (to be understood in a mild sense):

dY ϵt (x) =

[AY ϵ

t (x) + b′(x, Xt(x), ut) · Y ϵt (x)

]dt

+σ′j(x, Xt(x), ut) · Y ϵ

t (x) dβjt + δbt(x) dt+ δσjt(x) dβ

jt

Y ϵ0(x) = 0

Equation for Zϵt (to be understood in a mild sense):

dZϵt(x) =

[AZϵ

t(x) + b′(x, Xt(x), ut) · Zϵt(x)

]dt+ σ′

j(x, Xt(x), ut) · Zϵt(x) dβ

jt

+[12b′′(x, Xt(x), ut) · Y ϵ

t (x)2 + δb′t(x) · Y ϵ

t (x)]dt

+[12σ′′j (x, Xt(x), ut) · Y ϵ

t (x)2 + δσ′

jt(x) · Y ϵt (x)

]dβj

t

Zϵ0(x) = 0

Proposition. For all p ≥ 2,

supt∈[0,T ]

(E∥Y ϵ

t ∥pLp(O)

)1/p= sup

t∈[0,T ]

(E∫O|Y ϵ

t (x)|pdx)1/p

≤ Cp√ϵ.

supt∈[0,T ]

(E∥Zϵ

t∥pLp(O)

)1/p= sup

t∈[0,T ]

(E∫O|Zϵ

t(x)|pdx)1/p

≤ Cp ϵ.

supt∈[0,T ]

(E∥Xϵ

t − Xt − Y ϵt − Zϵ

t∥2H)1/2

= supt∈[0,T ]

(E∫O|Xϵ

t(x)− Xt(x)− Y ϵt (x)− Zϵ

t(x)|2dx)1/2

= o(ϵ).

24

Expansion of the cost functional

Let (X, u) be an optimal pair for the cost

J(u) = E∫ T

0

∫Ol(x,Xt(x), ut) dx dt+ E

∫Oh(x,XT(x)) dx

Let uϵt be the spike variation, and J(uϵ) the corresponding cost. Then clearly

J(uϵ)− J(u) ≥ 0.

Recall

δlt(x) = l(x, Xt(x), uϵt)− l(x, Xt(x), ut)

Proposition. We have

0 ≤ J(uϵ)− J(u) = E∫ T

0

∫Oδlt(x) dx dt+∆ϵ

1 +∆ϵ2 + o(ϵ),

where

∆ϵ1 = E

∫ T

0

∫Ol′(x, Xt(x), ut)(Y

ϵt (x) + Zϵ

t(x)) dx dt

+E∫Oh′(x, XT(x))(Y

ϵT(x) + Zϵ

T(x)) dx,

∆ϵ2 =

1

2E∫ T

0

∫Ol′′(x, Xt(x), ut)Y

ϵt (x)

2 dx dt+1

2E∫Oh′′(x, XT(x))Y

ϵT(x)

2 dx.

25

The first adjoint processes

Consider the backward SPDE −dpt(x) = −qjt(x) dβjt +

[A∗pt(x) + b′(x, Xt(x), ut) · pt(x)

+σ′j(x, Xt(x), ut) · qjt(x) + l′(x, Xt(x), ut)

]dt

pT(x) = h′(x, XT(x))

By Hu-Peng, Stoch Anal Appl (’91) there exists of a unique (m + 1)-uple ofadapted processes (p, q1, ..., qm) solving the above in a mild sense and verifying

supt∈[0,T ]

E∫O|pt(x)|2Hdx+ E

∫ T

0

∫O|qjt(x)|2Hdx dt < ∞

Computing d∫O Y ϵ

t (x)pt(x) dx , d∫O Zϵ

t(x)pt(x) dx, and joining what one obtainswith the expression for ∆ϵ

2 we get

0 ≤ J(uϵ)−J(u) = E∫ T

0

∫O

[δlt(x)+δbt(x)pt(x)+δσjt(x)qjt(x)

]dx dt+

1

2∆ϵ

3+o(ϵ),

where ∆ϵ3 =

E∫ T

0

∫O

[l′′(x, Xt(x), ut) + b′′(x, Xt(x), ut)pt(x) + σ′′

j (x, Xt(x), ut)qjt(x)]Y ϵt (x)

2dxdt

+E∫Oh′′(x, XT(x))Y

ϵT(x)

2dx.

26

The second adjoint processes

Consider again

∆ϵ3 = E

∫ T

0

∫OHt(x)Y

ϵt (x)

2dxdt+E∫Oh(x)Y ϵ

T(x)2 dx = E

∫ T

0⟨HtY

ϵt , Y

ϵt ⟩ dt+E⟨hY ϵ

T , YϵT ⟩

where

Ht(x) = l′′(x, Xt(x), ut) + b′′(x, Xt(x), ut)pt(x) + σ′′j (x, Xt(x), ut)qjt(x),

h(x) = h′′(x, XT(x)).

Here and below, by Ht and h we denote multiplication operators by Ht(·) andh(·), acting on H:

Ht : f(·) 7→ Ht(·)f(·), h : f(·) 7→ h(·)f(·), f ∈ H = L2(O).

Note that

|h(x)| ≤ C := sup |h′′| < ∞, E∫ T

0

∫O|Ht(x)|2dx dt < ∞,

due to the occurrence of qjt(x), so

h(·) ∈ L∞(O) P− a.s., Ht(·) ∈ L2(O), P× dt− a.e.

In particular, h is bounded but Ht is not a (bounded) linear operator on H.

To finish our argument we have to compute limϵ→0 ϵ−1∆ϵ3

27

Characterization of P

For fixed t ∈ [0, T ] and f ∈ L4, we consider the equationdYt,f

s (x) = AYt,fs (x) ds+ b′(x, Xs(x), us)Yt,f

s (x) ds+ σ′j(x, Xs(x), us)Yt,f

s (x) dW js ,

Yt,ft (x) = f(x),

We define a progressive process (Pt)t∈[0,T ] with values in the space of bounded

linear operators L4 → (L4)∗ = L4/3 setting for t ∈ [0, T ], f, g ∈ L4

⟨Ptf, g⟩ = EFt

∫ T

t

⟨HsYt,fs ,Yt,g

s ⟩ ds+ EFt⟨hYt,fT ,Yt,g

T ⟩, P− a.s.

The process (Pt)t∈[0,T ] enjoys the following properties

Boundedness supt∈[0,T ] E∥Pt∥2L < ∞,

Continuity E|⟨Pt+ϵ − Pt)f, g⟩| → 0, as ϵ → 0, f, g ∈ L4(O)

Regularization For every η ∈ (0,1/4) there exists a constant Cη such that

E supf,g

|⟨Pt(−A)ηf, (−A)ηg⟩|2 ≤ Cη(T − t)−4ηE[∫ T

0∥Hs∥2L2(O)ds+ ∥h∥2L2(O)

].

where D(−A)η is the domain of the fractional power of A in L4(O) and thesup is taken over all f, g ∈ D(−A)η, ∥f∥L4(O) ≤ 1, ∥g∥L4(O) ≤ 1.

28

Conclusion of the proof

By the Markov property and suitable estimates(recalling that, for all p ≥ 1, E∥Y ϵ

t ∥2pLp(O) ≤ Cp ϵp for all t ∈ [0, T ]. )

∆ϵ3 = E

∫ T

0⟨HsY

ϵs , Y

ϵs ⟩ ds+ E⟨hY ϵ

T , YϵT ⟩ = E

∫ T

t0

⟨HsYϵs , Y

ϵs ⟩ ds+ E⟨hY ϵ

T , YϵT ⟩

= o(ϵ) + E∫ T

t0+ϵ

⟨HsYt0+ϵ,Y ϵ

t0+ϵ

s ,Yt0+ϵ,Y ϵt0+ϵ

s ⟩ ds+ E⟨hYt0+ϵ,Y ϵt0+ϵ

T ,Yt0+ϵ,Y ϵt0+ϵ

T ⟩

= o(ϵ) + E⟨Pt0+ϵYϵt0+ϵ, Y

ϵt0+ϵ⟩,

The argument is then concluded by proving the two following two relations:

E⟨(Pt0+ϵ − Pt0)Yϵt0+ϵ, Y

ϵt0+ϵ⟩ = o(ϵ),

E⟨Pt0Yϵt0+ϵ, Y

ϵt0+ϵ⟩ = E

∫ t0+ϵ

t0

⟨Psδϵσj(s, ·), δϵσj(s, ·)⟩ ds+ o(ϵ)

since in that case we obtain

∆ϵ3 = E

∫ t0+ϵ

t0

⟨Psδϵσj(s, ·), δϵσj(s, ·)⟩ ds+ o(ϵ)

29