Download - Leader-Follower Stochastic Differential Game with Asymmetric Information and Applications · 2015. 5. 18. · Jie Xiong Department of Mathematics University of Macau Macau Based on

UNIVERSIDADE DE MACAU

Leader-Follower Stochastic Differential Gamewith Asymmetric Information and Applications

Jie Xiong

Department of MathematicsUniversity of Macau

Macau

Based on joint works with Shi and Wang2015 Barrett Lecture, Knoxville, 14/05/2015]


Outline

1 Motivation

2 Problem formulation

3 LQ Leader-Follower Stochastic Differential Game withAsymmetric Information

4 Follower’s optimization problem

5 Complete Information LQ Stochastic Optimal ControlProblem of the Leader


1. A motivation

Example: Continuous Time Newsvendor Problems

D(·) is the demand rate{dD(t) = a(µ− kR(t)−D(t))dt+ σdW (t) + σ̃dW̃ (t),D(0) = d0 ∈ R,

retail price R(t).


The retailer will obtain an expected profit

J1(q(·), R(·), w(·))

= E∫ T0

[(R(t)− S) min[D(t), q(t)]

− (w(t)− S)q(t)]dt.

order rate q(t); wholesale price w(t); salvaged at unit priceS ≥ 0.


the manufacturer has a fixed production cost per unit M ≥ 0,an expected profit

J2(q(·), R(·), w(·)) = E∫ T0

(w(t)−M)q(t)dt.


The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .For any w(·), retailer selects a G1t -adapted processes pair(q∗(·), R∗(·)) for the retailer such that

J1(q∗(·), R∗(·), w(·))

≡ J1(q∗(·;w(·)), R∗(·;w(·)), w(·)

)= max

q(·),R(·)J1(q(·), R(·), w(·)),


Manufacturer selects a G2t -adapted process w∗(·) for themanufacturer such that

J2(q∗(·), R∗(·), w∗(·))

≡ J2(q∗(·;w∗(·)), R∗(·;w∗(·)), w∗(·)

)= max

w(·)J2(q∗(·;w(·)), R∗(·;w(·)), w(·)

),


Example: Cooperative Advertising and Pricing Problems

dx(t) =[ρu(t)

√1− x(t)− δx(t)

]dt+ σ(x(t))dW (t)

+ σ̃(x(t))dW̃ (t),

x(0) = x0 ∈ [0, 1],

u(t) advertisement effort rate, x(t) awareness share, ρ responseconstant, and δ the rate at which potential consumers are lost.

Manufacturer decide:wholesale price w(t), cooperative participation rate θ(t).Retailer decides: effort u(t), retail price p(t).


Given w(t) and θ(t), retailer solves an optimization problem tomaximize his expected profit

J1(w(·), θ(·), u(·), p(·))

= E∫ T0e−rt

[(p(t)− w(t))D(p(t))x(t)

− (1− θ(t))u2(t)]dt,

D is demand function.


The manufacturer’s optimization problem is to maximize hisexpected profit

J2(w(·), θ(·), u(·), p(·))

= E∫ T0e−rt

[(w(t)− c)D(p(t))x(t)− θ(t)u2(t)

]dt,

where c ≥ 0 is the constant unit production cost.


The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .Given w and θ, retailer chooses G1,t-adapted pair (u∗(·), p∗(·)),

J1(w(·), θ(·), u∗(·), p∗(·))≡ J1

(w(·), θ(·), u∗(·; (w(·), θ(·)), p∗(·; (w(·), θ(·))

)= max

u(·),p(·)≥0J1(w(·), θ(·), u(·), p(·)).


Select a G2,t-adapted pair (w∗(·), θ∗(·)) for the manufacturer

J2(w∗(·), θ∗(·), u∗(·), p∗(·))

≡ J2(w∗(·), θ∗(·), u∗(·; (w∗(·), θ∗(·)),p∗(·; (w∗(·), θ∗(·))

)= max

w(·),0≤θ(·)≤1J2(w(·), θ(·), u∗(·; (w(·), θ(·)),

p∗(·; (w(·), θ(·))),


2. Problem formulation

State equation:

dxu1,u2(t)

= b(t, xu1,u2(t), u1(t), u2(t)

)dt

+ σ(t, xu1,u2(t), u1(t), u2(t)

)dW (t)

+ σ̃(t, xu1,u2(t), u1(t), u2(t)

)dW̃ (t),

xu1,u2(0) = x0,

(2.1)


G1t 6= G2t ⊆ Ft.The admissible control

U1 :={u1∣∣u1 : Ω× [0, T ]→ U1 is G1t -adapted

and sup0≤t≤T

E|u1(t)|i


Problem of the follower. For any chosen u2(·) ∈ U2 by theleader, choose a G1t -adapted control u∗1(·) = u∗1(·;u2(·)) ∈ U1,such that

J1(u∗1(·), u2(·)) ≡ J1

(u∗1(·;u2(·)), u2(·)

)= inf

u1∈U1J1(u1(·), u2(·)), (2.4)

u∗1(·) = u∗1(·;u2(·)), (partial information) optimal control.


The leader would like to choose a G2t -adapted control u∗2(·) tominimize

J2(u∗1(·), u2(·))

= E[ ∫ T

0g2(t, xu

∗1,u2(t), u∗1(t;u2(t)), u2(t)

)dt

+G2(xu∗1,u2(T ))

].

(2.5)


Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that

J2(u∗1(·), u∗2(·)) = J2

(u∗1(·;u∗2(·)), u∗2(·)

)= inf

u2∈U2J2(u∗1(·;u2(·)), u2(·)

),

(2.6)


Issacs (1954-5) differential gamesBasar and Olsder (1982) Monograph to summarizeStackelberg (1934) Leader-follower game introducedYong (2002) LQ, complete info.Bensoussan et al (2012) Stoch. Max. Principle


3. LQ Leader-Follower Stochastic Differential Gamewith Asymmetric Information; Stochastic maximumprinciple under partial information.

The state equation

dxu1,u2(t)

=[A(t)xu1,u2(t) +B1(t)u1(t) +B2(t)u2(t)

]dt

+[C(t)xu1,u2(t) +D1(t)u1(t) +D2(t)u2(t)

]dW (t)

+[C̃(t)xu1,u2(t) + D̃1(t)u1(t) + D̃2(t)u2(t)

]dW̃ (t),

xu1,u2(0) = x0,

(3.7)


Given u2 ∈ U2, follower minimizes his cost functional

J1(u1(·), u2(·))

=1

2E[ ∫ T

0

(〈Q1(t)x

u1,u2(t), xu1,u2(t)〉

+〈N1(t)u1(t), u1(t)

〉)dt

+〈G1x

u1,u2(T ), xu1,u2(T )〉],

(3.8)

Info: G1t = σ{W̃ (s); 0 ≤ s ≤ t}


Stochastic maximum principle under partialinformation.Probability space (Ω,F , P̂).State equation: FBSDE

dx = b(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dW̃ ,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).

(3.9)

Observation {dY = h(t, x)dt+ dW̃ ,Y (0) = 0.

(3.10)


Admissible controls v ∈ Av is Gt-adapted, where Gt = FYt .

Control problem: Find u ∈ A s.t.

J(u) = minv∈A

J(v),

where

J(v) = Ê(∫ 1

0`(t, xv, yv, zv, z̃v, v)dt+ φ(xv(1)) + γ(yv(0))

).


Hypothesis (H1):

Coeff. conti. diff. w/ bounded 1st order partial derivatives.

σ̃, h bounded.

Hypothesis (H2):

`, φ, γ conti. diff.

Ê(∫ 1

0|`(t, xv, yv, zv, z̃v, v)|dt+ |φ(xv(1))|+ |γ(yv(0))|

)


when b = σ = σ̃ = 0, studied by Huang-Wang-Xiong (SICON2009)General form by Wang-Wu-Xiong (SICON 2013).Detail on LQ case by Wang-Wu-Xiong (IEEE TAC 2015).


Replace b, W̃ in state equation by b̃, Y , resp, where

b̃ = b− σ̃h.

State equation: dx = b̃(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dY,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).

(3.11)


Let

Zv(t) = exp

{∫ t0h(s, xv)dY − 1

2

∫ t0h2(s, xv)ds

}.

SodZv = Zvh(t, xv)dY. (3.12)

Let P ∼ P̂ be s.t.dP̂ = Zv(1)dP.

Then, under P, (W,Y ) is a B.M.

J(v) = E(∫ 1

0Zv`(t, xv, yv, zv, z̃v, v)dt+ Zv(1)φ(xv(1)) + γ(yv(0))

).


Hypothesis (H3):

For ξ bdd, Gt-measurable,

v(s) ≡ ξ1[t,t+τ)(s),

then v ∈ A.For any u ∈ A and v being Gs-adapted and bdd, ∃ δ > 0, if|�| < δ, then u+ �v ∈ A.


Adjoint equations:

dp = (gy(t, x, y, z, z̃, u)p− `y(t, x, y, z, z̃, u)) dt+ (gz(t, x, y, z, z̃, u)p− `z(t, x, y, z, z̃, u)) dW+ (gz̃(t, x, y, z, z̃, u)p− h(t, x)) dW̃ ,

−dq ={

(bx(t, x, u)− σ̃(t, x, u)hx(t, x))q + σx(t, x, u)k

+σ̃x(t, x, u)k̃ − gx(t, x, y, z, z̃, u)p

−`x(t, x, y, z, z̃, u)}dt− kdW − k̃dW̃ ,

p(0) = −γy(y(0))q(1) = −fx(x(1))p(1) + φx(x(1)),

(3.13)


and {−dP = `(t, x, y, z, z̃, u)dt−QdW − Q̃dW̃P (1) = φ(x(1)).

(3.14)

Hamitonian

H(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)

= bq + σk + σ̃k̃ + hQ̃+ `− (g − hz̃)p.


Theorem

If u is a local minimum for J(·), then

E(Hu(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)

∣∣∣Gt) = 0.Idea of proof Set

d

d�J(u+ �v)

∣∣∣�=0

= 0.


Remark

(3.14) is for Zv(t). If h = 0, this eq. is not needed.

Suppose h = b = σ = σ̃, it is proved in Huang-Wang-Xiong(2008, SICON)


4. Follower’s optimization problemStep 1. (Optimal control)Hamiltonian function:

H1(t, x, u1, u2; q, k, k̃

)=〈q, A(t)x+B1(t)u1 +B2(t)u2

〉+〈k,C(t)x+D1(t)u1 +D2(t)u2

〉+〈k̃, C̃(t)x+ D̃1(t)u1 + D̃2(t)u2

〉− 1

2

〈Q1(t)x, x

〉− 1

2

〈N1(t)u1, u1

〉.

(4.15)


Then,

0 = N1(t)u∗1(t)−B>1 (t)E

[q(t)

∣∣G1t ]−D>1 (t)E

[k(t)

∣∣G1t ]− D̃>1 (t)E

[k̃(t)

∣∣G1t ], a.e.t ∈ [0, T ],(4.16)

where Ft-adapted (q(·), k(·), k̃(·)) satisfies the BSDE

−dq(t) =[A>(t)q(t) + C>(t)k(t) + C̃>(t)k̃(t)

−Q1(t)xu∗1,u2(t)

]dt

− k(t)dW (t)− k̃(t)dW̃ (t),q(T ) = −G1xu

∗1,u2(T ).

(4.17)


Step 2. (Optimal filtering)Let

f̂(t) := E[f(t)

∣∣G1t ], f = q, k, k̃.How to derive filtering equation? Direct from (4.17) isimpossible.Solve (4.17) first!Guess:

q(t) = −P1(t)xu∗1,u2(t)− ϕ(t), t ∈ [0, T ], (4.18)

Set {dϕ(t) = α(t)dt+ β(t)dW̃ (t),

ϕ(T ) = 0.(4.19)


Applying dq(t) and comparing, we get

k(t) = −P1(t)[C(t)xu

∗1,u2(t) +D1(t)u

∗1(t) +D2(t)u2(t)

], (4.20)

k̃(t) = − P1(t)[C̃(t)xu

∗1,u2(t) + D̃1(t)u

∗1(t)

+ D̃2(t)u2(t)]− β(t),

(4.21)

α(t) =[− Ṗ1(t)− P1(t)A(t)−A>(t)P1(t)

−Q1(t)]xu

∗1,u2(t)− P1(t)B(t)u∗1(t)

− P1(t)B2(t)u2(t)−A>(t)ϕ(t)

+ C>(t)k(t) + C̃>(t)k̃(t).

(4.22)


Hence,q̂(t) = −P1(t)x̂u

∗1,û2(t)− ϕ(t), (4.23)

k̂(t) = −P1(t)[C(t)x̂u

∗1,û2(t) +D1(t)u

∗1(t) +D2(t)û2(t)

], (4.24)

ˆ̃k(t) = − P1(t)

[C̃(t)x̂u

∗1,û2(t) + D̃1(t)u

∗1(t)

+ D̃2(t)û2(t)]− β(t)

(4.25)

with û2(t) := E[u2(t)

∣∣G1,t].


Using filtering technique (Lemma 5.4 in Xiong (2008)), we getdx̂u

∗1,û2(t)

=[A(t)x̂u

∗1,û2(t) +B1(t)u

∗1(t) +B2(t)û2(t)

]dt

+[C̃(t)x̂u

∗1,û2(t) + D̃1(t)u

∗1(t) + D̃2(t)û2(t)

]dW̃ (t),

x̂u∗1,û2(0) = x0,

(4.26)

and −dq̂(t) =

{A>(t)q̂(t) + C>(t)k̂(t) + C̃>(t)

ˆ̃k(t)

−Q1(t)x̂u∗1,û2(t)

}dt− ˆ̃k(t)dW̃ (t),

q̂(T ) = −G1x̂u∗1,û2(T ),

(4.27)


Step 3. (Optimal state feedback control)By (4.16) we arrive at

u∗1(t) =− Ñ1(t)−1[(B>1 (t)P1(t) +D

>1 (t)P1(t)C(t)

+ D̃>1 (t)P1(t)C̃(t))x̂u

∗1,û2(t)

+(D>1 (t)P1(t)D2(t) + D̃

>1 (t)P1(t)D̃2(t)

)û2(t)

+B>1 (t)ϕ(t) + D̃>1 (t)β(t)

],

(4.28)

where

Ñ1(t) := N1(t) +D>1 (t)P1(t)D1(t) + D̃

>1 (t)P1(t)D̃1(t) (4.29)


Substituting (4.28) into (4.22), we obtain Riccati equation

Ṗ1(t) + P1(t)A(t) +A>(t)P1(t) + C

>(t)P1(t)C(t)

+ C̃>(t)P1(t)C̃(t) +Q1(t)

−(P1(t)B1(t) + C

>(t)P1(t)D1(t)

+ C̃>(t)P1(t)D̃1(t))Ñ−11 (t)

(B>1 (t)P1(t)

+D>1 (t)P1(t)C(t) + D̃>1 (t)P1(t)C̃(t)

)= 0,

P1(T ) = G1.

(4.30)

Suppose it admits a unique differentiable solution P1(·)


α(t) =[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)

]ϕ(t)

+[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)

]β(t)

+[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)

]û2(t),

(4.31)

which is G1t -adapted, where

S̃(t) := D>1 (t)P1(t)D2(t) + D̃>1 (t)P1(t)D̃2(t),

S̃1(t) := P1(t)B1(t) + C>(t)P1(t)D1(t)

+ C̃>(t)P1(t)D̃1(t),

S̃2(t) := P1(t)B2(t) + C>(t)P1(t)D2(t)

+ C̃>(t)P1(t)D̃2(t), t ∈ [0, T ].


Then,

−dϕ(t) ={[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)

]ϕ(t)

+[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)

]β(t)

+[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)

]û2(t)

}dt

− β(t)dW̃ (t),ϕ(T ) = 0,

(4.32)

which admits a unique G1t -adapted solution (ϕ(·), β(·)).


Finally,

dx̂u∗1,û2(t) =

[Ã(t)x̂u

∗1,û2(t) + F̃1(t)ϕ(t)

+ B̃1(t)β(t) + B̃2(t)û2(t)]dt

+[ ˜̃C(t)x̂u

∗1,û2(t) + B̃>1 (t)ϕ(t)

+ F̃3(t)β(t) +˜̃D2(t)û2(t)

]dW̃ (t),

x̂u∗1,û2(0) = x0,

(4.33)

solves x̂u∗1,û2 , where


Ã(t) := A(t)−B1(t)Ñ1(t)−1S̃>1 (t),

B̃2(t) := B2(t)−B1(t)Ñ1(t)−1S̃(t),

F̃1(t) := −B1(t)Ñ1(t)−1B>1 (t),

B̃1(t) := −B1(t)Ñ1(t)−1D̃>1 (t),˜̃C(t) := C̃(t)− D̃1(t)Ñ1(t)−1S̃>1 (t),

F̃3(t) := −D̃1(t)Ñ1(t)−1D̃>1 (t),˜̃D2(t) := D̃2(t)− D̃1(t)Ñ1(t)−1S̃(t)

F̃4(t) := −S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t).


Theorem 3.1 Let Riccati equation admit a differentiablesolution P1(·) and let matrix be convertible. For chosen u2(·) ofthe leader, Problem of the follower admits an optimalcontrol u∗1(·) of the state feedback, where processes triple(x̂u

∗1,û2(·), ϕ(·), β(·)) is the unique G1t -adapted solution to the

FBSDE (4.32)-(4.33).


5. Complete Information LQ Stochastic OptimalControl Problem of the LeaderThe leader knows the complete information Ft at any time t.The state equations faced by the leader consist of BSDE (4.32)and the SDE (3.7). By (4.28), it can be written as


dxu2(t) =[A(t)xu2(t) +

(Ã(t)−A(t)

)x̂û2(t) + F̃1(t)ϕ(t)

+ B̃1(t)β(t) +B2(t)u2(t) +(B̃2(t)−B2(t)

)û2(t)

]dt

+[C(t)xu2(t) + F̃5(t)x̂

û2(t) +˜̃B1(t)

>ϕ(t) +˜̃D1(t)β(t)

+D2(t)u2(t) + F̃2(t)û2(t)]dW (t)

+[C̃(t)xu2(t) +

( ˜̃C(t)− C̃(t)

)x̂û2(t) + B̃1(t)

>ϕ(t)

+ F̃3(t)β(t) + D̃2(t)u2(t) +( ˜̃D(t)− D̃(t)

)û2(t)

]dW̃ (t),

− dϕ(t) =[Ã>(t)ϕ(t) +

˜̃C>

(t)β(t)

+ F̃4(t)û2(t)]dt− β(t)dW̃ (t),

xu2(0) = x0, ϕ(T ) = 0.(5.34)


The leader’s cost functional

J̃2(u2(·)) := J2(u∗1(·), u2(·))

=1

2E[ ∫ T

0

(〈Q2(t)x

u∗1,u2(t), xu∗1,u2(t)

〉+〈N2(t)u2(t), u2(t)

〉)dt

+〈G2x

u∗1,u2(T ), xu∗1,u2(T )

〉](5.35)

This complete info conditional mean-field LQ problem withstate (5.34) and cost (5.34) can be solved explicitly using ageneral result.


General conditional mean-field SMP. State

dxu2(t) = b(t)dt+ σ(t)dW (t) + σ̃(t)dW̃ (t),

−dq(t) ={bx(t)q(t) +

d1∑j=1

σjx(t)kj(t)

+

d2∑j=1

σ̃jx(t)k̃j(t)− g1x(t)

}dt

− k(t)dW (t)− k̃(t)dW̃ (t),xu2(0) = x0, q(T ) = −G1x(xu2(T )),

(5.36)

where b is a function of (t, x, x̂, u2, û2, q, k, k̃, k̂,ˆ̃k).


Cost:

J2(u2(·)) = E[ ∫ T

0g2(t)dt+G2(x

u2(T ))]

Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that

J2(u∗2(·)) = inf

u2∈U2J2(u2(·)), (5.37)

subject to (5.36).


Define Hamiltonian

H2(t, x, u2, x̂, û2, q, k, k̃; y, z, z̃, p

)=〈y, b()

〉+ tr

{z>σ(t)

}+ tr

{z̃>σ̃(t)

}+ g2(t)

−〈p, bx(t)q +

d1∑j=1

σjx(t)kj +

d2∑j=1

σ̃jx(t)k̃j − g1x(t)

〉.

(5.38)


Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution tothe adjoint conditional mean-field FBSDE of the leader

dp(t) ={b∗x(t)p(t) + E

[b∗x̂(t)p(t)

∣∣G1t ]}dt+

d1∑j=1

{σ∗jx (t)p(t) + E

[σ∗jx̂ (t)p(t)

∣∣G1t ]}dW j(t)+

d2∑j=1

{σ̃∗jx (t)p(t) + E

[σ̃∗jx̂ (t)p(t)

∣∣G1t ]}dW̃ j(t),(5.39)


−dy ={bxy + E

[bx̂y∣∣G1t ]+ d1∑

j=1

[σjxz

j + E[σjx̂z

j∣∣G1t ]]

+

d2∑j=1

[σ̃jx(t)z

j(t) + E[σ̃jx̂(t)z

j(t)∣∣G1t ]]

−n∑i=1

{∂bx∂xi

(t)q(t)pi(t) + E[∂bx∂x̂i

(t)q(t)pi(t)∣∣∣G1t ]}

−n∑i=1

{ ∂∂xi

( d1∑j=1

σjxkj

)pi + E

[ ∂∂x̂i

( d1∑j=1

σjxkj

)pi

∣∣∣G1t ]}

−n∑i=1

{ ∂∂xi

( d1∑j=1

σ̃jxk̃j

)pi + E

[ ∂∂x̂i

( d1∑j=1

σ̃jxk̃j

)pi

∣∣∣G1t ]}+ g1xxp+ E

[g1xx̂p

∣∣G1t ]+ g2x + E[g2x̂∣∣G1t ]}dt− zdW − z̃dW̃ ,(5.40)


with boundary

p(0) = 0,

y(T ) = G1xx(x∗(T ))p(T ) +G2x(x

∗(T )),(5.41)

where we have used φ ≡ φ(t, x∗(t), x̂∗(t), u∗2(t), û

∗2(t))

forφ = b, σ, σ̃, g1, g2 and all their derivatives.Related work: Anderson and Djehiche (2011), Li (2012), Yong(2.13).


Proposition (Maximum principle of conditional mean-fieldFBSDE with partial information)

Let u∗2(·) ∈ U2 be the optimal control for Problem of theleader and (x∗(·), q∗(·), k∗(·), k̃∗(·)) be the correspondingoptimal state which is the unique Ft-adapted solution to (5.36).Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution to theadjoint equation. Then

E

{∂H2∂u2

+ E[∂H2∂û2

∣∣∣∣G1t ]∣∣∣∣u2=u∗2(t)

∣∣∣∣∣G2t}

= 0, a.e.t ∈ [0, T ], (5.42)


Now, we come back to LQ problem. Define Hamiltonian

H2(t, xu2 , u2, ϕ, β; y, z, z̃, p

)=〈y,A(t)xu2 +

(Ã(t)−A(t)

)x̂û2 + F̃1(t)ϕ

+ B̃1(t)β +B2(t)u2 +(B̃2(t)−B2(t)

)û2〉

+〈z, C(t)xu2 + F̃5(t)x̂

û2 +˜̃B>1 (t)ϕ

+˜̃D1(t)β +D2(t)u2 + F̃2(t)u2

〉+〈z̃, C̃(t)xu2 +

( ˜̃C(t)− C̃(t)

)x̂û2 + B̃>1 (t)ϕ

+ F̃3(t)β + D̃2(t)u2 +(D̃2(t)−D2(t)

)û2〉

+〈p, Ã>ϕ+

˜̃C>β + F̃4û2

〉+

1

2

[〈Q2x

u2 , xu2〉

+〈N2u2, u2

〉].

(5.43)


Applying the general thm, if there exists an Ft-adapted optimalcontrol u∗2(·) ∈ U2 for the leader, then

0 = N2(t)u∗2(t) + F̃

>4 (t)p̂(t) +B

>2 (t)y(t)

+(B̃2(t)−B2(t)

)>ŷ(t) +D>2 (t)z(t)

+ F̃>2 (t)ẑ(t) + D̃>2 (t)z̃(t)

+( ˜̃D2(t)− D̃2(t)

)>ˆ̃z(t), a.e. t ∈ [0, T ],(5.44)

where Ft-adapted processes quadruple (p(·), y(·), z(·), z̃(·))satisfies the conditional mean-field FBSDE


dp(t) =[Ã(t)p(t) + F̃>1 (t)y(t) +

˜̃B1(t)z(t) + B̃1(t)z̃(t)

]dt

+[ ˜̃C(t)p(t) + B̃>1 (t)y(t)

+˜̃D>1 (t)z(t) + F̃

>3 (t)z̃(t)

]dW̃ (t),

−dy(t) =[A>(t)y(t) +

(Ã(t)−A(t)

)>ŷ(t)

+ C>(t)z(t) + F̃>5 (t)ẑ(t) + C̃>(t)z̃(t)

+( ˜̃C(t)− C̃(t)

)>ˆ̃z(t) +Q2(t)x∗(t)]dt− z(t)dW (t)− z̃(t)dW̃ (t),

p(0) = 0, y(T ) = G2x∗(T ).

(5.45)


——————

Thanks!

——————

A motivationProblem formulationStochastic maximum principle under partial information.Follower's optimization problemComplete Information LQ Stochastic Optimal Control Problem of the Leader