UNIVERSIDADE DE MACAU
Leader-Follower Stochastic Differential Gamewith Asymmetric Information and Applications
Jie Xiong
Department of MathematicsUniversity of Macau
Macau
Based on joint works with Shi and Wang2015 Barrett Lecture, Knoxville, 14/05/2015]
UNIVERSIDADE DE MACAU
Outline
1 Motivation
2 Problem formulation
3 LQ Leader-Follower Stochastic Differential Game withAsymmetric Information
4 Follower’s optimization problem
5 Complete Information LQ Stochastic Optimal ControlProblem of the Leader
UNIVERSIDADE DE MACAU
1. A motivation
Example: Continuous Time Newsvendor Problems
D(·) is the demand rate{dD(t) = a(µ− kR(t)−D(t))dt+ σdW (t) + σ̃dW̃ (t),D(0) = d0 ∈ R,
retail price R(t).
UNIVERSIDADE DE MACAU
The retailer will obtain an expected profit
J1(q(·), R(·), w(·))
= E∫ T0
[(R(t)− S) min[D(t), q(t)]
− (w(t)− S)q(t)]dt.
order rate q(t); wholesale price w(t); salvaged at unit priceS ≥ 0.
UNIVERSIDADE DE MACAU
the manufacturer has a fixed production cost per unit M ≥ 0,an expected profit
J2(q(·), R(·), w(·)) = E∫ T0
(w(t)−M)q(t)dt.
UNIVERSIDADE DE MACAU
The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .For any w(·), retailer selects a G1t -adapted processes pair(q∗(·), R∗(·)) for the retailer such that
J1(q∗(·), R∗(·), w(·))
≡ J1(q∗(·;w(·)), R∗(·;w(·)), w(·)
)= max
q(·),R(·)J1(q(·), R(·), w(·)),
UNIVERSIDADE DE MACAU
Manufacturer selects a G2t -adapted process w∗(·) for themanufacturer such that
J2(q∗(·), R∗(·), w∗(·))
≡ J2(q∗(·;w∗(·)), R∗(·;w∗(·)), w∗(·)
)= max
w(·)J2(q∗(·;w(·)), R∗(·;w(·)), w(·)
),
UNIVERSIDADE DE MACAU
Example: Cooperative Advertising and Pricing Problems
dx(t) =[ρu(t)
√1− x(t)− δx(t)
]dt+ σ(x(t))dW (t)
+ σ̃(x(t))dW̃ (t),
x(0) = x0 ∈ [0, 1],
u(t) advertisement effort rate, x(t) awareness share, ρ responseconstant, and δ the rate at which potential consumers are lost.
Manufacturer decide:wholesale price w(t), cooperative participation rate θ(t).Retailer decides: effort u(t), retail price p(t).
UNIVERSIDADE DE MACAU
Given w(t) and θ(t), retailer solves an optimization problem tomaximize his expected profit
J1(w(·), θ(·), u(·), p(·))
= E∫ T0e−rt
[(p(t)− w(t))D(p(t))x(t)
− (1− θ(t))u2(t)]dt,
D is demand function.
UNIVERSIDADE DE MACAU
The manufacturer’s optimization problem is to maximize hisexpected profit
J2(w(·), θ(·), u(·), p(·))
= E∫ T0e−rt
[(w(t)− c)D(p(t))x(t)− θ(t)u2(t)
]dt,
where c ≥ 0 is the constant unit production cost.
UNIVERSIDADE DE MACAU
The information G1t ,G2t available to the retailer and themanufacturer at time t, respectively. G1t 6= G2t .Given w and θ, retailer chooses G1,t-adapted pair (u∗(·), p∗(·)),
J1(w(·), θ(·), u∗(·), p∗(·))≡ J1
(w(·), θ(·), u∗(·; (w(·), θ(·)), p∗(·; (w(·), θ(·))
)= max
u(·),p(·)≥0J1(w(·), θ(·), u(·), p(·)).
UNIVERSIDADE DE MACAU
Select a G2,t-adapted pair (w∗(·), θ∗(·)) for the manufacturer
J2(w∗(·), θ∗(·), u∗(·), p∗(·))
≡ J2(w∗(·), θ∗(·), u∗(·; (w∗(·), θ∗(·)),p∗(·; (w∗(·), θ∗(·))
)= max
w(·),0≤θ(·)≤1J2(w(·), θ(·), u∗(·; (w(·), θ(·)),
p∗(·; (w(·), θ(·))),
UNIVERSIDADE DE MACAU
2. Problem formulation
State equation:
dxu1,u2(t)
= b(t, xu1,u2(t), u1(t), u2(t)
)dt
+ σ(t, xu1,u2(t), u1(t), u2(t)
)dW (t)
+ σ̃(t, xu1,u2(t), u1(t), u2(t)
)dW̃ (t),
xu1,u2(0) = x0,
(2.1)
UNIVERSIDADE DE MACAU
G1t 6= G2t ⊆ Ft.The admissible control
U1 :={u1∣∣u1 : Ω× [0, T ]→ U1 is G1t -adapted
and sup0≤t≤T
E|u1(t)|i
UNIVERSIDADE DE MACAU
Problem of the follower. For any chosen u2(·) ∈ U2 by theleader, choose a G1t -adapted control u∗1(·) = u∗1(·;u2(·)) ∈ U1,such that
J1(u∗1(·), u2(·)) ≡ J1
(u∗1(·;u2(·)), u2(·)
)= inf
u1∈U1J1(u1(·), u2(·)), (2.4)
u∗1(·) = u∗1(·;u2(·)), (partial information) optimal control.
UNIVERSIDADE DE MACAU
The leader would like to choose a G2t -adapted control u∗2(·) tominimize
J2(u∗1(·), u2(·))
= E[ ∫ T
0g2(t, xu
∗1,u2(t), u∗1(t;u2(t)), u2(t)
)dt
+G2(xu∗1,u2(T ))
].
(2.5)
UNIVERSIDADE DE MACAU
Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that
J2(u∗1(·), u∗2(·)) = J2
(u∗1(·;u∗2(·)), u∗2(·)
)= inf
u2∈U2J2(u∗1(·;u2(·)), u2(·)
),
(2.6)
UNIVERSIDADE DE MACAU
Issacs (1954-5) differential gamesBasar and Olsder (1982) Monograph to summarizeStackelberg (1934) Leader-follower game introducedYong (2002) LQ, complete info.Bensoussan et al (2012) Stoch. Max. Principle
UNIVERSIDADE DE MACAU
3. LQ Leader-Follower Stochastic Differential Gamewith Asymmetric Information; Stochastic maximumprinciple under partial information.
The state equation
dxu1,u2(t)
=[A(t)xu1,u2(t) +B1(t)u1(t) +B2(t)u2(t)
]dt
+[C(t)xu1,u2(t) +D1(t)u1(t) +D2(t)u2(t)
]dW (t)
+[C̃(t)xu1,u2(t) + D̃1(t)u1(t) + D̃2(t)u2(t)
]dW̃ (t),
xu1,u2(0) = x0,
(3.7)
UNIVERSIDADE DE MACAU
Given u2 ∈ U2, follower minimizes his cost functional
J1(u1(·), u2(·))
=1
2E[ ∫ T
0
(〈Q1(t)x
u1,u2(t), xu1,u2(t)〉
+〈N1(t)u1(t), u1(t)
〉)dt
+〈G1x
u1,u2(T ), xu1,u2(T )〉],
(3.8)
Info: G1t = σ{W̃ (s); 0 ≤ s ≤ t}
UNIVERSIDADE DE MACAU
Stochastic maximum principle under partialinformation.Probability space (Ω,F , P̂).State equation: FBSDE
dx = b(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dW̃ ,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).
(3.9)
Observation {dY = h(t, x)dt+ dW̃ ,Y (0) = 0.
(3.10)
UNIVERSIDADE DE MACAU
Admissible controls v ∈ Av is Gt-adapted, where Gt = FYt .
Control problem: Find u ∈ A s.t.
J(u) = minv∈A
J(v),
where
J(v) = Ê(∫ 1
0`(t, xv, yv, zv, z̃v, v)dt+ φ(xv(1)) + γ(yv(0))
).
UNIVERSIDADE DE MACAU
Hypothesis (H1):
Coeff. conti. diff. w/ bounded 1st order partial derivatives.
σ̃, h bounded.
Hypothesis (H2):
`, φ, γ conti. diff.
Ê(∫ 1
0|`(t, xv, yv, zv, z̃v, v)|dt+ |φ(xv(1))|+ |γ(yv(0))|
)
UNIVERSIDADE DE MACAU
when b = σ = σ̃ = 0, studied by Huang-Wang-Xiong (SICON2009)General form by Wang-Wu-Xiong (SICON 2013).Detail on LQ case by Wang-Wu-Xiong (IEEE TAC 2015).
UNIVERSIDADE DE MACAU
Replace b, W̃ in state equation by b̃, Y , resp, where
b̃ = b− σ̃h.
State equation: dx = b̃(t, x, v)dt+ σ(t, x, v)dW + σ̃(t, x, v)dY,−dy = g(t, x, y, z, z̃, v)dt− zdW − z̃dY,x(0) = x0, y(1) = f(x(1)).
(3.11)
UNIVERSIDADE DE MACAU
Let
Zv(t) = exp
{∫ t0h(s, xv)dY − 1
2
∫ t0h2(s, xv)ds
}.
SodZv = Zvh(t, xv)dY. (3.12)
Let P ∼ P̂ be s.t.dP̂ = Zv(1)dP.
Then, under P, (W,Y ) is a B.M.
J(v) = E(∫ 1
0Zv`(t, xv, yv, zv, z̃v, v)dt+ Zv(1)φ(xv(1)) + γ(yv(0))
).
UNIVERSIDADE DE MACAU
Hypothesis (H3):
For ξ bdd, Gt-measurable,
v(s) ≡ ξ1[t,t+τ)(s),
then v ∈ A.For any u ∈ A and v being Gs-adapted and bdd, ∃ δ > 0, if|�| < δ, then u+ �v ∈ A.
UNIVERSIDADE DE MACAU
Adjoint equations:
dp = (gy(t, x, y, z, z̃, u)p− `y(t, x, y, z, z̃, u)) dt+ (gz(t, x, y, z, z̃, u)p− `z(t, x, y, z, z̃, u)) dW+ (gz̃(t, x, y, z, z̃, u)p− h(t, x)) dW̃ ,
−dq ={
(bx(t, x, u)− σ̃(t, x, u)hx(t, x))q + σx(t, x, u)k
+σ̃x(t, x, u)k̃ − gx(t, x, y, z, z̃, u)p
−`x(t, x, y, z, z̃, u)}dt− kdW − k̃dW̃ ,
p(0) = −γy(y(0))q(1) = −fx(x(1))p(1) + φx(x(1)),
(3.13)
UNIVERSIDADE DE MACAU
and {−dP = `(t, x, y, z, z̃, u)dt−QdW − Q̃dW̃P (1) = φ(x(1)).
(3.14)
Hamitonian
H(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)
= bq + σk + σ̃k̃ + hQ̃+ `− (g − hz̃)p.
UNIVERSIDADE DE MACAU
Theorem
If u is a local minimum for J(·), then
E(Hu(t, x, y, z, z̃, v; p, q, k, k̃, Q̃)
∣∣∣Gt) = 0.Idea of proof Set
d
d�J(u+ �v)
∣∣∣�=0
= 0.
UNIVERSIDADE DE MACAU
Remark
(3.14) is for Zv(t). If h = 0, this eq. is not needed.
Suppose h = b = σ = σ̃, it is proved in Huang-Wang-Xiong(2008, SICON)
UNIVERSIDADE DE MACAU
4. Follower’s optimization problemStep 1. (Optimal control)Hamiltonian function:
H1(t, x, u1, u2; q, k, k̃
)=〈q, A(t)x+B1(t)u1 +B2(t)u2
〉+〈k,C(t)x+D1(t)u1 +D2(t)u2
〉+〈k̃, C̃(t)x+ D̃1(t)u1 + D̃2(t)u2
〉− 1
2
〈Q1(t)x, x
〉− 1
2
〈N1(t)u1, u1
〉.
(4.15)
UNIVERSIDADE DE MACAU
Then,
0 = N1(t)u∗1(t)−B>1 (t)E
[q(t)
∣∣G1t ]−D>1 (t)E
[k(t)
∣∣G1t ]− D̃>1 (t)E
[k̃(t)
∣∣G1t ], a.e.t ∈ [0, T ],(4.16)
where Ft-adapted (q(·), k(·), k̃(·)) satisfies the BSDE
−dq(t) =[A>(t)q(t) + C>(t)k(t) + C̃>(t)k̃(t)
−Q1(t)xu∗1,u2(t)
]dt
− k(t)dW (t)− k̃(t)dW̃ (t),q(T ) = −G1xu
∗1,u2(T ).
(4.17)
UNIVERSIDADE DE MACAU
Step 2. (Optimal filtering)Let
f̂(t) := E[f(t)
∣∣G1t ], f = q, k, k̃.How to derive filtering equation? Direct from (4.17) isimpossible.Solve (4.17) first!Guess:
q(t) = −P1(t)xu∗1,u2(t)− ϕ(t), t ∈ [0, T ], (4.18)
Set {dϕ(t) = α(t)dt+ β(t)dW̃ (t),
ϕ(T ) = 0.(4.19)
UNIVERSIDADE DE MACAU
Applying dq(t) and comparing, we get
k(t) = −P1(t)[C(t)xu
∗1,u2(t) +D1(t)u
∗1(t) +D2(t)u2(t)
], (4.20)
k̃(t) = − P1(t)[C̃(t)xu
∗1,u2(t) + D̃1(t)u
∗1(t)
+ D̃2(t)u2(t)]− β(t),
(4.21)
α(t) =[− Ṗ1(t)− P1(t)A(t)−A>(t)P1(t)
−Q1(t)]xu
∗1,u2(t)− P1(t)B(t)u∗1(t)
− P1(t)B2(t)u2(t)−A>(t)ϕ(t)
+ C>(t)k(t) + C̃>(t)k̃(t).
(4.22)
UNIVERSIDADE DE MACAU
Hence,q̂(t) = −P1(t)x̂u
∗1,û2(t)− ϕ(t), (4.23)
k̂(t) = −P1(t)[C(t)x̂u
∗1,û2(t) +D1(t)u
∗1(t) +D2(t)û2(t)
], (4.24)
ˆ̃k(t) = − P1(t)
[C̃(t)x̂u
∗1,û2(t) + D̃1(t)u
∗1(t)
+ D̃2(t)û2(t)]− β(t)
(4.25)
with û2(t) := E[u2(t)
∣∣G1,t].
UNIVERSIDADE DE MACAU
Using filtering technique (Lemma 5.4 in Xiong (2008)), we getdx̂u
∗1,û2(t)
=[A(t)x̂u
∗1,û2(t) +B1(t)u
∗1(t) +B2(t)û2(t)
]dt
+[C̃(t)x̂u
∗1,û2(t) + D̃1(t)u
∗1(t) + D̃2(t)û2(t)
]dW̃ (t),
x̂u∗1,û2(0) = x0,
(4.26)
and −dq̂(t) =
{A>(t)q̂(t) + C>(t)k̂(t) + C̃>(t)
ˆ̃k(t)
−Q1(t)x̂u∗1,û2(t)
}dt− ˆ̃k(t)dW̃ (t),
q̂(T ) = −G1x̂u∗1,û2(T ),
(4.27)
UNIVERSIDADE DE MACAU
Step 3. (Optimal state feedback control)By (4.16) we arrive at
u∗1(t) =− Ñ1(t)−1[(B>1 (t)P1(t) +D
>1 (t)P1(t)C(t)
+ D̃>1 (t)P1(t)C̃(t))x̂u
∗1,û2(t)
+(D>1 (t)P1(t)D2(t) + D̃
>1 (t)P1(t)D̃2(t)
)û2(t)
+B>1 (t)ϕ(t) + D̃>1 (t)β(t)
],
(4.28)
where
Ñ1(t) := N1(t) +D>1 (t)P1(t)D1(t) + D̃
>1 (t)P1(t)D̃1(t) (4.29)
UNIVERSIDADE DE MACAU
Substituting (4.28) into (4.22), we obtain Riccati equation
Ṗ1(t) + P1(t)A(t) +A>(t)P1(t) + C
>(t)P1(t)C(t)
+ C̃>(t)P1(t)C̃(t) +Q1(t)
−(P1(t)B1(t) + C
>(t)P1(t)D1(t)
+ C̃>(t)P1(t)D̃1(t))Ñ−11 (t)
(B>1 (t)P1(t)
+D>1 (t)P1(t)C(t) + D̃>1 (t)P1(t)C̃(t)
)= 0,
P1(T ) = G1.
(4.30)
Suppose it admits a unique differentiable solution P1(·)
UNIVERSIDADE DE MACAU
α(t) =[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)
]ϕ(t)
+[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)
]β(t)
+[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)
]û2(t),
(4.31)
which is G1t -adapted, where
S̃(t) := D>1 (t)P1(t)D2(t) + D̃>1 (t)P1(t)D̃2(t),
S̃1(t) := P1(t)B1(t) + C>(t)P1(t)D1(t)
+ C̃>(t)P1(t)D̃1(t),
S̃2(t) := P1(t)B2(t) + C>(t)P1(t)D2(t)
+ C̃>(t)P1(t)D̃2(t), t ∈ [0, T ].
UNIVERSIDADE DE MACAU
Then,
−dϕ(t) ={[A>(t)− S̃1(t)Ñ1(t)−1B>1 (t)
]ϕ(t)
+[C̃>(t)− S̃1(t)Ñ1(t)−1D̃>1 (t)
]β(t)
+[− S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t)
]û2(t)
}dt
− β(t)dW̃ (t),ϕ(T ) = 0,
(4.32)
which admits a unique G1t -adapted solution (ϕ(·), β(·)).
UNIVERSIDADE DE MACAU
Finally,
dx̂u∗1,û2(t) =
[Ã(t)x̂u
∗1,û2(t) + F̃1(t)ϕ(t)
+ B̃1(t)β(t) + B̃2(t)û2(t)]dt
+[ ˜̃C(t)x̂u
∗1,û2(t) + B̃>1 (t)ϕ(t)
+ F̃3(t)β(t) +˜̃D2(t)û2(t)
]dW̃ (t),
x̂u∗1,û2(0) = x0,
(4.33)
solves x̂u∗1,û2 , where
UNIVERSIDADE DE MACAU
Ã(t) := A(t)−B1(t)Ñ1(t)−1S̃>1 (t),
B̃2(t) := B2(t)−B1(t)Ñ1(t)−1S̃(t),
F̃1(t) := −B1(t)Ñ1(t)−1B>1 (t),
B̃1(t) := −B1(t)Ñ1(t)−1D̃>1 (t),˜̃C(t) := C̃(t)− D̃1(t)Ñ1(t)−1S̃>1 (t),
F̃3(t) := −D̃1(t)Ñ1(t)−1D̃>1 (t),˜̃D2(t) := D̃2(t)− D̃1(t)Ñ1(t)−1S̃(t)
F̃4(t) := −S̃1(t)Ñ1(t)−1S̃(t) + S̃2(t).
UNIVERSIDADE DE MACAU
Theorem 3.1 Let Riccati equation admit a differentiablesolution P1(·) and let matrix be convertible. For chosen u2(·) ofthe leader, Problem of the follower admits an optimalcontrol u∗1(·) of the state feedback, where processes triple(x̂u
∗1,û2(·), ϕ(·), β(·)) is the unique G1t -adapted solution to the
FBSDE (4.32)-(4.33).
UNIVERSIDADE DE MACAU
5. Complete Information LQ Stochastic OptimalControl Problem of the LeaderThe leader knows the complete information Ft at any time t.The state equations faced by the leader consist of BSDE (4.32)and the SDE (3.7). By (4.28), it can be written as
UNIVERSIDADE DE MACAU
dxu2(t) =[A(t)xu2(t) +
(Ã(t)−A(t)
)x̂û2(t) + F̃1(t)ϕ(t)
+ B̃1(t)β(t) +B2(t)u2(t) +(B̃2(t)−B2(t)
)û2(t)
]dt
+[C(t)xu2(t) + F̃5(t)x̂
û2(t) +˜̃B1(t)
>ϕ(t) +˜̃D1(t)β(t)
+D2(t)u2(t) + F̃2(t)û2(t)]dW (t)
+[C̃(t)xu2(t) +
( ˜̃C(t)− C̃(t)
)x̂û2(t) + B̃1(t)
>ϕ(t)
+ F̃3(t)β(t) + D̃2(t)u2(t) +( ˜̃D(t)− D̃(t)
)û2(t)
]dW̃ (t),
− dϕ(t) =[Ã>(t)ϕ(t) +
˜̃C>
(t)β(t)
+ F̃4(t)û2(t)]dt− β(t)dW̃ (t),
xu2(0) = x0, ϕ(T ) = 0.(5.34)
UNIVERSIDADE DE MACAU
The leader’s cost functional
J̃2(u2(·)) := J2(u∗1(·), u2(·))
=1
2E[ ∫ T
0
(〈Q2(t)x
u∗1,u2(t), xu∗1,u2(t)
〉+〈N2(t)u2(t), u2(t)
〉)dt
+〈G2x
u∗1,u2(T ), xu∗1,u2(T )
〉](5.35)
This complete info conditional mean-field LQ problem withstate (5.34) and cost (5.34) can be solved explicitly using ageneral result.
UNIVERSIDADE DE MACAU
General conditional mean-field SMP. State
dxu2(t) = b(t)dt+ σ(t)dW (t) + σ̃(t)dW̃ (t),
−dq(t) ={bx(t)q(t) +
d1∑j=1
σjx(t)kj(t)
+
d2∑j=1
σ̃jx(t)k̃j(t)− g1x(t)
}dt
− k(t)dW (t)− k̃(t)dW̃ (t),xu2(0) = x0, q(T ) = −G1x(xu2(T )),
(5.36)
where b is a function of (t, x, x̂, u2, û2, q, k, k̃, k̂,ˆ̃k).
UNIVERSIDADE DE MACAU
Cost:
J2(u2(·)) = E[ ∫ T
0g2(t)dt+G2(x
u2(T ))]
Problem of the leader. Find a G2t -adapted control u∗2(·) ∈ U2,such that
J2(u∗2(·)) = inf
u2∈U2J2(u2(·)), (5.37)
subject to (5.36).
UNIVERSIDADE DE MACAU
Define Hamiltonian
H2(t, x, u2, x̂, û2, q, k, k̃; y, z, z̃, p
)=〈y, b()
〉+ tr
{z>σ(t)
}+ tr
{z̃>σ̃(t)
}+ g2(t)
−〈p, bx(t)q +
d1∑j=1
σjx(t)kj +
d2∑j=1
σ̃jx(t)k̃j − g1x(t)
〉.
(5.38)
UNIVERSIDADE DE MACAU
Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution tothe adjoint conditional mean-field FBSDE of the leader
dp(t) ={b∗x(t)p(t) + E
[b∗x̂(t)p(t)
∣∣G1t ]}dt+
d1∑j=1
{σ∗jx (t)p(t) + E
[σ∗jx̂ (t)p(t)
∣∣G1t ]}dW j(t)+
d2∑j=1
{σ̃∗jx (t)p(t) + E
[σ̃∗jx̂ (t)p(t)
∣∣G1t ]}dW̃ j(t),(5.39)
UNIVERSIDADE DE MACAU
−dy ={bxy + E
[bx̂y∣∣G1t ]+ d1∑
j=1
[σjxz
j + E[σjx̂z
j∣∣G1t ]]
+
d2∑j=1
[σ̃jx(t)z
j(t) + E[σ̃jx̂(t)z
j(t)∣∣G1t ]]
−n∑i=1
{∂bx∂xi
(t)q(t)pi(t) + E[∂bx∂x̂i
(t)q(t)pi(t)∣∣∣G1t ]}
−n∑i=1
{ ∂∂xi
( d1∑j=1
σjxkj
)pi + E
[ ∂∂x̂i
( d1∑j=1
σjxkj
)pi
∣∣∣G1t ]}
−n∑i=1
{ ∂∂xi
( d1∑j=1
σ̃jxk̃j
)pi + E
[ ∂∂x̂i
( d1∑j=1
σ̃jxk̃j
)pi
∣∣∣G1t ]}+ g1xxp+ E
[g1xx̂p
∣∣G1t ]+ g2x + E[g2x̂∣∣G1t ]}dt− zdW − z̃dW̃ ,(5.40)
UNIVERSIDADE DE MACAU
with boundary
p(0) = 0,
y(T ) = G1xx(x∗(T ))p(T ) +G2x(x
∗(T )),(5.41)
where we have used φ ≡ φ(t, x∗(t), x̂∗(t), u∗2(t), û
∗2(t))
forφ = b, σ, σ̃, g1, g2 and all their derivatives.Related work: Anderson and Djehiche (2011), Li (2012), Yong(2.13).
UNIVERSIDADE DE MACAU
Proposition (Maximum principle of conditional mean-fieldFBSDE with partial information)
Let u∗2(·) ∈ U2 be the optimal control for Problem of theleader and (x∗(·), q∗(·), k∗(·), k̃∗(·)) be the correspondingoptimal state which is the unique Ft-adapted solution to (5.36).Let (y(·), z(·), z̃(·), p(·)) be the unique Ft-adapted solution to theadjoint equation. Then
E
{∂H2∂u2
+ E[∂H2∂û2
∣∣∣∣G1t ]∣∣∣∣u2=u∗2(t)
∣∣∣∣∣G2t}
= 0, a.e.t ∈ [0, T ], (5.42)
UNIVERSIDADE DE MACAU
Now, we come back to LQ problem. Define Hamiltonian
H2(t, xu2 , u2, ϕ, β; y, z, z̃, p
)=〈y,A(t)xu2 +
(Ã(t)−A(t)
)x̂û2 + F̃1(t)ϕ
+ B̃1(t)β +B2(t)u2 +(B̃2(t)−B2(t)
)û2〉
+〈z, C(t)xu2 + F̃5(t)x̂
û2 +˜̃B>1 (t)ϕ
+˜̃D1(t)β +D2(t)u2 + F̃2(t)u2
〉+〈z̃, C̃(t)xu2 +
( ˜̃C(t)− C̃(t)
)x̂û2 + B̃>1 (t)ϕ
+ F̃3(t)β + D̃2(t)u2 +(D̃2(t)−D2(t)
)û2〉
+〈p, Ã>ϕ+
˜̃C>β + F̃4û2
〉+
1
2
[〈Q2x
u2 , xu2〉
+〈N2u2, u2
〉].
(5.43)
UNIVERSIDADE DE MACAU
Applying the general thm, if there exists an Ft-adapted optimalcontrol u∗2(·) ∈ U2 for the leader, then
0 = N2(t)u∗2(t) + F̃
>4 (t)p̂(t) +B
>2 (t)y(t)
+(B̃2(t)−B2(t)
)>ŷ(t) +D>2 (t)z(t)
+ F̃>2 (t)ẑ(t) + D̃>2 (t)z̃(t)
+( ˜̃D2(t)− D̃2(t)
)>ˆ̃z(t), a.e. t ∈ [0, T ],(5.44)
where Ft-adapted processes quadruple (p(·), y(·), z(·), z̃(·))satisfies the conditional mean-field FBSDE
UNIVERSIDADE DE MACAU
dp(t) =[Ã(t)p(t) + F̃>1 (t)y(t) +
˜̃B1(t)z(t) + B̃1(t)z̃(t)
]dt
+[ ˜̃C(t)p(t) + B̃>1 (t)y(t)
+˜̃D>1 (t)z(t) + F̃
>3 (t)z̃(t)
]dW̃ (t),
−dy(t) =[A>(t)y(t) +
(Ã(t)−A(t)
)>ŷ(t)
+ C>(t)z(t) + F̃>5 (t)ẑ(t) + C̃>(t)z̃(t)
+( ˜̃C(t)− C̃(t)
)>ˆ̃z(t) +Q2(t)x∗(t)]dt− z(t)dW (t)− z̃(t)dW̃ (t),
p(0) = 0, y(T ) = G2x∗(T ).
(5.45)
UNIVERSIDADE DE MACAU
——————
Thanks!
——————
A motivationProblem formulationStochastic maximum principle under partial information.Follower's optimization problemComplete Information LQ Stochastic Optimal Control Problem of the Leader
Top Related