MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity...
Transcript of MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity...
![Page 1: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/1.jpg)
MC2019 Part 2Optimal control for stochastic systems.
Hirokazu Tanaka
![Page 2: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/2.jpg)
Optimal control for stochastic systems.
In this lecture, we will learn:• Feedforward and feedback control• Trial-by-trial variability• Signal-dependent noise• Dynamic programming• Bellman’s optimality equation• Linear-quadratic-Gaussian (LQG) control• Optimal feedback control (OFC) model• Human psychophysics
![Page 3: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/3.jpg)
Straight paths and bell-shaped velocity in reaching.
Morasso (1981) Exp Brain Res
12
3 4 5
6
joint angles
angular velocity
angular accel.
hand speed
T1->T4 T3->T5 T2->T5 T1->T5
![Page 4: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/4.jpg)
Power laws for curved movements.
Lacquaniti et al. (1983) Acta Psychologica
23θ κ∝
13v κ
−∝
: angular velocity : h: curvature
and speedvθ
κ
or
curvature
angu
lar v
eloc
ity
![Page 5: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/5.jpg)
Fitts’ law for movement duration in rapid pointing movements.
Fitts (1954) J Exp Psychol
22logf
Dt a bW
= +
![Page 6: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/6.jpg)
“Main sequence” for saccadic movements
Bahill et al. (1975) Math Biosci
![Page 7: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/7.jpg)
Optimality principle: how a unique trajectory is chosen.
http://en.wikipedia.org/wiki/Snell%27s_law
Snell’s law
Fermat’s principle of least time
1 1 1
2 2 2
sin 1/sin 1/
v nv n
θθ
= =
( ) ( ) ( )Q Q
0 P P
1T dsT s dt n s dsv s c
⋅ = = = ∫ ∫ ∫
( ) ( )Q
P
1 0T s n s dsc
δ δ⋅ = = ∫
![Page 8: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/8.jpg)
Feedforward and feedback control.
Feedforward control as a function of time step:
1k k k+ = +x Ax Bu
( )k k k=u u
Feedback control as a function of state:
( )k k k=u u x
For a deterministic system, feedforward and feedback control are equivalent. On the other hand, for a stochastic system, feedforward and feedback control are NOT equivalent.
![Page 9: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/9.jpg)
Signal-dependent noise: trial-by-trial variability.
van Beers et al. (2004) J Neurophysiol;Schmidt et al. (1979) Psychol Rev; Jones et al. (2002) J Neurophysiol
Variability in reaching endpoints Variability in force production
( ) ( )noise 0, 11 ,ξξ= +u u
noise noise TE , Cov = = u u u uuSignal dependent noise
![Page 10: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/10.jpg)
Minimum-variance control predicts smooth trajectory.
Harris & Wolpert (1998) Nature
( )1 1n n n nξ+ = + +x Ax Bu
( )( ) ( )
( )
1 1 1
22 2 2 1 1
1
00
1
1
1 1
1n
nk
k
n n n n
n n n n n
n kk
ξ
ξ ξ
ξ
− − −
− −
−
=
− − −
− −
= + +
= + + + +
=
= + +∑
x Ax Bu
A x ABu Bu
A x A Bu
[ ] 0
1
0
1E n kn
nn
kk
− −−
=
= +∑x A x A Bu
[ ] ( )T1 T1
0
T 1Cov n k n kn k k
n
k
−
=
− − − −= ∑x A Bu u B A
![Page 11: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/11.jpg)
Minimum-variance control predicts smooth trajectory.
Harris & Wolpert (1998) Nature
[ ] ( )0
11
0E
nn
kn k
nk
f ft t T−
=
− −= + = ≤ ≤ +∑x A x A Bu x
[ ] ( )T1 T T1
11110
1Covf f
f f
Tn k n
t t T n
kt n t k
n kt
k+
− − −
=
−+ −
= =
= ∑ ∑ ∑x A Bu u B A
Minimize the variance of final position (quadratic with respect to u)
under constraints (T+1 constraints with respect to u)
This problem can be solved with quadratic programming (quadprogcommand).
![Page 12: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/12.jpg)
Minimum-variance control predicts smooth trajectory.
Harris & Wolpert (1998) Nature
Saccade velocity
Reaching and drawing
![Page 13: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/13.jpg)
Matlab code for the minimum-variance model.
Harris & Wolpert (1998) Nature
t1 = 224/1000; % time const of eye dynamics t2 = 13/1000; % another time const of eye dynamicstm = 10/1000;
dt = 1/1000; % simulation time step
tf = 50/1000; % movement durationtp = 20/1000; % post-movement durationK = tf/dt;L = tp/dt;
x0 = [0; 0; 0]; % initial statexf = [10; 0; 0]; % final state
Ac = [0 1 0; ...0 0 1; ...-1/(t1*t2*tm) -1/(t1*t2)-1/(t1*tm)-1/(t2*tm) -1/t1-1/t2-
1/tm];Bc = [0; 0; 1];A = expm(Ac*dt);B = inv(Ac)*(eye(3)-expm(Ac*dt))*Bc;
Q = zeros(K+L);
% calculation of Qfor ell=0:K+L-1
if ell<Kfor k=K:K+L-1
tmpQ = A^(k-ell-1)*B*B'*A'^(k-ell-1);Q(ell+1, ell+1) = Q(ell+1, ell+1) + tmpQ(1,1);
endelse
for k=ell+1:K+L-1tmpQ = A^(k-ell-1)*B*B'*A'^(k-ell-1);Q(ell+1, ell+1) = Q(ell, ell) + tmpQ(1,1);
endend
End
% calculation of CC = [];for p=1:L+1
Ctmp = [];for q=1:K+L
if K-1-(q-1)+(p-1)>0Ctmp = [Ctmp A^(K-1-(q-1)+(p-1))*B];
elseif K-1-(q-1)+(p-1)==0Ctmp = [Ctmp B];
elseCtmp = [Ctmp zeros(3,1)];
endendC = [C; Ctmp];
end
% calculation of dd = [];for ell=0:L
d = [d; xf-A^(K+ell)*x0];end
% solution by quadratic programmingu = quadprog(Q, [], [], [], C, d);
% forward solutionx = zeros(3, K+L);x(:,1) = x0;for k=1:K+L-1
x(:,k+1) = A*x(:,k) + B*u(k);end
figure(1);subplot(211); hold on; plot(x(1,:), 'k', 'linewidth', 2);set(gca, 'plotboxaspectratio', [1.5 1 1]);xlabel('time (ms)'); ylabel('eye position (deg)');subplot(212); hold on; plot(x(2,:), 'k', 'linewidth', 2)set(gca, 'plotboxaspectratio', [1.5 1 1]);xlabel('time (ms)'); ylabel('eye velocity (deg/s)');
![Page 14: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/14.jpg)
Matlab code for the minimum-variance model.
Harris & Wolpert (1998) Nature
![Page 15: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/15.jpg)
Minimum-time control predicts Fitts’ law and main sequence.
Tanaka et al. (2006) J Neurophysiol
Faster, but less accurate
300ms 700ms
More precise, but slower
500ms
Unique movement time can be determined
[ ]
[ ]
MT ,
T
1 Var (t
( ) E ( )
{ }
)t tf p
f
t tf pf
f
p
t f
t f
f
V d
dt t
tt
S t
t
t
λ+
+
=
+
−
+
− ∫
∫ μ x x
u
x
minimization of movement time
final variance constraint
final position constraint
![Page 16: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/16.jpg)
Minimum-time control predicts Fitts’ law and main sequence.
Tanaka et al. (2006) J Neurophysiol
Fitts’ law
Main sequence
Model Experiment
![Page 17: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/17.jpg)
Optimal control: Bellman’s optimality equation.
( ) ( )1
0 1 1 00
1 1, , , ;2 2
NT T T
N k k k k k N N Nk
J−
−=
= + +∑u u u x x Q x u Ru x Q x
1k k k+ = +x Ax Bu
Error during movement
Control cost Endpoint error
Find control signals {u0, u1, …, uN-1} that minimize the cost function.
Deterministic (i.e., noiseless) dynamics:
Cost function:
![Page 18: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/18.jpg)
Optimal control: Pontryagin’s minimum principle.
State x
Time stepNN-1N-2N-30 1 2 3
u0
u1 u2u3
uN-3
uN-2uN-1
x0
x1
x2 x3
xN-3 xN-2
xN-1
xN
( ) ( )1
0 1 1 00
1 1, , , ;2 2
NT T T
N k k k k k N N Nk
J−
−=
= + +∑u u u x x Q x u Ru x Q x
The control signals {u0, u1, …, uN-1} are optimized as a whole.
![Page 19: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/19.jpg)
Optimal control: Bellman’s optimality equation.
State x
Time stepNN-1N-2N-30 1 2 3
u0
u1 u2u3
x0
x1
x2 x3
n n+1 n+2
(cost at state xn at step n) = (cost of moving xn to xn+1) + (cost at state xn+1 at step 𝑛𝑛 + 1)
( )1 1n nV + +x( )n nV x
![Page 20: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/20.jpg)
Optimal control: Bellman’s optimality equation.
( ){ }
( )1 1
1
, , ,
1 12 2min
n n N
NT T T
n n k k k k k N N Nk n
V+ −
−
=
= + + ∑
u u ux x Q x u Ru x Q x
( ){ }
( )
( ){ }
( )1
1 1
1
1
,
,
,
1
,
,
1
1 12 2
1 1 1 2 2 2
min
min min
n N
nn N
n
NT T T
n n k k k k k N N Nk n
NT T T T Tn n n n n k k k k k N N N
nk
V
+ −
+ −
−
=
+
−
=
= + +
= + + + +
∑
∑
u u u
u u u
x x Q x u Ru x Q x
x Q x u Ru x Q x u Ru x Q x
( ) ( )1 112min
n
T Tn n n n n n nV + +
= + + ux Q x u Ru x
( )1 1n nV + +x
( ) ( ) ( )1 112min
n
T Tn n n n n n n n nV V + +
= + + ux x Q x u Ru x
![Page 21: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/21.jpg)
Solvable example: Linear-Quadratic-Regulator control.
( ) ( )
( ) ( ) ( )
1 1 1
1
1 12 21 1 2 2
min
min
T T Tk k k k k k k k k
u
TT Tk k k k k k k k k k
u
V + + +
+
= + + = + + + +
x x Q x u Ru x S x
x Q x u Ru Ax Bu S Ax Bu
( )
( )1 1
1 1
102
T T T T T T Tk k k k k k k k k k
T Tk k k k
+ +
+ +
∂ = + + + ∂
= + +
u Ru u B ABu x A S Bu u B S Axu
R B S B u B S Ax
( ) 1
1 1T T
k k k k k k
−
+ += − + = −u R B S B B S Ax L x
Assume that the cost-to-go function has a quadratic form:
( )1 1 1 112
Tk k k kV + + + +=x x S x
Substituting this into the Bellman equation gives
Then, by minimizing with respect to u,
A feedback control law is obtained:
![Page 22: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/22.jpg)
Solvable example: Linear-Quadratic-Regulator control.
( ) 1
1 1T T
k k k k k k
−
+ += − + = −u R B S B B S Ax L x
( ) ( ) 11 11
1 12 2
TT T
k k k k k k k kk
V−− −
+ = = + +
x x S x x Q A S BR B A x
( ) 11 11
T Tk k k
−− −+= + +S Q A S BR B A
Substituting the control law
into the Bellman equation gives
Therefore a backward recursive equation for the matrix S is obtained.
0
S0
L0
1
S1
L1
2
S2
k-1
Sk-1
Lk-1
k
Sk
Lk
k+1
Sk+1
Lk-2
N-1
SN-1
LN-1
N
SN
LN-2
![Page 23: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/23.jpg)
Solvable example: Linear-Quadratic-Regulator control.
k k k= −u L x
( ) 1
1 1T T
k k k
−
+ += +L R B S B B S A
( ) 11 11
T Tk k k
−− −+= + +S Q A S BR B A
N N=S Q
Backward recursion equation for the matrix S:
Feedback control law:
where
with the terminal condition
Therefore, once the matrices {S0, …, SN} and {L0, …, LN-1} are obtained, the optimal feedback control signals {u0, …, uN-1} can be computed.
![Page 24: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/24.jpg)
Deterministic and stochastic optimal control.
( ) ( )1
0 1 1 00
1 1, , , ;2 2
NT T T
N k k k k k N N Nk
J−
−=
= + +∑u u u x x Q x u Ru x Q x
( ) ( )1
0 1 10 ,
0 01 1ˆ, , , ; E,2 2
NT T T
N k k k k k N N Nk
J−
−=
Σ = + + ∑
w v
u u u x x Q x u Ru x Q x
1k k k
k k
+ = + =
x Ax Buy Cx
1k k k
k kk
k+ = + ++ =
wv
x Ax Buy Cx
Deterministic Optimal Control
Stochastic Optimal Control
![Page 25: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/25.jpg)
Stochastic optimal control combines Kalman filter and feedback control.
ˆk k k= −u L x
( )1ˆ ˆ ˆk k k k k k+ = + + −x Ax Bu K y Cx
( ) ( )1
0 1 10 ,
0 01 1ˆ, , , ; E,2 2
NT T T
N k k k k k N N Nk
J−
−=
Σ = + + ∑
w v
u u u x x Q x u Ru x Q x
1k k k
k kk
k+ = + ++ =
wv
x Ax Buy Cx
Stochastic Optimal Control
Optimal state estimation (Kalman Filter)
Feedback controller
![Page 26: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/26.jpg)
Optimal feedback control as a model of motor control.
Todorov & Jordan (2002) Nature Neurosci
( ) ( )1
0 1 10 ,
0 01 1ˆ, , , ; , E2 2
NT T T
N k k k k k N N Nk
Jε
−
−=
Σ = + + ∑
ω
u u u x x Q x u Ru x Q x
1 k
k
k k k k
k k
ε+ = + += +
x Ax Bu Cy Hx ω
u
![Page 27: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/27.jpg)
Optimal feedback control as a model of motor control.
Todorov & Jordan (2002) Nature Neurosci; Todorov (2005) Neural Comput
Kalman filter:( )1ˆ ˆ ˆk k k k k k+ = + + −x Ax Bu K y Hx
( )( )
( ) ( )
1T T
T T T1 1 1
1ˆ T ˆ ˆ
1
;
ˆ ˆ;
k k k
k k k k k k
T Tk k k k k k
−
+
+
= +
= − + =
= + − − =
Σ Σ
Σ Σ Σ Σ Σ
Σ Σ Σ Σ
e
x x x
e ω
e e x e
e
K A H H
A K H A CL L C
K H A A BL A BL x
Ω
x
H
Feedback control:ˆk k k= −u L x
( )( )( )( ) ( )
1T T T1 1 1 1
T1
TT1 1
;
;
k
N
N
k k k k
k k k k N
k k k k k k
Q
−
+ + + +
+
+ +
= + + +
= + − =
= + − − =
e
e e e
x x x
x x x
x
L B S B R C S S C B S A
S Q A S A BL S
S A S BL A K H S A K H S 0
![Page 28: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/28.jpg)
小脳内部順モデル予測
運動野フィードバック制御
頭頂葉カルマンフィルタ
身体・環境
体性感覚野感覚処理
大脳基底核評価関数の設定
predictionx̂
filterx̂
yfilterx̂filterˆ= −u Lx
u
Computational neuroanatomy for motor control.
Shadmehr & Krakauer (2008); Haar & Donchin (2019)
![Page 29: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/29.jpg)
Matlab code: optimal feedback control model.
Todorov & Jordan (2002) Nature Neurosci; Todorov (2005) Neural Comput
for iter = 1:MaxIter
% initialize covariancesSiE = S1;SiX = X1*X1';SiXE = zeros(szX,szX);
% forward pass - recompute Kalman filter for k = 1:N-1
% compute Kalman gaintemp = SiE + SiX + SiXE + SiXE';if size(D,2)==1,
DSiD = diag(diag(temp).*D.^2);else
DSiD = zeros(szY,szY);for i=1:szD
DSiD = DSiD + D(:,:,i)*temp*D(:,:,i)';end;
end;K(:,:,k) = A*SiE*H'*pinv(H*SiE*H'+D0*D0'+DSiD);
% compute new SiEnewE = E0*E0' + C0*C0' + (A-K(:,:,k)*H)*SiE*A';LSiL = L(:,:,k)*SiX*L(:,:,k)';if size(C,2)==1,
newE = newE + B*diag(diag(LSiL).*C.^2)*B';else
for i=1:szCnewE = newE + B*C(:,:,i)*LSiL*C(:,:,i)'*B';
end;end;
% update SiX, SiE, SiXESiX = E0*E0' + K(:,:,k)*H*SiE*A' + (A-B*L(:,:,k))*SiX*(A-B*L(:,:,k))' + ...
(A-B*L(:,:,k))*SiXE*H'*K(:,:,k)' + K(:,:,k)*H*SiXE'*(A-B*L(:,:,k))';SiE = newE;SiXE = (A-B*L(:,:,k))*SiXE*(A-K(:,:,k)*H)' - E0*E0';
end;
% first pass initializationif iter==1,
if Init==0, % open loopK = zeros(szX,szY,N-1);
elseif Init==2, % randomK = randn(szX,szY,N-1);
end;end;
% initialize optimal cost-to-go functionSx = Q(:,:,N);Se = zeros(szX,szX);Cost(iter) = 0;
% backward pass - recompute control policyfor k=N-1:-1:1
% update CostCost(iter) = Cost(iter) + trace(Sx*C0*C0') + ...
trace(Se*(K(:,:,k)*D0*D0'*K(:,:,k)' + E0*E0' + C0*C0'));
% Controllertemp = R + B'*Sx*B;BSxeB = B'*(Sx+Se)*B;if size(C,2)==1,
temp = temp + diag(diag(BSxeB).*C.^2);else
for i=1:size(C,3)temp = temp + C(:,:,i)'*BSxeB*C(:,:,i);
end;end;L(:,:,k) = pinv(temp)*B'*Sx*A;
% compute new SenewE = A'*Sx*B*L(:,:,k) + (A-K(:,:,k)*H)'*Se*(A-K(:,:,k)*H);
% update Sx and SeSx = Q(:,:,k) + A'*Sx*(A-B*L(:,:,k));KSeK = K(:,:,k)'*Se*K(:,:,k);if size(D,2)==1,
Sx = Sx + diag(diag(KSeK).*D.^2);else
for i=1:szDSx = Sx + D(:,:,i)'*KSeK*D(:,:,i);
end;end; Se = newE;
end;
% adjust costCost(iter) = Cost(iter) + X1'*Sx*X1 + trace((Se+Sx)*S1);
% progress barif ~rem(iter,10),
fprintf('.');end;
% check convergence of Costif (Niter>0 & iter>=Niter) | ...
(Niter==0 & iter>1 & abs(Cost(iter-1)-Cost(iter))<Eps) | ...(Niter==0 & iter>20 & sum(diff(dist(iter-10:iter))>0)>3),break;
end;end;
![Page 30: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/30.jpg)
Matlab code: optimal feedback control model.
Todorov & Jordan (2002) Nature Neurosci; Todorov (2005) Neural Comput
![Page 31: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/31.jpg)
Infinite-horizon optimal feedback control.
Phillis (1985) IEEE Trans Auto Contr; Qian et al. (2013) Neural Comput
( ) ,d dd dt dd dt d
β γ ωξ
= +
+
+ + +
=
x Yu Gy Cx
x Bu xD
A F
( )T T T1 2 0
1lim E E limt
t tJ J J dt
t→∞ →∞
= + = + + ∫x Ux x Qx u Ru
Stochastic dynamics with signal- and state-dependent noises:
Infinite-horizon cost:
( ) ( ).
ˆ ,ˆˆˆd dt d dt= + + −
= −
x Ax Bu yu
xL
Cx
K
In infinite-horizon optimal feedback control, the Kalman gain (K) and the feedback gain (L) are time invariant.
![Page 32: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/32.jpg)
Bimanual coordination explained by OFC model.
Dietrichsen (2007) Curr Biol
Two cursor condition One cursor condition
Model
Experiment
![Page 33: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/33.jpg)
Motor adaptation as reoptimization.
Izawa et al. (2008) J Neurosci
Sigmoidal pathsModel Experiment
Field uncertaintyModel Experiment
![Page 34: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/34.jpg)
Movement as a real-time decision making process.
Nashed et al. (2014) J Neurosci
![Page 35: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/35.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
ii
id dt dt η= + + ∑x Ax Bu B C u
( )0
T TJ dt∞
= +∫ x Qx u Ru
= −u Kx
10 T −= + − +A P PA Q PBR BP1 T−=K R B P
Algebraic Riccatti equation:
The Riccatti equation for the optimal feedback gain requires explicit knowledge of the forward model (the matrices A and B)!
![Page 36: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/36.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
10 T −= + + −A P PA Q PBR BP
( ) ( )0 Tk k k k k k= − + − + −A BK P P A BK Q K RK
11
Tk k
−+ =K R B P
Algebraic Riccatti equation:
Iterative solution to the Riccatti equation:
1) A − BK is Hurwitz.2) 𝑃𝑃∗ ≤ 𝑃𝑃𝑘𝑘+1 ≤ 𝑃𝑃𝑘𝑘3) 𝐾𝐾𝑘𝑘 → 𝐾𝐾∗ and 𝑃𝑃𝑘𝑘 → 𝑃𝑃∗
![Page 37: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/37.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
10 T −= + − +A P PA Q PBR BP
( ) ( )0 Tk k k k k k= − + − − +A BK P P A BK Q K RK
11
Tk k
−+ =K R B P
Algebraic Riccatti equation:
Iterative solution to the Riccatti equation:
1) A − BK is Hurwitz.2) 𝑃𝑃∗ ≤ 𝑃𝑃𝑘𝑘+1 ≤ 𝑃𝑃𝑘𝑘3) 𝐾𝐾𝑘𝑘 → 𝐾𝐾∗ and 𝑃𝑃𝑘𝑘 → 𝑃𝑃∗
![Page 38: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/38.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
10 T −= + + −A P PA Q PBR BP
Problem: In determining optimal policy, the system matrices need to be known:
How can optimal policy be derived without assuming the direct knowledge of the system matrices?
Solution: Note that
( )
10 T T
T T T
dt
d dt
− = + + − = + +
x A P PA Q PBR x
x Px x Q K RK x
BP
Therefore, the knowledge of 𝐀𝐀,𝐁𝐁 is replaced with the knowledge of 𝑑𝑑𝐱𝐱.
![Page 39: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/39.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
By integrating over an infinitesimal interval:
( )0 T T Td dt = + + x Px x Q K RK x
( ) ( ) ( ) ( )tT T t T
t
Tt t t tt dttδ
δ δ+
+ + − = − + ∫x Px x Px x Q K RK x
By integrating over an infinitesimal interval:
Therefore, without knowing the system dynamics, the matrices Pand K are determined.
![Page 40: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/40.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
![Page 41: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/41.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
![Page 42: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/42.jpg)
Learning optimal control policy without knowing dynamics.
Jiang & Jiang (2014) Biol Cybern
![Page 43: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/43.jpg)
Summary
• Feedforward and feedback control differ in stochastic dynamics.
• Signal-dependent control noise contributes to trial-by-trial motor variability.
• Movement accuracy under signal-dependent noise models human movements.
• Optimal feedback control (OFC) integrates state estimation and motor control in a single computational framework.
• A number of psychophysical experiments can be explained by the OFC model.
![Page 44: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/44.jpg)
References
• Schmidt, R. A., Zelaznik, H., Hawkins, B., Frank, J. S., & Quinn Jr, J. T. (1979). Motor-output variability: a theory for the accuracy of rapid motor acts. Psychological review, 86(5), 415.
• van Beers, R. J., Haggard, P., & Wolpert, D. M. (2004). The role of execution noise in movement variability. Journal of Neurophysiology, 91(2), 1050-1063.
• Harris, C. M., & Wolpert, D. M. (1998). Signal-dependent noise determines motor planning. Nature, 394(6695), 780-784.
• Tanaka, H., Krakauer, J. W., & Qian, N. (2006). An optimization principle for determining movement duration. Journal of neurophysiology, 95(6), 3875-3886.
• Todorov, E., & Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nature neuroscience, 5(11), 1226-1235.
• Todorov, E. (2005). Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural computation, 17(5), 1084-1108.
• Athans, M. (1967). The matrix minimum principle. Information and control, 11(5), 592-606.
• Phillis, Y. A. (1985). Controller design of systems with multiplicative noise. Automatic Control, IEEE Transactions on, 30(10), 1017-1019.
• Qian, N., Jiang, Y., Jiang, Z. P., & Mazzoni, P. (2013). Movement duration, fitts's law, and an infinite-horizon optimal feedback control model for biological motor systems. Neural computation, 25(3), 697-724.
• Diedrichsen, J. (2007). Optimal task-dependent changes of bimanual feedback control and adaptation. Current Biology, 17(19), 1675-1679.
• Izawa, J., Rane, T., Donchin, O., & Shadmehr, R. (2008). Motor adaptation as a process of reoptimization. The Journal of Neuroscience, 28(11), 2883-2891.
• Nashed, J. Y., Crevecoeur, F., & Scott, S. H. (2014). Rapid online selection between multiple motor plans. The Journal of Neuroscience, 34(5), 1769-1780.
• Jiang, Y., & Jiang, Z. P. (2014). Adaptive dynamic programming as a theory of sensorimotor control. Biological cybernetics, 108(4), 459-473.
![Page 45: MC2019 Part 2 Optimal control for stochastic systems. · Straight paths and bell -shaped velocity in reaching. Morasso (1981) Exp Brain Res 2. 1. 3. 4. 5. 6. joint angles. angular](https://reader033.fdocuments.in/reader033/viewer/2022043017/5f3994dccd96163edf5b61fb/html5/thumbnails/45.jpg)
Exercise
• Write a Matlab code of the minimum-variance model for a trajectory with given initial and final positions.
• Write a Matlab code of the optimal feedback control for a trajectory with given initial and final positions.
• Investigate how the minimum-variance model and the optimal feedback model differ.