Post on 06-Jul-2018
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 1/18
Gaussian Process Approximations
of Stochastic Differential Equations
Cédric Archambeau
Centre for Computational Statistics and Machine Learning
University College London
c.archambeau@cs.ucl.ac.uk
CSML 2007 Reading Group on SDEsJoint work with Manfred Opper (TU Berlin), John Shawe-Taylor (UCL) and Dan Cornford (Aston).
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 2/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Context: numerical weather prediction (data assimilation)
• Model equations for the atmosphere are ordinary differential equations
(deterministic)
• Stochasticity arises from:• Dynamics simplifications
• Discretisation of continuous process
• Physics (unknown parameters)
• Observations are satellite radiances (indirect, possibly non-linear)
• Measurement error
• Decision making must take the uncertainty into account!
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 3/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Issues with the current methods
Double-well potential Ensemble Kalman smoother
(Eyink, et al., 2002)
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 4/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Motivation
Data Assimilation:
• Improve current prediction tools• Current methods cannot exploit large data sets
• Current methods fail on (highly) non-linear data
• Parameters are unknown (model, noise, observation operator)
Machine Learning:
• Deal with uncertainty in a principled way
• Discover optimal data representation:• find sparse solution
• learn the kernel
• …
• Deal with non-linear data
• Deal with huge amount of data
• Deal with high dimensional data
• Deal with large noise
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 5/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Outline
• Diffusion processes• Gaussian approximation of the prior process
• Approximate posterior process after observing the data
• Smoothing algorithm (E-step)
• Parameter estimation (M-step)
• Conclusion
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 6/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Continuous-time continuous-state stochastic processes
• Stochastic differential equation for a diffusion process:
• To be interpreted as an (Ito) stochastic integral:
Wiener Process:
• Almost surely continuous
• Almost surely non-differentiable
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 7/18
C. Archambeau (CSML)
C. Archambeau GP Approximations of SDEs
Describing diffusion processes
Fokker-Planck equation:
• Time evolution of the transition density• Drift: instantaneous rate of change of the mean
• Diffusion: instantaneous rate of change of squared fluctuations
Stochastic differential equation:• Depends on same drift and diffusion terms
•
Alternative representation of the transition density• (Non-)linear SDE leads to (non-)Gaussian transition density
Kernel representation:• Captures correlations between data
• Particular SDE induces a specific (two-time) kernel
•
Time varying approximation of an SDE is equivalent to learning the kernel(hot topic!)
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 8/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Gaussian approximation of the non-Gaussian process
• Time varying linear approximation of the drift:
• Gaussian marginal density:
• Time evolution of the means and the covariances (consistency):
(see for example second CSML reading group on SDEs)
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 9/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Optimal (Gaussian) approximation of the prior process
• Euler-Maruyama discrete approximation:
• Probability density of a discrete-time sample path:
• Optimal prior process:
… path integral Kullback-Leibler divergence!
Same stochastic noise!
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 10/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Comment on the Kullback-Leibler divergence
Is this measure a good criterion?
Support…
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 11/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Including the observations
• Posterior process:
• Gaussian likelihood with linear observation operator (for simplicity):
• EM-type training algorithm:
where the lower bound is defined as
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 12/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
E-step: estimating the optimal (latent) state path
• Find optimal variational functions
• Constrained optimization problem:• ODE for the marginal means
• ODE for the marginal covariances
• Integrating the Lagrangian by parts and differentiating leads to:• ODEs for the Lagrange multipliers
• Gradient for the variational functions
SDE Likelihood
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 13/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Smoothing algorithm:
Fix initial conditions.Repeat until convergence:
• Forward sweep:
Propagate means and covariances forward in time:
• Backward sweep:
Propagate Lagrange multipliers backward in time (adjoint operation):
Use jump conditions when observations:
• Update variational functions:
(scaled gradient step)
InitialConditions
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 14/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Double-well example
Gaussian process regression Variational Gaussian approximation
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 15/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Comparison to MCMC simulation
Hybrid Monte Carlo approach:
• Reference solution
• Generate sample paths from posterior
• Modify scheme in order to increase acceptance rate (molecular dynamics)
• Still requires to generate 100,000 for good results
• Hard to check convergence
(Y. Shen, Aston)
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 16/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
M-step: learning the parameters by maximum likelihood
• Linear transformation
• Observations noise
• Stochastic noise?
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 17/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
When things go wrong…
8/16/2019 Ca07 GPforSDEs Talk
http://slidepdf.com/reader/full/ca07-gpforsdes-talk 18/18
C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs
Conlusion
• Machine Learning for Data Assimilation
• Modeling the uncertainty is essential• Learning the kernel is a challenging new concern
• Bunch of suboptimal/intermediate good solutions?
• Promising results: quality of the solution & potential extensions
Future work includes:• High(er) dimensional data• Check gradient approach (cf. stochastic noise)
• Simplifications when the force (drift) derives from a potential
• Investigate a full variational Bayesian treatment
• Combine the variational approach with MCMC?
Paper available from: www.cs.ucl.ac.uk/staff/C.Archambeau
Research project: VISDEM (multi-site, EPSRC funded)