Ca07 GPforSDEs Talk

8/16/2019 Ca07 GPforSDEs Talk

http://slidepdf.com/reader/full/ca07-gpforsdes-talk 1/18

Gaussian Process Approximations

of Stochastic Differential Equations

Cédric Archambeau

Centre for Computational Statistics and Machine Learning

University College London

c.archambeau@cs.ucl.ac.uk

CSML 2007 Reading Group on SDEsJoint work with Manfred Opper (TU Berlin), John Shawe-Taylor (UCL) and Dan Cornford (Aston).

C. Archambeau (CSML)

C. Archambeau GP Approximations of SDEs

Context: numerical weather prediction (data assimilation)

• Model equations for the atmosphere are ordinary differential equations

(deterministic)

• Stochasticity arises from:• Dynamics simplifications

• Discretisation of continuous process

• Physics (unknown parameters)

• Observations are satellite radiances (indirect, possibly non-linear)

• Measurement error

• Decision making must take the uncertainty into account!

Issues with the current methods

Double-well potential Ensemble Kalman smoother

(Eyink, et al., 2002)

Motivation

Data Assimilation:

• Improve current prediction tools• Current methods cannot exploit large data sets

• Current methods fail on (highly) non-linear data

• Parameters are unknown (model, noise, observation operator)

Machine Learning:

• Deal with uncertainty in a principled way

• Discover optimal data representation:• find sparse solution

• learn the kernel

• …

• Deal with non-linear data

• Deal with huge amount of data

• Deal with high dimensional data

• Deal with large noise

Outline

• Diffusion processes• Gaussian approximation of the prior process

• Approximate posterior process after observing the data

• Smoothing algorithm (E-step)

• Parameter estimation (M-step)

• Conclusion

Continuous-time continuous-state stochastic processes

• Stochastic differential equation for a diffusion process:

• To be interpreted as an (Ito) stochastic integral:

Wiener Process:

• Almost surely continuous

• Almost surely non-differentiable

Describing diffusion processes

Fokker-Planck equation:

• Time evolution of the transition density• Drift: instantaneous rate of change of the mean

• Diffusion: instantaneous rate of change of squared fluctuations

Stochastic differential equation:• Depends on same drift and diffusion terms

Alternative representation of the transition density• (Non-)linear SDE leads to (non-)Gaussian transition density

Kernel representation:• Captures correlations between data

• Particular SDE induces a specific (two-time) kernel

Time varying approximation of an SDE is equivalent to learning the kernel(hot topic!)

C. Archambeau (CSML)C. Archambeau GP Approximations of SDEs

Gaussian approximation of the non-Gaussian process

• Time varying linear approximation of the drift:

• Gaussian marginal density:

• Time evolution of the means and the covariances (consistency):

(see for example second CSML reading group on SDEs)

Optimal (Gaussian) approximation of the prior process

• Euler-Maruyama discrete approximation:

• Probability density of a discrete-time sample path:

• Optimal prior process:

… path integral Kullback-Leibler divergence!

Same stochastic noise!

Comment on the Kullback-Leibler divergence

Is this measure a good criterion?

Support…

Including the observations

• Posterior process:

• Gaussian likelihood with linear observation operator (for simplicity):

• EM-type training algorithm:

where the lower bound is defined as

E-step: estimating the optimal (latent) state path

• Find optimal variational functions

• Constrained optimization problem:• ODE for the marginal means

• ODE for the marginal covariances

• Integrating the Lagrangian by parts and differentiating leads to:• ODEs for the Lagrange multipliers

• Gradient for the variational functions

SDE Likelihood

Smoothing algorithm:

Fix initial conditions.Repeat until convergence:

• Forward sweep:

Propagate means and covariances forward in time:

• Backward sweep:

Propagate Lagrange multipliers backward in time (adjoint operation):

Use jump conditions when observations:

• Update variational functions:

(scaled gradient step)

InitialConditions

Double-well example

Gaussian process regression Variational Gaussian approximation

Comparison to MCMC simulation

Hybrid Monte Carlo approach:

• Reference solution

• Generate sample paths from posterior

• Modify scheme in order to increase acceptance rate (molecular dynamics)

• Still requires to generate 100,000 for good results

• Hard to check convergence

(Y. Shen, Aston)

M-step: learning the parameters by maximum likelihood

• Linear transformation

• Observations noise

• Stochastic noise?

When things go wrong…

Conlusion

• Machine Learning for Data Assimilation

• Modeling the uncertainty is essential• Learning the kernel is a challenging new concern

• Bunch of suboptimal/intermediate good solutions?

• Promising results: quality of the solution & potential extensions

Future work includes:• High(er) dimensional data• Check gradient approach (cf. stochastic noise)

• Simplifications when the force (drift) derives from a potential

• Investigate a full variational Bayesian treatment

• Combine the variational approach with MCMC?

Paper available from: www.cs.ucl.ac.uk/staff/C.Archambeau

Research project: VISDEM (multi-site, EPSRC funded)

Ca07 GPforSDEs Talk

Documents

Transcript of Ca07 GPforSDEs Talk

Staffing and human resources in the NHSim.ft-static.com/content/images/ed75b250-ca07-11da-852f...Staffing and human resources in the NHS – facing up to the reform agenda Professor

Talk,Talk, Talk, 2

Alliance of Glycobiologists for Detection of Cancer: A ... · Concept for RFA Reissuance (CA07-020) Alliance of Glycobiologists for Detection of Cancer: A Trans-NIH Program. Karl

talk talk talk 1 Part 3

Lets talk talk evaluation

small talk BIG TALK

TALK TALK TALK 1 Scanned Files

“Giving a Talk” Talk

Talk talk talk 1

Sequential Monte Carlo Inverse Kinematicsperception.inrialpes.fr/Publications/2007/CA07/RR-6426.pdf · Sequential Monte Carlo Inverse Kinematics 3 1 Introduction Given a kinematic

BDL-CA Bases de données lexicales : construction et applicationsolst.ling.umontreal.ca/pdf/BDL-CA07.pdf · 2007-04-21 · Danlos (2005) claims 97% accuracy on a corpus of about 240,000

Teacher Talk - Student Talk

COLLOQUY Talk Talk White Paper

Talk Talk Talk2

Rihanna - Talk That Talk (Llibret)

talk talk Talk 1 Part6 Cont..

Talk the Talk

talk talk lounge vol.12

Auto WIN - jrj.com.cnpg.jrj.com.cn/acc/Res/CN_RES/INVEST/2012/8/31/353a8c27-ca07-4e94-a205... · Asia Pacific Equity Research 31 August 2012 Auto WIN Regional auto industry highlights

Ca07 SDE Talk