Project leader’s report MUCM Advisory Panel Meeting, November 2006.
-
Upload
keon-mousley -
Category
Documents
-
view
216 -
download
1
Transcript of Project leader’s report MUCM Advisory Panel Meeting, November 2006.
Project leader’s report
MUCM Advisory Panel Meeting, November 2006
www.mucm.group.shef.ac.uk
Outline
Background: uncertainty in models MUCM overview Putting the structures in place Specific progress
www.mucm.group.shef.ac.uk
Background: Uncertainty in Models
www.mucm.group.shef.ac.uk
Computer models
In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real-world processes For understanding, prediction, control
There is a growing realisation of the importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty,
it’s easy to dismiss them
www.mucm.group.shef.ac.uk
Sources of uncertainty
A computer model takes inputs x and produces outputs y = f(x)
How might y differ from the true real-world value z that the model is supposed to predict? Error in inputs x
Initial values, forcing inputs, model parameters Error in model structure or solution
Wrong, inaccurate or incomplete science Bugs, solution errors
www.mucm.group.shef.ac.uk
Quantifying uncertainty
The ideal is to provide a probability distribution p(z) for the true real-world value The centre of the distribution is a best estimate Its spread shows how
much uncertainty about z is induced by uncertainties on the last slide
How do we get this? Input uncertainty: characterise p(x), propagate
through to p(y) Structural uncertainty: characterise p(z-y)
www.mucm.group.shef.ac.uk
Example: UK carbon flux in 2000
Vegetation model predicts carbon exchange from each of 700 pixels over England & Wales Principal output is Net Biosphere Production
Accounting for uncertainty in inputs Soil properties Properties of different types of vegetation
Aggregated to England & Wales total Allowing for correlations Estimate 7.55 Mt C Std deviation 0.57 Mt C
www.mucm.group.shef.ac.uk
Maps
www.mucm.group.shef.ac.uk
Sensitivity analysis
Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the vegetation parameters As opposed to soil
parameters Contribution of
vegetation uncertainty is largest in grasslands/moorlands
www.mucm.group.shef.ac.uk
England & Wales aggregate
PFTPlug-in estimate
(Mt C)Mean(Mt C)
Variance (Mt C2)
Grass 5.28 4.64 0.269
Crop 0.85 0.45 0.034
Deciduous 2.13 1.68 0.013
Evergreen 0.80 0.78 0.001
Covariances 0.001
Total 9.06 7.55 0.321
www.mucm.group.shef.ac.uk
Reducing uncertainty
To reduce uncertainty, get more information! Informal – more/better science
Tighten p(x) through improved understanding Tighten p(z-y) through improved modelling or
programming Formal – using real-world data
Calibration – learn about model parameters Data assimilation – learn about the state variables Learn about structural error z-y Validation
www.mucm.group.shef.ac.uk
Example: Nuclear accident
Radiation was released after an accident at the Tomsk-7 chemical plant in 1993
Data comprise measurements of the deposition of ruthenium 106 at 695 locations obtained by aerial survey after the release
The computer code is a simple Gaussian plume model for atmospheric dispersion
Two calibration parameters Total release of 106Ru (source term) Deposition velocity
www.mucm.group.shef.ac.uk
Data
www.mucm.group.shef.ac.uk
A small sample (N=10 to 25) of the 695 data points was used to calibrate the model
Then the remaining observations were predicted and RMS prediction error computed
On a log scale, error of 0.7 corresponds to a factor of 2
Calibration
Sample size N 10 15 20 25
Best fit calibration 0.82 0.79 0.76 0.66
Bayesian calibration 0.49 0.41 0.37 0.38
www.mucm.group.shef.ac.uk
So far, so good, but
In principle, all this is straightforward In practice, there are many technical difficulties
Formulating uncertainty on inputs Elicitation of expert judgements
Propagating input uncertainty Modelling structural error Anything involving observational data!
The last two are intricately linked And computation
www.mucm.group.shef.ac.uk
The problem of big models
Tasks like uncertainty propagation and calibration require us to run the model many times
Uncertainty propagation Implicitly, we need to run f(x) at all possible x Monte Carlo works by taking a sample of x from p(x) Typically needs thousands of model runs
Calibration Traditionally this is done by searching the x space for
good fits to the data This is impractical if the model takes more than a few
seconds to run We need a more efficient technique
www.mucm.group.shef.ac.uk
Gaussian process representation
More efficient approach First work in early 1980s
Consider the code as an unknown function f(.) becomes a random process We represent it as a Gaussian process (GP)
Training runs Run model for sample of x values Condition GP on observed data Typically requires many fewer runs than MC
And x values don’t need to be chosen randomly
www.mucm.group.shef.ac.uk
Emulation
Analysis is completed by prior distributions for, and posterior estimation of, hyperparameters
The posterior distribution is known as an emulator of the computer code Posterior mean estimates what the code would
produce for any untried x (prediction) With uncertainty about that prediction given by
posterior variance Correctly reproduces training data
www.mucm.group.shef.ac.uk
2 code runs
Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points
www.mucm.group.shef.ac.uk
3 code runs
Adding another point changes estimate and reduces uncertainty
www.mucm.group.shef.ac.uk
5 code runs
And so on
www.mucm.group.shef.ac.uk
Then what?
Given enough training data points we can emulate any model accurately So that posterior variance is small “everywhere” Typically, feasible with orders of magnitude fewer
model runs than traditional methods Use the emulator to make inference about other
things of interest Uncertainty analysis, sensitivity analysis,
calibration, data assimilation, optimisation, … Conceptually very straightforward in the
Bayesian framework But of course can be technically hard
www.mucm.group.shef.ac.uk
Research directions
Models with heterogeneous local behaviour Regions of input space with rapid response, jumps
High dimensional models Many inputs, outputs, data points
Dynamic models Data assimilation
Stochastic models Relationship between models and reality
Model/emulator validation Multiple models
Design of experiments Sequential design
www.mucm.group.shef.ac.uk
MUCM Overview
www.mucm.group.shef.ac.uk
MUCM in a nutshell Managing Uncertainty in Complex Models
Four-year research grant 7 postdoctoral research assistants 4 PhD studentships
Started in June 2006 Based in Sheffield and 4 other UK universities
Objective: To develop Bayesian model uncertainty
methods into a robust technology … toolkit, UML specifications
that is widely applicable across the spectrum of modelling applications
case studies
www.mucm.group.shef.ac.uk
Theme 1: High Dimensionality
Tackling problems associated with dimensionality of inputs, outputs, parameters, and data
WP 1.1 – Screening (PS) Identifying most important inputs/outputs
WP 1.2 – Sparsity and Projection (RA) Dimension reduction using modern
computational techniques WP 1.3 – Multiscale models (RA)
Linking models and data at different resolutions Theme leader: Dan Cornford
www.mucm.group.shef.ac.uk
Theme 2: Using Observational Data
Tackling problems associated with model structural error to link models to field data
WP 2.1 – Linking Models to Reality (RA) Modelling structural error
WP 2.2 – Diagnostics and Validation (PS) Criticising our statistical representations
WP 2.3 – Calibration & Data Assimilation (RA) Extending calibration techniques, particularly to
dynamic models Theme leader: Michael Goldstein
www.mucm.group.shef.ac.uk
Theme 3: Realising the Potential
Turning theory into reliable, widely applicable techniques across a wide range of models
WP 3.1 – Experimental Design (RA + PS) Designing input sets for running models, and
planning observational studies WP 3.2 – The Toolkit (RA + PS)
Distilling experience with methods into robust tools, relaxing constraints
WP 3.3 – Case Studies (RA) Three substantial case studies
Theme leader: Peter Challenor
www.mucm.group.shef.ac.uk
Organisation overview
www.mucm.group.shef.ac.uk
Organisation by theme
1. Cornford 2. Goldstein 3. Challenor
1.1 Boukouvalas Cornford (Challenor)
1.2 Maniyar Cornford (Wynn)
1.3 CummingGoldstein (Rougier)
2.1 HouseGoldstein (O’Hagan)
2.2 Bastos O’Hagan (Rougier)
2.3 BhattacharyaOakley (Cornford)
3.1 Maruri-AguilarWynn (Goldstein)Youssef Wynn (Oakley)
3.2 GattikerChallenor (O’Hagan, Cornford)StephensonChallenor (Oakley)
3.3 GoslingO’Hagan (Challenor)
O’Hagan
www.mucm.group.shef.ac.uk
Organisation by committee
The whole Team meets twice a year Presentations, reports and planning
The Project Management Board meets four times a year Formal decision making, budgeting, personnel
matters The Advisory Panel meets with the
investigators twice a year Providing external support and advice
www.mucm.group.shef.ac.uk
The Team
Investigators Challenor, Cornford, Goldstein, Oakley,
O’Hagan, Rougier, Wynn Project manager
Green RAs
Bhattacharya, Cumming, Gattiker, Gosling, House, Maniyar, Maruri-Aguilar
PSs Bastos, Bouskouvalas, Stephenson, Youssef
www.mucm.group.shef.ac.uk
The Board
Project Management Board is the primary project management body Tony O’Hagan (Sheffield, Chair) Dan Cornford (Aston) Peter Challenor (Southampton) Michael Goldstein (Durham) Henry Wynn (LSE)
Non-voting Jeremy Oakley (Sheffield) Jonty Rougier (Durham) Jo Green (Sheffield)
www.mucm.group.shef.ac.uk
The Panel
Advisory Panel comprises modellers, model users and model uncertainty experts from a wide range of fields
Industry Bob Parish, Hilmi Kurt-Elli, Clive Bowman
Academia Ron Akehurst, Martin Dove, Keith Beven,
Douglas Kell, Ian Woodward Research institutions
Richard Haylock, Andrea Saltelli, Andy Hart, David Higdon, Mat Collins
www.mucm.group.shef.ac.uk
The Mentor
Peter Green (Bristol) Appointed by EPSRC Liaise between project team and EPSRC Advise team
www.mucm.group.shef.ac.uk
Putting the Structures in Place
www.mucm.group.shef.ac.uk
General
All RAs, PSs and Project Manager recruited Started at various times from 1 June to 1
October Need to replace Bhattacharya
Website, wiki, email lists, logo, templates created Reading list, glossary under development
Monthly reporting established RAs set up reading club Links established with related projects
Particularly with SAMSI programme in USA
www.mucm.group.shef.ac.uk
Project planning
First draft of rolling workplans Descriptions and objectives Detailed plans and milestones for 12 months
ahead With month-by-month detail for 6 months
Outline plans and milestones for remainder of project
Will be updated quarterly Milestones and deliverables carefully monitored Panel will receive plans from previous Board
www.mucm.group.shef.ac.uk
Financial management
Handled at quarterly Board meetings Phased budget plan created for each
institution RAs appointed initially for 3 years
Fourth year funds retained in reserve
www.mucm.group.shef.ac.uk
Contacts with Panel members
Introductory meetings held with most members An RA has been assigned to each
To develop understanding of the models and the modelling area
To act as link between other team members and Panel member
Beginning to explore use of models Some models also sourced from other contacts
www.mucm.group.shef.ac.uk
Specific progress 1
Emulator fitting Study of methods to estimate roughness
parameters Acquisition of existing packages
Multiscale models Multiscale version of Daisyworld model created
Non-homogeneous models Voronoi tessellation method improved Paper in preparation
www.mucm.group.shef.ac.uk
Specific progress 2
Design Study of aberration and relationship to kernel Paper in preparation
Dynamic models Basic theory of dynamic emulation developed Toy dynamic model created and emulated Paper in preparation Hydrological model acquired