Assessing Model Discrepancy Using a Multi-Model Ensemblemazjcr/RougierMPI.pdf · and each model has...

Assessing Model Discrepancy Using aMulti-Model Ensemble

Jonathan Rougier1

Michael Goldstein2 Leanna House3

1Department of Mathematics, University of Bristol, UK

2Department of Mathematical Sciences, Durham University, UK

3Department of Statistics, Virginia Tech, Blacksburg VA, USA

Technical report available at http://www.maths.bris.ac.uk/∼mazjcr/mme1.pdf

The current state of the art (AR4 WG1, ch 10)

Thinking about the discrepancy

The Best Input approach

Actual climate = f(x̃)⊕ discrepancy

Observations = historic climate⊕measurement error

where x̃ is the best input, and f(x̃) the climate model evaluated atits best input; ‘⊕’ means ‘plus independent’.

I The best input x̃ is uncertain: the model’s standardparameterisation is an estimate of x̃.

I The discrepancy is uncertain: we describe it in terms of amean vector and a variance matrix.

Ignoring the discrepancy is equivalent to treating it as identicallyequal to zero. This does not reflect our judgement that the modelis imperfect. Inferences made with a zero discrepancy must besuspect.

Thinking about the discrepancy

The Best Input approach

Actual climate = f(x̃)⊕ discrepancy

Observations = historic climate⊕measurement error

where x̃ is the best input, and f(x̃) the climate model evaluated atits best input; ‘⊕’ means ‘plus independent’.

I The best input x̃ is uncertain: the model’s standardparameterisation is an estimate of x̃.

I The discrepancy is uncertain: we describe it in terms of amean vector and a variance matrix.

Ignoring the discrepancy is equivalent to treating it as identicallyequal to zero. This does not reflect our judgement that the modelis imperfect. Inferences made with a zero discrepancy must besuspect.

Our multi-model ensemble (MME)

Surface temperature, 1995−1999

I A collection of evaluations of different models (e.g. modelsfrom different research groups). Each model has its ownparameters and is evaluated at its own standardparameterisation:

MME ={f(1)(x̃(1)), . . . , f(m)(x̃(1))

{f̃(1)

, . . . , f̃(m)}

and each model has its own discrepancy,{d(1), . . . ,d(m)

I It might seem natural to treat the mean from the MME as acentral estimate of actual climate, and the variance as ameasure of uncertainty, but this makes no allowance forcommon sources of uncertainty in the discrepancy, i.e. thatCov(d(i),d(j)) 6= 0.

Examples: common sub-modules (including common code), similar

solver resolution, similar parameter values (peer pressure).

I A collection of evaluations of different models (e.g. modelsfrom different research groups). Each model has its ownparameters and is evaluated at its own standardparameterisation:

MME ={f(1)(x̃(1)), . . . , f(m)(x̃(1))

{f̃(1)

, . . . , f̃(m)}

and each model has its own discrepancy,{d(1), . . . ,d(m)

I It might seem natural to treat the mean from the MME as acentral estimate of actual climate, and the variance as ameasure of uncertainty, but this makes no allowance forcommon sources of uncertainty in the discrepancy, i.e. thatCov(d(i),d(j)) 6= 0.

Examples: common sub-modules (including common code), similar

solver resolution, similar parameter values (peer pressure).

Second-order exchangeability of the MME

SOE implements the following principle:

All of the models in the MME are equally informativeabout actual climate, in the sense that if we had to pickany pair of models for inference about actual climatethen we would be indifferent between all possible pairs.

This is a qualitative judgement based on the modelmeta-information, i.e. made before a detailed inspection of themodel-evaluations. It excludes:

1. Models that are outliers, either too good or too bad;

2. Models that are duplicates of other models.

I For our MME, we exclude BCC-CM1, from the BeijingClimate Center, and GFDL-CM2.1, GISS-EH, GISS-ER,CCS-M3, UKMO-HadGEM1 (duplicates), which leaves uswith 14 models.

Statistical modelling implications

I If our MME is judged to be SOE, then we can write theevaluations as

f̃(j)

=M(f )⊕Rj(f ) j = 1, . . . ,m

where M(f ) can be thought of as the representative model,and Rj(f ) is an uncorrelated-with-everything residual.

I This implies a similar representation for the discrepancies:

d(j) =M(d)⊕Rj(d) j = 1, . . . ,m

and the two crucial quantities that we must specify are thetwo variances, Var(M(d)) and Var(Rj(d)) (which is thesame for all j).

I Unsurprisingly (?!) we can use the ensemble variance as anestimate of Var(Rj(d)).

f̃(j)

=M(f )⊕Rj(f ) j = 1, . . . ,m

d(j) =M(d)⊕Rj(d) j = 1, . . . ,m

f̃(j)

=M(f )⊕Rj(f ) j = 1, . . . ,m

d(j) =M(d)⊕Rj(d) j = 1, . . . ,m

Statistical modelling implications (cont)

With our SOE framework,

Cov(d(i),d(j)) 6= 0 ⇔ Var(M(d)) 6= 0.

We can adopt the following principle for Var(M(d)):

Model disagreement

When the individual models disagree on somecomponent, then the models taken together are judgedless accurate about that component.

This implies, crudely, that Var(M(d)) ∝ Var(Rj(d)), where weget to choose the constant(s) of proportionality.

Small Print . . .Actually, in both cases, i.e. both Var(Rj (d)) and Var(M(d)), we have to be a bit more subtle than this, andperhaps the devil is in the details! The details are in the paper.

Statistical modelling implications (cont)

With our SOE framework,

Cov(d(i),d(j)) 6= 0 ⇔ Var(M(d)) 6= 0.

We can adopt the following principle for Var(M(d)):

Model disagreement

When the individual models disagree on somecomponent, then the models taken together are judgedless accurate about that component.

This implies, crudely, that Var(M(d)) ∝ Var(Rj(d)), where weget to choose the constant(s) of proportionality.

Small Print . . .Actually, in both cases, i.e. both Var(Rj (d)) and Var(M(d)), we have to be a bit more subtle than this, andperhaps the devil is in the details! The details are in the paper.

Observations, and modelling validation

DJF surface temperature, 1995–1999, aggregated to 5◦ gridcells.

Observations, including missing values (deg Celcius)

−180 −135 −90 −45 0 45 90 135 180

−50 −40 −30 −20 −10 0 10 20 30 40 50

Standardised marginal prediction errors

−180 −135 −90 −45 0 45 90 135 180

−20 −3 −2 −1 0 1 2 3 20

Standardised joint prediction errors

−180 −135 −90 −45 0 45 90 135 180

−20 −3 −2 −1 0 1 2 3 20

Cross−model correlations

−180 −135 −90 −45 0 45 90 135 180

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6

Discrepancy: Adjusted mean and variance

Adjusted discrepancy mean (deg Celcius)

−180 −135 −90 −45 0 45 90 135 180

−10 −8 −6 −4 −2 0 2 4 6 8 10

Initial discrepancy standard deviation (deg Celcius)

−180 −135 −90 −45 0 45 90 135 180

0 1 2 3 4 5 6 7 8 9 10 11

Adjusted discrepancy standard deviation (deg Celcius)

−180 −135 −90 −45 0 45 90 135 180

0 1 2 3 4 5 6 7 8 9 10 11

First eigenvector(14% of total trace)

Second eigenvector(9% of total trace)

Third eigenvector(6% of total trace)

Fourth eigenvector(5% of total trace)

Summary

I The discrepancy between model-output and actual climate is anuncertain quantity that can never be completely observed, and so allmodel-based climate inference ought to be statistical inference.

I Multi-model ensembles (MMEs) and actual climate observationsboth contain relevant information about the discrepancy. But thereis no ‘automatic’ method for combining them into an estimate:judgements will be required!

I Statistical models provide a framework within which our judgementsabout the discrepancy can be validated and improved.‘Independence’ in the MME is not a tenable statistical model, butSecond Order Exchangeability (SOE) allows us to incorporateshared sources of model-error.

I The Bayes linear framework within which SOE is implemented isinvariant to the number of members of the MME and scales well(polynomial, O(n3)) with the number of components in thediscrepancy vector.

Assessing Model Discrepancy Using a Multi-Model Ensemblemazjcr/RougierMPI.pdf · and each model has...

Documents

Transcript of Assessing Model Discrepancy Using a Multi-Model Ensemblemazjcr/RougierMPI.pdf · and each model has...