Statistics of Multiple Constraints: Analysis of Global ...inez/MSRI-NCAR_CarbonDA/... · Statistics...

Statistics of Multiple

Constraints:

Analysis of Global Change

and the Carbon Cycle

Ian G. Enting

MASCOS

The University of Melbourne

1

Acknowledgments

The ARC funds the Center of Excellence for

Mathematics and Statistics (MASCOS).

My fellowship at MASCOS is supported by CSIRO

through a sponsorship agreement.

Collaborators:

Cathy Trudinger and YingPing Wang of CSIRO Marine

and Atmospheric Research.

Roger Francey, Denis O’Brien, Peter Rayner, formerly

of CSIRO Atmospheric Research.

Workshop on Data Assimilation for the Carbon Cycle

2

Summary

Theme: reviewing time series analysis for:

• improvement of modelling

• understanding issues of spatial analysis

Topics:

• interpreting the carbon cycle

• inversions as statistics

• digital filtering — resolution

• possible problems — beyond Gaussian


3

Data for Carbon Cycle Studies

• Air sampling networks interpreted by inverse

modelling;

• Satellite data, for quantities such as leaf-area index

and phenology

• Terrestrial biosphere models;

• Convective boundary layer measurements;

• Stand-level flux networks;

• Ecosystem experiments;

• Small cuvettes.

Also, potentially satellite concentration data.

From Canadell et al. 2000.Workshop on Data Assimilation for the Carbon Cycle

4

Key characteristics of statistics

• magnitude;

• degree of correlation between components;

• temporal correlation structure;

• spatial correlation structure;

• distribution;

• mismatches in averaging;

• contribution from model representativeness error.

From Raupach et al. 2005


5

Interpretation

Interpretation is an inverse problem, working

backwards from results to causes.

Two main inverse problems: calibration and

data assimilation (deconvolution).

C(t) = C(0) =∫ t

0R(t− t′)S(t′) dt′

C(t) = C(0) =∫ t

0R(t′′)S(t− t′′) dt′′

The problems of deducing model response,

R(t) and forcing term S(t) are formally

equivalent, but in practice differ in the

characteristics of the statistics.


6

Inverse Problems as Statistics

• Any uncertainty analysis needs to be based

on statistics.

• Any statistical analysis is assuming (either

implicitly or explicitly) some statistical

model.

• any variability that cannot be modelled

deterministically . . . must be . . . modelled

statistically (Enting, 2002).


7

Aims of statistical analysis

• Compact characterisation of variability

• Design of techniques for data processing:

statistical efficiency, robustness etc

• Formalism for propagating uncertainties

through chain of calculations

• Design of new experiments:

How much is uncertainty reduced?

• Testing statistical assumptions underlying

data analysis techniques


8

Origins of Uncertainty

For empirical quantities: ‘model’ and ‘data’

• Statistical variation

• Variability

• Inherent randomness

• Subjective judgement

• Linguistic imprecision

• Disagreement

• Approximation

From Morgan and Henrion (my split), Uncertainty (CUP, 1990).


9

Combining information

Bayesian: Pr(x|z) ∝ Pr(z|x)Pr0(x)

Multiple constraints:

Pr(x|z1, z2) ∝ Pr(z2|x)Pr(z1|x)Pr0(x)

∝ Pr(z2|x)Pr1(x)

For linear relations with multivariate normal

distributions, constraints zj = Gjx + ej, with

inverse covariances, Xj, combine to give

inverse covariance, W , as:

W = Wprior +∑

GjTXjGj for estimate

x = [Wprior +∑

GjTXjGj]

−1[W x0 +∑

j GjTXjzj]


10

Digital filtering

Model is zk = sk + nk where signal has power

spectrum, fs(θ) and noise has spectrum fn(θ).

Estimate signal as sk =∑

j Φjzk−j

Mean square error of estimate:

E[(sk − sk)2] = bias + variance:

MSE =∫ π

−π

[|1− φ(θ)|2fs(θ) + |φ(θ)|2fn(θ)

]dθ

Optimal filter: φ(θ) = fs(θ)/[fn(θ) + fs(θ)]

with

MSE =fs(θ)fn(θ)

fs(θ) + fn(θ)=

1

fs(θ)+

1

fn(θ)

−1


11

Characterising resolution

Use characteristic numbers:

Nobs How many observations?

Ndata How many effectively independent observations?

Kcomp How many components used in the calculations?

Msignal How many components needed to specify

signal?

Ksynth How many components used to fit the signal?

Ms:n How many signal components exceed the noise

level?

Ktarget How many signal components is one trying to

estimate?

Expanded from Enting (2002: Section 8.3), including Msignal and

distinguishing Nobs from Ndata.Workshop on Data Assimilation for the Carbon Cycle

12

More data: Ms:n < Ndata

fsig(θ)

fnoise(θ)

φopt(θ)

Integrand

of MSE

0 2012345

0 20

1

0 20

1

0 2 4 6012345

0 2 4 60

1

0 2 4 60

1

0 2 4 6 8 10 12012345

0 2 4 6 8 10 120

1

0 2 4 6 8 10 120

1

MSE ∼ 1/Ndata due to reduced aliasing in noise.Workshop on Data Assimilation for the Carbon Cycle

13

Correlated data: Ndata < Nobs

fsig(θ)

fnoise(θ)

φopt(θ)

Integrand

of MSE

0 2012345

0 20

1

0 20

1

0 2 4 6012345

0 2 4 60

1

0 2 4 60

1

0 2 4 6 8 10 12012345

0 2 4 6 8 10 120

1

0 2 4 6 8 10 120

1

No change in MSE unless increasing Nobs increases Ndata.Workshop on Data Assimilation for the Carbon Cycle

14

Aliasing: Ndata < Msignal

fsig(θ)

fnoise(θ)

φopt(θ)

Integrand

of MSE

0 2012345

0 20

1

0 20

1

0 2 4 6012345

0 2 4 60

1

0 2 4 60

1

0 2 4 6 8 10 12012345

0 2 4 6 8 10 120

1

0 2 4 6 8 10 120

1

MSE too low, due to ignoring aliasing (truncation error).Workshop on Data Assimilation for the Carbon Cycle

15

Aliasing: Ndata < Msignal

fsig(θ)

fnoise(θ)

φopt(θ)

Integrand

of MSE

0 2 4 6 8 10 12012345

0 2 4 6 8 10 120

1

0 2 4 6 8 10 120

1

0 2 4 6012345

0 2 4 60

1

0 2 4 60

1

0 2012345

0 20

1

0 20

1

Treating aliased signal as an error contribution.Workshop on Data Assimilation for the Carbon Cycle

16

Aliasing from Truncation error

(i) objective, (ii) risk, (iii) hope, (iv) Wunsch, (v)

strong priors, (vi) correction.

(i)

desired solution

a

b (ii)

solution from biased data

a

b

unifo

rm

basis

Constraintfrom data

(iii)

solution from unbiased data

a

b

(iv)

project from full space

a

b (v)

work in target space

a

b shap

edba

sis

(vi)

apply truncation error to data

a

b


17

Smoothing splines

Fit a set of data, zn, with a smooth curve,

f(t) chosen to minimise

Θ =∑j[zj − f(tj)]

2 + λ∫ tNt1

[f ′′(t)]2 dt

Spline acts as approximate digital filter withφ(θ) = 1/[1 + (θ/θ0.5)

4]

where 50% attenuation occurs at:θ0.5 = 2π/T0.5 = [λ∆t]−1/4

Fit is linear in data, so data uncertainties can

be linearly propagated through calculations.


18

Spline example: Law Dome CH4

Spline fit, f(t), with ±2 s.d.

data uncertainty.

Growth rate, ˙f(t), and

source estimate,˙f(t) + ˙f(t)/τ , also ±2 s.d.

Lower pre-1500 data density

implies smoother spline.

Uncertainties are uncertainty

in spline (smooth part of

source), not uncertainty in

complete source function.

Have chosen Ktarget !!

1000 1200 1400 1600 1800600

700

800

900

CH

4 (p

pb)

1000 1200 1400 1600 1800-5

0

5

10

CH

4 in

crea

se (

ppb/

yr)

1000 1200 1400 1600 180060

70

80

90C

H4

inpu

t (pp

b/yr

)


19

Kalman filter paradigm

Mixed deterministic-stochastic model assumes evolution

of a state, x by: x(n + 1) = F (n)x(n) + u(n) + w(n)

with indirect noisy observations: z(n) = H(n)x(n) + e(n)

Where F , H and u are taken as known, and w and e are

zero-mean multivariate normal with known covariances Q

and R.

Kalman filter formalism gives the optimal estimates, x(n)

of state, given z(1) to z(n).

Combines mulitvariate normal distributions of

observations, z(n), and projection x(n|z1 . . . zn−1).


20

Modelling for Kalman filter

State-space model for methane from ice cores:

x1 is methane concentration, x2 is source.

F =

1−∆t/τ a∆t

0 1

Q =

0 00 Q

H = [1,0] (or ([0,0] if no data).

Data uncertainty R, unit conversion factor, a.

Thus, source is modelled as a ‘random walk’.

Simplified from Trudinger et al, 2002.


21

Kalman filter response

Frequency domain characterisation of stationary

case of Kalman filter

E.g. noise as white

noise, fn = R/2π

Random walk model of

forcing

fs ∝ 1/(1− cos θ) 0.00 0.05 0.10 0.15 0.20 0.25frequency (y-1)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

freq

uenc

y re

spon

se

20.0 10.0 6.7 5.0 4.0period (y)

R/Q ratio (9/81, 25/81, 49/36 for ∆t = 2, and

49/1 for ∆t = 20, right to left) changes the

relative weights of ‘prior’ and observations in

frequency-dependent manner.Workshop on Data Assimilation for the Carbon Cycle

22

Kalman filter on CO2

Combined model, including

CO2 and 13CO2, using

concentration values

corrected for firn diffusion.

Case on left uses published

δ13C uncertainties.

Case on right multiplies

these by four.

Again, scale of uncertainty

affects the optimal

smoothing.

(From Trudinger et al., 2002).

1850 1900 1950 2000280

300

320

340

360

CO

2 (

ppm

)

a)

1850 1900 1950 2000Year

-8.0

-7.5

-7.0

-6.5

-6.0

δ13C

(pe

rmil)

b)

1850 1900 1950 2000Year

-3

-2

-1

0

1

Sour

ce (

Gt C

/yr)

c)

Deduced sources

oceanbiosphere

1850 1900 1950 2000280

300

320

340

360

CO

2 (

ppm

)

d)

1850 1900 1950 2000Year

-8.0

-7.5

-7.0

-6.5

-6.0

δ13C

(pe

rmil)

e)

1850 1900 1950 2000Year

-3

-2

-1

0

1

Sour

ce (

Gt C

/yr)

f)

Deduced sources

oceanbiosphere


23

Time-dependent CO2 inversion

As with simple Kalman filter case, statistical

assumptions about structure of errors in time

can greatly influence what is estimated:

• Synthesis in terms of independent monthly

pulses (Rayner et al 1999), effectively

assumes no long-term systematic error.

• Representing prior information as

’mean-plus-anomaly’ avoids artificially-low

uncertainty on posterior mean.


24

The hard stuff: spatial statistics

• Two-dimensional, (or 2 + 1 when time

involved)

– more modes to consider

– more complex statistics are possible

• Non-stationary (c.f. Kalman filter)

• Response (atmospheric transport) involved

for concentration data (c.f. Kalman filter)


25

Spatial statistics:

In more than one

dimension, more complex

behaviour can occur.

Simple local conditional

dependence can lead to

long-range (fractal)

behaviour.

Closed-form expressions

for model statistics are

seldom known.

Identifying appropriate

statistical models is

correspondingly hard.Example from statistical physics, showing multiple length-scales.


26

Terrestrial distribution

In process inversions, biome-specific distributions will

be modulated by climatic variations.

Map data from Matthews 1983.Workshop on Data Assimilation for the Carbon Cycle

27

Grassland footprint

Distribution of grasslands reflects a 13C signal due to

C3-C4 differences.


28

Number of modes

In an ill-conditioned inverse problem, only a

limited number of modes will be resolved by

the data.

• these may not be the modes that you wantto know about

• you don’t get to choose which modes areresolved

• c.f. biased estimates by Fan et al. byputting fine source discretisation in region(Nth America) without corresondingly finedata set.


29

Semi-quantitative

An exploratory signal-to-noise analysis doesn’t

need to be as precise as actual inversion.

x =∑j

ajzj

typically with lots on near-cancellations in an

ill-conditioned problem.

E[(x− x)2] =∑j|aj|2Rj

and so the variance calculation is much less

sensitive to errors in the aj (i.e. model error).


30

Toy model

• Model atmospheric transport as pure

anisotropic diffusion

• Eigenmodes are spherical harmonics.

Amplification factor for relating surface

sources to surface concentration response is:

γnmk tanh(γnmk) where

γ2nmk

(p0 − p1)2=

κθ[n(n + 1) + αM2]

κpR2earth

+2πik

κpT

Vertical, N-S and E-W diffusion: κp, κθ, ακθ


31

Reality check

• Amplitude of responses for full 3-D model

are approximated well by the 1/n response

to latitudinal variation

• Inversion factor ∝ n comparable to

numerical differentiation,

• Suggests that latitudinally-integrated flux

could have much less correlation, than the

actual flux estimates – confirmed by results

from actual inversions


32

Generalities

• Satellite data don’t have the

1/√n(n + 1) + αm2 attenuation factor

(but vertical averages are still attenuated

relative to surface distribution)

Low precision can give valuable constraints.

• For comparable spatial resolution, E-W

sampling density possibly needs to be

greater than N-S density (however, consider

re-visiting this, with ‘toy model’ including

‘solid-body rotation’, for advective term).


33

From point to globe

Observations:Fluxes, forcings and proxies

↓Local parametric model

↓← Global distribution

↓ of forcing and/or proxy

Global model

Contributions to uncertainty in global model are:

global distribution * parametric uncertainty +

parametric sensitivity * uncertainty in distribution


34

What if it ain’t so?

If Pr(z|x) ∝ exp(−λ|z − h(x)|) and not

Pr(z|x) ∝ exp(−λ|z − h(x)|2) then

• 90% of the ‘nice’ mathematics goes away

• Results of inversions are terrible

It’s not the things you don’t know that get you into

trouble — it’s the things you know for sure that just

ain’t so. (Mark Twain)


35

Central value

For Pr(z|θ) = exp(−|z − θ|2/σ2) then

maximum likelihood estimate is

θN = 1N

∑n zn batch mode

θn+1 = θn + 1n+1

[zn+1 − θn

]recursive

Each estimate θn contains all information about

all previous data values needed for optimal

estimator for the next step (given new data).

For Pr(z|θ) = exp(−|z − θ|/σ) the maximum

likelihood estimate θn is median of all previous

zm ( m ≤ n) — no compact stepwise algorithm.


36

Implications for inversions

If Pr(z|x) ∝ exp(−λ|z − h(x)|) then

1: 90% of the ‘nice’ mathematics goes away

• you don’t have compact ‘sufficient

statistics’

• multi-variate cases are really messy

• automatic differentiation isn’t a lot of use

2: Results of inversions are terrible

• major features depend on how much you

believe Samoa


37

Emerging future directions

Inversion methodology

Shrinkage estimators :

Estimating variances :

Better statistics

Errors-in variables : e.g. using proxy source

data.

Hidden Markov models :

Financial time series models : e.g.

Stochastic volatility.


38

Concluding thoughts:

Take-home questions:

• What are your calculations really

estimating?

• Can what you want really be estimated

given limits to resolution imposed by

model, data, ill-conditioning and

signal-to-noise ratio?

• Are the results, including residuals,

consistent with the statistical assumptions?


39

Further Information

• I. Enting: Characterising the Temporal Variability of

the Global carbon Cycle. CSIRO Atmospheric

Research, Technical paper 40.

• I. Enting: Inverse Problems in Atmospheric

Constituent Transport. 2002, CUP.

• C. Rodenbeck: Estimating CO2 sources and sinks

. . . . MPI-BGC Technical Report 6.

See also:

• Trudinger et al (2002a,b): Kalman filter analysis of

ice-core data: 1 and 2. JGR

• Enting, Trudinger and Etheridge: Propagating data

uncertainty through smoothing spline fits. Tellus:

(in press, 2006).


40

Statistics of Multiple Constraints: Analysis of Global ...inez/MSRI-NCAR_CarbonDA/... · Statistics...

Documents

Transcript of Statistics of Multiple Constraints: Analysis of Global ...inez/MSRI-NCAR_CarbonDA/... · Statistics...