Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data...

33
Introduction to ensemble forecasting Eric J. Kostelich SCHOOL OF MATHEMATICS AND STATISTICS MSRI Climate Change Summer School July 21, 2008

Transcript of Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data...

Page 1: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction to ensemble forecasting

Eric J. Kostelich

SCHOOL OF MATHEMATICS AND STATISTICS

MSRI Climate Change Summer SchoolJuly 21, 2008

Page 2: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Co-workers:

Istvan Szunyogh, Brian Hunt, Edward Ott,

Eugenia Kalnay, Jim Yorke

and many others!

Thanks to: Dave Kuhl

Papers, preprints, and codes:

http://www.weatherchaos.umd.eduhttp://math.asu.edu/∼eric

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 2 / 32

Page 3: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Principal papers

Preprints: www.weatherchaos.umd.edu

Initial papers:E. Ott et al., Tellus A 56 (2004), 415–428.I. Szunyogh et al., Tellus A 57 (2005),528–545.

Refined mathematical implementation: B. R. Hunt, E. K.,I. Szunyogh, Physica D 230 (2007) 112–126.

Results with real data: I. Szunyogh, E.K. et al., Tellus A 60(2008) 113–130.

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 3 / 32

Page 4: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Recap from last time

In a chaotic process, every point is sensitiveUncertainties in initial conditions grow exponentially(at least for awhile)The weather is chaotic (as far as anyone can tell)The uncertainty in the global weather vector roughlydoubles every 2 daysForecast horizon: about 2 weeks

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 4 / 32

Page 5: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Relevant U. S. organizations

The National Oceanographic and AtmosphericAdministration (NOAA) is a division of theDepartment of CommerceThe National Centers for Environmental Prediction(NCEP) is the division of NOAA responsible fordeveloping and maintaining weather forecast modelsSpectrum of models: Global Forecast System (GFS),Regional Spectral Model (RSM), etc.Model data is distributed to local Weather Serviceoffices, which generate public forecast products

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 5 / 32

Page 6: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Other important modeling efforts

NASA develops and maintains its own forecast modelsInternational agreements to share forecasts andobservations (NCEP, UK Met Office, ECMWF,Canada, Japan, Brazil, etc.)Research community: Weather Research andForecasting model (WRF)NOAA and the U. S. Navy develop and maintain oceanmodelsPrivate sector efforts: AccuWeather, airlines, etc.

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 6 / 32

Page 7: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

What do we want to predict?

The best long-term forecast is climatology (the mean isthe maximum likelihood estimate)Prior to the mid 1960s, the starting initial condition wasclimatologyThe U. S. Weather Service defines “normal” as the1971–2000 averageExample: in Phoenix, Arizona, tomorrow’s weatherwill be sunny with 96% probabilityExceptional weather often is of greatest interest

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 7 / 32

Page 8: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

What is data assimilation?

The process by which empirical measurements areincorporated into a forecast model to refine an estimateof the initial conditionThe distinction between variables and parameters is amatter of definitionOperational weather forecast centers perform dataassimilation steps 4 times per day (0Z, 6Z, 12Z, 18Z)Real-time constraints: NCEP allows 20 minutes

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 8 / 32

Page 9: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Measures of forecast quality

One objective measure of goodness:

〈forecast−observations〉

A 72-hour forecast today is as accurate as a 36-hourforecast in 1985“Holy grail:” 7-day forecasts that are as accurate as3-day forecasts are now

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 9 / 32

Page 10: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Many applications besides weather

Controls (e.g., airplane autopilots)Ocean and climate models (obviously)Biological models (e.g., Tim Sauer & Steve Schiff)Parameter estimation

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 10 / 32

Page 11: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Some fundamental problems

Naive approach: direct insertionDifficulty: there are usually many more grid pointsthan available measurementsDoes not account for errors in the measurementDoes not exploit correlations between nearby gridpointsThe variables in the model are not necessarily the onesthat can be easily measured

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 11 / 32

Page 12: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Example: Global Forecast System

Principal variables in the GFS:natural logarithm of surface pressurevirtual temperaturedivergence and vorticity of the wind field

Principal measurements:barometric pressuresensible temperaturerelative humiditywind speed and directionsatellite radiances (complicated!)

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 12 / 32

Page 13: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Typical 6-hour land surface dataset: 31,310 locations

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 13 / 32

Page 14: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Typical 6-hour surface marine dataset: 2,642 locations

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 14 / 32

Page 15: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Typical 6-hour satellite dataset: 53,842 locations

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 15 / 32

Page 16: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The observation space

For these reasons, data assimilation is done in theobservation spaceGiven a vector of observations y, interpolate the modelstate x to the same locationsThe interpolation operator is denoted HThe innovation is y−H(x)

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 16 / 32

Page 17: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Basic idea: Weighted least squares

Observations: y ∈ Rp, y = H(xt)+ ε

Observation errors: E(ε) = 0, E(εεT) = RModel forecast (“background”): x ∈ Rn, xb = xt +η

Model errors: E(η) = 0, E(ηηT) = Pb

Goal: minimize the objective function

J(x)= [y−H(x)]TR−1[y−H(x)]+(x−xb)TP−1b (x−xb)

Minimization produces an analysis xa with associatedcovariance Pa

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 17 / 32

Page 18: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Simplest assumptions

The observation errors ε are normally distributed withmean 0 and covariance RModel errors similarly: N(0,Pb)When the underlying model is linear, it can be shownthe the minimizer xa of J is unique, unbiased and hasminimum variance among all linear estimatorsWeather models are “linear enough” over 6-hourintervals, but there is no guarantee of optimality

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 18 / 32

Page 19: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The dimensionality problem

Must evaluate

J(x)= [y−H(x)]TR−1[y−H(x)]+(x−xb)TP−1b (x−xb)

where y ∈ Rp, x ∈ Rn

Current NCEP operations: p∼ 1.75 million andn∼ 3 billionWe need R−1 (p×p) and P−1

b (n×n)

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 19 / 32

Page 20: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The computational complexity problem

Inversion of a k× k matrix is an O(k3) algorithmIf a 100×100 matrix takes ∼ 1 sec to invert, then a109×109 matrix takes ∼ 1018 secR is nearly diagonal if observation errors are mostlyuncorrelatedPb is not diagonalComputing Pb(t +∆t) from Pa(t) requires integrationof the tangent linear model

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 20 / 32

Page 21: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Complexity reduction strategies

Localization: Try to do the minimzation over smallerregions of the globeEstimate and precompute P−1

b : Assume that theforecast uncertainty is approximately constant fromone day to the next. (Used in all current operational DAsystems)Thin the observations and use only the “mostimportant” ones

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 21 / 32

Page 22: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Each strategy has drawbacks

Assuming Pb ≈ constant ignores the “errors of the day”Generally regarded as one of the key impediments tobetter forecastsThe result of sequential assimilation of observationsdepends on the order of processingMust assure continuity at the boundaries of the smallerregions

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 22 / 32

Page 23: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The Local Ensemble Transform Kalman Filter (LETKF)

Addresses many of these problemsExploits the “geometry of uncertainty” in chaoticprocesses to lower the dimension but still account forerrors of the dayAssimilates all the data at onceUses localization and sets of observations that varyslowly in space to help assure continuityPermits efficient implementation on massively parallelcomputers

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 23 / 32

Page 24: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The geometry of forecast uncertainty

The size of a typical high- or low-pressure system isabout 1000 km×1000 km (≈ Texas)The GFS, when run at medium (T62) resolution,contains about 3000 grid-point variables in Texas-sizedregionsSuppose we run k statistically equivalent forecastsWhat are the singular values of the resulting 3000× kforecast matrix XF?

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 24 / 32

Page 25: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Correlation and dimensionality

Over most Texas-sized regions, one solution looksmuch like anotherThe columns of XF tend to be highly correlated

so the SVD of XF yields a good rank-r approximationeven when r � k

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 25 / 32

Page 26: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Correlation and dimensionality

Over most Texas-sized regions, one solution looksmuch like anotherThe columns of XF tend to be highly correlatedso the SVD of XF yields a good rank-r approximationeven when r � k

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 25 / 32

Page 27: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Key empirical finding

This was a key finding byD. J. Patil et al. PRL 86 (2001), 5878–5881.

GFS at T62 resolution: ∼ 3000 grid variables overtypical Texas-sized regionTypical ensemble of 100≤ k ≤ 200 forecasts generatesa 3000× k forecast matrix XF whose first r singularvectors, 40≤ r ≤ 80, yield an excellent approximationof the forecast uncertainty

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 26 / 32

Page 28: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The ensemble dimension

The ensemble dimension (E-dimension) of an n× kmatrix is

E ≡ (s1 + s2 + · · ·+ sk)2

s21 + s2

2 + · · ·+ s2k

Measures the eccentricity of the “ellipse” of forecastuncertainty

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 27 / 32

Page 29: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Example: s1 = 3.78, s2 = 3.60, Edim = 1.99

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 28 / 32

Page 30: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Example: s1 = 19.24, s2 = 4.35, Edim = 1.43

−5 −4 −3 −2 −1 0 1 2 3 4 5−4

−3

−2

−1

0

1

2

3

4

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 29 / 32

Page 31: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Example: s1 = 83.65, s2 = 4.33, Edim = 1.10

−15 −10 −5 0 5 10 15

−10

−5

0

5

10

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 30 / 32

Page 32: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

The key idea behind the LETKF

If the E-dimension is much less than the dimension ofthe overall space, then the distribution is “flat”The ensemble forecast uncertainty over a typicalsynoptic region resembles a “pancake” (at least forshort intervals)Reduce the dimensionality of the problem by changingcoordinates to the r-dimensional subspace containingmost of the forecast uncertaintyThe dynamics reduces the uncertainty in the remainingdirections

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 31 / 32

Page 33: Introduction to ensemble forecasting Eric J. Kostelicheric/msri/ejk_msri2.pdf · Introduction Data Mathematical Framework LETKF Relevant U. S. organizations The National Oceanographic

Introduction Data Mathematical Framework LETKF

Next lecture

Outline of the Kalman filterMathematical details of how we accomplish thedimension reductionResults with operational models and real observations

MSRI Lecture #2 E. Kostelich MATHEMATICS AND STATISTICS 32 / 32