Research Issues in Data Assimilation for Operational NWPide/data/meetings/birs... · x xBxx MHRyy...

1

© Crown copyright Met Office

Research Issues in Data Assimilation for Operational NWP

Andrew Lorenc. Banff International Research Station for Mathematical Innovation and Discovery. Mathematical Advancement in Geophysical Data Assimilation 3-8 February 2008.


Contents

This presentation covers the following areas

• Advanced DA methods: use good model, weight information, predict errors.

• Gaussian assumption: needed for practical methods for very-large models. 4D-Var & Ensemble KF.

• Non-linearities break Gaussian assumption, give butterfly effect and chaotic attractor. Can be addressed sub-optimally.

• Convective-scale NWP is becoming feasible & presents revised challenges.

2


Data Assimilation is the process of absorbing and incorporating observed information into a prognostic model.

OED "assimilate, v. t. … II: to absorb and incorporate."

ASSIMILATION MODEL

OBSERVATIONS

This is normally done by integrating the model forward in time, adding observations.• The model state summarises in an organised way the

information from earlier observations.• It is modified to incorporate new observations, by combining

new & old information in a statistically optimal way.


Data Assimilation, to be good,

1. Needs a good model:• to carry information from past observations to current time;

• to diagnose unobserved quantities via physical modelling relationships.

2. Needs statistical-dynamical combination of information.• Forecasts are generally more informative than latest observations, yet all

the information from each observation should be extracted.

• Observation networks are incomplete. Information on unobserved variables must be inferred (e.g. from satellite radiances).

• It is impossible to properly sample error distributions – physical insight is needed to give:• a good model of observational variances and biases.• a good model of the structure of forecast errors.

3. Advanced Data Assimilation methods also use models to predict the evolution of forecast errors.

3


Performance Improvements

RMS surface pressure error over the NE Atlantic

“Improved by about a day per decade”

N.Hemisphere T+72 RMSE – MSLPVerification vs. analysis

4

S.Hemisphere T+72RMSE – MSLPVerification vs. analysis


UK Index Improvement:skill scores vs UK SYNOPS for T wind ppn cloud visibility

“Improved by about 6hour every 2.5years

- about a day per decade”

5


Importance of forecast model

• A large part of the increase in assimilationaccuracy comes from improvements to the model

• A large part of the increase in model accuracy comes from improvements in resolution

• The resolution has been limited by computer power, so the increase in skill is related to Moore’s Law.

• Still true today – a larger part of planned increases in computer power will be spent on increased resolution than on improved algorithms


Met Office plans on new computer (~*8) in 2009• Global 25km model

• Incremental 4D-Var

• 60km 24m ETKF ensemble

• Regional NAE 12km model

• Incremental 4D-Var

• 16km 24m ETKF ensemble

• UK 1.5km model (stretched)

• 3D-Var + nudging of ppn & cloud

• Ensemble driven by NAE perturbations (experimental)

• Small domain 4D-Var RUC (experimental)

6


J. Charney, M. Halem, and R. Jastrow (1969)J. Atmos. Sci. 26, 1160-1163.

Use of incomplete historical data to infer the present state of the atmosphere

OSSE using Mintz-Arakawa model : 9° x 7° x 2 levels.

Satellite sounders could become a major part of the global OS.

Direct insertion of satellite temperature retrievals is a viable DA method.


ECMWF

Evolution of the r.m.s day-one 500hPaheight forecast error 1981-2001

sonde Z500ob. error~10m!

Simmons & Hollingsworth, 2002

7


S.Hem. Z500 T+24 rms v analyses

4D-Var 3DVar+ATOVS

ATOVSModel+Cov

Radiances+Cov

NOAA16+AMSU-BFGAT+Cov

2nd ATOVSNew stats

12hr 4D-VarHigher res.


N.Hem. Z500 T+24 rms v analyses

4D-Var 3DVar+ATOVS

ATOVSModel+Cov

Radiances+CovFGAT+Cov

NOAA16+AMSU-B

12hr 4D-VarNew stats

2nd ATOVS

Higher res.

8


Bayes’ Theorem

Observation

y o=(x1 +x

2 )/2


Fokker-Planck Equation

Ensemble methods attempt to sample entire PDF.

9


Gaussian Probability Distribution Functions

• Easier to fit to sampled errors.

• Quadratic optimisation problems, with linear solution methods – much more efficient.

• The Kalman filter is optimal for linear models, but • it is not affordable for expensive models (despite the “easy”

quadratic problem) • it is not optimal for nonlinear models.

• Advanced methods based on the Kalman filter can be made affordable:

• Ensemble Kalman filter (EnKF, ETKF, ...)

• Four-dimensional variational assimilation (4D-Var)


Ensemble Kalman filter

Fit Gaussian to forecast ensemble.

10


EnKF

=

=

=


N=100

-0.5

0

0.5

1

1.5

0 500 1000 1500 2000 2500 3000

distance (km)

cova

rianc

e

Errors in sampled EnKF covariances

11


n=100 * compact support

-0.5

0

0.5

1

1.5

0 500 1000 1500 2000 2500 3000

distance (km)

cova

rianc

e

Localisation: The Schur or Hadamard Product


Flavours of Ensemble Kalman Filter

• EnKF: closest to KF. Allows Schur product localisation. Uses perturbed observations to get correct spread. (e.g. Houtekamer & Mitchell Canada)

• SQRT filters: Allows Schur product localisation. Deterministic equation gives correct spread. Efficient with serial processing of obs. (e.g. Tippett, Anderson, Bishop, Hamill, Whitaker)

• ETKF: Localised by data selection. Deterministic equation gives correct spread. Efficient because matrices are order ensemble size. (e.g. Bowler Met Office, Kalnay, Ott, Hunt et al. Univ Maryland, Miyoshi Japan)

12


Deterministic 4D-Var

Initial PDF is approximated by a Gaussian.

Descent algorithm only explores a small part of the PDF, on the way to a local minimum.


Simple 4D-Var, as a least-squares best fit of a deterministic model trajectory to observations

13


Assumptions in deriving deterministic 4D-Var

Bayes Theorem - posterior PDF:

where the obs likelihood function is given by:

Impossible to evaluate the integrals necessary to find “best”.

Instead assume best xmaximises PDF, and minimises -ln(PDF):

(Purser 1984, Lorenc 1986)


The deterministic 4D-Var equations

( ) ( ) ( )o oP P P∝x y x y x

( ) ( ) ( )( )112exp

Tb bP −∝ − − −x x x B x x

( ) ( ) ( ) ( )( )112exp

To o o oP P −= ∝ − − −y x y y y y R y y

( )( )H M=y x

( ) ( ) ( ) ( ) ( )1 11 12 2

TTb b o oJ − −= − − + − −x x x B x x y y R y y

( ) ( ) ( )1 * * 1b oJ − −∇ = − + −x x B x x M H R y y

Bayesian posterior pdf.

Assume Gaussians

But nonlinear model makes pdf non-Gaussian: full pdf is too complicated to be allowed for.

So seek mode of pdf by finding minimum of penalty function

14


Limits to deterministic 4D-Var with perfect, chaotic model.

Cross section of penalty function for 4D-Var in the Lorenz 3-variable model, for different length of time-window.

If the 4D-Var time-window > chaotic timescale, then the penalty function will be multi-modal.

Pires et al. (1996)


Limits to deterministic 4D-Var with turbulence modelTanguay and Gauthier (1995) showed deterministic 4D-Var does not work for a wide range of scales.

15


Statistical, incremental 4D-Var

Statistical 4D-Var approximates entire PDF by a Gaussian.


Statistical 4D-Var - equations

( ) ( )( ) ( )( )( )( ) ( )( )( ) ( )( )

112

112

112

, exp

exp

exp

To b g b g

Tg g

To o

P δ δ δ δ

δ δ

−

−

−

∝ − − − − −

− + +

− − −

x η y x x x B x x x

η η Q η η

y y R y y

( ) ( )( ), ,g gH Mδ= +y HM x η x η% %

( ) ( )( ) ( )( )( ) ( )( ) ( )

112

112

112

,Tb g b g

Tg g

To o

J δ δ δ δ

δ δ

−

−

−

= − − − −

+ + +

+ − −

x η x x x B x x x

η η Q η η

y y R y y

Independent, Gaussian background and model errors ⇒ non-Gaussian pdffor general y:

Incremental linear approximations in forecasting model predictions of observed values converts this to an approximate Gaussian pdf:

The mean of this approximate pdfis identical to the mode, so it can be found by minimising:

16


Incremental 4D-Var with Outer Loop

y y yy

FULL FORECAST MODEL

PERTURBATION FORECAST MODEL

ADJOINT OF P.F. MODELD

ESC

ENT

ALG

OR

ITH

M

Inner low-resolution incremental variational iterationba

ckgr

ound

Outer, full-resolution iteration

U

U

T

xg

xb

δx +δη

+ η

Optional model error terms


Perturbation Forecast model for Incremental 4D-Var

Cloud fraction

(RHtotal-1)/(1-RHcrit)

• Minimise:

• Designed to give best fit for finite perturbations

• Not Tangent-Linear

• Requires physical insight – not just automatic differentiation

• Filters unpredictable scales and rounds IF tests

Tim Payne

cloud fraction

( ) ( ) ( ){ }I E M Mδ δ= + − −x x x M x x%

17


What spread to assume in regularisation?

• If guess=background, need to approximate whole of PDFf

• In final outer-loop, only need to approximate PDFa

Observation

y o=(x1 +x

2 )/2


Relative scores 2003-5 + dates of 4D-Var implementation

RMS errors with mean intra-annual variability removed

-30%

-20%

-10%

0%

10%

20%

30%

40%

Oct

-03

Nov

-03

Dec

-03

Jan-

04

Feb-

04

Mar

-04

Apr

-04

May

-04

Jun-

04

Jul-0

4

Aug

-04

Sep

-04

Oct

-04

Nov

-04

Dec

-04

Jan-

05

Feb-

05

Mar

-05

Apr

-05

May

-05

Jun-

05

Jul-0

5

Aug

-05

Sep

-05

trend UK ECMWF USAFrance Germany Japan Canada

4D-Var implementation

18


verification against own analysis: weighted mean of RMS error / Met Office RMS error


Unique Selling Points, for NWP

4D-Var

Implicitly uses a complete 4-dimensional PDF, with time-evolution as accurate as perturbation model. ⇒Can make good use of time-distributed high-density incomplete observations such as satellite soundings.

Ensemble Kalman Filter

Gives the best available sample of background errors, at low cost if short-period ensemble forecasts are needed anyway.

19


Unique Selling Points, for NWP

4D-Var

Implicitly uses a complete 4D PDF, with time-evolution as accurate as perturbation model. ⇒ Can make good use of time-distributed high-density incomplete observations such as satellite soundings.


Gives the best available sample of background errors, at low cost if short-period ensemble forecasts are needed anyway.

My current understanding of algorithms:


“Localisation” does not correctly handle evolution. If the detail in observations can only be fitted because of localisation, then information in the evolution of the detail is not extracted.

4D-Var

3D-Var covariances are only evolved for the time-window (usually 6-12 hours: longer time windows are not fully tested and will be expensive). Longer-evolved “Errors of The Day” are not represented.


4D-Var or Ensemble KF?

• Ensemble KF is easier to build.

• I expect 4D-Var to be better at extracting dynamical information from a time-sequence of dense incomplete observations (satellite or radar). To date, 4D-Var has a better track record in good quality NWP systems.

• Ensembles are better at sampling the various sources of forecast error over time, to give an estimate of the PDF.

⇒I am planning to implement a hybrid.

• Ultimately, the ability to cope with nonlinear effects (the attractor, spin-up, …) will be important.

20


Ensemble Transform Kalman Filter (ETKF)

E is the matrix of forecast ensemble perturbations in normalised observation space

C is the matrix of non-zero eigenvectors, Γ is a diagonal matrix of non-zero eigenvaluesT is a transform matrix telling how to mix

perturbations from different members

Analysis perturbations are then

f

T

f HXRHXRE 21

21

−−

⎟⎟⎠

⎞⎜⎜⎝

⎛=

Neill Bowler

( ) TCIΓCT 21

−+=

TXX fa =

Size of matrices

manipulated determined

by ensemble size k ⇒ very

cheap.


T+12 perturbed forecast

T+12 ensemble mean forecast

( - ) + =

( - ) + =

( - ) + =

( - ) + =

( - ) + =Transform matrix

4D-Varanalysis

Perturbed analysis

0.9 Pert 1-0.1 Pert 2-0.1 Pert 3-0.1 Pert 4-0.1 Pert 5

Ensemble Transform Kalman Filter (ETKF)

Neill Bowler

21


Local ETKF

• Calculate transform matrix using observations local to a limited set of points, approximately evenly distributed around globe

• Interpolate transform matrix to intermediate grid points

• Use observations within 5000 km of each localisation centre

• Overlap ensures some consistency between transform matrix found for neighbouring centres, as does spherical simplex form of ETKF

Neill Bowler


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16fc-step (d)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Ran

ked

Prob

abilit

y Sk

ill Sc

ore

ecmwf_vt12 ukmo_vt12 ncep_vt12 jma_vt1210 categories, cases 20070328-20070528_N62, area n.hemz at 500hPa (cf_as_ref)

Ensemble Ranked Probability Skill Score – Z500 over NH

ECMWFMet OfficeNCEPJMA

Tim PalmerECMWF

22


Ways to use Ensembles in VAR, reducing sampling error

1. Time average, to give mean covariances (Fisher ECMWF)

2. Use smoothed Errors Of The Day variances

3. Use EOTD scales, smoothed locally in wavelet covariances (PannekouckeMétéoFrance)

4. Use EOTD modes, localised using Schurproduct, in VAR α-control variable (Barker & Lorenc, Met Office & NCAR)


Assume traditional transform or ensemble perturbations both model covariances:

⇒traditional (v) & new (vα) control variables can both represent most perturbations;

⇒Use both appropriately weighted:

Possible future systemusing Ensemble Perturbations

23


Non-linearities

• 4D-Var and the EnKF extend the Kalman filter to weakly nonlinear models. (PDFs should stay nearly Gaussian.)

• The atmosphere is chaotic, with a wide-range of interacting scales and an attractor:

⇒The Butterfly Effect;

⇒Initialisation & “spin-up” problems.


Error growth v scale

Growth of errors initially confined to smallest scales, according to a theoretical model Lorenz (1984) . Horizontal scales are on the bottom, and the upper curve is the full atmospheric motion spectrum. (from Tribbia & Baumhefner 2004).

24


Statistical 4D-Var for a “butterfly”model with chaotic small scales

• Full model should predict the best estimate (which tends to climatology by damping uncertain scales)

• PF model should not give large linear growths (needs to filter chaotic scales)

( ) ( ) ( ){ }I E M Mδ δ= + − −x x x M x x%


DA for a chaotic model

• Methods like 4D-Var and the extended Kalman filter, which use a linearised chaotic model, need everything to be “observed”: they cannot cope with a chaotic assimilation best estimate.

• Not a problem for a good, synoptic-scale atmospheric model & normal observation network.

• Becomes a problem as resolution is increased: “butterfly-scales” are not observed.

• Incremental 4D-Var avoids problem by filtering.

• Not a problem for ensemble KF (but the ensemble mean is similarly filtered)

25


Effect of the “attractor”

• Differences in detail between largely similar states can make some much less probable.

• Can use balance relationships to help identify likely states, but this is imperfect. Humans recognise coherent structures like inversions, fronts, cyclones & convection.

• The effect is significant if the PDF mean (i.e. the best estimate) is not actually a likely state.

• Important for diagnostic fluxes, and moisture budgets.


Freqency of layer cloud in 11304 UK observations & global model forecasts

26


Tephigram of sounding (black), global model background (blue) and analysis (red), for the mean of 136 UK soundings with layer cloud top diagnosed at level 5 in the background.


Tephigram of sounding, global model background and analysis, for the mean of 136 UK soundings with layer cloud top diagnosed at level 5 in the background.

27


Mean b-o, classified by cloud top in b


Tephigram of sounding (black), global model background (blue) and analysis (red), for the mean of 140 soundings with stratocumulus tops diagnosed at level 5 in the sonde.

28


Tephigram of sounding, global model background and analysis, for the mean of 140 UK soundings with layer cloud top diagnosed at level 5 in the observed soundings.


Current Met Office Research Projects which might help

• Improve vertical resolution• Improve linear Gaussian aspects:

• Vertical coordinate transform related to potential temperature• EOTD modes sampled from ensemble• Improved PF parametrization of boundary layer

• Addressing nonlinear problem:• Long-window outer-loop 4D-Var, adding a constraint that only states

which the model can generate/retain are acceptable• Nonlinear “Holm” humidity transform to address non-Gaussianity near

100%

• Other “fixes”:• When satellite observations detect layer cloud, generate “bogus”

observations specifying inversion structure• Continue and improve “nudging” direct insertion

29


Allowing for the attractor

• Theoretical DA methods can be devised, under the assumption that the model’s attractor matches the atmosphere’s.

• If the model does not regularly predict it, the DA cannot analyse it.Very difficult to correct for model errors.

• Impossible to retain the computational efficiency of 4D-Var or EnKF.

• Can engineer aspects of design to help. e.g. • Nonlinear normal-mode initialisation.

• Incremental 4D-Var with digital-filter Jc & outer-loop long enough to “spin up” fine-scale adjustments to inner-loop increments.


Extra Problems for Convective-Scale DA• Less experience with high-density observations

• More nonlinear

• Little diagnostic balance

• Expensive model & quick delivery needed

• >2km grid cannot resolve most convection

• Wide range of scales all significant

• Lateral boundary conditions important

• Downscaling of synoptic-scale often useful

30


Radar Network Coverage – 2007

1km resolution

2km

5km


Comparison of synoptic-scale & convective-scale ensembles

Atmospheric Predictability at Synoptic Versus Cloud-Resolving ScalesCathy Hohenegger and Christoph Schär. BAMS 2007

31


Boscastle flood. 16 August 2004


Boscastle flood. 16 August 2004

32


Radar data animation. 16 August 2004


Better resolution: effect on the Boscastle forecast

60km forecast from 00UTC

Forecast rainfall accumulations for

1200-1800UTC 16/8/2004

12km forecast from 00UTC 4km forecast from 00UTC 1km forecast from 00UTC

5km radar actual

33


Summary

• Advanced DA methods

• Use accurate model to carry information

• Allow for forecast and observation errors

• Use a model to predict forecast errors.

• Gaussian assumption gives efficient methods for very large models (e.g. 4D-Var & EnKF); essential in practice. Technical issues are important in design.

• Nonlinearities break the Gaussian assumption:

• 4D-Var & EnKF cope with weakly nonlinear models

• Butterfly effect is avoided by filtering

• Attractor effects can be addressed (sub-optimally).

• Convective-scale NWP is becoming affordable and poses revised challenges. Expect to see as much progress in next 30 years as synoptic-scale made in last!


Questions and discussion

Research Issues in Data Assimilation for Operational NWPide/data/meetings/birs... · x xBxx MHRyy...

Documents

Transcript of Research Issues in Data Assimilation for Operational NWPide/data/meetings/birs... · x xBxx MHRyy...