Hybrid Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July...

Daryl Kleist

1

JCSDA Summer Colloquium on Data AssimilationFort Collins, CO

27 July – 7 August 2015

Univ. of Maryland-College Park, Dept. of Atmos. & Oceanic Science

Thanks to Kayo Ide (UMD), Jeff Whitaker (NOAA/ESRL), David Parrish (NOAA/NCEP), Steve Penny (UMD), and Matt Kretschmer (UMD) for

inspiration, collaboration, and several slides.

Outline

2

I. Motivation for HybridII. Covariance Hybrid and EnVar Formulation

I. Single ObservationsII. Toy Model ResultsIII. Extension to 4D

III. Real World Application and Examples from NCEP GFSIV. Hybrid Alternatives

I. Gain-HybridII. Ensemble-Based Hybrid (CaLETKF)

V. Summary

Perspectives of Data Assimilation

• Two main perspectives of practical data assimilation & hybrid approach• What are some of the pros/cons for the ML and MV approaches?

3

Variational Approach: Least square estimation[maximum likelihood]– Variational

Sequential (KF) Approach: Minimum Variance estimate[least uncertainty]

– EnKF (or RRKF)

p(x)p(x)

3DVar vs EnKFSingle Observation GSI Example

4Single 850mb Tv observation (1K O-F, 1K error)

xl position

3DVAR

EnKF

B determines the quality of Δxa

5

Kalman Filter from Var Perspective

Forecast Step

Analysis

• Recast the problem in terms of variational framework (cost function)

E/En KF

• BKF: Time evolving background error covariance• AKF: Inverse Hessian of JKF(x’)

Use of Be in Var

• Or, substitute ensemble estimate of error covariance instead

• This is in the full physical space, which we can work around by introducing a new control variable:

• Where is the local weight for the ensemble members• L is the localization on the extended control variable• xe are the ensemble perturbations that represent Be (as in EnKF)

Be and Bc

Hybridization

• We have already demonstrated that Be is powerful for providing flow-dependent estimates of the background error covariance (and multivariate correlations)– However, suffers from severe rank deficiency

• Alternatively, Bc is full-rank (in the space of the entire model state)– However, typically taken to be static in time, derived from climatological (and usually averaged in

time) statistics– Does not often represent multivariate correlations well (i.e. linking humidity to wind)

• So, why not try to combine them (Hamill and Snyder, 2000)

Single ObservationHamill and Snyder

8100% Bc 90% Be ,10% Bc

Toy Model Demo (L96) 3DVAR Result, 50% obs

9

• Left shows “NMC” derived Bc , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for 3DVAR configuration

• 3DVAR does well in observed regions, but struggles in unobserved do to Bc

Toy Model Demo (L96) ETKF Result, 50% obs,M=20

10

• Left shows snapshot of ETKF derived Be , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for

LETKF configuration• ETKF does well everywhere, time evolving B

11

• Left shows snapshot of Bh , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for hybrid

configuration• Hybrid does much better than 3DVAR, comparable to LETKF

Toy Model Demo (L96) Hybrid Result, 50% obs,M=20

12

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

3DVAR 12.08

Hybrid 3.321 3.764 4.074 4.633 5.060 5.770 7.044 8.218 9.595

ETKF 3.871

Analysis RMSE (x10) over 1800 cases

3DVAR

= 0.3

= 0.7

ETKF

– Hybrid (small beta) as good as ETKF

– Hybrid (larger beta) in between 3DVAR and ETKF

– Small beta hybrid better than LETKF

Toy Model Demo (L96) Hybrid Result, 50% obs, M=20

13

Analysis RMSE (x10) over 1800 cases

3DVAR

= 0.3

= 0.7

ETKF

LETKF

– Hybrid can help mitigate small ensemble size (like localization)

– Not shown: Hybrid with localization would be even better yet

Toy Model Demo (L96) Hybrid Result, 50% obs, M=10

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

3DVAR 12.08

Hybrid 23.65 10.64 7.077 7.270 7.930 7.434 7.426 8.826 9.809

ETKF 42.37

LETKF 3.481

Hybrid EnVar

14

• Starting from the EnVar cost function, how could we combine the static and ensemble components?

• Solution: Add a second background term (one for ensemble, and one for static). Here, we’ll drop the k subscript to help differentiate between climatological (c) and ensemble (e) contributions

Hybrid EnVarLorenc (2003), Buehner (2005), Wang et al.(2007)

c & e: weighting coefficients for clim. (var) and ensemble covariance respectively

xt’: (total increment) sum of increment from fixed/static B (xc’) and ensemble B

ak: extended control variable; :ensemble perturbations

- analogous to the weights in the LETKF formulation

L: correlation matrix [effectively the localization of ensemble perturbations]

Preconditioning Sidebar

16

For the double Conjugate Gradient (GSI default), inverses of B and L not need and the solution is pre-conditioned by full B.

This formulation differs from the UKMO and Canadians, who use a square root formulation. Also, the weights can be applied to the increments themselves:

Hybrid

f-1=0.5

Single Temperature Observation

17f

-1=0.0

EnVar

3DVAR

f-1=0.5

Single Observation TC Example

Single 850mb zonal wind observation (3 m/s O-F, 1m/s error) in Hurricane Ike circulation

HybridEnVar

3DVAR

19

Why Hybrid?

VAR (3D, 4D)

EnKF Hybrid References

Benefit from use of flow dependent ensemble covariance instead of static B

x x Hamill and Snyder 2000; Wang et al. 2007b,2008ab, 2009b, Wang 2011; Buehner et al. 2010ab

Robust for small ensemble x Wang et al. 2007b, 2009b; Buehner et al. 2010b

Better localization (physical space) for integrated measure, e.g. satellite radiance

* x Campbell et al. 2009

Easy framework to add various constraints

x x Kleist and Ide 2015

Use of various existing capabilities in VAR

x x Kleist and Ide 2015

So what’s the catch?

20

• Most configurations of hybrid DA systems require the development and maintenance of two DA systems• EnKF + Var

• Still need to deal with localization and other sampling-related issues (though somewhat mitigated by use of full rank Bc)

• Even more parameters to explore• Trade off between ensemble size, resolution, hybrid

weights, etc.

21

NCEP operational 3D EnVar Hybrid

• Cycling 80 member EnKF (T574 ~40km) provides an ensemble-based estimate of Jb term in 3DVar.

• 3DVar with ensemble Jb updates a T1534 (~13 km) control forecast. The EnKF analysis ensemble is re-centered around the high-res analysis.

• A combination of multiplicative inflation and stochastic physics is used to represent missing sources of uncertainty in the EnKF ensemble.

Operational Configuration

22

• Full B preconditioned double conjugate gradient minimization

• Spectral filter for horizontal part of L, level-dependent decorrelation distances

• Recursive filter used for vertical• 0.5 scale heights

• Same localization used in Hybrid (L) and EnSRF• Applied using GC compact

functions

• TLNMC (Kleist et al. 2009) applied to total analysis increment*

22

23

EnKFmember update

member 2 analysis

high resforecast

high resanalysis

member 1 analysis

member 2 forecast

member 1 forecast

recenter analysis ensemble

Dual-Res Coupled HybridVar/EnKF Cycling

member N forecast

member N analysis

T574

L64

T153

4L64

Generate new ensemble perturbations given the

latest set of observations and first-guess ensemble

Ensemble contribution to background error

covariance

Replace the EnKF ensemble mean analysis

and inflate

Previous Cycle Current Update Cycle

GSI Hybrid EnVar

Hybrid Impact in Pre-implementation Tests

24

Figure 01: Percent change in root mean square error from the experimental GFS minus the operational GFS for the period covering 01 February 2012 through 15 May 2012 in the northern hemisphere (green), southern hemisphere (blue), and tropics (red) for selected variables as a function of forecast lead time. The forecast variables include 1000 hPa geopotential height (a, b), 500 hPa geopotential eight (c, d), 200 hPa vector wind (e, f, h), and 850 vector wind (g). All verification is performed using self-analysis. The error bars represent the 95% confidence threshold for a significance test.

Hybrid Impact in Pre-implementation Tests

25

Figure 02: Mean tropical cyclone track errors (nautical miles) covering the 2010 and 2011 hurricane seasons for the operational GFS (black) and experimental GFS including hybrid data assimilation (red) for the a) Atlantic basin, b) eastern Pacific basin, and c) western Pacific basin. The number of cases is specified by the blue numbers along the abscissa. Error bars indicate the 5th and 95th percentiles of a resampled block bootstrap distribution.

26

4D Ensemble Var (Liu et al, 2008)GSI - Hybrid 4D-EnVar

Wang and Lei (2014); Kleist and Ide (2015)The Hybrid EnVar cost function can be easily extended to 4D and include a

static contribution (ignore preconditioning)

Where the 4D increment is prescribed through linear combinations of the 4D ensemble perturbations plus static contribution, i.e. it is not itself a model trajectory

Here, static contribution is time invariant. C represents TLNMC balance operator. No TL/AD in Jo term (M and MT)

Jo term divided into observation “bins” as in 4DVAR

Ensemble-Var methods: nomenclatureLorenc (2013)

• En-4DVar: Propagate ensemble Pb from one assimilation window to the next (updated using EnKF for example), replace static Pb with ensemble estimate of Pb at start of 4DVar window, Pb propagated with tangent linear model within window.

• 4D-EnVar: Pb at every time in the assimilation window comes from ensemble estimate (TLM no longer used).

• As above, with hybrid in name: Pb is a linear combination of static and ensemble components.

• 3D-EnVar: same as 4D ensemble Var, but Pb is assumed to be constant through the assimilation window (current NCEP implementation). 27

28

GSI – Hybrid En-4DVarWang and Lei (2014); Kleist and Ide (2015)

The traditional 4DVar cost function can be manipulated to use an ensemble to help prescribe the error covariance at the beginning of the window

Here, the hybrid error covariance is applied at the beginning of the window, and the TL/AD propagate within observation window (M and MT) in Jo term

Jo term divided into observation “bins” as in 4DVAR

4DVAR

29

4D analysis increment is a trajectory of the PF model.

Lorenc & Payne 2007

4D EnVar

30

4D analysis is a (localised) linear combination of nonlinear trajectories. It is not itself a trajectory.

Courtesy: Andrew Lorenc

In the alpha control variable method one uses the ensemble perturbations to estimate Pb only at the start of the 4DVar assimilation window: the evolution of Pb inside the window is due to the tangent linear dynamics (Pb(t) ≈ MPbMT)

In 4D-EnVar Pb is sampled from ensemble trajectories throughout the assimilation window (nonlinear dynamics):

from: D. Barker

4D Hybrids

Single Observation (-3h) Examplefor 4D Variants

32

4DVAR

H-4DVAR_ADbf

-1=0.25H-4DEnVar

bf-1=0.25

4DEnVar

Time Evolution of Increment

33

t=-3h

t=0h

t=+3h

H-4DVAR_AD H-4DEnVar

Solution at beginning of window same to within round-off (because observation is taken at that time, and same weighting parameters used)

Evolution of increment qualitatively similar between dynamic and ensemble specification

4D Hybrid Summary

• 4D EnVar analysis is localized, linear combination of ensemble perturbations (similar to EnFK/LETKF)

• Traditional 4DVar (and hybrid 4DVars) requires sequential (and repeated) runs of the TL/AD. Ensemble trajectories can be pre-computed in parallel (but stored: IO/memory)

• Developing/maintaining TL/AD is demanding

• Still unsolved issues: Is ensemble sampling of nonlinear dynamics better than TL evolution? Other ensemble-related issues in EnVar…

34

4D Hybrid at Major NWP Centers

• Hybrid 4D EnVar– Implemented at CMC (Canada)

• Replaced 4DVAR– To be implemented at NCEP (early 2016)

• Hybrid En-4DVAR (Operational or in Testing)– UKMO– ECMWF*– Meteo-France*– US Navgem– JMA

35

(Planned) Implementation of 4D EnVarat NCEP

36

• Hybrid 4D EnVar to become operational for GFS/GDAS by January 2016 (tentative)– Tests at low resolution helped design configuration (results not show,

publication in prep)– Real time and retrospectives already underway, operational package quasi-

frozen

• Package Configuration– T1534 deterministic GFS with 80 member T574L46 ensemble with fully coupled

(two-way) EnKF update (87.5% ensemble & 12.5% static), same localization as current operations

– Incremental normal mode initialization (TLNMC) on total increment– Multiplicative inflation and stochastic physics for EnKF perturbations– Full field digital filter– All-sky radiance assimilation, aircraft temperature bias correction– Minor model changes

Full Resolution (T1534/T574) Trials:500 hPa AC

500 hPa AC for the Operational GFS (Black, 3D Hybrid) and Test 4D configuration (Red) for the period covering 02-01-2015 through 04-29-2015.

Full Resolution (T1534/T574) Trials:Summary Scorecard

(02-01 through 04-29 2015)

Note on Hybrid at ECMWF and Meteo-France

• ECMWF and Meteo-France utilized ensembles of data assimilation to estimate background errors

• They do not employ the extended control variable methods, but instead look to prescribe aspects of the B from the flow-dependent ensemble– This includes significant efforts on filtering of raw statistics since they

use a 25 member ensemble (next slide).– Filtered Variances Used since 2012– Filtered Correlations (in their wavelet Jb) used since 2013

39

Course 2014 - EDA

Raw EDA StDev Vorticity 500 hPa

From Bonavita

Filtered EDA StDev Vorticity 500 hPa c

ECMWF EDA

Flow dependent Jb:Correlations Lengths of vorticity errors, ~500mb

Static Wavelet Jb

Online Wavelet Jb:2012011000

Courtesy ECMWF

Alternate Implementation:“Gain Hybrid”

42

Ensemble

Climo/Static

• Combining in a specific way to arrive at:

• Which is the basis for the Hybrid-LETKF of Penny (2013)

LETKF

3DVAR

HYBRID

L96 ResultsOb Count v Ens Size

HYBRID (0.2)LETKF

• Hybrid really helps for small ensembles, comparable in skill (slightly worse) for regimes that are well observed and have very large ensemble sizes From Penny (2014)

Hybrid Gain Testing at ECMWF

44

NHem

Trop

SHem

EnKF4DVar Hybrid

Ensemble-Based HybridsKretschmer et al. (2015)

• Climatologically Augmented LETKF (caLETKF)– Supplement dynamic ensemble with orthogonal eigenvectors derived

from Bc (i.e. they do not require model integration!)

45

Summary• Hybrid methods attempt to combine the best aspects of

variational and ensemble based DA solvers– They have shown to be robust for small ensemble sizes and in the

presence of model error– One draw back is that it requires the maintenance of two DA schemes

• There has been some work on “filter-free” or more cost effective alternatives

• EnVar schemes have a 4D extension that does not require the TL/AD– Though, while competitive, it has yet to be shown that hybrid 4D

EnVar can truly be better than hybrid En-4DVAR (yet).

• There are alternate hybrid formulations out there such as the hybrid gain and caLETKF

46

Summary (cont.)• Hybrid methods have become a popular scheme for NWP, and

are now being extended to other earth system models– How will they fare for coupled DA schemes?

• These methods are still fairly “new”, and there is still a lot of work that can be done to improve them– Estimation of weighting parameters/localization, building solvers that

can update the ensemble and mean simultaneously, etc.

47

Sampling of References• Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasi-

operational setting. Quart. J. Roy. Meteor. Soc., 131, 1013-1043.

• Buehner, M., J. Morneau, and C. Charette, 2013: Four-dimensional ensemble-variational data assimilation for global deterministic weather prediction. Nonlin. Processes Geophys., 20, 669-682, doi:10.5194/npg-20-669-2013, 2013.

• Clayton A. M, A. C. Lorenc and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., 139, 1445-1461.

• Hamill, T. H., and C. Snyder., 2000. A Hybrid Ensemble Kalman Filter-3D Variational Analysis Scheme. Mon. Wea. Rev. 128, 2905–2919.

• Kleist, D. T., and K. Ide, 2015: An OSSE-based Evaluation of Hybrid Variational-Ensemble Data Assimilation for the NCEP GFS, Part I: System Description and 3D-Hybrid Results, Mon. Wea. Rev.,143, 433-451.

• Kleist, D. T., and K. Ide, 2015: An OSSE-based Evaluation of Hybrid Variational-Ensemble Data Assimilation for the NCEP GFS, Part II: 4D EnVar and Hybrid Variants, Mon. Wea. Rev., 143, 452-470.

• Kretschmer, M., B. R. Hunt, and E. Ott, 2015: Data assimilation using climatologically augmented local ensemble transform Kalman Filter, Tellus, 67, 9 pp.

• Liu C., Q. Xiao, and B. Wang, 2008: An ensemble-based four dimensional variational data assimilation scheme. Part I: technique formulation and preliminary test. Mon. Wea. Rev., 136, 3363-3373.

• Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP – a comparison with 4D-VAR. Quart. J. Roy. Meteor. Soc., 129, 3183-3203.

• Lorenc, A.C. 2013: Recommended nomenclature for EnVar data assimilation methods. Research Activities in Atmospheric and Oceanic Modeling. WGNE, 2pp, URL: http://www.wcrp-climate.org/WGNE/BlueBook/2013/individual-articles/01_Lorenc_Andrew_EnVar_nomenclature.pdf )

• Penny, S., 2014: A hybrid local ensemble transform Kalman filter. Mon. Wea. Rev. 142, 2139-2149.. 48

http://www.wcrp-climate.org/WGNE/BlueBook/2013/individual-articles/01_Lorenc_Andrew_EnVar_nomenclature.pdf

Sampling of References• Wang, X., C. Snyder, and T. M. Hamill, 2007a: On the theoretical equivalence of differently proposed ensemble/3D-Var

hybrid analysis schemes. Mon. Wea. Rev., 135, 222-227.

• Wang, X., 2010: Incorporating ensemble covariance in the Gridpoint Statistical Interpolation (GSI) variational minimization: a mathematical framework. Mon. Wea. Rev., 138, 2990-2995.

• Zhang F., M. Zhang, and J. Poterjoy, 2013: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited area weather prediction model and comparison to E4DVar. Mon. Wea. Rev., 141, 900–917.

• Zhang, M. and F. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev., 140, 587–600.

49

Hybrid Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July...

Documents

Transcript of Hybrid Methods Daryl Kleist 1 JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July...