Parameter estimation using EnKF

Parameter estimation using EnKF

Juan Ruiz in colaboration with Manuel Pulido and Takemasa Miyoshi.

[email protected]

Centro de Investigaciones del Mar y la Atmósfera- CONICET-UBA

Department of Atmospheric and Oceanographic Sciences, University of

Buenos Aires

Numerical models have a dynamic core and parameterizations of the ”physics”.

In both components there are parameters that have some impact upon the model performance.

Most of this parameters arise from the formulation of numerical schemes and the simplifying assumptions made in parameterizations. So these parameters are intrinsically unknown and have to be tuned in order to get a good performance.

)pt,par(x+)pt,f(x=dtdx

21 ,,

Dynamic core Parameterizations (model physics)

Apart from the state variable vector (x) we can define a parameter vector (p1, p2) which contain the value of different model parameters.

Applications of parameter estimation in numerical models. •  To optimize model performance in operational weather forecasting. Optimal parameters should depend on the region, season, etc. Flexible model optimization. Fast optimization to take advantage of frequent model update.

• Estimate parameter uncertainty in order to introduce a better representation of model error associated with the uncertain parameters in the forecast (i.e. ensemble forecasts with perturbations in the parameters, stochastic parameterizations, etc).

•  Optimization of climate model for improved representation of the current climatology. Climate is independent of initial conditions, so that parameters plays in important role.

• Climate change detection and attribution. (A workshop on this topic will be held in Buenos Aires between October 15th-18th)

Efficient methods for parameter estimation A state of the art numerical model of the atmosphere or the ocean can have a large number of parameters. The number is substantially larger if we consider parameters that can have 2D structures (for example surface fluxes, Kang 2009, Kang et al. 2011) Traditional optimization techniques that require several model evaluations to describe the model sensitivity to the parameter and to optimize their value. They are too expensive to optimize time and space varying parameters (particularly when the parameterizations are fully coupled with the model). Changes in physical parameterizations, model resolution, model dynamic core require the estimation of a new set of parameters, so that an efficient method of parameter estimation could be a very useful tool Data assimilation methods can provide an efficient way to estimate these model parameters taking into account the temporal and spatial dependence of the parameters.

Data assimilation and parameter estimation:

Simultaneous state and parameter estimation using the information provided by the observations. (Kang 2009, Fertig et al. 2009, Miyoshi 2005, Koyama and Watanabe 2010, among many others)

( )( )bog

ba xHyK+x=x ´´´ −

( ) 1−R+HHPHP=K Tb

Tbg

• Pb contains the covariances between the state variables and the parameters. The parameters usually are not directly observed.

• Covariance between state variables and parameters are usually strongly flow dependent. Parameter estimation should be based on DA methods that take into account the state dependence of the augmented state error covariance matrix (EnKF, 4DVAR o PF)

⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

N

i

X

X

X=X

...

...

1

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

M

1

N

p

p

X

X=X

...

...

´ 1

Augmented state vector State vector

Update of the augmented state

Kalman gain for the augmented state

Convective scheme parameters optimization for the SPEEDY model using LETKF

The model has to be only sligthly modified to include the parameters as an input. The addition of the parameters in an existing EnKF is relatively simple and depending on the localization strategy the estimation of the parameters can be computationally very cheap.

LETKF equations for the simultaneous estimation of global parameters.

[ ]bbK

bbb xxxxX −−= |...|1

( ) ( ))1(,)1(,)1(, ´ −−− nak

nak

nak

bk XM=p,XM=X

[ ]bbk

bbb

bi

bi

yyyyYxHy

−−=

=

|...|

)(

1

[ ] 11)1(~ −−+−= bTb

a YRYIKP

[ ] 2/1~)1( aa PKW −=)(~ 1 boTbaa yyRYPw −= −

abba wXxx += aba WXX =ap

bba wppp += ap

ba Wpp =

Miyoshi (2005), Fertig et al. (2009), Kang (2009), Kang et al (2011)

Speedy convective scheme and associated parameters (Annan et al 2005, Koyama and Watanabe 2010)

Entrainment

(ENTMAX)

Mass flux at cloud base (TRCNV, RHBL)

Convective initiation is controlled by the vertical profile of the moist static energy and by the moisture in the PBL (RHBL).

Detrainment at cloud top

TRCNV is the convective adjustment rate.

There are two more parameters associated to the convective scheme, but the model performance is almost not sensitive to their value.

Three parameters are most important: RHBL, ENTMAX and TRCNV.

Model sensitivity to the convective scheme parameters The success of parameter estimation depends on the sensitivity of the model to the estimated parameters. If the model is not sensitive to changes in these parameters, then parameter estimation may fail.

Failure of EnKF and 4DVAR based parameter estimation may also occur if the model response to changes in the parameter is strongly non-linear.

l  RMSE of the 6 hour forecast normalized by the forecast error with the standard parameters.

l  The larger the response, the smaller the uncertainty in the estimation of the parameter.

TRCNV

RHBL

ENTMAX

Impact of errors in the initial conditions and in the observations.

Even in the perfect model scenario, the optimal parameter value may change sligthly in time, due to errors in the initial conditions and in the observations. The amplitud of this error is inversily proportional to the model sensitivity to the parameter.

Opt

imal

par

amet

er v

alue

TRCNV

RHBL

ENTMAX

Zonally averaged RMSE (temp) Zonally averaged RMSE (V)

Verti

cal l

evel

s

Latitude Latitude

Spatial distribution of the model sensitivity to the parameters.

Covariance between the parameter and the state variables Example of the covariance between TRCNV, the temperature at mid levels and wind at low levels.

These are global parameters, so that that the parameter can be correlated with state variables that are far away from each other. Covariance structure is strongly flow dependent because convection is intermitent in space and time.

As this global parameters can be correlated with variables that are far away from each other, no localization is applied in the estimation of the parameter.

[ ] 11)1(~ −−+−= bTb

a YRYIKP

ap

bba wppp += ap

ba Wpp =

)(~ 1 boTbaap yyRYPw −= − [ ] 2/1~)1( aa

p PKW −=

All the observations are then used to compute the weights for the parameters

These weigths are only used in the update of the parameters.

Localization strategy for global parameters Koyama and Watanabe (2010), Fertig et al. (2009), Aksoy et al. (2006), Hu et al. (2010), Miyoshi (2005).

Nature run. A three month run with the SPEEDY model using the original set of parameters is used as the nature run. Observations are generated taking the nature run values at every other grid point, at all vertical levels and every six hours. Ensemble initialization. Initial conditions from the state variables are generated as a random selection of possible model states. The parameter ensemble is initialized as random perturbations around an initial value that is relatively far from the one used in the nature run. Data assimilation Observations are assimilated every six hours. The model used in the assimilation experiment may only differ from the true model in the value of the convective parameter scheme.

Twin experiments with the SPEEDY model: Miyoshi (2005), Fertig et al. (2009), Kang (2009)

Parameter estimation: Time independent parameters

Convective schemes parameter are accurately estimated and the spin-up time is around 15 days (including the spin-up of the initial conditions).

Estimated parameters as a function of time, and parameter uncertainty (gray shaded)

RHBL

ENTMAX

TRCNV

Parameter estimation produces a strong impact upon analysis error. In this twin experiment the parameter estimation experiment is as good as the perfect model.

Impact upon analysis RMSE

( ) ( )Ttata xxRxx=RMSE −− −1

Imperfect parameters (no estimation)

Parameter estimation

Perfect model

Ana

lysi

s R

MSE

Time (cycles)

Ensemble forecast experiments: 20 members ensemble forecast up to 15 days lead time.

The positive impact of parameter estimation is also present in the medium range forecast. The stronger impact of parameter estimation is trough the improvement of the initial conditions.

Fore

cast

RM

SE

Imperfect model

Perfect model

Parameter estimation

Perfect parameters for the forecast

Ensemble forecast experiments: Spread error correlation.

Perfect model

Estimated parameters

Imperfect model (no parameter estimation)

Perfect parameters for the forecast.

Parameter estimation reduces model error and produce a possitive impact in the error-spread realtionship.

Parameter estimation: Time dependent parameters Kang et al. (2012), Koyama and Watanabe (2010). Time dependent parameters are included in the nature run.

RHBL

ENTMAX

TRCNV

Time dependent parameters are well estimated. However a small lag is present in the estimated parameters. This lag is due to the persistance model used in the parameter estimation.

Parameter uncertainty Parameter uncertainty is not known a priori. As parameters are assumed to be constant during model integration, the spread of the parameter ensemble does not grow during the forecast. The spread of the parameter ensemble is reduced every assimilation step. This can cause that the spread of the parameter approaches to 0 after a few assimilation cycles. After that the estimated value remains almost constant with time. When no localization is used the problem is even worse, due to the large amount of observations assimilated, the spread of the analysis can be some orders of magnitude smaller than the spread of the guess.

Fixed parameter spread Kang 2009,Aksoy et al. 2006, Tong and Xue (2008) After the assimilation the spread of the parameter is inflated back to an a-priori defined value.

Adequate parameter spread Parameter spread too small

Parameter spread has to be manually tuned, this can be computationally expensive if the number of estimated parameters is large.

Estimated parameter ensemble spread Inspired in previous approaches, Ruiz et al (2012, submitted)

Parameter inflation is computed in ensemble space (and only applied to the parameters). • Inflate the ensemble spread so it is similar to the background ensemble spread. But keep the structure of the analysis error covariance matrix.

• This nearly preserves the relative magnitude beteween the parameter ensemble spread and the spread in the state variables.

• The computation is done in the ensemble space so the computational cost is low (almost none).

K is the ensemble size. )~()1( aPtrKK

−=λ

ap

ba Wpp λ=

Other approaches: Fixed multiplicative inflation ( Fertig et al. (2009), Koyama and Watanabe (2010)), additive inflation (Kang et al. 2012).

Comparison of both techniques. Ruiz et al (2012, submitted)

The ESTIMATED spread approach shows the best result, independently of the initial parameter ensemble spread. The parameter ensemble spread converges always to the same value in the ESTIMATED approach (not shown). We do not need to tune offline the parameter esnemble spread.

Constant parameters

FIXED

ESTIMATED

The ESTIMATED spread approach can capture changes in the uncertainty of the parameter.

Estimated parameter spread for time independent parameters

Larger parameter value, weaker model sensitivity to the parameter, larger parameter spread.

Lower parameter value, stronger sensitivity, small spread

As has been showed by previous studies EnKF (LETKF) is an efficient and accurate method for parameter estimation of different types of parameters that affects model skill (Kang 2009, Kang et al 2011, Ruiz et al. 2012) or that are associated with biases in the model and in the observations (Miyoshi 2005, Fertig et al. 2009).

Both the analysis and the forecast are improved. The improvement in the forecast seems to be mostly associated with the improvement of the initial conditions.

The optimal parameter spread can be estimated at no additional computational cost.

Parameter estimation in the presence of other sources of model error.

In the experiments presented so far, the model was (almost) perfect. All model error is due to the uncertainty in the convective scheme parameters. What happens when those are not the only source of errors and when the model error cannot be completely corrected by tuning some model parameters (as in real applications)? How can parameter estimation be combined with other methods that include a representation of model error in the ensemble Kalman filter?

Imperfect model experiments: Other sources of model imperfection are introduced. To simulate other sources of model imperfection the value of some parameters (numerical diffusion, surface exchange coefficients) are modified. Convective scheme parameter are the only ones being estimated in the parameter estimation experiments. All other settings are as in the perfect model experiment.

Changes in model sensitivity

U, Level 1

U, Level 3

U, Level 5

U, Level 1

U, Level 3

U, Level 5

Perfect model experiment Imperfect model experiment

In the imperfect model experiment the optimal value depends on the level (for the same variable). Estimated parameter may depend on the available observations.

In the presence of other sources of model error, the estimated parameters do not converge to the true parameter value. Estimated parameters shows an increased variability in time.

Parameter estimation in the presence of other model error.

RHBL

ENTMAX

TRCNV

Impact upon the analysis

Analysis RMSE

Imperfect model with perfect convective scheme parameters

Imperfect model with parameter estimation

Estimated parameters produced a lower analysis RMSE than the true convective scheme parameters.

Parameter estimation can improve the analysis even in the presence of other sources of model error.

Perfect model

Combine parameter estimation with adaptive inflation and additive inflation.

• Additive inflation (applied to the state variables only):

Random samples of “true” model error are used as perturbations for the additive inflation approach. This can compensate for low ensemble spread associated with model error and can modify the structure of the initial conditions perturbations.

(Li et al 2009)

• Adaptive multplicative inflation (applied to the state variables only):

Adaptive multiplicative inflation uses the observations to find the optimal inflation level (Li et al, 2009, Miyoshi 2011). This can compensate low ensemble spread due to error sources not considered in the ensemble formulation.

Kang 2009 showed that multiplicative inflation applied to the state variables has a possitive impact upon the estimation of the parameters.

Impact upon the analysis

Analysis RMSE

Imperfect model with perfect convective scheme parameters

Additive and adaptive inflation produces a larger impact than parameter estimation alone.

Combining parameter estimation with these techniques produces an small improvement of the analysis.

Perfect model

Additive + adaptive inflation (no parameter

estimation)

Additive + adaptive inflation (with parameter estimation)

Impact upon the short to medium range forecast: Imperfect model.

Eventhough the estimated parameters do not converge to the true parameters, they produce an improvement of the forecast skill.

Imperfect model

Estimated parameters

Additive + adaptive inflation

Additive + adaptive inflation

+ parameter estimation

Parameter estimation is a very promising tool for treating model error in ensemble based data assimilation.

Parameter estimation can partially correct model error.

In more realistic experiments, parameter estimation finds a set of parameter that is close to the optimal ones in the sense that analysis error is reduced. This is related to the idea of a flexible optimization where the optimal parameter can be a function of the model resolution and of the interaction with other parameterizations.

As seen in the experiments optimal parameters in the presence of model error can be time dependent. The time scales of these fluctuations may be related to the impact that this optimal parameter can produce in the forecast.

In real applications there is no “true” parameters, just optimal parameters.

Surface exchange coefficients estimation with LETKF-WRF

Motivation: Surface fluxes of moisture and momentum has a great impact upon the tropical storms.

The exchange rates of these quantities are more uncertaion under strong winds regimes.

Previous studies suggest that the exchange coefficients can be estimad using data assimilation techniques (Ito et al 2010), or even the flux itself can be estimated (Kang et al. 2012)

In WRF similarity theory is used to compute the surface heat, moisture and momentum flux coefficients over the ocean. The exchange coefficients are functions of the near surface static stability, wind intensity and the temperature difference between the lowest model level and the sea surface temperature.

Over the land these coefficients are computed in the land surface model.

Parameter estimation implementation: Simple approach:

A set of correction factors ( α1, α2 α3 ) is introduced in the equations. This factors are a function of latitude and longitud and the multiply the fluxes computed by the WRF model.

1αWRFHH FF =

2αWRFqq FF =

1αWRFMM FF =

),( latlonii αα =

As parameters are local, then spatial localization is applied to estimate the parameter.

The same localization that is used for surface variables (i.e., ps) is applied to the parameter.

This approach is similar to the one used by Kang et al. 2009, but without the variable localization.

Perturbed heat flux Perturbed moisture flux

Perturbed momentum flux Perturbed SST

Sensitivity experiments: homogeneous parameter perturbation. Track sensitivity

Heat exchange coefficient perturbation producest the weakest impact, moisture flux produces the largest impact upon cyclon track.

Perturbed heat flux Perturbed moisture flux

Perturbed momentum flux Perturbed SST

Sensitivity experiments: homogeneous parameter perturbation. Storm intensity.

Parameter estimation using EnKF

Documents

Transcript of Parameter estimation using EnKF