2016 L26 MEA716 4 19 DA1 - Nc State University · 2016. 5. 2. · (Panasonic Weather Services) GL:...

Tue 4/19/2016Data Assimilation

• Notes and optional practice assignment

Reminders/announcements:• MP experiment assignment (due Thursday)• Final presentations: 28 April, 1-4 pm (final exam period)

• Schedule optional meetings with me if feedback or assistance is needed • Be sure to emphasize the analysis aspect! • Handout/assignment provides additional guidance for content and evaluation• Extra credit option: YouTube presentation of your project! CamStudio (free)

https://www.youtube.com/watch?v=7k9rE3PfT0k&nohtml5=False

• Read short “scenario” for Thursday (will discuss in class)

Data AssimilationObjectives:

- Outline basic approach building from simple example

- Introduce central concepts and terminology

- Clarify differences between main methods:- 3DVAR, 4DVAR, Kalman filter, ensemble Kalman filter

- Increase appreciation for our reliance upon DA in modeling and observational work

Why study DA?• Everyone using gridded analyses for any study or

for model IC/LBC should know from where it came

• Quality of (re)analyses determined by quality of DA system used to create it (Reanalysis1,2, CFSR, MERRA, ERA, NARR, GFS, NAM, etc.… )

• Those who have DA skills will have no problem finding a job (see next slides)!

• Offerings in MEAS are too limited in this area

Recent DA job opportunity

Job perspective from colleague Neil Jacobs (Panasonic Weather Services)

GL: My sense is that there is a shortage of expertise in this area…

An extreme shortage. The problems are 1) even people with met backgrounds are lacking some of the more intensive math knowledge, and 2) you just can’t find people who know Fortran. We are always looking…

Anyway, there are literally 6-7 people I know in the US who know this stuff well… Kleist, Derber, Whitaker, Hamill, etc. They are all largely self taught. There are those who are great at the theory like Eugeina, but you need someone with the complete package, and that includes both math theory, as well as a software engineer’s level of Fortran knowledge.

Starting salaries for someone with the compete package just out of grad school (Ph.D.) would easily go 150k in NC and 300k+ on the west coast. Granted, there are probably only 15 or so job opportunities out there, but 60% are unfilled.

Job perspective from colleague Neil Jacobs (Panasonic Weather Services)

GL: Also, what advice would you offer to a group of MS and PhD atmospheric scientists who will be entering the job market with regard to modeling, programming, and DA skills?

Learn as much Fortran as possible from a software engineering side versus science side. Science based Fortran does solve problems, but running operational code needs someone who can write very logical *efficient* code, and not just code that gets the right answer.

Having experience with HPC is a big deal. Parallel processing will always be used over single threaded Fortran jobs.

Don’t be afraid to look outside of just meteorology/DA positions. Those type of math and programming skills are needed everywhere from landing probes on Mars to power trading in spot markets. My point is if you know the math and can write efficient code, you’ll be assured a very well-paying job. There is a general lack of qualified people in many different markets, so having a good foundation can set you up perfectly to go in many directions.

Job perspective from colleague Peter Neilley(IBM/The Weather Company)

Also, what advice would you offer…

I share your intuition that there is not a proper balance in qualified DA talent vs. NWP talent in our field to meet the needs of our science. Its hard to quantify that imbalance, but something that indirectly speaks to it is the percentage of computer time that majors center spend on DA vs the forward models. I believe ECMWF is roughly 50-50 on DA vs Model while I think NOAA is closer to 30-70, as is the UKMO. This numbers speak to the importance of DA, which indirectly speaks to need for scientific talent to support the system.

That said, there are two general classes of NWP jobs, but generally just 1 type of DA job. NWP jobs are generally either model development or model use (e.g. for research/diagnostic/phenomena study purposes). The latter doesn't need to know or in some cases well understand the model. But on the DA side, there generally is just the development kind of job. There isn't (as much of) an equivalent of the diagnostic NWP position for DA.

Job perspective from colleague Peter Neilley(IBM/The Weather Company)

Also, what advice would you offer…

Maybe if I'm in Raleigh one day I could give the students a brown bag on my perspective of success in the enterprise, and in particular the private sector. But my fundamental recommendation to students these days is to develop a hybrid of talents. The quote I like to use is "find your second passion", the first passion being meteorology of course. A second passion is what will make a student stand out in a subdiscipline and if they are strong there it will make them very employable.

Examples of second passions are software development, societal applications (e.g. financial markets, risk management, communication, renewables, etc.), teaching, etc. Perhaps NWP/DA development could be consider a second passion too. The point is being just a meteorologist is generally not sufficient anymore.

Updated graph: Courtesy Dr. Adrian Simmons, ECMWF (July 2010)

ECMWF 500-mb anomaly correlation

Data Assimilation1.) Introduction 2.) Older Empirical Techniques

a.) Successive correction methods (SCM)b.) Nudging

3.) Least Squares methods• Optimum Interpolation (OI)• Variational (cost function) approaches

– 3DVar– 4DVar

• Kalman Filtering4.) Dynamical balancing of initial conditions5.) Observational data QC

Data Assimilation (DA)

• Manual gridding of data used in initial NWP efforts

• Richardson (1922) and Charney et al. (1950) undertook time-consuming manual interpolation procedures

• Immediately recognized as not acceptable

Data Assimilation (DA)• Talagrand (1997) defined DA as

– “Using all the available information to determine as accurately as possible the state of the atmospheric (or oceanic) flow.”

• “Available information” includes much more than observations: dynamical relations between variables, error statistics, climatology, etc.

• Consider also the value of information at different times and places from analysis domain

• Info from one field is related to other fields (e.g., wind, geopotential height)

Data Assimilation• There are not enough observations at a given time to

initialize a primitive equation model

• Data density is non-uniform

• Many observed variables are not dependent variables in PE model (e.g., satellite radiance, radar reflectivity)

• Clear from start that “first guess” needed in model initialization (aka “background” or “prior”)

• Initially, climatology, or combination of climo and short-term forecast used. Now, short-term forecast used

Data Assimilation• Even in Richardson’s first NWP forecast, raw

observations alone could not be used to initialize a successful numerical forecast (as Lynch demonstrated)

• Must modify data in dynamically consistent manner to provide a valid initial condition

• Traditionally, “Data Assimilation” involves two facets:– (i) Objective analysis (OA) – transfer of irregular observations to

a grid, with quality control (use 1st guess or background field)

– (ii) Data initialization – objective analysis contains noise that would result in large, spurious gravity waves, must filter, balance

• Modern DA essentially combines these steps

DA flow diagram (after Kalnay Fig. 5.1.2a)

Observations (+/- 3 h)

Background or first guess

Global analysis (statistical interpolation and

balancing)

Global forecast model

Operational forecasts

6-h forecast

Initial Conditions

Model is mechanism

that propagates info in time & space

E.g., model can transmit

info from data rich to data poor areas

4DDAOver oceans, analysis consists largely of

Asynoptic data (ship/buoy reports from all hours, satellite data, aircraft data)

Earlier model short-term forecasts

These sources are difficult to build into objective analyses

Major operational centers have combined OA and initialization into a continuous cycle of data assimilation (e.g., 4DVAR at ECMWF)

3.) Least Squares MethodsResults from Least Squares method carries over to

more complex methods, so introduce first

Start with simplest example possible:

Two independent temperature observations, T1 & T2; assume instruments unbiased (errors are random, not systematic)

(whiteboard, starting with equation #1 thru #11)

3.) Least Squares - variational

For optimum interpolation method, minimized least-square error with respect to weights

For variational methods: All information, weighted by their statistical error characteristics, used to derive cost function

Cost function is minimized to yield analysis that is most likely estimate of the true atmospheric state at a given time

3. Least Squares: variationalCost function (from Kalnay 2003): 2 temperature obs, T1 & T2

from different, independent data sources with normally distributed errors & standard deviations 1 and 2

We can define the cost function J as

2

2

22

2

21

1

21)(

TTTTTJ

The cost function is clearly related to the square of difference between the actual temperature T and the 2 observations

How can we minimize J(T)?

(12)

Gaussian statistics for normal distributions: Minimizing J(T) yields an expression for maximum likelihood of T in terms of observations and their error statistics

If we take

Let Ta be the “maximum likelihood” value, and solve:

Variational approach

021)( 2

2

22

2

21

1

TTTT

TTJ

T

2 22 1

1 22 2 2 22 1 2 1

aT T T

Least squares: VariationalDoes this make sense, physically?

2 22 1

1 22 2 2 22 1 2 1

aT T T

Yes, an observation with a small error variance (reliable) is weighted more heavily than one with a large error variance

In this way, construct the best possible analysis using all available data and knowledge of data error characteristics

3.) Least Squares Methods

Operational systems use a “background” or “first guess” field, in addition to observations

Analogous development using 1 observation and 1 background (first guess) value yields:

(13) a b obs bT T W T T


(13) a b obs bT T W T T

22

2

obsb

bW

222

111

obsba

“The analysis is derived by adding innovation to 1st guess, weighted by optimal weight”

“The optimal weight is the background error variance divided by the total error variance”

“The analysis precision is the sum of the background and observational precision”

(14)

(15)


(16)“The analysis error variance is = to the background

error variance weighted by (1 – optimal weight)”

These equations were developed here for an extremely simple example, but they have exactly the same form as in OI, 3DVar, Kalman Filtering

22 1 ba W

Analysis cyclingWe can “cycle” eq. (13) – (16) if the background

field is a model forecast

Suppose we have analysis at time t = ti (e.g., 00 UTC), and we want subsequent analysis for t = ti+1 (e.g. 03 UTC)

Two phases in cycle:1.) Forecast phase to update Tb and 2.) Analysis phase to update Ta and (using Tobs, )

2b2a 2

obs

Extension to DA In forecast phase of analysis cycle, we compute a

background field:(17)

In (17), M is a “Model”, which does not have to be a dynamical model but is, in practice

Must also estimate for next analysis time

1( ) ( )b i a iT t M T t

2b

Analysis cycling methodsWe must also estimate for the future time:

(a) Optimum Interpolation approach:

(18)

In (18), “a” is > 1 but < 2, a simple assumption about error growth assumed in model M

Then, compute new weight W using (14)

(b) Kalman Filtering approach:Still use (17), but instead of a simple assumption as

(18), compute background error variance from model

2 21( ) ( )b i a it a t

2b

Kalman Filter method

1( ) [ ( )]t i t i mT t M T t (19)

Future “true” Temperature

Model forecast from “perfect”

analysis

Model error

E.g., Use linearized version of model, “tangent linear model”, or TLM, to isolate error growth (several methods)

2 2 2 21 ,( )b i a i mt M

(20)

Analysis cycling1.) Get observations for new time, model forecast for

background

2.) Determine at new time from (18 - OI) or (20 - KF)

3.) Compute analysis, determine analysis error variance for use in next cycle ( )

2b

2a

CommentsIn general, we do not and can not directly observe the

variables we are analyzing- Variables, times, or locations can differ between obs, analysis- E.g., radars measure reflectivity or radial velocity, satellites measure

radiance; these are not dependent variables in model- These quantities are related to what we are analyzing, however

Must use an “observation operator” or “observation forward operator” to project background information onto observation space, in order to compute statistics

H includes vertical and horizontal interpolations, and transformations based on physical laws (e.g., radiation laws to convert background T or q to radiances)

)( bTH

CommentsIn 1980s and early 1990s, DA systems used “retrieved”

analysis variablese.g., Satellite sounder data were processed to produce profiles of T,

q, that “looked like” rawinsonde data

This proved to be inferior relative to utilizing the raw measurements directly in observational forward model

Why?Retrieval technique doesn’t have all the other data or background

available; the other data help to make the retrieval more accurate and maximize usefulness of data

It is extremely difficult to know the error covariance of the retrieved profiles; e.g., the radiance error covariance is much better known because it is due to instrument error (more likely to be unbiased)

Extension to 4-D: Notation and Terminology

Generalize to complete NWP model problem:(i) Analysis for a field of model variables (ii) A background field available at grid points(iii) A set of “p” observations at irregular points

Can be 2D or 3D analysis

Combine all model variables into large arrays of length n, where # of model variables

H is observation operator that transforms background field to observation space; HT, transpose of H, converts back to model space

ax

bx

oyir

n i j k

Extension to 4-D: Notation and Terminology

Also:(i) “error variance” becomes “error covariance matrix” (ii) “optimal weight” becomes “optimal gain matrix”

Note that is used to denote the observations:- The observations are in different spatial (and

temporal) locations from the model grid points

- Observed quantities are often not same physical parameters that are analyzed (as discussed before)

oy

Multivariate OI ProblemFirst, we tackled the OI problem, which is similar in

form to 3DVar and Kalman Filtering, but with more approximations and other practical disadvantages

For OI, (13) becomes:(21)

In (21),

abobsbt xHyWxx

)(

taa xx

Multivariate OI ProblemEquivalently, we could write

We can also write (21) as:

where (22)

The “innovation” or “observational increments” vector

“B” is the “background error covariance” matrix

“R” is the observation error covariance matrix

)( bobsba xHyWxx

a bx x W d

)( bobs xHyd

Multivariate OI Problem

(23)

“Analysis obtained by adding background to the innovation, weighted by optimal weight matrix. The 1st guess of the observations is obtained by applying H to background”

(24)

“The optimal weight is the background error covariance in observation space (BHT) divided by total error variance”

(24) is obtained by minimizing the least squares equation with respect to the weights to find optimum (not shown)

1)( TT HBHRBHW

dWxxHyWxx bbobsba

)(


(25)

“The analysis error variance is = to the background error covariance weighted by the identity matrix I minus the optimal weight matrix”

If you understood the earlier expressions for the simple problem, then you can understand these, because the meanings are exactly analogous

BWHIPa

OI and simple exampleWe can see that the following are analogous:

(23)

(13)

And the optimal weight equations are as well:

B = background error covariance, R = observation error covariance

( )a b obs bx x W y H x

bobsba TTWTT

T

T

B HWHBH R

22

2

obsb

bW


(25)

(16)

“The analysis error variance is = to the background error covariance weighted by the identity matrix I minus the optimal weight matrix”

BWHIPa

22 1 ba W

Comments on OI MethodObservation errors are due to several sources:

(i) Instrument error (random)(ii) Error of representativeness(iii) Errors in transformation between obs, model space (H)

In practice, high-density observation clusters merged into “superobservations” that combine information from individual observations before assimilation

Most critical aspect of problem is determining B, because R is generally diagonal (or can be made so) if observations are independent

As before, form of Kalman Filtering equations is the same as for OI, except for determination of background error covariance (forecasted from model)

Hrepinst RRRR

(26)

3DVar3DVar case:

Parallel to earlier example, we can arrive at (23) by minimizing a cost function with respect to analysis variables:

(27)

(28)

)()(21)( 11 xHyRxHyxxBxxxJ o

Tob

Tb

)(1111bo

TTba xHyRHHRHBxx

0)( ax xJ

How does 3DVar Differ from OI?3DVar and OI are similar in form, but in practice 3DVar has

several major advantages

OI makes several approximations that are absent in variational methods:

Method of solution is local (grid point by grid point) and sequential over variables

The background error covariance is also locally approximated

In 3DVar, the cost function can be minimized globally (simultaneously everywhere) with all data used simultaneously

In 3DVar, can easily add new constraints to cost function (e.g., a balance condition, or direct QC procedures)

3DVarFrom our notes:

Zapotocny et al. (2000), eq. (3):

Difference is use of balance constraint in operational cost function, as discussed in article on page 609


Tob

Tb

(27)

4DVar4DVar is used at ECMWF, available in MM5 (Zou et al.

1997) and now WRF

(29)

Similar in form to 3DVar, but better incorporation of observations from times that differ from ta

Find analysis that minimizes difference between model solution, observations over some time interval

Model assumed perfect in this process

oiii

Toi

N

ii

btt

Tbttt yxHRyxHxxBxxxJ )())((

21

21)( 1

000

10000

4DVar3DVar and 4DVar:

(29)

4DVar cost function includes summation over time of each observational increment computed with respect to the model integrated to the time of the observation

Cost function minimized wrt initial state of model, but analysis at end of time interval is given by model integration – so analysis must be a solution of model equations

oiii

Toi

N

ii

btt

Tbttt yxHRyxHxxBxxxJ )())((

21

21)( 1

000

10000


Tob

Tb

(27)

Background integrated to same time as observations using model

4DVarUsed at ECMWF, available in MM5, WRF (Zou et al. 1997)

Some centers have skipped straight from earlier techniques to 4DVar (e.g., ECMWF)

Why has NCEP invested so heavily in 3DVar development? Insights from MM5/WRF Team:

Stronger reliance on model itself in 4DVar, can be a disadvantage

4DVar is computationally very expensive, and many systems cannot utilize all observations in time to get operational IC to model

Not clear whether benefit from 4DVar greater than could be derived from better use of high-density obs in 3DVar (Kalnay 2003)

2016 L26 MEA716 4 19 DA1 - Nc State University · 2016. 5. 2. · (Panasonic Weather Services) GL:...

Documents

Transcript of 2016 L26 MEA716 4 19 DA1 - Nc State University · 2016. 5. 2. · (Panasonic Weather Services) GL:...