La puissance de calcul et les techniques de prévision pour...

24
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition La puissance de calcul et les techniques de pr´ evision pour les Geosciences en grande dimension Support from RTRA-STAE network Serge Gratton, with E. Simon, Ph.L. Toint, L.N. Vicente, Z. Zhang S. Gurol, P. Laloyau, M. Pontaud, A. Weaver December, 2016 1 / 19 La puissance de calcul et les techniques de pr´ evision pour les Geosciences en grande dimension

Transcript of La puissance de calcul et les techniques de prévision pour...

Page 1: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

La puissance de calcul et les techniques deprevision pour les Geosciences en grande

dimensionSupport from RTRA-STAE network

Serge Gratton,with E. Simon, Ph.L. Toint, L.N. Vicente, Z. Zhang

S. Gurol, P. Laloyau, M. Pontaud, A. Weaver

December, 2016

1 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 2: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Outline

1 Generalities

2 Some perspectives in meteorology : a general view

3 Enhancing parallel performance by decoupling in time

4 Using hierachy of observations

5 Subspace reduction

1 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 3: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

The art of forecasting...

A (dynamical) system is characterized by state variables, e.g.

velocity components

pressure

density

temperature

gravitational potential

Goal : predict or understand the system behaviour at a future time from

dynamical integration model

observational data

Applications : climate, meteorology, oceanography, neutronics, finance, ...−→ forecasting problems

2 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 4: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

The dynamical integration model predicts the state of the system giventhe state at an earlier time.−→ integrating may lead to very large prediction errors−→ (inexact physics, discretization errors, approximated parameters)

Observational data are used to improve accuracy of the forecasts.−→ but the data are inaccurate (measurement noise, under-sampling−→ In atmosphere/ocean science millions of observations and variables

are processed every day : large scale inverse problem

3 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 5: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

The dynamical integration model predicts the state of the system giventhe state at an earlier time.−→ integrating may lead to very large prediction errors−→ (inexact physics, discretization errors, approximated parameters)

Observational data are used to improve accuracy of the forecasts.−→ but the data are inaccurate (measurement noise, under-sampling−→ In atmosphere/ocean science millions of observations and variables

are processed every day : large scale inverse problem

3 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 6: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Data assimilation chart

−2 −1 0 1−1

0

1

starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel

data are are processed : screened, agglomerated

adjustment with of model with respect to observations : best value to befound by some form of optimization

predictions are then issued

4 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 7: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Data assimilation chart

−2 −1 0 1−1

0

1

starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel

data are are processed : screened, agglomerated

adjustment with of model with respect to observations : best value to befound by some form of optimization

predictions are then issued

4 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 8: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Data assimilation chart

−2 −1 0 1−1

0

1

starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel

data are are processed : screened, agglomerated

adjustment with of model with respect to observations : best value to befound by some form of optimization

predictions are then issued

4 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 9: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Data assimilation chart

−2 −1 0 1−1

0

1

starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel

data are are processed : screened, agglomerated

adjustment with of model with respect to observations : best value to befound by some form of optimization

predictions are then issued

4 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 10: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

One possibility is to approximately solve a large-scale non-linear weightedleast-squares problem :

minx∈Rn

1

2‖x − xb‖2

B−1 +1

2

N∑j=0

∥∥Hj

(Mj(x)

)− yj

∥∥2

R−1j

where

x ≡ x(t0) is the control variable in Rn, n ∼ 106.

Mj are model operators : x(tj) =Mj(x(t0))

Hj are observation operators : yj ≈ Hj(x(tj))

the obervations yj and the background xb are noisy

B and Rj are covariance matrices

Model integration and minimization are crucial mathematical steps tooptimize to achieve good performance

Intricate combination of physics, applied mathematics and computerscience

Steered by application challenges

Page 11: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Improving the forecasting capabilities of thesystem : case of Meteorology

The main goal is to obtain fast, accurate algorithms that yields a goodrepresentation of reality.Raises challenges :

Physical modelling. Use finer grids, incorporate as many observations aspossible and constraints

Statistical modelling. Try to learn from earlier forecast exercises. Estimaterather than impose errors on a priori knowledge and dynamical model

Algorithmic. Solve large-scale systems, large-scale optimization problems

Machine. Fast machines based on sustainable models in terms of cost,consumption, with balanced performance

6 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 12: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Some challenges on the model side

At Meteo-France, essentially 2 models : global and limited area/regional.A ratio of 3-5 between resolution of both.

Want to increase resolution of limited area to :

take into account topography, soil moisture for fog forecastIncrease resolution also needed for cevenol events

Want to add new variables to temperature, pressure, etc. For instance dustconcentration, aerosols. Want to add also chemistry with new applicationsexpected in air quality. This would results in an improved physics

All these improvements ask for ever larger computer ressources.

Huge socio-economical impact

7 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 13: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Challenges on the optimization side (I) :weak-constrained variational approach

minx∈Rn

1

2‖x0 − xb‖2

B−1 +1

2

N∑j=0

∥∥Hj

(xj)− yj

∥∥2

R−1j

+1

2

N∑j=1

‖xj −Mj(xj−1)︸ ︷︷ ︸qj

−q‖2

Q−1j

x =

x0

x1

...xN

∈ Rn is the control variable (with xj = x(tj))

Notation same as before but :Matrices Qj are the covariance matrices for model error. They should beestimatedDramatically increase in the number of degrees of freedomSolution computational cost does not scale linearly with problem sizeunless efficient acceleration techniques are found

We are facing a large-scale weighted nonlinear least-squares problem

8 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 14: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Parallelism in time : saddle Point Approach

Let us consider weak-constraint 4D-Var as a constrained problem :

min(δp,δw)

1

2‖δp− b‖2

D−1 +1

2‖δw − d‖2

R−1

subject to δp = Lδx and δw = Hδx

We can write the Lagrangian function for this problem as

L(δw, δp,λ,µ) =1

2‖δp− b‖2

D−1 +1

2‖δw − d‖2

R−1

+ λT (δp− Lδx) + µT (δw −Hδx)

The stationary point of L satisfies the following equations :

D−1(Lδx− b) + λ = 0 (1)

R−1(Hδx− d) + µ = 0 (2)

LTλ + HTµ = 0 (3)

9 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 15: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Preconditioning Saddle Point Formulation of4D-Var

A =

D 0 L0 R H

LT HT 0

=

(A BT

B 0

)

B is the most computationally expensive block and calculations involvingA are relatively cheap

The inexact constraint preconditioner proposed by (Bergamaschi et. al.2005) is promising for our application. The preconditioner can be chosenas :

P =

(A BT

B 0

)=

D 0 L0 R 0

LT 0 0

,

where

L is an approximation to the matrix LB = [LT 0] is a full row rank approximation of the matrix B ∈ Rn×(m+n)

10 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 16: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Numerical Results with qg-model

0 100 200 300 400 500 600Sequential sub-window integrations

102

103

104

105

Nonlin

ear

Cost

Funct

ion v

alu

e

Forcing

Precond_Ck

Precond_Fk

Figure – Nonlinear cost function values along sequential subwindow integrations

→ At each iteration the forcing formulation requires one application of L−1,followed by one application of L−T (16 sequential subwindow integrations)

11 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 17: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Multilevel algorithm in the dual space

Motivations

”Huge” amount of data (even if the system is under sampled).

B Assimilation computationally expensive.

Heterogeneous spatial distribution of the observation.

B Numerous observations in some areas VS few observations in some others.

Do we need to assimilate all the observations to reach a target accuracy ?

Fine and coarse subproblems

The fine observation grid data assimilation problem :

minδxf ∈Rn

1

2‖x + δxf − xb‖2

B−1 +1

2‖Hf δxf − df ‖2

R−1f

(4)

B (δxf , λf ) s.t.

{(R−1

f Hf BHTf + Imf )λf = R−1

f (df − Hf (xb − x))δxf = xb − x + BHT

f λf

12 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 18: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

An example of observation sets

Coarse observation set Oc Auxiliary observation set Of

Fine observation set Of13 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 19: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Example : the Lorenz-96 system

Configuration of the experiment

Model

B u is a vector of N-equally spaced entriesaround a circle of constant latitude.

B Chaotic behavior for F > 5 and N > 11.

∀j ∈ NN , θ ∈ NΘ,duj+θdt

=

1

κ(−uj+θ−2uj+θ−1 + uj+θ−1uj+θ+1 − uj+θ + F )

uN = u0; u−1 = uN−1; uN+1 = u1

with N = 40, F = 8, κ = 120 and Θ = 10,T = 120 and ∆t = 1

80

Background and observations

B Normal distributed additive noise :N (0, σ2

b/o) with σb = 0.2, σo = 0.1.

B B = σ2bIn and R = σ2

o Ip .

Coordinate system

Dynamical system (space and time)

14 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 20: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Example : Cost function and RMS errorC

ost

f.V

Sn

bo

bs

RM

Ser

ror

VS

tim

e

15 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 21: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Hybrid approach : Envar

An ensemble method enables to compute a subspace that contains thedirection that explain most of the variability of the system

The variational problem is solved on the linear space spanned byensembles. Not very different from model reduction techniques : needsubspace selection

No need to compute derivatives

Embarrassingly parallel !

Shallow water equations∂u∂t

+ u ∂u∂x

+ v ∂u∂y

− fv + g ∂z∂x

= ν∆u

∂v∂t

+ v ∂v∂x

+ v ∂v∂y

+ fu + g ∂z∂y

= ν∆v

∂z∂t

+ ∂uz∂x

+ ∂vz∂y

= ν∆z

16 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 22: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Uncertainty quantification

Ask more to the prediction system :

Want to estimate errors in initial conditions, in models, in boundaryconditions and their impact

Associate a measure of fidelity/probability to the forecast

Run an ensemble of predictions.Several proposal by the community(Ensembles of 4D-Var or 4D-Envar) :

A set of model trajectories is computed

A filter is run on a subspace span by the ensembles using the observationand model likelihoods, in a small dimensional space

Raises new opportunity in terms of error spread analysis

17 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 23: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Ensemble/particule filters

Good news :

are based on a sophisticate sampling of the probability densities involvedin the inverse problem

are naturally embarrassingly parallel.

are based on a derivative free variants of the Kalman filter. Do not need asophisticated tangent or adjoint code

Not so good news :

suffer from the curse of dimensionality

need techniques to alleviate the sampling error : localization, inflation,colapse of weights

are based on a derivative free variant of the Kalman filter. Do not need asophisticated tangent or adjoint code

Raise a number of important issues and stimulate new ideas : criticalsets, quasi-static approaches

18 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension

Page 24: La puissance de calcul et les techniques de prévision pour ...bonnaillie/Modelisation2016/GRATTON_Serge.pdfLa puissance de calcul et les techniques de pr evision pour les Geosciences

Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition

Conclusion/Remarks

New applications, like uncertainty quantification call for a new generationof algorithms

Most applications are really large scale and use of parallel computers ismandatory. Issues like performance evaluation, fault tolerance, energyconsumption should be considered at first place

Completly new (at least for the domain) algorithms have to be designedto face these challenges. parallel in time, multigrid, saddle points,uncertainty quantification, global optimization

As usual, available computers also favour particular algorithms accordingto their hardware particularities

Exploiting the partially separable nature in ensemble methods may beuseful

Hybridization of particule filters and variational approaches

19 / 19

La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension