La puissance de calcul et les techniques de prévision pour...
Transcript of La puissance de calcul et les techniques de prévision pour...
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
La puissance de calcul et les techniques deprevision pour les Geosciences en grande
dimensionSupport from RTRA-STAE network
Serge Gratton,with E. Simon, Ph.L. Toint, L.N. Vicente, Z. Zhang
S. Gurol, P. Laloyau, M. Pontaud, A. Weaver
December, 2016
1 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Outline
1 Generalities
2 Some perspectives in meteorology : a general view
3 Enhancing parallel performance by decoupling in time
4 Using hierachy of observations
5 Subspace reduction
1 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
The art of forecasting...
A (dynamical) system is characterized by state variables, e.g.
velocity components
pressure
density
temperature
gravitational potential
Goal : predict or understand the system behaviour at a future time from
dynamical integration model
observational data
Applications : climate, meteorology, oceanography, neutronics, finance, ...−→ forecasting problems
2 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
The dynamical integration model predicts the state of the system giventhe state at an earlier time.−→ integrating may lead to very large prediction errors−→ (inexact physics, discretization errors, approximated parameters)
Observational data are used to improve accuracy of the forecasts.−→ but the data are inaccurate (measurement noise, under-sampling−→ In atmosphere/ocean science millions of observations and variables
are processed every day : large scale inverse problem
3 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
The dynamical integration model predicts the state of the system giventhe state at an earlier time.−→ integrating may lead to very large prediction errors−→ (inexact physics, discretization errors, approximated parameters)
Observational data are used to improve accuracy of the forecasts.−→ but the data are inaccurate (measurement noise, under-sampling−→ In atmosphere/ocean science millions of observations and variables
are processed every day : large scale inverse problem
3 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Data assimilation chart
−2 −1 0 1−1
0
1
starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel
data are are processed : screened, agglomerated
adjustment with of model with respect to observations : best value to befound by some form of optimization
predictions are then issued
4 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Data assimilation chart
−2 −1 0 1−1
0
1
starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel
data are are processed : screened, agglomerated
adjustment with of model with respect to observations : best value to befound by some form of optimization
predictions are then issued
4 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Data assimilation chart
−2 −1 0 1−1
0
1
starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel
data are are processed : screened, agglomerated
adjustment with of model with respect to observations : best value to befound by some form of optimization
predictions are then issued
4 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Data assimilation chart
−2 −1 0 1−1
0
1
starting from a priori knowledge on the state, the forward model is run :expensive in computer time, not always very parallel
data are are processed : screened, agglomerated
adjustment with of model with respect to observations : best value to befound by some form of optimization
predictions are then issued
4 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
One possibility is to approximately solve a large-scale non-linear weightedleast-squares problem :
minx∈Rn
1
2‖x − xb‖2
B−1 +1
2
N∑j=0
∥∥Hj
(Mj(x)
)− yj
∥∥2
R−1j
where
x ≡ x(t0) is the control variable in Rn, n ∼ 106.
Mj are model operators : x(tj) =Mj(x(t0))
Hj are observation operators : yj ≈ Hj(x(tj))
the obervations yj and the background xb are noisy
B and Rj are covariance matrices
Model integration and minimization are crucial mathematical steps tooptimize to achieve good performance
Intricate combination of physics, applied mathematics and computerscience
Steered by application challenges
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Improving the forecasting capabilities of thesystem : case of Meteorology
The main goal is to obtain fast, accurate algorithms that yields a goodrepresentation of reality.Raises challenges :
Physical modelling. Use finer grids, incorporate as many observations aspossible and constraints
Statistical modelling. Try to learn from earlier forecast exercises. Estimaterather than impose errors on a priori knowledge and dynamical model
Algorithmic. Solve large-scale systems, large-scale optimization problems
Machine. Fast machines based on sustainable models in terms of cost,consumption, with balanced performance
6 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Some challenges on the model side
At Meteo-France, essentially 2 models : global and limited area/regional.A ratio of 3-5 between resolution of both.
Want to increase resolution of limited area to :
take into account topography, soil moisture for fog forecastIncrease resolution also needed for cevenol events
Want to add new variables to temperature, pressure, etc. For instance dustconcentration, aerosols. Want to add also chemistry with new applicationsexpected in air quality. This would results in an improved physics
All these improvements ask for ever larger computer ressources.
Huge socio-economical impact
7 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Challenges on the optimization side (I) :weak-constrained variational approach
minx∈Rn
1
2‖x0 − xb‖2
B−1 +1
2
N∑j=0
∥∥Hj
(xj)− yj
∥∥2
R−1j
+1
2
N∑j=1
‖xj −Mj(xj−1)︸ ︷︷ ︸qj
−q‖2
Q−1j
x =
x0
x1
...xN
∈ Rn is the control variable (with xj = x(tj))
Notation same as before but :Matrices Qj are the covariance matrices for model error. They should beestimatedDramatically increase in the number of degrees of freedomSolution computational cost does not scale linearly with problem sizeunless efficient acceleration techniques are found
We are facing a large-scale weighted nonlinear least-squares problem
8 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Parallelism in time : saddle Point Approach
Let us consider weak-constraint 4D-Var as a constrained problem :
min(δp,δw)
1
2‖δp− b‖2
D−1 +1
2‖δw − d‖2
R−1
subject to δp = Lδx and δw = Hδx
We can write the Lagrangian function for this problem as
L(δw, δp,λ,µ) =1
2‖δp− b‖2
D−1 +1
2‖δw − d‖2
R−1
+ λT (δp− Lδx) + µT (δw −Hδx)
The stationary point of L satisfies the following equations :
D−1(Lδx− b) + λ = 0 (1)
R−1(Hδx− d) + µ = 0 (2)
LTλ + HTµ = 0 (3)
9 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Preconditioning Saddle Point Formulation of4D-Var
A =
D 0 L0 R H
LT HT 0
=
(A BT
B 0
)
B is the most computationally expensive block and calculations involvingA are relatively cheap
The inexact constraint preconditioner proposed by (Bergamaschi et. al.2005) is promising for our application. The preconditioner can be chosenas :
P =
(A BT
B 0
)=
D 0 L0 R 0
LT 0 0
,
where
L is an approximation to the matrix LB = [LT 0] is a full row rank approximation of the matrix B ∈ Rn×(m+n)
10 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Numerical Results with qg-model
0 100 200 300 400 500 600Sequential sub-window integrations
102
103
104
105
Nonlin
ear
Cost
Funct
ion v
alu
e
Forcing
Precond_Ck
Precond_Fk
Figure – Nonlinear cost function values along sequential subwindow integrations
→ At each iteration the forcing formulation requires one application of L−1,followed by one application of L−T (16 sequential subwindow integrations)
11 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Multilevel algorithm in the dual space
Motivations
”Huge” amount of data (even if the system is under sampled).
B Assimilation computationally expensive.
Heterogeneous spatial distribution of the observation.
B Numerous observations in some areas VS few observations in some others.
Do we need to assimilate all the observations to reach a target accuracy ?
Fine and coarse subproblems
The fine observation grid data assimilation problem :
minδxf ∈Rn
1
2‖x + δxf − xb‖2
B−1 +1
2‖Hf δxf − df ‖2
R−1f
(4)
B (δxf , λf ) s.t.
{(R−1
f Hf BHTf + Imf )λf = R−1
f (df − Hf (xb − x))δxf = xb − x + BHT
f λf
12 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
An example of observation sets
Coarse observation set Oc Auxiliary observation set Of
Fine observation set Of13 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Example : the Lorenz-96 system
Configuration of the experiment
Model
B u is a vector of N-equally spaced entriesaround a circle of constant latitude.
B Chaotic behavior for F > 5 and N > 11.
∀j ∈ NN , θ ∈ NΘ,duj+θdt
=
1
κ(−uj+θ−2uj+θ−1 + uj+θ−1uj+θ+1 − uj+θ + F )
uN = u0; u−1 = uN−1; uN+1 = u1
with N = 40, F = 8, κ = 120 and Θ = 10,T = 120 and ∆t = 1
80
Background and observations
B Normal distributed additive noise :N (0, σ2
b/o) with σb = 0.2, σo = 0.1.
B B = σ2bIn and R = σ2
o Ip .
Coordinate system
Dynamical system (space and time)
14 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Example : Cost function and RMS errorC
ost
f.V
Sn
bo
bs
RM
Ser
ror
VS
tim
e
15 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Hybrid approach : Envar
An ensemble method enables to compute a subspace that contains thedirection that explain most of the variability of the system
The variational problem is solved on the linear space spanned byensembles. Not very different from model reduction techniques : needsubspace selection
No need to compute derivatives
Embarrassingly parallel !
Shallow water equations∂u∂t
+ u ∂u∂x
+ v ∂u∂y
− fv + g ∂z∂x
= ν∆u
∂v∂t
+ v ∂v∂x
+ v ∂v∂y
+ fu + g ∂z∂y
= ν∆v
∂z∂t
+ ∂uz∂x
+ ∂vz∂y
= ν∆z
16 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Uncertainty quantification
Ask more to the prediction system :
Want to estimate errors in initial conditions, in models, in boundaryconditions and their impact
Associate a measure of fidelity/probability to the forecast
Run an ensemble of predictions.Several proposal by the community(Ensembles of 4D-Var or 4D-Envar) :
A set of model trajectories is computed
A filter is run on a subspace span by the ensembles using the observationand model likelihoods, in a small dimensional space
Raises new opportunity in terms of error spread analysis
17 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Ensemble/particule filters
Good news :
are based on a sophisticate sampling of the probability densities involvedin the inverse problem
are naturally embarrassingly parallel.
are based on a derivative free variants of the Kalman filter. Do not need asophisticated tangent or adjoint code
Not so good news :
suffer from the curse of dimensionality
need techniques to alleviate the sampling error : localization, inflation,colapse of weights
are based on a derivative free variant of the Kalman filter. Do not need asophisticated tangent or adjoint code
Raise a number of important issues and stimulate new ideas : criticalsets, quasi-static approaches
18 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension
Generalities Meteorology Time parallel Multi-level in observation Subspace decomposition
Conclusion/Remarks
New applications, like uncertainty quantification call for a new generationof algorithms
Most applications are really large scale and use of parallel computers ismandatory. Issues like performance evaluation, fault tolerance, energyconsumption should be considered at first place
Completly new (at least for the domain) algorithms have to be designedto face these challenges. parallel in time, multigrid, saddle points,uncertainty quantification, global optimization
As usual, available computers also favour particular algorithms accordingto their hardware particularities
Exploiting the partially separable nature in ensemble methods may beuseful
Hybridization of particule filters and variational approaches
19 / 19
La puissance de calcul et les techniques de prevision pour les Geosciences en grande dimension