Recent Algorithmic (and Practical) Developments in ML · MatrixFree - matrix-free SA. DD / DD-ML -...
Transcript of Recent Algorithmic (and Practical) Developments in ML · MatrixFree - matrix-free SA. DD / DD-ML -...
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Recent Algorithmic (and Practical)Developments in ML
Chris Siefert
Ray Tuminaro, Jonathan Hu, Pavel Bochev, Erik Boman andKaren Devine
Sandia National Laboratories
SAND # 2008-2180C
OutlineIntroduction to ML.
Solving Maxwell’s Equations w/ RefMaxwell.
Repartitioning w/ Zoltan and Hypergraphs.
Conclusions & Future Work.
Recent Algorithmic (and Practical) Developments in ML – p.2/28
Trilinos Summary
Discretizations
Methods
ML
Solvers
Core
Rythmos
Intrepid phdMesh
Moertel
Sacado
AztecOO
Ifpack Amesos
NOX
Anasazi Belos
Meros LOCA
Moocho
Thyra
Isorropia
RTOp
ForTrilinos
ZoltanTeuchos
Epetra EpetraExt
Recent Algorithmic (and Practical) Developments in ML – p.3/28
Trilinos Summary
Discretizations
Methods
Core
Solvers
ML
Rythmos
Intrepid phdMesh
Moertel
Sacado
Thyra
Isorropia
RTOp
ForTrilinos
ZoltanTeuchos
Epetra EpetraExt
AztecOO
Ifpack Amesos
NOX
Anasazi Belos
Meros LOCA
Moocho
Recent Algorithmic (and Practical) Developments in ML – p.4/28
ML Features (1)ML provides scalable multilevel/multigrid preconditioners.
Method typesSmoothed Aggregation (SA) - symmetric or nearlysymmetric problems.Non-symmetric SA - non-symmetric problems.MatrixFree - matrix-free SA.DD / DD-ML - domain decomposition.Maxwell - Maxwell’s equations.RefMaxwell - new method for Maxwell’s equations.
Recent Algorithmic (and Practical) Developments in ML – p.5/28
ML Features (2)Simple Trilinos interface.
Teuchos::ParameterList driven options.Has sensible defaults (override what you don’t like).
Parameter validation for accuracy.
MATLAB interface for some features (MLMEX).
Recent Algorithmic (and Practical) Developments in ML – p.6/28
Using ML// Start with a problem & build solver
Epetra_LinearProblem Problem(A, &LHS, &RHS);
AztecOO solver(Problem);
// Override any defaults
Teuchos::ParameterList List;
List.set("smoother: sweeps",2);
// Build the preconditioner
MultiLevelPreconditioner Prec(A,List);
solver.SetPrecOperator(Prec);
solver.Iterate(100,1e-12);// SolveRecent Algorithmic (and Practical) Developments in ML – p.7/28
OutlineIntroduction to ML.
Solving Maxwell’s Equations w/ RefMaxwell.
Repartitioning w/ Zoltan and Hypergraphs.
Conclusions & Future Work.
Recent Algorithmic (and Practical) Developments in ML – p.8/28
Target Applications
Electromagnetic phenomena modeled by Maxwell’sequations occur in many Sandia applications.
HEDP: Wire arrays and liners for Z machine simulations.
Magnetic Launch: Coil & rail guns (ONR).
Code: ALEGRA & Trilinos/ML (SNL).
Recent Algorithmic (and Practical) Developments in ML – p.9/28
Maxwell’s Equations
∇×1
µ∇× E + σE = 0 in Ω
n × E = 0 on Γ
n ×1
µ∇× E = 0 on Γ∗
∇×∇φ = 0 ⇒ large null space complicates discretization+ solver.
Large jumps in σ.
Significant mesh stretching.
Large problems & repeated solves→ Scalable linear solvers are critical.
Recent Algorithmic (and Practical) Developments in ML – p.10/28
Continuous/Discrete Relationship
H(Grad)Node Element
L2FaceH(Div)
EdgeH(Curl)
Null(Curl) Null(Div)
Null(Curl)Null(Div)
D∗
0D∗
1D∗
2
D0D1 D2
M0 M1M2
M3
−∇−∇·
∇
∇×
∇× ∇·
γ λσ µ−1
Recent Algorithmic (and Practical) Developments in ML – p.11/28
Continuous/Discrete Relationship
H(Grad)Node Element
L2FaceH(Div)
EdgeH(Curl)
Null(Curl) Null(Div)
Null(Curl)Null(Div)
D∗
0D∗
1D∗
2
D0D1 D2
M0 M1M2
M3
−∇−∇·
∇
∇×
∇× ∇·
γ λσ µ−1
Recent Algorithmic (and Practical) Developments in ML – p.12/28
Hodge Laplacian
H(Grad)Node Element
L2FaceH(Div)
EdgeH(Curl)
Null(Curl) Null(Div)
Null(Curl)Null(Div)
D∗
0D∗
1D∗
2
D0D1 D2
M0 M1M2
M3
−∇−∇·
∇
∇×
∇× ∇·
γ λσ µ−1
∆ = ∇×∇× + ∇∇·.
L1 = D∗
1D1 + D0D
∗
0
Recent Algorithmic (and Practical) Developments in ML – p.13/28
Discrete Hodge Decomposition
(M1D∗
1D1 + M1)e = b
Considere = a + D0p,
where D∗
0a = 0.
This gives us the block 2 × 2 system[
M1D∗
1D1 + M1+M1D0D
∗
0M1D0
M0D∗
0M0D
∗
0D0
] [
a
p
]
=
[
b
DT0b
]
.
Recent Algorithmic (and Practical) Developments in ML – p.14/28
Discrete Hodge Decomposition
(M1D∗
1D1 + M1)e = b
Considere = a + D0p,
where D∗
0a = 0.
This gives us the block 2 × 2 system[
M1D∗
1D1 + M1+M1D0D
∗
0M1D0
M0D∗
0M0D
∗
0D0
] [
a
p
]
=
[
b
DT0b
]
.
We can add D0D∗
0w/o changing answer!
Recent Algorithmic (and Practical) Developments in ML – p.15/28
Preconditioning
[
M1D∗
1D1 + M1D0D
∗
0+ M1 M1D0
M0D∗
0M0D
∗
0D0
] [
a
p
]
=
[
b
DT0b
]
.
Use preconditioner:
P−1 =[
I D0
]
[
AMG11
AMG22
] [
I
DT0
]
This preconditioner is implemented in ml/src/RefMaxwell inTrilinos 8.0.
Recent Algorithmic (and Practical) Developments in ML – p.16/28
3D Weak Scaling
Problem Code: ALEGRA (SNL).
Problems: LinerF, Sphere.
Material Parameters: 1e6 jump in conductivity.
Geometry: Regular meshes.
Compare Maxwell vs. RefMaxwell
Recent Algorithmic (and Practical) Developments in ML – p.17/28
Scaling: Old vs. NewS
olve
Tim
e
100
101
102
103
0
200
400
600
800
1000
1200
Processors
Fie
ld S
olve
Tim
e
Old SolverNew Solver
100
101
102
103
0
50
100
150
200
250
300
350
400
450
500
Processors
Fie
ld S
olve
Tim
e
Old SolverNew Solver
Itera
tions
100
101
102
103
0
20
40
60
80
100
120
140
160
Processors
Itera
tions
Old SolverNew Solver
100
101
102
103
0
20
40
60
80
100
120
140
160
180
200
Processors
Itera
tions
Old SolverNew Solver
Number of Processors
LinerF Sphere
LinerF Sphere
Recent Algorithmic (and Practical) Developments in ML – p.18/28
Jumbo Scaling: NewS
olve
Tim
e
104
0
100
200
300
400
500
600
700
Processors
Fie
ld S
olve
Tim
e
New Solver
104
0
50
100
150
200
250
300
350
400
Processors
Fie
ld S
olve
Tim
e
New Solver
Itera
tions
104
0
10
20
30
40
50
60
70
80
Processors
Itera
tions
New Solver
104
0
10
20
30
40
50
60
70
80
90
Processors
Itera
tions
New Solver
Number of Processors
LinerF Sphere
LinerF Sphere
Recent Algorithmic (and Practical) Developments in ML – p.19/28
OutlineIntroduction to ML.
Solving Maxwell’s Equations w/ RefMaxwell.
Repartitioning w/ Zoltan and Hypergraphs.
Conclusions & Future Work.
Recent Algorithmic (and Practical) Developments in ML – p.20/28
Why Repartitioning?
ComputationDominated
CommunicationDominated
Coarse grids ⇒ less work per proc ⇒ poor performance.
Solution: Move data to leave some procs idle.
Recent Algorithmic (and Practical) Developments in ML – p.21/28
Why Repartitioning?Goals
Move to a subset of processors⇒ Increase computation to communication ratio.Rebalance load.
MethodML current supports RCB via Zoltan.What about irregular meshes?What about consistency between partitions at differentlevels?New Feature: Hypergraph Repartitioning.⇒ To be released in Trilinos 9.0.
Recent Algorithmic (and Practical) Developments in ML – p.22/28
What is a Hypergraph?
2
4
1
3
X
X
X
X
X
X
X
X
X
X
X
X
Recent Algorithmic (and Practical) Developments in ML – p.23/28
Why Hypergraphs?For (compressed row) matrix:
Vertices == rows.Hyperedges == columns.Weights == Communication volume for that edge ⇒
we·( # processors in edge - 1).
Why Hypergraphs?Models structurally non-symmetric systems.Models communication costs — esp. important fornon-homogeneous meshes.Minimizes combined cost of application communicationand data migration.
Recent Algorithmic (and Practical) Developments in ML – p.24/28
Zoltan at Work
680k rows, 2.3M nonzeros
Recent Algorithmic (and Practical) Developments in ML – p.25/28
OutlineIntroduction to ML.
Solving Maxwell’s Equations w/ RefMaxwell.
Repartitioning w/ Zoltan and Hypergraphs.
Conclusions & Future Work.
Recent Algorithmic (and Practical) Developments in ML – p.26/28
ConclusionsRefMaxwell
Handles jumpy coefficients well.Utilizes existing technology.Scalability up to 24.5k procs!
Zoltan & Hypergraph RepartitioningAccurately models application and data migrationcosts.Good results on test problems.Future: ML’s unstructured mesh applications.
Recent Algorithmic (and Practical) Developments in ML – p.27/28
Future DirectionsTOPS-II: Interface w/ PETSc.
Extreme Scalability.
Adaptive methods for circuit problems.
Improved multimaterial algorithms.
Recent Algorithmic (and Practical) Developments in ML – p.28/28