ML: Multilevel Preconditioning Package Trilinos User’s Group Meeting Wednesday, October 15, 2003...
-
Upload
norman-wheeler -
Category
Documents
-
view
217 -
download
2
Transcript of ML: Multilevel Preconditioning Package Trilinos User’s Group Meeting Wednesday, October 15, 2003...
ML: Multilevel Preconditioning Package
Trilinos User’s Group MeetingTrilinos User’s Group Meeting
Wednesday, October 15, 2003
Jonathan Hu
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.
Outline
• What is ML?• Basic multigrid concepts• Configuring and building ML• Interoperability with other packages• User options available within ML• Example: Using ML as a preconditioner• What to do if something goes wrong• Sandia apps that use ML• Future Plans• Documentation, mailing lists, and getting help
ML Package
• C package that provides multigrid preconditioning for linear solver methods
• Developers: Ray Tuminaro, Jonathan Hu, Charles Tong (LLNL)
• Main methods– Geometric
• Grid refinement hierarchy• 2-level FE basis function domain decomposition
– AMG (algebraic)• Smoothed aggregation• Edge-element AMG for Maxwell’s equations• Classical AMG
Geometric Multigrid
A4 u4 = f4
Solve
Approximate PDE on (user supplied) grid hierarchy
A3 u3 = f3
A2 u2 = f2
A1 u1 = f1
Develop grid transfer operators: restriction Rk and interpolation Ik.
R3I3
R2 I2
R1 I1(S1)
(S4)
Develop smoothers Sk (approximate solve on a level) Jacobi, Gauss-Seidel, CG, etc.
(S3)
(S2)
Use coarse solves (k<4) to accelerate convergence for A4
Algebraic Multigrid (AMG)
• Build MG operators (Ak, Ik, Rk, Sk’s) automatically to define a hierarchy:
Akuk=fk, k=1,…,L
• Main difference: In AMG, interpolation operators Ik’s built automatically– also main difficulty
• Once Ik’s are defined, the rest follows “easily”:
Same Goal: Solve ALuL=fL.
Rk = IkT (usually)
Ak-1 = Rk-1 Ak Ik-1 (triple matrix product)
Smoother (iterative method) Sk
+ Gauss-Seidel, polynomial, conjugate gradient, etc.
MG(f, u, k){
if (k == 1) u1 = (A1)-1f1
else {
Sk(Ak, fk,uk) //pre-smooth
rk = fk – Ak uk
fk-1 = Rk-1 rk; uk-1 = 0 //restrict
MG(fk-1, uk-1, k-1)
uk = uk + Ik-1 uk-1 //interpolate & correct
Sk(Ak, fk,uk) //post-smooth }
}
Ak, Rk, Ik, Sk required on all levels
Sk: smoothers
Rk: restriction operators
Ik: interpolation operators
Recursive MG Algorithm (V Cycle)
V Cycle to solve A4u4=f4
k=4
k=1
W Cycle FMV Cycle
• Build tentative tIk to interpolate null space
where tIk(i,j) =
Algebraic Construction of Ik: Coefficients
Finding Ik
Smoothed aggregation
+ Improves tIk with Jacobi’s method: Ik = (I - diag(Ak) -1 Ak) tIk
+ Ik emphasizes what is not smoothed by Jacobi
{1 if ith point within jth aggregate
0 otherwise
5
5 5 2 2
2 7
7 72
ML Capabilities
• MG cycling: V, W, full V, full W
• Grid Transfers– Several automatic coarse grid generators– Several automatic grid transfer operators
• Smoothers– Jacobi, Gauss-Seidel, Hiptmair, LU, Aztec methods,
sparse approximate inverses, polynomial
• Kernels: matrix/matrix multiply, etc.
Configuring and Building ML
• ML builds by default when Trilinos is configured/built• Configure help
– See ml/README file, orml/configure --help
• Enabling direct solver support (default is off)configure --with-ml_superlu \--with-incdirs=“-I/usr/local/superlu/include” \--with-ldflags=“/usr/local/superlu/libsuperlu.a”
• Performance monitoring--enable-ml_timing--enable_ml_flops
• Example suite builds by defaultYour-build-location/packages/ml/examples
ML Interoperability with Trilinos Packages
ML
Epetra
Accepts user data as Epetra objects
Can be wrapped as Epetra_Operator
TSF
TSF interface exists
Othermatvecs
Othersolvers
Accepts other solversand MatVecs
AmesosAmesos interfacecoming very soon
Aztecoo
Meros
Via Epetra & TSF
Common Decisions for ML Users
• What smoother to use– # of smoothing steps
• Coarsening strategy• Cycling strategy
• Jacobi– Simplest, cheapest, usually least effective.
– Damping parameter () needed• Point Gauss-Seidel
– Equation satisfied one unknown at a time– Better than Jacobi– May need damping– Can be problematic in parallel
• processor-based (stale off-proc values)
ML Smoother Choices
)( )(1)()1( iii AxbDxx
ML Smoother Choices
• Block Gauss-Seidel– Satisfy several equations simultaneously by modifying several
DOFs (inverting subblock associated with DOFs).– Blocks can correspond to aggregates
• Aztec smoothers– Any Aztec preconditioner can be used as a smoother. – Probably most interesting is ILU & ILUT methods: may be more
robust than Gauss-Seidel.• Sparse LU• Hiptmair
– Specialized 2-stage smoother for Maxwell’s Eqns.• MLS
– Approximation to inverse based on Chebyshev polynomials of smoothed operator.
– Competitive with true Gauss-Seidel in serial.– Doesn’t degrade with # processors (unlike processor-based GS)
ML Cycling Choices
• V is default (usually works best).
• W more expensive, may be more robust.
• Full MG (V cycle) more expensive– Less conventional within preconditioners
These choices decide how frequently coarse grids are visited compared to fine grid.
ML Aggregation Choices
• MIS– Expensive, usually best with many processors
• Uncoupled (UC)– Cheap, usually works well.
• Hybrid = UC + MIS– Uncoupled on fine grids, MIS on coarser grids
A Simple Example: Trilinos/Packages/aztecoo/example/MLAztecOO
LinearSolver
ML: multi-grid pre-
cond.
AztecOO(Epetra)
ML
CG
AMG
Solution component
Example methods
Packagesused
Solve Ax=b:A: linear elasticity problemLinear solver: Conjugate GradientPrecond.: AMG smoothed aggregation
Simple Example
#include “ml_include.h” #include “ml_epetra_operator.h” #include “ml_epetra_utils.h” Epetra_CrsMatrix A; Epetra_Vector x,bb; ... //matrix, vectors loaded here // Construct Epetra Linear Problem Epetra_LinearProblem problem(&A, &x, &bb); // Construct a solver object AztecOO solver(problem); solver.SetAztecOption(AZ_solver, AZ_cg);
// Create and set an ML multilevel preconditioner int N_levels = 10; // max # of multigrid levels possible ML_Set_PrintLevel(3); // how much ML info is output to screen ML *ml_handle; ML_Create(&ml_handle,N_levels);
// Make linear operator A accessible to ML EpetraMatrix2MLMatrix(ml_handle, 0, &A);
Simple Example (contd.)
// Create multigrid hierarchy ML_Aggregate *agg_object; ML_Aggregate_Create(&agg_object); ML_Aggregate_Set_CoarsenScheme_Uncoupled(agg_object); N_levels = ML_Gen_MGHierarchy_UsingAggregation(ml_handle, 0, ML_INCREASING, agg_object); // Set symmetric Gauss-Seidel smoother for MG method int nits = 1; double dampingfactor = ML_DDEFAULT; ML_Gen_Smoother_SymGaussSeidel(ml_handle, ML_ALL_LEVELS, ML_BOTH, nits, dampingfactor); ML_Gen_Solver(ml_handle, ML_MGV, 0, N_levels-1);
// Set preconditioner within Epetra Epetra_ML_Operator MLop(ml_handle,comm,map,map); solver.SetPrecOperator(&MLop);
// Set some Aztec solver options and iterate solver.SetAztecParam(AZ_rthresh, 1.4); solver.SetAztecParam(AZ_athresh, 10.0); int Niters = 500; solver.Iterate(Niters, 1e-12);
Multigrid: Issues to be aware of
• Severely stretched grids or anisotropies• Loss of diagonal dominance• Atypical stencils• Jumps in material properties• Non-symmetric matrices• Boundary conditions• Systems of PDEs• Non-trivial null space (Maxwell’s equations,
elasticity)
Things to Try if Multigrid Isn’t Working
• Smoothers– Vary number of smoothing steps.– More `robust’ smoothers: block Gauss-Seidel, ILU, MLS
polynomial (especially if degradation in parallel).– Vary damping parameters (smaller is more conservative).
• Try different aggregation schemes• Try fewer levels• Try drop tolerances, if …
– high complexity (printed out by ML).– Severely stretched grids– Anisotropic problems– Variable regions
• Reduce prolongator damping parameter– Concerned about operators properties (e.g. highly non-
symmetric).
ALEGRA/NEVADA Zpinch Results
procs # Elements its
16 155,904 41
50 492, 912 42
128 1,259,976 47
CG preconditioned with V(1,1) AMG cycle
||r||2/||r0||2 < 10-8
2-stage Hiptmair smoother
(4th order polys in each stage)
Edge element interpolation
Flow and Transport Solution Algorithms for Chemical Attack in an Airport Terminal: 3D prototype
Preconditioner Problem Size
(# Unknowns)
# Newton Steps
# GMRES Its.
Total Time (sec.)
16 1-GHz P3 Procs.
64 procs
Cplant
512 procs Cplant
1 – Level Additive Schwarz: DD - ILU
118,560 6 213 57
872,000 7 826 1801
2 - Level Additive Schwarz:
Fine mesh - DD ILU
Coarse problem 17,360
unknowns – SuperLU
118,560 6 97 178
872,000 6 124 795
6,679,680 6 1396 -------- 5709
~50,000,000
Algorithm; Steady State: Newton – Krylov (GMRES)
Future Improvements
• Amesos interface– More direct solvers
• Self-correcting capabilities– Better analysis tools
– Improved coarse grids
– “On-the-fly” adjustments
• “Re-partitioning”
Getting Help
• See Trilinos/packages/ml/examples• See guide:
– ML User’s Guide, ver. 2.0 (in ml/doc)– ML User’s Guide, ver. 3.0 (under construction)
• Mailing lists– [email protected]– [email protected]
• Bug reporting, enhancement requests via bugzilla:– http://software.sandia.gov/bugzilla
• Email us directly– [email protected]– [email protected]