Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

47
Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring

Transcript of Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Page 1: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Trilinos 102: Advanced Concepts

November 7, 2007

8:30-9:30 a.m.

Mike HerouxJim Willenbring

Page 2: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Overview

How to Create a Trilinos (Compatible) Package Adding Files to the Build System and Tarball Adding Configure Options Using Makefile.export for Tests and Examples 2D Objects. Parallel Data Redistribution.

Page 3: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Outline Creating Objects. 2D Objects. Teuchos tidbits. Performance Optimizations.

Page 4: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

How to Create a Trilinos (Compatible) Package

Two primary cases Using Autotools with an existing package Starting a new package using Autotools Both cases are similar

In either case, the package might be Stand alone Used via Trilinos/packages/external Added to Trilinos

Page 5: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

How to Create a Trilinos (Compatible) Package

Look at the new_package package Customize the following files for your package

configure.ac Makefile.am src/Makefile.am test/Makefile.am example/Makefile.am Makefile.export.<package>.in (Some of the necessary changes can be made using scripts supplied by

new_package)

Additional instructions are supplied with new_package

Page 6: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Library source files:

CORE = \$(srcdir)/Epetra_BLAS.cpp \...$(srcdir)/Epetra_Object.cpp

CORE_H = \$(srcdir)/Epetra_BLAS.h \…$(srcdir)/Epetra_ConfigDefs.h

Conditionally compiled files listed with ‘EXTRA_’ prefix Don’t forget to list header files!

Adding Files to the Build System and Tarball

Page 7: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Adding Files to the Build System and Tarball

Makefile.am/.in: Add the directory the new files are in to ‘SUBDIRS’ in the Makefile.am

one level up• SUBDIRS = DIR1 DIR2

Add the Makefile that will be generated to ‘AC_CONFIG_FILES’ in configure.ac

• AC_CONFIG_FILES([Makefile … src/Makefile …]) Don’t forget to ‘cvs add’ both files ./bootstrap

Other types of files (scripts, plain text, etc): Add the name of the file to EXTRA_DIST

• EXTRA_DIST = script1 README … ./bootstrap

Page 8: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Adding Configure Options

TAC_ARG_ENABLE_CAN_USE_PACKAGE(epetra, teuchos, …) ‘#ifdef HAVE_EPETRA_TEUCHOS’ in source code

TAC_ARG_ENABLE_FEATURE_SUB( epetra, abc, …) ‘#ifdef HAVE_EPETRA_ARRAY_BOUNDS_CHECK’ in source

Page 9: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Adding Configure Options

TAC_ARG_WITH_PACKAGE(zoltan, [Enable Zoltan interface support], ZOLTAN, no) AM_CONDITIONAL(HAVE_ZOLTAN, [test

"X$ac_cv_use_zoltan" != "Xno"])• ‘if HAVE_ZOLTAN’ in Makefile.am

AC_SEARCH_LIBS(pow,[m],,AC_MSG_ERROR(Cannot find math library))

Page 10: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Using Makefile.export for Tests / Examples

Makefile.am:

include $(top_builddir)/Makefile.export.epetra

EXEEXT = .exe

noinst_PROGRAMS = CrsMatrix_test

CrsMatrix_test_SOURCES = $(srcdir)/cxx_main.cppCrsMatrix_test_DEPENDENCIES=$(top_builddir)/src/libepetra.a

CrsMatrix_test_CXXFLAGS = $(EPETRA_INCLUDES)

CrsMatrix_test_LDADD = $(EPETRA_LIBS)

Page 11: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

LAL Foundation: Petra Petra provides a “common language” for distributed

linear algebra objects (operator, matrix, vector)

Petra provides distributed matrix and vector services. Has 3 implementations under development.

Page 12: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Perform redistribution of distributed objects:• Parallel permutations.• “Ghosting” of values for local computations.• Collection of partial results from remote processors.

Petra Object Model

Abstract Interface to Parallel Machine• Shameless mimic of MPI interface.• Keeps MPI dependence to a single class (through all of Trilinos!).• Allow trivial serial implementation.• Opens door to novel parallel libraries (shmem, UPC, etc…)

Abstract Interface for Sparse All-to-All Communication• Supports construction of pre-recorded “plan” for data-driven communications.• Examples:

• Supports gathering/scatter of off-processor x/y values when computing y = Ax.• Gathering overlap rows for Overlapping Schwarz.• Redistribution of matrices, vectors, etc…

Describes layout of distributed objects:• Vectors: Number of vector entries on each processor and global ID• Matrices/graphs: Rows/Columns managed by a processor.• Called “Maps” in Epetra.

Dense Distributed Vector and Matrices:• Simple local data structure.• BLAS-able, LAPACK-able.• Ghostable, redistributable.• RTOp-able.

Base Class for All Distributed Objects:• Performs all communication.• Requires Check, Pack, Unpack methods from derived class.

Graph class for structure-only computations:• Reusable matrix structure.• Pattern-based preconditioners.• Pattern-based load balancing tools. Basic sparse matrix class:

• Flexible construction process.• Arbitrary entry placement on parallel machine.

Page 13: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Petra Implementations

Three version under development: Epetra (Essential Petra):

Current production version. Restricted to real, double precision arithmetic. Uses stable core subset of C++ (circa 2000). Interfaces accessible to C and Fortran users.

Tpetra (Templated Petra): Next generation C++ version. Templated scalar and ordinal fields. Uses namespaces, and STL: Improved usability/efficiency.

Jpetra (Java Petra): Pure Java. Portable to any JVM. Interfaces to Java versions of MPI, LAPACK and BLAS via interfaces.

Page 14: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Details about Epetra Maps

Note: Focus on Maps (not BlockMaps). Getting beyond standard use case…

Page 15: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

1-to-1 Maps

1-to-1 map (defn): A map is 1-to-1 if each GID appears only once in the map (and is therefore associated with only a single processor).

Certain operations in parallel data repartitioning require 1-to-1 maps. Specifically: The source map of an import must be 1-to-1. The target map of an export must be 1-to-1. The domain map of a 2D object must be 1-to-1. The range map of a 2D object must be 1-to-1.

Page 16: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

2D Objects: Four Maps

Epetra 2D objects: CrsMatrix, FECrsMatrix CrsGraph VbrMatrix, FEVbrMatrix

Have four maps: RowMap: On each processor, the GIDs of the rows that processor

will “manage”. ColMap: On each processor, the GIDs of the columns that

processor will “manage”. DomainMap: The layout of domain objects

(the x vector/multivector in y=Ax). RangeMap: The layout of range objects

(the y vector/multivector in y=Ax).Must be 1-to-1 maps!!!

Typically a 1-to-1 map

Typically NOT a 1-to-1 map

Page 17: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Sample Problem

2 1 0

1 2 1

0 1 2

−⎡ ⎤⎢ ⎥− −⎢ ⎥⎢ ⎥−⎣ ⎦

1

2

3

x

x

x

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

=

1

2

3

y

y

y

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

y A x

Page 18: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Case 1: Standard Approach

RowMap = {0, 1} ColMap = {0, 1, 2} DomainMap = {0, 1} RangeMap = {0, 1}

1 1

22

2 1 0,... ,...

1 2 1

y xy A x

xy

−⎡ ⎤ ⎡ ⎤⎡ ⎤= = =⎢ ⎥ ⎢ ⎥⎢ ⎥− −⎣ ⎦ ⎣ ⎦⎣ ⎦

First 2 rows of A, elements of y and elements of x, kept on PE 0. Last row of A, element of y and element of x, kept on PE 1.

PE 0 Contents

[ ] [ ] [ ]3 3,... 0 1 2 ,...y y A x x= = − =

PE 1 Contents

RowMap = {2} ColMap = {1, 2} DomainMap = {2} RangeMap = {2}

Notes: Rows are wholly owned. RowMap=DomainMap=RangeMap (all 1-to-1). ColMap is NOT 1-to-1. Call to FillComplete: A.FillComplete(); // Assumes

2 1 0

1 2 1

0 1 2

−⎡ ⎤⎢ ⎥− −⎢ ⎥⎢ ⎥−⎣ ⎦

1

2

3

x

x

x

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

=1

2

3

y

y

y

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

y A xOriginal Problem

Page 19: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

1

2

3

x

x

x

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

1

2

3

y

y

y

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

Case 2: Twist 1

RowMap = {0, 1} ColMap = {0, 1, 2} DomainMap = {1, 2} RangeMap = {0}

[ ] 21

3

2 1 0,... ,...

1 2 1

xy y A x

x

− ⎡ ⎤⎡ ⎤= = = ⎢ ⎥⎢ ⎥− −⎣ ⎦ ⎣ ⎦

First 2 rows of A, first element of y and last 2 elements of x, kept on PE 0. Last row of A, last 2 element of y and first element of x, kept on PE 1.

PE 0 Contents

[ ] [ ]21

3

,... 0 1 2 ,...y

y A x xy

⎡ ⎤= = − =⎢ ⎥⎣ ⎦

PE 1 Contents

RowMap = {2} ColMap = {1, 2} DomainMap = {0} RangeMap = {1, 2}

Notes: Rows are wholly owned. RowMap is NOT = DomainMap

is NOT = RangeMap (all 1-to-1). ColMap is NOT 1-to-1. Call to FillComplete:

A.FillComplete(DomainMap, RangeMap);

2 1 0

1 2 1

0 1 2

−⎡ ⎤⎢ ⎥− −⎢ ⎥⎢ ⎥−⎣ ⎦

=

y A xOriginal Problem

Page 20: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Case 2: Twist 2

RowMap = {0, 1} ColMap = {0, 1} DomainMap = {1, 2} RangeMap = {0}

[ ] 21

3

2 1 0,... ,...

1 1 0

xy y A x

x

− ⎡ ⎤⎡ ⎤= = = ⎢ ⎥⎢ ⎥−⎣ ⎦ ⎣ ⎦

First row of A, part of second row of A, first element of y and last 2 elements of x, kept on PE 0.

Last row, part of second row of A, last 2 element of y and first element of x, kept on PE 1.

PE 0 Contents

[ ]21

3

0 1 1,... ,...

0 1 2

yy A x x

y

−⎡ ⎤ ⎡ ⎤= = =⎢ ⎥ ⎢ ⎥−⎣ ⎦⎣ ⎦

PE 1 Contents

RowMap = {1, 2} ColMap = {1, 2} DomainMap = {0} RangeMap = {1, 2}

Notes: Rows are NOT wholly owned. RowMap is NOT = DomainMap

is NOT = RangeMap (all 1-to-1). RowMap and ColMap are NOT 1-to-1. Call to FillComplete:

A.FillComplete(DomainMap, RangeMap);

2 1 0

1 2 1

0 1 2

−⎡ ⎤⎢ ⎥− −⎢ ⎥⎢ ⎥−⎣ ⎦

=

y A xOriginal Problem

1

2

3

x

x

x

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

1

2

3

y

y

y

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

Page 21: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

What does FillComplete Do?

A bunch of stuff. One task is to create (if needed) import/export

objects to support distributed matrix-vector multiplication: If ColMap ≠ DomainMap, create Import object. If RowMap ≠ RangeMap, create Export object.

A few rules: Rectangular matrices will always require:

A.FillComplete(DomainMap,RangeMap);

DomainMap and RangeMap must be 1-to-1.

Page 22: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Parallel Data Redistribution

Epetra vectors, multivectors, graphs and matrices are distributed via one of the map objects.

A map is basically a partitioning of a list of global IDs: IDs are simply labels, no need to use contiguous values (Directory class handles

details for general ID lists). No a priori restriction on replicated IDs.

If we are given: A source map and A set of vectors, multivectors, graphs and matrices (or other distributable objects)

based on source map. Redistribution is performed by:

1. Specifying a target map with a new distribution of the global IDs.2. Creating Import or Export object using the source and target maps.3. Creating vectors, multivectors, graphs and matrices that are redistributed (to

target map layout) using the Import/Export object.

Page 23: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Example: epetra/ex9.cppint main(int argc, char *argv[]) { MPI_Init(&argc, &argv); Epetra_MpiComm Comm(MPI_COMM_WORLD); int NumGlobalElements = 4; // global dimension of

the problem int NumMyElements; // local nodes Epetra_IntSerialDenseVector MyGlobalElements;

if( Comm.MyPID() == 0 ) { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 0; MyGlobalElements[1] = 1; MyGlobalElements[2] = 2; } else { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 1; MyGlobalElements[1] = 2; MyGlobalElements[2] = 3; }// create a map Epetra_Map Map(-1,MyGlobalElements.Length(),

MyGlobalElements.Values(),0, Comm);

// create a vector based on map Epetra_Vector xxx(Map); for( int i=0 ; i<NumMyElements ; ++i ) xxx[i] = 10*( Comm.MyPID()+1 ); if( Comm.MyPID() == 0 ){ double val = 12; int pos = 3; xxx.SumIntoGlobalValues(1,0,&val,&pos); } cout << xxx; // create a target map, in which all elements are on proc 0 int NumMyElements_target; if( Comm.MyPID() == 0 ) NumMyElements_target = NumGlobalElements; else NumMyElements_target = 0; Epetra_Map TargetMap(-1,NumMyElements_target,0,Comm); Epetra_Export Exporter(Map,TargetMap); // work on vectors Epetra_Vector yyy(TargetMap); yyy.Export(xxx,Exporter,Add); cout << yyy;MPI_Finalize();return( EXIT_SUCCESS );}

Page 24: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Output: epetra/ex9.cpp

> mpirun -np 2 ./ex9.exeEpetra::Vector MyPID GID Value 0 0 10 0 1 10 0 2 10Epetra::Vector 1 1 20 1 2 20 1 3 20Epetra::Vector MyPID GID Value 0 0 10 0 1 30 0 2 30 0 3 20Epetra::Vector

PE 0xxx(0)=10xxx(1)=10xxx(2)=10

PE 1xxx(1)=20xxx(2)=20xxx(3)=20

PE 0yyy(0)=10yyy(1)=30yyy(2)=30yyy(3)=20

PE 1

Export/Add

Before Export After Export

Page 25: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Import vs. Export

Import (Export) means calling processor knows what it wants to receive (send).

Distinction between Import/Export is important to user, almost identical in implementation.

Import (Export) objects can be used to do an Export (Import) as a reverse operation.

When mapping is bijective (1-to-1 and onto), either Import or Export is appropriate.

Page 26: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Example: 1D Matrix Assembly

a b

-uxx = fu(a) = 0

u(b) = 1

x1 x2 x3

PE 0 PE 1

• 3 Equations: Find u at x1, x2 and x3

• Equation for u at x2 gets a contribution from PE 0 and PE 1.

• Would like to compute partial contributions independently.

• Then combine partial results.

Page 27: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Two Maps

We need two maps: Assembly map:

• PE 0: { 1, 2 }.

• PE 1: { 2, 3 }.

Solver map:• PE 0: { 1, 2 } (we arbitrate ownership of 2).

• PE 1: { 3 }.

Page 28: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

End of Assembly Phase

At the end of assembly phase we have AssemblyMatrix:

On PE 0:

On PE 1:

Want to assign all of Equation 2 to PE 0 for usewith solver.

NOTE: For a class of Neumann-Neumann preconditioners, the above layout is exactly what we want.

2 1 0

1 1 0

−⎡ ⎤⎢ ⎥−⎣ ⎦

Equation 1:

Equation 2:

0 1 1

0 1 2

−⎡ ⎤⎢ ⎥−⎣ ⎦

Equation 2:

Equation 3:

Row 2 is shared

Page 29: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Export Assembly Matrix to Solver Matrix

Epetra_Export Exporter(AssemblyMap, SolverMap);

Epetra_CrsMatrix SolverMatrix (Copy, SolverMap, 0);

SolverMatrix.Export(AssemblyMatrix, Exporter, Add);

SolverMatrix.FillComplete();

Page 30: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Matrix Export

2 1 0

1 2 1

−⎡ ⎤⎢ ⎥− −⎣ ⎦

Equation 1:

Equation 2:

Equation 3: [ ]0 1 2−

2 1 0

1 1 0

−⎡ ⎤⎢ ⎥−⎣ ⎦

0 1 1

0 1 2

−⎡ ⎤⎢ ⎥−⎣ ⎦

Equation 1:

Equation 2:

Equation 2:

Equation 3:

PE 0 PE 0

PE 1 PE 1

Before Export After Export

Export/Add

Page 31: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Example: epetraext/ex2.cppint main(int argc, char *argv[]) {

MPI_Init(&argc,&argv);

Epetra_MpiComm Comm (MPI_COMM_WORLD);

int MyPID = Comm.MyPID();

int n=4;

// Generate Laplacian2d gallery matrix

Trilinos_Util::CrsMatrixGallery G("laplace_2d", Comm);

G.Set("problem_size", n*n);

G.Set("map_type", "linear"); // Linear map initially

// Get the LinearProblem.

Epetra_LinearProblem *Prob = G.GetLinearProblem();

// Get the exact solution.

Epetra_MultiVector *sol = G.GetExactSolution();

// Get the rhs (b) and lhs (x)

Epetra_MultiVector *b = Prob->GetRHS();

Epetra_MultiVector *x = Prob->GetLHS();

// Repartition graph using Zoltan

EpetraExt::Zoltan_CrsGraph * ZoltanTrans = new EpetraExt::Zoltan_CrsGraph();

EpetraExt::LinearProblem_GraphTrans * ZoltanLPTrans =

new EpetraExt::LinearProblem_GraphTrans( *(dynamic_cast<EpetraExt::StructuralSameTypeTransform<Epetra_CrsGraph>*>(ZoltanTrans)) );

cout << "Creating Load Balanced Linear Problem\n";

Epetra_LinearProblem &BalancedProb = (*ZoltanLPTrans)(*Prob);

// Get the rhs (b) and lhs (x)

Epetra_MultiVector *Balancedb = Prob->GetRHS();

Epetra_MultiVector *Balancedx = Prob->GetLHS();

cout << "Balanced b: " << *Balancedb << endl;

cout << "Balanced x: " << *Balancedx << endl;

MPI_Finalize() ;

return 0 ;

}

Page 32: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Need for Import/Export

Solvers for complex engineering applications need expressive, easy-to-use parallel data redistribution: Allows better scaling for non-uniform overlapping Schwarz. Necessary for robust solution of multiphysics problems.

We have found import and export facilities to be a very natural and powerful technique to address these issues.

Page 33: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Extending Capabilities: Preconditioners, Operators, Matrices

Illustrated using AztecOO as example

Page 34: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Epetra User Class Categories Sparse Matrices: RowMatrix, (CrsMatrix, VbrMatrix, FECrsMatrix, FEVbrMatrix)

Linear Operator: Operator: (AztecOO, ML, Ifpack)

Dense Matrices: DenseMatrix, DenseVector, BLAS, LAPACK,SerialDenseSolver

Vectors: Vector, MultiVector

Graphs: CrsGraph

Data Layout: Map, BlockMap, LocalMap

Redistribution: Import, Export, LbGraph, LbMatrix

Aggregates: LinearProblem

Parallel Machine: Comm, (SerialComm, MpiComm, MpiSmpComm)

Utilities: Time, Flops

Page 35: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

LinearProblem Class

A linear problem is defined by:Matrix A :

• An Epetra_RowMatrix or Epetra_Operator object.(often a CrsMatrix or VbrMatrix object.)

Vectors x, b : Vector objects.

To call AztecOO, first define a LinearProblem:Constructed from A, x and b.Once defined, can:

• Scale the problem (explicit preconditioning).

• Precondition it (implicitly).

• Change x and b.

Page 36: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO

Aztec is the previous workhorse solver at Sandia: Extracted from the MPSalsa reacting flow code. Installed in dozens of Sandia apps.

AztecOO leverages the investment in Aztec: Uses Aztec iterative methods and preconditioners.

AztecOO improves on Aztec by: Using Epetra objects for defining matrix and RHS. Providing more preconditioners/scalings. Using C++ class design to enable more sophisticated use.

AztecOO interfaces allows: Continued use of Aztec for functionality. Introduction of new solver capabilities outside of Aztec.

Belos is coming along as alternative. AztecOO will not go away. Will encourage new efforts and refactorings to use Belos.

Page 37: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

A Simple Epetra/AztecOO Program

// Header files omitted…int main(int argc, char *argv[]) { MPI_Init(&argc,&argv); // Initialize MPI, MpiComm Epetra_MpiComm Comm( MPI_COMM_WORLD );

// ***** Create x and b vectors ***** Epetra_Vector x(Map); Epetra_Vector b(Map); b.Random(); // Fill RHS with random #s

// ***** Create an Epetra_Matrix tridiag(-1,2,-1) *****

Epetra_CrsMatrix A(Copy, Map, 3); double negOne = -1.0; double posTwo = 2.0;

for (int i=0; i<NumMyElements; i++) { int GlobalRow = A.GRID(i); int RowLess1 = GlobalRow - 1; int RowPlus1 = GlobalRow + 1; if (RowLess1!=-1) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowLess1); if (RowPlus1!=NumGlobalElements) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowPlus1); A.InsertGlobalValues(GlobalRow, 1, &posTwo, &GlobalRow); }A.FillComplete(); // Transform from GIDs to LIDs

// ***** Map puts same number of equations on each pe *****

int NumMyElements = 1000 ; Epetra_Map Map(-1, NumMyElements, 0, Comm); int NumGlobalElements = Map.NumGlobalElements();

// ***** Report results, finish *********************** cout << "Solver performed " << solver.NumIters() << " iterations." << endl << "Norm of true residual = " << solver.TrueResidual() << endl;

MPI_Finalize() ; return 0;}

// ***** Create/define AztecOO instance, solve ***** AztecOO solver(problem); solver.SetAztecOption(AZ_precond, AZ_Jacobi); solver.Iterate(1000, 1.0E-8);

// ***** Create Linear Problem ***** Epetra_LinearProblem problem(&A, &x, &b);

Page 38: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO Extensibility

AztecOO is designed to accept externally defined: Operators (both A and M):

• The linear operator A is accessed as an Epetra_Operator.

• Users can register a preconstructed preconditioner as an Epetra_Operator.

RowMatrix:• If A is registered as a RowMatrix, Aztec’s preconditioners are

accessible.

• Alternatively M can be registered separately as an Epetra_RowMatrix, and Aztec’s preconditioners are accessible.

StatusTests:• Aztec’s standard stopping criteria are accessible.

• Can override these mechanisms by registering a StatusTest Object.

Page 39: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO understands Epetra_Operator

Epetra_Operator Methods Documentation

AztecOO is designed to accept externally defined: Operators (both A and M). RowMatrix (Facilitates use

of AztecOO preconditioners with external A).

StatusTests (externally-defined stopping criteria).

Page 41: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO UserOp/UserMat Recursive Call Example

Trilinos/packages/aztecoo/example/AztecOO_RecursiveCall

1. Poisson2dOperator A(nx, ny, comm); // Generate nx by ny Poisson operator2. Epetra_CrsMatrix * precMatrix = A.GeneratePrecMatrix(); // Build tridiagonal approximate Poisson

3. Epetra_Vector xx(A.OperatorDomainMap()); // Generate vectors (xx will be used to generate RHS b)4. Epetra_Vector x(A.OperatorDomainMap());5. Epetra_Vector b(A.OperatorRangeMap());

6. xx.Random(); // Generate exact x and then rhs b7. A.Apply(xx, b);

8. // Build AztecOO solver that will be used as a preconditioner9. Epetra_LinearProblem precProblem;10. precProblem.SetOperator(precMatrix);11. AztecOO precSolver(precProblem);12. precSolver.SetAztecOption(AZ_precond, AZ_ls);13. precSolver.SetAztecOption(AZ_output, AZ_none);14. precSolver.SetAztecOption(AZ_solver, AZ_cg);15. AztecOO_Operator precOperator(&precSolver, 20);

16. Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem17. AztecOO solver(problem); // Construct solver

18. solver.SetPrecOperator(&precOperator); // Register Preconditioner operator

19. solver.SetAztecOption(AZ_solver, AZ_cg);20. solver.Iterate(Niters, 1.0E-12);

Page 42: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Ifpack/AztecOO Example Trilinos/packages/aztecoo/example/IfpackAztecOO

1. // Assume A, x, b are define, LevelFill and Overlap are specified

2. Ifpack_IlukGraph IlukGraph(A.Graph(), LevelFill, Overlap);3. IlukGraph.ConstructFilledGraph();4. Ifpack_CrsRiluk ILUK (IlukGraph);5. ILUK.InitValues(A);6. assert(ILUK->Factor()==0); // Note: All Epetra/Ifpack/AztecOO method return int err codes7. double Condest;8. ILUK.Condest(false, Condest); // Get condition estimate9. if (Condest > tooBig) {10. ILUK.SetAbsoluteThreshold(Athresh);11. ILUK.SetRelativeThreshold(Rthresh);12. Go back to line 4 and try again13. }14. Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem15. AztecOO solver(problem); // Construct solver

16. solver.SetPrecOperator(&ILUK); // Register Preconditioner operator

17. solver.SetAztecOption(AZ_solver, AZ_cg);18. solver.Iterate(Niters, 1.0E-12);

19. // Once this linear solutions complete and the next nonlinear step is advanced,20. // we will return to the solver, but only need to execute steps 5 on down…

Page 43: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Multiple Stopping Criteria

Possible scenario for stopping an iterative solver: Test 1: Make sure residual is decreased by 6 orders of magnitude.

And Test 2: Make sure that the inf-norm of true residual is no more

1.0E-8.

But Test 3: do no more than 200 iterations.

Note: Test 1 is cheap. Do it before Test 2.

Page 44: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO StatusTest classes

AztecOO_StatusTest: Abstract base class for defining stopping

criteria. Combo class: OR, AND, SEQ

AztecOO_StatusTest Methods

Page 45: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

AztecOO/StatusTest Example Trilinos/packages/aztecoo/example/AztecOO

1. // Assume A, x, b are define

2. Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem3. AztecOO solver(problem); // Construct solver

4. AztecOO_StatusTestResNorm restest1(A, x, bb, 1.0E-6);5. restest1.DefineResForm(AztecOO_StatusTestResNorm::Implicit, AztecOO_StatusTestResNorm::TwoNorm);6. restest1.DefineScaleForm(AztecOO_StatusTestResNorm::NormOfInitRes, AztecOO_StatusTestResNorm::TwoNorm);

7. AztecOO_StatusTestResNorm restest2(A, x, bb, 1.0E-8);8. restest2.DefineResForm(AztecOO_StatusTestResNorm::Explicit, AztecOO_StatusTestResNorm::InfNorm);9. restest2.DefineScaleForm(AztecOO_StatusTestResNorm::NormOfRHS, AztecOO_StatusTestResNorm::InfNorm);

10. AztecOO_StatusTestCombo comboTest1(AztecOO_StatusTestCombo::SEQ, restest1, restest2);

11. AztecOO_StatusTestMaxIters maxItersTest(200);12. AztecOO_StatusTestCombo comboTest2(AztecOO_StatusTestCombo::OR, maxItersTest1, comboTest1);13. solver.SetStatusTest(&comboTest2);

14. solver.SetAztecOption(AZ_solver, AZ_cg);15. solver.Iterate(Niters, 1.0E-12);

Page 46: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Summary: Extending Capabilities

Trilinos packages are designed to interoperate. All packages (ML, IFPACK, AztecOO, …) that can

provide linear operators: Implement the Epetra_Operator interface. Are available to any package that can use an linear operator.

All packages (ML, AztecOO, NOX, Belos, Anasazi, …) that can use linear operators: Accept linear operator via Epetra_Operator interface. Support easy user extensions.

All packages (ML, IFPACK, AztecOO, …) that need matrix coefficient data: Can access that data from Epetra_RowMatrix interface. Can use any concrete Epetra matrix class, or any user-provided

adapter.

Page 47: Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Summary: Extending Capabilities

AztecOO is one example: Flexibility comes from abstract base classes:

• Epetra_Operator:– All Epetra matrix classes implement.

– Best way to define A and M when coefficient info not needed.

• Epetra_RowMatrix:– All Epetra matrix classes implement.

– Best way to define A and M when coefficient info is needed.

• AztecOO_StatusTest:– A suite of parametrized status tests.

– An abstract interface for users to define their own.

– Ability to combine tests for sophisticated control of stopping.