PETSc

41
Material adapted from a tutorial by: PETSc Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/ tutorials/index.htm Mathematics and Computer Science Division Argonne National Laboratory http://www.mcs.anl.gov/petsc

description

PETSc. Material adapted from a tutorial by:. Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm Mathematics and Computer Science Division Argonne National Laboratory http://www.mcs.anl.gov/petsc. PETSc Philosophy. - PowerPoint PPT Presentation

Transcript of PETSc

Page 1: PETSc

Material adapted from a tutorial by:

PETSc

Satish BalayBill Gropp

Lois Curfman McInnesBarry Smith

www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm

Mathematics and Computer Science DivisionArgonne National Laboratoryhttp://www.mcs.anl.gov/petsc

Page 2: PETSc

PETSc Philosophy

• Writing hand-parallelized (message-passing or distributed shared memory) application codes from scratch is extremely difficult and time consuming.

• Scalable parallelizing compilers for real application codes are very far in the future.

• We can ease the development of parallel application codes by developing general-purpose, parallel numerical PDE libraries.

Page 3: PETSc

PETSc Concepts in the Tutorial

• How to specify the mathematics of the problem– Data objects

• vectors, matrices

• How to solve the problem– Solvers

• linear, nonlinear, and timestepping (ODE) solvers

• Parallel computing complications– Parallel data layout

• structured and unstructured meshes

Page 4: PETSc

Tutorial Outline• Getting started

– sample results– programming paradigm

• Data objects– vectors (e.g., field variables)– matrices (e.g., sparse

Jacobians) • Viewers

– object information – visualization

• Solvers– linear– nonlinear– timestepping (and ODEs)

• Data layout and ghost values– structured mesh problems – unstructured mesh problems– partitioning and coloring

• Putting it all together– a complete example

• Debugging and error handling

• Profiling and performance tuning– -log_summary– stages and preloading– user-defined events

• Extensibility issues

Page 5: PETSc

• Advanced– user-defined customization of

algorithms and data structures

• Developer– advanced customizations,

intended primarily for use by library developers

Tutorial Approach

• Beginner – basic functionality, intended

for use by most programmers

• Intermediate– selecting options, performance

evaluation and tuning

From the perspective of an application programmer:

beginner

1

intermediate

2

advanced

3

developer

4

Page 6: PETSc

Incremental Application Improvement

• Beginner– Get the application “up and walking”

• Intermediate– Experiment with options– Determine opportunities for improvement

• Advanced– Extend algorithms and/or data structures as needed

• Developer– Consider interface and efficiency issues for integration

and interoperability of multiple toolkits

Page 7: PETSc

The PETSc Programming Model• Goals

– Portable, runs everywhere– Performance– Scalable parallelism

• Approach– Distributed memory, “shared-nothing”

• Requires only a compiler (single node or processor)• Access to data on remote machines through MPI

– Can still exploit “compiler discovered” parallelism on each node (e.g., SMP)

– Hide within parallel objects the details of the communication– User orchestrates communication at a higher abstract level than

message passingtutorial introduction

Page 8: PETSc

Collectivity• MPI communicators (MPI_Comm) specify collectivity

(processors involved in a computation)• All PETSc creation routines for solver and data objects are

collective with respect to a communicator, e.g., VecCreate(MPI_Comm comm, int m, int M, Vec *x)

• Some operations are collective, while others are not, e.g., – collective: VecNorm( )– not collective: VecGetLocalSize()

• If a sequence of collective routines is used, they must be called in the same order on each processor

tutorial introduction

Page 9: PETSc

Computation and Communication KernelsMPI, MPI-IO, BLAS, LAPACK

Profiling Interface

PETSc PDE Application Codes

Object-OrientedMatrices, Vectors, Indices

GridManagement

Linear SolversPreconditioners + Krylov Methods

Nonlinear Solvers,Unconstrained Minimization

ODE Integrators Visualization

Interface

PDE Application Codes

tutorial introduction

Page 10: PETSc

CompressedSparse Row

(AIJ)

Blocked CompressedSparse Row

(BAIJ)

BlockDiagonal(BDIAG)

Dense Other

Indices Block Indices Stride Other

Index SetsVectors

Line Search Trust Region

Newton-based MethodsOther

Nonlinear Solvers

AdditiveSchwartz

BlockJacobi Jacobi ILU ICC LU

(Sequential only) Others

Preconditioners

Euler BackwardEuler

Pseudo TimeStepping Other

Time Steppers

GMRES CG CGS Bi-CG-STAB TFQMR Richardson Chebychev OtherKrylov Subspace Methods

Matrices

PETSc Numerical Components

tutorial introduction

Page 11: PETSc

PETSc codeUser code

ApplicationInitialization

FunctionEvaluation

JacobianEvaluation

Post-Processing

PC KSPPETSc

Main Routine

Linear Solvers (SLES)

Nonlinear Solvers (SNES)

Timestepping Solvers (TS)

Flow of Control for PDE Solution

tutorial introduction

Page 12: PETSc

True Portability• Tightly coupled systems

– Cray T3D/T3E– SGI/Origin– IBM SP– Convex Exemplar

• Loosely coupled systems, e.g., networks of workstations– Sun OS, Solaris 2.5– IBM AIX– DEC Alpha– HP tutorial

introduction

– Linux– Freebsd– Windows NT/95

Page 13: PETSc

Data Objects

• Object creation• Object assembly • Setting options• Viewing• User-defined customizations

• Vectors (Vec)– focus: field data arising in nonlinear PDEs

• Matrices (Mat)– focus: linear operators arising in nonlinear PDEs (i.e., Jacobians)

tutorial outline: data objects

beginner

beginner

intermediate

advanced

intermediate

Page 14: PETSc

Vectors

• Fundamental objects for storing field solutions, right-hand sides, etc.

• VecCreateMPI(...,Vec *)– MPI_Comm - processors that share the

vector– number of elements local to this processor– total number of elements

• Each process locally owns a subvector of contiguously numbered global indices

data objects: vectorsbeginner

proc 3

proc 2

proc 0

proc 4

proc 1

Page 15: PETSc

Vectors: Example#include "vec.h"int main(int argc,char **argv){ Vec x,y,w; /* vectors */ Vec *z; /* array of vectors */ double norm,v,v1,v2;int n = 20,ierr; Scalar one = 1.0,two = 2.0,three = 3.0,dots[3],dot;

PetscInitialize(&argc,&argv,(char*)0,help); ierr = OptionsGetInt(PETSC_NULL,"n",&n,PETSC_NULL); CHKERRA(ierr); ierr =VecCreate(PETSC_COMM_WORLD,PETSC_DECIDE,n,&x); CHKERRA(ierr); ierr = VecSetFromOptions(x);CHKERRA(ierr); ierr = VecDuplicate(x,&y);CHKERRA(ierr); ierr = VecDuplicate(x,&w);CHKERRA(ierr); ierr = VecSet(&one,x);CHKERRA(ierr); ierr = VecSet(&two,y);CHKERRA(ierr); ierr = VecSet(&one,z[0]);CHKERRA(ierr); ierr = VecSet(&two,z[1]);CHKERRA(ierr); ierr = VecSet(&three,z[2]);CHKERRA(ierr); ierr = VecDot(x,x,&dot);CHKERRA(ierr); ierr = VecMDot(3,x,z,dots);CHKERRA(ierr);

Page 16: PETSc

Vectors: Example ierr = PetscPrintf(PETSC_COMM_WORLD,"Vector length %d\n",(int)dot); CHKERRA(ierr); ierr = VecScale(&two,x);CHKERRA(ierr); ierr = VecNorm(x,NORM_2,&norm);CHKERRA(ierr); v = norm-2.0*sqrt((double)n); if (v > -1.e-10 && v < 1.e-10) v = 0.0; ierr = PetscPrintf(PETSC_COMM_WORLD,"VecScale %g\n",v); CHKERRA(ierr); ierr = VecDestroy(x);CHKERRA(ierr); ierr = VecDestroy(y);CHKERRA(ierr); ierr = VecDestroy(w);CHKERRA(ierr); ierr = VecDestroyVecs(z,3);CHKERRA(ierr); PetscFinalize(); return 0;}

Page 17: PETSc

Vector Assembly

• VecSetValues(Vec,…)– number of entries to insert/add– indices of entries– values to add– mode: [INSERT_VALUES,ADD_VALUES]

• VecAssemblyBegin(Vec)• VecAssemblyEnd(Vec)

data objects: vectorsbeginner

Page 18: PETSc

Parallel Matrix and Vector Assembly

• Processors may generate any entries in vectors and matrices

• Entries need not be generated on the processor on which they ultimately will be stored

• PETSc automatically moves data during the assembly process if necessary

data objects: vectors and matricesbeginner

Page 19: PETSc

Selected Vector Operations

Function Name Operation

VecAXPY(Scalar *a, Vec x, Vec y) y = y + a*xVecAYPX(Scalar *a, Vec x, Vec y) y = x + a*yVecWAXPY(Scalar *a, Vec x, Vec y, Vec w) w = a*x + yVecScale(Scalar *a, Vec x) x = a*xVecCopy(Vec x, Vec y) y = xVecPointwiseMult(Vec x, Vec y, Vec w) w_i = x_i *y_iVecMax(Vec x, int *idx, double *r) r = max x_iVecShift(Scalar *s, Vec x) x_i = s+x_iVecAbs(Vec x) x_i = |x_i |VecNorm(Vec x, NormType type , double *r) r = ||x||

Page 20: PETSc

Sparse Matrices

• Fundamental objects for storing linear operators (e.g., Jacobians)

• MatCreateMPIAIJ(…,Mat *)– MPI_Comm - processors that share the matrix– number of local rows and columns– number of global rows and columns– optional storage pre-allocation information

data objects: matricesbeginner

Page 21: PETSc

Parallel Matrix Distribution

MatGetOwnershipRange(Mat A, int *rstart, int *rend)– rstart: first locally owned row of global matrix– rend -1: last locally owned row of global matrix

Each process locally owns a submatrix of contiguously numbered global rows.

proc 0

} proc 3: locally owned rowsproc 3proc 2proc 1

proc 4

data objects: matricesbeginner

Page 22: PETSc

Matrix Assembly

• MatSetValues(Mat,…)– number of rows to insert/add– indices of rows and columns– number of columns to insert/add– values to add– mode: [INSERT_VALUES,ADD_VALUES]

• MatAssemblyBegin(Mat)• MatAssemblyEnd(Mat)

data objects: matricesbeginner

Page 23: PETSc

Viewers

• Printing information about solver and data objects

• Visualization of field and matrix data

• Binary output of vector and matrix data

tutorial outline: viewers

beginner

beginner

intermediate

Page 24: PETSc

Viewer Concepts• Information about PETSc objects

– runtime choices for solvers, nonzero info for matrices, etc.• Data for later use in restarts or external tools

– vector fields, matrix contents – various formats (ASCII, binary)

• Visualization– simple x-window graphics

• vector fields• matrix sparsity structure

beginner viewers

Page 25: PETSc

Viewing Vector Fields• VecView(Vec x,Viewer v);• Default viewers

– ASCII (sequential):VIEWER_STDOUT_SELF

– ASCII (parallel):VIEWER_STDOUT_WORLD

– X-windows:VIEWER_DRAW_WORLD

• Default ASCII formats– VIEWER_FORMAT_ASCII_DEFAULT– VIEWER_FORMAT_ASCII_MATLAB– VIEWER_FORMAT_ASCII_COMMON– VIEWER_FORMAT_ASCII_INFO– etc.

viewersbeginner

Solution components, using runtime option -snes_vecmonitor

velocity: u velocity: v

temperature: Tvorticity:

Page 26: PETSc

Viewing Matrix Data• MatView(Mat A, Viewer

v);• Runtime options available

after matrix assembly– -mat_view_info

• info about matrix assembly– -mat_view_draw

• sparsity structure– -mat_view

• data in ASCII – etc.

viewersbeginner

Page 27: PETSc

Linear Solvers

Goal: Support the solution of linear systems, Ax=b,particularly for sparse, parallel problems arisingwithin PDE-based models

User provides:– Code to evaluate A, b

solvers:linearbeginner

Page 28: PETSc

PETSc

ApplicationInitialization Evaluation of A and b Post-

Processing

SolveAx = b PC KSP

Linear Solvers (SLES)

PETSc codeUser code

Linear PDE SolutionMain Routine

solvers:linearbeginner

Page 29: PETSc

Linear Solvers (SLES)

• Application code interface• Choosing the solver • Setting algorithmic options• Viewing the solver• Determining and monitoring convergence• Providing a different preconditioner matrix• Matrix-free solvers• User-defined customizations

SLES: Scalable Linear Equations Solvers

tutorial outline: solvers: linear

beginner

beginner

intermediate

intermediate

intermediate

intermediate

advanced

advanced

Page 30: PETSc

Linear Solvers in PETSc 2.0

• Conjugate Gradient• GMRES• CG-Squared• Bi-CG-stab• Transpose-free QMR• etc.

• Block Jacobi• Overlapping Additive

Schwarz• ICC, ILU via

BlockSolve95• ILU(k), LU (sequential

only)• etc.

Krylov Methods (KSP) Preconditioners (PC)

solvers:linearbeginner

Page 31: PETSc

Basic Linear Solver Code (C/C++)SLES sles; /* linear solver context */Mat A; /* matrix */Vec x, b; /* solution, RHS vectors */int n, its; /* problem dimension, number of iterations */

MatCreate(MPI_COMM_WORLD,n,n,&A); /* assemble matrix */VecCreate(MPI_COMM_WORLD,n,&x); VecDuplicate(x,&b); /* assemble RHS vector */

SLESCreate(MPI_COMM_WORLD,&sles); SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTERN);SLESSetFromOptions(sles);SLESSolve(sles,b,x,&its);

solvers:linearbeginner

Page 32: PETSc

Basic Linear Solver Code (Fortran)SLES sles Mat AVec x, binteger n, its, ierr

call MatCreate(MPI_COMM_WORLD,n,n,A,ierr) call VecCreate(MPI_COMM_WORLD,n,x,ierr)call VecDuplicate(x,b,ierr)

call SLESCreate(MPI_COMM_WORLD,sles,ierr)call SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTERN,ierr)call SLESSetFromOptions(sles,ierr)call SLESSolve(sles,b,x,its,ierr)

C then assemble matrix and right-hand-side vector

solvers:linearbeginner

Page 33: PETSc

Customization Options

• Procedural Interface– Provides a great deal of control on a usage-by-usage

basis inside a single code– Gives full flexibility inside an application

• Command Line Interface– Applies same rule to all queries via a database– Enables the user to have complete control at runtime,

with no extra coding

solvers:linearintermediate

Page 34: PETSc

Setting Solver Options within Code• SLESGetKSP(SLES sles,KSP *ksp)

– KSPSetType(KSP ksp,KSPType type)– KSPSetTolerances(KSP ksp,double

rtol,double atol,double dtol, int maxits)– ...

• SLESGetPC(SLES sles,PC *pc)– PCSetType(PC pc,PCType)– PCASMSetOverlap(PC pc,int overlap)– ...

solvers:linearintermediate

Page 35: PETSc

• -ksp_type [cg,gmres,bcgs,tfqmr,…]• -pc_type [lu,ilu,jacobi,sor,asm,…]

• -ksp_max_it <max_iters>• -ksp_gmres_restart <restart>• -pc_asm_overlap <overlap>• -pc_asm_type

[basic,restrict,interpolate,none]• etc ...

Setting Solver Options at Runtime

solvers:linearbeginner

1

intermediate

2

1

2

Page 36: PETSc

SLES: Review of Basic Usage• SLESCreate( ) - Create SLES context• SLESSetOperators( ) - Set linear operators• SLESSetFromOptions( ) - Set runtime solver

options for [SLES, KSP,PC]• SLESSolve( ) - Run linear solver• SLESView( ) - View solver options actually

used at runtime (alternative: -sles_view)• SLESDestroy( ) - Destroy solver

beginnersolvers:linear

Page 37: PETSc

SLES: Review of Selected Preconditioner Options

Functionality Procedural Interface Runtime Option

Set preconditioner type PCSetType( ) -pc_type [lu,ilu,jacobi, sor,asm,…]

Set level of fill for ILU PCILULevels( ) -pc_ilu_levels <levels>Set SOR iterations PCSORSetIterations( ) -pc_sor_its <its>Set SOR parameter PCSORSetOmega( ) -pc_sor_omega <omega>Set additive Schwarz variant

PCASMSetType( ) -pc_asm_type [basic, restrict,interpolate,none]

Set subdomain solver options

PCGetSubSLES( ) -sub_pc_type <pctype> -sub_ksp_type <ksptype> -sub_ksp_rtol <rtol>

And many more options...solvers: linear: preconditionersbeginner

1

intermediate

2

1

2

Page 38: PETSc

SLES: Review of Selected Krylov Method Options

solvers: linear: Krylov methodsbeginner

1

intermediate

2And many more options...

Functionality Procedural Interface Runtime Option

Set Krylov method KSPSetType( ) -ksp_type [cg,gmres,bcgs, tfqmr,cgs,…]

Set monitoring routine

KSPSetMonitor() -ksp_monitor, –ksp_xmonitor, -ksp_truemonitor, -ksp_xtruemonitor

Set convergence tolerances

KSPSetTolerances( ) -ksp_rtol <rt> -ksp_atol <at> -ksp_max_its <its>

Set GMRES restart parameter

KSPGMRESSetRestart( ) -ksp_gmres_restart <restart>

Set orthogonalization routine for GMRES

KSPGMRESSetOrthogon alization( )

-ksp_unmodifiedgramschmidt -ksp_irorthog

1

2

Page 39: PETSc

SLES: Runtime Script Example

solvers:linearintermediate

Page 40: PETSc

Viewing SLES Runtime Options

solvers:linearintermediate

Page 41: PETSc

SLES: Example Programs

• ex1.c, ex1f.F - basic uniprocessor codes • ex2.c, ex2f.F - basic parallel codes • ex11.c - using complex numbers

• ex4.c - using different linear system and preconditioner matrices• ex9.c - repeatedly solving different linear systems

• ex15.c - setting a user-defined preconditioner

And many more examples ...

Location: www.mcs.anl.gov/petsc/src/sles/examples/tutorials/

solvers:linear

1

2

beginner

1

intermediate

2

advanced

3

3