Introduction to scientific computing using PETSc and · PDF file ·...

128
Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta PRACE Spring School, Cracow 2012

Transcript of Introduction to scientific computing using PETSc and · PDF file ·...

Page 1: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Introduction to scientific computing using

PETSc and Trilinos Václav Hapla David Horák

Michal Merta

PRACE Spring School, Cracow 2012

Page 2: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Should we reinvent the wheel?

many complex but well-known and often-used algorithms (LU, CG, matrix-vector multiply, …) have been already implemented, tested and are ready to use!

a software framework is a software providing generic functionality that can be selectively changed by user code, thus providing application specific software (wikipedia.org)

motivation: programmers should consider focusing on new, original algorithms that make an added value

Frameworks for scientific computing – why ?

Page 3: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

are parallelized on the data level (vectors & matrices) using MPI

use BLAS and LAPACK – de facto standard for dense LA

have their own implementation of sparse BLAS

include robust preconditioners, linear solvers (direct and iterative) and nonlinear solvers

can cooperate with many other external solvers and libraries (e.g. MATLAB, MUMPS, UMFPACK, …)

already support CUDA and hybrid parallelization

are licensed as open-source

Both PETSc and Trilinos…

Page 4: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc

„essential object orientation“

for programmers used to procedural programming but seeking for modular code

recommended for C and FORTRAN users

Trilinos

„pure object orientation“

for programmers who are not scared of OOP, appreciate good SW design and have some experience with C++

extensibility and reusability

Potential users

Page 5: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc Library Václav Hapla

David Horák

Page 6: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc project

PETSc programming primitives

Objects in PETSc

Vectors, index sets and matrices in PETSc

Linear solvers

Debugging & profiling

Outline of PETSc tutorial

Page 7: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc project Václav Hapla

Page 8: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc = Portable, Extensible Toolkit for Scientific computation

developed by Argonne National Laboratory since 1991

data structures and routines for the scalable parallel solution of scientific applications modeled by PDE

coded primarily in C language but good FORTRAN support, can also be called from C++ and Python codes

homepage: www.mcs.anl.gov/petsc

current stable version is 3.2

PETSc project (1)

Page 9: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

petsc-dev (development branch) is evolving intensively

code and mailing lists open to anybody

portable to any parallel system supporting MPI

tightly coupled systems (Cray XT5, BG/P, Earth Simulator, Sun Blade, SGI Altix)

loosely coupled systems, such as networks of workstations (Linux, Windows, IBM, Mac, Sun)

iPhone support

PETSc project (2)

Page 10: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Developing parallel, nontrivial PDE solvers that deliver high performance is still difficult and requires months (or even years) of concentrated effort. PETSc is a toolkit that can ease these difficulties and reduce the development time, but it is not a black-box PDE solver, nor a silver bullet.

Barry Smith

(PETSc Team)

Role of PETSc

Page 11: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

„We will continually add new features and enhanced functionality in upcoming releases; small changes in usage and calling sequences of PETSc routines will continue to occur. Although keeping one's code accordingly up-to-date can be annoying, all PETSc users will be rewarded in the long run with a cleaner, better designed, and easier-to-use interface.“

Changes

Page 12: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Documentation

all documention available at http://www.mcs.anl.gov/petsc/documentation/index.html

PETSc users manual – PDF (fully searchable, hypertext)

help topics – general topics such as „error handling“, „multigrid“, „shared memory“

manual pages – individual routines, split into 4 categories: Beginner - basic usage

Intermediate - setting options for algorithms and data structures

Advanced - setting more advanced options and customization

Developer - interfaces intended primarily for library developers

Page 13: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc is layered on top of MPI

MPI provides low-level tools to exchange data primitives between processes

PETSc provides medium-level tools such as insert matrix element to arbitrary location

parallel matrix-vector product

you do not need to know much about MPI

but you can call arbitrary MPI routine directly if needed

same code for sequential and parallel runs

Parallelism in PETSc

Page 14: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc cooperates with... (1)

Python: petsc4py

Documentation utilities: Sowing, lgrind, c2html

MPI: MPICH, MPE, Open MPI

Dense LA: BLAS, LAPACK, BLACS, ScaLAPACK, PLAPACK

Graphs & load balancing: ParMetis, Chaco, Jostle, Party, Scotch, Zoltan

Direct linear solvers: MUMPS, Spooles, SuperLU, SuperLU_Dist, UMFPack

Page 15: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc cooperates with... (2)

Iterative linear solvers: PaStiX, HYPRE

Multigrid: Trilinos ML

Eigenvalue solvers: BLOPEX

FFT: FFTW

Time-stepping: Sundials

Meshing: Triangle, TetGen, FIAT, FFC, Generator

Data exchange: HDF5

Boost

Page 16: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

TAO - Toolkit for Advanced Optimization

SLEPc - Scalable Library for Eigenvalue Problems

fluidity - a finite element/volume fluids code

Prometheus - scalable unstructured finite element solver

freeCFD - general purpose CFD solver

OpenFVM - finite volume based CFD solver

OOFEM - object oriented finite element library

libMesh - adaptive finite element library

Packages that use/extend PETSc (1)

Page 17: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MOOSE - Multiphysics Object-Oriented Simulation Environment developed at INL built on top of libmesh on top of PETSc

DEAL.II - sophisticated C++ based finite element simulation package

PHMAL - The Parallel Hierarchical Adaptive MultiLevel Project

Chaste - Cancer, Heart and Soft Tissue Environment

Packages that use/extend PETSc (2)

Page 18: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc has been used for modeling in all of these areas: Acoustics, Aerodynamics, Air Pollution, Arterial Flow, Bone Fractures, Brain Surgery, Cancer Surgery, Cancer Treatment, Carbon Sequestration, Cardiology, Cells, CFD, Combustion, Concrete, Corrosion, Data Mining, Dentistry, Earth Quakes...

Applications (1)

Page 19: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Applications (2)

Fracture mechanics

Mechanics- elasticity

Real-time surgery

Magma dynamics

Page 20: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc installation in a nutshell

Václav Hapla

Page 21: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

stable releases of PETSc can be downloaded via HTTP as a tarball

petsc-3.2-p7.tar.gz - full distribution (including all current patches) with documentation

petsc-lite-3.2-p7.tar.gz - smaller version with no documentation (all documentation may be accessed online)

Download - tarball

Page 22: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

stable releases as well as current development release can be downloaded using Mercurial versioning system

caution – build system has its own separate repository!

stable: hg clone http://petsc.cs.iit.edu/petsc/releases/petsc-3.2

hg clone http://petsc.cs.iit.edu/petsc/releases/BuildSystem-3.2 \

petsc-3.2/config/BuildSystem

dev: hg clone http://petsc.cs.iit.edu/petsc/releases/petsc-3.2

hg clone http://petsc.cs.iit.edu/petsc/releases/BuildSystem-3.2 \

petsc-3.2/config/BuildSystem

Download - Mercurial

Page 23: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

./configure script written in Python

realizes PETSc auto-tuning capabilities

sets many internal variables and macros depending on the machine

generates makefile

--help – prints all options

see www.mcs.anl.gov/petsc/documentation/installation.html

Configuration

Page 24: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSC_DIR and PETSC_ARCH variables that control the configuration and build process of PETSc

These variables can be set as environment variables or specified on the command line.

PETSC_DIR points to the location of the PETSc installation that is used.

Multiple PETSc versions can coexist on the same file-system. By changing PETSC_DIR value, one can switch between these installed versions of PETSc.

PETSC_DIR

Page 25: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSC_ARCH variable gives a name to a configuration and build.

configure uses this value to store the generated makefiles in ${PETSC_DIR}/${PETSC_ARCH}/conf.

make uses this value to determine the location

program libraries (.a or .so) of PETSc and downloaded external packages are stored into ${PETSC_DIR}/${PETSC_ARCH}/lib

Thus one can install multiple variants of PETSc libraries - by providing different PETSC_ARCH values to each configure build.

Then one can switch between using these variants of libraries by switching the PETSC_ARCH value used.

PETSC_ARCH

Page 26: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc supports tens of external packages

[pkg] = mumps, superlu, parmetis, sprng, netcdf, ...

download and compile automatically:

--download-[pkg] - downloads and installs a package for you in $PETSC_DIR/lib

use existing installation

--with-[pkg] =<bool> test for [pkg]

--with-[pkg]-dir=<dir> the root directory of the [pkg] installation

--with-[pkg]-include=<dirs>

--with-[pkg]-lib=<libraries: e.g.[/Users/..../libboost.a,...]>

External packages

Page 27: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

./configure --with-batch

for machines with a batch system

configure generates special executable binary conftest

run conftest on one computing node (e.g. submit the batch script)

it will generate a new ./reconfigure-$PETSC_ARCH script with machine specific variables set (cache size etc.)

run ./reconfigure to complete the configuration stage

Batch mode

Page 28: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

after configuration stage is completed successfully you get the message like this Configure stage complete. Now build PETSc libraries with (cmake build): make PETSC_DIR=/home/vhapla/devel/petsc-dev \

PETSC_ARCH=debug-so-mpich2-gnu all

you can copy and paste the make command

it will compile the source files and build the program library

it can make use of CMake if installed

significant speedup of compilation

shows progress percentage

Compilation

Page 29: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc programming primitives Václav Hapla

Page 30: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

#include "petsc.h"

#undef __FUNCT__

#define __FUNCT__ "main"

int main(int argc,char **argv)

Declare the name of each routine by redefining __FUNCT__ macro to get more useful tracebacks on error

Program header in C

Page 31: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

program init

implicit none

#include "finclude/petsc.h"

FORTRAN has more limited error handling, one cannot use __FUNCT__ macro

If you are familiar with C, please use C.

We will focus on PETSc C interface.

Program header in F

Page 32: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

You can include all PETSc headers at once by #include "petsc.h" //includes all PETSc headers

Or you can include specific headers #include "petscsys.h" //framework routines

#include "petscvec.h" //vectors

#include "petscmat.h" //matrices

Higher level headers include all lower level headers needed

#include "petscksp.h" //includes vec,mat,dm,pc

What headers to include?

Page 33: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Initialize & Finalize (1)

static char help[] = "Empty program.\n\n";

#include <petscsys.h>

int main(int argc,char **argv)

{

ierr = PetscInitialize(&argc,&argv,(char *)0,help);CHKERRQ(ierr);

ierr = PetscFinalize();CHKERRQ(ierr);

return 0;

}

Every PETSc program begins with the call to PetscInitialize()

ends with the call to PetscFinalize()

they call MPI_Init(), MPI_Finalize()

Page 34: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Initialize & Finalize (2)

static char help[] = "Empty program.\n\n";

#include <petscsys.h>

int main(int argc,char **argv)

{

ierr = PetscInitialize(&argc,&argv,(char *)0,help);CHKERRQ(ierr);

ierr = PetscFinalize();CHKERRQ(ierr);

return 0;

}

argc,argv - propagate command line arguments to PETSc and MPI

help - additional help messages to print when the executable is invoked with the cmd-line-arg -help (will be discussed later)

Page 35: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc is written in C

C has no support for C++ exceptions

instead of throwing exception, every routine returns integer error code (PetscErrorCode type)

error code is „catched“ by CHKERRQ(ierr) macro

PetscErrorCode ierr;

ierr = SomePetscRoutine();CHKERRQ(ierr);

Error handling (1)

Page 36: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

#include <petscsys.h>

int main(int argc,char **argv)

{

ierr = PetscFinalize(); CHKERRQ(ierr);

return 0;

}

This code throws this error: PetscInitialize() must be called before PetscFinalize()

(+ stacktrace)

Error handling (2)

Page 37: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Communicators

communicator = an opaque object of MPI_Comm type that defines process group and synchronization channel

PETSc built-in communicators: PETSC_COMM_SELF – just this process – for serial objects

PETSC_COMM_WORLD – all processes – for parallel objects

MPI can split communicators, spawn processes on new communicators – PETSc does not deal with it

Page 38: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Function Collectiveness

1. Not Collective – no communication nor synchronization VecGetLocalSize(), MatSetValues()

2. Logically Collective – checked when running in debug mode KSPSetType(), PCMGSetCycleType()

3. Neighbor-wise Collective – point-to-point communication between two processes VecScatterBegin(), MatMult()

4. Collective – global communication, synchronous VecNorm(), MatAssemblyBegin(), KSPCreate()

Page 39: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc provides many useful utilities

prefixed by Petsc

parallel flow control: Barrier, SequentialPhaseBegin/End

memory management and checking: Malloc,Free,MallocValidate,MallocDump

Utility routines (1)

Page 40: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

logging: PetscLogEventRegister/Begin/End

string handling: Strcat/cmp/cpy/len/tolower/replace/ToArray

MATLAB engine interface: MatlabEngineCreate/Destroy/Evaluate

and many more

Utility routines (2)

Page 41: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PetscInt n = 20;

PetscScalar v = -3.5, w = 3.1e9;

PetscReal x = 2.55, y = 1e-9;

PETSc has its own typedefs for numeric data types

It is better to use them instead of built-in C types

Better portability and easier switching between

real and complex numbers

32-bit and 64-bit numbers

Primitive datatypes

Page 42: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc provides routines for managing the options database

in your program, you can call routines PetscOptionsGetInt,

PetscOptionsGetString,

PetscOptionsGetReal, etc. to obtain the values

Options (1)

Page 43: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Example:

in command-line ./yourapp -myint 10 -myreal 1e3

in program yourapp: PetscReal myreal; PetscInt myint; PetscOptionsGetInt(PETSC_NULL,"-myint",&myint,

PETSC_NULL);

PetscOptionsGetReal(PETSC_NULL,"-myreal",&myreal,

PETSC_NULL);

Options (2)

Page 44: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

-help command-line argument prints essential info about the PETSc-based program:

program description (the last argument of PetscInitialize()

options specific for the program

general built-in options

built-in options relevant for the program

PETSc version

command-line help

Page 45: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

trainee@pss2012vm:~/petsc-tutorial$ ./ex2 -help

Solves a linear system in parallel with KSP.

Input parameters include:

-random_exact_sol : use a random exact solution vector

-view_exact_sol : write exact solution vector to stdout

-m <mesh_x> : number of mesh points in x-direction

-n <mesh_n> : number of mesh points in y-direction

-----------------------------------------------------------

Petsc Release Version 3.2.0, Patch 7, Thu Mar 15 09:30:51 CDT 2012

...

-----------------------------------------------------------

Options for all PETSc programs:

-help: prints help method for each option

-on_error_abort: cause an abort when an error is detected. Useful

only when run in the debugger

...

command-line help - example

Page 46: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

command line

filename in the third argument of PetscInitialize()

~/.petscrc

$PWD/.petscrc

$PWD/petscrc

PetscOptionsInsertFile()

PetscOptionsInsertString()

PETSC_OPTIONS environment variable

command line option -options_file [file]

Ways to set options

Page 47: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

C: PetscErrorCode PetscPrintf(MPI_Comm,

const char format[],...)

prints to standard output

only from the first processor in the communicator comm

F: PetscPrintf(MPI_Comm, character(*),

PetscErrorCode)

limited support in FORTRAN

only single character string can be passed

Print to standard output

Page 48: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

static char help[] = "Hello world program.\n\n";

#include <petscsys.h>

int main(int argc,char **argv)

{

PetscErrorCode ierr;

PetscMPIInt rank;

PetscInitialize(&argc,&argv,(char *)0,help);

MPI_Comm_rank(PETSC_COMM_WORLD,&rank);

PetscPrintf(PETSC_COMM_SELF,"Hello World from %d\n",rank);

PetscFinalize();

return 0;

}

PETSc Hello world in C

Page 49: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

program main

integer ierr, rank

#include "include/finclude/petsc.h"

call PetscInitialize(PETSC_NULL_CHARACTER, ierr)

call MPI_Comm_rank(PETSC_COMM_WORLD, rank, ierr)

if (rank .eq. 0) then

print *, ‘Hello World from ’, rank

endif

call PetscFinalize(ierr)

end

PETSc Hello world in F

Page 50: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

static char help[] = "Hello world program.\n\n";

#include <petscsys.h>

int main(int argc,char **argv)

{

PetscErrorCode ierr;

PetscMPIInt rank;

ierr = PetscInitialize(&argc,&argv,(char *)0,help);CHKERRQ(ierr);

ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);

ierr = PetscPrintf(PETSC_COMM_SELF,"Hello World from %d\n",

rank);CHKERRQ(ierr);

ierr = PetscFinalize();

return 0;

}

PETSc Hello world in C - with error checking

Page 51: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

To obtain output of the first processor followed by that of the second, etc., one can call:

PetscSynchronizedPrintf(PETSC_COMM_WORLD,

"Hello World from %d\n",rank);

PetscSynchronizedFlush(PETSC_COMM_WORLD);

Output: Hello World from 0

Hello World from 1

Hello World from 2

Synchronized print

Page 52: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Objects in PETSc Václav Hapla

Page 53: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Hierarchy of components le

vel o

f abstractio

n

PETSc

paralle

lization

Nonlinear solvers (SNES)

Time Steppers (TS)

Linear solvers (KSP)

Preconditioners (PC)

Matrices (Mat)

Vectors (Vec)

Index Sets (IS)

MPI BLAS

Application

use

r

LAPACK

Page 54: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

every object in PETSc belongs to some communicator

MPI_Comm is the first argument of every object‘s constructor

two objects can only interact if they belong to the same communicator

Objects and communicators

Page 55: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc uses specific and limited inheritance

every object in PETSc is an instance of a class: Vec, Mat, KSP, SNES, …

functions called on objects (= methods in C++) are prefixed by a class name: MatMult(Mat,…)

class is specified when the object is created using proper Create function (= constructor in C++): Mat A;

MatCreate(comm, &A);

PETSc object oriented design: classes

Page 56: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc object oriented design: types

classes are further subdivided into types: seqaij,mpidense,composite,…

= seq. sparse, par. dense, implicit matrix addition/multiplication

type of object is specified during object lifetime Mat A;

MatCreate(comm, &A);

MatSetType(A, MATSEQAIJ);

Page 57: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Mat A,B; Vec x; KSP solver; are opaque objects

you don‘t access inner fields directly

in include/petscmat.h you can find typedef struct _p_Mat* Mat;

so B = A only copies pointer, not data

prevents unwanted data copying

makes pointer handling easier

allows hiding implementation from public interface → polymorphism

PETSc object oriented design: opaque objects

Page 58: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Polymorphism

MatMult(Mat A,Vec x,Vec y); //y = A*x

public interface

uniform for all types of matrices: sequential, parallel, dense, sparse, …

documented

calls private implementation based on type: MatMult_SeqDense(Mat A,Vec x,Vec y)

hidden, specific for each matrix type

Page 59: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PetscObject (1)

Every PETSc object can be cast to PetscObject: Mat A;

PetscObject obj;

obj = (PetscObject) A;

PetscObject provides general methods such as:

Get/SetName() – name the object (used for printing, MATLAB interface, etc.)

GetType() – the type of the object

GetComm() – the communicator the object belongs to

Page 60: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PetscObject (2)

Mat A;

char *type;

MPI_Comm comm;

PetscObjectGetComm((PetscObject)A,&comm);

PetscObjectGetType((PetscObject)A,&type);

//is the same as

MatGetType(A,&type);

Page 61: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc inheritance

classes

types

...

...

Page 62: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

once again: method names must be prefixed by the class name: Vec,Mat,KSP, etc.

all PETSc buil-in classes support following methods

Create() - create the object

Get/SetType() - set the implementation type

Common methods (1)

Page 63: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

SetFromOptions() - set all options of the object from the options database

Get/SetOptionsPrefix() - set a specific option prefix for the given object

SetUp() - prepare the object inner state for computation

View() - print object info to specified output

Destroy() - deallocate the memory used by the object

Common methods (2)

Page 64: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Destroy method uses simple reference counting.

If counter > 0, then only nullify the pointer and decrement the counter.

If reference count equals 0

call type-specific private destroy routine

deallocate the whole object

So PETSc uses destroy always paradigm

Not like smart pointers in new C standard, Boost or Trilinos RCP, that use destroy never paradigm

Destroy

Page 65: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc contains special PetscViewer class for printing to stdout, files (several text and binary formats), strings or even socket connection

basic usage: PetscViewer viewer;

PetscViewerCreate(comm, &viewer);

PetscViewerSetType(viewer, PETSCVIEWERASCII);

PetscViewerDestroy(&viewer);

prints only from the first processor of comm

Viewers (1)

Page 66: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

predefined viewers: PETSC_VIEWER_STDOUT_WORLD, PETSC_VIEWER_BINARY_SELF, ...

every PETSc object can be viewed by the viewer:

Viewer v; Mat A; Vec x;

...

MatView(A,v);

VecView(x,v);

Viewers (2)

Page 67: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

#include <petscviewer.h>

int main(int argc,char **args)

{

PetscViewer viewer;

PetscInt i;

PetscInitialize(&argc,&args,(char *)0,(char *)0);

PetscViewerCreate(PETSC_COMM_WORLD, &viewer);

PetscViewerSetType(viewer, PETSCVIEWERASCII);

PetscViewerFileSetMode(viewer, FILE_MODE_APPEND);

PetscViewerFileSetName(viewer, "test.txt");

for(i = 0; i <= 5; i++) {

PetscViewerASCIIPrintf(viewer, "test line %d\n", i);

}

PetscViewerDestroy(&viewer);

PetscFinalize();

return 0;

}

PetscViewer Example (1)

Page 68: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

This program will append the following text to the file test.txt:

test line 0

test line 1

test line 2

test line 3

test line 4

test line 5

PetscViewer Example (2)

Page 69: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Vectors, index sets and matrices in PETSc

David Horák

Page 70: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Vec v;

VecCreate(MPI_Comm comm,&v);

VecDestroy(&v);

a vector is an array of PetscScalars

the vector object is not completely created in one call, you must at least set sizes: VecSetSizes(Vec v, int m, int M);

Create another vector with the same type and layout: VecDuplicate(Vec v,Vec *w);

Vec: Vectors

Page 71: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Create a vector from an existing array

Create vector from user provided array:

VecCreateSeqWithArray(MPI_Comm comm,

PetscInt n, const PetscScalar array[],

Vec *v)

VecCreateMPIWithArray(MPI_Comm comm,

PetscInt n, PetscInt N,

const PetscScalar array[], Vec *vv)

Page 72: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Global size can be specified as PETSC_DECIDE.

Local size can be specified as PETSC_DECIDE.

Vector parallel layout

Page 73: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Query vector layout:

VecGetOwnershipRange(Vec x, PetscInt *low,

PetscInt *high)

Create general layout:

PetscSplitOwnership(MPI_Comm comm,PetscInt *n,

PetscInt *N)

Ownership Range

Page 74: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Vec x;

Set all entries of vector to constant value: VecSet(Vec,PetscScalar)

VecSet(x,1.0);

Set individual elements (global indexing !): VecSetValues(Vec,PetscInt,PetscInt*,

PetscScalar*,InsertMode);

i = 1; v = 3.14;

VecSetValues(x,1,&i,&v,INSERT_VALUES);

//eq.

VecSetValue(x,i,v,INSERT_VALUES);

Setting vector values (1)

Page 75: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Setting vector values (2)

Set more entries at once: ii[0]=1; ii[1]=2; vv[0]=2.7; vv[1]=3.1;

VecSetValues(x,2,ii,vv,INSERT_VALUES);

The last argument can be INSERT_VALUES - replace original value

ADD_VALUES - add to original value

VecSetValues is not collective, values are cached

after setting all values, you must call assembly routine to exchange values between processors: VecAssemblyBegin(Vec x);

VecAssemblyEnd(Vec x);

Page 76: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

get a copy of entries of x with indices ix to an array y:

VecGetValues(Vec x, PetscInt ni, const PetscInt ix[],

PetscScalar y[])

user must provide an allocated array y

get the pointer to the internal array:

Vec x; PetscScalar *a;

VecGetArray(Vec x,PetscScalar *a[]);

/* do something with the array */

VecRestoreArray(Vec x,PetscScalar *a[]);

local only; see VecScatter for general

Getting values

Page 77: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

int localsize,first,i;

PetscScalar *a;

VecGetLocalSize(x,&localsize);

VecGetOwnershipRange(x,&first,PETSC_NULL);

VecGetArray(x,&a);

for (i=0; i<localsize; i++)

printf("Vector element %d : %e\n",

first+i,a[i]);

VecRestoreArray(x,&a);

Getting values example

Page 78: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

VecAXPY(Vec y,PetscScalar a,Vec x); /* y = y + a*x */

VecAYPX(Vec y,PetscScalar a,Vec x); /* y = a*y + x */

VecScale(Vec x, PetscScalar a);

VecDot(Vec x, Vec y, PetscScalar *r); /* several variants */

VecMDot(Vec x,int n,Vec y[],PetscScalar *r);

VecNorm(Vec x,NormType type, double *r);

VecSum(Vec x, PetscScalar *r);

VecCopy(Vec x, Vec y);

VecSwap(Vec x, Vec y);

Basic operations (1)

Page 79: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

VecPointwiseMult(Vec w,Vec x,Vec y);

VecPointwiseDivide(Vec w,Vec x,Vec y);

VecMAXPY(Vec y,int n, PetscScalar *a, Vec x[]);

VecMax(Vec x, int *idx, double *r);

VecMin(Vec x, int *idx, double *r);

VecAbs(Vec x);

VecReciprocal(Vec x);

VecShift(Vec x,PetscScalar s);

Basic operations (2)

Page 80: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Index Set is a set of indices

generalization of an integer array

can be distributed (if comm has more than one process)

general IS: IS is; PetscInt indices[]={1,3,7}; PetscInt n=3;

ISCreateGeneral(comm,n,indices,PETSC_COPY_VALUES,&is);

/* indices can now be freed */

ISCreateGeneral(comm,n,indices,PETSC_OWN_VALUES,&is);

/* indices are stored inside is and freed when

ISDestroy(&is) is called */

IS: Index Sets (1)

Page 81: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

IS: Index Sets (2)

stride IS

in MATLAB: is = 0:2:n-1

in PETSCc:

ISCreateStride (comm,n,0,2,&is);

ISDestroy(&is);

Various manipulations: ISSum, ISDifference, ISInvertPermutations

Page 82: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

To get the values given by isx from x and put them at positions

determined by isy into y:

VecScatterCreate(Vec x,IS isx,Vec y,IS isy,VecScatter*)

VecScatterBegin(VecScatter,Vec x,Vec y,InsertMode,

ScatterMode)

VecScatterEnd(VecScatter,Vec x,Vec y,InsertMode,

ScatterMode)

VecScatterDestroy(VecScatter*)

IS & VecScatters

Page 83: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Creating a vector and a scatter context that copies all values of MPI

vector vin to each processor into Seq. vector vout :

VecScatterCreateToAll(Vec vin,VecScatter *ctx,Vec *vout)

Creating an output vector and a scatter context used to copy all

values of MPI vector vin into the seq. vector vout on the zeroth core

VecScatterCreateToZero(Vec vin,VecScatter *ctx,Vec *vout)

Standard sequence follows: VecScatterBegin(), VecScatterEnd(),

VecScatterDestroy()

Other VecScatters

Page 84: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

The usual create/destroy calls:

MatCreate(MPI_Comm comm,Mat *A);

MatDestroy(Mat *A);

Several more aspects to creation:

MatSetType(A,MATSEQAIJ); /*or MATMPIAIJ,MATAIJ */

MatSetSizes(Mat A,PetscInt m,PetscInt n,PetscInt M,

PetscInt N);

MatSeqAIJSetPreallocation(Mat B, PetscInt nz,

const PetscInt nnz[]);

Local or global size can be PETSC_DECIDE.

Mat: Matrices

Page 85: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatCreateSeqAIJ(MPI_Comm comm, PetscInt m, PetscInt n,

PetscInt nz, const PetscInt nnz[],Mat *A);

nz - expected number of nonzeros per row (or slight overestimate)

nnz - array of expected row lengths (or slight overestimates)

considerable savings over dynamic allocation!

Matrix creation all in one

Page 86: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatCreateMPIAIJ(MPI_Comm comm,PetscInt m,

PetscInt n,PetscInt M,PetscInt N,

PetscInt d_nz,const PetscInt d_nnz[],

PetscInt o_nz,const PetscInt o_nnz[],

Mat *A);

d_nz - # of nonzeros per row in diagonal part

o_nz - # of nonzeros per row in off-diagonal part

d_nnz - array of # of nonzeros per row in diagonal part

o_nnz - array of # of nonzeros per row in off-diagonal part

Matrix creation all in one

Page 87: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Basic matrix types

MATAIJ, MATSEQAIJ, MATMPIAIJ

basic sparse format, known as compressed row format, CRS, Yale

MATAIJ is identical to MATSEQAIJ when constructed with a single process communicator, and MATMPIAIJ otherwise.

MATBAIJ, MATSEQBAIJ, MATMPIAIJ

extensions of the AIJ formats described above

store matrix elements by fixed-sized dense blocks

intended especially for use with multiclass PDEs

multiple DOFs per mesh node

MATDENSE, MATSEQDENSE, MATMPIDENSE

dense matrices

Page 88: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatGetSize(Mat mat, PetscInt *M, PetscInt* N);

MatGetLocalSize(Mat mat, PetscInt *m, PetscInt* n);

MatGetOwnershipRange(Mat A, PetscInt *first row,

PetscInt *last row);

Querying parallel structure

Page 89: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatGetVecs(Mat mat,Vec *right,Vec *left)

right - vector that the matrix can be multiplied against

left - vector that the matrix vector product can be stored in

both can be PETSC_IGNORE

Compatible vectors

Page 90: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc matrix creation is very flexible

No sparsity pattern

any processor can set any element => potential for lots of malloc calls

malloc is very expensive

tell PETSc the matrix' sparsity structure (do construction loop twice: once counting, once making)

MatSeqAIJSetPreallocation(Mat B,

PetscInt nz, const PetscInt nnz[]);

Matrix Preallocation

Page 91: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Set one value:

MatSetValue(Mat v, PetscInt i,PetscInt j,

PetscScalar va,InsertMode mode);

where insert mode is INSERT_VALUES, ADD_VALUES

Set logically 2-D array of values:

MatSetValues(Mat A,

PetscInt m, const PetscInt idxm[],

PetscInt n, const PetscInt idxn[],

const PetscScalar values[], InsertMode mode);

Setting values

Page 92: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatSetValues is not collective, values are cached

MatAssemblyBegin(Mat A,MAT_FINAL_ASSEMBLY);

MatAssemblyEnd(Mat A,MAT_FINAL_ASSEMBLY);

cannot mix inserting/adding values

need to do assembly in between

Assembling the matrix

Page 93: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatGetValues(Mat mat, PetscInt m, const PetscInt

idxm[], PetscInt n, const PetscInt idxn[],

PetscScalar v[])

Gets a block of values given by idxm and idxn from a matrix, only returns a local block

mat - the matrix

v - a logically two-dimensional array for storing the values

m, idxm - the number of rows and their global indices

n, idxn - the number of columns and their global indices

The user must allocate space (m*n PetscScalars) for the values v which are then returned in a row-oriented format, analogous to that used by default in MatSetValues()

Getting Values

Page 94: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Values are often not needed: many matrix operations supported

Matrix elements can only be obtained locally

PetscErrorCode MatGetRow(Mat mat,PetscInt row,

PetscInt *ncols,const PetscInt *cols[],

const PetscScalar *vals[]);

PetscErrorCode MatRestoreRow(/*same parameters*/);

Getting values in array

Page 95: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Extract one parallel submatrix:

MatGetSubMatrix(Mat mat, IS isrow, IS iscol,

MatReuse cll, Mat *newmat)

Extract multiple single-processor matrices:

MatGetSubMatrices(Mat mat, PetscInt n,

const IS irow[], const IS icol[],

MatReuse scall, Mat *submat[])

Collective call, but different index sets per processor

Submatrices

Page 96: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatTranspose(Mat A, MatReuse reuse, Mat *B)

computes an out-of-place transpose B of a matrix A if reuse=MAT_INITIAL_MATRIX or

an in-place transpose of a matrix A if reuse=MAT_REUSE_MATRIX and B=A

MatMultTranspose()

MatMultTransposeAdd()

MatIsTranspose()

Matrix Transpose

Page 97: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

matrix-vector

MatMult(Mat A,Vec in,Vec out);

MatMultAdd

MatMultTranspose

MatMultTransposeAdd

simple operations on matrices

MatNorm

MatScale

MatDiagonalScale

Matrix operations

Page 98: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Implicit matrices

some of the matrix types in PETSc are not stored by elements but they behave like normal matrices in some operations

nomenclature: matrix-free, implicit, not assembled, not formed, not stored ...

the most important operation is a matrix-vector product (MatMult) which can be considered an application of a linear operator

when using an iterative solver, this operation suffices to solve a linear system

Page 99: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

matrix type MATTRANSPOSE

implicit transpose of a matrix

maintains pointer to the original matrix

its MatMult just calls MatMultTranspose of an underlying matrix and vice versa

MatTranspose (1)

Page 100: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Mat A, Ati, Ate;

Vec x, yi, ye;

//assemble somehow matrix A and vector x

MatCreateTranspose(A, &Ati);

MatTranspose(A, MAT_INITIAL_MATRIX,&Ate);

MatGetVecs(Ati,&x,&yi);

VecDuplicate(yi, &ye);

MatMult(Ati,x,yi);

MatMult(Ate,x,ye);

//norm(yi-ye) is close to 0

MatTranspose (2)

Page 101: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatComposite

Mat F,G;

Mat arr[3] = {C, B, A}; // reverse order!

// F = A*B*C (implicitly)

MatCreateComposite(comm, 3, arr, &F);

MatCompositeSetType(F,

MAT_COMPOSITE_MULTIPLICATIVE);

// G = A+B+C (implicitly)

MatCreateComposite(comm, 3, arr, &G);

MatCompositeSetType(G, MAT_COMPOSITE_ADDITIVE);

matrix type MATCOMPOSITE

implicit matrix sum or product

Page 102: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

matrix type MATSHELL

no predefined operation

arbitrary size

any operations can be defined by the user (C function pointers) using MatShellSetOperation function

can have a context with additional data

MatShellSetContext(Mat mat,void *ctx);

MatShellGetContext(Mat mat,void **ctx);

Shell matrices

Page 103: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

#undef __FUNCT__

#define __FUNCT__ "mymatmult"

/* user-defined matrix-vector multiply */

PetscErrorCode mymatmult(Mat mat,Vec in,Vec out) {

MyType *matData;

PetscFunctionBegin;

MatShellGetContext(mat,(void**)&matData);

/* compute out from in, using matData */

PetscFunctionReturn(0);

}

Shell matrix example (1)

Page 104: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Shell matrix example (2)

Mat A;

PetscInt m,n,M,N;

MyType Adata;

...

MatCreate(comm,&A);

MatSetSizes(A,m,n,M,N);

MatSetType(A,MATSHELL);

MatShellSetOperation(A,MATOP_MULT,

(void(*)(void)) mymatmult);

MatShellSetContext(A,(void*)&Adata);

...

Page 105: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Linear solvers David Horák

Page 106: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Solving a linear system Ax = b with Gaussian elimination can take a lot of time and memory.

alternative: iterative solvers use successive approx. of the solution:

convergence not always guaranteed

possibly much faster / less memory

basic operation: y = Ax executed once per iteration

convergence can be accelerated by a preconditioner B ~ A-1

KSP & PC: Iterative solvers

Page 107: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

All KSP solvers in PETSc are iterative

direct solvers - one iteration with perfect preconditioning (LU, Cholesky)

Object oriented: solvers only need matrix action, so can handle shell matrices

Preconditioners

Fargoing control through commandline options

Tolerances

Convergence and divergence reason

Custom monitors and convergence tests

Basic concepts

Page 108: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

KSPCreate(comm,&solver);

// general:

KSPSetOperators(solver,A,B,DIFFERENT_NONZERO_PATTERN);

// common:

KSPSetOperators(solver,A,A,DIFFERENT_NONZERO_PATTERN);

// also SAME_NONZERO_PATTERNS and SAME_PRECONDITIONER

KSPSolve(solver,rhs,sol);

/* optional */ KSPSetup(solver);

KSPDestroy(solver);

Iterative solver basics

Page 109: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

KSPSetType(solver,KSPGMRES);

KSP can be controlled from the commandline:

KSPSetFromOptions(solver);

/* right before KSPSolve or KSPSetUp */

then options -ksp_... are parsed -ksp_type gmres

-ksp_gmres_restart 20

-ksp_view

Solver type

Page 110: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Iterative solvers can fail

solve call itself gives no feedback

solution may be completely wrong

KSPGetConvergedReason(solver,&reason)

positive for convergence, negative for divergence

KSPGetIterationNumber(solver,&nits) after how many iterations did the method stop?

Convergence

Page 111: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

KSPSetTolerances(solver,rtol,atol,dtol,maxit);

Monitors can also be set in code, but easier:

-ksp_monitor

-ksp_monitor_true_residual

Monitors and convergence tests

Page 112: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Many options for the (mathematically) sophisticated user, some specific to one method

KSPSetInitialGuessNonzero

KSPGMRESSetRestart

KSPSetPreconditionerSide

KSPSetNormType

Advanced options

Page 113: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

MatNullSpace sp;

MatNullSpaceCreate /* constant vector */

(PETSC_COMM_WORLD,PETSC_TRUE,0,PETSC_NULL,&sp);

MatNullSpaceCreate /* general vectors */

(PETSC_COMM_WORLD,PETSC_FALSE,5,vecs,&sp);

KSPSetNullSpace(ksp,sp);

The solver will now properly remove the null space at each iteration.

Null spaces

Page 114: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PC usually created as part of KSP: separate create and destroy calls exist, but are (almost) never needed

KSP solver; PC precon;

KSPCreate(comm,&solver);

KSPGetPC(solver,&precon);

PCSetType(precon,PCJACOBI);

PCILU, PCJACOBI, PCASM, PCBJACOBI, PCMG, etc.

Controllable through commandline options:

-pc_type ilu -pc_factor_levels 3

PC basics

Page 115: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Iterative method with direct solver as preconditioner would converge in one step

Direct methods in PETSc implemented as special iterative method: KSPPREONLY only apply preconditioner - skips stopping criteria etc.

All direct methods are preconditioner type PCLU:

myprog -pc_type lu -ksp_type preonly \

-pc_factor_mat_solver_package mumps

KSP direct methods

Page 116: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

IS isr, isc; MatFactorInfo info;

MatGetOrdering(A,MATORDERING_NATURAL,&isr,&isc);

MatLUFactor(A,isr,isc,&info);

// MatLUFactorSymbolic(), MatLUFactorNumeric()

// MatCholeskyFactor(A, isr, &info);

MatSolve(A,b,x);

MatSolves(Mat A,Vecs bs,Vecs xs)

// Solves A x = b, given a factored matrix, for a

collection of vectors

MatMatSolve(Mat A,Mat B,Mat X)

//Solves A X = B, given a factored matrix

Low-level direct methods

Page 117: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Krylov Subspace Methods

Using PETSc linear algebra, just add:

KSPSetOperators(KSP ksp, Mat A, Mat M,

MatStructure flag);

KSPSolve(KSP ksp, Vec b, Vec x);

Can access subobjects

KSPGetPC(KSP ksp, PC *pc)

Preconditioners must obey PETSc interface

Basically just the KSP interface

Can change solver dynamically from the command line

-ksp_type bicgstab

Linear solvers - summary

Page 118: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Newton and Picard Methods

Using PETSc linear algebra, just add:

SNESSetFunction(SNES snes,Vec r,residualFunc,

void *ctx);

SNESSetJacobian(SNES snes, Mat A, Mat M,

jacFunc,void *ctx);

SNESSolve(SNES snes, Vec b, Vec x);

Can access subobjects

SNESGetKSP(SNES snes, KSP *ksp)

Can customize subobjects from the cmd line

Set the subdomain preconditioner to ILU with -sub_pc_type ilu

Nonlinear solvers - summary

Page 119: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

1 Sequential LU

ILUDT (SPARSEKIT2, Yousef Saad, U of MN)

EUCLID & PILUT (Hypre, David Hysom, LLNL)

ESSL (IBM)

SuperLU (Jim Demmel and Sherry Li, LBNL)

Matlab

UMFPACK (Tim Davis, U. of Florida)

LUSOL (MINOS, Michael Saunders, Stanford)

2 Parallel LU

MUMPS (Patrick Amestoy, IRIT)

SPOOLES (Cleve Ashcroft, Boeing)

SuperLU_Dist (Jim Demmel and Sherry Li, LBNL)

3 Parallel Cholesky

DSCPACK (Padma Raghavan, Penn. State)

MUMPS (Patrick Amestoy, Toulouse)

CHOLMOD (Tim Davis, Florida)

3rd party direct solvers in PETSc

Page 120: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

1 Parallel ICC

BlockSolve95 (Mark Jones and Paul Plassman, ANL)

2 Parallel ILU

PaStiX (Faverge Mathieu, INRIA)

3 Parallel Sparse Approximate Inverse

Parasails (Hypre, Edmund Chow, LLNL)

SPAI 3.0 (Marcus Grote and Barnard, NYU)

4 Sequential Algebraic Multigrid

RAMG (John Ruge and Klaus Steuben, GMD)

SAMG (Klaus Steuben, GMD)

5 Parallel Algebraic Multigrid

Prometheus (Mark Adams, PPPL)

BoomerAMG (Hypre, LLNL)

ML (Trilinos, Ray Tuminaro and Jonathan Hu, SNL)

3rd party preconditioners in PETSc

Page 121: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

DM: Data management and grid manipulation

SNES: Nonlinear solvers

TS: Time stepping

PETSc components we were not speaking about

Page 122: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Debugging & profiling

Page 123: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Launch the debugger

-start_in_debugger [gdb,dbx,noxterm]

-on_error_attach_debugger [gdb,dbx,noxterm]

Attach debugger only to some parallel processes: -debugger_nodes 0,1

Put a breakpoint in PetscError() to catch errors as they occur

Debugging - stepping

Page 124: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc tracks memory overwrites at both ends of arrays

the CHKMEMQ macro causes a check of all allocated memory

track memory overwrites by bracketing them with CHKMEMQ

PETSc checks for leaked memory

use PetscMalloc() and PetscFree() for all allocation

print unfreed memory on PetscFinalize() with -malloc_dump

Simply the best tool today is valgrind (http://www.valgrind.org)

it checks memory access, cache performance, memory usage...

needs -trace-children=yes when running under MPI

Debugging - memory checking

Page 125: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

PETSc has integrated profiling (timing, flops, memory usage, MPI messages)

Option -log_summary prints a report on PetscFinalize()

PETSc allows user-defined events

PetscLogEventRegister(), PetscLogEventBegin/End()

to create and to manage events reporting time, calls, flops, communication, etc.

Memory usage is tracked by object

Events may also be nested and will aggregate in a nested fashion

Profiling is separated into stages

PetscLogStageRegister(), PetscLogStagePush/Pop()

to create and to to manage stages identified by an integer handle

Stages may be nested, but will not aggregate in a nested fashion

Profiling

Page 126: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

output of -log_summary:

Example profiling

Page 127: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

References

Introduction to PETSc, TACC, Jan 17, 2012 (Victor Eijkhout). Slides

Short Course at the Graduate University, Chinese Academy of Sciences, Beijing, China, July 2010 (Matthew Knepley). Slides

Tutorial at ICES, UT Austin, TX September 2011 (Matthew Knepley). Slides

PETSc homepage, http://www.mcs.anl.gov/petsc/

PETSc Users Manual, http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf

PETSc Developer Guide, http://www.mcs.anl.gov/petsc/developers/developers.pdf

Page 128: Introduction to scientific computing using PETSc and · PDF file · 2012-06-05Introduction to scientific computing using PETSc and Trilinos Václav Hapla David Horák Michal Merta

Thank you for your attention!