Intro MPI JvanHunen

7/27/2019 Intro MPI JvanHunen

1/17

MPI An introductionby Jeroen van Hunen

What is MPI and why should we use it? Simple example + some basic MPI functions

Other frequently used MPI functions

Compiling and running code with MPI Domain decomposition

Stokes solver

Tracers/markers

Performance

Documentation


2/17

What is MPI?

Mainly a data communication tool: Message-Passing Interface Allows parallel calculation on distributed memory machines Usually Single-Program-Multiple-Data principle used:

all processors have similar tasks (e.g. in domain decomposition)

Alternative: OpenMP for shared memory machines

Why should we use MPI?

If sequential calculations take too long

If sequential calculations use too much memory


3/17

Output for 4 processors:

Code:

contains definitions, macros,function prototypes

initialize MPI

ask processor rankask # processors p

stop MPI

Simple MPI example


4/17

MPI calls for sending/receiving data


5/17

in C:

in Fortran:

in C:

in Fortran:

MPI_SEND and MPI_RECV syntax


6/17

MPI data types

in C: in Fortran:


7/17

Other frequently used MPI calls

Sending and receiving at the same time: no risk for deadlocks:

or overwrite send buffer with received info:


8/17


Synchronizing the processors: wait for each other at the barrier:

Broadcasting a message from one processor to all the others:both sending and receiving processors use same call to MPI_BCAST


9/17


Reducing (combining) data from all processors:add, find maximum/minimum, etc.

OP can be one of the following:

For results to be available at all processors, use MPI_Allreduce:


10/17

Additional comments:

wildcards are allowed in MPI calls for: source:MPI_ANY_SOURCE tag: MPI_ANY_TAG

MPI_SEND and MPI_RECV areblocking:they wait until job is done


11/17

Deadlocks:Deadlock

Dependingon buffer

Safe

Dont let processor send a message to itselfIn this case use MPI_SENDRECV

Non-matching send/receive calls my block the code


12/17

Compiling and running code with MPI

Compiling:Fortran:

mpif77 o binary code.f

mpif90 o binary code.f

C:

mpicc

o binary code.c

Running in general, no queueing system:mpirun np 4 binary

mpirun -np 4 -nolocal -machinefile mach binary

Running on Gonzales, with queueing system:bsub -n 4 -W 8:00 prun binary


13/17

Domain decomposition

x

yz

Total computational domain divided into equal size blocks

Each processor only deals with its own block

At block boundaries some information exchange necessary

Block division matters:

surface/volume ratio

number of processor bnds.


14/17

M2

S2

N2

EW

M1 =0.25*(N1+S1+W)

M

S

N

EW

M=0.25*(N+S+E+W)

S1

M1

N1

M2 =0.25*(E)M =M1+M2 (using MPI_SENDRECV)M1 =M1=M

Stokes equation: Jacobi iterative solver

In block interior:no MPI needed

At block boundary:MPI needed

Gauss-Seidel solver performs better, butis also slightly more difficult to implement.


15/17

Tracers/Markers

proc n proc n+1

2nd order Runge-Kutta scheme:k1= dt v(t,x(t))k2= dt v(t+dt/2, x(t) + k1/2)

x(t+dt) = x(t) + k2

Procedure:Calculate x(t+dt/2)

If in procn+1:procn sends tracer coordinates to procn+1procn+1 reports tracer velocity back to procn

Calculate x(t)If in procn+1:

procn sends tracer coordinates + function valuespermantently to procn+1

k1

k2


16/17

Performance

For too small jobs communication quickly becomes the bottleneck.

This problem: R-B convection (Ra=106) 2-D 64x64 finite elements, 104 steps 3-D 64x64x64 FE, 100 steps Calculation on gonzales


17/17

Documentation

PDF: www.hpc.unimelb.edu.au/software/mpi-docs/mpi-book.pdf

Books:

Intro MPI JvanHunen

Documents

Transcript of Intro MPI JvanHunen