Intro MPI JvanHunen

download Intro MPI JvanHunen

of 17

Transcript of Intro MPI JvanHunen

  • 7/27/2019 Intro MPI JvanHunen

    1/17

    MPI An introductionby Jeroen van Hunen

    What is MPI and why should we use it? Simple example + some basic MPI functions

    Other frequently used MPI functions

    Compiling and running code with MPI Domain decomposition

    Stokes solver

    Tracers/markers

    Performance

    Documentation

  • 7/27/2019 Intro MPI JvanHunen

    2/17

    What is MPI?

    Mainly a data communication tool: Message-Passing Interface Allows parallel calculation on distributed memory machines Usually Single-Program-Multiple-Data principle used:

    all processors have similar tasks (e.g. in domain decomposition)

    Alternative: OpenMP for shared memory machines

    Why should we use MPI?

    If sequential calculations take too long

    If sequential calculations use too much memory

  • 7/27/2019 Intro MPI JvanHunen

    3/17

    Output for 4 processors:

    Code:

    contains definitions, macros,function prototypes

    initialize MPI

    ask processor rankask # processors p

    stop MPI

    Simple MPI example

  • 7/27/2019 Intro MPI JvanHunen

    4/17

    MPI calls for sending/receiving data

  • 7/27/2019 Intro MPI JvanHunen

    5/17

    in C:

    in Fortran:

    in C:

    in Fortran:

    MPI_SEND and MPI_RECV syntax

  • 7/27/2019 Intro MPI JvanHunen

    6/17

    MPI data types

    in C: in Fortran:

  • 7/27/2019 Intro MPI JvanHunen

    7/17

    Other frequently used MPI calls

    Sending and receiving at the same time: no risk for deadlocks:

    or overwrite send buffer with received info:

  • 7/27/2019 Intro MPI JvanHunen

    8/17

    Other frequently used MPI calls

    Synchronizing the processors: wait for each other at the barrier:

    Broadcasting a message from one processor to all the others:both sending and receiving processors use same call to MPI_BCAST

  • 7/27/2019 Intro MPI JvanHunen

    9/17

    Other frequently used MPI calls

    Reducing (combining) data from all processors:add, find maximum/minimum, etc.

    OP can be one of the following:

    For results to be available at all processors, use MPI_Allreduce:

  • 7/27/2019 Intro MPI JvanHunen

    10/17

    Additional comments:

    wildcards are allowed in MPI calls for: source:MPI_ANY_SOURCE tag: MPI_ANY_TAG

    MPI_SEND and MPI_RECV areblocking:they wait until job is done

  • 7/27/2019 Intro MPI JvanHunen

    11/17

    Deadlocks:Deadlock

    Dependingon buffer

    Safe

    Dont let processor send a message to itselfIn this case use MPI_SENDRECV

    Non-matching send/receive calls my block the code

  • 7/27/2019 Intro MPI JvanHunen

    12/17

    Compiling and running code with MPI

    Compiling:Fortran:

    mpif77 o binary code.f

    mpif90 o binary code.f

    C:

    mpicc

    o binary code.c

    Running in general, no queueing system:mpirun np 4 binary

    mpirun -np 4 -nolocal -machinefile mach binary

    Running on Gonzales, with queueing system:bsub -n 4 -W 8:00 prun binary

  • 7/27/2019 Intro MPI JvanHunen

    13/17

    Domain decomposition

    x

    yz

    Total computational domain divided into equal size blocks

    Each processor only deals with its own block

    At block boundaries some information exchange necessary

    Block division matters:

    surface/volume ratio

    number of processor bnds.

  • 7/27/2019 Intro MPI JvanHunen

    14/17

    M2

    S2

    N2

    EW

    M1 =0.25*(N1+S1+W)

    M

    S

    N

    EW

    M=0.25*(N+S+E+W)

    S1

    M1

    N1

    M2 =0.25*(E)M =M1+M2 (using MPI_SENDRECV)M1 =M1=M

    Stokes equation: Jacobi iterative solver

    In block interior:no MPI needed

    At block boundary:MPI needed

    Gauss-Seidel solver performs better, butis also slightly more difficult to implement.

  • 7/27/2019 Intro MPI JvanHunen

    15/17

    Tracers/Markers

    proc n proc n+1

    2nd order Runge-Kutta scheme:k1= dt v(t,x(t))k2= dt v(t+dt/2, x(t) + k1/2)

    x(t+dt) = x(t) + k2

    Procedure:Calculate x(t+dt/2)

    If in procn+1:procn sends tracer coordinates to procn+1procn+1 reports tracer velocity back to procn

    Calculate x(t)If in procn+1:

    procn sends tracer coordinates + function valuespermantently to procn+1

    k1

    k2

  • 7/27/2019 Intro MPI JvanHunen

    16/17

    Performance

    For too small jobs communication quickly becomes the bottleneck.

    This problem: R-B convection (Ra=106) 2-D 64x64 finite elements, 104 steps 3-D 64x64x64 FE, 100 steps Calculation on gonzales

  • 7/27/2019 Intro MPI JvanHunen

    17/17

    Documentation

    PDF: www.hpc.unimelb.edu.au/software/mpi-docs/mpi-book.pdf

    Books: