Vivek Sarkar - Computer Science | School of...

47
Vivek Sarkar Department of Computer Science Rice University [email protected] Advanced MPI COMP 422 Lecture 18 18 March 2008

Transcript of Vivek Sarkar - Computer Science | School of...

Page 1: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

Vivek Sarkar

Department of Computer ScienceRice University

[email protected]

Advanced MPI

COMP 422 Lecture 18 18 March 2008

Page 2: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

2 COMP 422, Spring 2008 (V.Sarkar)

Acknowledgement for today’s lecture

• Supercomputing 2007 tutorial on “Advanced MPI” byWilliam Gropp, Ewing (Rusty) Lusk, Robert Ross, RajeevThakur– http://sc07.supercomputing.org/schedule/event_detail.php?evid=

11040

Page 3: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

3 COMP 422, Spring 2008 (V.Sarkar) 3

The MPI Standard (1 & 2)

Page 4: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

4 COMP 422, Spring 2008 (V.Sarkar) 4

MPI-1

• MPI is a message-passing library interface standard.– Specification, not implementation– Library, not a language– Classical message-passing programming model

• MPI was defined (1994) by a broadly-based group ofparallel computer vendors, computer scientists, andapplications developers.– 2-year intensive process

• Implementations appeared quickly and now MPI istaken for granted as vendor-supported software on anyparallel machine.

• Free, portable implementations exist for clusters andother environments (MPICH2, Open MPI)

Page 5: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

5 COMP 422, Spring 2008 (V.Sarkar) 5

MPI-2

• Same process of definition by MPI Forum• MPI-2 is an extension of MPI

– Extends the message-passing model.• Parallel I/O• Remote memory operations (one-sided)• Dynamic process management

– Adds other functionality• C++ and Fortran 90 bindings

– similar to original C and Fortran-77 bindings• External interfaces• Language interoperability• MPI interaction with threads

Page 6: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

6 COMP 422, Spring 2008 (V.Sarkar) 6

MPI-2 Implementation Status

• Most parallel computer vendors now support MPI-2 ontheir machines– Except in some cases for the dynamic process management

functions, which require interaction with other system software

• Cluster MPIs, such as MPICH2 and Open MPI, supportmost of MPI-2 including dynamic process management

• Our examples here have all been run on MPICH2

Page 7: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

7 COMP 422, Spring 2008 (V.Sarkar)

Outline

• MPI code example: Conway’s Game of Life

• MPI and Threads

• Remote Memory Access

Page 8: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

8 COMP 422, Spring 2008 (V.Sarkar) 8

Conway’s Game of Life

• A cellular automata– Described in 1970 Scientific American– Many interesting behaviors; see:

• http://www.ibiblio.org/lifepatterns/october1970.html

• Program issues are very similar to those for codes thatuse regular meshes, such as PDE solvers– Allows us to concentrate on the MPI issues

Page 9: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

9 COMP 422, Spring 2008 (V.Sarkar) 9

Rules for Life

• Matrix values A(i,j) initialized to 1 (live) or 0 (dead)• In each iteration, A(i,j) is set to

– 1 (live) if either• the sum of the values of its 8 neighbors is 3, or• the value was already 1 and the sum of its 8 neighbors is 2 or 3

– 0 (dead) otherwise

j

i

j-1 j+1

i+1

i-1

Page 10: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

10 COMP 422, Spring 2008 (V.Sarkar) 10

Implementing Life

• For the non-parallel version, we:– Allocate a 2D matrix to hold state

• Actually two matrices, and we will swap them between steps– Initialize the matrix

• Force boundaries to be “dead”• Randomly generate states inside

– At each time step:• Calculate each new cell state based on previous cell states

(including neighbors)• Store new states in second matrix• Swap new and old matrices

Page 11: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

11 COMP 422, Spring 2008 (V.Sarkar) 11

Steps in Designing the Parallel Version

• Start with the “global” array as the main object– Natural for output – result we’re computing

• Describe decomposition in terms of global array• Describe communication of data, still in terms of the

global array• Define the “local” arrays and the communication

between them by referring to the global array

Page 12: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

12 COMP 422, Spring 2008 (V.Sarkar) 12

Step 1: Description of Decomposition

• By rows (1D or row-block)– Each process gets a group of adjacent rows

• Later we’ll show a 2D decomposition

Columns

Row

s

Page 13: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

13 COMP 422, Spring 2008 (V.Sarkar) 13

Step 2: Communication

• “Stencil” requires read access to data from neighbor cells

• We allocate extra space on each process to storeneighbor cells

• Use send/recv or RMA to update prior to computation

Page 14: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

14 COMP 422, Spring 2008 (V.Sarkar) 14

Step 3: Define the Local Arrays

• Correspondence between the local and global array• “Global” array is an abstraction; there is no one global

array allocated anywhere• Instead, we compute parts of it (the local arrays) on

each process• Provide ways to output the global array by combining

the values on each process (parallel I/O!)

Page 15: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

15 COMP 422, Spring 2008 (V.Sarkar) 15

Boundary Regions

• In order to calculate next state of cells in edge rows,need data from adjacent rows

• Need to communicatethese regions at eachstep– First cut: use isend

and irecv– Revisit with RMA later

Page 16: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

16 COMP 422, Spring 2008 (V.Sarkar) 16

Life Point-to-Point Code Walkthrough

• Points to observe in the code:– Handling of command-line arguments– Allocation of local arrays– Use of a routine to implement halo exchange

• Hides details of exchange

mdatamatrix

Allows us to use matrix[row][col] to address elements

See mlife.c pp. 1-8 for code example.

Page 17: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

17 COMP 422, Spring 2008 (V.Sarkar) 17

Note: Parsing Arguments

• MPI standard does not guarantee that command linearguments will be passed to all processes.– Process arguments on rank 0– Broadcast options to others

• Derived types allow one bcast to handle most args– Two ways to deal with strings

• Big, fixed-size buffers• Two-step approach: size first, data second (what we do in the

code)

See mlife.c pp. 9-10 for code example.

Page 18: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

18 COMP 422, Spring 2008 (V.Sarkar) 18

Point-to-Point Exchange

• Duplicate communicator to ensure communications donot conflict– This is good practice when developing MPI codes, but is not

required in this code– If this code were made into a component for use in other

codes, the duplicate communicator would be required

• Non-blocking sends and receives allow implementationgreater flexibility in passing messages

See mlife-pt2pt.c pp. 1-3 for code example.

Page 19: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

19 COMP 422, Spring 2008 (V.Sarkar)

Outline

• MPI code example: Conway’s Game of Life

• MPI and Threads

• Remote Memory Access

Page 20: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

20 COMP 422, Spring 2008 (V.Sarkar) 20

MPI and Threads

• MPI describes parallelism between processes• Thread parallelism provides a shared-memory model

within a process• OpenMP and pthreads are common

– OpenMP provides convenient features for loop-levelparallelism

Page 21: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

21 COMP 422, Spring 2008 (V.Sarkar) 21

MPI and Threads (contd.)

• MPI-2 defines four levels of thread safety– MPI_THREAD_SINGLE: only one thread

– MPI_THREAD_FUNNELED: only one thread that makes MPI calls

– MPI_THREAD_SERIALIZED: only one thread at a time makes MPI calls

– MPI_THREAD_MULTIPLE: any thread can make MPI calls at any time

• User calls MPI_Init_thread to indicate the level of thread supportrequired; implementation returns the level supported

Page 22: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

22 COMP 422, Spring 2008 (V.Sarkar) 22

Threads and MPI in MPI-2

• An implementation is not required to support levels higherthan MPI_THREAD_SINGLE; that is, an implementation isnot required to be thread safe

• A fully thread-compliant implementation will supportMPI_THREAD_MULTIPLE

• A portable program that does not call MPI_Init_threadshould assume that only MPI_THREAD_SINGLE issupported

Page 23: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

23 COMP 422, Spring 2008 (V.Sarkar) 23

For MPI_THREAD_MULTIPLE

• When multiple threads make MPI calls concurrently,the outcome will be as if the calls executedsequentially in some (any) order

• Blocking MPI calls will block only the calling thread andwill not prevent other threads from running or executingMPI functions

• It is the user's responsibility to prevent races whenthreads in the same application post conflicting MPIcalls

• User must ensure that collective operations on thesame communicator, window, or file handle arecorrectly ordered among threads

Page 24: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

24 COMP 422, Spring 2008 (V.Sarkar) 2424

Efficient Support forMPI_THREAD_MULTIPLE

• MPI-2 allows users to write multithreaded programs and call MPIfunctions from multiple threads (MPI_THREAD_MULTIPLE)

• Thread safety does not come for free, however• Implementation must protect certain data structures or parts of

code with mutexes or critical sections• To measure the performance impact, we ran tests to measure

communication performance when using multiple threads versusmultiple processes

– For details, see Rajeev Thakur and William Gropp, "Test Suite forEvaluating Performance of MPI Implementations That SupportMPI_THREAD_MULTIPLE," in Proc. of the 14th European PVM/MPIUsers' Group Meeting (Euro PVM/MPI 2007), September 2007, pp.46-55.

• http://www-unix.mcs.anl.gov/~thakur/papers/thread-tests.pdf

Page 25: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

25 COMP 422, Spring 2008 (V.Sarkar) 25

Tests with Multiple Threads versusProcesses

25

T

T

T

T

T

T

T

T

P

P

P

P

P

P

P

P

Page 26: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

26 COMP 422, Spring 2008 (V.Sarkar) 2626

Concurrent Latency Test on Sun andIBM SMPs

Page 27: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

27 COMP 422, Spring 2008 (V.Sarkar) 2727

Concurrent Bandwidth Test on Sun andIBM SMPs

Page 28: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

28 COMP 422, Spring 2008 (V.Sarkar) 2828

Concurrent Latency Test on LinuxCluster

Page 29: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

29 COMP 422, Spring 2008 (V.Sarkar) 2929

Concurrent Bandwidth Test on LinuxCluster

MPICH2 version 1.0.5Open MPI version 1.2.1

Page 30: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

30 COMP 422, Spring 2008 (V.Sarkar)

Outline

• MPI code example: Conway’s Game of Life

• MPI and Threads

• Remote Memory Access

Page 31: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

31 COMP 422, Spring 2008 (V.Sarkar) 31

Revisiting Mesh Communication

• Recall the Game of Life parallel implementation– Determine source and destination data

• Do not need full generality of send/receive– Each process can completely define what data needs to be

moved to itself, relative to each processes local mesh• Each process can “get” data from its neighbors

– Alternately, each can define what data is needed by theneighbor processes• Each process can “put” data to its neighbors

Page 32: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

32 COMP 422, Spring 2008 (V.Sarkar) 32

Remote Memory Access

• Separates data transfer from indication of completion(synchronization)

• In message-passing, they are combined

storesend receive

load

Proc 0 Proc 1 Proc 0 Proc 1

fenceputfence

fence

fenceload

storefence fence

get

or

Page 33: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

33 COMP 422, Spring 2008 (V.Sarkar) 33

Remote Memory Access in MPI-2(also called One-Sided Operations)

• Goals of MPI-2 RMA Design– Balancing efficiency and portability across a wide class of

architectures• shared-memory multiprocessors• NUMA architectures• distributed-memory MPP’s, clusters• Workstation networks

– Retaining “look and feel” of MPI-1– Dealing with subtle memory behavior issues: cache

coherence, sequential consistency

Page 34: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

34 COMP 422, Spring 2008 (V.Sarkar) 34

Remote Memory Access Windows andWindow Objects

Get

Put

Process 2

Process 1

Process 3

Process 0

= address spaces = window object

window

Page 35: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

35 COMP 422, Spring 2008 (V.Sarkar) 35

Basic RMA Functions for Communication

• MPI_Win_create exposes local memory to RMA operation byother processes in a communicator– Collective operation– Creates window object

• MPI_Win_free deallocates window object

• MPI_Put moves data from local memory to remote memory• MPI_Get retrieves data from remote memory into local memory• MPI_Accumulate updates remote memory using local values• Data movement operations are non-blocking• Subsequent synchronization on window object needed to

ensure operation is complete

Page 36: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

36 COMP 422, Spring 2008 (V.Sarkar) 36

Why Use RMA?

Halo Performance on Sun

0

10

20

30

40

50

60

70

80

0 200 400 600 800 1000 1200

Bytes

uS

ec

sendrecv-8

psendrecv -8

putall -8

putpscwalloc -8

putlockshared -8

putlocksharednb -8

Potentially higher performance on some platforms, e.g., SMPs Productivity benefits of shared address space

Page 37: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

37 COMP 422, Spring 2008 (V.Sarkar) 37

Advantages of RMA Operations

• Can do multiple data transfers with a singlesynchronization operation– like BSP model

• Bypass tag matching– effectively precomputed as part of remote offset

• Some irregular communication patterns can be moreeconomically expressed

• Can be significantly faster than send/receive onsystems with hardware support for remote memoryaccess, such as shared memory systems

Page 38: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

38 COMP 422, Spring 2008 (V.Sarkar) 38

Irregular Communication Patterns withRMA

• If communication pattern is not known a priori, thesend-recv model requires an extra step to determinehow many sends-recvs to issue

• RMA, however, can handle it easily because only theorigin or target process needs to issue the put or getcall

• This makes dynamic communication easier to code inRMA

Page 39: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

39 COMP 422, Spring 2008 (V.Sarkar) 39

RMA Communication Calls

• MPI_Put - stores into remote memory

• MPI_Get - reads from remote memory

• MPI_Accumulate - updates remote memory

• All are non-blocking: data transfer is described, maybeeven initiated, but may continue after call returns

• Subsequent synchronization on window object is neededto ensure operations are complete

Page 40: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

40 COMP 422, Spring 2008 (V.Sarkar) 40

Put, Get, and Accumulate

• MPI_Put(origin_addr, origin_count, origin_datatype, target_rank, target_offset, target_count, target_datatype, window)

• MPI_Get( ... )

• MPI_Accumulate( ..., op, ... )

• op is as in MPI_Reduce, but no user-defined operationsare allowed

Page 41: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

41 COMP 422, Spring 2008 (V.Sarkar) 41

The Synchronization Issue

Issue: Which value is retrieved?– Some form of synchronization is required between local

load/stores and remote get/put/accumulates MPI provides multiple forms

localstores

MPI_Get

Page 42: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

42 COMP 422, Spring 2008 (V.Sarkar) 42

Synchronization with Fence

Simplest methods for synchronizing on window objects:• MPI_Win_fence - like OpenMP flush

Process 0

MPI_Win_fence(win)

MPI_PutMPI_Put

MPI_Win_fence(win)

Process 1

MPI_Win_fence(win)

MPI_Win_fence(win)

Page 43: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

43 COMP 422, Spring 2008 (V.Sarkar) 43

More on Fence

• MPI_Win_fence is collective over the group of thewindow object

• MPI_Win_fence is used to separate, not just complete,RMA and local memory operations– That is why there are two fence calls

• Why?– MPI RMA is designed to be portable to a wide variety of

machines, including those without cache coherent hardware(including some of the fastest machines made)

– See performance tuning for more info

Page 44: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

44 COMP 422, Spring 2008 (V.Sarkar) 44

Mesh Exchange Using MPI RMA• Define the windows

– Why – safety, options for performance (later)

• Define the data to move• Mark the points where RMA can start and where it

must complete (e.g., fence/put/put/fence)

Page 45: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

45 COMP 422, Spring 2008 (V.Sarkar) 45

Outline of 1D RMA Exchange

• Create Window object• Computing target offsets• Exchange operation

Page 46: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

46 COMP 422, Spring 2008 (V.Sarkar) 46

Computing the Offsets

• Offset to top ghost row– 1

• Offset to bottom ghost row– 1 + (# cells in a row)*(# of rows – 1)– = 1 + (cols + 2)*(e – s + 2)

e

s

cols

a(1,e)

a(1,s)

Page 47: Vivek Sarkar - Computer Science | School of Engineeringvs3/comp422/lecture-notes/comp422-lec18-s... · 4 COMP 422, Spring 2008 (V.Sarkar) 4 MPI-1 •MPI is a message-passing library

47 COMP 422, Spring 2008 (V.Sarkar) 47

Fence Life Exchange Code Walkthrough

• Points to observe– MPI_Win_fence is used to separate RMA accesses from non-

RMA accesses• Both starts and ends data movement phase

– Any memory may be used• No special malloc or restrictions on arrays

– Uses same exchange interface as the point-to-point version– Two MPI_Win objects are used, one for the current patch, one

for next (and last) iteration’s patch• For this example, we could use a single MPI_Win for the entire

patch• To simplify the evolution of this code to a two-dimensional

decomposition, we use two MPI_Win objects

See mlife-fence.c pp. 1-3 for code example.