Message Passing and MPI Collective Operations and Buffering

Message Passing and MPICollective Operations and Buffering

Laxmikant Kale

CS 320

2

Example : Jacobi relaxation

Pseudocode:

A, Anew: NxN 2D-array of (FP) numbers

loop (how many times?)

for each I = 1, N

for each J between 1, N

Anew[I,J] = average of 4 neighbors and itself.

Swap Anew and A

End loop

Red and Blue boundaries held at fixed values (say temperature)

Discretization: divide the space into a grid of cells.

For all cells except those on the boundary: iteratively compute temperature as average of their neighboring cells’

3

How to parallelize?• Decide to decompose data:

– What options are there? (e.g. 16 processors)• Vertically

• Horizontally

• In square chunks

– Pros and cons

• Identify communication needed– Let us assume we will run for a fixed number of iterations

– What data do I need from others?

– From whom specifically?

– Reverse the question: Who needs my data?

– Express this with sends and recvs..

4

Ghost cells: a common apparition• The data I need from neighbors

– But that I don’t modify (therefore “don’t own”)

• Can be stored in my data structures– So that my inner loops don’t have to know about communication at

all..

– They can be written as if they are sequential code.

5

Convergence Test• Notice that all processors must report their convergence

– Only if all have converged the program has converged

– Send data to one processor (say #0)

– If you are running on 1000 processors?• Too much overhead on that one processor (serialization)

– Use spanning tree:• Simple one: processor P’s parents are (P-1)/2

– Children: 2P+1 2P+2

• Is that the best spanning tree?

– Depends on the machine!

– MPI supports a single interface• Imple,ented differently on different machines

6

MPI_Reduce• Reduce data, and use the result on root.

MPI_Reduce(data, result, size, MPI_Datatype, MPI_Op, amIroot, communicator)

MPI_Allreduce(data, result, size, MPI_Datatype, MPI_Op, amIroot, communicator)

7

Others collective ops• Barriers, Gather, Scatter

MPI_Barrier(MPI_Comm)

MPI_Gather(sendBuf, size, dataType, recvBuf, rcvSize, recvType, root,comm)

MPI_Scatter(…)

MPI_AllGather(.. No root..)

MPI_AllScatter(. .)

8

Collective calls• Message passing is often, but not always, used for SPMD style of

programming:– SPMD: Single process multiple data

– All processors execute essentially the same program, and same steps, but not in lockstep

• All communication is almost in lockstep

• Collective calls: – global reductions (such as max or sum)

– syncBroadcast (often just called broadcast):• syncBroadcast(whoAmI, dataSize, dataBuffer);

– whoAmI: sender or receiver

9

Other Operations• Collective Operations

– Broadcast

– Reduction

– Scan

– All-to-All

– Gather/Scatter

• Support for Topologies

• Buffering issues: optimizing message passing

• Data-type support

Message Passing and MPI Collective Operations and Buffering

Documents

Transcript of Message Passing and MPI Collective Operations and Buffering