Distributed Memory Programming with...

47
Distributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014 Part of slides from the text book and B. Gropp

Transcript of Distributed Memory Programming with...

Page 1: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Distributed Memory Programming

with Message-Passing

Pacheco. Chapter 3

T. Yang, CS140 2014

Part of slides from the text book and B. Gropp

Page 2: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Copyright © 2010, Elsevier

Inc. All rights Reserved

Outline

• An overview of MPI programming

Six MPI functions and hello sample

How to compile/run

• More on send/receive communication

• Parallelizing numerical integration with MPI

# C

hapte

r Subtitle

Page 3: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Mainly for distributed memory systems

Copyright © 2010, Elsevier

Inc. All rights Reserved

Not targeted for shared memory machines. But can work

Page 4: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

4

Message Passing Libraries

• MPI, Message Passing Interface, now the industry

standard, for C/C++ and other languages

• Running as a set of processes. No shared variables

• All communication, synchronization require

subroutine calls

Enquiries

– How many processes? Which one am I? Any messages

waiting?

Communication

– point-to-point: Send and Receive

– Collectives such as broadcast

Synchronization

– Barrier

Page 5: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

6

MPI Implementations & References

• The Standard itself (MPI-2, MPI-3):

at http://www.mpi-forum.org

• Implementation for Linux/Windows

Vendor specific implementation

MPICH

Open MPI

• Other information on Web:

http://en.wikipedia.org/wiki/Message_Passing_Interf

ace

http://www.mcs.anl.gov/mpi MPI talks and tutorials,

a FAQ, other MPI pages

Page 6: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

7

MPI is Simple

• Many parallel programs can be written using just

these six functions, only two of which are non-

trivial:

MPI_INIT

MPI_FINALIZE

MPI_COMM_SIZE

MPI_COMM_RANK

MPI_SEND

MPI_RECV

• To measure time: MPI_Wtime()

Slide source: Bill Gropp, ANL

Page 7: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Hello World!

Copyright © 2010, Elsevier

Inc. All rights Reserved

(a classic)

Page 8: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

10

Mpi_hello (C)

#include "mpi.h"

#include <stdio.h>

int main( int argc, char *argv[] )

{

int rank, size;

MPI_Init( &argc, &argv );

MPI_Comm_rank( MPI_COMM_WORLD, &rank );

MPI_Comm_size( MPI_COMM_WORLD, &size );

printf( "I am %d of %d!\n", rank, size );

MPI_Finalize();

return 0;

}

Slide source: Bill Gropp, ANL

Page 9: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Compilation

Copyright © 2010, Elsevier

Inc. All rights Reserved

mpicc -g -Wall -o mpi_hello mpi_hello.c

wrapper script to compile

turns on all warnings

source file

create this executable file name

(as opposed to default a.out)

produce

debugging

information

Page 10: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Execution with mpirun or mpiexec

Copyright © 2010, Elsevier

Inc. All rights Reserved

mpirun -n <number of processes> <executable>

mpirun -n 1 ./mpi_hello

mpirun -n 4 ./mpi_hello

run with 1 process

run with 4 processes

Page 11: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Execution with mpirun at TSCC

mpirun –machinefile <filename> -n <number of

processes> <executable>

mpirun -machinefile $PBS_NODEFILE -n 4

./mpi_hello

run with 4 processes on allocated

machines

Page 12: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Execution

Copyright © 2010, Elsevier

Inc. All rights Reserved

mpirun -n 1 ./mpi_hello

mpirun -n 4 ./mpi_hello

I am 0 of 1 !

I am 0 of 4 !

I am 1 of 4 !

I am 2 of 4 !

I am 3 of 4 !

Page 13: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

MPI Programs

• Written in C/C++.

Has main.

Uses stdio.h, string.h, etc.

• Need to add mpi.h header file.

• Identifiers defined by MPI start with “MPI_”.

• First letter following underscore is uppercase.

For function names and MPI-defined types.

Helps to avoid confusion.

• MPI functions return error codes or MPI_SUCCESS

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 14: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

MPI Components

• MPI_Init

Tells MPI to do all the necessary setup.

• MPI_Finalize

Tells MPI we’re done, so clean up anything

allocated for this program.

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 15: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Basic Outline

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 16: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

20

Basic Concepts: Communicator

• Processes can be collected into groups

Communicator

Each message is sent & received in the

same communicator

• A process is identified by its rank in the group

associated with a communicator

• There is a default communicator whose group

contains all initial processes, called MPI_COMM_WORLD

Slide source: Bill Gropp, ANL

Page 17: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Communicators

Copyright © 2010, Elsevier

Inc. All rights Reserved

number of processes in the communicator

my rank

(the process making this call)

Page 18: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Basic Send

Copyright © 2010, Elsevier

Inc. All rights Reserved

• Things specified: How will “data” be described?

How will processes be identified?

How will the receiver recognize/screen messages?

Page 19: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Data types

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 20: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Basic Receive: Block until a matching

message is received

• Things that need specifying: Where to receive data

How will the receiver recognize/screen messages?

What is the actual message received

Page 21: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Message matching

Copyright © 2010, Elsevier

Inc. All rights Reserved

MPI_Send

dest

MPI_Recv

src

Page 22: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Receiving messages without knowing

• A receiver can get a message without knowing:

the amount of data in the message,

the sender of the message,

– Specify the source as MPI_ANY_SOURCE

or the tag of the message.

– Specify the tag as MPI_ANY_TAG

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 23: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Status argument: who sent me and what tag

is?

Copyright © 2010, Elsevier

Inc. All rights Reserved

Who sent me

What tag is

Error code

Actual message length

MPI_Status*

Page 24: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

29

Retrieving Further Information from status

argument in C

• Status is a data structure allocated in the user’s

program.

• In C: int recvd_tag, recvd_from, recvd_count;

MPI_Status status;

MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG, ..., &status )

recvd_tag = status.MPI_TAG;

recvd_from = status.MPI_SOURCE;

MPI_Get_count( &status, datatype, &recvd_count );

Slide source: Bill Gropp, ANL

Page 25: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

CS267 Lecture 7 30

Retrieving Further Information in C++

• Status is a data structure allocated in the user’s

program.

• In C++: int recvd_tag, recvd_from, recvd_count;

MPI::Status status;

Comm.Recv(..., MPI::ANY_SOURCE, MPI::ANY_TAG, ...,

status )

recvd_tag = status.Get_tag();

recvd_from = status.Get_source();

recvd_count = status.Get_count( datatype );

Slide source: Bill Gropp, ANL

Page 26: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

31

MPI Example: Simple send/recive

#include “mpi.h”

#include <stdio.h>

int main( int argc, char *argv[])

{

int rank, buf;

MPI_Status status;

MPI_Init(&argv, &argc);

MPI_Comm_rank( MPI_COMM_WORLD, &rank );

/* Process 0 sends and Process 1 receives */

if (rank == 0) {

buf = 123456;

MPI_Send( &buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);

}

else if (rank == 1) {

MPI_Recv( &buf, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,

&status );

printf( “Received %d\n”, buf );

}

MPI_Finalize();

return 0;

}

Slide source: Bill Gropp, ANL

Proc

0 Proc

1

123456

Page 27: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

32

MPI Send/Receive Example with C++

#include “mpi.h”

#include <iostream>

int main( int argc, char *argv[])

{

int rank, buf;

MPI::Init(argv, argc);

rank = MPI::COMM_WORLD.Get_rank();

// Process 0 sends and Process 1 receives

if (rank == 0) {

buf = 123456;

MPI::COMM_WORLD.Send( &buf, 1, MPI::INT, 1, 0 );

}

else if (rank == 1) {

MPI::COMM_WORLD.Recv( &buf, 1, MPI::INT, 0, 0 );

std::cout << “Received “ << buf << “\n”;

}

MPI::Finalize();

return 0;

}

Slide source: Bill Gropp, ANL

Proc

0 Proc

1

123456

Page 28: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

MPI_Wtime()

• Returns the current time with a double float.

• To time a program segment

Start time= MPI_Wtime()

End time=MPI_Wtime()

Time spent is end_time – start_time.

Page 29: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Example of using MPI_Wtime()

#include<stdio.h>

#include<mpi.h>

main(int argc, char **argv){

int size, node; double start, end;

MPI_Init(&argc, &argv);

MPI_Comm_rank(MPI_COMM_WORLD, &node);

MPI_Comm_size(MPI_COMM_WORLD, &size);

start = MPI_Wtime();

if(node==0) {

printf(" Hello From Master. Time = %lf \n", MPI_Wtime() -

start);

}

else {

printf("Hello From Slave #%d %lf \n", node, (MPI_Wtime()

- start));

}

MPI_Finalize();

}

Page 30: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

MPI Example: Numerical Integration With

Trapezoidal Rule

Textbook p. 94-101 Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 31: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Numerical Integration with Trapezoidal

Rule

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 32: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

0

2

4

6

8

10

12

3 5 7 9 11 13 15

Approximation of Numerical Integration

Two ideas

1) Use a simple function to approximate the

integral area.

2) Divide an integral into small segments

0 x

1

Page 33: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Trapezoid Rule • Straight-line approximation

)x(f)x(f2

h

)x(fc)x(fc)x(fcdx)x(f

10

1100i

1

0i

i

b

a

x0 x1 x

f(x)

L(x)

Page 34: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Example:Trapezoid Rule Evaluate the integral

• Exact solution

• Trapezoidal Rule 926477.5216)1x2(e

4

1

e4

1e

2

xdxxe

1

0

x2

4

0

x2x24

0

x2

dxxe4

0

x2

%12.357926.5216

66.23847926.5216

66.23847)e40(2)4(f)0(f2

04dxxeI

84

0

x2

Page 35: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Apply trapezoid rule to multiple segments over

integration limits

0

1

2

3

4

5

6

7

3 5 7 9 11 13 15

Two segments

0

1

2

3

4

5

6

7

3 5 7 9 11 13 15

Four segments

0

1

2

3

4

5

6

7

3 5 7 9 11 13 15

Many segments

0

1

2

3

4

5

6

7

3 5 7 9 11 13 15

Three segments

Page 36: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Composite Trapezoid Rule

)x(f)x(f2)2f(x)f(x2)f(x2

h

)f(x)f(x2

h)f(x)f(x

2

h)f(x)f(x

2

h

f(x)dxf(x)dxf(x)dxf(x)dx

n1ni10

n1n2110

x

x

x

x

x

x

b

a

n

1n

2

1

1

0

x0 x1 x2 h h x3 h h x4 x

f(x)

n

abh

Page 37: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Composite Trapezoid Rule: Example

Evaluate the integral dxxeI

4

0

x2

%66.2 95.5355

)4(f)75.3(f2)5.3(f2

)5.0(f2)25.0(f2)0(f2

hI25.0h,16n

%50.10 76.5764)4(f)5.3(f2

)3(f2)5.2(f2)2(f2)5.1(f2

)1(f2)5.0(f2)0(f2

hI5.0h,8n

%71.39 79.7288)4(f)3(f2

)2(f2)1(f2)0(f2

hI1h,4n

%75.132 23.12142)4(f)2(f2)0(f2

hI2h,2n

%12.357 66.23847)4(f)0(f2

hI4h,1n

Page 38: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Implementing Composite Trapezoidal Rule

x0 x1 x2 h h x3 h h x4 x

f(x)

n

abh

Page 39: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Pseudo-code for a serial program

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 40: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Parallelizing the Trapezoidal Rule

1. Partition problem solution into tasks.

2. Identify communication channels between tasks.

3. Aggregate tasks into composite tasks.

4. Map composite tasks to cores.

x0 x1 x2 h h x3 h h x4

Task 0

Core 0

Task 1

Core 1 Task 2

Core 2

Task 3

Core 3

x

f(x)

n

abh

Page 41: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Parallel pseudo-code

Copyright © 2010, Elsevier

Inc. All rights Reserved

Summation of local values

Compute the local area

Page 42: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Tasks and communications for Trapezoidal

Rule

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 43: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

First version (1)

Use send/receive to sum

Page 44: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

First version (2)

Copyright © 2010, Elsevier

Inc. All rights Reserved

Use send/receive to sum

Page 45: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

First version: Trapezoidal Rule of local area

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 46: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

I/O handling in trapezoidal program

• Most MPI implementations only allow process 0 in

MPI_COMM_WORLD access to stdin.

• Process 0 must read the data (scanf) and send to the

other processes.

Copyright © 2010, Elsevier

Inc. All rights Reserved

Page 47: Distributed Memory Programming with Message-Passingtyang/class/140s14/slides/Chapt3-MPI.pdfDistributed Memory Programming with Message-Passing Pacheco. Chapter 3 T. Yang, CS140 2014

Function for reading user input

Copyright © 2010, Elsevier

Inc. All rights Reserved

Broadcast parameters

Process 0 inputs parameters