Parallel Programming, MPIhpc.um.ac.ir/images/282/banners/IPM_11_L2.pdf · institution-logo Parallel...
Transcript of Parallel Programming, MPIhpc.um.ac.ir/images/282/banners/IPM_11_L2.pdf · institution-logo Parallel...
institution-logo
Parallel Programming, MPILecture 2
Ehsan Nedaaee Oskoee1
1Department of PhysicsIASBS
IPM Grid and HPC workshop IV,2011
institution-logo
Outline
1 Introduction and ReviewThe Von Neumann ComputerKinds of Parallel Machine
Distributed memory parallel machinesShared memory parallel machine
2 Methodical DesignPartitioning
Domain DecompositionFunctional Decomposition
3 An Introduction to MPI
4 Point-to-Point CommunicationBlocking PTP Communication
institution-logo
Outline
1 Introduction and ReviewThe Von Neumann ComputerKinds of Parallel Machine
Distributed memory parallel machinesShared memory parallel machine
2 Methodical DesignPartitioning
Domain DecompositionFunctional Decomposition
3 An Introduction to MPI
4 Point-to-Point CommunicationBlocking PTP Communication
institution-logo
The Von Neumann Computer
Figure: The Von Neumann Computer
institution-logo
Outline
1 Introduction and ReviewThe Von Neumann ComputerKinds of Parallel Machine
Distributed memory parallel machinesShared memory parallel machine
2 Methodical DesignPartitioning
Domain DecompositionFunctional Decomposition
3 An Introduction to MPI
4 Point-to-Point CommunicationBlocking PTP Communication
institution-logo
Different type of parallel platforms:Shared Memory
Figure: The typical representation of a Shared Memory Parallel machine
institution-logo
Different type of parallel platforms:DistributedMemory
Figure: A typical representation of Distributed Memory Parallel machine
institution-logo
Outline
1 Introduction and ReviewThe Von Neumann ComputerKinds of Parallel Machine
Distributed memory parallel machinesShared memory parallel machine
2 Methodical DesignPartitioning
Domain DecompositionFunctional Decomposition
3 An Introduction to MPI
4 Point-to-Point CommunicationBlocking PTP Communication
institution-logo
Methodical Design
PartitioningDomain Decomposition
Figure: Domain Decomposition1
1Designing and Building Parallel Programs (On-line book ), by Ian Foster
institution-logo
Methodical Design
PartitioningFunctional Decomposition
Figure: Functional Decomposition2
2Designing and Building Parallel Programs (On-line book ), by Ian Foster
institution-logo
Methodical Design3
Communication
Number of tasks 8× 8 = 64,Number of Communications4× 64 = 256.
Number of tasks 1× 4 = 4,Number of Communications4× 4 = 16.
3Designing and Building Parallel Programs (On-line book ), by Ian Foster
institution-logo
Methodical Design3
Communication
Number of tasks 8× 8 = 64,Number of Communications4× 64 = 256.
Number of tasks 1× 4 = 4,Number of Communications4× 4 = 16.
3Designing and Building Parallel Programs (On-line book ), by Ian Foster
institution-logo
Methodical Design4
Mapping
4Designing and Building Parallel Programs (On-line book ), by Ian Foster
institution-logo
An Introduction to MPI
Applications:Scalable Parallel Computers (SPCs) with distributed memory,Network Of Workstations (NOWs)
Some Goals of MPI:Design an application programming interface,Allow efficient communication,Allow for implementations that can be used in a heterogeneousenvironment,Allow convenient C and Fortran 77 binding for the interface,Provide a reliable communication interface,Define an interface not too different from current practice, suchas PVM, NX, etc.
institution-logo
An Introduction to MPI
Applications:Scalable Parallel Computers (SPCs) with distributed memory,Network Of Workstations (NOWs)
Some Goals of MPI:Design an application programming interface,Allow efficient communication,Allow for implementations that can be used in a heterogeneousenvironment,Allow convenient C and Fortran 77 binding for the interface,Provide a reliable communication interface,Define an interface not too different from current practice, suchas PVM, NX, etc.
institution-logo
An Introduction to MPI
What is Included in MPIPoint-to-point communication,Collective operations,Process groups,Communication domains,Process topologies,Environmental Management and inquiry,Profiling interface,Binding for Fortran 77 and C (Also for C++ and F90 in MPI-2)â,I/O functions (in MPI-2).â
Versions of MPIVersion 1.0 (was made in June 1994)â,Version 1.1 (was made in June 1995)â,Version 2.
institution-logo
An Introduction to MPI
What is Included in MPIPoint-to-point communication,Collective operations,Process groups,Communication domains,Process topologies,Environmental Management and inquiry,Profiling interface,Binding for Fortran 77 and C (Also for C++ and F90 in MPI-2)â,I/O functions (in MPI-2).â
Versions of MPIVersion 1.0 (was made in June 1994)â,Version 1.1 (was made in June 1995)â,Version 2.
institution-logo
An Introduction to MPI
Procedure SpecificationThe call uses but does not update an argument marked IN,The call may update an argument marked OUT,The call both uses and updates an argument marked INOUT.
Types of MPI CallsLocal,Non-local,Blocking,Non-blocking,Opaque Objects,Language Binding.
institution-logo
An Introduction to MPI
Procedure SpecificationThe call uses but does not update an argument marked IN,The call may update an argument marked OUT,The call both uses and updates an argument marked INOUT.
Types of MPI CallsLocal,Non-local,Blocking,Non-blocking,Opaque Objects,Language Binding.
institution-logo
Outline
1 Introduction and ReviewThe Von Neumann ComputerKinds of Parallel Machine
Distributed memory parallel machinesShared memory parallel machine
2 Methodical DesignPartitioning
Domain DecompositionFunctional Decomposition
3 An Introduction to MPI
4 Point-to-Point CommunicationBlocking PTP Communication
institution-logo
Point-to-Point Communication
The Simplest Example
main.forProgram mainimplicit noneinclude ’mpif.h’integer ierr,rccall MPI_INIT(ierr)print*, ’HI There’call MPI_FINALIZE(rc)
End
main.cpp#include <iostream.h>#include <mpi.h>int main(int argc, char **argv){MPI_Init(&argc, &argv);cout«"HI There";MPI_Finalize();return 0;
}
institution-logo
Point-to-Point Communication
Compiling a Program
> more hostfile192.168.189.11 1Naft 2Oil 1
> more hostfileHPCLABws01ws02ws03
Lamboot -v hostfilempicc code_name.c -o code_exe_namempiCC code_name.cpp -o code_exe_namempif77 code_name.for -o code_exe_namempif90 code_name.f90 -o code_exe_namempirun -v -np 9 code_exe_namempirun N code_exe_name
institution-logo
Point-to-Point Communication
A More Complex Program#include <iostream.h>#include <mpi.h>int main(int argc, char **argv){int npes, myrank;MPI_Init(&argc, &argv);MPI_Comm_size(MPI_COMM_WORLD, &npes);MPI_Comm_rank(MPI_COMM_WORLD, &myrank);cout«"HI There, I am node "«myrank«" and thetotal worker which you are using now is:"«npes«endl;MPI_Finalize();return 0;
}
institution-logo
Point-to-Point Communication
A More Complex ProgramProgram size_rankimplicit noneinclude ’mpif.h’integer ierr,npes,myrankcall MPI_INIT(ierr)call MPI_COMM_SIZE( MPI_COMM_WORLD, npes, ierr )call MPI_COMM_RANK( MPI_COMM_WORLD, myrank, ierr )print*,"HI There, I am node ",myrank," and the total",
*"number of workers which you are using now is: ",npescall MPI_FINALIZE(ierr)
End
institution-logo
Point-to-Point Communication
Blocking Send OperationMPI_SEND(buf, count, datatype, dest, tag, comm)
IN buf initial address of send bufferIN count number of entries to sendIN datatype datatype of each entryIN dest rank of destinationIN tag message tagIN comm communicator
C Versionint MPI_Send(void* buf, int count, MPI_Datatype datatype,
int dest, int tag, MPI_Comm, comm)
institution-logo
Point-to-Point Communication
Blocking Send OperationMPI_SEND(buf, count, datatype, dest, tag, comm)
IN buf initial address of send bufferIN count number of entries to sendIN datatype datatype of each entryIN dest rank of destinationIN tag message tagIN comm communicator
Fortran versionMPI_SEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)<type> BUF(*)
INTEGER COUNT, DATATYPE, DEST, TAG, COMM, IERROR
institution-logo
Point-to-Point Communication
Blocking Receive OperationMPI_RECV(buf, count, datatype, source, tag, comm, status)â
OUT buf initial address of received bufferIN count number of entries to receiveIN datatype datatype of each entryIN source rank of sourceIN tag message tagIN comm communicatorOUT status return status
C Versionint MPI_Recv(void* buf, int count, MPI_Datatype datatype,
int source, int tag, MPI_Comm comm,
MPI_Status *status);
institution-logo
Point-to-Point Communication
Blocking Receive OperationMPI_RECV(buf, count, datatype, source, tag, comm, status)â
OUT buf initial address of received bufferIN count number of entries to receiveIN datatype datatype of each entryIN source rank of sourceIN tag message tagIN comm communicatorOUT status return status
Fortran VersionMPI_RECV(BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, STATUS, IERROR)
<type> BUF(*)
INTEGER COUNT, DATATYPE, SOURCE, TAG, COMM,
STATUS(MPI_STATUS_SIZE), IERROR
institution-logo
Point-to-Point Communication
Data Type
MPI Data Type Fortran Data TypeMPI_INTEGER INTEGERMPI_REAL REALMPI_DOUBLE_PRECISION BOUBLE PRECISIONMPI_COMPLEX COMPLEXMPI_LOGICAL LOGICALMPI_CHARACTER CHARACTER(1)âMPI_BYTEMPI_PACKED
institution-logo
Point-to-Point Communication
Data Type
MPI Data Type C Data TypeMPI_CHAR signed charMPI_SHORT signed short intMPI_INT signed intMPI_LONG signed long intMPI_UNSIGNED_CHAR unsigned charMPI_UNSIGNED_SHORT unsigned short intMPI_UNSIGNED unsigned intMPI_UNSIGNED_LONG unsigned long intMPI_FLOAT floatMPI_DOUBLE doubleMPI_LONG_DOUBLE long doubleMPI_BYTEMPI_PACKED
institution-logo
Point-to-Point Communication
A useful commandMPI_GET_PROCESSOR_NAME(name, resultlen)
OUT name A unique specifier for the current physical nodeOUT resultlen Length (in printable characters) of the result returned in name
C Versionint MPI_Get_processor_name(char* name, int* resultlen)
Fortran VersionMPI_GET_PROCESSOR_NAME(NAME, RESULTLEN, IERROR)
CHARACTER*(*) NAME
INTEGER RESULTLEN, IERROR
institution-logo
Point-to-Point Communication
Blocking Send & Receive (Bsend_recd_1.cpp)int npes, myrank, namelen, i;char processor_name[MPI_MAX_PROCESSOR_NAME];char greeting[MPI_MAX_PROCESSOR_NAME + 80];MPI_Status status;MPI_Get_processor_name( processor_name, &namelen );
sprintf( greeting, "Hello, World, From process %d of %d on%s", myrank, npes, processor_name );
if (myrank == 0 ) {
printf( "%s", greeting );
for (i = 1; i < npes; i++) {MPI_Recv( greeting, sizeof( greeting ), MPI_CHAR, i, 1,
MPI_COMM_WORLD, &status);printf( "%s", greeting );
}
} else {MPI_Send( greeting, strlen( greeting ) + 1, MPI_CHAR, 0, 1,
MPI_COMM_WORLD);}}
}
institution-logo
Point-to-Point Communication
Blocking Send & Receive (Bsend_Recd_1.for)INTEGER ierr,npes,myrank,namelen,iINTEGER stat(MPI_STATUS_SIZE)CHARACTER*(MPI_MAX_PROCESSOR_NAME) processor_nameCHARACTER (MPI_MAX_PROCESSOR_NAME + 80) greetingCHARACTER(1) numb(0:9)numb(0)="0" ; numb(1)="1"; numb(2)="2" ; numb(3)="3"numb(4)="4" ; numb(5)="5" ; numb(6)="6" ; numb(7)="7"numb(8)="8" ; numb(9)="9"call MPI_GET_PROCESSOR_NAME(processor_name,namelen,ierr)greeting = ’Hello Wold , From process ’//numb(myrank)//
*’ of ’//numb(npes)//’ on ’//processor_nameIF(myrank.EQ.0) THENprint*, greeting
DO i = 1, npes -1call MPI_RECV(greeting, len(greeting)+1,MPI_CHARACTER,i,1,
* MPI_COMM_WORLD,stat,ierr)print*,greetingEND DO
ELSEcall MPI_SEND(greeting,len(greeting)+1,MPI_CHARACTER,0,1,
* MPI_COMM_WORLD,ierr)ENDIF
institution-logo
Point-to-Point Communication
Safety
institution-logo
Point-to-Point Communication
Safety
institution-logo
Point-to-Point Communication
Safety
institution-logo
Point-to-Point Communication
Safety
src = * :: MPI_ANY_SOURCE, tag = * :: MPI_ANY_TAG
institution-logo
Point-to-Point Communication
Blocking Send Received example 2Domain Decomposition
institution-logo
Point-to-Point Communication
Blocking Send & Receive (Bsend_recd_2.cpp)for(j =0; j<= np+1; j++){for(i=0; i<= np+1; i++)âa[i][j] = myrank*100+10*j+i;}left = myrank - 1;right= myrank + 1;
if(myrank == 1 ) left = npes -1 ;
if(myrank == (npes-1)) right = 1;
if (myrank != 0 ) {
if(myrank%2==0){MPI_Send(&a[1][0],np+2,MPI_INT,left,1,MPI_COMM_WORLD);MPI_Send(&a[np][0],np+2,MPI_INT,right,1,MPI_COMM_WORLD);MPI_Recv(&a[0][0],np+2,MPI_INT,left,1,MPI_COMM_WORLD,&stat);MPI_Recv(&a[np+1][0],np+2,MPI_INT,right,1,MPI_COMM_WORLD,&stat);
} else {MPI_Recv(&a[np+1][0],np+2,MPI_INT,right,1,MPI_COMM_WORLD,&stat);MPI_Recv(&a[0][0],np+2,MPI_INT,left,1,MPI_COMM_WORLD,&stat);MPI_Send(&a[np][0],np+2,MPI_INT,right,1,MPI_COMM_WORLD);MPI_Send(&a[1][0],np+2,MPI_INT,left,1,MPI_COMM_WORLD);
}
}
institution-logo
Point-to-Point Communication
Blocking Send & Receive (Bsend_Recd_2.for)IF(myrank.NE.0) THENIF(MOD(myrank,2).EQ.0)THENcall MPI_SEND(a(0,1),np+2,MPI_INTEGER,left,1,
* MPI_COMM_WORLD,ierr)call MPI_SEND(a(0,np),np+2,MPI_INTEGER,right,1,
* MPI_COMM_WORLD,ierr)call MPI_RECV(a(0,0),np+2,MPI_INTEGER,left,1,
* MPI_COMM_WORLD,stat,ierr)call MPI_RECV(a(0,np+1),np+2,MPI_INTEGER,right,1,
* MPI_COMM_WORLD,stat,ierr)ELSEcall MPI_RECV(a(0,np+1),np+2,MPI_INTEGER,right,1,
* MPI_COMM_WORLD,stat,ierr)call MPI_RECV(a(0,0),np+2,MPI_INTEGER,left,1,
* MPI_COMM_WORLD,stat,ierr)call MPI_SEND(a(0,np),np+2,MPI_INTEGER,right,1,
* MPI_COMM_WORLD,ierr)call MPI_SEND(a(0,1),np+2,MPI_INTEGER,left,1,
* MPI_COMM_WORLD,ierr)ENDIFENDIF
institution-logo
The End
That’s All Folks