:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on...

Click here to load reader

download :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.

of 15

Transcript of :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on...

HIGH PERFORMANCE COMPUTING CENTRE STUTTGARTKiril Dichev
Overview of using MPI on Grids
Advantages: a great set of computing and storage resources are available for Grid computing:
“The EGEE Grid consists of over 36,000 CPU available to users 24 hours a day, 7 days a week, in addition to about 5 PB disk (5 million Gigabytes) + tape MSS of storage, and maintains 30,000 concurrent jobs on average”
Disadvantages:
(for admins) MPI configuration on the site level (MPI libraries and supporting software on clusters, publishing of information to the Information System) is needed
(for users) The Grid middleware is not implemented to address MPI jobs
advanced setup more difficult
There is a controversy between high-performance oriented computing and Grid computing:
If you do high performance computing, you would want to be able to configure yourself:
Resource reservation
The runtime environment
If you do Grid computing, you would (normally) use the Grid simply as a set of abstract available resources
Additional tuning is possible through scripts, but more difficult to get through the middleware
Certain aspects (like resource manager options) are not possible
::
::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
MPI status in EGEE (report from 2008):
Of the 331 EGEE sites tested, only 36 sites accept parallel jobs
22 of the 36 jobs can run MPI jobs
The main problems: No MPI installation, misconfiguration of the Grid middleware for advertising MPI, wrong MPI installation, wrong startup of MPI jobs etc.
Euforia infrastructure:
::
::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
In a HPC environment:
Specify the batch job configuration file
Very flexible control of node reservation
Very flexible control of runtime options
Submit
Log into a UI computer
Specify the resource requirements in a JDL file
Limited configurability of reservation/runtime
A diagram:
::
::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
MPI-Start
MPI-Start is a set of shell scripts to support MPI applications
The goal is to „Do The Right Thing“ for running MPI-parallel jobs on different sites with different configuration
Detection mechanisms for:
::
::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
High configurability (by both admins and users)
Flexibility (component-based architecture supporting all modern HPC architectures/interconnects)
Responsive mailing list
MPI on Grids
For EGEE:
edg-job-* tools (LCG-based)
glite-wms-job-* tools (gLite-based)
The typical command-line tools for jobs on Euforia are modified edg-job-* tools:
i2g-job-* tools
Log into the Euforia UI iwrui2.fzk.de
Compile your MPI application:
mpicc –o ring_c ring_c.c
Create a temporary proxy which is associated with your certificate
voms-proxy-init –voms itut
i2g-job-submit <jdl-file>
File distribution for different types of clusters
Handling large input/output files (transfer from/to SE)‏
Forwarding MPI runtime options to mpirun/mpiexec
MPI tools support can be easily added:
MPI Performance measurement tools
MPI correctness checking tools
The key to most advanced features is to use shell scripts (e.g. MPI-Start extensions) before/during/after the program execution. The scripts allow the interaction with:
The Grid middleware
::
::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::