Configuring sites for MPI

8
EGEE-II INFSO-RI- 031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Configuring sites for MPI Stephen Childs Trinity College Dublin

description

Configuring sites for MPI. Stephen Childs Trinity College Dublin. Overview. Site configuration issues Resource broker and WMS YAIM Quattor. Why do we care?. Why should users care about site configuration? - PowerPoint PPT Presentation

Transcript of Configuring sites for MPI

Page 1: Configuring sites for MPI

EGEE-II INFSO-RI-031688

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Configuring sites for MPI

Stephen Childs

Trinity College Dublin

Page 2: Configuring sites for MPI

MPI applications course 2

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Overview

• Site configuration issues• Resource broker and WMS• YAIM• Quattor

Page 3: Configuring sites for MPI

MPI applications course 3

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Why do we care?

• Why should users care about site configuration?– The more sites that are configured correctly, the more places

you can run your MPI code– Helpful to have an idea of what is required before talking to site

admins– Some small fixes by sites can greatly improve user experience

Page 4: Configuring sites for MPI

MPI applications course 4

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Site configuration

• The recommended configuration– Shared home filesystem between WNs (and possibly CE)– Use the “pbs” jobmanager not “lcgpbs”– Install mpi-start RPM on WNs– Install required MPI flavours on WNs– Publish mpi-start availability and MPI versions in GLUE RTE– Set environment variables on WN describing MPI flavours

• Modules exist for Quattor and YAIM to do this

• Workarounds– Install “dummy” mpirun (may break older usage though)– Edit GLUE information to publish “pbs” not “torque”

Page 5: Configuring sites for MPI

MPI applications course 5

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

RB/WMS configuration

• Patched version of LCG RB available– Deployed in Grid-Ireland for >1 year– Allows multi-node Normal jobs– Worth considering if your VO still uses LCG RB?

• gLite WMS allows jobwrappers to be edited– Can remove hard-coded “mpirun” invocation– Needs to be done for each supported LRMS

• Newer WMS should allow for “Normal” jobs with multiple nodes– No hard-coded “mpirun”– Remove check for “pbs” or “lsf” jobmanagers

Page 6: Configuring sites for MPI

MPI applications course 6

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

YAIM

• First version of org.glite.yaim.mpi committed to CVS– Built for “modular” YAIM (v. 4.0.0)

• Module has dual aims:– Configure Grid for cluster where MPI is already configured

Sysadmin tells YAIM details of installed MPIs YAIM sets up Grid env. variables (WN) and GLUE (CE)

– Add baseline MPI functionality in non-MPI cluster Sysadmin just sets ENABLE_MPI Install standard MPIs (mpich, mpich2, openmpi, mpiexec) Set up Grid env. variables (WN) and GLUE (CE)

• Ready for testing! (ask me for the RPM)

Page 7: Configuring sites for MPI

MPI applications course 7

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

YAIM meta-RPM

• For installation of baseline MPI setup on WNs– mpi-start– mpich– mpich2– openmpi– mpiexec (OSC)– …

• Will hopefully be integrated into standard gLite release

Page 8: Configuring sites for MPI

MPI applications course 8

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

Quattor

• Recommendations for MPI configuration are fully implemented in QWG templates