CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on...

33
CS 591 x I/O in MPI

Transcript of CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on...

Page 1: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

CS 591 x

I/O in MPI

Page 2: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

MPI exists as many different implementationsMPI implementations are based on MPI standardsMPI standards are developed and maintained by the MPI Forum

Page 3: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

MPI implementations conform well to MPI standardsMPI 1 standards avoid the issue of I/OThis is a problem since it is rare that a useful program does no I/OHow to handle I/O is left to the individual implementations

Page 4: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

To use C I/O functions – which processes have access to stdin, stdout, stderr?This is undefined in MPI.Sometimes all processes have access to stdout.In some implementations only one process has access to stdout

Page 5: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

Sometimes stdout is only available to rank 0 in MPI_COMM_WORLDSame is true of stdinSome implementations provide no access to stdin

Page 6: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

So how do you create portable programs?Make some assumptionsDo some checking

Page 7: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

Recall in our MPI implementation – MPI running under PBS puts stdout in

a file (*.oxxxxx) No direct access to stdin

Page 8: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

stdin in PBS/Torque

-I -- means interactivecan be on qsub command line or in scriptjob still starts under the control of schedulerWhen job starts PBS/MPI will provide you with an interactive shellNot terribly obvious

Page 9: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

Two ways to deal with I/O in MPI define a specific approach in your

program use specialized parallel I/O system

I/O in parallel systems in a hot topic in high performance computing research

Page 10: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

Learn or define a single process that can do input (stdin) and output (stdout)Usually this will be rank 0 in MPI_COMM_WORLDWrite program to have IO process manage all user IO (user input/reports, prompts,etc.)

Page 11: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O in MPI

Attribute caching recall that topologies are attributes

associated (attached to communicators)

There are other attributes attached to communicators…

… and you can assign your own for example, designate a process to

handle IO

Page 12: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Attribute Caching

Duplicate the communicator MPI_Comm_dup(old_comm,

&new_comm);

Define a key value (index) for the new attribute MPI_Keyval_create(MPI_DUP_FN,

MPI_NULL_DELETE_FN, &IO_KEY, extra_arg);

Page 13: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Attribute caching

Define a value for the attribute – define the rank of the designated IO process *io_rank = 0;

Assign the attribute to to communicator MPI_Attr_put(io_comm, IO_KEY, io_rank);

To retrieve an attribute MPI_Attr_get(io_comm, IO_KEY,

&io_rank_att, &flag);

Page 14: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Attribute Caching

Attribute caching functions are local you may need to share attribute

values with other processes in the comm.

Page 15: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O Process

Even though no IO mechanism is defined in MPI…MPI implementations should have several predefined attributes for MPI_COMM_WORLDOne of these in MPI_IODefines which in process in the comm is suppose to be able to do IO

Page 16: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O process

If no process can do IO MPI_IO = MPI_PROC_NULL

If every process in the comm can do IO MPI_IO = MPI_ANY_SOURCE

If some can and some cannot process that can MPI_IO = myrank process that cannot MPI_IO = rank that

can

Page 17: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

I/O Process

MPI_IO really means which process can do outputstill may not have access to stdin

Page 18: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

MPI-IO –stdin, stdout,stderr

for stdout – create an io communicator identify an IO process in the

communicator or – create an IO process in the

communicator IO process gathers results from

compute processes IO process outputs results

Page 19: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

MPI-IO -stdin

Recall that all processes have access to stdin -only one process may have access to

stdin, or no processes have access to stdin

How will we know?

Page 20: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Testing stdin in MPI#include <stdio.h>

#include "mpi.h"

main(int argc, char** argv) {

int size, rank, numb;

MPI_Init(&argc, &argv);

MPI_Comm_size(MPI_COMM_WORLD, &size);

MPI_Comm_rank(MPI_COMM_WORLD, &rank);

printf("enter an integer ");

scanf(" %d",&numb);

printf("Hello world! I'm %d of %d - numb = %d\n", rank, size,numb);

MPI_Finalize(); }

Page 21: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Testing stdin in MPI#include <stdio.h>

#include "mpi.h"

main(int argc, char** argv) {

int size, rank, numb;

MPI_Init(&argc, &argv);

MPI_Comm_size(MPI_COMM_WORLD, &size);

MPI_Comm_rank(MPI_COMM_WORLD, &rank);

if (rank == 0) { printf("enter an integer "); scanf(" %d",&numb);}

MPI_Bcast(&numb, 1, MPI_INT, 0, MPI_COMM_WORLD);

printf("Hello world! I'm %d of %d - numb = %d\n", rank, size,numb);

MPI_Finalize();}

Page 22: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

stdin – what to do?

If all processes have access to stdin – designate one process as the IO process have that process read from stdin distribute input to other processes

If only one process has access to stdin- identify which process has access to stdin have IO process read from stdin distribute data to other processes

Page 23: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

stdin – What to do?

If no process has access to stdin – pass data as command line

arguments read input data from files create include files with data values

nuisance

Page 24: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

File IO in MPI

File IO can be a major bottleneck in the performance of a parallel applicationParallel application can have large (enormous) data setsWe often think of file IO as a side-effect – least in terms of performance – not true in parallel applications“One half hour of IO for every 2 hours of computation”

Page 25: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

MPI File IO types of Applications

Large grids and meshes storing grid point results for post

pressing distributing data for input

Checkpointing periodically saving the state of a job how much work can you afford to

lose?

Page 26: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

MPI File IO types of applications

Disk caching data to large for local memories

Data mining small compute load but a lot of file IO combing through large datasets

ex. CFD

Page 27: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

File IO in MPI

Recall that the use stdin, stdout, stderr assume, generally, a single channel for each of theseThis is not true with respect to file IO – sort of.Gathering to an IO node may not be the most efficient strategy

Page 28: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

File IO in MPI

In parallel systems you have multiple processors running concurrently each may have the ability to do file IO –

concurrently

Know your architecture Network shared disk storage

diskless compute nodes directories shared across nodes

Page 29: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Directories on Energy

/home/user - is shared and same on all nodes (r/w) /usr/local/packages/ - is shared and same on all nodes (ro)all other directories on any node are local to each nodeImplications?

Page 30: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

IO example

staging data for input dividing data before input to job distribute data pieces to local

compute node disk drives each compute node reads local files

to get its piece of the data as opposed to “read and scatter”

uses standard file IO calls

Page 31: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

IO Example

Dump and collectIn some cases large results datasets do not need to gathered to an IO node compute node writes data to file on local

disk drive postprocess program “visits” compute

nodes and collects locally stored data postprocessor store integrated data set.

Page 32: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

File IO strategy

IO Process/Scatter-Gather vs. Local IO/distribute-collectDepends on – use of input/output size of dataset file IO capacity of compute nodes

available disk space disk IO performance

Page 33: CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.