ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

18
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques

Transcript of ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

Page 1: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Parallel Processing Techniques

Page 2: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Parallel Design Considerations

• Concentrating on larger problems (only 3-dimensional)

• No compromises in physics and numerics

• Target distributed memory, MIMD architectures• workstation cluster first platform

• Single code version for single processor and parallel platforms

• No performance degradation on single processor platforms

• Dynamic load balancing needed

Page 3: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Parallel Components and Structure

• Main structure• Memory structure• Variable tables• Domain decomposition• Communication procedures • Concurrent computation/communication • Bookkeeping arrays• File output• Nested grid considerations• Dynamic load balancing

Page 4: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Parallel Structure

For parallel execution, serial (single-processor) code divided into two types:

• Master process

• Initialization

• All input/output functions

• Node (“compute”) processes

• All computation

Page 5: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Memory Structure

• Many options / array space / nests with variable numbers of grid points

• Dynamically-allocate “A” array

• “1-D” array with all memory

• “Pointers” to beginning of each field (integer indices in COMMON)

• Computational routines dynamically-dimension arrays

• Each node/master process allocates its own memory

Page 6: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS Variable Tables

Data file defines characteristics of each model array

• Existence (whether on node or master processes)

• Output

• Communication

Page 7: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

RAMS VTABLE File

• # Velocity, perturbation Exner function• UP : 1:3: 0:110000:vard:hist:anal:lite:mpti:mpt3:mpt2• UC : 2:3: 0:110000:vard:hist:mpti:mpt3• VP : 3:3: 0:110000:vard:hist:anal:lite:mpti:mpt3:mpt2• VC : 4:3: 0:110000:vard:hist:mpti:mpt3• WP : 5:3: 0:110000:vard:hist:anal:lite:mpti:mpt3:mpt2• WC : 6:3: 0:110000:vard:hist:mpti:mpt3• PP : 7:3: 0:110000:vard:hist:anal:lite:mpti:mpt3:mpt2• PC : 8:3: 0:110000:vard:hist:mpti:mpt3

Page 8: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Domain Decomposition

Nodes exchange “overlap” rows (halo regions) with adjacent nodes

Page 9: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Domain decomposition

• Each column assigned a “work” factor

• Factor based on boundary, physics, machine performance, etc.

• Domain horizontally decomposed across processors

• “2-D” decomposition (3 or more nodes)

Page 10: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Timestep flow - single grid - serial

Start Compute long timestep

End1 2 3

Start

Small or acoustictimesteps

Page 11: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Timestep flow - single grid - parallel

Start End1

Node 1:

Node 2:

Start End1Compute Compute

Compute Compute 1

1

1

1

Page 12: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Concurrent communication/computation

• Overlap rows sent first thing in timestep flow.

• Model processes that can be computed without overlap rows aredone first.

• Sub-domain interior is computed for some terms

• Messages are received

• Sub-domain boundary terms are computed

Page 13: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Bookkeeping arrays

Keep track of communication “paths” source node / destination node

ipaths(info,itype,ngrids,nproc,nproc)

• message bounds• machine id

• “type” of message

• grid number

• number of processor

iget_paths(itype,ngrids,nproc,nproc)

Page 14: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Communication “types”

Initialization / re-decomposition• Master to node; Full sub-domains; Most variables

Long timestep overlap region• Node to node; 1 row; Prognostic variables

Small timestep overlap region (2 events)• Node to node; 1 row; Selected variables

File output• Node to master; Full sub-domains; Prognostic variables

All node to node types package variables in a single message before sending.

Page 15: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

File output

• Nodes transfer necessary data to master process

• Transfer done with message passing

• No local disk space or NFS needed for file output

• Nodes can continue computing

• Data available on master for other activities

• Parallel I/O not needed

Page 16: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Nested grid considerations

• all grids decomposed independently

• 2-way interactive communication

• interpolation of nested grid boundaries performed on fine gridnodes, coarse grid information sent to fine grid

• feedback (averaging) of fine grid information performed on finegrid node, averaged values sent to coarse grid

• minimizes data needed to be communicated

Page 17: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Dynamic load balancing

• Regions of domain can require additional physics as model progresses (microphysics, radiation)

• Nodes send timing information back to master at end of timestep

• At opportune times (coinciding with file output times), master process will:

• re-compute “work” factor• re-decomposes the domain • re-initializes the nodes

Page 18: ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

ATmospheric, Meteorological, and Environmental Technologies

Performance

• Performance varies, dependent on architecture, communication hardware, model configuration

• Speedups and efficiencies

• Results• 60-110% efficiency

Efficiency:

E = S / # processors

Speed up: Time to run serial code S = Time to run parallel code