Parallel NetCDF Library Development Formerly “Sensor Cloud Integration” Kelsey Weingartner.
-
Upload
charity-marshall -
Category
Documents
-
view
226 -
download
1
Transcript of Parallel NetCDF Library Development Formerly “Sensor Cloud Integration” Kelsey Weingartner.
NetCDF and MASS
NetCDF Machine-
independent format for representing scientific data
Files stores data arranged in variables
Each variable holds an array of data
MASS Library for running a
simulation in parallel
Eases the complexity of creating and running 2D and 3D spatial simulations
A simulation is a grid of “Places” that may or may not have “Agents” on them
Purpose
Within MASS• Make NetCDF file use simple
and feasible for MASS• Maintain the benefits of a
distributed environment running in parallel.
Real-World Applications• Climate change analysis
Artifacts Summer 2012• Sequential write with NetCDF• Worst-case parallel performance
Fall 2013• Best-case parallel performance• File creator• File creator with parallel write• File creator with parallel write & read
Winter 2013• Single instance per processor file creator
and parallel writer• Final product
Sequential Each save
requires the file to only be opened once
callAll() gathers agent information from each Place
Master node then handles writing to the NetCDF file
Parallel - Worst-Case
Each save, the file is opened by every Place object
Master triggers save with callAll()
Place gathers its Agents’ information and writes
JavaMPI Parallel Best-Case
Drag picture to placeholder or click icon to add Select a NetCDF file to
copy
Master node creates a new file with same dimensions
Send an equal portion of data from the chosen file to each node
Each node writes their received array to the newly created NetCDF file
Final ProductSingle Instance per processor file creator and
parallel reader/writer
Extends MASS Place
Creates a file for the simulation if none exists
Stores file contents in a buffer to increase read/write speed
Each processor holds the portion of the file relevant to them
A file is only opened by the first writer Place in each partition
Results Sequential write (1 processor): • 100x100, 1,000 agents, 1,000 cycles = 225,712.8 msec
Worst-case parallel write (1 processor): • 50x50, 500 agents, 100 cycles 957,590.5 msec
MPInetCDF results on a 50x50 file:• On 4 processors: 22,114.4 msec / 246,444 bytes = 0.0897
B/msec
• On 6 processors: 16,470.2 msec / 246,444 bytes = 0.0668 B/msec
RandomWalk using parallel NetCDF (1 processor):• 100x100, 1,000 agents, 1,000 cycles = 204,997.2 msec /
472,484 bytes = 0.434 B/msec
Final Product Results
RandomWalk• 100 x 100 grid, 1000 agents,
100 cycles: 7,843.7 msec
RandomWalk with NetCDF• 100 x 100 grid, 1000 agents,
100 cycles, writing to file every 20 cycles: 204,997.2 msec
• Previous settings, but writing to file only once: 69,480.7
Wave2DMASS• 100 x 100 grid, 1000 cycles:
16,913.5 msec
Wave2DMASS with NetCDF• 100 x 100 grid, 1,000 cycles,
writing to file every 50 cycles: 50,422.6 msec
• Previous settings, but writing to file only once: 22,923.9
Future WorkOn Parallel_NetCDF
D0 array support Object datatype support Allow a whole variable to be
read/written Smaller buffer
After Parallel_NetCDF Conference paper for IEEE PacRim
Conference
Key Lessons
Working with external libraries
Working with limited documentation
Creating and meeting deadlines
Experience with parallel and distributed systems
Intermediate Products
File Creators
FileCreator• Create uniform 2D or 3D grids
• Can create NetCDF files with an unlimited dimension.
FileManipulator 1.0• Create uniform 2D or 3D grids
• Write 1D or 2D arrays of integer
FileManipulator 2.0• Create uniform 2D or 3D grids
• Read or write whole variable or single value
• 8 datatypes supported
Single Instance Iterations
Single instance per processor reader• Create uniform 2D or 3D
grids
• Read or write whole variables
• 8 datatypes supported