Alok [email protected] 1Northwestern University Access Patterns, Metadata, and...

14
Alok Choudhary [email protected] 1 Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE, Northwestern University Collaboration with ANL SDM kickoff meeting July 10-11, 2001

Transcript of Alok [email protected] 1Northwestern University Access Patterns, Metadata, and...

Page 1: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 1Northwestern University

Access Patterns, Metadata, and Performance

Alok Choudhary and Wei-Keng LiaoDepartment of ECE, Northwestern University

Collaboration with ANL

SDM kickoff meeting

July 10-11, 2001

Page 2: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 2Northwestern University

Virtuous Cycle

Problem setup(Mesh, domainDecomposition)

Simulation(Execute app,Generate data)

Manage,Visualize, Analyze

Measure Results,Learn, Archive

Page 3: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 4Northwestern University

Data Access Sequence Dependency

• Temporal dependency– Access the same data set at

different time stamp

• Spatial dependency– Access different data sets at the

same time stamp

• Resolution dependency– Access the same data set at different

resolution

• Sequence is useful for I/O performance improvement, eg. Pre-fetch, pre-stage, storage continuity

Page 4: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 5Northwestern University

Spatial Data Access Patterns

• Parallel partition patterns:– Regular, irregular

– Static, dynamic during simulation

• Access sequence– Spatial, temporal, resolution

• Access frequency– Once only, multiple times (overwrite for restart)

• Access amount– Large, medium, small chunks

Page 5: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 6Northwestern University

Access Patterns for Visualization/Analysis

• Generated from real data during simulation or in post-simulation process

• Smaller size than real data– Type conversion,

eg. float unsign char

– Reduce/increase resolution

– Projection 3D to 2D

• 3 types of data generate and display sequence

Page 6: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 7Northwestern University

Architecture

UserApplications

MDMS

Storage Systems(I/O Interface)

SimulationData AnalysisVisualization

Metadataaccess pattern, history

MPI-IO(Other interfaces..)

QueryInput MetadataHints, Directives

Associations OIDsparameters for I/O

Schedule, Prefetch, cacheHints (coll I/O)

Performance InputSystem metadata

I/O func (best_I/O (for these param))Hint

Data

Page 7: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 8Northwestern University

Approach

• Management meta data using OR-DBMS– Collect and organize meta data in relation tables– Design meta data query interface using SQL

• Access to HSS– Obtain current storage layout, configuration– Native I/O interfaces or MPI-IO

• I/O optimization– Determine optimal I/O calls– Overlap I/O with computation, communication, and I/O– Pre-fetch, pre-stage, migrate, purge in HSS– Sub-filing for large file, file container for small files

Page 8: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 10Northwestern University

Metadata

• Application Level– Algorithms, compiling, execution environments

– Time stamps, parameters, result summary

• Programming Level– Data types, structures, association of datasets, partition patterns

• Storage System Level– File locations, file structure, I/O modes, host names, device types,

path names, storage hierarchy

• Performance Level– I/O bandwidth of HSS for local and remote access

– Data access sequence, frequency, other access hints

– Collective or non-collection I/O

Page 9: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 11Northwestern University

Applications

• Asto3D -- study the highly turbulent convective

layers of late-type star – Write only

– regular partition on all data sets

• ENZO -- simulate the formation of a cluster of

galaxies consisting of gas and stars– Both read and write

– Both regular and irregular partition

– Adaptive Mesh Refinement dynamic load balancing

• Common feature– Checkpoint / restart

– Post-simulation data analysis

– Visualizing the process of the computation in the form of a movie

Page 10: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 12Northwestern University

Interface

Page 11: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 13Northwestern University

Run Application

Page 12: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 14Northwestern University

Dataset and Access Pattern Table

Page 13: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 15Northwestern University

Data Analysis

Page 14: Alok Choudharychoudhar@ece.nwu.edu 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,

Alok Choudhary [email protected] 16Northwestern University

Integrating Analysis

Problem setup(Mesh, domainDecomposition)

Simulation(Execute app,Generate data)

Manage,Visualize, Analyze

Measure Results,Learn, Archive

On-line analysisAnd mining