Petascale I/O Impacts on Visualization Hank Childs Lawrence Berkeley National Laboratory & UC Davis...

Petascale I/O Impacts on Visualization

Hank Childs

Lawrence Berkeley National Laboratory

& UC Davis

March 24, 2010

27B elementRayleigh-Taylor Instability(MIRANDA, BG/L)

2 trillion element mesh

2006

2012

How does the {peta-, exa-} scale affect visualization?

Large # of time steps

Large ensembles

High-res meshes

Large # of variables

Your mileage may vary - Are you running full machine?- How much data do you

output?

The soon-to-be “good ole days” … how visualization is done right now

P0P1

P3

P2

P8P7 P6

P5

P4

P9

Pieces of data(on disk)

Read Process Render

Processor 0

Read Process Render

Processor 1

Read Process Render

Processor 2

Parallelized visualizationdata flow network

P0 P3P2

P5P4 P7P6

P9P8

P1

Parallel Simulation Code

This technique is called “pure parallelism”

Pure parallelism performance is based on # bytes to process and I/O rates.

Vis is almost always >50% I/O and sometimes 98% I/O

Amount of data to visualize is typically O(total mem)

FLOPs Memory I/O

Terascale machine

“Petascale machine”

Two big factors:

① how much data you have to read

② how fast you can read it Relative I/O (ratio of total memory and I/O) is

key

Anedoctal evidence: relative I/O really is getting slower.

Machine name Main memory I/O rate

ASC purple 49.0TB 140GB/s 5.8min

BGL-init 32.0TB 24GB/s 22.2min

BGL-cur 69.0TB 30GB/s 38.3min

Petascale machine

?? ?? >40min

Time to write memory to disk

Why is relative I/O getting slower?

• “I/O doesn’t pay the bills”—And I/O is becoming a dominant cost in the

overall supercomputer procurement.• Simulation codes aren’t as exposed.

—And will be less exposed with proposed future architectures.

Recent runs of trillion cell data sets provide further evidence that I/O dominates

7

● Weak scaling study: ~62.5M cells/core

7

#coresProblem Size

TypeMachine

8K0.5TZAIXPurple

16K1TZSun LinuxRanger

16K1TZLinuxJuno

32K2TZCray XT5JaguarPF

64K4TZBG/PDawn

16K, 32K1TZ, 2TZCray XT4Franklin2T cells, 32K procs on Jaguar

2T cells, 32K procs on Franklin

- Approx I/O time: 2-5 minutes- Approx processing time: 10 seconds

Summary: what are the challenges?

• Scale—We can’t read all of the data at full resolution

any more… • Q: What can we do?• A: We need algorithmic change.

• Insight—How are we going to understand it?

• (There is a lot more data than pixels!)

Multi-resolution techniques use coarse representations then refine.

P0P1

P3

P2

P8P7 P6

P5

P4

P9


Read Process Render

Processor 0

Read Process Render

Processor 1

Read Process Render

Processor 2

Parallelized visualizationdata flow network

P0 P3P2

P5P4 P7P6

P9P8

P1


P2

P4

Multi-resolution: pros and cons

• Pros—Drastically reduce I/O & memory requirements

• Cons—Is it meaningful to process simplified version of

the data?—How do we generate hierarchical

representations? What costs do they incur?

In situ processing does visualization as part of the simulation.

P0P1

P3

P2

P8P7 P6

P5

P4

P9


Read Process Render

Processor 0

Read Process Render

Processor 1

Read Process Render

Processor 2

P0 P3P2

P5P4 P7P6

P9P8

P1


In situ processing does visualization as part of the simulation.

P0P1

P3

P2

P8P7 P6

P5

P4

P9

GetAccessToData

Process RenderProcessor 0

Parallelized visualization data flow networkParallel Simulation Code

GetAccessToData


GetAccessToData


GetAccessToData


… … … …

In situ: pros and cons

• Pros:—No I/O!—Lots of compute power available

• Cons:—Very memory constrained—Many operations not possible

• Once the simulation has advanced, you cannot go back and analyze it

—User must know what to look a priori• Expensive resource to hold hostage!

Now we know the tools … what problem are we trying to solve?

• Three primary use cases:—Exploration—Confirmation—Communication

Examples:Scientific discoveryDebugging

Examples:Data analysisImages / moviesComparison

Examples:Data analysisImages / movies

Multi-res

In situ

Will still need pure parallelism and possibly other techniques such as data subsetting and streaming.

Prepare for difficult conversations in the future.

• Multi-resolution:—Do you understand what a multi-resolution

hierarchy should look like for your data?—Who do you trust to generate it?—Are you comfortable with your I/O routines

generating these hierarchies while they write?—How much overhead are you willing to tolerate

on your dumps? 33+%?—Willing to accept that your visualizations are

not the “real” data?

Prepare for difficult conversations in the future.

• In situ:—How much memory are you willing to give up

for visualization?—Will you be angry if the vis algorithms crash?—Do you know what you want to generate a

priori? • Can you re-run simulations if necessary?

Visualization on BlueWaters: Two Scenarios

1) Pure parallelism continues (no SW change):—Visualization and analysis will be done using large

portions of BW, users charging against science allocations• Lots of time spent doing I/O• This increases overall I/O contention on BW

—Vis & analysis is slow people do less ( insights will be lost (?))

2) Smart techniques deployed:—Allocations are used for simulation, not vis & analysis—Less artificial I/O contention introduced—Ability to explore / interact with data

18

184

VisIt is a richly featured, turnkey application

• VisIt is an open source, end user visualization and analysis tool for simulated and experimental data

– Used by: physicists, engineers, code developers, vis experts

– >100K downloads on web

• R&D 100 award in 2005• Used “heavily to exclusively” on 8 of

world’s top 12 supercomputers

217 pin reactor coolingsimulation. Run on ¼ of Argonne BG/P.

1 billion grid points

19

19

Terribly Named!! Intended for more than just visualization!

Data Exploration

VisualDebugging Quantitative Analysis

Presentations

=?

Comparative Analysis

20

20

VisIt has a rich feature set that can impact many science areas.

• Meshes: rectilinear, curvilinear, unstructured, point, AMR

• Data: scalar, vector, tensor, material, species• Dimension: 1D, 2D, 3D, time varying• Rendering (~15): pseudocolor, volume rendering,

hedgehogs, glyphs, mesh lines, etc…• Data manipulation (~40): slicing, contouring, clipping,

thresholding, restrict to box, reflect, project, revolve, …• File formats (~85)• Derived quantities: >100 interoperable building blocks

+,-,*,/, gradient, mesh quality, if-then-else, and, or, not• Many general features: position lights, make movie, etc• Queries (~50): ways to pull out quantitative information,

debugging, comparative analysis

21

21

VisIt employs a parallelized client-server architecture.

• Client-server observations:—Good for remote

visualization—Leverages available

resources—Scales well—No need to move data

Additional design considerations:• Plugins• Multiple UIs: GUI (Qt), CLI

(Python), more…

remote machine

Parallel vis resources

Userdata

localhost – Linux, Windows, Mac

Graphics Hardware

22

22

The VisIt team focuses on making a robust, usable product for end users.

• Manuals—300 page user manual—200 page command line interface manual—“Getting your data into VisIt” manual

• Wiki for users (and developers)• Revision control, nightly regression testing, etc• Executables for all major platforms• Day long class, complete with exercises

Slides from the VisIt class

23

23

VisIt is a vibrant project with many participants.

• Over 50 person-years of effort• Over one million lines of code• Partnership between: Department of Energy’s Office of Nuclear

Energy, Office of Science, and National Nuclear Security Agency, and among others—NSF XD centers both expected to make large contributions.

2004-6

User communitygrows, includingAWE & ASC Alliance schools

Fall ‘06

VACET is funded

Spring ‘08

AWE enters repo

2003

LLNL user communitytransitioned to VisIt

2005

2005 R&D100

2007

SciDAC Outreach Center enablesPublic SW repo

2007

Saudi Aramcofunds LLNL to support VisIt

Spring ‘07

GNEP funds LLNL to support GNEP codes at Argonne

Summer‘07

Developers from LLNL, LBL, & ORNLStart dev in repo

‘07-’08

UC Davis & UUtah research done in VisIt repo

2000

Project started

‘07-’08

Partnership withCEA is developed

2008

Institutional supportleverages effort from many labs

Spring ‘09

More developersEntering repo allthe time

24

VisIt: What’s the Big Deal?

• Everything works at scale• Robust, usable tool• Vis to code development to scientific insight

24

VisIt and the smart data techiques

• Full pure parallelism implementation• Data subsetting well integrated

—(if you set up your data properly)• In situ: yes, but … it’s a memory hog• Multi-res: emerging effort in this space

26

Three Ways To Get Data Into VisIt

• (1) Write to a known output format• (2) Write a plugin file format reader• (3) Integrate VisIt “in situ”

—“lib-VisIt” is linked into simulation code• (Note: Memory footprint issues with

implementation!)

—Use model:• simulation code advances• at some time interval (e.g. end of cycle), hands

control to lib-VisIt.• lib-VisIt performs vis & analysis tasks, then hands

control back to simulation code• repeat

26

Summary

• Massive data will force algorithmic change in visualization tools.

• The VisIt project is making progress on bringing these algorithmic changes to the user community.

• Contact info:—Hank Childs, LBL & UC Davis—[email protected] / [email protected]—http://vis.lbl.gov/~hrchilds

Questions??

mailto:[email protected]

mailto:[email protected]

Petascale I/O Impacts on Visualization Hank Childs Lawrence Berkeley National Laboratory & UC Davis...

Documents

Transcript of Petascale I/O Impacts on Visualization Hank Childs Lawrence Berkeley National Laboratory & UC Davis...