Desktop Techniques for the Exploration of Terascale Sized Turbulence Data Sets
-
Upload
ethan-young -
Category
Documents
-
view
39 -
download
0
description
Transcript of Desktop Techniques for the Exploration of Terascale Sized Turbulence Data Sets
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Desktop Techniques for the Exploration of Terascale Sized Turbulence Data Sets
John Clyne
Scientific Computing Division
National Center for Atmospheric Research
Boulder, CO USA
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
[Numerical] models that can currently be run on typical supercomputing platforms produce data in amounts that make storage expensive, movement cumbersome, visualization difficult, and detailed analysis impossible. The result is a significantly reduced scientific return from the nation's largest computational efforts.
We can now compute more data than we know how to analyze!!!
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
A sampling of various technology performance curves
• Not all technologies advance at same rate!!!
Performance gains from 1980 to present
1
10
100
1000
10000
100000Im
pro
vem
ent
Disk Drive Internal DataRate
Disk Drive InterfaceData RateEthernet NetworkBandwidth
Intel MicroprocessorClock SpeedDrive Capacity
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Estimated Sustained GFLOPs at NCAR (Production Systems)
0
100
200
300
400
500
600
700
800
900
Jan-97 Jan-98 Jan-99 Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06
IBM p5-575/HPS(bluevista)
IBM Opteron/Linux(pegasus)
IBM Opteron/Linux(lightning)
IBM POWER4/Federation(thunder)
IBM POWER4/Colony(bluesky)
IBM POWER4 (bluedawn)
SGI Origin3800/128
IBM POWER3(blackforest)
IBM POWER3 (babyblue)
Compaq ES40/32(prospect)
SGI Origin2000/128 (ute)
HP SPP-2000/64 (sioux)
CRI Cray C90/16 (antero)
CRI Cray J90 series
ARCS Phase 4
Cray C90/16
HP SPP2000
SGI Origin2000
blackforest (WH-1)
SGI Origin3800
lightning
bluesky
blackforest
ARCS Phase 3
ARCS Phase 2
ARCS Phase 1
pegasus
Linux
blackforest (WH-2/NH-2)
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
NCAR MSS - Data Holdings
0
500
1000
1500
2000
2500
3000Ja
n-9
7
Jan
-98
Jan
-99
Jan
-00
Jan
-01
Jan
-02
Jan
-03
Jan
-04
Jan
-05
Jan
-06
Te
rab
yte
s
Total
Unique
40 years for thefirst PetaByte
Nov '02
20 months for thesecond PetaByte
Jul '04
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Growth of individual NCAR simulation data sets
Approximate Simulation Data Set Sizes
0.1
1
10
100
1000
10000
100000
1989 1995 1998 2000 2004 2006
GB
s
Climate
Turbulence
Weather
Representative data sets from climate, turbulence, and weather disciplines
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Climate simulation grid resolutions
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Example: Compressible plume dynamics
• 504x504x2048
• 5 variables (u,v,w,rho,temp)
• ~500 time steps saved
• 9TBs storage (4GBs/variable/timestep)
• Six months compute time required on 112 IBM SP RS/6000 processors
• Three months for post-processing
• Data may be analyzed for several years
M. Rast, 2004. Images courtesy of Joseph Mendoza, NCAR/SCD
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Workflow in computational science
SimulationAnalysis
& VisualizationStorage
PostProcessing Storage
BatchBatch &
Interactive Interactive
Bandwidth requirements?
Bandwidth requirements?
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
What is meant by interactive computing?
Definition: A system is interactive if the time between a user event and the response to that event is short enough maintain my full attention
If the response time is…1-5 seconds : I’m engaged
5-60 seconds : I’m reading email
1-3 minutes : I’ve forgotten what I was trying to do
> 3 minutes : I’ve given up!
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
IO wait times for high resolution simulations
• Assumptions– Single precision
– 100 MB/sec bandwidth
– No contention
Resolution MBs per variable
Scalar variable wait time
Vector variable wait time
1283 8 0.1 0.3
2563 67 0.7 2.1
5123 537 5.0 15.0
10243 4295 43.0 130.0
Interactive!
Reading mail!!
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Visualization and Analysis Platform for oceanic, atmospheric, and solar Research (VAPoR)
Key components1. Domain specific application focus: numerically simulated turbulence 2. Quantitative capabilities to support scientific data analysis3. Integrate visualization into analysis process, interactively steering the
analysis while enhancing data understanding4. Employ multiresolution data representation as a data reduction technique
This work is funded in part through a U.S. National Science Foundation, Information Technology Research program grant
Combination of visualization with multiresolution data representation that provide sufficient data reduction to enable interactive work
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Enabling speed/quality tradeoffs with multiresolution data representation
1
•Multiple copies of data at varying power of two resolutions
•Storage costs:
1/2
1/41/8
dddL
l
dl //// 32
0
212121121
•2D Example: Texture MIP MappingExample: Texture MIP Mapping
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Wavelet Transforms for 3D Multiresolution data representation
• Permit hierarchical data representation
• Invertible and lossless (subject to floating point round off errors)
• Numerically efficient – forward and inverse transform
• No additional storage cost!!!
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Compressible Convection
1283 5123M. Rast, 2002
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
504x504x2048
Full
252x252x1024
1/8
126x126x512
1/64
63x63x256
1/512
Compressible plume data set shown at native and progressively coarser resolutions
Compressible plume
Resolution:
Problem size:
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Rendering timings
0.1
1
10
100
1000
Full 1/2 1/4 1/8
Resolution
Tim
e in
se
con
ds
Mdb
Vtk
0.01
0.1
1
10
Full 1/2 1/4 1/8
Resolution
Tim
e in
se
con
ds
Mdb
5123 Compressible Convection 5042x2048 Compressible Plume
Reduced resolution affords responsive interaction while preserving all but finest features
SGI Octane2, 1x600MHz R14k
SGI Origin, 10x600MHz R14k
Interactive
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Derived quantities
p: pressure
: density
T: temperature
: ionization potential
: Avogadro’s number
me: electron mass
k: Boltzmann’s constant
h: Planck’s constant
(1) Tp
(2)
2323
2
2
2
1kTe e
N
T
h
km
y
y
(3)22 u
Derived quantities produced from the simulation’s field variables as a post-process
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Calculation timings for derived quantities
0.01
0.1
1
10
100
1000
10000
Full 1/2 1/4 1/8
Resolution
Tim
e in
Se
co
nd
s
pressure (eq 1)
ionization (eq 2)
enstrophy (eq 3)
Note: 1/2th resolution is 1/8th problem size, etc
Deriving new quantities on interactive time scales only possible with data reduction
SGI Origin, 10x600MHz R14k
Integrated visualization and analysis on interactively selected subdomains:
u
2ur
pg
z
1 pr
1 pr
2ur
z
Vertical vorticity of the flow.
Mach number of the vertical velocity.Full domain seen from above. Subdomain from side.
Full domain seen from above. Subdomain from side.
Efficient analysis requires rapid calculation and visualization of unanticipated derived quantities. This can be facilitated by a combination of subdomain selection and resolution reduction.
A test of multiresolution analysis: Force balance in supersonic downflows
Sites of supersonic downflow are also those of very high vertical vorticity. The cores of the vortex tubes are evacuated, with centripetal acceleration balancing that due to the inward directed pressure gradient. Buoyancy forces are maximum on the tube periphery due to mass flux convergence.
The same interpretation results from analysis at half resolution.
1 pr
u
2ur
pg
z
1 pr
2ur
z
u
2ur
pg
z
1 pr
1 pr
2ur
z
Full
Half
Resolution
Subdomain selection and reduced resolution together yield data reduction by a factor of 128!!!
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
Future???
Original 20:1 Lossy Compression
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
• Live VAPOR demonstrations, SGI Theatre (booth # 602):– Wednesday, 11:30am– Thursday, 3:30pm
• VAPOR URL:– http://www.scd.ucar.edu/hss/dasg/software/vapor
Questions???
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
• Inadequate IO bandwidth is but one impediment to interactive analysis and visualization.
• Others impediments include:– Insufficient capacity of high-speed storage
– Reliance on un-optimized, serial applications
– Mismatch between simulation and analysis computing resources
SC05November, [email protected]
Supercomputing • Communications • Data
NCAR Scientific Computing Division
NCAR Science
Space Weather Turbulence
Atmospheric ChemistryClimate Weather
The Sun
More than just the atmosphere… from the earth’s oceans to the solar interior