1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific...

19
1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center [email protected]

Transcript of 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific...

Page 1: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

1

Impact of TeraGrid on

Science and Engineering Research

April 15, 2008

Ralph Roskies,

Scientific DirectorPittsburgh Supercomputing Center

[email protected]

Page 2: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

2

Overview• Present brief descriptions of a sampling of recent

achievements, extracted from the Annual Report• Point out TeraGrid value added

–much more than provide well-managed, reliable platforms for the research

–heterogeneity of the resources, so that each application can do its work on the most appropriate platform. Many applications use several platforms each with different strengths

–explicit help in optimization, data analysis and visualization, facilitated data motion

• Comment briefly on the transformational implications of this work

Page 3: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

3

Storm predictionMing Xue*, U. of Oklahoma

• Better alerts for thunderstorms, especially supercells that spawn tornados, could save millions of dollars and many lives.

• Unprecedented experiment, every day from April 15- June 8 (tornado season) to test the ability of storm-scale ensemble prediction under real forecasting conditions for US east of the Rockies.

• First time for– ensemble forecasting at storm scale

(had been used for larger scale models)

– real-time in a simulated operational environment

• Successful predictions of the overall pattern and evolution of many of the convective-scale features, sometimes out to the second day, and good ability to capture storm-scale uncertainties

* Indicates Advanced Support Top- prediction 21 hours ahead of time for May 24, 2007 ; bottom- observed

Page 4: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

4

Storm Modeling• 10-member ensembles (4 km resolution, 6.5 to 9.5 hours each day),

each used 66 Cray XT3 processors at PSC and one 600 processor high resolution model (2 km resolution, 9 hours).

• >100× more computing daily than the most sophisticated National Weather Service operational forecasts-points the way to the future.

• Transferred 2.6 TB of data daily to Norman, Oklahoma• PSC optimized IO, and modified the reservation and job-processing

logic of its job-scheduling software to autoscheduling of 760 jobs/day• Triggered on-demand forecasts, in regions where storms were likely to

develop, using TeraGrid LEAD Gateway (ran on NCSA Tungsten).• LEAD also sponsored its WxChallenge competition for colleges and

universities to predict maximum sustained wind speed for selected U.S. cities. Allows them to configure their own model runs using the LEAD technology (supported by Indiana Big Red and Data Capacitor, NCSA Tungsten).

Page 5: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

5

CosmologyMike Norman* UCSD

• Small (1 part in 105) spatial inhomogeneities 380,000 years after the Big Bang, as revealed by WMAP Satellite data, get transformed by gravitation into the pattern of severe inhomogeneities (galaxies, stars, voids etc.) that we see today.

• Enormously demanding computations that will clearly use petascale computing when available.

• Uniform meshes won’t do, must zoom in on dense regions to capture the key physical processes- gravitation (including dark matter), shock heating and radiative cooling of gas. So need an adaptive mesh refinement scheme (they use 7 levels of mesh refinement).

The filamentary structure in this simulation in a cube 1.5 billion light years on a side is also seen in real life observations such as the Sloan Digital Sky Survey.

Page 6: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

6

Cosmology

• Benefits from large memory capabilities (NCSA Altix- with 1.5 TB of shared memory, and 2 TB of distributed memory on IBM Datastar at SDSC). –Adaptive mesh refinement is very hard to load-balance on distributed memory machines.

• SDSC helped make major improvements in the scaling and efficiency of the code (ENZO).

• Used an NCSA-developed tool (Amore) for high quality visualizations; also used the HDF format, developed at NCSA for facilitating the data handing of 8 TB of data, which was stored at SDSC.

Page 7: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

7

Earthquake AnalysisSCEC and Quake teams

• To improve seismic hazard estimates, safer structural designs and better building codes.

• SCEC (Southern California Earthquake Center) and Quake (CMU) teams create realistic 3-D models of earthquakes in the Los Angeles basin, using empirical information about the inhomogeneous basin properties.

• SCEC uses uniform meshes, Quake uses adaptive meshing to study higher frequency effects.

• Computationally very demanding to find the ‘high frequency’ (≥1 Hz) properties because these involve shorter wavelengths and thus finer meshes, but these are critical for building shaking.

• SCEC and Quake have carried out the largest and most realistic simulations of a magnitude 7.8 earthquake along the San Andreas fault.

• SCEC discovered that the basin and mountain structure produced unexpectedly large focusing of energy into the

Los Angeles area

Page 8: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

8

SDSC-SCEC effort• SDSC helped optimize the SCEC

code on SDSC (DataStar) and TACC machines (Lonestar)

• SDSC produced the visualization that led to the scientific insights about focusing.

• This run produced over 3 TB of data, stored in the SCEC Digital Library, hosted at SDSC.

• The SCEC Digital Library contains 168 TB of data (3.5M files), and provides publicly available data accessed through the Earthworks Gateway maintained at USC.

Visualizations of the magnitude 7.8 earthquake simulations are a vital tool helping scientists understand the "big picture" of their results and guide the simulations. The figure shows peak ground velocity, reaching more than 3.5 meters per second in the Los Angeles area with great potential for structural damage. Warmer colors are higher ground velocity.

Page 9: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

9

PSC Quake Effort• PSC helped optimize the

code and developed the ability to stream results to remote sites to enable the researchers to interact with the calculations in real time, changing what is being visualized.

• PSC also developed visualization tools to compare Quake results (adaptive meshes) with those of SCEC (uniform meshes) to validate results.

Statistical ComparisonThe top two planes of this display, show a “phase mismatch” and an “envelope mismatch” between two simulations, while the third plane represents a difference between the first two planes.

Page 10: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

10

Protein StructureDavid Baker, U. of Washington

• How is the 3-D structure of a protein determined by its sequence of amino acids?

• David Baker’s Rosetta code has proved the best at predicting protein structure from sequence in biannual competitions (CASP- Critical Assessment of Structural Predictions)

• Can then design enzymes to accomplish particular tasks by investigating the folding pattern of alternate sequences

Protein structure prediction by the Rosetta code, showing the predicted structure (blue), the X-ray structure (red, unknown when the prediction was calculated), and a low-resolution NMR structure (green).

Page 11: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

11

Protein Structure

• Used 1.3 M hours on NCSA Tungsten to identify promising targets (coarse resolution). Then refined 22 promising targets on 730,000 hours of SDSC Blue Gene.

• SDSC helped improve scaling to run on 40,960 processor BlueGene at IBM, which reduced the running time for a single prediction to 3 hours, instead of weeks on a typical 1000 processor cluster. (BlueGene well suited to compute intensive, small memory tasks)

• Robetta web portal allows researchers to run Rosetta jobs without programming. NCSA worked with Baker’s team and the Condor group at U. of Wisconsin to integrate Tungsten’s Condor workload management system supporting Robetta with NCSA computing resources.

Page 12: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

12

Zeolite DatabasesMichael Deem*, Rice U.

David Earl, U. of Pittsburgh• Zeolites are powerful mineral catalysts with a

porous, Swiss-cheese-like structure.• Used to make gasoline, asphalt, laundry

detergent, aquarium filters. • Catalog of naturally occurring zeolites –

about 50 – has grown to approximately 180 with the addition of synthetic varieties,

• Deem and Earl used the TeraGrid (TACC, Purdue, Argonne, NCSA, SDSC) to identify potentially new zeolites by searching for hypothetically stable structures. Their database now contains over 3.5 million structures.

• Scientists can use the catalog to find structures that are more efficient, either in terms of energy inputs or in waste byproducts.

Page 13: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

13

Zeolite Databases• Large number of independent jobs.• TACC developed tools like MyCluster, harnessing the

distributed, heterogeneous resources available on the TeraGrid network into a single virtual environment for the management and execution of their simulation runs.

• At Purdue, the application was used within a Condor pool of more than 7000 processors using standard Linux tools for job management. 4M+ processor hours used in 22 months. Performance engineer supported application revisions and job management scripts for large-scale production to

•Track execution time to detect runaways and terminate gracefully.•Add application self checkpoint to recover from job and system failures•Control of number of jobs in queue to practical maximum w.r.t. system capabilities.

–Dynamically adjust cap to hold percentage of jobs executing to 80-90% of total in queue (number executing varied from <100 to peak of ~2000).

Page 14: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

14

New Communities

Page 15: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

15

Urban WaterJim Uber- U. of Cincinnati

K. Mahinthakumar, R. Ranjithan & D. Brill- NCSU• Developed iterative methods to

locate the source of contaminants of urban water systems (100’s of miles of pipe), and approaches to limiting their impact (for security, public health, regional economy).

• Have run serious simulations with Cincinnati Water Works. Team is learning to cope with problem situations, for example to engage different or new sensors when some within the network malfunction.

• Early simulations simulated a few hundreds of sensors; have now grown to 11,000; a city network could have 300,000.

Used DTF Grid (NCSA, SDSC, Argonne) with Argonne algorithms for optimizing which runs get sent where. SDSC helped with Grid software (e.g.

MPICH-G2)

Page 16: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

16

Social RelationshipsMaria Esteva, UT Austin

Studying work relationships in an organization by analyzing electronic texts the organization’s digital archive

• TACC visualization group helped researchers understand social relationships by developing visual metaphors for the social matrix, for limited data.

• Will apply to 600,000 Enron emails of 150 individuals, and require TeraGrid computational power.

• Work is now beginning in several places to use the TeraGrid computational power to further analyze social interactions (e.g. of Facebook communities).

Page 17: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

17

City PlanningMete Sozen, Nicoletta Adamo-Villani Purdue U.

• Istanbul at high risk for another devastating earthquake. Researchers at Purdue University and Istanbul created a model proposed satellite city to provide immediate refuge.

• Produced 5-minute movie flythrough of the proposed city for The Metropolitan Municipality of Istanbul.

• Technical information: 9,000 frames (720x480)

> 30,000 frames

>10,000 CPU hours of Purdue’s TeraDRE (TeraGrid Distributed Rendering Environment).

Page 18: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

18

Transformative Aspects

• Many of the scientific breakthroughs are potentially transformative. –Large scale computations intrinsic to the enormous strides cosmology has taken in the past 2 decades.

–Understanding the detailed atomic level mechanisms behind biological systems leads to major opportunities in design of improvements.

–Similarly atomic understanding will transform design of new materials (zeolites, magnetic storage, alloys, …)

–Computation is transforming how we can deal with disaster warning and mitigation (storms, earthquakes, attacks on water systems)

–New communities are being introduced to the possibilities that TeraGrid affords.

Page 19: 1 Impact of TeraGrid on Science and Engineering Research April 15, 2008 Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu.

19

Transformative Aspects

• There is also a substantial transformation in how we do the Science–Faster turnaround leads to greater researcher productivity and changes the questions we ask in all disciplines.

•Also allows things like interactive steering, ensemble forecasts and better uncertainty analysis

–Visualization aids understanding–High speed networks allow much greater span of collaborative activities, and better use of distributed heterogeneous resources