Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing...

69
Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, [email protected] Associate Dean, Research Technologies & Chief Operating Officer, Pervasive Technology Labs; IU Chair, Coalition for Academic Scientific Computing IU TeraGrid Resource Partner PI & Maytal Dahan, [email protected] Software Developer, Distributed & Grid Computing, TACC 9 June 2008

Transcript of Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing...

Page 1: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Introduction to Parallel Computing on the TeraGrid

Part 1: the TeraGrid and Parallel Computing concepts

Craig Stewart, [email protected] Associate Dean, Research Technologies &

Chief Operating Officer, Pervasive Technology Labs; IU

Chair, Coalition for Academic Scientific Computing

IU TeraGrid Resource Partner PI

&

Maytal Dahan, [email protected] Developer, Distributed & Grid Computing, TACC

9 June 2008

Page 2: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

License terms• Please cite as: Stewart, C.A., and M. Dahan. 2008. Introduction to Parallel

Computing on the TeraGrid. Part 1: the TeraGrid and Parallel Computing concepts. Tutorial presentation. Presented at TeraGrid08 Conference, June 9-13, Las Vegas, NV. http://hdl.handle.net/2022/13990

• Some figures are shown here taken from web, under an interpretation of fair use that seemed reasonable at the time and within reasonable readings of copyright interpretations. Such diagrams are indicated here with a source url. In several cases these web sites are no longer available, so the diagrams are included here for historical value. Except where otherwise noted, by inclusion of a source url or some other note, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

2

Page 3: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

3

Outline

• Why this tutorial may be valuable to you– (Time consuming computations on the critical path of your

research? Need more storage? Do you provide scientific services/resources over the Web?)

• What is cyberinfrastructure?• Examples of TeraGrid uses• More detailed info about the TeraGrid• How can you get going using the TeraGrid?

– Resources are available for use– Help using the system is available

• Introduction to parallel computing concepts• NB: ‘Tufte was here’

Page 4: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

4

What is cyberinfrastructure?• “Cyberinfrastructure consists of computing systems, data storage

systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible.” (This and other information in Wikipedia definition of cyberinfrastructure)

• Some basic terms– TFLOPS - Trillions of FLOating Point operations per Second

(mathematical operations) (10^12)– Processor hour - one hour of processor (CPU) utilization– TB - terabyte; PB - petabyte– Parallel programming– MPI - Message Passing Interface – WSRF - Web Services Resource Framework

©Trustees of Indiana University. May be reused provided TeraGrid logo remains and any modifications to original are noted. Courtesy Craig A. Stewart, IU

Page 5: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

5

What is the TeraGrid?• “TeraGrid is an open scientific discovery infrastructure combining leadership class

resources at eleven partner sites to create an integrated, persistent computational resource.”

• An instrument (cyberinfrastructure) that delivers high-end IT resources – storage, computation, visualization, and data/service hosting – almost all of which are UNIX-based under the covers; some hidden by web interfaces– A data storage and management facility: over 20 petabytes of storage (disk and

tape), over 100 scientific data collections– A computational facility: over 870 TFLOPS in parallel computing systems and

growing– (Sometimes) an intuitive way to do very complex tasks, via Science Gateways, or get

data via data services• A service: help desk and consulting, Advanced Support for TeraGrid Applications (ASTA),

education and training events and resources• The largest individual cyberinfrastructure facility funded by the NSF, which supports the

national science and engineering research community• Something you can use without financial cost – allocated via peer review (and without

double jeopardy)

Page 6: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

6

• Simulation of TonB-dependent transporter (TBDT)• Used 400,000 processor (CPU) hours on systems

at National Center for Supercomputing Applications, IU, Pittsburgh Supercomputing Center [45 years with one processor]

• Modeled mechanisms for allowing transport of molecules through cell membrane

• Experimental analysis not possible!• Work by Emad Tajkhorshid and James Gumbart, of

University of Illinois Urbana-Champaign. Mechanics of Force Propagation in TonB-Dependent Outer Membrane Transport. Biophysical Journal 93:496-504 (2007).

• Results of the simulation may be seen at www.life.uiuc.edu/emad/TonB-BtuB/btub-2.5Ans.mpg

Image courtesy of Emad Tajkhorshid, UIUC

Examples of what you can do with the TeraGrid:Simulation of cell membrane processes

Page 7: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

7

Predicting storms• Hurricanes and tornadoes cause massive

loss of life and damage to property• TeraGrid supported spring 2007 NOAA

and University of Oklahoma Hazardous Weather Testbed–Major Goal: assess how well ensemble

forecasting predicts thunderstorms, supercells tornadoes

–Nightly reservation at PSC–Delivers “better than real time”

prediction–Used 675,000 CPU hours for the

season–Used 312 TB on HPSS storage at PSC–Used >100× more computing daily

than NWS operational forecasts

Slide courtesy of Dennis Gannon, IU, and LEAD Collaboration

Page 8: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

8

Solve any Rubik’s Cube in 26 moves?

• Rubik's Cube is perhaps the most famous combinatorial puzzle of its time

• > 43 quintillion states (4.3x10^19)

• Gene Cooperman and Dan Kunkle of Northeastern Univ. proved any state can be solved in 26 moves

• 7TB of distributed storage on TeraGrid allowed them to develop the proof

Source: http://www.physorg.com/news99843195.html

Page 9: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

9

• Resources for many disciplines• Resource availability growing at unprecedented rates• These data for first quarter of calendar 2008

Molecular Biosciences

Chemistry

Astronomical Sciences

Physics

Materials Research

Chemical, Thermal Systems

Earth Sciences

All 18 Others

Advanced Scientific Computing

Atmospheric Sciences

0 50000000 100000000 150000000 200000000 250000000 300000000 350000000 400000000

NUs used

Page 10: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

10

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

Caltech

USC/ISI

UNC/RENCI

UW

Resource Provider (RP)

Software Integration Partner

Grid Infrastructure Group (UChicago)

The TeraGrid Map

Tennessee

LONI/LSU

Network Hub

www.teragrid.org

Page 11: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

11

But the map doesn’t matter - TeraGrid Architecture

ComputeService

VizService

DataService

Network, Accounting, …

RP 1

RP 3

RP 2

©University of Chicago, Courtesy Dane Skow, Director, TeraGrid Grid Infrastructure Group. Used with Permission and modified substantially from original by Craig A. Stewart

TeraGrid Infrastructure (Accounting, Network, Authorization,…)

POPS (for now)

Science Gateways

UserPortal

Command Line

Page 12: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

12

www.teragrid.org

Page 13: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

13

TeraGrid High Performance Computing Systems 2007-8

Computational Resources (size approximate - not to scale)

Slide Courtesy and © Tommy Minyard, TACC

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

2007(504TF)

2008(~1PF)Tennessee

LONI/LSU

NB: Ranger soon to be at 580 TFLOPS

Page 14: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

14

RANGER

• @ Texas Advanced Computing Center’s Ranger– Biggest open supercomputer in

world– 504 TFLOPS Sun Constellation

(soon to be 580)– 15,744 AMD Quad-core “Barcelona”

processors– Disk subsystem - 1.7 petabytes– First “Track II” system online

Ranger info courtesy and © Tommy Minyard, TACC

Ranger – Sun Constellation Linux Cluster

Page 15: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

NICS Systems

• Cray “Baker” system 2009– ~1 PetaFLOPs– Opteron multi-core

processors– 100 TB of memory– 2.3 PB of disk space

• Initial Delivery: July 2008– 4,512 Opteron quad-core

processors– 170 TeraFLOPs

© Oak Ridge National Labs

Page 16: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Quick summary of highlights of other Resource Partner services later in

talk…

16

Page 17: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

17

Data storage and management: Disk• All RP sites provide local working storage• Some RP sites provide storage as an allocatable resource

– GPFS-WAN (General Parallel File System Wide Area Network) ~ 1 petabyte

• Home at San Diego Supercomputer Center• May be accessed as if it were a local file system from NCAR,

NCSA, IU, UC/ANL– Lustre-WAN

• Production availability summer 2008 @ IU; direct mount to PSC in testing now

• Several other RPs to experiment with Lustre-WAN this year– Long term disk storage allocations

• IU, NCSA, SDSC

Page 18: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

18

Data storage and management: Tape• Many RPs provide short term use of tape archives in support of

computation• TeraGrid provides persistent (up to Feb 2010+) storage on disk and

tape• Could you benefit from having a spare copy of your data stored

someplace removed from your home location?• Allocatable tape-based storage systems:

– IU (Indiana University) – geographically distributed– NCAR (National Center for Atmospheric Research) – also supports

dual copy– NCSA (National Center for Supercomputing Applications)– SDSC (San Diego Supercomputer Center)– Note: most sites have massive data storage systems that provide

storage in support of computation

Page 19: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

TGUP (TeraGrid User Portal) motivation

• Aggregates and simplifies access to TG information & services

• Control panel for active TG users with accounts– Daily resource for TG users – check allocations, view

system information, submit consulting ticket, view documentation etc.

• Make using TG simple, like a financial portal• Increase productivity of TeraGrid researchers – do more

science!

Page 20: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

TGUP (TeraGrid User Portal) vision

• The TeraGrid User Portal will continue to integrate important user capabilities in one place:– Comprehensive TeraGrid allocation and account

management– TeraGrid user documentation, consulting, training info,

knowledge base– Comprehensive RP resource information services – Simple access to GIG and RP interactive grid usage –

logging in, remote viz, etc.– Potentially, all user services and interactions (e.g., surveys,

online training, real-time consulting, interactive data mining, remote visualization, etc.)

Page 21: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Accessing TeraGrid User Portal

• How do I access the TeraGrid User Portal?

http://portal.teragrid.org – When you get your accounts

packet you will have a portal account (username and password) listed

– The portal account is also your TG-Wide login and can be used to access any of your TG systems

– You can change your portal password by logging in to the user portal, visiting the MyTeraGrid tab, and going to the ‘Change Portal Password’ link www.teragrid.org

Page 22: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Current Features•Account Management Services

– Detailed projects and allocation usage• Log in to the user portal to view your projects, allocations, usage, PI, Grant #• PIs have expanded information including users that belong to their project

– View system accounts • Complete list of system accounts with ability to log in directly to any TG system

– Seamless Login to TG systems• No need to remember usernames or passwords• SSH directly in to a TG system

– User Profile listing and update• View & Update your user profile

– Change portal password– Distinguished name listing tool– Add/remove users

• PIs can request users be added/removed from an allocation– Request community account

• Gateway projects can request a community account

Page 23: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Current Features

• Resource Services– System Monitor

• View comprehensive list of TG resources• View detailed job information for each resource• View static resources attributes• Can sort by column

– Batch queue prediction service• Wait time prediction• Deadline prediction

– View & access TG science gateways– View & access TG data collections

Page 24: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Current Features

•Interactive Services– Interactive Remote Visualization

• To Maverick– GSI-SSH login to TG systems

• SSH window to any TG system you have an account on• No need to install any software, know your local account info, or

authenticate– File Manager (coming soon)

Page 25: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Current Features

•Allocation Services–Info about how to apply for allocations–Lists resources!–Sample proposals, proposal questions–Link to POPS Allocation request/renewal

•Training Services–Calendar of training courses–Comprehensive listing of online training modules

•Documentation Services–Knowledge Base interface

•View and search TG related documentation–User Info documentation

•Pulled from user information on TG web site–Automatic population of user forms

•Consulting form, feedback form, add/remove user form, etc.•Consulting Services

–Help desk form, submit consulting tickets

Page 26: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

File Manager Service

•Portal Interface for drag and drop file management across TeraGrid •Transfer between local machine <-> TG System, TG system <-> TG System•Set Notifications, View Xfer History•Secure, High Performance, No additional authentication required•Status: Friendly user testing phase•Release: Early Summer 2008 www.teragrid.org

Page 27: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Automated Password Reset• Automatically reset

portal/TG-Wide password via user portal

• No need to contact help desk or have original TG paperwork

• Status: Implementation phase

• Release: Summer 2008

www.teragrid.org

Page 28: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Planned Features• Enhanced authentication/POPS integration

– Seamless POPS integration• Automatic authentication to POPS for TeraGrid Users

– Vetted and un-vetted user portal accounts• Replace POPS accounts with un-vetted user accounts• When user gets an allocation the account becomes vetted

• User Documentation Services– Software listing

• View CTSS and 3rd party software on TG systems– User news integration– Search feature – TGUP, website

• Customization and personalization– Domain specific views

• Customized portal views for users based on scientific domains– Personalize portal interface

• View only resources you are interested in, etc.

Page 29: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

TeraGrid User Portal @ TG08

• Want to learn more about the TeraGrid User Portal? – TeraGrid User Portal Birds of a Feather

• Tuesday, June 10th

• 5:30 – 6:30pm– TeraGrid User Portal Paper Presentation

• "Increasing TeraGrid User Productivity through Integration of Information and Interactive Services"

• Wednesday, June 11th • 2:00 - 2:30pm

Page 30: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

30

Science Gateways

• A Science Gateway is a domain-specific computing environment, typically accessed via the Web, that provides a scientific community with end-to-end support for a particular scientific workflow

• Science Gateways are distinguished from web portals (http://en.wikipedia.org/wiki/Web_portal) in that portals “present information from diverse sources in a unified way.”

• Hides complexity (pay no attention to the grid behind the curtain…)

©Trustees of Indiana University. May be reused provided TeraGrid logo remains and any modifications to original are noted. Courtesy Craig A. Stewart, IU

Page 31: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

31

LEAD (portal.leadproject.org)

• Simple enough an undergraduate can use it!• National Center for Supercomputing Applications (NCSA) and IU teamed up to

support WxChallenge weather forecast competition. 64 teams, 1000 students, ~16,000 CPU hours on Big Red

• XBaya is available from http://www.collab-ogce.org/

Page 32: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

32

Purdue’s NanoHUB (www.nanohub.org)

Page 33: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

33

U. Chicago SIDGrid (sidgrid.ci.uchicago.edu)

Page 34: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

34

Image by Chris Matusek Image by Ralf Frieser

IU Render Portal

• Supports scientific visualization • Supports education in visualization, graphics, and new media

©Trustees of Indiana University. May be reused provided TeraGrid logo remains and any modifications to original are noted. Courtesy Craig A. Stewart, IU

Page 35: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

35

TeraGrid Science GatewaysAccessible at http://www.teragrid.org/programs/sci_gateways/

Title Discipline

Open Science Grid (OSG) Advanced Scientific Computing

Special PRiority and Urgent Computing Environment (SPRUCE) Advanced Scientific Computing

Massive Pulsar Surveys using the Arecibo L-band Feed Array (ALFA) Astronomical Sciences

National Virtual Observatory (NVO) Astronomical Sciences

High Resolution Daily Temperature and Precipitation Data for the Northeast United States

Atmospheric Sciences

Linked Environments for Atmospheric Discovery (LEAD) Atmospheric Sciences

Computational Chemistry Grid (GridChem) Chemistry

Computational Science and Engineering Online (CSE-Online) Chemistry

Network for Earthquake Engineering Simulation (NEES) Earthquake Hazard Mitigation

GEON(GEOsciences Network) Earth Sciences

NanoHUB Nanotechnology

TeraGrid Geographic Information Science Gateway (GISolve) Geography

Page 36: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

36

TeraGrid Science GatewaysAccessible at http://www.teragrid.org/programs/sci_gateways/

Title Discipline

CIG Science Gateway for the Geodynamics Community Geophysics

QuakeSim (QuakeSim) Geophysics

The Earth System Grid (ESG) Global Atmospheric Research

National Biomedical Computation Resource (NBCR) Integrative Biology and Neuroscience

Developing Social Informatics Data Grid (SIDGrid) Language, Cognition, and Social Behavior

Neutron Science TeraGrid Gateway (NSTG) Materials Research

Biology and Biomedicine Science Gateway Molecular Biosciences

Open Life Sciences Gateway (OLSG) Molecular Biosciences

The Telescience Project Neuroscience Biology

Grid Analysis Environment (GAE) Physics

SCEC Earthworks Project Seismology

TeraGrid Visualization Gateway Visualization, Image Processing

Page 37: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

37

Hosting services

• Remember that old Waffle House commercial?• If you have a data set or a data resource that serves

a national community (or even a community that extends beyond your home institution… or a community you would like to extend beyond your home institution) …

• Hosting of your service is available via the TeraGrid

©Trustees of Indiana University. May be reused provided TeraGrid logo remains and any modifications to original are noted. Courtesy Craig A. Stewart, IU

Page 38: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

38

MutDB (www.mutdb.org)

http://www.chembiogrid.org/

©Trustees of Indiana University. May be reused provided TeraGrid logo remains and any modifications to original are noted. Courtesy Craig A. Stewart, IU

Page 39: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

39

Getting an account and allocation• Get a POPS (Partnership Online Proposal System) account• Apply for a DAC allocation (Development Allocation Committee):

< 5 TB disk, < 25 TB tape storage, and/or < 30,000 Standard Units (SUs - related to CPU hours - in general an SU on one of the newer TeraGrid systems is about 0.5 CPU hours)

• Wait a month (although any RP can help you shorten that!)• Read the introductory documentation• Use the TeraGrid KB if you need• Ask for help ([email protected])• Go discover!

Page 40: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

40

Go to the POPS page - https://pops-submit.teragrid.org/

Page 41: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

41

Create a POPS Login

www.teragrid.org

Page 42: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

42

Indicate that you are “New” to the Teragrid

www.teragrid.org

Page 43: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

43

Indicate that this is a “Start-up” Request

www.teragrid.org

Page 44: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

44

Select DAC-TG (nonintuitive)

www.teragrid.org

Page 45: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

45

Fill out PI information

www.teragrid.org

Page 46: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

46

Skip Co-PIs probably (unless Co-PI has current funding and you don’t)

www.teragrid.org

Page 47: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

47

Fill out info on your project

www.teragrid.org

Page 48: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

48

Fill out info on your funding

www.teragrid.org

Page 49: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

49

Make reasonable estimates about your computing

www.teragrid.org

Page 50: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

50

when ready

Upload your CV and Submit!

www.teragrid.org

Page 51: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Highlights of facilities at several RPs

51

Page 52: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

52

Texas Advanced Computing Center

580TF, 3,936 AMD Blades (62,976 cores), 123 TB Memory, 1.73 PB Storage, InfiniBand Interconnect

First Track2 deployment!

Sun STK SL8500 Tape System

5 PB Capacity, 20 TB Disk Cache

62TF, 1,460 PowerEdge Blades (5,840 cores), 10.4 TB Memory, 105 TB Parallel File System, InfiniBand

Interconnect

Maverick – Remote Visualization System

Ranger – Sun Constellation Linux Cluster

Ranch – Storage Facility Lonestar – Dell Linux Cluster

EnVision – Web-based Remote VisualizationTeraGrid Funded

http://envision.tacc.utexas.edu

Sun E25K128 SPARC4 Cores

500 GB Memory16 Frame Buffers

© TACC

Page 53: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Pittsburgh Supercomputing Centerwww.psc.edu © PSC

Storage Silos2 PB

Storage Cache Nodes

100 TB

Scratch Storage 200 TB

BigBenCray XT3; 4132c,

21.5Tf, 4TB

PopleSGI Altix 4700;

1.5TB shared, 768c

Rachel, JonasHP GS1280; each

512GB, 128p

SalkSGI Altix 4700;288GB, 144c

GolemSGI Altix

450

• The Pittsburgh Supercomputing Center, established 1986, is a joint effort of Carnegie Mellon University and the University of Pittsburgh together with Westinghouse Electric Company.

• PSC is a Resource Provider in the NSF TeraGrid program, providingcapability-class and high-productivity resources and extensivecomputational science support to researchers nationwide.

• PSC contributes to the TeraGrid’s coordinating Grid InfrastructureGroup (GIG) with leadership roles in user support, security,accounting, education, outreach, and training.

• Wide-ranging contributions to high-performance computing, communications, storage, outreach, and science, such as:

• Advanced networking: Web100, National LamdaRail, 3ROXSM, …

• National Resource for Biomedical Supercomputing (NIH)

• SuperComputing Science Consortium ( (SC)2; DOE/Regional )

• Zest high-performance snapshot service

High-resolution CFD simulation of collateral blood flow in the Circle of Willis, performed at PSC and TACC by Leopold Grinberg (Brown Univ.) and visualized by Greg Foss (PSC).

Page 54: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

NCAR TeraGrid Overview – TG’08• TeraGrid Resources at NCAR

– Computing Resources• 5.7 TFLOPS, 2048 processor IBM Blue

Gene/L with 1 TB of memory• 20 TB of attached disk storage

– Data and Visualization Resources• 150 TB Online SAN Disk

– Visualization Resources• Sun Ultra 40

• Focus Areas– Domain specific computing for the

atmospheric and related sciences– Large dataset visualization

• http://www.vapor.ucar.edu– Science Gateways

• Earth System Grid• Asteroseismology Gateway

– Urgent/On-demand Compute Access– Lustre and GPFS-wan testing and

deployment– EOT Internship and visitor programs:

• http://www.cisl.ucar.edu/siparcs• http://www.cisl.ucar.edu/rsvp

• © NCAR

Page 55: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

SDSC TeraGrid Overview – TG’08• Resources

– Compute• DataStar – IBM Power4+• BlueGene Data – IBM Blue Gene• TeraGrid Cluster – Intel IA-64

– Data• 2.5 PB Online SAN Disk• 25 PB Archive• 100+ Data Collections

• Focus Areas– Advanced Support for TeraGrid Applications (ASTA)– Co/Meta-Scheduling and Advanced Reservations– On-demand Compute Access– Global File Systems– Dual-site Archival Storage– Education, Outreach, and Training

Page 56: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Indiana University• IU foci:

– Science Gateways and gateway hosting (Quarry)

– Lustre-WAN (production this summer) (Data Capacitor)

– Data collection hosting, massive data storage

– Big Red – WRF, Molecular dynamics

Page 57: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

NCSA Abe• The NCSA Intel 64 Linux Cluster Abe is intended to provide a

capability resource for computationally challenging problems• Production jobs should typically use at least 1000 cores• Requests for extended access for large scale runs is encouraged• Specs

– Linux cluster– 89.47 TFLOPS– 9600 CPUs

• http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/Intel64Cluster/

Page 58: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

TeraGrid RP Resource Highlights

© Purdue University

Steele – A cluster with 812 dual quad-core nodes, each with 16-32GB memory. GigE/InfiniBand. ~60TFlops.

Brutus - SGI 450 with 4 FPGAs for development of FPGA accelerated applications and services. Condor schedulable.

TeraDRE – High-throughput visualization resource built on Condor Pools. A 48-node subcluster featuring Nvidia GeForce 6600 GT GPUs. Supports Maya, POV-Ray, Blender and Gelato. http://www.purdue.edu.teragrid/teradre.

Data Collections – SRB managed datasets (incl. real-time satellite images, NEXRAD radar streams, remote sensing data) and application services (OpenDAP, THREDDS)..

Condor – Over 14000 CPUs with a total of over 60

TFlops. Linux, Windows. Condor is designed for high-throughput computing, and is excellent for parameter sweeps, Monte Carlo simulation, or most any serial application, and some classes of parallel jobs such as master-worker applications.

Storage – 200 TB BlueArc Titan storage for user home directories and scratch space; 1.3 PB tape archival storage.

Page 59: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

6/5/2008

Page 60: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

A conceptual introduction to parallel programming

• How many people can effectively build a house?– Two people perhaps in half the time it takes one, perhaps

less than that– Four in ~ half the time of two– If you had 1,024 people…– And what if one worker is not very effective?

• And some things you simply cannot do in parallel

60

Page 61: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Some key definitions and ideas

• Parallel computing is a form of computation in which many instructions are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently ("in parallel”) [Wikipedia]

• Nicely parallel (sometimes called ‘trivially parallel’)• High Throughput Computing (4 color theorem, folding at home,

particle physics)• High Performance Computing, Supercomputing

61

Page 62: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

62

Amdahl’s LawP = parallel fraction of programN = number of processors

Formula and graph from wikipedia.org

Page 63: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Gustafson’s Law and types of scaling

63

S(P) = P – α(P-1)

• A driving metaphor (from Wikipedia)– Suppose a car is traveling between two cities 60 miles apart, and

has already spent one hour traveling half the distance at 30 mph.• Amdahl's Law approximately suggests: “No matter how fast you drive

the last half, it is impossible to achieve 90 mph average before reaching the second city. Since it has already taken you 1 hour and you only have a distance of 60 miles total; going infinitely fast you would only achieve 60 mph.”

– Gustafson's Law approximately states: “Given enough time and distance to travel, the car's average speed can always eventually reach 90mph, no matter how long or how slowly it has already traveled. For example, in the two-cities case this could be achieved by driving at 150 mph for an additional hour.”

Page 64: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Gustafson’s Law and types of scaling

64

S(P) = P – α(P-1)

• Strong scaling: measure time to solution with fixed problem size, vary the number of processors

• Weak scaling: measure time to solution with problem size growing as processor count (fixed system size per processor).

• In the end scientific progress is a function of hardware, software, insight, and patience

Page 65: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

65

Page 66: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Shared vs Distributed Memory

66

Page 67: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

Problems today and tomorrow (continuing in part with house

analogy)• What if getting the 2x4s to where they go were a bigger problem

than nailing them in place?• What if part of the time you wanted to give the wood you’re

working on over to a Mitre saw and then have it handed back to you?

• Multicore – not MPI on a chip (Tom Sterling)• Speed of light – not changing anytime soon • Real time response

67

Page 68: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

68

Additional info

• Getting started guide - includes examples of good proposals: http://www.teragrid.org/userinfo/getting_started.php

• Review criteria: http://www.teragrid.org/userinfo/access/allocationspolicy.php

• When you’re in a foreign country there is nothing like a guide. If you need help with the application process submit a help request via the TeraGrid ([email protected])

Page 69: Introduction to Parallel Computing on the TeraGrid Part 1: the TeraGrid and Parallel Computing concepts Craig Stewart, stewart@iu.edu Associate Dean, Research.

69

Acknowledgements• IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation

under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075. The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433. IU research presented here is supported in part by the Pervasive Technology Labs and the Indiana METACyt Initiative; both of these IU initiatives are supported by the Lilly Endowment, Inc., as well as Shared University Research grants from IBM, Inc. to IU. The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by NSF grant 331480. Marcus Christie and Surresh Marru of the Extreme! Computing Lab contributed the LEAD graphics. The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health grant P20 HG003894-01. Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s award to Stewart, funded by the US Department of State and the Technische Universitaet Dresden.

• The Grid Infrastructure Group is funded by NSF grant 0503697.

• Purdue’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grant No. OCI-050399.

• Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other funding agency.

• This work is made possible by many staff throughout the US who are striving to make the TeraGrid a critical asset for the US in scientific discovery and global competitiveness.

Thank you! Any questions?