Grid Projects In The US July 2008

30
Ian Foster Computation Institute Argonne National Lab & University of Chicago Grid Projects in the US (an inevitably incomplete view)

description

A talk given at the HPC 2008 meeting in Cetraro, Italy

Transcript of Grid Projects In The US July 2008

Page 1: Grid Projects In The US July 2008

Ian Foster

Computation Institute

Argonne National Lab & University of Chicago

Grid Projects in the US(an inevitably incomplete view)

Page 2: Grid Projects In The US July 2008

2

Grid Projects in the US

Resources ResourceProvider

ResourceProvider

ResourceProvider

Page 3: Grid Projects In The US July 2008

3

Service Provider

Service Provider

Grid Projects in the US

Service Provider

Services

Resources ResourceProvider

Page 4: Grid Projects In The US July 2008

4

CommunityCommunity

Grid Projects in the US

Community

Service Provider

Content

Services

Resources ResourceProvider

SoftwareProviders

Page 5: Grid Projects In The US July 2008

5

Grid Projects in the US

Community

Service Provider

Content

Services

Resources

SoftwareProviders

ResourceProvider

Page 6: Grid Projects In The US July 2008

6

Resource Providers Campus and regional grids

Purdue, Wisc, UCLA, …, … TIGRE, UC system, …

Open Science Grid 43,000 CPUs, 6 PB disk, 15,000 CPU days/day Allocations on basis of MOUs

TeraGrid ~ 1.2 Pflop/s National Allocation Committee

Amazon, Microsoft, IBM, etc. ?? CPUs, ?? storage Fee for service

Page 7: Grid Projects In The US July 2008

7

Open Science Grid Sites (5/4/08)

+3 in Brazil; 2 in Mexico; 2 in Taiwan; 1 in the UK. Grows by 10-20 per year.

Page 8: Grid Projects In The US July 2008

8

Use by Community

CMS

ATLAS

CDF

Local Usage & bugs(unmapped to VO)

D0

2,000,000 a week

1,000,000 a week

Page 9: Grid Projects In The US July 2008

9

TeraGrid Participants

Page 10: Grid Projects In The US July 2008

10

Growing User Community4,277

3,702

1,807

575

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

Dec

-03

Feb-

04

Apr-0

4

Jun-

04

Aug-0

4

Oct

-04

Dec

-04

Feb-

05

Apr-0

5

Jun-

05

Aug-0

5

Oct

-05

Dec

-05

Feb-

06

Apr-0

6

Jun-

06

Aug-0

6

Oct

-06

Dec

-06

Feb-

07

Apr-0

7

Jun-

07

Aug-0

7

Oct

-07

Dec

-07

TeraGrid UsersCurrent AccountsActive UsersNew AccountsGateway UsersTarget

Source: TeraGrid Central Database

Page 11: Grid Projects In The US July 2008

11

Growing Usage

Source: TeraGrid Central Database

3.95B NUs delivered in CY2007

Page 12: Grid Projects In The US July 2008

12

CY2007 Usage by Discipline

3.95B NUs delivered in CY2007

Molecular

Biosciences

31%

Chemistry

17%Physics

17%

Astronomical

Sciences12%

Materials Research

6%

Earth Sciences

3%

All 19 Others

4%

Advanced Scientific Computing

2%

Atmospheric

Sciences

3%

Chemical, Thermal

Systems

5%

Page 13: Grid Projects In The US July 2008

13

Grid Projects in the US

Community

Service Provider

Content

Services

Resources

SoftwareProviders

ResourceProvider

Service Provider

For example: Build and test service (Wisc) Certificate Authorities Cancer Biology Informatics Grid LIGO Data Grid

Page 14: Grid Projects In The US July 2008

14

caBIG: sharing of infrastructure, applications, and data.

DataIntegration!

Services& Cancer Biology Globus

Page 15: Grid Projects In The US July 2008

15

Microarray

NCICB

ResearchCenter

Gene Databas

e

Grid-Enabled Client

ResearchCenter

Tool 1

Tool 2caArray

Protein Database

Tool 3

Tool 4

Grid Data Service

Analytical Service

Image

Tool 2

Tool 3

Grid Services Infrastructure(Metadata, Registry, Query,

Invocation, Security, etc.)

Grid Portal

caBIG Under the Covers

Globus

Page 16: Grid Projects In The US July 2008

16

Birmingham•

LIGO Data Grid

Replicating >1 Terabyte/day to 8 sites770 TB replicated to date: >120 million

replicasMTBF = 1 month

LIGO Gravitational Wave Observatory

Cardiff

AEI/Golm

Ann Chervenak et al., ISI; Scott Koranda et al, LIGO

Globus

Page 17: Grid Projects In The US July 2008

17

Grid Projects in the US

Community

Service Provider

Content

Services

Resources

SoftwareProviders

ResourceProvider

Community

For example: Earth System Grid Children’s Oncology Grid Southern California

Earthquake Center (SCEC) Science gateways

Page 18: Grid Projects In The US July 2008

18

Main ESG PortalMain ESG Portal CMIP3 (IPCC AR4) ESG PortalCMIP3 (IPCC AR4) ESG Portal

198 TB of data at four locations 1,150 datasets 1,032,000 files Includes the past 6 years of joint

DOE/NSF climate modeling experiments

35 TB of data at one location 74,700 files Generated by a modeling campaign coordinated by the

Intergovernmental Panel on Climate Change Data from 13 countries, representing 25 models

8,000 registered users 1,900 registered projects

Downloads to date 49 TB 176,000 files

Downloads to date 387 TB 1,300,000 files 500 GB/day

(average)

400 scientific papers published to date based on analysis of CMIP3 (IPCC AR4) data

Earth System Grid

ESG usage: over 500 sites worldwide

ESG monthly download volumes

Globus

Page 19: Grid Projects In The US July 2008

19

Pathway Instantiations

SCEC Community Modeling Environment

Knowledge Base

OntologiesCurated taxonomies,

Relations & constraints

Pathway ModelsPathway templates,

Models of simulation codes

Code Repositories

Data & SimulationProductsData Collections

FSM

RDM

AWM

SRM

Storage

GRIDPathway Execution

Policy, Data ingest, Repository access

Grid ServicesCompute & storage management, Security

DIGITALLIBRARIES

Navigation &Queries

Versioning,Replication

MediatedCollectionsFederated

access

KNOWLEDGEACQUISITION

Acquisition InterfacesDialog planning,

Pathway constructionstrategies

Pathway AssemblyTemplate instantiation,

Resource selection,Constraint checking

KNOWLEDGE REPRESENTATION & REASONINGKnowledge Server

Knowledge base access, InferenceTranslation Services

Syntactic & semantic translation

Computing

Users

A collaboratory for system-level earthquake science

Globus

Page 20: Grid Projects In The US July 2008

20

Seismic Hazard Analysis

Intensity measure: peak ground acceleration

Interval: 50 yrs

Probability of exceedance: 2%

Defn: Max. intensity of shaking expected at a site during a fixed time interval

Example: National seismic hazard maps

(http://geohazards.cr.usgs.gov/eq/)(http://geohazards.cr.usgs.gov/eq/)

Globus

Page 21: Grid Projects In The US July 2008

21

SDSCUSC

SCEC

PSC TeraGrid ISI

12 CPUs 1,700 CPUs 1,200 CPUs

1 CPU4 CPUs

• Prepare input to Pathway2 wave propagation code • Pathway2PGV converts output into hazard map• Map is visualized

SCEC Computations & Grid Globus

Page 22: Grid Projects In The US July 2008

22

Children’s Oncology Gridand MEDICUS

Globus

Page 23: Grid Projects In The US July 2008

23

Grid Projects in the US

Community

Service Provider

Content

Services

Resources ResourceProvider

SoftwareProviders

Page 24: Grid Projects In The US July 2008

24

Software Providers

Globus [GT4.2 released July 2, 2008] GRAM, GridFTP, MDS, RLS, DRS, … GSI, GridShib, MyProxy, … GridWay (Spain), OGSA-DAI (UK), Introduce, …

Condor

MPI-G, Swift, Pegasus, Taverna (UK), Kepler caBIG: e.g., Introduce Virtual Data Toolkit (includes VOMS [Italy], …) SRB, iRODS, MyCluster, … …

Globus

Page 25: Grid Projects In The US July 2008

25

Virtual Data Toolkit (VDT)Software Release Process

VDT components over time: built for 15 Linux Versions

Development & testing

Globus

Page 26: Grid Projects In The US July 2008

26

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get results

Introduce

Container

Transfer GAR

Deploy

Ohio State University and Argonne/U.Chicago

Creating Services:Introduce and gRAVI

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Globus

Page 27: Grid Projects In The US July 2008

27

Composing Services

Globus

Page 28: Grid Projects In The US July 2008

28

Service Discovery:Registries

Globus

Page 29: Grid Projects In The US July 2008

29

CommunityCommunity

Challenges

Community

Service Provider

Content

Services

Resources ResourceProvider

SoftwareProviders

Conflicting Missions

SustainabilityDiscipline science pull

Page 30: Grid Projects In The US July 2008

30

The Future

NSF eXtreme Digital (XD) solicitation Aka “TeraGrid III”

DOE, NIH, etc.—what do they want?

International cooperation