Post on 10-May-2015
description
Ian Foster
Computation Institute
Argonne National Lab & University of Chicago
Grid Projects in the US(an inevitably incomplete view)
2
Grid Projects in the US
Resources ResourceProvider
ResourceProvider
ResourceProvider
3
Service Provider
Service Provider
Grid Projects in the US
Service Provider
Services
Resources ResourceProvider
4
CommunityCommunity
Grid Projects in the US
Community
Service Provider
Content
Services
Resources ResourceProvider
SoftwareProviders
5
Grid Projects in the US
Community
Service Provider
Content
Services
Resources
SoftwareProviders
ResourceProvider
6
Resource Providers Campus and regional grids
Purdue, Wisc, UCLA, …, … TIGRE, UC system, …
Open Science Grid 43,000 CPUs, 6 PB disk, 15,000 CPU days/day Allocations on basis of MOUs
TeraGrid ~ 1.2 Pflop/s National Allocation Committee
Amazon, Microsoft, IBM, etc. ?? CPUs, ?? storage Fee for service
7
Open Science Grid Sites (5/4/08)
+3 in Brazil; 2 in Mexico; 2 in Taiwan; 1 in the UK. Grows by 10-20 per year.
8
Use by Community
CMS
ATLAS
CDF
Local Usage & bugs(unmapped to VO)
D0
2,000,000 a week
1,000,000 a week
9
TeraGrid Participants
10
Growing User Community4,277
3,702
1,807
575
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
Dec
-03
Feb-
04
Apr-0
4
Jun-
04
Aug-0
4
Oct
-04
Dec
-04
Feb-
05
Apr-0
5
Jun-
05
Aug-0
5
Oct
-05
Dec
-05
Feb-
06
Apr-0
6
Jun-
06
Aug-0
6
Oct
-06
Dec
-06
Feb-
07
Apr-0
7
Jun-
07
Aug-0
7
Oct
-07
Dec
-07
TeraGrid UsersCurrent AccountsActive UsersNew AccountsGateway UsersTarget
Source: TeraGrid Central Database
11
Growing Usage
Source: TeraGrid Central Database
3.95B NUs delivered in CY2007
12
CY2007 Usage by Discipline
3.95B NUs delivered in CY2007
Molecular
Biosciences
31%
Chemistry
17%Physics
17%
Astronomical
Sciences12%
Materials Research
6%
Earth Sciences
3%
All 19 Others
4%
Advanced Scientific Computing
2%
Atmospheric
Sciences
3%
Chemical, Thermal
Systems
5%
13
Grid Projects in the US
Community
Service Provider
Content
Services
Resources
SoftwareProviders
ResourceProvider
Service Provider
For example: Build and test service (Wisc) Certificate Authorities Cancer Biology Informatics Grid LIGO Data Grid
14
caBIG: sharing of infrastructure, applications, and data.
DataIntegration!
Services& Cancer Biology Globus
15
Microarray
NCICB
ResearchCenter
Gene Databas
e
Grid-Enabled Client
ResearchCenter
Tool 1
Tool 2caArray
Protein Database
Tool 3
Tool 4
Grid Data Service
Analytical Service
Image
Tool 2
Tool 3
Grid Services Infrastructure(Metadata, Registry, Query,
Invocation, Security, etc.)
Grid Portal
caBIG Under the Covers
Globus
16
Birmingham•
LIGO Data Grid
Replicating >1 Terabyte/day to 8 sites770 TB replicated to date: >120 million
replicasMTBF = 1 month
LIGO Gravitational Wave Observatory
Cardiff
AEI/Golm
Ann Chervenak et al., ISI; Scott Koranda et al, LIGO
Globus
17
Grid Projects in the US
Community
Service Provider
Content
Services
Resources
SoftwareProviders
ResourceProvider
Community
For example: Earth System Grid Children’s Oncology Grid Southern California
Earthquake Center (SCEC) Science gateways
18
Main ESG PortalMain ESG Portal CMIP3 (IPCC AR4) ESG PortalCMIP3 (IPCC AR4) ESG Portal
198 TB of data at four locations 1,150 datasets 1,032,000 files Includes the past 6 years of joint
DOE/NSF climate modeling experiments
35 TB of data at one location 74,700 files Generated by a modeling campaign coordinated by the
Intergovernmental Panel on Climate Change Data from 13 countries, representing 25 models
8,000 registered users 1,900 registered projects
Downloads to date 49 TB 176,000 files
Downloads to date 387 TB 1,300,000 files 500 GB/day
(average)
400 scientific papers published to date based on analysis of CMIP3 (IPCC AR4) data
Earth System Grid
ESG usage: over 500 sites worldwide
ESG monthly download volumes
Globus
19
Pathway Instantiations
SCEC Community Modeling Environment
Knowledge Base
OntologiesCurated taxonomies,
Relations & constraints
Pathway ModelsPathway templates,
Models of simulation codes
Code Repositories
Data & SimulationProductsData Collections
FSM
RDM
AWM
SRM
Storage
GRIDPathway Execution
Policy, Data ingest, Repository access
Grid ServicesCompute & storage management, Security
DIGITALLIBRARIES
Navigation &Queries
Versioning,Replication
MediatedCollectionsFederated
access
KNOWLEDGEACQUISITION
Acquisition InterfacesDialog planning,
Pathway constructionstrategies
Pathway AssemblyTemplate instantiation,
Resource selection,Constraint checking
KNOWLEDGE REPRESENTATION & REASONINGKnowledge Server
Knowledge base access, InferenceTranslation Services
Syntactic & semantic translation
Computing
Users
A collaboratory for system-level earthquake science
Globus
20
Seismic Hazard Analysis
Intensity measure: peak ground acceleration
Interval: 50 yrs
Probability of exceedance: 2%
Defn: Max. intensity of shaking expected at a site during a fixed time interval
Example: National seismic hazard maps
(http://geohazards.cr.usgs.gov/eq/)(http://geohazards.cr.usgs.gov/eq/)
Globus
21
SDSCUSC
SCEC
PSC TeraGrid ISI
12 CPUs 1,700 CPUs 1,200 CPUs
1 CPU4 CPUs
• Prepare input to Pathway2 wave propagation code • Pathway2PGV converts output into hazard map• Map is visualized
SCEC Computations & Grid Globus
22
Children’s Oncology Gridand MEDICUS
Globus
23
Grid Projects in the US
Community
Service Provider
Content
Services
Resources ResourceProvider
SoftwareProviders
24
Software Providers
Globus [GT4.2 released July 2, 2008] GRAM, GridFTP, MDS, RLS, DRS, … GSI, GridShib, MyProxy, … GridWay (Spain), OGSA-DAI (UK), Introduce, …
Condor
MPI-G, Swift, Pegasus, Taverna (UK), Kepler caBIG: e.g., Introduce Virtual Data Toolkit (includes VOMS [Italy], …) SRB, iRODS, MyCluster, … …
Globus
25
Virtual Data Toolkit (VDT)Software Release Process
VDT components over time: built for 15 Linux Versions
Development & testing
Globus
26
ApplnService
Create
Index service
StoreRepository ServiceAdvertize
Discover
Invoke;get results
Introduce
Container
Transfer GAR
Deploy
Ohio State University and Argonne/U.Chicago
Creating Services:Introduce and gRAVI
Introduce Define service Create skeleton Discover types Add operations Configure security
Grid Remote Application Virtualization Infrastructure Wrap executables
Globus
27
Composing Services
Globus
28
Service Discovery:Registries
Globus
29
CommunityCommunity
Challenges
Community
Service Provider
Content
Services
Resources ResourceProvider
SoftwareProviders
Conflicting Missions
SustainabilityDiscipline science pull
30
The Future
NSF eXtreme Digital (XD) solicitation Aka “TeraGrid III”
DOE, NIH, etc.—what do they want?
International cooperation