Internet2 Presentation (Oct. 11, 2007)
Paul Avery 1
Paul AveryUniversity of [email protected]
Internet2 MeetingSan Diego, CA
October 11, 2007
www.opensciencegrid.org
High Energy & Nuclear Physics Experimentsand Advanced Cyberinfrastructure
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 2
Context: Open Science GridConsortium of many organizations (multiple
disciplines)Production grid cyberinfrastructure75+ sites, 30,000+ CPUs: US, UK, Brazil, Taiwan
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 3
OSG Science Drivers Experiments at Large Hadron
ColliderNew fundamental particles and forces100s of petabytes 2008 - ?
High Energy & Nuclear Physics exptsTop quark, nuclear matter at extreme
density~10 petabytes 1997 –
present
LIGO (gravity wave search)Search for gravitational waves~few petabytes 2002 –
present
Data
gro
wth
Com
mu
nit
y g
row
th
2007
2005
2003
2001
2009
Future Grid resources Massive CPU (PetaOps) Large distributed datasets (>100PB) Global communities (1000s) International optical networks
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 4
OSG History in ContextPrimary Drivers: LHC and LIGO
1999 2000 2001 2002 20052003 2004 2006 2007 2008 2009
PPDG
GriPhyN
iVDGL
Trillium Grid3 OSG(DOE)
(DOE+NSF)(NSF)
(NSF)
European Grid + Worldwide LHC Computing Grid
Campus, regional grids
LHC Ops
LHC construction, preparation, commissioning
LIGO operation LIGO preparation
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 5
Search for Origin of Mass New fundamental forces Supersymmetry Other new particles 2008 – ?
TOTEM
LHCb
ALICE
27 km Tunnel in Switzerland & France
CMS
ATLAS
LHC Experiments at CERN
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 6
Particle
ProtonProton 2835 bunch/beam Protons/bunch 1011
Beam energy 7 TeV x 7 TeVLuminosity 1034 cm2s1
Crossing rate every 25 nsec
Collision rate ~109 Hz New physics rate ~105 Hz Selection: 1 in 1014
Parton(quark, gluon)
Proton
ll
jetjet
Bunch
SUSY.....
Higgs
Zo
Zoe+
e+
e-
e-
Collisions at LHC (2008?)
(~20 Collisions/Crossing)
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 7
CMSATLAS
LHCb
Storage Raw recording rate 0.2 – 1.5 GB/s Large Monte Carlo data samples 100 PB by ~2012 1000 PB later in decade?
Processing PetaOps (> 600,000 3 GHz cores)
Users 100s of institutes 1000s of researchers
LHC Data and CPU Requirements
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 8
ATLAS CMS
LHC Global Collaborations
2000 – 3000 physicists per experiment USA is 20–31% of total
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 9
CMS Experiment
LHC Global Grid
Online System
CERN Computer Center
FermiLabKorea RussiaUK
Maryland
200 - 1500 MB/s
>10 Gb/s
10-40 Gb/s
2.5-10 Gb/s
Tier 0
Tier 1
Tier 3
Tier 2
Physics caches
PCs
Iowa
UCSDCaltechU Florida
5000 physicists, 60 countries
10s of Petabytes/yr by 2009 CERN / Outside = 10-20%
FIU
Tier 4
OSG
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 10
LHC Global Grid
11 Tier-1 sites112 Tier-2 sites (growing)100s of universities J. Knobloch
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 11
LHC Cyberinfrastructure Growth: CPU
0
50
100
150
200
250
300
350
2007 2008 2009 2010Year
MS
I200
0
LHCb-Tier-2
CMS-Tier-2
ATLAS-Tier-2
ALICE-Tier-2
LHCb-Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
LHCb-CERN
CMS-CERN
ATLAS-CERN
ALICE-CERN
CERN
Tier-1
Tier-2
~100,000 cores
Multi-core boxesAC & power challenges
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 12
LHC Cyberinfrastructure Growth: Disk
0
20
40
60
80
100
120
140
160
2007 2008 2009 2010Year
PB
LHCb-Tier-2
CMS-Tier-2
ATLAS-Tier-2
ALICE-Tier-2
LHCb-Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
LHCb-CERN
CMS-CERN
ATLAS-CERN
ALICE-CERN
100 Petabytes
Disk
CERN
Tier-1
Tier-2
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 13
LHC Cyberinfrastructure Growth: Tape
0
20
40
60
80
100
120
140
160
2007 2008 2009 2010Year
PB
LHCb-Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
LHCb-CERN
CMS-CERN
ATLAS-CERN
ALICE-CERN
100 Petabytes
Tape
CERN
Tier-1
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 14
HENP Bandwidth Roadmapfor Major Links (in Gbps)
Year Production Experimental Remarks
2001 0.155 0.622-2.5 SONET/SDH
2002 0.622 2.5 SONET/SDH DWDM; GigE Integ.
2003 2.5 10 DWDM; 1 + 10 GigE Integration
2005 10 2-4 X 10 Switch; Provisioning
2007 3 X 10 ~10 X 10; 40 Gbps
1st Gen. Grids
2009 ~8 X 10 or 2 X 40
~5 X 40 or ~20 X 10
40 Gbps Switching
2012 ~5 X 40 or
~20 X 10
~25 X 40 or ~100 X 10
2nd Gen Grids Terabit Networks
2015 ~Terabit ~MultiTbps ~Fill One Fiber
Paralleled by ESnet roadmap
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 15
HENP Collaboration with Internet2www.internet2.edu
HENP SIG
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 16
UltraLight and other networking initiatives Spawning state-wide and regional networks (FLR, SURA,
LONI, …)
HENP Collaboration with NLRwww.nlr.net
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 17
US LHCNet, ESnet Plan 2007-2010:
3080 Gbps US-CERN
DENDEN
ELPELP
ALBALBATLATL
Metropolitan Area Rings
Aus.
Europe
SDGSDG
AsiaPacSEASEA
Major DOE Office of Science SitesHigh-speed cross connects with Internet2/Abilene
New ESnet hubsESnet hubs
SNVSNV
Europe
Japan
Science Data Network core, 40-60 Gbps circuit transportLab suppliedMajor international
Production IP ESnet core, 10 Gbps enterprise IP traffic
Japan
Aus.
Metro Rings
ESnet4SDN Core: 30-50Gbps
ESnet IP Core≥10 Gbps
10Gb/s10Gb/s
30Gb/s2 x 10Gb/s
NYCNYCCHICHI
US-LHCNetNetwork Plan
(3 to 8 x 10 Gbps US-CERN)
LHCNet Data Network
DCDC
GEANT2SURFNetIN2P3
NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2
CERN
FNAL
BNL
US-LHCNet: NY-CHI-GVA-AMS 2007-10: 30, 40, 60, 80 Gbps
ESNet MANs to FNAL & BNL; Dark Fiber to FNAL; Peering With GEANT
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 18
CSA06
Tier1–Tier2 Data Transfers: 2006–07
1 GB/sec
Sep. 2006 Sep. 2007Mar. 2007
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 19
Computing, Offline and CSA07
Nebraska
One well configured site. But ~10 such sites in near future network challenge
US: FNAL Transfer Rates to Tier-2 Universities
1 GB/s
June 2007
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 20
Current Data Transfer Experience
Transfers are generally much slower than expectedOr stop altogether
Potential causes difficult to diagnoseConfiguration problem? Loading? Queuing?Database errors, experiment S/W error, grid S/W error?End-host problem? Network problem? Application failure?
Complicated recovery Insufficient informationToo slow to diagnose and correlate at the time the error
occurs
ResultLower transfer rates, longer troubleshooting times
Need intelligent services, smart end-host systems
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 21
UltraLight
10 Gb/s+ network• Caltech, UF, FIU, UM, MIT• SLAC, FNAL• Int’l partners• Level(3), Cisco, NLR
http://www.ultralight.org
Funded by NSF
Integrating Advanced Networking in Applications
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 22
UltraLight Testbed
www.ultralight.orgFunded by NSF
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 23
Many Near-Term Challenges Network
Bandwidth, bandwidth, bandwidthNeed for intelligent services, automationMore efficient utilization of network (protocols, NICs, S/W
clients, pervasive monitoring)
Better collaborative toolsDistributed authentication?Scalable services: automationScalable support
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 24
END
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 25
Extra Slides
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 26
The Open Science Grid Consortium
OpenScience
Grid
U.S. gridprojects LHC experiments
Laboratorycenters
Educationcommunities
Science projects & communities
Technologists(Network, HPC, …)
ComputerScience
Universityfacilities
Multi-disciplinaryfacilities
Regional andcampus grids
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 27
CMS: “Compact” Muon Solenoid
Inconsequential humans
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 28
All charged tracks with pt > 2 GeV
Reconstructed tracks with pt > 25 GeV
(+30 minimum bias events)
109 collisions/sec, selectivity: 1 in 1013
Collision Complexity: CPU + Storage
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 29
LHC Data Rates: Detector to Storage
Level 1 Trigger: Special Hardware
40 MHz
75 KHz 75 GB/sec
5 KHz 5 GB/sec
Level 2 Trigger: Commodity CPUs
100 Hz0.15 – 1.5 GB/sec
Level 3 Trigger: Commodity CPUs
Raw Data to storage(+ simulated data)
Physics filtering ~TBytes/sec
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 30
LIGO: Search for Gravity Waves LIGO Grid
6 US sites3 EU sites (UK & Germany)
* LHO, LLO: LIGO observatory sites* LSC: LIGO Scientific Collaboration
Cardiff
AEI/Golm •
Birmingham•
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 31
Is HEP Approaching Productivity Plateau?
Padova2000
Beijing2001
San Diego2003
Interlachen2004
Mumbai2006
Victoria2007
Exp
ecta
tion
s
From Les Robertson
Gartner Group
(CHEP Conferences)
The Technology Hype CycleApplied to HEP Grids
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 32
Challenges from Diversity and Growth
Management of an increasingly diverse enterpriseSci/Eng projects, organizations, disciplines as distinct
culturesAccommodating new member communities (expectations?)
Interoperation with other gridsTeraGrid International partners (EGEE, NorduGrid, etc.)Multiple campus and regional grids
Education, outreach and trainingTraining for researchers, students… but also project PIs, program officers
Operating a rapidly growing cyberinfrastructure25K 100K CPUs, 4 10 PB diskManagement of and access to rapidly increasing data stores
(slide)Monitoring, accounting, achieving high utilizationScalability of support model (slide)
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 33
Collaborative Tools: EVO Videoconferencing End-to-End Self
Managed Infrastructure
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 34
REDDnet: National Networked Storage
NSF funded project Vanderbilt
8 initial sitesMultiple disciplines
Satellite imagery HENP Terascale
Supernova Initative Structural Biology Bioinformatics
Storage 500TB disk 200TB tape
Brazil?
Internet2 Presentation (Oct. 11, 2007)
Paul Avery 35
OSG Operations Model
Distributed modelScalability!VOs, sites, providersRigorous problem
tracking & routingSecurityProvisioningMonitoringReporting
Partners with EGEE operations
Top Related