Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S....
-
Upload
candice-goodman -
Category
Documents
-
view
213 -
download
0
Transcript of Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S....
Distributed IT Infrastructure for U.S. ATLAS
Rob GardnerRob Gardner
Indiana UniversityIndiana University
DOE/NSF Review of U.S. ATLAS and CMS Computing ProjectsDOE/NSF Review of U.S. ATLAS and CMS Computing Projects
Brookhaven National LaboratoryBrookhaven National LaboratoryNOVEMBER 14-17, 2000NOVEMBER 14-17, 2000
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 2
Outline
RequirementsRequirements
ApproachApproach
Organization Organization
Resource RequirementsResource Requirements
ScheduleSchedule
Fallback IssuesFallback Issues
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 3
Distributed IT Infrastructure
A wide area computational infrastructure for U.S. ATLAS A wide area computational infrastructure for U.S. ATLAS A network of distributed computing devices
A network of distributed data caches & stores
Connectivity Physicists with data (laptop scale sources: LOD) Computers with data (at all scales) Physicists with each other (collaboration)
Distributed information, portals
EffortsEfforts Data Grid R&D
Strategic “remote” sites (Tier 2s)
Distributed IT support at the Tier 1 center
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 4
Requirements
AccessAccess Efficient access to resources at the Tier 1 facility Data distribution to remote computing devices
InformationInformation A secure infrastructure to locate, monitor and manage collections of
distributed resources Analysis planning framework
Resource estimation “Matchmaker” tools to optimally connect physicist+CPU+data+etc…
ScalableScalable Add arbitrary large numbers of computing devices as they become available Add arbitrarily large numbers of data sources as they become available
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 5
Approach
~5 strategic remote sites (Tier 2s)~5 strategic remote sites (Tier 2s)
Scale of each facility:Scale of each facility: MONARC estimates
ATLAS NCB/WWC (World Wide Computing Group) National Tier 1 facility
209 Spec95 365 Online disk 2 PB tertiary
Tier 2 = Tier 1 * 20%
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 6
Role of Tier 2 Centers
User AnalysisUser Analysis Standard configuration optimized for analysis at the AOD level
ESD objects required for some analysis
Primary Resource for Monte Carlo Simulation Primary Resource for Monte Carlo Simulation
““Spontaneous” production level ESD skims (autonomy)Spontaneous” production level ESD skims (autonomy)
Data distribution cachesData distribution caches
Remote data storesRemote data stores HSM serve to archive AODs
MC data of all types (GEN, RAW, ESD, AOD, LOD) from all Tier 2’s & users
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 7
Typical Tier 2
CPU: 50KCPU: 50K SpecInt95 SpecInt95 (t1: 209K)(t1: 209K)
Commodity Pentium/Linux
Estimated 144 Dual Processor Nodes (t1: 640) Online Storage: 70 TB Disk (t1: 365)
High Performance Storage Area Network
Baseline: Fiber Channel Raid Array
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 8
‘Remote’ Data Stores
Exploit existing infrastructureExploit existing infrastructure
mass store infrastructure at 2 of the 5 Tier 2 centersmass store infrastructure at 2 of the 5 Tier 2 centers Assume existing HPSS or equivalent license, tape silo, robot
Augment with drives, media, mover nodes, and disk cache
Each site contributes 0.3-0.5 PB store AOD archival, MC ESD+AOD archival
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 9
Organization
Facilities Subproject 2.3.2Facilities Subproject 2.3.2WBSNumber Description2.3.2 Distributed IT Infrastructure2.3.2.1 Specify ATLAS requirements2.3.2.2 Design and Model Grid Architecture2.3.2.3 Integration of Grid Software 2.3.2.4 Grid testbeds2.3.2.5 Wide Area Network Integration 2.3.2.6 Collaborative tools for Tier 2 2.3.2.7 Tier 2 Regional Center at Location A2.3.2.8 Tier 2 Regional Center at Location B2.3.2.9 Tier 2 Regional Center at Location C2.3.2.10 Tier 2 Regional Center at Location D2.3.2.11 Tier 2 Regional Center at Location E2.3.2.12 Tertiary Storage at Tier 2 Regional Centers
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 10
Personnel
MANPOWER ESTIMATE SUMMARY IN FTEsMANPOWER ESTIMATE SUMMARY IN FTEs
WBSNo:WBSNo: 22 Funding Type:Funding Type: InfrastructureInfrastructure 11/13/00 8:08:38 PM11/13/00 8:08:38 PM
Description:Description: US ATLAS ComputingUS ATLAS Computing Institutions:Institutions:AllAll Funding Source :Funding Source : AllAll
FY 01FY 01 FY 02FY 02 FY 03FY 03 FY 04FY 04 FY 05FY 05 FY 06FY 06 TotalTotal
IT IIT I 1.01.0 4.04.0 6.06.0 10.010.0 10.010.0 7.07.0 38.038.0
IT IIIT II .0.0 1.01.0 2.02.0 2.02.0 5.05.0 5.05.0 15.015.0
PhysicistPhysicist 1.01.0 1.01.0 1.01.0 1.01.0 1.01.0 .0.0 5.05.0
TOTAL LABORTOTAL LABOR 2.02.0 6.06.0 9.09.0 13.013.0 16.016.0 12.012.0 58.058.0
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 11
Tier 2 CostsWBS FY 01 FY 02 FY 03 FY 04 FY 05 FY06 TotalNumber Description (k$) (k$) (k$) (k$) (k$) (k$) (k$)
2.3.2.7 Tier 2 Regional Center at Location A 412 462 436 439 691 1007 34482.3.2.7.1 Tier 2 Facility Hardware 248 244 213 207 459 784 21552.3.2.7.2 Tier 2 Facility Software 140 186 186 186 186 186 10692.3.2.7.3 Tier 2 Facility Administration 24 33 37 46 46 37 224
2.3.2.8 Tier 2 regional center at location B 0 500 498 403 721 1008 31312.3.2.8.1 Tier 2 Facility Hardware 0 336 275 171 489 785 20572.3.2.8.2 Tier 2 Facility Software 0 140 186 186 186 186 8832.3.2.8.3 Tier 2 Facility Administration 0 24 37 46 46 37 191
2.3.2.9 Tier 2 regional center at Location C 0 0 0 315 745 1032 20932.3.2.9.1 Tier 2 Facility Hardware 0 0 0 89 411 698 11982.3.2.9.2 Tier 2 Facility Software 0 0 0 75 75 75 2242.3.2.9.3 Tier 2 Facility Administration 0 0 0 21 37 37 95
2.3.2.10 Tier 2 regional center at Location D 0 0 0 315 745 1032 20932.3.2.10.1 Tier 2 Facility Hardware 0 0 0 89 411 698 11982.3.2.10.2 Tier 2 Facility Software 0 0 0 75 75 75 2242.3.2.10.3 Tier 2 Facility Administration 0 0 0 21 37 37 95
2.3.2.11 Tier 2 regional center at Location E 0 0 0 315 745 1032 20932.3.2.11.1 Tier 2 Facility Hardware 0 0 0 89 411 698 11982.3.2.11.2 Tier 2 Facility Software 0 0 0 75 75 75 2242.3.2.11.3 Tier 2 Facility Administration 0 0 0 21 37 37 95
2.3.2.12 Tertiary Storage at Tier 2 Regional 0 0 258 343 387 757 17452.3.2.12.1 Tertiary Storage at Regional Center 0 0 258 151 176 401 9872.3.2.12.2 Tertiary Storage at Regional Center 0 0 0 192 210 355 758
14603
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 12
Schedule
R&D Tier 2’s – FY ‘01 & FY ‘02R&D Tier 2’s – FY ‘01 & FY ‘02 Initial Development & Test, 1% to 2% scale
Start Grid testbed: ATLAS-GriPhyN
Data Challenges – FY ‘03 & FY ‘04Data Challenges – FY ‘03 & FY ‘04
Production Tier 2’s – FY ‘04 & FY ‘05Production Tier 2’s – FY ‘04 & FY ‘05
Operation – FY ‘05, FY ‘06 & beyondOperation – FY ‘05, FY ‘06 & beyond Full Scale System Operation, 20% (‘05) to 100% (‘06) (as for Tier
1)
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 13
Calren Esnet, Abilene, Nton
Abilene
Esnet, Mren
UC BerkeleyLBNL-NERSC
Esnet
NPACI, Abilene
BrookhavenNationalLaboratory
Indiana University
Boston University
ArgonneNationalLaboratory
HPSS sites
U Michigan
University ofTexas atArlington
Testbed ‘01
November 14-17, 2000November 14-17, 2000Rob Gardner Tier 2 Plan for U.S. ATLASRob Gardner Tier 2 Plan for U.S. ATLAS 14
Fallback Issues
Impact of limited support for a planned distributed Impact of limited support for a planned distributed
infrastructure?infrastructure? Several scenarios of course are possible US ATLAS will face a serious shortfall in analysis capability
Shortfall in simulation capacity Analysis groups will have less autonomy
University groups likely augment their facilities through supplemental requests, and large scale proposals to establish multidisciplinary “centers” We could end up with 1 Tier 1 and 32 “Tier 2” centers An incoherent, messy infrastructure, difficult to manage Not the best way to optimize physics discovery