Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
-
Upload
tyler-white -
Category
Documents
-
view
216 -
download
0
Transcript of Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Outline
Recall partnership principlesLHC computing modelCMS and ATLAS grid prototypingCategories of work packagesContributors
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Agreement on 5 principles:The cost and complexity of 21st Century Science requires the creation of advanced and coherent global Infostructure (information infrastructure).The construction of a coherent Global Infostructure for Science requires definition and drivers from Global Applications (that will also communicate with each other) Further, forefront Information Technology must be incorporated into this Global Infostructure for the Applications to reach their full potential for changing the way science is done. LHC is a near term Global Application requiring advanced and un-invented Infostructure and is ahead in planning compared to many others.U.S. agencies must work together for effective U.S. participation on Global scale infostructure, and the successful execution of the LHC program in a 4 way agency partnership, with international cooperation in view.
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
LHC as exemplar of global science
Project already involves scientists (and funding agencies) from all over the worldHigh visibility scienceExperiments already making good use of prototype grids Sociological (as well as technical) reasons for decentralized computing systemsRecognized challenge of accumulating sufficient resources
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
LHC Global ScienceLHC is most exciting, challenging, and relevant scienceChallenges
scientifically, technically, culturally, manageriallyCollaboration
open and fair access and sharing of data, tools, ideasunique opportunities for discovery to small and remote groups
Data and Informationvast data beyond technical capabilities of any single organizationrevolutionary new applications of new tools of information technology
Globalizationbuilding truly global (science) communitiesacquiring data centrally, analyzing data globally, like a large corporation
Opportunities to advance Information TechnologyRelevant to Science at Large
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Centres taking part in the LCG-1
around the world around the clock
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
LHC Computing ModelDistributed model from the start (distributed resources + coherent global access to data)Must support
Production (reconstruction, simulation)• Scheduled, predictable, batch• Run by experiment or physics group• Highly compute intensive, accesses predictable data sets
Data Analysis (including calibration and monitoring)• Random, chaotic, often interactive• Run by individuals and small groups• Mostly data intensive, accesses random data• Highly collaborative
Code development and testing• Highly interactive• Highly collaborative
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
LHC Computing Facilities Model
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Adoption of Grids by LHC Experiments
Already some major successes:CMS and ATLAS production runningGood collaborations with computer scientists:
• iVDGL, GriPhyN, PPDG, EDG…
LHC Computing Grid Project (LCG)
We now have a scientific community that understands the components and value of grid computing.
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Lawrence BerkeleyNational Laboratory
BrookhavenNationalLaboratoryIndiana
University
Boston University
ArgonneNationalLaboratory
U Michigan
University ofTexas atArlington
OklahomaUniversity
US -ATLAS testbed launched February 2001
ATLAS Grid Testbed Sites
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Fermilab– 1+5 PIII dual 0.700 GHz processor machines
Caltech– 1+3 AMD dual 1.6 GHz processor machines
San Diego– 1+3 PIV single 1.7 GHz processor machines
Florida– 1+5 PIII dual 1 GHz processor machines
Rice– 1+? machines
Wisconsin– 5 PIII single 1 GHz processor machines
Total: ~40 1 GHz dedicated processors
UCSD
Florida
Wisconsin
Caltech
Fermilab
US-CMS Development Grid Testbed
Rice
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
US-CMS Integration Grid Testbed
Fermilab (Tier1)– 40 dual 0.750 GHz processor machines
Caltech (Tier2)– 20 dual 0.800 GHz processor machines– 20 dual 2.4 GHz processor machines
San Diego (Tier2)– 20 dual 0.800 GHz processor machines– 20 dual 2.4 GHz processor machines
Florida (Tier2)– 40 dual 1 GHz processor machines
CERN (LCG Tier0 site)– 36 dual 2.4 GHz processor machines
Total: 240 0.85 GHz processors: Red Hat 6 152 2.4 GHz processors: Red Hat 7
UCSD
Florida
Caltech
Fermilab
CERN
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
“work packages” for LHC computing HW infrastructure Distributed computing infrastructure Grid services Experiment software Collaboration tools Support services
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Hardware Infrastructure
Tier 0 at CERNCompute elementsStorage elementsMass storage
Tier 1 national regional centersTier 2 regional centersLocal computing resources
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Distributed computing infrastructure
NetworkingIntercontinentalRegional wide areaLocal “end user” connections
Servers for distributed computingMetadata serversResources brokersMonitoring centers
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Grid ServicesLow level middleware (casual user doesn’t see this layer)Application specific middleware (service built on top of low level MW with flexible user interfaces and higher level functionality)Modeling and monitoringTrouble shooting and fault toleranceDistributed Data Analysis Environment
Grid hardware forResearch and development of toolsDeployment and integrationProduction
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
CMS Approach to R&D, Integration, Deploymentprototyping, early roll out, strong QC/QA & documentation, tracking of external “practices”
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Experiment Software
core SW detector specific applications physics analysis support analysis group support
Some of this software suitable for common development
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Collaboration tools
phone conferencing video conferencing Remote informal interaction (virtual coffee break) Document sharing Collaborative software development Collaborative data analysis Telepresence Remote control of experiment
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Support Services
Training and documentation Information Services User support (help desk): 24x7
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Grid Middleware I
1. User Management• 1.1 Registration of users as members
of a virtual organization (VO) (including subgroup credentials within the VO)
• 1.2 Authentication of users• 1.3 Authorization of users for
particular tasks
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Grid Middleware II2. Resource Management
• 2.1 Resource declaration (making resources available to the Grid)
• 2.2 Resource discovery• 2.3 Resource assignment tools (eg,
these CPUs are only available for experiment A, only for physicists from country B, only for physics group C, etc)
• 2.4 Prioritization tools
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Grid Middleware III3. Job Management
• 3.1 Job Submission• 3.2 job monitoring
4. Data management• 4.1 Data replication• 4.2 Data access• 4.3 data set management• 4.4 Data movement/job movement/data re-
creation decisions
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Production Grids middleware support error recovery robustness 24x7 operation monitoring and system usage optimization strategy and policy for resource allocation authentication and authorization simulation of grid operations tools for optimizing distributed systems
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Nat
ion
al S
cien
ce F
ou
nd
atio
n
Operations insupport of endusers
Developmentor acquisition
Coordination (synergy) Matrix
Research intechnologies,systems, andapplications
Applications of information technology to scienceand engineering research
Cyberinfrastructure in support of applications
Core technologies incorporated intocyberinfrastructure
Blue Ribbon Panel on Cyberinfrastructure
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
ContributorsFunding Agencies: Base ProgramFunding Agencies: LHC Research Program (LHC Software & Computing Projects)US Funding Agencies: networks and infrastructureCERN
Tier 0/1 facilities at CERNNetworking and infrastructureLCG Project
Other collaborating countries funding agenciesDOE/NSF Computatational Science Research Program
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Who contributes where?
Hardware Infrastructure
Distributed Computing Infrastructure
Grid Services
Experiment Software
Collaboration Tools
Support Services
US Base ProgramUS Research Program
DOE/NSF Networking
CERNOther countries
DOE/NSF CS
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
7-Feb-03023rd NSF/DOE meeting on
partnerships for global inforstructure
Proposal for further action
Form small working group (representatives from experiments, both agencies, physics and CS side) to flesh out workplans and “sign up” for tasks (Road Map to Global Infostructure): report back in < 1 monthMeeting soon in Europe