Tony Doyle - University of Glasgow 12 January 2005Collaboration Board GridPP: Executive Summary Tony...
-
Upload
phillip-jennings -
Category
Documents
-
view
218 -
download
4
Transcript of Tony Doyle - University of Glasgow 12 January 2005Collaboration Board GridPP: Executive Summary Tony...
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
GridPP: Executive Summary
Tony Doyle
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Contents
• What was GridPP1?• What is GridPP2?• Vision • Challenges• LCG
– Data Challenges– Issues
• Deployment Status (9/1/05) – Tier-1/A, Tier-2,
NGS
• M/S/N• EGEE Middleware• Applications • Dissemination• What lies ahead?• Beyond GridPP2• Grid and e-Science
Support in 2008• Executive Summary
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
What was GridPP1?
• A team that built a working prototype grid of significant scale
> 2,000 (9,000) CPUs> 1,000 (5,000) TB of available storage> 1,000 (6,000) simultaneous jobs
• A complex project where 88% of the milestones were completed and all metrics were within specification
1 . 1 2 . 1 3 . 1 4 . 1 5 . 1 6 . 1 7 . 1
1 . 1 . 1 1 . 1 . 2 1 . 1 . 3 1 . 1 . 4 2 . 1 . 1 2 . 1 . 2 2 . 1 . 3 2 . 1 . 4 3 . 1 . 1 3 . 1 . 2 3 . 1 . 3 3 . 1 . 4 4 . 1 . 1 4 . 1 . 2 4 . 1 . 3 4 . 1 . 4 5 . 1 . 1 5 . 1 . 2 5 . 1 . 3 6 . 1 . 1 6 . 1 . 2 6 . 1 . 3 6 . 1 . 4 7 . 1 . 1 7 . 1 . 2 7 . 1 . 3 7 . 1 . 41 . 1 . 5 2 . 1 . 5 2 . 1 . 6 2 . 1 . 7 2 . 1 . 8 3 . 1 . 5 3 . 1 . 6 3 . 1 . 7 3 . 1 . 8 4 . 1 . 5 4 . 1 . 6 4 . 1 . 7 4 . 1 . 8 6 . 1 . 5
2 . 1 . 9 3 . 1 . 9 3 . 1 . 1 0 4 . 1 . 9
1 . 2 2 . 2 3 . 2 4 . 2 5 . 2 6 . 2 7 . 2 1 . 2 . 1 1 . 2 . 2 1 . 2 . 3 1 . 2 . 4 2 . 2 . 1 2 . 2 . 2 2 . 2 . 3 2 . 2 . 4 3 . 2 . 1 3 . 2 . 2 3 . 2 . 3 3 . 2 . 4 4 . 2 . 1 4 . 2 . 2 4 . 2 . 3 4 . 2 . 4 5 . 2 . 1 5 . 2 . 2 5 . 2 . 3 6 . 2 . 1 6 . 2 . 2 6 . 2 . 3 7 . 2 . 1 7 . 2 . 2 7 . 2 . 31 . 2 . 5 1 . 2 . 6 1 . 2 . 7 1 . 2 . 8 2 . 2 . 5 2 . 2 . 6 2 . 2 . 7 3 . 2 . 5 3 . 2 . 6 3 . 2 . 7 3 . 2 . 8 4 . 2 . 5 4 . 2 . 6 4 . 2 . 71 . 2 . 9 1 . 2 . 1 0 3 . 2 . 9
1 . 3 2 . 3 3 . 3 4 . 3 5 . 3 6 . 3 7 . 3
1 . 3 . 1 1 . 3 . 2 1 . 3 . 3 1 . 3 . 4 2 . 3 . 1 2 . 3 . 2 2 . 3 . 3 2 . 3 . 4 3 . 3 . 1 3 . 3 . 2 3 . 3 . 3 3 . 3 . 4 4 . 3 . 1 4 . 3 . 2 4 . 3 . 3 4 . 3 . 4 5 . 3 . 1 5 . 3 . 2 5 . 3 . 3 6 . 3 . 1 6 . 3 . 2 6 . 3 . 3 6 . 3 . 4 7 . 3 . 1 7 . 3 . 2 7 . 3 . 3 7 . 3 . 41 . 3 . 5 1 . 3 . 6 1 . 3 . 7 1 . 3 . 8 2 . 3 . 5 2 . 3 . 6 2 . 3 . 7 3 . 3 . 5 3 . 3 . 6 4 . 3 . 51 . 3 . 9 1 . 3 . 1 0 1 . 3 . 1 1
1 . 4 2 . 4 3 . 4 4 . 4 5 . 4 1 . 4 . 1 1 . 4 . 2 1 . 4 . 3 1 . 4 . 4 2 . 4 . 1 2 . 4 . 2 2 . 4 . 3 2 . 4 . 4 3 . 4 . 1 3 . 4 . 2 3 . 4 . 3 3 . 4 . 4 4 . 4 . 1 4 . 4 . 2 4 . 4 . 3 4 . 4 . 4 5 . 4 . 1 5 . 4 . 2 5 . 4 . 3 5 . 4 . 41 . 4 . 5 1 . 4 . 6 1 . 4 . 7 1 . 4 . 8 2 . 4 . 5 2 . 4 . 6 2 . 4 . 7 3 . 4 . 5 3 . 4 . 6 3 . 4 . 7 3 . 4 . 8 4 . 4 . 5 4 . 4 . 6 5 . 4 . 51 . 4 . 9 3 . 4 . 9 3 . 4 . 1 0 M e t r i c O K 1 . 1 . 1
M e t r i c n o t O K 1 . 1 . 1 1 . 5 2 . 5 3 . 5 4 . 5 T a s k c o m p le t e 1 . 1 . 1
1 . 5 . 1 1 . 5 . 2 1 . 5 . 3 1 . 5 . 4 2 . 5 . 1 2 . 5 . 2 2 . 5 . 3 2 . 5 . 4 3 . 5 . 1 3 . 5 . 2 3 . 5 . 3 3 . 5 . 4 4 . 5 . 1 4 . 5 . 2 4 . 5 . 3 4 . 5 . 4 T a s k o v e r d u e 1 . 1 . 11 . 5 . 5 1 . 5 . 6 1 . 5 . 7 1 . 5 . 8 2 . 5 . 5 2 . 5 . 6 2 . 5 . 7 3 . 5 . 5 3 . 5 . 6 3 . 5 . 7 6 0 d a y s 1 . 1 . 11 . 5 . 9 1 . 5 . 1 0 T a s k n o t d u e s o o n 1 . 1 . 1
N o t A c t i v e 1 . 1 . 1 2 . 6 3 . 6 4 . 6 N o T a s k o r m e t r i c
2 . 6 . 1 2 . 6 . 2 2 . 6 . 3 2 . 6 . 4 3 . 6 . 1 3 . 6 . 2 3 . 6 . 3 3 . 6 . 4 4 . 6 . 1 4 . 6 . 2 4 . 6 . 32 . 6 . 5 2 . 6 . 6 2 . 6 . 7 2 . 6 . 8 3 . 6 . 5 3 . 6 . 6 3 . 6 . 7 3 . 6 . 8 N a v ig a t e u p 2 . 6 . 9 3 . 6 . 9 3 . 6 . 1 0 3 . 6 . 1 1 3 . 6 . 1 2 N a v ig a t e d o w n
E x t e r n a l l i n k 2 . 7 3 . 7 L in k t o g o a l s
2 . 7 . 1 2 . 7 . 2 2 . 7 . 3 2 . 7 . 4 3 . 7 . 1 3 . 7 . 2 3 . 7 . 3 3 . 7 . 42 . 7 . 5 2 . 7 . 6 2 . 7 . 7 2 . 7 . 8 3 . 7 . 5 3 . 7 . 6
2 . 8 3 . 8 2 . 8 . 1 2 . 8 . 2 2 . 8 . 3 2 . 8 . 4 3 . 8 . 1 3 . 8 . 2 3 . 8 . 32 . 8 . 5
W P 8
1 2 3
D e p l o y m e n t
W P 4
W P 5
F a b r i c
T e c h n o l o g y
W P 6
D u e w i t h i n
A T L A S
G r i d P P G o a l
R e s o u r c e sI n t e r o p e r a b i l i t y D i s s e m i n a t i o n
T i e r - 1
T i e r - A
L H C b T i e r - 2
C E R N D a t a G r i d A p p l i c a t i o n s I n f r a s t r u c t u r e
W P 1
W P 2
W P 3
L C G C r e a t i o n
A p p l i c a t i o n s
W P 7
A T L A S / L H C b
C M S
B a B a r
C D F / D O
U K Q C D
O t h e r
D a t a C h a l l e n g e s
R o l l o u t
T e s t b e d
1 - J a n - 0 4S t a t u s D a t e
I n t . S t a n d a r d s
O p e n S o u r c e
W o r l d w i d e I n t e g r a t i o n
U K I n t e g r a t i o n
M o n i t o r i n g
D e v e l o p i n gE n g a g e m e n t
P a r t i c i p a t i o n
T o d e v e l o p a n d d e p l o y a l a r g e s c a l e s c i e n c e G r i di n t h e U K f o r t h e u s e o f t h e P a r t i c l e P h y s i c s c o m m u n i t y
P r e s e n t a t i o n D e p l o y m e n t
5 6 74
U p d a t e
C l e a r
A Success
“The achievement of
something desired, planned, or
attempted”
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
What is GridPP2?
0. Production Grid
1. 1 2.1 3.1 4.1 5.1 6.1
1. 2 2.2 3.2 4.2 5.2 6.2
1. 3 2.3 3.3 4.3 6.3
1. 4 2.4 3.4 4.4 6.4
2.5 3.5 4.5
Navigate down External link Link to goals
2.6 3.6 4.6
Network
Management
& MonitoringInformation PhenoGrid
KnowledgeTransfer
32
Management
Grid Deployment Security CMS UKQCD
Engagement
Grid Technology Workload LHCb D0
Computing Fabric Data & Storage Ganga CDF Deployment
Grid Operations
1 6M/S/N LHC Apps
54
GridPP2 GoalTo develop and deploy a large scale production quality grid in the UK for the use of the Particle Physics community
Tier-A Tier-1 Tier-2 Deployment Middleware Support Experiment Support
Interoperability
ATLAS Dissemination
Management ExternalLCG
Planning
Applications Metadata
Non-LHC Apps
BaBar
SAMGrid
LHC Deployment Portal
Structures agreed and in place (except LCG phase-2)
•253 Milestones, 112 Monitoring Metrics at present.•Must deliver a “Production Grid”: robust, reliable, resilient, secure, stable service delivered to end-user applications. •The Collaboration aims to develop, deploy and operate a very large Production Grid in the UK for use by the worldwide particle physics community.
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Vision
1. SCALE: GridPP will deliver Grid middleware and hardware infrastructure to enable the construction of a UK Production Grid for the LHC of significant scale.
2. INTEGRATION: The GridPP project is designed to integrate with the existing Particle Physics programme within the UK, thus enabling full use of Grid technology and efficient use of shared resources.
3. DISSEMINATION: The project will disseminate the GridPP deliverables in the multi-disciplinary e-Science environment and will seek to build collaborations with emerging non-PPARC Grid activities both nationally and internationally.
4. UK LHC COMPUTING: The main aim is to provide a computing environment for the UK Particle Physics Community capable of meeting the challenges posed by the unprecedented data, processing and analysis requirements of the LHC experiments.
5. OTHER UK PARTICLE PHYSICS COMPUTING: The process of creating and testing the computing environment for the LHC will naturally support the current and next generation of highly data intensive Particle Physics experiments.
6. EGEE: Grid technology is the framework used to develop the required capability: key components will be developed as part of the EGEE project and elsewhere.
7. LCG: The collaboration builds on the strong computing traditions of the UK at CERN. GridPP will make a strong contribution to the LCG deployment and operations programme.
8. INTEROPERABILITY: The project is integrated with national and international developments from other Grid projects and the GGF in order to ensure a common set of principles, protocols and standards that can support a wide range of applications.
9. INFRASTRUCTURE: Provision is made for a Tier-1 facility at RAL and four Regional Tier-2s, encompassing the collaborating Institutes.
10. OTHER FUNDING: The Tier-1 and Tier-2s will provide a focus for dissemination to the academic and commercial sector and will attract additional funds such that the full programme can be realised.
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
•Must •share data between thousands of scientists with multiple interests•link major (Tier-0 [Tier-1]) and minor (Tier-1 [Tier-2]) computer centres•ensure all data accessible anywhere, anytime•grow rapidly, yet remain reliable for more than a decade•cope with different management policies of different centres•ensure data security•be up and running routinely by 2007
What are the Grid challenges?
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
What are the Grid challenges?
Data Management, Security and
Sharing
1. Software process2. Software efficiency3. Deployment
planning 4. Link centres
5. Share data
6. Manage data7. Install software8. Analyse data9. Accounting
10. Policies
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
LCG Overview
By 2007:- 100,000 CPUs- More than 100 institutes worldwide
- building on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT)
- prototype went live in September 2003 in 12 countries
- Extensively tested by the LHC experiments during this summer
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Data Challenges
• Ongoing..• Grid and
non-Grid Production
• Grid now significant
• CMS - 75 M events and 150 TB: first of this year’s Grid data challenges
• ALICE - 35 CPU Years• Phase 1 done
• Phase 2 ongoing
LCG
Entering Grid Production Phase..
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Data Challenge
ATLAS DC2 - LCG - September 71%
2%
0%
1%
2%
14%
3%
1%
3%
9%
8%
3%2%5%1%4%
1%
1%
3%
0%
1%
1%
4%1%
0%
12%
0%
1%
1%
2%
10%
1% 4%
at.uibk
ca.triumf
ca.ualberta
ca.umontreal
ca.utoronto
ch.cern
cz.golias
cz.skurut
de.fzk
es.ifae
es.ific
es.uam
fr.in2p3
it.infn.cnaf
it.infn.lnl
it.infn.mi
it.infn.na
it.infn.na
it.infn.roma
it.infn.to
it.infn.lnf
jp.icepp
nl.nikhef
pl.zeus
ru.msu
tw.sinica
uk.bham
uk.ic
uk.lancs
uk.man
uk.rl
ATLAS DC2 - CPU usage
LCG41%
NorduGrid30%
Grid329%
LCG
NorduGrid
Grid3
Total:
~ 1350 kSI2k.months~ 95000 jobs~ 7.7 Million events fully simulated (Geant4)~ 22 TB
• 7.7 M GEANT4 events and 22 TB• UK ~20% of LCG• Ongoing..
• (3) Grid Production
• ~150 CPU years so far
• Largest total computing requirement
• Small fraction of what ATLAS need..
Entering Grid Production Phase..
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
LHCb Data Challenge
424 CPU years (4,000 kSI2k months), 186M events • UK’s input significant (>1/4 total) • LCG(UK) resource:
– Tier-1 7.7%– Tier-2 sites:– London 3.9%– South 2.3%– North 1.4%
• DIRAC:– Imperial 2.0%– L'pool 3.1%– Oxford 0.1%– ScotGrid 5.1%
DIRAC alone
LCG inaction
1.8 106/day
LCG paused
Phase 1 Completed
3-5 106/dayLCG
restarted
186 M Produced Events
Entering Grid Production Phase..
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Paradigm ShiftTransition to Grid…
Jun: 80%:20%
25% of DC’04
Aug: 27%:73%
42% of DC’04
May: 89%:11%
11% of DC’04
Jul: 77%:23%
22% of DC’04
424 CPU · Years
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Issues
https://edms.cern.ch/file/495809/2.2/LCG2-Limitations_and_Requirements.pdf
First large-scale Grid production problems being addressed…at all levels
“LCG-2 MIDDLEWARE
PROBLEMS ANDREQUIREMENTS
FOR LHC EXPERIMENT DATA
CHALLENGES”
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Is GridPP a Grid?
1. Coordinates resources that are not subject to centralized control
2. … using standard, open, general-purpose protocols and interfaces
3. … to deliver nontrivial qualities of service
1. YES. This is why development and maintenance of LCG is important.
2. YES. VDT (Globus/Condor-G) + EDG/EGEE(Glite) ~meet this requirement.
3. YES. LHC experiments data challenges over the summer of 2004.
http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf
5
http://agenda.cern.ch/fullAgenda.php?ida=a042133
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
GridPP Deployment Status (9/1/05)
Three Grids on Global scale in HEP (similar functionality)
sites CPUs• LCG (GridPP) 90 (16) 9000
(2029)• Grid3 [USA] 29 2800• NorduGrid 30 3200
GridPP deployment is part of LCG(Currently the largest Grid in the world)The future Grid in the UK is dependent upon LCG releases
totalCPU
freeCPU
runJob
waitJob
seAvail TB
seUsed TB
maxCPU
avgCPU
Total
2029 1402 95 480 8.69 4.55 2549 1994
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
UK Tier-1/A Centre Rutherford Appleton
Laboratory
• High quality data services• National and international
role• UK focus for international
Grid development1000 CPU200 TB Disk60 TB Tape
(Capacity 1PB)
Grid Resource Discovery Time = 8 Hours
2004 CPU Utilisation2004 Disk Use
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
UK Tier-2 Centres
ScotGridDurham, Edinburgh, Glasgow NorthGridDaresbury, Lancaster, Liverpool,Manchester, Sheffield
SouthGridBirmingham, Bristol, Cambridge,Oxford, RAL PPD, Warwick
LondonGridBrunel, Imperial, QMUL, RHUL, UCL
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Level-2 Grid
In future will include services to facilitate collaborative (grid) computing•Authentication (PKI X509)•Job submission/batch service•Resource brokering•Authorisation•Virtual Organisation management •Certificate management•Information service•Data access/integration (SRB/OGSA-DAI/DQPS)•National Registry (of registry’s)•Data replication•Data caching•Grid monitoring•Accounting
* LeedsManchester *
* OxfordRAL *
* DL
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Middleware Development
Configuration Management
Storage Interfaces
Network Monitoring
Security
Information Services
Grid Data Management
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Application Development
ATLAS LHCb CMS
BaBar (SLAC) SAMGrid (FermiLab)QCDGrid PhenoGrid
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
More Applications
ZEUS uses LCG•needs the Grid to respond to increasing demand for MC production• 5 million Geant events on Grid since August 2004
QCDGrid• For UKQCD • Currently a 4-site data grid • Key technologies used
- Globus Toolkit 2.4- European DataGrid- eXist XML database
•managing a few hundred gigabytes of data
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Disseminationmuch has happened..
more people are reading about it..
LHCb-UK members get up to speed with the Grid Wed 5 Jan 2005GridPP in Pittsburgh Thu 9 Dec 2004GridPP website busier than ever Mon 6 Dec 2004Optorsim 2.0 released Wed 24 Nov 2004ZEUS produces 5 million Grid events Mon 15 Nov 2004CERN 50th anniversary reception Tue 26 Oct 2004GridPP at CHEP'04 Mon 18 Oct 2004LHCb data challenge first phase a success for LCG and UK Mon 4 Oct 2004Networking in Nottingham - GLIF launch meeting Mon 4 Oct 2004GridPP going for Gold - website award at AHM Mon 6 Sep 2004GridPP at the All Hands Meeting Wed 1 Sep 2004R-GMA included in latest LCG release Wed 18 Aug 2004LCG2 administrators learn tips and tricks in Oxford Tue 27 Jul 2004Take me to your (project) leader Fri 2 Jul 2004ScotGrid's 2nd birthday: ScotGrid clocks up 1 million CPU hours Fri 25 Jun 2004Meet your production manager Fri 18 Jun 2004GridPP10 report and photographs Wed 9 Jun 2004CERN recognizes UK's outstanding contribution to Grid computing Wed 2 Jun 2004UK particle physics Grid takes shape Wed 19 May 2004A new monitoring map for GridPP Mon 10 May 2004Press reaction to EGEE launch Tue 4 May 2004GridPP at the EGEE launch conference Tue 27 Apr 2004LCG2 released Thu 8 Apr 2004University of Warwick joins GridPP Thu 8 Apr 2004Grid computing steps up a gear: the start of EGEE Thu 1 Apr 2004EDG gets glowing final review Mon 22 Mar 2004Grids and Web Services meeting, 23 April, London Tue 16 Mar 2004EU DataGrid Software License approved by OSI Fri 27 Feb 2004GridPP Middleware workshop, March 4-5 2004, UCL Fri 20 Feb 2004Version 1.0 of the Optorsim grid simulation tool released by EU DataGrid Tue 17 Feb 2004Summary and photographs of the 9th GridPP Collaboration Meetin Thu 12 Feb 2004
138,976 hitsin December
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Annual data storage:12-14 PetaBytesper year
100 Million SPECint2000
100,000 PCs (3 GHz Pentium 4)
Concorde(15 km)
CD stack with1 year LHC data(~ 20 km)
What lies ahead? Some mountain
climbing..
Quantitatively, we’re ~9% of the way there in terms of CPU (9,000 ex 100,000) and disk (3 ex 12-14*3 years)…
In production
terms, we’ve made base camp We are here
(1 km)
Importance of step-by-step planning… Pre-plan your trip, carry an ice axe and crampons and arrange for a guide…
12 January 2005 Collaboration Board Tony Doyle - University of Glasgow
Executive Summary
GRIDPP-PMB-40-EXEC• The Grid is a reality• A project was/is needed • Under control• LCG2 support: SC case being written• 16 UK sites are on the Grid
– MoUs, planning, deployment, monitoring– each underway as part of GridPP2
• Developments estd.,R-GMA deployed
• Glite designed inc. web services• Interfaces developed, testing phase• Area transformed• Initial ideas.. consultation reqd.
• Introduction• Project
Management• Resources• LCG• Deployment
– Tier-1/A production+ Tier-2 resources
• M/S/N• EGEE• Applications • Dissemination • (Beyond GridPP2)