GLite adoption and opportunities for collaboration with industry Tony Doyle Distributed Computing...
-
Upload
zachary-morrison -
Category
Documents
-
view
217 -
download
0
Transcript of GLite adoption and opportunities for collaboration with industry Tony Doyle Distributed Computing...
gLite adoption and opportunities for collaboration with industry
Tony DoyleDistributed Computing Workshop
Westminster, 21 May 2008
Introduction
Context – PIPPS Projects
Who are GridPP?
Why do we need a Grid?
What is our Grid?
What do we offer?
PIPPS Projects
• David Sinclair and Chris Town (Cambridge Ontology Ltd) and Andy Parker (Cambridge e-Science Centre)
– Mini-PIPSS to develop a Content Based Image Retrieval (CBIR) platform powered by gLite– On completion of the Mini-PIPSS project Cambridge Ontology received £535k private equity
investment, changed its name to Imense, and is now doing a PIPSS project with Andy Parker
• Oleg Soloviev (Econophysica) and Steve Lloyd (QMUL)– Mini-PIPSS to develop a Grid based automated trading platform for the financial industry
• Constellation Technologies Ltd and Neil Geddes (RAL)– PIPSS to develop a commercial version of gLite middleware
• DiGS and George Beckett (Edinburgh, EPCC) – PIPSS to develop a Data Grid for Cell Biology, sharing biological images between researchers (an
example of inter-disciplinary use of software)• Other EGEE-wide Projects
– Total Oil testbed studies (Aberdeen) – EU-wide biomed docking studies (anti-malarial and bird-flu drug development)
4
Who are GridPP?
UK’s contribution to LHC computing:- 19 UK Universities, STFC and CERN
GridPP1 (2001- 2004) £17m“From Web to Grid”
GridPP2 (2004 - 2008) £16m“From Prototype to Production”
GridPP3 (2008 – 2011) £25m“From Production to Exploitation”
4 Large Experiments
CERN LHCThe world’s most powerful particle accelerator
Why do particle physicists need the Grid?
Example from LHC: starting from this event
We are looking for this “signature”
Selectivity: 1 in 1013
Like looking for 1 person in a thousand world populationsOr for a needle in 20 million haystacks!
• ~100,000,000 electronic channels
• 800,000,000 proton-proton interactions per second
• 0.0002 Higgs per second
• 10 PBytes of data a year
• (10 Million GBytes = 14 Million CDs)
Concorde(15 Km)
Mt. Blanc(4.8 Km)
One year’s data from LHC would fill a stack of CDs 20km high
Who are GridPP?Why do particle physicists need the Grid?
A question of scale
• Share more than information• Efficient use of resources at many institutes• Leverage over other sources of funding• Data, computing power, applications• Join local communities
Challenges:• share data between thousands of scientists with multiple interests• link major and minor computer centres• ensure all data accessible anywhere, anytime• grow rapidly, yet remain reliable for more than a decade• cope with different management policies of different centres• ensure data security• be up and running routinely in 2008
Solution – Build a GridSolution – Build a Grid
MIDDLEWARE
CPUDisks, CPU etc
PROGRAMS
OPERATING SYSTEM
Word/Excel
Email/Web
Your Program
Games
CPUCluster
UserInterfaceMachine
CPUCluster
CPUCluster
Resource Broker
Information Service
Single PC Grid
DiskServer
Your Program
Middleware is the Operating System of a distributed computing system
Replica CatalogueBookkeeping
Service
Middleware is the Key
Something like this…
griduiJDL
VOMS
WLMS
JS
RB
LFC
BDII
Logging & Bookkeeping
33
CPU Nodes Storage
Grid Enabled Resources
CPU Nodes Storage
Grid Enabled Resources
CPU Nodes Storage
Grid Enabled Resources
CPU Nodes Storage
Grid Enabled Resources
44
55
Submitter
6677
88 99
1010
00 VOMS-proxy-init
11 Job Submission
22
Job
Stat
us?
1111Job Retrieval
Grid Infrastructure
Tier 0
Tier 1National centres
Tier 2Regional groups
Institutes
Workstations
Offline farm
Online system
CERN computer centre
RAL,UK
ScotGrid NorthGrid SouthGrid London
FranceItalyGermanySpain
Glasgow Edinburgh Durham
Structure chosen for particle physics.Different for others.
11 T1 centres
Tagg
ed re
leas
e se
lect
ed fo
r cer
tifica
tion
Certi
fied
rele
ase
sele
cted
for d
eplo
ymen
t
Tagg
ed p
acka
ge
Problem reports
add unittested code torepository
Run nightly build& auto. tests
Grid certification
Fix problemsApplication Certification
BuildSystem
CertificationTestbed ~40CPU
ApplicationTestbed ~1000CPU
Certified publicreleasefor use by apps.
24x7
Build system Test Group
WPs
Unit Test Build Certification Production
Users
DevelopmentTestbed ~15CPU
Individual WP tests
IntegrationTeam
Integration
Overall release tests
Releases candidate
Tagged Releases
Releases candidate
Certified Releases
Apps. Representatives
Middleware Validation: From Testbed to Production
Process to test:frameworkssupportpoliciesdocumentationplatforms/compilers
StatusMarch 2007 March 2008
Status in 2007: 177 sites32,412 CPUs ~13 PB storage
Status in 2008:250 sites, 50 countries55,094 CPUs~20PB storage
GridPP & IndustryWhat Do We Offer?
• Middleware Expertise
• Our Grid (for test purposes)Examples:• Adaptable User Interface (GANGA)
• Security tools (GridSite)
• Accounting tools (R-GMA & APEL)
Security
Network Monitoring
Information Services
Grid Data Management
Storage Interfaces
Workload Management
Middleware Expertise
• The UK Grid (via the individual research sites) has been used to test applications for other areas e.g.
• biomedical research• financial modelling• device modelling• oil exploration• image processing
Our Grid
Adaptable User Interface
Job details
Logical
Folders
Job Monitoring
Log window
Job builder
Scriptor
Ganga GUIGanga GUI
Grid Security for the WebWeb platforms for Grids
• Digital Certificates• Certification Authority• Gridsite identifies users to websites with the digital
certificates• GridSiteWiki is an extension to the tool • GridSite is open source (http://www.gridsite.org/)
Security Tools
• Relational Grid Monitoring Architecture– An information and monitoring system for static
and dynamic information about grid resources, applications and networks
• Accounting Processor for Event Logs– Provides a summary of the resources consumed
based on attributes such as CPU time, Wall Clock Time, Memory and grid user identity
Accounting tools
Knowledge Exchange
Knowledge
Exchange
Knowledge
Exchange
Accounting
Standards
Applications
Portability
Trust
Security
Business Models
Quality of Service
Open Source Support
Software Licence Management
Business Community
Research Community
Dissemination
Productise software for your business
Sustain software on
behalf of all users
“an essential component within the innovation cycle of any knowledge driven economy”
Dissemination
Knowledge Exchange
Summary1. Opportunity for knowledge
creation through improved IT skills and an enhanced research base
2. GridPP supports locally-led activities (based upon an international core of expertise and ongoing examples of collaboration)
3. GridPP will work with companies to examine different methods of technology transfer and identify the activities that can be used for industry and business