The GRID – Hype or Reality? Roger Barlow Manchester and Stanford University Symposium in...
-
Upload
matthew-beach -
Category
Documents
-
view
212 -
download
0
Transcript of The GRID – Hype or Reality? Roger Barlow Manchester and Stanford University Symposium in...
The GRID – Hype or Reality?
Roger BarlowManchester and Stanford University
Symposium in celebration of John Vincent Atanasoff Iowa State University, 31st October
2003
What is the Grid?Problems and Tools that solve themProgressProspects
2
What is the GRID? 1: The Press
The new Grid computing technologies are set to revolutionise the way scientists use the world’s computing resources.
The technology now being deployed for particle physics will ultimately change the way that science and business are undertaken in the years to come. This will have a profound effect on the way society uses information technology, much as the World Wide Web did.
Grid technology will extend to fields like bioinformatics, digital archive and biodiversity informatics. “We are very excited to be able to
participate in such a revolutionary global collaboration.”
3
What is the GRID? 2: For Politicians
Computing coming out of a wall socket
Power:
110V AC
60Hz
Computing: ?
?
4
What is the GRID? 3: For Machine vendors
Another name for a CPU cluster
A very limited and unimaginative use. But it sells boxes!
5
What is the GRID? Here is a (transport) grid that works
Get from anywhere* to anywhere* (home, school, office, theater, park, mall..)
By any* means (car, truck, bus…)
For any* purpose (work, pleasure,shopping…) A user’s activities vary.
* within reasonable limits
6
Aside: What is the WEB? Computers talk to computersThey say:”Give me that file.”
7
What is the GRID? Computers talk to computers
They say anything
Find data like this…
Add this data to that data
Run this program on those data
Collect results and plot them
8
600+ physicists
9 countries
71 institutes
Moving from a CENTRALISED
To a
DISPERSED
Computing model
9
Particle Physics and building the Grid
The Grid needs Particle Physics as trail-blazers…
• Demanding users with demanding problems
• Amateurs: Knowledgeable but non expert
• Write their own programs. Badly.
• Lots of resources – different universities/countries
• COTS components - heterogeneous
• Software not mission-critical
Particle Physics needs the Grid…
Locate data – which may be in different sites
Select events and process (take-the-job-to-the-data
beats take-the-data-to-the-job)
Collate results
10
Warning: Cyberspace is dangerous!
Rogues and vagabonds: evil people and stupid people
How can you trust anyone? How can anyone trust you?
How can we build a grid when all the doors and gates locked?
11
Old Solution: Passwords• Not really secure• Multiple passwords are a
pain for users• Does not scale for millions
of users (N) , tens of thousands of sites (M)
SYSTEM
Gatekeeper
Gate
12
Eureka! RSA Encryption
12317 | 3233
123855
8552753 | 3233
(123+5)*2123
(256-10)*0.5256
PUBLIC key (3233,17)
PRIVATE key (3233,2753)
Rivest, Shamir, Adleman
Can deduce private key from public key in principle but not in practice
3233=53*61
52*60= (2753*17-1)*15
Anyone can do this using public keyOnly holder of private key can do
this
Trivial Example
Simple Example
13
What you can do with it
• Suppose I trust Charles Adams• Charles Adams has his own private key• I have his public key (so do lots of people)• If I get a coded message that makes sense
when I decode it, I know CA must have written it• If you bring me such a message saying ‘This
guy is John Doe of Iowa State…’ then I believe it because I believe CA.
• This is a CERTIFICATE. Your certificate. ‘Signed’ by CA
14
Authentication Solved
• There are several (not many) Certificate Authorities (DOE, UK, France, Germany…)
• Each CA should have a rigorous procedure for issuing certificates, generally involving personal contact though maybe multilevel
• A site chooses which it will recognise, on the basis of trustworthiness
• When I approach a site with my certificate, they know I am who I say I am
• ~N+M transactions
15
Next Hurdle: Authorisation
• Your ID gets you into the bar – but it doesn’t buy you a beer
• ‘Virtual Organisation’ – membership allows you to use resources
• People join a VO. Or several VOs (N transactions)
• Sites give resources to VO (M transactions)
• ‘Generic accounts’ make it easy
“Please present your ticket and photo ID…”
16
The BaBar VO
Want to join?Just put your
certificate name into a file .cert-id on your SLAC BaBar account
cron jobs do the rest
Simple AND secure
17
Sandboxes• You ship a whole load of files
to the remote site. (May include the executable.)
• It runs and generates bytes to gigabytes of output files
• You get all these back
Your job operates only within the input and output sandboxes, not in the rest of the remote system. So everyone is safe
Issues:• How do you know which files to send?• How much can you assume standard libraries
etc are present? (“As little as possible” is the right answer.)
• How do the files come back? Pull (inconvenience) or push (risky).
18
AFS solves all our sandbox problems
Run/test jobs locally in directory which is part of afs cell: /afs/<mysite>/<user>/<dirname>
Files needed are in this directory or linked from it (except for the big raw data files.)
Output files created in this directory
At start of remote job, klog to /afs/<mysite>/<user>/<dirname>All the I/O (except raw data input) is done
thereUses gsiklog (certificate gives token).
Transportable
19
Stuff that would be nice
• Resource Brokering• Job monitoring• Fun to use GUI interfaces• SRB for storage
• But we don’t want to hang around till it’s ready. Well, I don’t.
(and 50% of software projects fail)
20
BaBarGrid(Sorry, no demo today. Just screenshots)
Some machines already in the system
21
Start by finding some dataUse experiment-specific jargon to specify what sort of data you want
Say where you want to run (Here MAN and RAL)
Get whole bunch of input files
22
One command runs the jobsManchester
Rutherford Lab
’job’ is standard user (badly) written analysis code: nothing special required
23
And get the answer
Plot obtained by merging all the job outputs from all the sites
24
Coming Next: the LHC
• Large Hadron Collider – CERN. 14 TeV proton-proton collisions
• 4 Experiments – each an order of magnitude bigger than BaBar
• Massive data analysis requirements• Switch on 2007• Grid computing built in from the
start
25
LCG now rolling out
First LHC Computing Grid – ~4,000 CPUs
26
The Grid
• It’s real – despite the hype• Being test-driven by Particle
Physicists– Especially in Europe
• You will be using it sooner or later – opening possibilities you never thought possible
• Get your certificate and join the fun