Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

39
Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Page 1: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Computing

from a solid past to a bright future?

David GroepNIKHEF

2002-08-28

Page 2: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

The Grid: a vision?

Imagine that you could plug your computer into the wall and have direct access to huge computing resources immediately, just as you plug in a lamp to get instant light. …

Far from being science-fiction, this is the idea the XXXXXX project is about to make into reality.…

from a project brochure in 2001

Page 3: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Physics @ CERN• LHC particle accellerator

• operational in 2007

• 5-10 Petabyte per year

• 150 countries

• > 10000 Users

• lifetime ~ 20 years

level 1 - special hardware

40 MHz (40 TB/sec)

level 2 - embeddedlevel 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &

offline analysis

The Need for Grids: LHC

http://www.cern.ch/

Page 4: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

CPU & Data RequirementsEstimated CPU Capacity at CERN

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

year

K S

I95

Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment

Jan 2000:3.5K SI95

LHC experimentsOther experiments

< 50% of the main analysis capacity will be at CERN

Estimated CPU capacity required at CERN

http://www.cern.ch/

Page 5: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

More Reasons Why

ENVISAT• 3500 MEuro programme cost3500 MEuro programme cost

• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe

• ~700 approved science user projects~700 approved science user projects

• 3500 MEuro programme cost3500 MEuro programme cost

• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe

• ~700 approved science user projects~700 approved science user projectshttp://www.esa.int/

Page 6: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

And More …

•For access to data

–Large network bandwidth to access computing centers

–Support of Data banks replicas (easier and faster

mirroring)

–Distributed data banks

•For interpretation of data

–GRID enabled algorithmsBLAST on distributed data banks, distributed data mining

Bio-informatics

Page 7: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

And even more …

• financial services, life sciences, strategy evaluation, …

• instant immersive teleconferencing

• remote experimentation

• pre-surgical planning and simulation

Page 8: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Why is the Grid successful?

• Applications need large amounts of data or computation

• Ever larger, distributed user community• Network grows faster than compute power/storage

Page 9: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Inter-domain communication

0

500

1000

1500

2000

2500

19

86

19

88

19

90

19

92

19

94

11

96

19

98

20

00

Average IETF attendance

• The Internet community spawned 3360 RFCs(as of August 2nd, 2002)

• Myriad of different protocols and APIs

• Be strict in what you send be liberal in what you accept

• Inter-domain by nature• Increasing focus on security

Page 10: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Intra-domain tools

• RPC proved hugely successful within domains– YP– Network File System– Typical client-server stuff…

• CORBA– Extension of RPC to OO design model– Diversification

• Latest trend: web services

Page 11: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

The beginnings of the Grid

• Grown out of distributed computing• Gigabit network test beds & meta-computing• Supercomputer sharing (I-WAY)• Condor ‘flocking’

• Focus shifts to inter-domain operations

GUSTO meta-computing test bed in 1999

Page 12: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

The Grid

Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999

Page 13: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

The One-Liner

• Resource sharing and coordinated problem solving in dynamic multi-institutional virtual organisations

Page 14: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Standards Requirements

• Standards are key to inter-domain operations• GGF established in 2001• Approx. 40 working & research groups

0

200

400

600

800

1000

1200

1999 1999 2000 2000 2000 2001 2001 2001 2002 2002

(G)GF attendance

http://www.gridforum.org/

Page 15: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Protocol Layers & Bodies

PhysicalPhysical

Data LinkData Link

NetworkNetwork

TransportTransport

SessionSession

PresentationPresentation

ApplicationApplication

Standard body: IEEE

Standard body: IETF

Standard bodies: GGFW3C

Application

Fabric

Connectivity

Resource

Collective

Internet

Transport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 16: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Architecture (v1)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 17: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

What should the Grid provide?

• Dependable, consistent and pervasive access

• Interoperation among organisations

• Challenges:– Complete transparency for the user– Uniform access methods for

computing, data and information– Secure, trustworthy environment for providers– Accounting (and billing)– Management-free ‘Virtual Organizations’

Page 18: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

• Globus Project started 1997• Current de-facto standard• Reference implementation of Global Grid Forum

standards• Toolkit `bag-of-services' approach

• Several middleware projects:– EU DataGrid– CrossGrid, DataTAG, PPDG, GriPhyN– In NL: ICES/KIS Virtual Lab, VL-E

Grid Middleware

http://www.globus.org/

Page 19: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Condor

• Scavenging cycles off idle work stations• Leading themes:

– Make a job feel `at home’– Don’t ever bother the resource owner!

• Bypassredirect data to process

• ClassAdsmatchmaking concept

• DAGmandependent jobs

• Kangaroofile staging & hopping

• NeSTallocated `storage lots’

• PFSPluggable File System

• Condor-Greliable job control

for the Grid

http://www.cs.wisc.edu/condor/

Page 20: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Application Toolkits

Collect and abstract services in an order fashion

• Cactus: plug-n-play numeric simulations• Numeric propulsion system simulation NPSS• Commodity Grid Toolkits (CoGs):

JAVA, CORBA, …• NIMROD-G: parameter sweeping simulations• Condor: high-throughput computing• GENIUS, VLAM-G, … (web) portals to the Grid

Page 21: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grids Today

Page 22: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Protocols Today

• Based on the popular protocols on the ’Net• Use common Grid Security Infrastructure:

– Extensions to TLS for delegation (single sign-on)– Uses GSS-API standard where possible

• GRAM (resource allocation):attrib/value pairs over HTTP

• GridFTP (bulk file transfer):FTP with GSI and high-throughput extras (striping)

• MDS (monitoring and discovery service):LDAP + schemas

• ……

Page 23: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Getting People TogetherVirtual Organisations

• The user community `out there’ is huge & highly dynamic• Applying at each individual resource does not scale

• Users get together to form Virtual Organisations:– Temporary alliance of stakeholders

(users and/or resources)– Various groups and roles– Managed out-of-band

by (legal) contracts

• Authentication, Authorization, Accounting (AAA)

Page 24: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Security Infrastructure

• Requirements:– Strong authentication and accountability– Trace-ability– “Secure”!

– Single sign-on– Dynamic VOs: “proxying”, “delegation”– Work everywhere

(“easyEverything”, airport kiosk, handheld)– Multiple roles for each user– Easy!

Page 25: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Authentication & PKIAlice (e,n)

CommonName=‘Alice’Organization=‘KNMI’

Certificate Request

CA private keyCA self-signed certificate

Alice…

The CA will checkidentifier in the request

against the identity of the requestor

CA operator signs therequest with the CA key

CA ships the newcertificate to Alice

Alice generates a key pair and send the public key to CA

(d,n)Private Key

• EU DataGrid PKI: 1 PMA, 13 Certification Authorities• Automatic policy evaluation tools• Largest Grid-PKI in the world (and growing )

Page 26: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action“Create Processes at A and B

that Communicate & Access Files at C”

Page 27: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Authorization

• Authorization poses main scaling problem• Conflict between accountability and

ease-of-use / ease-of-management

• By getting rid of “local user” concept ease support for large, dynamic VOs:– Temporary account leasing: pool accounts à la DHCP– Grid ID-based file operations: slashgrid– Sandbox-ing applications

Direction of EU DataGrid and PPDG

Page 28: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Locating a Replica

• Grid Data Mirror Package

• Moves data across sites• Replicates both files and

individual objects• Catalogue used by Broker• Replica Location Service

(giggle)

• Read-only copies “owner” by the Replica Manager.

http://cmsdoc.cern.ch/cms/grid

Page 29: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Mass Data Transport

• Need for efficient, high-speed protocol: GridFTP• All storage elements share common interface

disk caches, tape robots, …• Also supports GSI & single sign-on

• Optimize for high-speed networks (>1 Gbit/s)• Data source striping through parallel streams• Ongoing work on “better TCP”

Page 30: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Data Bases ?!

• Database Access and Integration (DAI)-WG– OGSA-DAI integration project– Data Virtualisation Services– Standard Data Source Services

Early Emerging Standards:– Grid Data Service specification (GDS)– Grid Data Service Factory (GDSF)

Largely spin-off from the UK e-Science effort & DataGrid

Page 31: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Grid Access to Databases

• SpitFire (standard data source services)uniform access to persistent storage on the Grid

• Multiple roles support• Compatible with GSI (single sign-on) though CoG• Uses standard stuff: JDBC, SOAP, XML• Supports various back-end data bases

http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/

Page 32: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

Spitfire security model

Standard access to DBs

•GSI SOAP protocol•Strong authentication

•Supports single-signon•Local role repository

•Connection pool to•Multiple backend DBs

Version 1.0 out,WebServices version in alpha

Page 33: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

A Bright Future?

Page 34: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

OGSA: new directions

Open Grid Services Architecture … … cleaning up the protocol

mess

• Concept from the `web services’ world

• Based on common standards:– SOAP, WSDL, UDDI– Running over “upgraded” Grid Security Infra (GSI)

• Adds Transient Services:– State of distributed activities– Workflow, multi-media, distributed data analysis

Page 35: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

OGSA Roadmap

• Introduced at GGF4 (Toronto, March 2002)• New services already web-services based

(Spitfire 2, etc.)

• Alpha-version of Globus Toolkit v3:expected December 2002.

• Huge industrial commitment

Page 36: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

EU DataGrid

• Middleware research project (2001-2003)• Driving applications:

• HE Physics• Earth Observation• Biomedicine

• Operational testbed• 21 sites• 6 VOs• ~ 200 users, growing with ~100/month!

http://www.eu-datagrid.org/

Page 37: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

EU DataGrid Test Bed 1

• DataGrid TB1:– 14 countries– 21 major sites– CrossGrid: 40 more sites– Growing rapidly…

• Submitting Jobs:– Login only once,

run everywhere– Cross administrative

boundaries in asecure and trusted way

– Mutual authorization

http://marianne.in2p3.fr/

Page 38: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

DutchGrid Platform

Amsterdam

UtrechtKNMI

Delft

Nijmegen

Enschede

• DutchGrid:– Test bed coordination– PKI security– Support

• Participation byNIKHEF, KNMI, SARA

DAS-2 (ASCI):TUDelft, Leiden, VU, UvA, Utrecht

Telematics Institute

FOM, NWO/NCF

Min. EZ, ICES/KIS

IBM, KPN, …

Leiden

ASTRON

www.dutchgrid.nl

Page 39: Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28.

A Bright Future!

You could plug your computer into the wall and have direct access to huge computing resources almost immediately (with a little help from toolkits and portals)…It may still be science – although not fiction –but we are about to make this into reality!