Introduction to GRID Computing Bebo White [email protected] New Directions in Information...

254
Introduction to GRID Computing Bebo White [email protected] New Directions in Information Technology Series Contra Costa College Fall 2005

Transcript of Introduction to GRID Computing Bebo White [email protected] New Directions in Information...

Page 1: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Introduction to GRID Computing

Bebo White

[email protected]

New Directions in Information Technology Series

Contra Costa College

Fall 2005

Page 2: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Today’s Goals

To provide an introduction to key Grid computing and Web services issues, techniques, and technologies

To provide a substantial background and vocabulary to support future studies in Grid computing and Web services

To describe some of the current applications of Grid computing

To describe some of the current Grid computing initiatives

Page 3: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Hype

Page 4: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Power Grid -On-Demand Access to Electricity

Time

Qua

lity,

eco

nom

ies

of s

cale

Decouple production &

consumption, enabling

On-demand access

Economies of scale

Consumer flexibility

New devices

Page 5: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Energy Internet

The Shape of Grids to Come?

Page 6: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Grid Checklist (#1)

A system that coordinates resources that are not subject to centralized control

Integrates and coordinates resources and users that live within different control domains – for example, the user’s desktop vs. central computing; different administrative units of the same company; or different companies; and addresses the issues of security, policy, payment, membership, and so forth that arise in these settings.

Otherwise we are dealing with a local management system

(Ian Foster)

Page 7: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Grid Checklist (#2)

A system that uses standard, open, general-purpose protocols and interfaces

Is built from multi-purpose protocols and interfaces that address such fundamental issues as authentication, authorization, resource discovery, and resource access.

It is important that these protocols and interfaces be standard and open.

Otherwise, we are dealing with an application-specific system.

(Ian Foster)

Page 8: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Grid Checklist (#3)

A system that delivers nontrivial qualities of service.

Allows its constituent resources to be used in a coordinated fashion to deliver various qualities of service, relating, for example, to response time, throughput, availability, and security, and/or co-allocation of multiple resource types to meet complex user demands, so that the utility of the combined system is significantly greater than the sum of its parts.

(Ian Foster)

Page 9: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

What is Grid Computing ?

Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations [ I.Foster]

A VO is a collection of users sharing similar needs and requirements in their access to processing, data and distributed resources and pursuing similar goals.

Key concept : Ability to negotiate resource-sharing arrangements among a

set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose [I.Foster]

Page 10: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Grid Problem

Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resource

From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”

Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of…

central location,

central control,

omniscience,

existing trust relationships.

Page 11: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Elements of the Problem

Resource sharing Computers, storage, sensors, networks, …

Sharing always conditional: issues of trust, policy, negotiation, payment, …

Coordinated problem solving Beyond client-server: distributed data analysis, computation,

collaboration, …

Dynamic, multi-institutional virtual orgs Community overlays on classic org structures

Large or small, static or dynamic

Page 12: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Grid Information Problem

There is a need for different views of the information depending upon VO membership

Security constraints

Intended purpose

Etc.

Page 13: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Grids ?

Scale of the problems/applications Solving problems that are bigger than any one data center can hold

Size of user communities Leading research in many different fields today require

collaborations that span research centers and countries (i.e. multi-domain access to distributed resources)

Need to provide access to large data processing power and huge data storage

Page 14: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

What Kinds of Applications? Computation intensive

Interactive simulation (climate modeling)

Large-scale simulation (galaxy formation, gravity waves, battlefield simulation)

Engineering (parameter studies, linked models)

Data intensive Experimental data analysis (high energy physics)

Image, sensor analysis (astronomy, climate)

Distributed collaboration Online instruments (microscopes, x-ray devices)

Remote visualization (climate studies, biology)

Engineering (structural testing, chemical)

Page 15: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago

tomographic reconstruction

real-timecollection

wide-areadissemination

desktop & VR clients with shared controls

Advanced Photon Source

Online Access to Scientific Instruments

archival storage

Page 16: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Mathematicians Solve NUG30

Looking for the solution to the NUG30 quadratic assignment problem

The problem involves assigning 30 facilities to 30 fixed locations so as to minimize the total cost of transferring material between the facilities.

An informal collaboration of mathematicians and computer scientists

Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Page 17: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Community = 1000s of home

computer users

Philanthropic computing vendor (Entropia)

Research group (Scripps)

Common goal= advance AIDS research

Home Computers Evaluate AIDS Drugs

Page 18: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Network for Earthquake Engineering Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Page 19: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The LHC Detectors

CMSATLAS

LHCb

~6-8 PetaBytes / year~108 events/year

~103 batch and interactive users

Federico.carminati , EU review presentation

High Energy Physics

Page 20: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Image courtesy Harvey Newman, Caltech

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Page 21: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Solving Large Problems – Pre-Grid

Mini ComputerMini Computer

MicrocomputerMicrocomputer

ClusterCluster

(by Christophe Jacquet)

Once upon a time……..

mainframemainframe

Page 22: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Grid Distributed Computing Idea

(by Christophe Jacquet)

…and today

Page 23: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Differences Between Grids andDistributed Applications

Huge distributed applications already exist, but they tend to be specialized systems intended for a single purpose or user group

e.g., SETI@Home, FightAIDS@Home

Grids go further and take into account: Different kinds of resources

Not always the same hardware, data and applications No parallelization required

Different kinds of interactions User groups or applications want to interact with Grids in different ways

Dynamic nature Resources and users added/removed/changed frequently

Page 24: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Grid Vision

The Grid: networked data processing centers and ”middleware” software as the “glue” of resources.

Researchers perform their activities regardless geographical location, interact with colleagues, share and access data

Scientific instruments and experiments provide huge amount of data

Page 25: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Broader Context

“Grid Computing” has much in common with major industrial thrusts

Business-to-business, Peer-to-peer, Application Service Providers, Storage Service Providers, Distributed Computing, Internet Computing…

Sharing issues not adequately addressed by existing technologies

Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q”

High performance: unique demands of advanced and high-performance systems

Page 26: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Types - Physical

Cluster Grid Enterprise Grid Global Grid

Page 27: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Types - Logical

Data Grid responds to requests for computers and data stores; similar to (but more secure and auditable than) today's research grids

Information Grid responds to requests for computational processes, that may require several data sources and processing stages to deliver a desired result

Knowledge Grid responds to high-level questions and finds the appropriate processes to deliver answers in the required form

Page 28: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Classical (early) Grid

Focused on applications where data was stored in files little support for transactions, relational database access or

distributed query processing

Exploits a range of protocols such as:

LDAP for directory services and file store queries,

GridFTP for large-scale reliable data transfer

SSL for security

Page 29: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Now?

Moore’s law improvements in computing produce highly functional end systems

The Internet and burgeoning wired and wireless provide universal connectivity

Changing modes of working and problem solving emphasize teamwork, computation

Network exponentials produce dramatic changes in geometry and geography

Page 30: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Network Exponentials

Network vs. computer performance Computer speed doubles every 18 months

Network speed doubles every 9 months

Difference = order of magnitude per 5 years

1986 to 2000 Computers: x 500

Networks: x 340,000

2001 to 2010 Computers: x 60

Networks: x 4000

Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Page 31: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 32: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

iVDGL:International Virtual Data Grid Laboratory

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

Page 33: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Main Services of a Grid Architecture Service providers

Publish the availability of their services via information systems Such services may come-and-go or change dynamically E.g. a testbed site that offers x CPUs and y GB of storage

Service brokers Register and categorize published services and provide search capabilities E.g. 1) SLAC Resource Broker selects the best site for a “job”

2) Catalogues of data held at each testbed site

Service requesters Single sign-on: log into the Grid once Use brokering services to find a needed service and employ it E.g. CMS physicists submit a simulation job that needs 12 CPUs for 6 hours

and 15 GB which gets scheduled, via the Resource Broker, on the CERN testbed site

Page 34: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Security

Resource providers are essentially “opening themselves up” to itinerant users

Secure access to resources is required X.509 Public Key Infrastructure

User’s identity has to be certified by (mutually recognized) national Certification Authorities (CAs)

Resources (node machines) have to be certified by CAs

Temporary delegation from users to processes to be executed “in user’s name” ( proxy certificates )

Common agreed policies for accessing resource and handling user’s rights across different domains within VOs

Page 35: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Globus Project™Making Grid computing a reality

Close collaboration with real Grid projects in science and industry

Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure

Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing

The Globus Toolkit™: Open source, reference software base for building grid infrastructure and applications

Global Grid Forum: Development of standard protocols and APIs for Grid computing

Page 36: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Selected Major Grid ProjectsName URL & Sponsors Focus

Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF

Create & deploy group collaboration systems using commodity technologies

BlueGrid IBM Grid testbed linking IBM laboratories

DISCOM www.cs.sandia.gov/discomDOE Defense Programs

Create operational Grid providing access to resources at three U.S. DOE weapons laboratories

DOE Science Grid

sciencegrid.org

DOE Office of Science

Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities

Earth System Grid (ESG)

earthsystemgrid.orgDOE Office of Science

Delivery and analysis of large climate model datasets for the climate research community

European Union (EU) DataGrid

eu-datagrid.org

European Union

Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics

g

g

g

g

g

g

New

New

Page 37: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Selected Major Grid ProjectsName URL/Sponsor Focus

EuroGrid, Grid Interoperability (GRIP)

eurogrid.org

European Union

Create tech for remote access to supercomp resources & simulation codes; in GRIP, integrate with Globus Toolkit™

Fusion Collaboratory fusiongrid.org

DOE Off. Science

Create a national computational collaboratory for fusion research

Globus Project™ globus.org

DARPA, DOE, NSF, NASA, Msoft

Research on Grid technologies; development and support of Globus Toolkit™; application and deployment

GridLab gridlab.org

European Union

Grid technologies and applications

GridPP gridpp.ac.uk

U.K. eScience

Create & apply an operational grid within the U.K. for particle physics research

Grid Research Integration Dev. & Support Center

grids-center.org

NSF

Integration, deployment, support of the NSF Middleware Infrastructure for research & education

g

g

g

g

g

g

New

New

New

New

New

Page 38: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Selected Major Grid ProjectsName URL/Sponsor Focus

Grid Application Dev. Software

hipersoft.rice.edu/grads; NSF

Research into program development technologies for Grid applications

Grid Physics Network griphyn.org

NSF

Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS

Information Power Grid ipg.nasa.gov

NASA

Create and apply a production Grid for aerosciences and other NASA missions

International Virtual Data Grid Laboratory

ivdgl.org

NSF

Create international Data Grid to enable large-scale experimentation on Grid technologies & applications

Network for Earthquake Eng. Simulation Grid

neesgrid.org

NSF

Create and apply a production Grid for earthquake engineering

Particle Physics Data Grid

ppdg.net

DOE Science

Create and apply production Grids for data analysis in high energy and nuclear physics experiments

g

g

g

g

g

New

New

g

Page 39: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Selected Major Grid ProjectsName URL/Sponsor Focus

TeraGrid teragrid.org

NSF

U.S. science infrastructure linking four major resource sites at 40 Gb/s

UK Grid Support Center grid-support.ac.uk

U.K. eScience

Support center for Grid projects within the U.K.

Unicore BMBFT Technologies for remote access to supercomputers

g

g

New

New

Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS

See also www.gridforum.org

Page 40: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Where is Development of the Grid Going ?

Grid

Web

The definition of WSRF means that Grid and Web communities can move forward on a common base

WSRF

Started far apart in apps & tech

OGSI

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging

Page 41: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Standards

Grid and Web Services are merging Grid is an aggressive use case of Web Services

WSRF completes common infrastructure

Web Services standards landscape is in flux Uncertain status of security and policy standards continues to be

a big source of concern

Grid services standards landscape heating up Agreement, management, data access, …

Open source software important for adoption

Page 42: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Standards (cont)

Open, standard protocols Enable interoperability

Avoid product/vendor lock-in

Enable innovation/competition on end points

Enable ubiquity

In Grid space, must address how to Describe, discover, and access resources

Monitor, manage, and coordinate, resources

Account and charge for resources

For many different types of resource

Page 43: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Standards (cont)

SSL/TLS v1 (from OpenSSL) (IETF)

LDAP v3 (from OpenLDAP) (IETF)

X.509 Proxy Certificates (IETF)

GridFTP v1.0 (GGF)

WSDL 1.1, XML, SOAP (W3C)

WS-Security (OASIS)

OGSI v1.0 (GGF)

And others on the road to standardization WSRF (OASIS), DAIS (GGF), WS-Agreement (GGF), WSDL 2.0, WSDM,

SAML, XACML

Page 44: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WSRF Specifications

List is still changing, but basically includes..

Core: WS-Resource Framework (WSRF) WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-Base Faults(WSRF-BF)

Related: WS-Notifications WS-Addressing

Page 45: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WSRF

WSRF is a framework consisting of a number of specifications.

WS-Resource Properties

WS-Resource Lifetime

WS-Service Groups

WS-Notification

WS-BaseFaults

WS-Renewable References (unpublished)

Other WS specifications such as:

WS-Addressing

Page 46: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

How WSRF Fits in With Other Standards, Specifications and Protocols.

Internet protocols

Web services

WSRF

Grid stuff Globus (GRAM, MDS)

WSDL, SOAP

HTTP, TCP/IP

Page 47: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Describing Web Services

Web Services Description Language (WSDL) 2.0 Status: W3C Last Call Working Draft http://www.w3.org/TR/wsdl

WSDL is for describing Web Services Defines XML-based grammar for describing network services as a

set of endpoints Describes their methods, arguments, return values and how to

use

Approach: Service Oriented Architecture (SOA) Service-Provider:

Develop a Web Service and publish its description as WSDL Publish a link to it in a Service-Registry

Service-Consumer: Service discovery, i.e. find WSDL, e.g. via Service-Registry Use endpoint definition (WSDL) to communicate with service

Page 48: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Services Addressing

URIs (Uniform Resource Identifiers). Look like URLs:

http://webservices.mysite.com/weather/us/WeatherService

When you have a Web Service URI, you will usually need to give that URI to a program

If you typed a Web Service URI into your web browser, you would probably get an error message or some unintelligible code

Some services include a polite response page

Page 49: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Service-Oriented Architecture

Publish

Endpoint

Definitio

n

Registry:ServiceBroker

ServiceProvider

ServiceConsumerDiscovery

Bind

Page 50: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Services Architecture

WSDL: Core element of the Web Service Architecture stack (Endpoint definition language)

Listener

Responder

Web Service

XML 1.0 + Namespaces(messaging)

SOAP(messaging)

XSD(service description)

WSDL(service description)

UDDI(service discovery)

Simplified Web Service Stack (WS-I Basic Profile 1.0 compliant)

WSDL

Page 51: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Possibly part of a WSDL specification Message Operation PortType (Abstract Endpoint Type)

Set of message flows (operations) expected by a particular endpoint type - No details relating to transport or encoding or location

Abstract Endpoint Type

Message

Message

Message Message

Message

Message

One-way operation

Request-Responseoperation

Notificationoperation

Solicit-Responseoperation

AbstractEndpoint

Type

(PortType)

Page 52: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Concrete Endpoint Type(Binding)

Concrete Endpoint Type(Binding)

Concrete Endpoint Type

Binding (Concrete Endpoint Type) Defines transport and encoding particulars for a portType

Messagesfor operation

Messagesfor operation

Messagesfor operation

URI

URI

URI

PortType

PortType

Transport &Encoding

operationoperationoperation

operation

Page 53: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Host

Shift to Service Definition

Port (Endpoint Instance) Network address of an endpoint and

the binding it adheres to

Note – not necessarily an TCP port

Service A collection of related endpoint

instances

Concrete Endpoint Type(Binding)

Host

Concrete Endpoint Type(Binding)

En

dp

oin

t In

stan

ce(P

ort)

En

dp

oin

t In

stan

ce(P

ort)

Service

Page 54: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Describing Web Services

All WSDL Elements belong to the WSDL namespace: http://schemas.xmlsoap.org/wsdl/

Namespaces for WSDL Binding SOAP Binding:

http://schemas.xmlsoap.org/wsdl/soap/

HTTP GET and POST Binding:http://schemas.xmlsoap.org/wsdl/http/

WSDL MIME binding:http://schemas.xmlsoap.org/wsdl/mime/

More to come…

Page 55: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Core communication model of the Web (HTTP) is stateless

Application requires state when a user traverses the multiple endpoints of a Web application/service

State Management

Page 56: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Service: Stateless

Page 57: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Service: Stateful

Page 58: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Service Invocation - Stateful

Page 59: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Service + WSRF = Stateful Resources = WS-Resource

A stateful resource is something that exists even when you're not interacting with it.

E.g. database backend service

Stateful resources have properties that define state these properties are how you interact with them

Properties have values

Add/remove/change properties and values dynamically

WSRF Specification: a WS-Resource is the combination of a Web service and a stateful

resource on which it acts.

Page 60: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WS-Resource Approach to State

Typical approach: Put the state in the Web service (thus making it stateful, which is

generally regarded as a bad thing)

WSRF approach: Store state in a separate entity called a resource Each resource has a unique key, A Web service can have multiple resources

To connect to service: URI + WS-Addressing Std

Page 61: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WS-Resources

Web services often provide access to state Job submissions, databases

A WS-Resource is standard way of representing that state.

In this tutorial, we will be using ‘counter’ resources which are simple accumulators.

Page 62: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WS-Resources

WSRF specifications provide: XML-based Resource Properties

Lifetime management (creation/destruction) of resources

Servicegroups, which group together WS-Resources

Notification (for example of changes in resource properties)

Faults

Renewable References

Page 63: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Examples of WS-Resources

Files on a file server

Rows in a database

Jobs in a job submission system

Accounts in a bank

Page 64: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Session Design

Session – Defines a context in which a user communicates with a Web Application in a defined time period

One Session per user Assigns application state to multiple requests from one user

Design Decision / Rules of thumb Use a database to persist state

UUID to identify a session/user

Physical Design: Session identifier exchange

Cookie, hidden variable, or encoded into the URL

Page 65: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Transactions

Transaction – A unit of work that should either succeed or fail as a whole. A series of operations that behave corresponding to the ACID rules.

Series: BEGIN_TRANSACTION, Op1, …, OpN, COMMIT_TRANSACTION

ACID Rules define Atomicity, Consistency, Isolation, and Durability

Characteristics regarding Web Applications Long Running

Nested

Page 66: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Atomicity And Consistency

Atomicity Transaction executes exactly once and is atomic

All the work is done or none of it

Consistency Transaction preserves the consistency of data

Transforming one consistent state results in another consistent state of data

Page 67: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Isolation And Durability

Isolation Transaction is a unit of isolation Concurrent transactions behave as though each was the only

transaction running in the System Durability

Transaction is a unit of recovery

If a transaction commits, the system guarantees that its updates will persist, immediately after the commit.

Page 68: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Aspects of DSA

Driven by communication aspects

Performance issues Protocol overhead Bandwidth Quality of Service Delays Proxy, Cache and Mirrors

Other Issues Security, availability, etc. Operational aspects

Page 69: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Simple Web Service Chain

Web Service WS 1 provides functionality using WS 2, WS 2 provides…

Like a chain: The weakest element influences the overall behavior

Hops - Represents the number of network nodes involved from the source WS to the destination WS. Example shows 2 Hops, 4 Web Services

WS 1 WS 2 WS 3 WS n

Page 70: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Considering Scalability

Scale Up: More “power” added to the machine

Scale Out: The application logic unit is cloned across a set of identical servers

Scale Up

Scale Out

Page 71: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Scale-Out and Partition

Scale out Web Servers and scale up Database

Scale Up Database

Partition Database

Page 72: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Partition Database

Functional – Each functional area of a site gets its own database

Dedicated hardware to certain functions Class of hardware per function

Tables - Huge scale opportunity for large tables Some modern database management systems provide special

support for this

Read-only Databases Data changes do not occur often Use of Replicated Databases

Page 73: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Dynamic WS Discovery

Web Service calls Web Service mediated by Broker (respectively P2P network)

Criteria may be quality, context, price, etc.

Requires classification system or metadata

Broker could use UDDI automatically on request

P2P discovery by content-based routing (e.g. for WSDL)

WS 1

Broker / P2P-Network

WS x WS y

Page 74: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Integrating Endpoints

Typical Problems No standard Way to expose Functionality

Integration is expensive and error-prone

Not designed for Partnership Scenario

Why? Semantic of content gets lost on its way to presentation

Need for Semantic

Page 75: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Integrating Application Logic

Goal: Federating Web Applications (respectively their Logical Units)

Globalize the Component-based View Next Generation Web Applications will work together

Extend processes with external (potentially unknown) partners

Page 76: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Application

Web Application

Web Application

Web Application

Federation Approach

Internet

Page 77: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Federation Scenarios Distributed Computing / Web Services in use for:

Mobile Virtual Enterprise Market-place, Supply Chain, Grid Computing (Grid of Web Services) Portals providing uniform Access to distributed Information Spaces

Examples of Business Relationships: B2B: Business-to-Business B2C: Business-to-Consumer C2C: Consumer-to-Consumer B2A: Business-to-Administration A2C: Administration-to-Consumer A2A: Administration-to-Administration

Page 78: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Accessing Objects

SOAP Version 1.2 W3C Recommendation 24 June 2003

Part 0- Tutorial: http://www.w3.org/TR/soap12-part0/

Part1: Defines Messaging Framework

Part2: Adjuncts (may be used in messages)

SOAP provides a simple and lightweight Mechanism for exchanging structured and typed Information between Peers in a decentralized, distributed Environment

Formerly known as Simple Object Access Protocol

Does not itself define any Application Semantics, e.g. Programming Model

Page 79: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

SOAP

SOAP consists of three Parts: SOAP envelope - Defines what is in a message; who should deal

with it, and whether it is optional or mandatory

SOAP encoding rules - Define a serialization mechanism for application-defined data types.

SOAP RPC representation - Define a convention that can be used to represent remote procedure calls and responses.

Page 80: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

General Web Service Model

ConsumerWeb Service

(Provider)

Transport Process-Logic

SOAPMessage

Requestor

Parser

Listener

Respondere.g. HTTP(S),

SMTP, FTP)

Message

Page 81: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

SOAP Message

SOAP Message

SOAP Envelope

SOAP HeaderSOAP Header

SOAP BodySOAP Body

SOAP ProtocolLayering

SOAP

Application Protocol(HTTP, SMTP, etc.)

Transport Protocol(TCP/IP, IPX/SPX, etc.)

Physical Protocol(Ethernet, ATM, etc.)

Page 82: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

SOAP and Client/Server…

In order for SOAP to work, the client must have code running that is responsible for building the SOAP request.

In response, a server must also be responsible for understanding the SOAP request, invoke the specified method, build the response message, and return it to the client.

These details are up to you: your Web application

Page 83: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The HTTP Aspect

A SOAP request via HTTP POST requests

POST /WebCalculator/Calculator.asmx HTTP/1.1Content-Type: text/xml...SOAPAction: “http://tempuri.org/Add”Content-Length: 386

<?xml version=“1.0”?><soap:Envelope ...> ...</soap:Envelope>

Page 84: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

SOAP Message

SOAP Envelope

SOAP Header

SOAP Body

Message Name and Data

Headers

Headers

XML-encoded SOAP message name and data

<Body> contains SOAP message name

Individual headers

<Header> encloses headers

<Envelope> encloses payload

Protocol binding headers

The complete SOAP message

Message Structure

Page 85: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

SOAP Message Example

An XML document using the SOAP schema:

<?xml version=“1.0”?><soap:Envelope ...> <soap:Header ...> ... </soap:Header> <soap:Body> <MyQuery xmlns=“http://tempuri.org/”> <n1>12</n1> <n2>10</n2> </MyQuery > </soap:Body></soap:Envelope>

Page 86: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Encoding Complex Data

Data structures are serialized as XML:

<soap:Envelope ...> <soap:Body> <MyQueryResult xmlns=“http://tempuri.org/”> <result> <Description>Plastic Novelties Ltd</Description> <Price>129</Price> <Ticker>PLAS</Ticker> </result> </MyQueryResult> </soap:Body></soap:Envelope>

Page 87: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

POST /StockQuote HTTP/1.1Host: www.stockquoteserver.comContent-Type: text/xml; charset="utf-8"Content-Length: nnnnSOAPAction: "Some-URI“

<soap:Envelopexmlns:soap="http://www.w3.org/2001/09/soap-envelope">

<soap:Body> <m:GetLastTradePrice xmlns:m="Some-URI"> <symbol>DIS</symbol> </m:GetLastTradePrice> </soap:Body></soap:Envelope>

Example of a SOAP Request

SOAP message over HTTP-POST:

Page 88: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

HTTP/1.1 200 OKContent-Type: text/xml; charset="utf-8"Content-Length: nnnn

<soap:Envelopexmlns:soap="http://www.w3.org/2001/09/soap-envelope">

<soap:Body> <m:GetLastTradePriceResponse xmlns:m="Some-URI"> <Price>34.5</Price> </m:GetLastTradePriceResponse> </soap:Body></soap:Envelope>

A SOAP Response

SOAP response over HTTP

Page 89: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

HTTP/1.1 500 Internal Server ErrorContent-Type: text/xml; charset="utf-8"Content-Length: nnnn

<soap:Envelopexmlns:soap="http://www.w3.org/2001/09/soap-envelope">

<soap:Body> <soap:Fault> <faultcode>SOAP: MustUnderstand</faultcode> <faultstring>SOAP Must Under Error</faultstring> </soap:Fault> </soap:Body></soap:Envelope>

Example of a SOAP Error

SOAP response over HTTP

Page 90: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Security and Features

In context of HTTP – builds on existing security HTTPS X.509 certificates

Developers explicitly choose which methods to expose

Extensibility - the major strength of SOAP E.g. check the WS-* specifications

http://msdn.microsoft.com/webservices Cf. WS-Security Roadmap

Page 91: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

WS-Security Roadmap

SecuritySecurity

SecuritySecurityPolicyPolicy

SecureSecureConversationConversation

TrustTrust

FederationFederation

PrivacyPrivacy

AuthorizationAuthorization

SOAP MessagingSOAP Messaging

Page 92: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Discovering Web Services

Universal Description, Discovery, and Integration (UDDI) – Specifies what the API for a Web-based Registry looks like.

All about the “Yellow, White & Green Pages” Defines how to run and operate Registry Sites on the Web Defines how to pay for its Operation – encourages basic lookup

services for free

Further Information at http://uddi.org

Page 93: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Peer nodes (websites)

Companies registerwith any node

Registrations replicatedon a daily basis

Complete set of“registered” recordsavailable at all nodes

Common set ofSOAP APIs supportedby all nodes

Compliance enforced bybusiness contract

Ariba

Microsoftother

Registry Operation

UDDI.org

queries

IBM

Page 94: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why a DNS-like Model?

Enforces cross-platform compatibility across competitor platforms

Demonstration of trust and openness

Avoids tacit endorsement of any one vendor’s platform

May migrate to a third party

Page 95: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

UDDI provides information…

Who – Business Information

What – Find the right Type of Business

Where – To Access a Service

How – Describes how a given Interface functions

Information provided at http://uddi.microsoft.com

Page 96: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

UDDI – A Publisher View

Page 97: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

DiscoveryDiscovery

Let me talk to you (SOAP)Let me talk to you (SOAP)

UDDI and Web Services

How do we talk? (WSDL)How do we talk? (WSDL)

Find a ServiceFind a Service

return service response (XML)return service response (XML)

http://yourservice.com/svc1http://yourservice.com/svc1

return service descriptions (XML)return service descriptions (XML)

http://yourservice.com/?WSDLhttp://yourservice.com/?WSDL

HTML with link to WSDLHTML with link to WSDL

http://yourservice.comhttp://yourservice.com

http://www.uddi.orghttp://www.uddi.org

Link to discovery documentLink to discovery document

WebWebService Service

ConsumerConsumer WebWebService Service ProviderProvider

UDDIUDDI

Page 98: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

UDDI and SOAP

User UDDI

SOAP Request

UDDISOAP Response

UDDI RegistryNode

HTTPServer

SOAPProcessor

UDDIRegistry Service

B2B DirectoryCreate, View, Update, and Deleteregistrations Implementation-

neutral

Page 99: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Registry APIs (SOAP)

Inquiry API Find things

find_business find_service find_binding find_tModel

Get Details about things get_businessDetail get_serviceDetail get_bindingDetail get_tModelDetail

Publishers API Save things

save_business save_service save_binding save_tModel

Delete things delete_business delete_service delete_binding delete_tModel

security… get_authToken discard_authToken

Page 100: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Web Services Makes Sense For Grid Computing

Client requesting Grid Service

SOAPMessage

Grid ServiceProvider

HTTP Transport

VO BoundaryOr Network

Interface inWDSL

Page 101: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Should HPC Folks Care About the Grid ?

1) Grid is a disruptive technology [Vision] It ushers in a virtualized, collaborative, distributed world that

our applications will use

2) Grid addresses pain points now [Reality] Grids are built not bought, and are delivering real benefits

The computational demands of our applications are not going to get simpler

3) An open Grid is to our advantage [Future] Standards are being defined now that will determine the future

of this technology

Page 102: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Globus Project™Making Grid computing a reality

Close collaboration with real Grid projects in science and industry

Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure

Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing

The Globus Toolkit™: Open source, reference software base for building grid infrastructure and applications

Global Grid Forum: Development of standard protocols and APIs for Grid computing

Page 103: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Some Important Definitions

Resource

Network protocol

Network enabled service

Application Programmer Interface (API)

Software Development Kit (SDK)

Syntax

Not discussed, but important: policies

Page 104: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Resource

An entity that is to be shared E.g., computers, storage, data, software

Does not have to be a physical entity E.g., Condor pool, distributed file system, …

Defined in terms of interfaces, not devices E.g. scheduler such as LSF and PBS define a compute resource

Open/close/read/write define access to a distributed file system, e.g. NFS, AFS, DFS

Page 105: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Network Protocol

A formal description of message formats and a set of rules for message exchange

Rules may define sequence of message exchanges

Protocol may define state-change in endpoint, e.g., file system state change

Good protocols designed to do one thing Protocols can be layered

Examples of protocols IP, TCP, TLS (was SSL), HTTP, Kerberos

Page 106: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Network Enabled Services

Implementation of a protocol that defines a set of capabilities

Protocol defines interaction with service

All services require protocols

Not all protocols are used to provide services (e.g. IP, TLS)

Examples: FTP and Web servers

Web Server

IP Protocol

TCP Protocol

TLS Protocol

HTTP Protocol

FTP Server

IP Protocol

TCP Protocol

FTP Protocol

Telnet Protocol

Page 107: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Application Programming Interface

A specification for a set of routines to facilitate application development

Refers to definition, not implementation

E.g., there are many implementations of MPI

Spec often language-specific (or IDL) Routine name, number, order and type of arguments; mapping

to language constructs

Behavior or function of routine

Examples GSS API (security), MPI (message passing)

Page 108: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Software Development Kit

A particular instantiation of an API

SDK consists of libraries and tools Provides implementation of API specification

Can have multiple SDKs for an API

Examples of SDKs MPICH, Motif Widgets

Page 109: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Syntax

Rules for encoding information, e.g. XML, Condor ClassAds, Globus RSL

X.509 certificate format (RFC 2459)

Cryptographic Message Syntax (RFC 2630)

Distinct from protocols One syntax may be used by many protocols (e.g., XML); &

useful for other purposes

Syntaxes may be layered E.g., Condor ClassAds -> XML -> ASCII

Important to understand layerings when comparing or evaluating syntaxes

Page 110: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Protocol can have Multiple APIs

TCP/IP APIs include BSD sockets, Winsock, System V streams, …

The protocol provides interoperability: programs using different APIs can exchange information

I don’t need to know remote user’s API

TCP/IP Protocol: Reliable byte streams

WinSock API Berkeley Sockets API

Application Application

Page 111: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

An API can have Multiple Protocols

MPI provides portability: any correct program compiles & runs on a platform

Does not provide interoperability: all processes must link against same SDK

E.g., MPICH and LAM versions of MPI

ApplicationApplication

MPI API MPI API

LAM SDK

LAM protocol

MPICH-P4 SDK

MPICH-P4 protocol

TCP/IP TCP/IPDifferent message formats, exchange

sequences, etc.

Page 112: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

APIs and Protocols are Both Important

Standard APIs/SDKs are important They enable application portability

But w/o standard protocols, interoperability is hard (every SDK speaks every protocol?)

Standard protocols are important Enable cross-site interoperability

Enable shared infrastructure

But w/o standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)

Page 113: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Discuss Architecture?

Descriptive Provide a common vocabulary for use when describing Grid

systems

Guidance Identify key areas in which services are required

Prescriptive Define standard “Intergrid” protocols and APIs to facilitate

creation of interoperable Grid systems and portable applications

Page 114: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

One View of Requirements

Identity & authentication

Authorization & policy

Resource discovery

Resource characterization

Resource allocation

(Co-)reservation, workflow

Distributed algorithms

Remote data access

High-speed data transfer

Performance guarantees

Monitoring

Adaptation Intrusion detection Resource management Accounting & payment Fault management System evolution Etc. Etc. …

Page 115: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Another View: “Three Obstaclesto Making Grid Computing Routine”

1) New approaches to problem solving

Data Grids, distributed computing, peer-to-peer, collaboration grids, …

2) Structuring and writing programs

Abstractions, tools

3) Enabling resource sharing across distinct institutions

Resource discovery, access, reservation, allocation; authentication, authorization, policy; communication; fault detection and notification; …

Page 116: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Systems Problem:Resource Sharing Mechanisms That …

Address security and policy concerns of resource owners and users

Are flexible enough to deal with many resource types and sharing modalities

Scale to large number of resources, many participants, many program components

Operate efficiently when dealing with large amounts of data & computation

Page 117: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Aspects of the Systems Problem

1) Need for interoperability when different groups want to share resources

Diverse components, policies, mechanisms

E.g., standard notions of identity, means of communication, resource descriptions

2) Need for shared infrastructure services to avoid repeated development, installation

E.g., one port/service/protocol for remote access to computing, not one per tool/appln

E.g., Certificate Authorities: expensive to run

A common need for protocols & services

Page 118: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Protocol-Oriented View of Grid Architecture That Emphasizes …

Development of Grid protocols & services Protocol-mediated access to remote resources

New services: e.g., resource brokering

“On the Grid” = speak Intergrid protocols

Mostly (extensions to) existing protocols

Development of Grid APIs & SDKs Interfaces to Grid protocols & services

Facilitate application development by supplying higher-level abstractions

The (hugely successful) model is the Internet

Page 119: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 120: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Protocols, Services, and APIs Occur at Each Level

Languages/Frameworks

Fabric Layer

Applications

Local Access APIs and Protocols

Collective Service APIs and SDKs

Collective ServicesCollective Service Protocols

Resource APIs and SDKs

Resource ServicesResource Service Protocols

Connectivity APIs

Connectivity Protocols

Page 121: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Important Points

Built on Internet protocols & services Communication, routing, name resolution, etc.

“Layering” here is conceptual, does not imply constraints on who can call what

Protocols/services/APIs/SDKs will, ideally, be largely self-contained

Some things are fundamental: e.g., communication and security

But, advantageous for higher-level functions to use common lower-level functions

Page 122: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Hourglass Model

Focus on architecture issues Propose set of core services as basic

infrastructure

Use to construct high-level, domain-specific solutions

Design principles Keep participation cost low

Enable local control

Support for adaptation

“IP hourglass” model

Diverse global services

Coreservices

Local OS

A p p l i c a t i o n s

Page 123: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Where Are We With Architecture?

No “official” standards exist

But: Globus Toolkit™ has emerged as the de facto standard for

several important Connectivity, Resource, and Collective protocols

GGF has an architecture working group

Technical specifications are being developed for architecture elements: e.g., security, data, resource management, information

Internet drafts submitted in security area

Page 124: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Fabric LayerProtocols & Services Just what you would expect: the diverse mix of resources that

may be shared Individual computers, Condor pools, file systems, archives,

metadata catalogs, networks, sensors, etc., etc.

Few constraints on low-level technology: connectivity and resource level protocols form the “neck in the hourglass”

Defined by interfaces not physical characteristics

Page 125: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GSI: www.gridforum.org/security

Connectivity LayerProtocols & Services

Communication Internet protocols: IP, DNS, routing, etc.

Security: Grid Security Infrastructure (GSI) Uniform authentication, authorization, and message protection

mechanisms in multi-institutional setting

Single sign-on, delegation, identity mapping

Public key technology, SSL, X.509, GSS-API

Supporting infrastructure: Certificate Authorities, certificate & key management, …

Page 126: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Resource LayerProtocols & Services Grid Resource Allocation Mgmt (GRAM)

Remote allocation, reservation, monitoring, control of compute resources

GridFTP protocol (FTP extensions) High-performance data access & transport

Grid Resource Information Service (GRIS) Access to structure & state information

Network reservation, monitoring, control

All built on connectivity layer: GSI & IP

GridFTP: www.gridforum.orgGRAM, GRIS: www.globus.org

Page 127: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Collective LayerProtocols & Services

Index servers aka metadirectory services Custom views on dynamic resource collections assembled by a

community

Resource brokers (e.g., Condor Matchmaker) Resource discovery and allocation

Replica catalogs

Replication services

Co-reservation and co-allocation services

Workflow management services

Etc.

Page 128: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

ComputeResource

SDK

API

AccessProtocol

CheckpointRepository

SDK

API

C-pointProtocol

Example:High-ThroughputComputing System

High Throughput Computing System

Dynamic checkpoint, job management, failover, staging

Brokering, certificate authorities

Access to data, access to computers, access to network performance data

Communication, service discovery (DNS), authentication, authorization, delegation

Storage systems, schedulers

Collective(App)

App

Collective(Generic)

Resource

Connect

Fabric

Page 129: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Example:Data Grid Architecture

Discipline-Specific Data Grid Application

Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, …

Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs,

Access to data, access to computers, access to network performance data, …

Communication, service discovery (DNS), authentication, authorization, delegation

Storage systems, clusters, networks, network caches, …

Collective(App)

App

Collective(Generic)

Resource

Connect

Fabric

Page 130: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Programming Problem

But how do I develop robust, secure, long-lived, well-performing applications for dynamic, heterogeneous Grids?

I need, presumably: Abstractions and models to add to speed/robustness/etc. of

development

Tools to ease application development and diagnose common problems

Code/tool sharing to allow reuse of code components developed by others

Page 131: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Programming Technologies

“Grid applications” are incredibly diverse (data, collaboration, computing, sensors, …)

Seems unlikely there is one solution

Most applications have been written “from scratch,” with or without Grid services

Application-specific libraries have been shown to provide significant benefits

No new language, programming model, etc., has yet emerged that transforms things

But certainly still quite possible

Page 132: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Examples of GridProgramming Technologies

MPICH-G2: Grid-enabled message passing

CoG Kits, GridPort: Portal construction, based on N-tier architectures

GDMP, Data Grid Tools, SRB: replica management, collection management

Condor-G: workflow management

Legion: object models for Grid computing

Cactus: Grid-aware numerical solver framework Note tremendous variety, application focus

Page 133: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

MPICH-G2: A Grid-Enabled MPI

A complete implementation of the Message Passing Interface (MPI) for heterogeneous, wide area environments

Based on the Argonne MPICH implementation of MPI (Gropp and Lusk)

Requires services for authentication, resource allocation, executable staging, output, etc.

Programs run in wide area without change

See also: MetaMPI, PACX, STAMPI, MAGPIE

www.globus.org/mpi

Page 134: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Cactus(Allen, Dramlitsch, Seidel, Shalf, Radke)

Modular, portable framework for parallel, multidimensional simulations

Construct codes by linking Small core (flesh): mgmt services

Selected modules (thorns): Numerical methods, grids & domain decomps, visualization and steering, etc.

Custom linking/configuration tools

Developed for astrophysics, but not astrophysics-specific

Cactus “flesh”

Thorns

www.cactuscode.org

Page 135: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

High-Throughput Computingand Condor

High-throughput computing CPU cycles/day (week, month, year?) under non-ideal

circumstances

“How many times can I run simulation X in a month using all available machines?”

Condor converts collections of distributively owned workstations and dedicated clusters into a distributed high-throughput computing facility

Emphasis on policy management and reliability

Page 136: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Object-Based Approaches

Grid-enabled CORBA NASA Lewis, Rutgers, ANL, others

CORBA wrappers for Grid protocols

Some initial successes

Legion U.Virginia

Object models for Grid components (e.g., “vault”=storage, “host”=computer)

Page 137: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Portals

N-tier architectures enabling thin clients, with middle tiers using Grid functions

Thin clients = Web browsers

Middle tier = e.g. Java Server Pages, with Java CoG Kit, GPDK, GridPort utilities

Bottom tier = various Grid resources

Numerous applications and projects, e.g. Unicore, Gateway, Discover, Mississippi Computational Web

Portal, NPACI Grid Port, Lattice Portal, Nimrod-G, Cactus, NASA IPG Launchpad, Grid Resource Broker, …

Page 138: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Common Toolkit Underneath

Each of these programming environments should not have to implement the protocols and services from scratch!

Rather, want to share common code that… Implements core functionality

Software Development Kits (SDKs) that can be used to construct a large variety of services and clients

Standard services that can be easily deployed

Is robust, well-architected, self-consistent

Is open source, with broad input

Page 139: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

General Approach

Define Grid protocols & APIs Protocol-mediated access to remote resources

Integrate and extend existing standards

“On the Grid” = speak “Intergrid” protocols

Develop a reference implementation Client and server SDKs, services, tools, etc.

Grid-enable wide variety of tools

Learn through deployment and applications

Page 140: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Toolkit™

A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications

Offer a modular “bag of technologies”

Enable incremental development of grid-enabled tools and applications

Implement standard Grid protocols and APIs

Make available under liberal open source license

Current version is 4.0, commonly referred to as GT4

Page 141: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Key Concepts for GT4

OGSA, WSRF, and GT4 These are basic architecture components for GT4

Open Grid Services Architecture (OGSA)

Web Services:

OGSA, WSRF, and GT4 are based on standard Web Services technologies such as SOAP and WSDL.

Need to be familiar with the Web Services architecture and languages.

The Web Services Resource Framework: WSRF is the core of GT4.

Page 142: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Key Concepts for GT4 (cont)

The GT4 Architecture: Based on WS-Resources and Web Services, and grid computing

Java & XML: to use GT4, you need to be able to program in Java, and to

understand basic XML.

Page 143: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

OGSA Key Requirements

Interoperability and Support for Dynamic and Heterogeneous Environments

Resource Sharing Across Organizations

Optimization

Quality of Service (QoS) Assurance

Job Execution

Data Services

Security

Administrative Cost Reduction

Scalability

Availability

Ease of Use and Extensibility

Page 144: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

OGSA Defines Basic Capabilities

Infrastructure Services

Execution Management Services

Data Services

Resource Management Services

Security Services

Self-Management Services

Information Services

Security Considerations

Page 145: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

OGSA, WSRF, and GT4

Page 146: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GT4 Roadmap

Page 147: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

History and Motivation

Do we want standard APIs? Eg. MPI (Message Passing Interface)

But on the grid, we actually want standard wire protocols The API can be different on each system

Page 148: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

History and Motivation (cont)

Open Grid Services Infrastructure (OGSI)

Global Grid Forum (GGF) standard

Identified a number of common ‘building blocks’ used in grid protocols

Inspecting state, creating and removing state, detecting changes in state, naming state

Defined standard ways to do these things, based on Web services (defined a thing called a Grid Service)

Page 149: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

History and Motivation (cont)

But then…

Realized that this was useful for Web services in general, not just for the grid.

Moved out of GGF, into OASIS

Split the single OGSI specification into a number of other specifications called WSRF.

Page 150: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Toolkit

Grid infrastructure software

Four key protocols Security/Authentication (GSI)

Resource Management/Scheduling (GRAM)

Resource description (GRIS/GIIS)

Data/File transfer (GASS, GridFTP)

Page 151: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Security Infrastructure (GSI)

Page 152: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Security Terminology

Authentication: Establishing identity

Authorization: Establishing rights

Message protection Message integrity

Message confidentiality

Non-repudiation

Digital signature

Accounting

Certificate Authority (CA)

Page 153: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Grid Security is Hard

Resources being used may be valuable & the problems being solved sensitive

Resources are often located in distinct administrative domains

Each resource has own policies & procedures

Set of resources used by a single computation may be large, dynamic, and unpredictable

Not just client/server, requires delegation

It must be broadly available & applicable Standard, well-tested, well-understood protocols; integrated with

wide variety of tools

Page 154: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action“Create Processes at A and B that Communicate & Access Files at C”

Page 155: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

1) Easy to use

2) Single sign-on

3) Run applicationsftp,ssh,MPI,Condor,Web,…

4) User based trust model

5) Proxies/agents (delegation)

User View

1) Specify local access control

2) Auditing, accounting, etc.

3) Integration w/ local systemKerberos, AFS, license mgr.

4) Protection from compromisedresources

Resource Owner View

API/SDK with authentication, flexible message protection,

flexible communication, delegation, ...Direct calls to various security functions (e.g. GSS-API)Or security integrated into higher-level SDKs:

E.g. GlobusIO, Condor-G, MPICH-G2, HDF5, etc.

Developer View

Grid Security Requirements

Page 156: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Candidate Standards

Kerberos 5 Fails to meet requirements:

Integration with various local security solutions User based trust model

Transport Layer Security (TLS/SSL) Fails to meet requirements:

Single sign-on Delegation

Page 157: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Security Infrastructure (GSI)

Extensions to standard protocols & APIs Standards: SSL/TLS, X.509 & CA, GSS-API

Extensions for single sign-on and delegation

Globus Toolkit reference implementation of GSI SSLeay/OpenSSL + GSS-API + SSO/delegation

Tools and services to interface to local security Simple ACLs; SSLK5/PKINIT for access to K5, AFS; …

Tools for credential management Login, logout, etc. Smartcards MyProxy: Web portal login and delegation K5cert: Automatic X.509 certificate creation

Page 158: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Review of Public Key Cryptography

Asymmetric keys A private key is used to encrypt data.

A public key can decrypt data encrypted with the private key.

An X.509 certificate includes… Someone’s subject name (user ID)

Their public key

A “signature” from a Certificate Authority (CA) that: Proves that the certificate came from the CA. Vouches for the subject name Vouches for the binding of the public key to the subject

Page 159: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Public Key Based Authentication

User sends certificate over the wire.

Other end sends user a challenge string.

User encodes the challenge string with private key Possession of private key means you can authenticate as

subject in certificate

Public key is used to decode the challenge. If you can decode it, you know the subject

Treat your private key carefully!! Private key is stored only in well-guarded places, and only in

encrypted form

Page 160: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

X.509 Proxy Certificate

Defines how a short term, restricted credential can be created from a normal, long-term X.509 credential

A “proxy certificate” is a special type of X.509 certificate that is signed by the normal end entity cert, or by another proxy

Supports single sign-on & delegation through “impersonation”

Currently an IETF draft

Page 161: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

User Proxies

Minimize exposure of user’s private key

A temporary, X.509 proxy credential for use by our computations

We call this a user proxy certificate

Allows process to act on behalf of user

User-signed user proxy cert stored in local file

Created via “grid-proxy-init” command

Proxy’s private key is not encrypted Rely on file system security, proxy certificate file must be

readable only by the owner

Page 162: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Delegation

Remote creation of a user proxy

Results in a new private key and X.509 proxy certificate, signed by the original key

Allows remote process to act on behalf of the user

Avoids sending passwords or private keys across the network

Page 163: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Security APIs

Generic Security Service (GSS) API IETF standard

Provides functions for authentication, delegation, message protection

Decoupled from any particular communication method

But GSS-API is somewhat complicated, so we also provide the easier-to-use globus_gss_assist API.

GSI-enabled SASL is also provided

Page 164: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Results

GSI adopted by 100s of sites, 1000s of users Globus CA has issued >3000 certs (user & host), >1500 currently

active; other CAs active

Rollouts are currently underway all over: NSF Teragrid, NASA Information Power Grid, DOE Science Grid,

European Data Grid, etc.

Integrated in research & commercial apps GrADS testbed, Earth Systems Grid, European Data Grid,

GriPhyN, NEESgrid, etc.

Standardization begun in Global Grid Forum, IETF

Page 165: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GSI Applications

Globus Toolkit™ uses GSI for authentication

Many Grid tools, directly or indirectly, e.g. Condor-G, SRB, MPICH-G2, Cactus, GDMP, …

Commercial and open source tools, e.g. ssh, ftp, cvs, OpenLDAP, OpenAFS

SecureCRT (Win32 ssh client)

And since we use standard X.509 certificates, they can also be used for

Web access, LDAP server access, etc.

Page 166: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Ongoing and Future GSI Work

Protection against compromised resources Restricted delegation, smartcards

Standardization

Scalability in numbers of users & resources Credential management

Online credential repositories (“MyProxy”)

Account management

Authorization Policy languages

Community authorization

Page 167: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Proxy Certificate Standards Work

“Internet Public Key Infrastructure X.509 Proxy Certificate Profile”

draft-ietf-pkix-proxy-01.txt Draft being considered by IETF PKIX working group, and by GGF

GSI working group

Defines proxy certificate format, including restricted rights and delegation tracing

Demonstrated a prototype of restricted proxies at HPDC (August 2001) as part of CAS demo

Page 168: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GSS-API Extensions Work

4 years of GSS-API experience, while on the whole quite positive, has shed light on various deficiencies of GSS-API

“GSS-API Extensions” draft-ggf-gss-extensions-04.txt

Draft being considered by GGF GSI working group. Not yet submitted to IETF.

Defines extensions to the GSS-API to better support Grid security

Page 169: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GSS-API Extensions

Credential export/import Allows delegated credentials to be externalized

Used for checkpointing a service

Delegation at any time, in either direction More rich options on use of delegation

Restricted delegation handling Add proxy restrictions to delegated cred

Inspect auth cert for restrictions

Allow better mapping of GSS to TLS Support TLS framing of messages

Page 170: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Community Authorization Service

Question: How does a large community grant its users access to a large set of resources?

Should minimize burden on both the users and resource providers

Community Authorization Service (CAS) Community negotiates access to resources

Resource outsources fine-grain authorization to CAS

Resource only knows about “CAS user” credential CAS handles user registration, group membership…

User who wants access to resource asks CAS for a capability credential Restricted proxy of the “CAS user” cred., checked by resource

Page 171: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

CAS1. CAS request, with resource names and operations

Community Authorization(Prototype shown August 2001)

Does the collective policy authorize this

request for this user?

user/group membership

resource/collective membership

collective policy information

Resource

Is this request authorized for

the CAS?

Is this request authorized by

the capability? local policy

information

4. Resource reply

User 3. Resource request, authenticated with

capability

2. CAS reply, with and resource CA info

capability

Page 172: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Community Authorization Service

CAS provides user community with information needed to authenticate resources

Sent with capability credential, used on connection with resource

Resource identity (DN), CA

This allows new resources/users (and their CAs) to be made available to a community through the CAS without action on the other user’s/resource’s part

Page 173: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Authorization API

Service providers need to perform authorization policy evaluation on:

Local policies

Policies contained in restricted proxies

We are working on 2 API layers: Low level GAA-API implementation for evaluation of policies

High level, very simple authorization API that can easily be embedded into services

Still in early prototyping stage

Page 174: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Passport Online CA & MyProxy

Requiring users to manage their own certs and keys is annoying and error prone

A solution: Leverage Passport global authentication to obtain a proxy credential

Passport provides Globally unique user name (email address) Method of verifying ownership of the name (authentication) Re-issuance (e.g. forgotten password)

Passport credentials can be presented to an online CA or credential repository

Creates and issues new (restricted) proxy certificate to the user on demand

Page 175: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Other Future Security Work

Ease-of-use Improved error message, online CA, etc.

Improved online credential repositories See MyProxy paper at HPDC

Support for multiple user credentials

Multi-factor authentication

Subordinate certificate authorities for domains Ease issuance of host certs for domains

Independent Data Unit Support

Page 176: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Security Summary

GSI successfully addresses wide variety of Grid security issues

Broad acceptance, deployment, integration with tools

Standardization on-going in IETF & GGF

Ongoing R&D to address next set of issues

For more information: www.globus.org/research/papers.html

“A Security Architecture for Computational Grids” “Design and Deployment of a National-Scale Authentication

Infrastructure”

www.gridforum.org/security

Page 177: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Resource Allocation Management (GRAM)

Page 178: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Challenge

Enabling secure, controlled remote access to heterogeneous computational resources and management of remote computation

Authentication and authorization

Resource discovery & characterization

Reservation and allocation

Computation monitoring and control

Addressed by new protocols & services GRAM protocol as a basic building block

Resource brokering & co-allocation services

GSI for security, MDS for discovery

Page 179: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Resource Management

The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity

Resource Specification Language (RSL) is used to communicate requirements

A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services

Integrated with Condor, PBS, MPICH-G2, …

Page 180: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GRAM GRAM GRAM

LSF Condor NQE

Application

RSL

Simple ground RSL

Information Service

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

Resource Management Architecture

Page 181: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Resource Specification Language

Common notation for exchange of information between components

Syntax similar to MDS/LDAP filters

RSL provides two types of information: Resource requirements: Machine type, number of nodes, memory,

etc.

Job configuration: Directory, executable, args, environment

Globus Toolkit provides an API/SDK for manipulating RSL

Page 182: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

RSL Syntax

Elementary form: parenthesis clauses (attribute op value [ value … ] )

Operators Supported: <, <=, =, >=, > , !=

Some supported attributes: executable, arguments, environment, stdin, stdout, stderr,

resourceManagerContact,resourceManagerName

Unknown attributes are passed through May be handled by subsequent tools

Page 183: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Constraints: “&”

For example:

& (count>=5) (count<=10)

(max_time=240) (memory>=64)

(executable=myprog)

“Create 5-10 instances of myprog, each on a machine with at least 64 MB memory that is available to me for 4 hours”

Page 184: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Disjunction: “|”

For example:

& (executable=myprog)

( | (&(count=5)(memory>=64))

(&(count=10)(memory>=32)))

Create 5 instances of myprog on a machine that has at least 64MB of memory, or 10 instances on a machine with at least 32MB of memory

Page 185: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GRAM Protocol Evolution

GRAM-1: Simple HTTP-based RPC Job request

Returns a “job contact”: Opaque string that can be passed between clients, for access to job

Job cancel, status, signal

Event notification (callbacks) for state changes Pending, active, done, failed, suspended

GRAM-1.5 (U Wisconsin contribution) Add reliability improvements

Once-and-only-once submission Recoverable job manager service Reliable termination detection

GRAM-2: Moving to Web Services (SOAP)…

Page 186: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Toolkit Implementation

Gatekeeper Single point of entry

Authenticates user, maps to local security environment, runs service

In essence, a “secure inetd”

Job manager A gatekeeper service

Layers on top of local resource management system (e.g., PBS, LSF, etc.)

Handles remote interaction with the job

Page 187: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GRAM Components

Grid SecurityInfrastructure

Job Manager

GRAM client API calls to request resource allocation

and process creation.

MDS client API callsto locate resources

Query current statusof resource

Create

RSL Library

Parse

RequestAllocate &

create processes

Process

Process

Process

Monitor &control

Site boundary

Client MDS: Grid Index Info Server

Gatekeeper

MDS: Grid Resource Info Server

Local Resource Manager

MDS client API callsto get resource info

GRAM client API statechange callbacks

Page 188: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Co-allocation

Simultaneous allocation of a resource set Handled via optimistic co-allocation based on free nodes or queue

prediction

In the future, advance reservations will also be supported (already in prototype)

Globus APIs/SDKs support the co-allocation of specific multi-requests

Uses a Globus component called the Dynamically Updated Request OnlineCo-allocator (DUROC)

Page 189: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Multirequest: “+”

A multirequest allows us to specify multiple resource needs, for example

+ (& (count=5)(memory>=64)

(executable=p1))

(&(network=atm) (executable=p2)) Execute 5 instances of p1 on a machine with at least 64M of memory

Execute p2 on a machine with an ATM connection

Multirequests are central to co-allocation

Page 190: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Co-allocation Multirequest

+( & (resourceManagerContact= “flash.isi.edu:754:/C=US/…/CN=flash.isi.edu-fork”) (count=1) (label="subjob A") (executable= my_app1) ) ( & (resourceManagerContact= “sp139.sdsc.edu:8711:/C=US/…/CN=sp097.sdsc.edu-lsf") (count=2) (label="subjob B") (executable=my_app2) )

Different executables

Differentcounts

Different resourcemanagers

Page 191: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Job Submission Interfaces

Globus Toolkit includes several command line programs for job submission

globus-job-run: Interactive jobs

globus-job-submit: Batch/offline jobs

globusrun: Flexible scripting infrastructure

Others are building better interfaces General purpose

Condor-G, PBS, GRD, Hotpage, etc

Application specific ECCE’, Cactus, Web portals

Page 192: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

globus-job-run

For running of interactive jobs

Additional functionality beyond rsh Ex: Run 2 process job w/ executable staging

globus-job-run -: host –np 2 –s myprog arg1 arg2

Ex: Run 5 processes across 2 hosts

globus-job-run \

-: host1 –np 2 –s myprog.linux arg1 \

-: host2 –np 3 –s myprog.aix arg2

For list of arguments run:

globus-job-run -help

Page 193: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

globus-job-submit

For running of batch/offline jobs globus-job-submit Submit job

Same interface as globus-job-run Returns immediately

globus-job-status Check job status

globus-job-cancel Cancel job

globus-job-get-output Get job stdout/err

globus-job-clean Cleanup after job

Page 194: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

globusrun

Flexible job submission for scripting Uses an RSL string to specify job request

Contains an embedded globus-gass-server Defines GASS URL prefix in RSL substitution variable:

(stdout=$(GLOBUSRUN_GASS_URL)/stdout)

Supports both interactive and offline jobs

Complex to use Must write RSL by hand

Must understand its esoteric features

Generally you should use globus-job-* commands instead

Page 195: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Resource Management APIs

The globus_gram_client API provides access to all of the core job submission and management capabilities, including callback capabilities for monitoring job status.

The globus_rsl API provides convenience functions for manipulating and constructing RSL strings.

The globus_gram_myjob allows multi-process jobs to self-organize and to communicate with each other.

The globus_duroc_control and globus_duroc_runtime APIs provide access to multirequest (co-allocation) capabilities.

Page 196: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Advance Reservationand Other Generalizations

General-purpose Architecture for Reservation and Allocation (GARA)

2nd generation resource management services

Broadens GRAM on two axes Generalize to support various resource types

CPU, storage, network, devices, etc.

Advance reservation of resources, in addition to allocation

Currently a research prototype

Page 197: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Gatekeeper

Scheduler RM

Gatekeeper

Diffserv RM

Gatekeeper

DSRT RM

Gatekeeper

GRIO RM

Co-Reservation Agent MDS Info Service

GARA: The Big Picture

Page 198: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Information Services

System information is critical to operation of the grid and construction of applications

What resources are available? Resource discovery

What is the “state” of the grid? Resource selection

How to optimize resource use Application configuration and adaptation?

We need a general information infrastructure to answer these questions

Page 199: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Examples of Useful Information

Characteristics of a compute resource IP address, software available, system administrator, networks

connected to, OS version, load

Characteristics of a network Bandwidth and latency, protocols, logical topology

Characteristics of the Globus infrastructure Hosts, resource managers

Page 200: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Information: Facts of Life

Information is always old Time of flight, changing system state

Need to provide quality metrics

Distributed state hard to obtain Complexity of global snapshot

Component will fail

Scalability and overhead

Many different usage scenarios Heterogeneous policy, different information organizations, etc.

Page 201: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Information Service

Provide access to static and dynamic information regarding system components

A basis for configuration and adaptation in heterogeneous, dynamic environments

Requirements and characteristics Uniform, flexible access to information

Scalable, efficient access to dynamic data

Access to multiple information sources

Decentralized maintenance

Page 202: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Two Classes Of Information Servers

Resource Description Services Supplies information about a specific resource (e.g. Globus 1.1.3

GRIS).

Aggregate Directory Services Supplies collection of information which was gathered from

multiple GRIS servers (e.g. Globus 1.1.3 GIIS).

Customized naming and indexing

Page 203: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Information Protocols

Grid Resource Registration Protocol Support information/resource discovery

Designed to support machine/network failure

Grid Resource Inquiry Protocol Query resource description server for information

Query aggregate server for information

LDAP V3.0 in Globus 1.1.3

Page 204: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GIS Architecture

A A

Customized Aggregate Directories

R RR R

Standard Resource Description Services

Registration

Protocol

Users

Enquiry

Protocol

Page 205: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Metacomputing Directory Service

Use LDAP as Inquiry

Access information in a distributed directory Directory represented by collection of LDAP servers

Each server optimized for particular function

Directory can be updated by: Information providers and tools

Applications (i.e., users)

Backend tools which generate info on demand

Information dynamically available to tools and applications

Page 206: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Two Classes Of MDS Servers

Grid Resource Information Service (GRIS) Supplies information about a specific resource

Configurable to support multiple information providers

LDAP as inquiry protocol

Grid Index Information Service (GIIS) Supplies collection of information which was gathered from

multiple GRIS servers

Supports efficient queries against information which is spread across multiple GRIS server

LDAP as inquiry protocol

Page 207: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

LDAP Details

Lightweight Directory Access Protocol IETF Standard

Stripped down version of X.500 DAP protocol

Supports distributed storage/access (referrals)

Supports authentication and access control

Defines: Network protocol for accessing directory contents

Information model defining form of information

Namespace defining how information is referenced and organized

Page 208: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

MDS Components

LDAP 3.0 Protocol Engine Based on OpenLDAP with custom backend

Integrated caching

Information providers Delivers resource information to backend

APIs for accessing & updating MDS contents C, Java, PERL (LDAP API, JNDI)

Various tools for manipulating MDS contents Command line tools, Shell scripts & GUIs

Page 209: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GRIS/GIIS

Page 210: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Resource Information Service

Server which runs on each resource Given the resource DNS name, you can find the GRIS server (well

known port = 2135)

Provides resource specific information Much of this information may be dynamic

Load, process information, storage information, etc. GRIS gathers this information on demand

“White pages” lookup of resource information Ex: How much memory does machine have?

“Yellow pages” lookup of resource options Ex: Which queues on machine allows large jobs?

Page 211: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid Index Information Service

GIIS describes a class of servers Gathers information from multiple GRIS servers Each GIIS is optimized for particular queries

Ex1: Which Alliance machines are >16 process SGIs? Ex2: Which Alliance storage servers have >100Mbps bandwidth to host

X?

Akin to web search engines

Organization GIIS The Globus Toolkit ships with one GIIS Caches GRIS info with long update frequency

Useful for queries across an organization that rely on relatively static information (Ex1 above)

Can be merged into GRIS

Page 212: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Information Services API

RFC 1823 defines an IETF draft standard client API for accessing LDAP databases

Connect to server

Pose query which returns data structures contains sets of object classes and attributes

Functions to walk these data structures

Globus does not provide an LDAP API. We recommend the use of OpenLDAP, an open source implementation of RFC 1823.

Page 213: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Searching an LDAP Directory

grid-info-search [options] filter [attributes]

Default grid-info-search options

-h mds.globus.org MDS server

-p 389 MDS port

-b “o=Grid” search start point

-T 30 LDAP query timeout

-s sub scope = subtree alternatives:

base : lookup this entry

one : lookup immediate children

Page 214: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Searching a GRIS Server

grid-info-host-search [options] filter [attributes]

Exactly like grid-info-search, except defaults:

-h localhost GRIS server

-p 2135 GRIS port

Example:

grid-info-host-search –h pitcairn “dn=*” dn

Page 215: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Filtering

Filters allow selection of object based on relational operators (=, ~=,<=, >=)

grid-info-search “cputype=*”

Compound filters can be construct with Boolean operations: (&, |, !)

grid-info-search “(&(cputype=*)(cpuload1<=1.0))”

grid-info-search “(&(hn~=sdsc.edu)(latency<=10))”

Hints: white space is significant

use -L for LDIF format

required

Page 216: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Data Grid Problem

“Enable a geographically distributed community [of thousands] to pool their resources in order to perform sophisticated, computationally intensive analyses on Petabytes of data”

Note that this problem: Is common to many areas of science

Overlaps strongly with other Grid problems

Page 217: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Data Intensive Issues Include …

Harness [potentially large numbers of] data, storage, network resources located in distinct administrative domains

Respect local and global policies governing what can be used for what

Schedule resources efficiently, again subject to local and global constraints

Achieve high performance, with respect to both speed and reliability

Catalog software and virtual data

Page 218: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Data IntensiveComputing and Grids

The term “Data Grid” is often used Unfortunate as it implies a distinct infrastructure, which it

isn’t; but easy to say

Data-intensive computing shares numerous requirements with collaboration, instrumentation, computation, …

Security, resource mgt, info services, etc.

Important to exploit commonalities as very unlikely that multiple infrastructures can be maintained

Fortunately this seems easy to do!

Page 219: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Examples ofDesired Data Grid Functionality

High-speed, reliable access to remote data

Automated discovery of “best” copy of data

Manage replication to improve performance

Co-schedule compute, storage, network

“Transparency” wrt delivered performance

Enforce access control on data

Allow representation of “global” resource allocation policies

Page 220: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Model Architecture for Data Grids

Metadata Catalog

Replica Catalog

Tape Library

Disk Cache

Attribute Specification

Logical Collection and Logical File Name

Disk Array Disk Cache

Application

Replica Selection

Multiple Locations

NWS

SelectedReplica

GridFTP Control ChannelPerformanceInformation &Predictions

Replica Location 1 Replica Location 2 Replica Location 3

MDS

GridFTPDataChannel

Page 221: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Toolkit Components

Two major Data Grid components:

1. Data Transport and Access Common protocol

Secure, efficient, flexible, extensible data movement

Family of tools supporting this protocol

2. Replica Management Architecture Simple scheme for managing:

multiple copies of files collections of files

Page 222: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Motivation for a Common Data Access Protocol

Existing distributed data storage systems DPSS, HPSS: focus on high-performance access, utilize parallel

data transfer, striping

DFS: focus on high-volume usage, dataset replication, local caching

SRB: connects heterogeneous data collections, uniform client interface, metadata queries

Problems Incompatible (and proprietary) protocols

Each require custom client Partitions available data sets and storage devices

Each protocol has subset of desired functionality

Page 223: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Common, Secure,Efficient Data Access Protocol

Common, extensible transfer protocol Common protocol means all can interoperate

Decouple low-level data transfer mechanisms from the storage service

Advantages: New, specialized storage systems are automatically compatible

with existing systems

Existing systems have richer data transfer functionality

Interface to many storage systems HPSS, DPSS, file systems

Plan for SRB integration

Page 224: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Access/Transport Protocol Requirements

Suite of communication libraries and related tools that support GSI, Kerberos security

Third-party transfers

Parameter set/negotiate

Partial file access

Reliability/restart

Large file support

Data channel reuse

All based on a standard, widely deployed protocol

– Integrated instrumentation

– Loggin/audit trail

– Parallel transfers

– Striping (cf DPSS)

– Policy-based access control

– Server-side computation

– Proxies (firewall, load bal)

Page 225: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GridFTP and Grid Access to Secondary Storage (GASS)

Page 226: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GridFTP

Why FTP? Ubiquity enables interoperation with many commodity tools

Already supports many desired features, easily extended to support others

Well understood and supported

We use the term GridFTP to refer to Transfer protocol which meets requirements

Family of tools which implement the protocol

Note GridFTP > FTP

Note that despite name, GridFTP is not restricted to file transfer!

Page 227: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GridFTP: Basic Approach

FTP protocol is defined by several IETF RFCs

Start with most commonly used subset Standard FTP: get/put etc., 3rd-party transfer

Implement standard but often unused features GSS binding, extended directory listing, simple restart

Extend in various ways, while preserving interoperability with existing servers

Striped/parallel data channels, partial file, automatic & manual TCP buffer setting, progress monitoring, extended restart

Page 228: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GridFTP Protocol Specifications

Existing standards RFC 949: File Transfer Protocol

RFC 2228: FTP Security Extensions

RFC 2389: Feature Negotiation for the File Transfer Protocol

Draft: FTP Extensions

New drafts GridFTP: Protocol Extensions to FTP for the Grid

Grid Forum Data Working Group

Page 229: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

GridFTP vs. WebDAV

WebDAV extends http for remote data access Combines control and data over single channel

FTP splits control and data Supports multiple, user selectable data channel protocols

Advantage to split channels Third party transfers handled cleanly

Can (cleanly) define new data channel protocols E.g. parallel/striped transfer, automatic TCP buffer/window negotiation,

non-TCP based protocols, etc.

Amenable to high-performance proxies E.g. For firewalls, load balancing, etc.

Page 230: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The GridFTP Family of Tools

Patches to existing FTP code GSI-enabled versions of existing FTP client and server, for high-

quality production code

Custom-developed libraries Implement full GridFTP protocol, targeting custom use, high-

performance

Custom-developed tools Servers and clients with specialized functionality and performance

Page 231: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

A Word on GASS

The Globus Toolkit provides services for file and executable staging and I/O redirection that work well with GRAM. This is known as Globus Access to Secondary Storage (GASS).

GASS uses GSI-enabled HTTP as the protocol for data transfer, and a caching algorithm for copying data when necessary.

The globus_gass, globus_gass_transfer, and globus_gass_cache APIs provide programmer access to these capabilities, which are already integrated with the GRAM job submission tools.

Page 232: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Future Directions

Continued enhancement & standardization of protocol Globus Toolkit libraries provide reference implementation

Continue building on libraries Striped server w/ server side processing

Reliable replica/copy management service

Proxies for firewalls & load balancing

Work with more application communities

Page 233: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Virtual Data Concept

“[a virtual data grid enables] the definition and delivery of a potentially unlimited virtual space of data products derived from other data. In this virtual space, requests can be satisfied via direct retrieval of materialized products and/or computation, with local and global resource management, policy, and security constraints determining the strategy used.”

Page 234: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Virtual Data in Action

Data request may Access local data

Compute locally

Compute remotely

Access remote data

Scheduling subject to local & global policies

Local autonomy

?

Major Archive Facilities

Network caches & regional centers

Local sites

Page 235: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Condor

High-throughput scheduler

Non-dedicated resources

Job checkpoint and migration

Remote system calls

Page 236: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

What Is Condor-G?

Enhanced version of Condor that uses Globus Toolkit™ to manage Grid jobs

Excellent example of applying the general purpose Globus Toolkit to solve a particular problem (I.e. high-throughput computing) on the Grid

Page 237: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

What is Condor-G? (cont)

Merging of Globus and Condor technologies

Globus Protocols for secure inter-domain communications

Standardized access to remote batch systems

Condor Job submission and allocation

Error recovery

Creation of an execution environment

Page 238: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Condor Technologies in Grid Middleware

Application, problem solver…

Globus Toolkit

Condor-G

Condor

Processing, storage…..

Job submission

Job execution

Resource discovery, authentication….

Page 239: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Condor-G

Condor agent that speaks GRAM

R R R R R R

Q Q

Agent

Jobs go to queues and not resources directly

Page 240: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Why Use Condor-G

Condor Designed to run jobs within a single administrative domain

Globus Toolkit Designed to run jobs across many administrative domains

Condor-G Combine the strengths of both

Page 241: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Globus Universe

Advantages of using Condor-G to manage your Grid jobs Full-featured queuing service

Credential Management

Fault-tolerance

Page 242: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Full-Featured Queue

Persistent queue

Many queue-manipulation tools

Set up job dependencies

E-mail notification of events

Log files

Page 243: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Credential Management

Authentication in Globus Toolkit is done with limited-lifetime X509 proxies

Proxy may expire before jobs finish executing

Condor-G can put jobs on hold and e-mail user to refresh proxy

Condor-G can forward new proxy to execution sites

Page 244: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Fault Tolerance

Local Crash Queue state stored on disk

Reconnect to execute machines

Network Failure Wait until connectivity returns

Reconnect to execute machines

Page 245: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Fault Tolerance

Remote Crash – job still in queue Job state stored on disk

Start new jobmanager to monitor job

Remote Crash – job lost Resubmit job

Page 246: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

The Future:All Software is Network-Centric

We don’t build or buy “computers” anymore, we borrow or lease required resources

When I walk into a room, need to solve a problem, need to communicate

A “computer” is a dynamically, often collaboratively constructed collection of processors, data sources, sensors, networks

Similar observations apply for software

Page 247: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

And Thus …

Reduced barriers to access mean that we do much more computing, and more interesting computing, than today => Many more components (& services); massive parallelism

All resources are owned by others => Sharing (for fun or profit) is fundamental; trust, policy, negotiation, payment

All computing is performed on unfamiliar systems => Dynamic behaviors, discovery, adaptivity, failure

Page 248: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Summary

The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations

Grid architecture: Emphasize protocol and service definition to enable interoperability and resource sharing

Globus Toolkit™ a source of protocol and API definitions, reference implementations

See: www.globus.org, www.gridforum.org

Page 249: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Suppose...

The Grid could use Semantic Web technology for: at the infrastructure level

schema integration, workflow descriptions typing data and service inputs and outputs problem solving selection and intelligent portals

at the application level annotating results, workflows, database entries and parameters of

analyses with: personal notes, provenance data, derivation paths of information, explanations or claims

linking experimental components: literature, notes, code, databases, intermediate results, sketches, images, workflows, the person doing the experiment, the lab they are in, the final paper

describing people, labs, literature, tools and scientific knowledge

Page 250: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Grid + Semantic Web = The Semantic Grid

Grid is metadata based middleware

To support the full richness of the Grid computing vision we need to apply Semantic Web technologies to Grid middleware and applications; i.e. the Semantic Grid 

Semantic Web base services -> Grid Base Services

Semantic Web metadata and ontologies -> Grid metadata Infrastructure, driving the machinery of the Grid

Base and high level services

Grid applications, representing the knowledge and operational know-how of the application domain

A “Knowledge Grid”

Page 251: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Semantic Grid Definition

The Semantic Grid is an extension of the current Grid in which information and services are given well-defined meaning, better enabling computers and peopleto work in cooperation

Page 252: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Semantics in and on the Grid

Grid Computing

Grid Computing

The Semantic

Web

The Semantic

Web

The Semantic

Grid

Web ServicesWeb Services

Page 253: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Summary - The Current Landscape

internet

Data com

plexity

Computational complexity

Web

Dynamic Web

Semantic Web

Web Services

“Globus” Grid

OGSA Grid

Semantic Grid

Page 254: Introduction to GRID Computing Bebo White bebo@slac.stanford.edu New Directions in Information Technology Series Contra Costa College Fall 2005.

Thank You

Questions, Comments?

[email protected]