Services and the Semantic Grid

38
1 Services and the Semantic Grid SKG2005 Beijing China November 28 2005 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http:// www.infomall.org

description

Services and the Semantic Grid. SKG2005 Beijing China November 28 2005 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org. Data Deluged Science. - PowerPoint PPT Presentation

Transcript of Services and the Semantic Grid

Page 1: Services and the Semantic Grid

11

Services and the Semantic Grid

SKG2005 Beijing China November 28 2005

Geoffrey Fox

Computer Science, Informatics, Physics

Pervasive Technology Laboratories

Indiana University Bloomington IN 47401

[email protected]

http://www.infomall.org

Page 2: Services and the Semantic Grid

22

Data Deluged Science In the past, we worried about data in the form of parallel I/O or

MPI-IO, but we didn’t consider it as an enabler of new science and new ways of computing

Data assimilation was not central to HPCC DoE ASCI set up because didn’t want test data! Now particle physics will get 100 petabytes from CERN

• Nuclear physics (Jefferson Lab) in same situation

• Use around 30,000 CPU’s simultaneously 24X7

Weather, climate, solid earth (EarthScope) Bioinformatics curated databases (Biocomplexity only 1000’s of

data points at present) Virtual Observatory and SkyServer in Astronomy Environmental Sensor nets

Page 3: Services and the Semantic Grid

33

Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments,

file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012)

• Moore’s law for Sensors Possible filters assigned dynamically (on-demand)

• Run image processing algorithm on telescope image• Run Gene sequencing algorithm on compiled data

Needs decision support front end with “what-if” simulations

Metadata (provenance) critical to annotate data

Integrate across experiments as in multi-wavelength astronomy

Data Deluge comes from pixels/year available

Page 4: Services and the Semantic Grid

4

Semantically Rich Services with a Semantically Rich Distributed Operating Environment

Database

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS PortalFS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

MD

MD

MD

MD

MD

MD

MD

MD

MD

MetaData

Filter Service

Sensor Service

OtherService

Page 5: Services and the Semantic Grid

55

Semantic Grid and Services Implications of SOA (Service Oriented Architectures) for SG

(Semantic Grid)

• Build services to implement SG Implications of SG for SOA

• Build metadata rich systems of services using SG Services receive data in SOAP messages, manipulate it and

produce transformed data as further messages Meta-data is carried in SOAP messages Meta-data controls processing and transport of SOAP Messages Knowledge is created from data by services The Grid enhances Web services with semantically rich system

and application specific management One must exploit and work around the different approaches to

meta-data and their manipulation in Web Services

Page 6: Services and the Semantic Grid

66

Structure of SOAP Messages

SOAP Messages have System information in the header including WS-Policy based meta-data defining processing options• Processed by Handlers

Application data and meta-data is the body (controversies here!)• Processed by the Service itself

Some meta-data like WS-RF is logically “only in messages” Other like that in WS-Context or the SRB are stored in logical

equivalent of XML databases We only need to preserve semantic structure (XML/SOAP

Infoset) so transport in fast XML and store in efficient relational databases

H1 H4H3H2 Body F1 F2 F3 F4 Service

Container Handlers

Container Workflow

Page 7: Services and the Semantic Grid

77

What Type of Services are there? There are a horde of support services supplying security,

collaboration, database access, user interfaces The support services are either associated with system or

application• We will study the WS-* and GS-* which implicitly or

explicitly define many support services There are generalized filter services which are applications that

accept messages and produce new messages with some data derived from that in input• Simulations (including PDE’s and reactive systems)• Data-mining• Transformations• Agents• Reasoning are all termed filters here

There are services like “author ontology”, “parse RDF” or “attach provenance” that directly support Semantic Grid

But all services and their interactions are bathed in sea of meta-data and so implicitly need and support the Semantic Grid

Page 8: Services and the Semantic Grid

88

It’s a Composite Hierarchical World Filters can be a workflow which means they are “just collections

of other simpler services”• One needs meta-data to control the workflow

Services are programs that accept messages and produce messages

Grids are a distributed collection of services supporting managed shared resources• Management requires meta-data

Grids are distributed systems that accept distributed messages and produce distributed result messages• Can always talk about Grids and view a service or a

workflow as a special case of a Grid It just requires meta-data to send a message to a Grid and it

routed to “correct computer” holding “requested service”• Meta-data allows mapping of virtual to real addresses

Page 9: Services and the Semantic Grid

9

Semantically Rich Services with a Semantically Rich Distributed Operating Environment

Database

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS PortalFS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

MD

MD

MD

MD

MD

MD

MD

MD

MD

MetaData

Filter Service

Sensor Service

OtherService

SOAP Message Streams

SOAP Message Streams

Raw Data Raw Data

Raw Data

Raw Data

Data

Data Data

Data

Information

Information

Knowledge

Knowledge

Wisdom

Decisions

Information

AnotherService

AnotherService

AnotherGrid

AnotherGrid

Grids of Grids Architecture

is same as outwardfacing application

service

Page 10: Services and the Semantic Grid

10

The Grid and Web Service Institutional Hierarchy

1: Container and Run Time (Hosting) Environment

2: System Services and FeaturesHandlers like WS-RM, Security, Programming Models like BPEL

or Registries like UDDI

3: Generally Useful Services and FeaturesSuch as “Access a Database” or “Submit a Job” or “Semantic

Grid” or “Support a Portal” or “Collaborative Visualization”

4: Application or Community of InterestSpecific Services

such as “Run BLAST” or “Look at Houses for sale”

OGSAand otherGGF/W3C/ ………

WS-* fromOASIS/W3C/Industry

Apache Axis.NET etc.

Page 11: Services and the Semantic Grid

1111

The WS-* Infrastructure Core Grid Services build on and/or extend the 60 or so

WS-* Infrastructure specifications which define• 1. Container Model, XML, WSDL …• 2. Service Internet ( (Reliable) Messaging, Addressing)

including extensions for high performance transport and representation. This is natural basis for streaming applications

• 3. Notification• 4. Workflow and Transactions• 5. Security• 6. Service Discovery• 7. Metadata and State including lifetime• 8. Management (service interactions)• 9. Policy, Agreements• 10. Portals and User Interfaces

These categories are directly connectedto metadata

Page 12: Services and the Semantic Grid

12

A List of Web Services 6• 6) Service Discovery

• UDDI (Broadly Supported OASIS Standard) V3 August 2003

• WS-Discovery Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004

• WS-IL Web Services Inspection Language, (IBM, Microsoft) November 2001

• Note WS-Context as a metadata catalog and WS-Management Catalog are examples of related services

• There are many UDDI extensions such as Grimoires from UK OMII which often are essentially providing semantic enrichment

Discovery is just accessing part of meta-data defining a Grid

Page 13: Services and the Semantic Grid

13

A List of Web Services 7• 7) Metadata and State• RDF Resource Description Framework (W3C) Set of

recommendations expanded from original February 1999 standard • DAML+OIL combining DAML (Darpa Agent Markup Language)

and OIL (Ontology Inference Layer) (W3C) Note December 2001 • OWL Web Ontology Language (W3C) Recommendation February

2004 • WS-MetadataExchange Web Services Metadata Exchange (BEA,

IBM, Microsoft, SAP, Sun …) September 2004 • ASAP Asynchronous Service Access Protocol (OASIS) with V1.0

working draft 2B December 11 2004• WS-GAF Web Service Grid Application Framework (Arjuna,

Newcastle University) August 2003• WBEM Web-Based Enterprise Management including CIM

(Common Information Model) from DMTF (Distributed Management Task Force) 2004-2005

Page 14: Services and the Semantic Grid

14

A List of Web Services 7• 7) Metadata and State: Resource Framework• WS-RF Web Services Resource Framework (OASIS)

including • WS-Resource Framework Web Services Resource 1.2

(OASIS) Public Review Draft 01, 10 June 2005• WS-ResourceProperties Web Services Resource

Properties V1.2 Public Review Draft 01, 10 June 2005• WS-ResourceLifetime Web Services Resource Lifetime

V1.2 Public Review Draft 01, 13 June 2005• WS-ServiceGroup Web Services Service Group V1.2

Public Review Draft 01, 10 June 2005• WS-BaseFaults Web Services Base Faults V1.2 Public

Review Draft 01, June 13, 2005

These WS-* define syntax of Meta-data (RDF OWL CIM) and how to use it in system (WS-MetadataExchange) – especially headers (WS-RF)

Page 15: Services and the Semantic Grid

1515

Metadata and Service ContextMetadata and Service Context Consider a Consider a collection of servicescollection of services working together working together

• Workflow tells you how to specify service interaction Workflow tells you how to specify service interaction but more basically there is shared information or but more basically there is shared information or context specifying/controlling collectioncontext specifying/controlling collection

WS-RFWS-RF and W and WS-GAFS-GAF have different approaches to have different approaches to contextualization contextualization – supplying a common “context”– supplying a common “context” which at its simplest is a token to represent state which at its simplest is a token to represent state

More generally core More generally core shared informationshared information includes includes dynamic service metadata and the equivalent of dynamic service metadata and the equivalent of configuration information.configuration information.

Two services linked by a streamTwo services linked by a stream are perhaps simplest are perhaps simplest example of a collection of services needing contextexample of a collection of services needing context

Note that there is a tension between storing metadata Note that there is a tension between storing metadata in in messagesmessages and and services. services. • This is shared versus distributed memory debate in This is shared versus distributed memory debate in

parallel computingparallel computing

Page 16: Services and the Semantic Grid

1616

Stateful Interactions There are (at least) four approaches to specifying state

• OGSI use factories to generate separate services for each session in standard distributed object fashion

• Globus GT-4 and WSRF use metadata of a resource to identify state associated with particular session

• WS-GAF uses WS-Context to provide abstract context defining state. Has strength and weakness that reveals less about nature of session

• WS-I+ “Pure Web Service” leaves state specification the application – e.g. put a context in the SOAP body

I think we should smile and write a great metadata (semantic) service hiding all these different models for state and metadata

Page 17: Services and the Semantic Grid

1717

Role of WS-Context There are many WS-* specifications addressing meta-data

and both many approaches and many trade-offs We hear about Distributed Hash Tables (Chord) to achieve

scalability in large scale networks Managed dynamic workflows as in sensor integration and

collaboration require • Fault-tolerance and ability to support dynamic changes

with few millisecond delay• But only a modest number of involved services (up to

1000’s in a session)• Need Session NOT Service/Resource meta-data so don’t use

WS-RF We are building a WS-Context compliant metadata catalog

supporting distributed or central paradigms – see later talk by Mehmet Aktas

Use for OGC Web catalog service with UDDI for slowly varying meta-data

Page 18: Services and the Semantic Grid

18

A List of Web Services 8• 8) Management

• WS-DistributedManagement Web Services Distributed Management Framework with MUWS and MOWS below (OASIS)

• WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) OASIS Standard March 9 2005

• WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) OASIS Standard March 9 2005

Page 19: Services and the Semantic Grid

19

A List of Web Services 8- Contd• 8) Management: Microsoft Stack

• WS-Management Web Services for Management (Microsoft, Intel, Sun …) August 2005

• WS-Management Catalog The WS-Management Catalog (Microsoft, Intel, Sun …) August 2005

• WS-Transfer Web Service Transfer (Microsoft, BEA, Sonic Software etc.) September 2004

• WS-Enumeration Web Service Enumeration (Microsoft, BEA, Sonic Software etc.) September 2004 These WS-* define exchange of data and meta-data between services

Page 20: Services and the Semantic Grid

20

A List of Web Services 9• 9) General Service Characteristics

• WS-PolicyFramework Web Services Policy Framework (BEA, IBM, Microsoft, SAP …) September 2004

• WS-PolicyAttachment Web Services Policy Attachment (BEA, IBM, Microsoft, SAP …) September 2004

• WS-PolicyAssertions Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) 18 December 2002 (Superseded by WS-PolicyFramework)

• WS-Agreement Web Services Agreement Specification (GGF under development) 9 August 2004

These WS-* define syntax of Meta-data defining structure of distributed SystemGrids are managed (meta-data enhanced) distributed collections of Internet Scale services

Page 21: Services and the Semantic Grid

21

Activities in Global Grid Forum Working Groups

GGF Area Standards Activities

1: Architecture High Level Resource/Service Naming (level 2 of fig. 1),Integrated Grid Architecture

2: Applications Software Interfaces to Grid, Grid Remote Procedure Call, Checkpointing and Recovery, Interoperability to Job Submittal services, Information Retrieval,

3: Compute Job Submission, Basic Execution Services, Service Level Agreements for Resource use and reservation, Distributed Scheduling

4: Data Database and File Grid access, Grid FTP, Storage Management, Data replication, Binary data specification and interface, High-level publish/subscribe, Transaction management

5: Infrastructure Network measurements, Role of IPv6 and high performance networking, Data transport

6: Management Resource/Service configuration, deployment and lifetime, Usage records and access, Grid economy model

7: Security Authorization, P2P and Firewall Issues, Trusted Computing

Use the sea of meta-data supported by Semantic Grid

Page 22: Services and the Semantic Grid

22

Two-level Programming I• The Web Service (Grid) paradigm implicitly assumes a

two-level Programming Model• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module

– Data streaming from a sensor or Satellite

– Specialized (JDBC) database access

• Such services accept and produce data from users files and databases

• The Grid is built by coordinating such services assuming we have solved problem of programming the service

Service Data

Page 23: Services and the Semantic Grid

2323

Two-level Programming II The Grid is discussing the composition of distributed

services with the runtime interfaces to Grid in analogy to UNIX pipes/data streams

Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

Such interpretative environments are the single processor analog of Grid Programming

Some projects like GrADS from Rice University are looking at integration between service and composition levels but dominant effort looks at each level separately

Service1 Service2

Service3 Service4

Page 24: Services and the Semantic Grid

24

WS 2 WS N-1Web Service 1 Web Service N

3 Layer Programming Model

Level 2 Programming choosing services by virtualizationApplication Semantics (Metadata, Ontology) Semantic Grid

Level 1 Programming inside servicesApplication expressed in in Java Fortran C++ MPI etc.

Level 3 Grid Programming composing multiple servicesService Workflow, Transactions, Mediation

WS-* Infrastructure

Substantial work in UK e-Science program, international semantic web community

Page 25: Services and the Semantic Grid

2525

Information Architecture and Semantic Grid

WS-* provides key low level capability but deliberately does not define an information (data) architecture and leaves this to domain specific specification activities such as CellML/SBML for biology, WFS/GML for GIS and XGSP for Collaboration

WS-* does define a primitive service discovery (UDDI) and meta-data capabilities including WS-Context, WS-RF, RDF and WS-MetadataExchange already discussed.

GGF defines Grid data capabilities including info-D (publish/subscribe) and OGSA-DAI for data repositories

Semantic Grid uses WS-* and GS-* extending meta-data and service discovery with data-mining and reasoning

Page 26: Services and the Semantic Grid

2626

3 XML Databases of Importance WS-Context controlling a workflow (Extended) UDDI supporting semantic service discovery WFS or ASFS (see later) provides application specific

data/meta-data repository) These have different performance, scalability and data unit size

requirement In our implementation, each is currently “just an

Oracle/MySQL” database front ended by filters that convert between XML (GML for WFS) and object-relational Schema

• Example of Semantics (XML) versus representation (SQL) difference

OGSA-DAI offers Grid interface to databases – we could use but don’t as we only need to expose WFS and not MySQL to Grid

Page 27: Services and the Semantic Grid

2727

Information Management/Processing SOAP messages transport information expressed in a

semantically rich fashion between sources and services that enhance and transform information so that complete system provides

• Semantic Web technologies like RDF and OWL help us have rich expressivity

Data Information Knowledge transformation We build application specific information

management/transformation systems ASIS for each application domain

One special domain is the system itself where the metadata associated with services, sessions, Grids, messages, streams and workflow is itself managed and supported by an SIIS

Page 28: Services and the Semantic Grid

2828

Generalizing a GIS Geographical Information Systems GIS have been

hugely successful in all fields that study the earth and related worlds • They define Geography Syntax (GML) and ways to store,

access, query, manipulate and display geographical features• In SOA, GIS corresponds to a domain specific XML language

and a suite of services for different functions above However such a universal information model has not

been developed in other areas even though there are many fields in which it appears possible• BIS Biological Information System• MIS Military Information System• IRIS Information Retrieval Information System• PAIS Physics Analysis Information System• SIIS Service Infrastructure Information System

Page 29: Services and the Semantic Grid

2929

ASIS Application Specific Information System I a) Discovery capabilities that are best done using WS-*

standards b) Domain specific metadata and data including

search/store/access  interface. (cf WFS). Lets call generalization ASFS (Application Specific Feature Service)• Language to express domain specific features (cf GML). Lets call

this ASL (Application Specific language)• Tools to manipulate information expressed in language and key

data of application (cf coordinate transformations). Lets call this ASTT (Application specific Tools and Transformations)

• ASL must support Data sources such as sensors (cf OGC metadata and data sensor standards) and repositories. Sensors need (common across applications) support of streams of data

• Queries need to support archived (find all relevant data in past)   and streaming (find all data in future with given properties)

• Note all AS Services behave like Sensors and all sensors are wrapped as services

• Any domain will have “raw data” (binary) and that which has been filtered to ASL. Lets call ASBD (Application Specific Binary Data)

Page 30: Services and the Semantic Grid

3030

ASIS Application Specific Information System II Lets call this ASVS (Application Specific Visualization Services)

generalizing WMS for GIS The ASVS should both visualize information and provide a way of

navigating (cf GetFeatureInfo) database (the ASFS) The ASVS can itself be federated and presents an ASFS output interface d) There should be application service interface for ASIS from which all

ASIS service inherit e) There will be other user services interfacing to ASIS All user and system services will input and output data in ASL using

filters to cope with ASBD

AS Tool(generic)

AS“Sensor”

ASRepository

AS Service(user defined)

ASVSDisplay

AS Tool(generic)

Messages using ASL

Filter, Transformation, Reasoning, Data-mining, Analysis

Page 31: Services and the Semantic Grid

3131

EverythingIs a

Serviceor a message/Information

Nugget

MilitaryInformationManagement

System

Directly GS-* WS-*

ASVS

Filters/ASTT

Page 32: Services and the Semantic Grid

3232

MIOor Military Information

Object

Unit of Managed

Informationexpressed in

ASL

OGSA-DAI and Sensor Standards

Info-DWS-Notification

WS-Eventing

ASFS

Page 33: Services and the Semantic Grid

33

Information Resource

ReceiveRequest/Select

Get Status

ASL Data Get

IS =

InformationService(Sensor,Service orRepository)

BFS =

Basic FilterService

Filters either transform or aggregate Information

Filter Resource

ReceiveRequest/Select

Get Status

ASLData Get

IssueRequest/Select

RequestStatus

ASLData Put

Page 34: Services and the Semantic Grid

34

FS =

BFS

BFS BFS

BFS BFS

BFS

A Filter Service is a general workflow(the microscopic workflow) of Basic Filter Services

A transport link supports asynchronous publish/subscribe semanticsand Web Service Reliable messaging fault toleranceTransport links can be multicast to support collaboration (typically for last link before or after Presentation Service) or replication for fault tolerance.

The output of a Filter Service is indistinguishable from that of an IS

Page 35: Services and the Semantic Grid

35

FS

IS

FS

IS

FS

IS

FS

IS Gridlet =

Top IS could be produced by a Filter Service

The basic unit (Gridlet) transforms and aggregates application specific information

Gridlets are composed using Grid of Grids concept

Page 36: Services and the Semantic Grid

36

IS GridletIS Gridlet IS Gridlet

IS GridletIS Gridlet IS GridletIS

Gridlet

IS Gridlet

SearchPlanning

ConstructionManagement

Portal

Presentation

FederationMacrosopic Workflow

General System Services-----------------------Messaging/Data transportNotificationSecurityFault ToleranceMetadataDirectoryCollaborationReplica Management

SessionManagement

ASVS

Data Information Knowledge as messages flow from original sources to top of Filter Grid

Page 37: Services and the Semantic Grid

37

Semantically Rich Services with a Semantically Rich Distributed Operating Environment

Database

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS PortalFS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

MD

MD

MD

MD

MD

MD

MD

MD

MD

MetaData

Filter Service

Sensor Service

OtherService

SOAP Message Streams

SOAP Message Streams

Raw Data Raw Data

Raw Data

Raw Data

Data

Data Data

Data

Information

Information

Knowledge

Knowledge

Wisdom

Decisions

Information

AnotherService

AnotherService

AnotherGrid

AnotherGrid

Grids of Grids Architecture

is same as outwardfacing application

service

Page 38: Services and the Semantic Grid

3838

Summary Virtualization everywhere Focus on semantics not representation to get

performance combined with expressivity for transport and data access

All this enabled by powerful meta-data services Grids add management to rich but potentially chaotic

set of Web Services; • management and coherence enabled by meta-data

Can define general information architectures (ASIS, GIS, SIIS) for both applications and system

Knowledge from filters that span simulations, data-mining, reasoning and agents

A service is just a special case of a Grid Build systems from SubGrids (Gridlets)