17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids:...

26
17 March 2008 Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall National e-Science Centre, University of Edinburgh

Transcript of 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids:...

Page 1: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

17 March 2008 Standards for Interoperable Grids 1

Data Management

Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe

Clive Davenhall

National e-Science Centre, University of Edinburgh

Page 2: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 2

Data Management: Overview Manipulation and management of data. Typically including:

Processing

Transfer

Storage

Access

Page 3: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 3

Data Management: Overview Manipulation and management of data. Typically including:

Processing Job execution, BES, JSDL

Transfer

Storage

Access

Page 4: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 4

OGSA Standards There are a number of OGSA data management

standards:

DMI: data transfer.

ByteIO: data access (file-like), data transfer.

WS-DAI: data access (database-like).

Can be used individually or in concert with other OGSA standards.

Page 5: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 5

OGSA-DMI

DMI: Data Management Interface.

Not yet a specification; still a draft: currently receiving public comments, completion is imminent.

A standard mechanism for moving data between locations: from a source of data, to a sink (or destination) of data.

Page 6: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 6

OGSA-DMI Architecture A standard structure or interface

Various resources can use and interoperate.

Support a variety of protocols for the actual data transfer: GridFTP, file access, OGSA-ByteIO, SRB.

Supports ‘third party’ transfers, a superintending process initiates a transfer from a remote

source to a remote sink.

Only concerned with moving bytes from the source to the sink: not concerned with the semantics or structure of the data, though future versions might be.

Page 7: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 7

Port Types DMI: a mechanism for scheduling and managing

data transfers. Provides two port types. Uses the factory pattern.

DTF: Data Transfer Factory Client invokes a DTF to create a DTI.

DTI: Data Transfer Instance Service created to perform a specific transfer.

Page 8: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 8

DTI Operations

A DTI (Data Transfer Instance) will support the following operations:

StartActivateStopResumeSuspendGetStateGetInstanceAttributeDocument

Page 9: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 9

Sources and Sinks Source:

Emits an ordered sequence of bytes.

Sink: Receives an ordered sequence of bytes.

For a resource to act as a source or sink in a DMI transfer it must: Provide suitable services to send or receive data. Furnish a list of protocols that it can use.

Information about how data are to be sent or received is encapsulated in a DEPR (Data Endpoint Reference).

Page 10: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 10

DEPR DEPR: Data Endpoint Reference.

Encapsulates all the information about: How data in a source are to be accessed. How data sent to a sink are to be received.

Includes all the transport protocols supported by a source or sink.

Contains endpoint references to access the data.

In future versions these endpoint references will use WS-Addressing.

Page 11: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 11

NextGRID Recommendations Resources should be modelled as WS-

resources.

Transfers must be implemented as ‘Logical Data Transfers’ (the most flexible of several options available).

Prescribes a mechanism to query the protocols available to a source or sink.

OGSA-ByteIO must be one of the protocols available to both the source and sink.

Page 12: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 12

OGSA Data Management Standards

DMI: data transfer.

ByteIO: data access (file-like), data transfer.

WS-DAI: data access (database-like).

Page 13: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 13

OGSA ByteIO

POSIX-like access to remote resources.

The remote resource can be any source of data: files, sensors, live-data streams, etc…

Aims to provide access transparency.

Page 14: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 14

Mapping to Web Services

Core OGSA ByteIO Specification Independent of any basic profile.

ByteIO OGSA WSRF Basic Profile RenderingMapping to WSRF Basic Profile.

Currently WSRF is the only mapping.

Others are anticipated.

Page 15: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 15

ByteIO Access Methods Two access methods. Implemented as port-types. Each is optional.

RandomByteIO: Direct random access to a portion of data resource. Portion to access specified as offset from start of the

resource.

StreamableByteIO: Streamed access to a data resource. Each access relative to the previous access.

Page 16: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 16

RandomByteIO read(startOffset: unsignedLong, bytesPerBlock:

unsignedInt, numBlocks: unsignedInt, stride: long): byte[]

write(startOffset:unsignedLong, bytesPerBlock: unsignedInt, stride: long, data: byte[]): void

append(data: byte[]): void

truncAppend(offset: unsignedLong, data: byte[]): void

Page 17: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 17

RandomByteIO read as XML<rbyteio:read>

<rbyteio:start-offset>xsd:unsignedLong</rbyteio:start-offset>

<rbyteio:bytes-per-block>xsd:unsignedInt</rbyteio:bytes-per-block>

<rbyteio:num-blocks>xsd:unsignedInt</rbyteio:num-blocks>

<rbyteio:stride>xsd:long</rbyteio:stride>

<rbyteio:transfer-information transfer-mechanism=”xsd:anyURI”> byteio:transfer-information-type

</rbyteio:transfer-information>

</rbyteio:read>

Page 18: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 18

StreamableByteIO

seekRead(offset: long, seekOrigin: URI, bytesToRead: unsignedInt): byte[]

seekWrite(offset: long, seekOrigin: URI, data: byte[]): void

Page 19: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 19

NextGRID Recommendations

Must conform to the WSRF rendering.

Must support RandomByteIO.

Restrictions on naming.

Page 20: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 20

OGSA Data Management Standards

DMI: data transfer.

ByteIO: data access (file-like), data transfer.

WS-DAI: data access (database-like).

Page 21: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 21

OGSA WS-DAI

WS-DAI: Web Service Data Access and Integration.

Access to remote data resources.

Modelled on access to databases,- of various sorts.

Page 22: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 22

WS-DAI Data Resource Models The CORE WS-DAI Specification

Independent of data model. Implemented as a model-dependent realisation.

WS-DAIR Modelled on access to relational databases. Queries in SQL.

WS-DAIX Modelled on access to XML databases. Queries in XPath, XQuery and XUpdate.

Anticipated that additional realisations will be developed: eg, RDF, object databases…

Page 23: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 23

Properties A WS-DAI resource has a number of properties which a client can

interrogate to determine the resource’s characteristics:

DataResourceAbstractName: ParentDataResource: DataResourceManagement: DatasetMap: ConfigurationMap: LanguageMap: DataResourceDescription: Readable Writeable: ConcurrentAccess: TransactionInitiation: TransactionIsolation ChildSensitiveToParent

Page 24: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 24

Data Resources

Externally managed resources Data stored using a pre-existing DBMS which has its

own existence apart from WS-DAI. WS-DAI gives access to this resource.

Service managed resources No independent existence. WS-DAI exists to manage the resource. For example, the results of a previous query could be

made available as a serivce-managed resource.

Page 25: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 25

Direct and Indirect Access Patterns for obtaining the results of queries to a

resource.

Direct Access The results are simply returned in response to the

query.

Indirect Access Effectively implements the ‘factory pattern’. The results are not returned in the response to the

query. Rather, they are made available as a data resource in

their own right.

Page 26: 17 March 2008Standards for Interoperable Grids 1 Data Management Standards for Interoperable Grids: Experience from NextGRID and OMII-Europe Clive Davenhall.

Standards for Interoperable Grids 26

NextGRID Recommendations

WS-DAI access is optional for NextGRID.

Resources should be modelled as WS-resources.

Restrictions on naming.