The OGSA-DAI Project Databases and the Grid
-
Upload
jaime-gilmore -
Category
Documents
-
view
20 -
download
3
description
Transcript of The OGSA-DAI Project Databases and the Grid
http://www.ogsadai.org.uk
The OGSA-DAI ProjectDatabases and the Grid
Neil Chue HongProject Manager
EPCC, Edinburgh
http://www.ogsadai.org.uk
What is OGSA-DAI?
It is a project:– OGSA Data Access and Integration: funded by the UK
eScience Grid Core Programme
It is a vision:– From simple database access to truly virtualised data
resources
It is a standard:– The GridDataService Specification from the Data Access and
Integration Working Group (DAIS-WG) of the Global Grid Forum (GGF)
It is software that you can use:– Current version is R2.5
http://www.ogsadai.org.uk
OGSA-DAI Objective
To define:
– open standards and – open source based – uniform service interfaces – for accessing heterogeneous data sources – within the Open Grid Services Architecture (OGSA) framework
Why?– Because we are increasingly wanting to integrate different
data sources from different organisations together– The Grid, and OGSA, appears to provide a framework for
producing software to do this
http://www.ogsadai.org.uk
Who are we?
£3 million, 18 months, started February 2002Funded by the Grid Core Programme
IBMUSA
Oxford
Glasgow
Cardiff
Southampton
London
Belfast
Daresbury Lab
RAL
EPCC & NeSC
Newcastle
IBM Hursley
Oracle
Manchester
Cambridge
Hinxton
Contributing to the globalgrid computing community
EPCC & NeSCIBM UKIBM USAManchester e-SCNewcastle e-SCOracle373 man months
http://www.ogsadai.org.uk
What are we doing?
Grid Plumbing & Security Infrastructure
Scheduling Accounting
Monitoring Diagnosis Logging
Data Intensive Applications
Data & Storage Resources
Distributed
Scientific Data Mining & Integration Technology
http://www.ogsadai.org.uk
What are we doing?
Grid Plumbing & Security Infrastructure
Scheduling Accounting
Monitoring Diagnosis Logging
Data Intensive Applications
Data & Storage Resources
Distributed
Authorisation Data Access
Data Integration
Structured Data
Scientific Data Mining & Integration Technology
http://www.ogsadai.org.uk
What are we doing?
Grid Plumbing & Security Infrastructure
Scheduling Accounting
Monitoring Diagnosis Logging
Data Intensive Applications
Data & Storage Resources
Distributed
Authorisation Data Access
Data Integration
Structured Data
Scientific Data Mining & Integration Technology
OperationsTeam
App. Developers
Owners
http://www.ogsadai.org.uk
What are we doing?
Grid Plumbing & Security Infrastructure
Scheduling Accounting
Monitoring Diagnosis Logging
Data Intensive Applications
Data & Storage Resources
Distributed
Authorisation Data Access
Data Integration
Structured Data
Scientific Data Mining & Integration Technology
OperationsTeam
App. Developers
Owners
Data Intensive Application Scientists
Data ProvidersData Curators
Tech. Developers
http://www.ogsadai.org.uk
DAIS WG
GridDatabaseService Specification– DAIS WG of the GGF– Aim to produce a V1.0 specification by early 2004– Defines an interface for a GridDatabaseService– May contributors, not just OGSA-DAI Project– OGSA-DAI (the software) seeks to be a reference
implementation of this standard• But does not necessarily track it exactly just now
– Requirements and Overview Informational documents also published
http://www.ogsadai.org.uk
The OGSA-DAI Approach
Reuse existing technologies and standards– OGSA, Query languages, Java, transport
Three key services:– GridDataService– GridDataServiceFactory– DAIServiceGroupRegistry
Benefits:– Location independence– Hides heterogeneity– Scalable– Flexible– Dynamic
http://www.ogsadai.org.uk
OGSA-DAI Positioning - Today
LocationMeta Data
Notification
OGSA
LifetimeDrivers
Query (CreateRetrieveUpdateDelete)
DataFormat
OGSA-DAI Basic Services
OGSA-DAI Distributed Query
Delivery
Database, Communication, OS… Technology
GDS DAISGRGDSF
http://www.ogsadai.org.uk
OGSA-DAI To Date
Assuming that OGSA becomes the standard framework– Have adopted the OGSA approach
Have first concentrated on data access– Released software has only limited data integration so far– Distributed query processor prototype due in July
Implementation provides focus on basic functionality first– But architecturally we have tried to answer many pertinent
questions– Functionality will increase over subsequent releases
http://www.ogsadai.org.uk
GDS in action
Database (XindiceMySQLOracleDB2)
1a. Request to Registry for sources of data about “x” 1b. Registry
responds with Factory handle
2a. Request to Factory for access to database
2b. Factory creates GridDataService to manage access
2c. Factory returns handle of GDS to client
3a. Client queries GDS with SQL, XPath, XQuery etc
3b. GDS interacts with database
3c. Results of query returned to client as XML
SOAP/HTTP
service creation
API interactions
Analyst
RegistryDAISGR
FactoryGDSF
Grid Data Service
GDS
Consumer
OR3d. Results of query delivered to consumer as XML
http://www.ogsadai.org.uk
Activities
OGSA-DAI is structured around the concept of activities
This framework allows new functionality to be added easily
Three types of activity at present:– statement (e.g. SQLQuery, Xupdate)– transformation (e.g. XSL translation, compression)– delivery (e.g. GridFTP)
OGSA-DAI provides implementations of common functionality, others can extend
http://www.ogsadai.org.uk
Documents
Accessing a Grid Data Resource is done using Documents– caveat: this may change
A document allows you to:– define parameters– execute activities– deliver results
Written in XML, normally used by a client.
<gridDataServicePerform><request name=“myRequest”><parameter name=“idname”><value name=“idvalue”>10</value></parameter>
<sqlQueryStatement name=“myStatement”><sqlParameter position=“1” from=“idvalue”/><expression>SELECT * FROM littleblackbook WHERE id=?</expression><webRowSetStream name=“statementresult”/></sqlQueryStatement>
<deliverToResponse name=“d1”><fromLocal from=“statementresult”/></deliverToResponse></request></gridDataServicePerform>
http://www.ogsadai.org.uk
OGSA-DAI Core Services
OGSA-DAI Release 2.5 – out now– Java, Tomcat, Globus Toolkit 3 Beta– Supports MySQL, DB2, Xindice; SQL92, XPath, Xupdate
OGSA-DAI Release 3 – end July– Java, Tomcat, Globus Toolkit 3.0– Supports MySQL, DB2, Oracle, Xindice; SQL92, XPath,
Xupdate– Adds Notification, Internationalisation, Transactions, Caching
Continue to track Globus Toolkit 3 releases– Experimental, then production, GT3 grids will help
http://www.ogsadai.org.uk
Asynchronous delivery – Pull
Asynchronous delivery – Push
Client
Consumer
DB
GDS
GDT
GDS Instance
Ra
Q1
2
3
Rs
DTGSH/R + data id
D + GDH
Client
Consumer
DB
GDS
GDT
GDS Instance
Ra
Q + D + GSH/R1
2
3
Rs
DT
GSH/R
Asynchronous Delivery
http://www.ogsadai.org.uk
GDSClient
GDSClient
Client
1 Operation
GDSClient
2
DB
Operation
Operation
OperationDB
4
Operation
Operation
Operation
DB
GDS
GDS
GDS
3
Operation
Operation
Operation
DB
GDS
GDS
GDS
Client5
Operation
Operation
Operation
DB
GDS
GDS
GDS
GDS Composition
http://www.ogsadai.org.uk
Distributed Query Service
A higher level service:– Extension of Polar* query processor, partitions and schedules
queries– Sits on top of OGSA and OGSA-DAI
Defines new portTypes and services– GridDistributedQuery(GDQ) PortType– GridDistributedQueryService(GDQS) – wraps Polar*– GridQueryEvaluatorService(GQES) – perform subqueries
Currently based on OGSA-DAI Release 1.5
http://www.ogsadai.org.uk
DQS Architecture
http://www.ogsadai.org.uk
DQP in action
http://www.ogsadai.org.uk
DQS: the future
The GridDistributedQueryService – is an example of a higher level data integration service which utilises
OGSA-DAI core services– Assumes that GDSF, GDQS Factory and client live in different
containers– Really requires a well-defined meta-model for the physical schema of
a database• Being partially addressed in DAIS WG
– Shows how a GDS can be both client and service• Service hierarchy and composition
DAIT (proposed follow-on to OGSA-DAI) would produce a robust reference implementation of the DQP components
http://www.ogsadai.org.uk
Projects using OGSA-DAI
Industry:– FirstDIG: business process analysis (with First Transport Group)
• OGSA-DAI with datamining
Collaborative– Bridges: database integration over six geographically distributed
genomics research sites (with IBM UK)• OGSA-DAI with DiscoveryLink
– eDIKT: porting OGSA-DAI to other platforms• OGSA-DAI with performance
– DEISA: linking Europe’s HPC centres• OGSA-DAI with distributed accounting
– MS .Net Grid: porting OGSA-DAI to the .Net framework (with Microsoft Research UK)
• OGSA-DAI with .Net
http://www.ogsadai.org.uk
ODD Genes
OGSA-DAI used to query gene expression data resources at GTI and HGU– One data resource: low spatial resolution, high gene resolution– Other resource: high spatial resolution, low gene resolution– Query one database and use data to find correct data
resource to run more detailed query and produce visualisation– Simple example of data integration at work
Client
Query
Query
Render
GTIGDS
GDS
EPCC
HGU
http://www.ogsadai.org.uk
Project Timeline
Feb ’02 May ’02 Jul ’02 Sep ’02 Dec ’02 Feb ’03 Sep ’03
Ship Release 1 (Jan 15th 2003)
RDB + GT2 / OGSA Prototypes Available
XML + OGSA Prototype Available
Design Documents & Demos for DAIS WG @ GGF5
XML + OGSA Prototypes for Early Adopters
WS + GSI UK support ( > 100 downloads)
Tutorial @ GGF7
GGF6 WG Papers & Prototypes
today
Release 2
Release 3
Phase 2 StartsPhase 1 Starts
Release 1.5 (Feb 28th 2003)OGSADAI Tutorial @ NeSC
Early Adopters Workshop @ NeSC
Tutorial @ NeSC
GT3 A3 GT3 Beta
GT3 A4 GT3 Final
May ’03
GT3 A1
GT3 A2
TP5TP4
Release 2.5
http://www.ogsadai.org.uk
A DAIT for the Future
DAIT (Data Access and Integration Two)– follow on project from OGSA-DAI, funded for two years– continue to research, prototype and productise– release every six months, R4 in December 2003– R4:
• support for SQL Server and structured filesystems
• extended DBMS management functionality (e.g. archive)
• bulk load operations (where supported)
• support for DFDL file access
• triggers exposed through notification
– R5• Distributed Query Processing, Distributed Transactions
• Virtualised views across databases
http://www.ogsadai.org.uk
Further information
The OGSA-DAI Project Site:– http://www.ogsadai.org.uk
The DAIS-WG site:– http://cs.man.ac.uk/grid-db
OGSA-DAI Users Mailing list– [email protected]– General discussion on grid data access and integration
Formal support for OGSA-DAI releases– http://www.ogsadai.org.uk/support + [email protected]
OGSA-DAI training courses– http://www.ogsadai.org.uk/courses/