WIS V-GISC – Simdat

19
WIS V-GISC – Simdat WMO REGIONAL SEMINAR OUAGADOUGOU BURKINA FASO 12-13 February 2007 Jacques Roumilhac

description

WIS V-GISC – Simdat. WMO REGIONAL SEMINAR OUAGADOUGOU BURKINA FASO 12-13 February 2007 Jacques Roumilhac. WIS. Will of WMO to renew the Information System FWIS: Future WMO Information System Now we speak about the WIS (the ‘F’ disappears) - PowerPoint PPT Presentation

Transcript of WIS V-GISC – Simdat

Page 1: WIS V-GISC – Simdat

WIS

V-GISC – SimdatWMO REGIONAL SEMINAR

OUAGADOUGOU BURKINA FASO 12-13 February 2007

Jacques Roumilhac

Page 2: WIS V-GISC – Simdat

SIMDAT SIMDAT

WIS

Will of WMO to renew the Information System FWIS: Future WMO Information System Now we speak about the WIS (the ‘F’ disappears) Based on Core Metadata on XML Format to define all the data GTS renewal included in the philosophy Nodes: GISC, DCPC and NC GISC: Global Information System Centre VGISC: Virtual Global Information System Centre (MetOffice, Météo

France, DWD, Eumetsat, ECMWF)

Page 3: WIS V-GISC – Simdat

SIMDAT SIMDAT

WIS Functional Requirements

Support variety of data types (Common to all WMO Programmes) Support Archive and Real-time datasets Provide a Catalogue of all the meteorological data for exchange to

support WMO programmes Support ad-hoc requests for data and products: Pull model Support routine dissemination of all observed data and products

both real-time and non real-time : Push model Support network security Support of different users profile and data policies Use different types of communication links (GTS, satellite,

dedicated links)

Page 4: WIS V-GISC – Simdat

SIMDAT SIMDAT

European Project on Grid Technology Decided in 2004 to do a demonstrator V-GISC on Simdat backbone SIMDAT focuses on 4 application areas:

– product design in automotive and aerospace,

– process design in pharmacology

– service provision in meteorology

Phase 1: Connectivity Phase 2: Interoperability Phase 3: Knowledge

. Deployment of Infrastructure with particular attention to data transport and management. Distributed Data access

. Virtual Data Repository

. Introduction of Grid technologies research. Introduction of VO

. Integration of analysis services, workflows, discovery and data mining

SIMDAT

Page 5: WIS V-GISC – Simdat

SIMDAT SIMDAT

Meteorology Application : Project Aims

Service oriented framework targeting meteorology, hydrology, climate and environment and offering transparent access to distributed resources– Discovery service, cataloguing service, subscription service, …

Some key elements of the project are:– A single view of meteorological information which is distributed amongst

the meteorological partners

– Improve visibility and access to meteorological data through a comprehensive discovery service

– Offer a variety of reliable services for collection and sharing of data and for routine dissemination (future)

– Provide a global access control policy managed by the partners and integrated into their existing security infrastructure (future)

Page 6: WIS V-GISC – Simdat

SIMDAT SIMDAT

Architecture

3 main components to build the virtual database: Data Repository, Catalogue Node and Portal

– Installed on each partner site and interconnected through a dedicated secure connection channel

Data Repository– Interface to the partners databases– Offers metadata information to describe, search, locate data– Offers interface to retrieve data from the associated local databases

Catalogue Node– Maintains the registry and ensures synchronisation– Harvests metadata and requests data from the data Repository– Ingests data and maintains the cache of the real-time data– Serves clients: Portal or other Nodes– Monitors the execution of the requests

Distributed Portal– Offers interface to search/browse the catalogue

Page 7: WIS V-GISC – Simdat

SIMDAT SIMDAT

Architecture (cont.)

Page 8: WIS V-GISC – Simdat

SIMDAT SIMDAT

Support variety of data types

Interface to the existing Meteorological Databases– It provides access to any kind of databases (rdbms,

bespoke, flat files)

Metadata provider– Provide Metadata information to discover, locate and

describe data, in respect with a defined XML metadata format

– Answer Catalogue Node metadata harvesting messages

Data provider– Provide an interface to asynchronously request data

from the associated existing database– Transform the XML data request to the real database

request – Offer a data channel (HTTP, FTP, …) to send the

retrieved data to the Catalogue Node

Page 9: WIS V-GISC – Simdat

SIMDAT SIMDAT

Support variety of data types

Satellite data

ERA40 dataTIGGE data

Climate Time Series

Aviation data (TAF, METAR)Lightning data

Model outputReal-time GTS data

Model outputClimate Time Series

Model outputObservation

Model outputSatellite data

Model outputWave Observation

Community PortalCatalogue

Oceanographicdata (BATHY, SHIP)

More than 27,000 datasets discoverable

Page 10: WIS V-GISC – Simdat

SIMDAT SIMDAT

Catalogue of all available products

The Catalogue is built using the metadata harvested from the Data Repositories

The Catalogue is synchronized and replicated on each Catalogue Node

The Catalogue Node offers discovery services accessible to the user through the distributed portal

The Catalogue contains the necessary information to retrieve and sub select the data

Page 11: WIS V-GISC – Simdat

SIMDAT SIMDAT

WMO Core metadata standard - Challenges

WMO Core Profile, profile of ISO19115 on geo-referenced data Scalability

– Records are large and contain redundant information, slowing down the database hosting the catalogue

– Same information repeated in all metadata records Unnecessary information is circulating over the network

– Some documents are orders of magnitude larger than data itself

– Cannot represent very large archives with small granularity

Cannot fulfil all requirements to build the V-GISC– Information on how to retrieve data from local databases

– Information to create a directory (Taxonomy of documents)

– Information to sub-select data from a dataset

Page 12: WIS V-GISC – Simdat

SIMDAT SIMDAT

WMO Core metadata standard - Solutions

Add specific extension to define all relevant information needed to implement the system and not defined by the WMO core– Internal unique ID

– Hierarchy relationship

– Physical location (which node holds the data)

– Information used to generate a valid request to retrieve data from the end system

– Information used to create web interface for the end user

Work with WMO ET to integrate extensions in future releases of standards

WMO

UKMO

Synop

Heathrow

2005-10-12

CoreOwnerData type

Location

Date

Split XML documents into fragments to solve the scalability issue– WMO core metadata is structured – Some parts are shared amongst many

documents

Page 13: WIS V-GISC – Simdat

SIMDAT SIMDAT

Metadata Synchronization

New observation has been received by one site

Page 14: WIS V-GISC – Simdat

SIMDAT SIMDAT

Metadata Synchronization (cont.)

The associated metadata are generated and published in the Data Repository

Page 15: WIS V-GISC – Simdat

SIMDAT SIMDAT

Metadata Synchronization (cont.)

Catalogue Node harvests the new metadata and stores it in its Catalogue

Page 16: WIS V-GISC – Simdat

SIMDAT SIMDAT

Metadata Synchronization (cont.)

The Catalogue of the other Nodes is synchronized and the dataset is searchable from any sites

Page 17: WIS V-GISC – Simdat

SIMDAT SIMDAT

Support Archive and Real-time Data

A GTS Data Repository has been developed– Interfaced with the GTS (through a MSS)

– It publishes GTS collections in the Cache

– Currently,no data replication over the SIMDAT infrastructure

For phase III several sources plugged onto SIMDAT– Strategy to uniquely identify the datasets

(using MD5 hash codes)

– Real-time data replication using the metadata synchronization mechanism

– Generic Solution that can be used by all the partners

Page 18: WIS V-GISC – Simdat

SIMDAT SIMDAT

Support Pull model

A Portal is deployed on each site and offers a unique view of all the datasets available

Portal offers discovery mechanisms to the users– Full text, temporal and geographical search (google-

like)– Directory browsing (yahoo-like browsing)

Portal provides request handling mechanisms to the users

– Submitted requests can be asynchronous to manage long-lived requests

– Users can manage requests (check status, delete them …)

– Users retrieve the associated data when the request is complete

Page 19: WIS V-GISC – Simdat

SIMDAT SIMDAT

Support of different profile and data policies

VO Domain

– Group of organisations that share a common data

access policy (e.g. the RA-VI V-GISC)

– Access to protected resources occurs on a domain basis

Authentication (AuthN)

– Users register with a node

– Users are known to all the nodes in the same domain

– Any node within the domain should be able to authenticate a user of the domain

Authorisation (AuthZ)

– AuthZ is performed at the node level to allow/deny access to the data

– Data Access policy is expressed within the metadata

Implementation : first release March 2007

A BC

FE

VO Domain

D

1D

2