D4science-II Codata

of 33/33
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Resource Management Pasquale Pagano National Research Council of Italy [email protected] 22nd International CODATA 24-27 October 2010 Cape Town (South Africa) www.d4science.eu
  • date post

    13-Jan-2015
  • Category

    Documents

  • view

    347
  • download

    0

Embed Size (px)

description

 

Transcript of D4science-II Codata

  • D4Science:An e-Infrastructure for Facilitating Fisheries and Aquaculture Resource Management

    Pasquale PaganoNational Research Council of Italy pasqual[email protected] International CODATA24-27 October 2010Cape Town (South Africa)www.d4science.eu

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    AssumptionsConsolidated facts:Very rich applications and data collections are currently maintained by a multitude of authoritative providersDifferent problems require different execution paradigms: batch, map-reduce, synchronous call, message-queue, Key distributed computation technologies exist: grid (gLite and Globus), distributed resource management (Condor), clusters (Hadoop), Several standards are adopted in the same domain

    Societal observationsA rich variety of protocols, models, and formats Create barriers in the usage of resourcesDelay dramatically new exploitation patterns

    Technical observationsProtocols, models, and formats heterogeneity increases load, Load increases failures

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    D4Science VisionD4Science objectives:hide heterogeneity, i.e. abstract over differences in location, protocol, and model;embrace heterogeneity, i.e. allow for multiple locations, protocols, and models;

    Technical goalsno bottlenecks: scale no less than the interfaced resourcesno outages: keep failures partial and temporaryautonomicity: system reacts and recovers

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    From a testbed to a production ecosystemOct .04Nov.07Jan.08Dec.09Oct .09Sept.11

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    From a testbed to a production ecosystemfunctionalitygLitegCubeOct .04Nov.07Jan.08Dec.09Oct .09Sept.11

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Infrastructure Exploitation30 Nodes CNR NKUA ESA FAO UNIBASEL25 Data EEA MERIS AATSR69 Metadata es ISO19115 eiDB15 Data AquaMaps Fact Sheets Country Maps 28 Metadata FARM_dc aquamapsNodesCollectionsFunctionality29 Nodes CNR NKUA FAO UNIBASEL Integration with gPod

    Geographical and text search Search by metadata Personal workspace Objects annotation Report generation Maps GenerationTime Series management

    Production More than 500 autonomic Web Services

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    A Digital Library System is a possibly distributed system that collects, manages and preserves for the long term rich digital content, and offers to its user communities specialised functionality on that content, of measurable quality and according to codified policies[The Digital Library Reference Model]

    The gCube data infrastructure enabling framework provides DL functionality by:gCube as a Digital Library SystemFederating exiting digital contentSupporting the generation of new digital contentProviding discovery and access capabilitiesmaintained in a variety of tailored repository systemsby exploiting heterogeneous computational platformson diversely described and modeled digital content

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    gCube as an e-Infrastructure ecosystem enabling frameworkBy bridging a number of well-established systems and standards from various domainsincluding high-energy physics, biodiversity, fishery and aquaculture resources management

    gCube realises an e-Infrastructure ecosystem

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    How does it work ?

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Why sharing through VREs is a key?Through the VRE, groups of users have controlled access to distributed data and services integrated under a personalised interface.

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Why sharing through VREs is a key?A Virtual Research Environment (VRE) supports cooperative activitiesMetadata cleaning, enrichment, and transformation by exploiting mapping schema, controlled vocabulary, thesauri, and ontologyProcesses refinement and show cases implementation (restricted to a set of users); Data assessment (required to make data publically exploitable by VO members);Expert users validation of products generated through data elaboration or simulation.

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Why sharing through VREs is a key?VREs integrated environment put at disposal a functionality set to support and perform research activities:

    the ability to integrate heterogeneous data and servicesthe ability to process information on-demand ingesting the results,to share data and process with other users, to customize collection of information, to store user actions and exploit them for further use, to aggregate relevant information into ad-hoc information sources and keeping them updated.

    VREs integrated environment put at disposal a functionality set to support and perform research activities:

    the ability to integrate heterogeneous data and servicesthe ability to process information on-demand ingesting the results,to share data and process with other users, to customize collection of information, to store user actions and exploit them for further use, to aggregate relevant information into ad-hoc information sources and keeping them updated.

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Building Virtual Research Environments

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    TransformationStorageVRE FacilitiesTools supporting specific tasksA virtual live document to describe research resultsA virtual desktop to organize the working environment WorkspaceSpecies Maps GenerationTime Series ManagementReportManagementSearchAnnotationVisualisationSearchAnnotationVisualisationAnnotationSearchStorageVisualisationTransformationTransformationStorage

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    WorkspaceA collaboration-oriented suite providing forseamless access and organisation facilities on a rich array of objects (e.g. Information Objects, Queries, Files, Templates) mediation between external world objects, systems and infrastructures (import/export/publishing)support common file manager (drag & drop, contextual menu)support an effective rich object sharing facility

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    AquaMaps is an application* tailored to predict global distributions of marine species initially designed for marine mammals and subsequently generalised to marine species, that generates color-coded species range maps using a half-degree latitude and longitude blocksby interfacing several databases and repository providers

    Species Distribution Maps Generation* Algorithm by Kashner et al. 2006

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    AquaMaps execution is based on the gCube Ecological Niche Modelling Suite which allows the extrapolation of known species occurrences Species Distribution Maps Generationto determine environmental envelopes (species tolerances)

    to predict future distributions by matching species tolerances against local environmental conditions (e.g. climate change and sea pollution)Very large volume of input and output data: HSPEC native range 56,468,301 - HSPEC suitable range 114,989,360Very large number of computation: One multispecies map computed on 6,188 half degree cells (over 170k) and 2,540 species requires 125 millions computations (Eli E. Agbayani, FishBase Project/INCOFISH WP1, WorlFish Center)

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Time Series ManagementOffers a set of tools to manage capture statisticsSupports the complete TS lifecycle Supports validation, curation, and analysis Provides support for data reallocationProduces uniform data-set

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Time SeriesOffers a set of tools to operate on capture statisticsMultiple key families supportFiltering, grouping, and aggregationUnionMining

    Produce automatically provenance information

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Time SeriesOffers a set of tools to operate on capture statisticsMultiple key families supportFiltering, grouping, and aggregationUnionMining

    Produce automatically provenance information

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Report ManagementA collaboration-oriented suite providing fortemplate-oriented, feature-rich and flexible document format definitioneffective and infrastructure-integrated report compilation (drag & drop workspace items)collaborative and distributed editing (workspace based)standard-based report materialisation (HTML, OpenXML)

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    VREs, Workspaces and Report in Action

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    gCube and Humanities: the gMan caseJISC - Kings College LondonLook at new ways of integrating existing data resources for Classics and add services so that research work based on integrated resources can be publishedData sourcesThe Heidelberger Gesamtverzeichnis (HGV) der griechischen Papyrusurkunden Aegyptens, a collection of metadata records for 55,000 Greek papyri from Egypt.Projet Volterra, a database of Roman legal texts, and associated metadata, from various sources (epigraphic, papyrological, or literary) currently in the low tens of thousands but very much in progress.The Inscriptions of Aphrodisias, (InsAph), a corpus of about 2,000 ancient Greek inscriptions from the Roman city of Aphrodisias in Asia Minor, including transcribed texts and metadata marked up using EpiDoc TEI, as well as images of the physical objects.Main functionalitycross-collection searchworkspaceannotationreport creationEarly results in AHM 2009 Phil. Trans. A special issue

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    VRE SumamryD4Science approach:Heterogeneous resources are accessible in a common ecosystem of resources despite their locations, technologies, and protocolDifferent communities have access to different viewsaccording to the conditions under which the sharing can occurEach community can define its own virtual research environment to satisfy specific needsfor a limited timeframe and at no cost for the providers of the resourceSeveral virtual research environments can coexistwithout interfering each other even by competing for the same resources

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    ConclusionsFactsVery rich services and data collections are currently maintained by a multitude of authoritative providersSeveral standards are adopted in the same domain

    Interoperability approaches are key to exploit such richness

    D4Science offers a variety of patterns, tools, and solutions to interconnect Heterogeneous digital contentHeterogeneous repository systemsHeterogeneous computation platformswith a rich set of free-to-use tailored servicesto decrease the cost of adoptionto reduce the time to market of new ideasto deal with plethora of standards

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Supported StandardsWS-*WSRFWS-BPEL

    JDLJSDLGlue Schema (part)

    X-*DC, TEI, ISO etc

    JSR (several)

    GSI-SecurityXACMLSAML

    OpenSearch

    OGC related

    Comply with:OAI-PMH OAI-ORE

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Supported StandardsWSRF SpecificationsWS-ResourceProperties (WSRF-RP)WS-ResourceLifetime (WSRF-RL)WS-ServiceGroup (WSRF-SG)WS-BaseFaults (WSRF-BF)

    JSR168 : Simple Portlets286 : 186 update160 : JMX

    WSN Specifications:WS-BaseNotificationWS-Topics(WS-BrokeredNotification).

    WS-* StandardsSOAPWSDLWS-Addressing.

    ISO:ISO3166 countriesISO4217 currenciesISO19115 geo-location.

    X-*XMLXSDXSLXSLTxPathxQuery

    OGCWeb Coverage Processing Service Web Coverage Service Web Feature Service Web Map Context Web Map Service Web Map Tile Service Web Processing Service Web Service Common

    OGF Standard:Glue Schema (2)

    .

    Comply with:OAI-PMH OAI-ORE

    *www.d4science.euD4Science22nd International CODATA, Cape Town 24-27 October 2010

    Find uswww.gcube-system.orgwww.d4science.euDonatella CastelliD4Science-II Project [email protected]

    Pasquale PaganoD4Science-II Technical [email protected]

    Thank You For Your Attention

    ****WSRF SpecificationsWS-ResourceProperties (WSRF-RP)WS-ResourceLifetime (WSRF-RL)WS-ServiceGroup (WSRF-SG)WS-BaseFaults (WSRF-BF)

    JSR168 : Simple Portlets286 : 186 update160 : JMX

    WSN Specifications:WS-BaseNotificationWS-Topics(WS-BrokeredNotification)

    WS-* StandardsSOAPWSDLWS-Addressing

    ISO:ISO3166 countriesISO4217 currenciesISO9115 geo-location

    X-*XMLXSDXSLXSLTxPathxQuery

    OtherWSRP

    OpenGISKML

    OGF Standard:Glue Schema (2)

    eXtensible Access Control Markup Language(XACML) is a specification in XML for writing access control policies in XML and how to interpret them Security Assertion Markup Language(SAML) is a XML specification, defining syntax and processing semantics about security assertions

    **