An ecosystem to support FAIR data
-
Upload
blue-bridge -
Category
Technology
-
view
37 -
download
1
Transcript of An ecosystem to support FAIR data
FAIR DATA PRINCIPLES
Findable:F1. (meta)data are assigned a globally unique and
persistent identifier;
F2. data are described with rich metadata;
F3. metadata clearly and explicitly include the
identifier of the data it describes;
F4. (meta)data are registered or indexed in a
searchable resource;
Accessible:A1. (meta)data are retrievable by their identifier
using a standardized communications protocol;
A1.1 the protocol is open, free, and universally
implementable;
A1.2. the protocol allows for an authentication and
authorization procedure, where necessary;
A2. metadata are accessible, even when the data
are no longer available;
Interoperable:I1. (meta)data use a formal, accessible,
shared, and broadly applicable language for
knowledge representation.
I2. (meta)data use vocabularies that follow
FAIR principles;
I3. (meta)data include qualified references to
other (meta)data;
Reusable:R1. meta(data) are richly described with a
plurality of accurate and relevant attributes;
R1.1. (meta)data are released with a clear and
accessible data usage license;
R1.2. (meta)data are associated with detailed
provenance;
R1.3. (meta)data meet domain-relevant
community standards;
FAIR DATA PRINCIPLES - METADATA
Findable:F1. metadata are assigned a globally unique and
persistent identifier;
F2. data are described with rich metadata;
F3. metadata clearly and explicitly include the
identifier of the data it describes;
F4. (meta)data are registered or indexed in a
searchable resource;
Accessible:A1. metadata are retrievable by their identifier
using a standardized communications protocol;
A1.1 the protocol is open, free, and universally
implementable;
A1.2. the protocol allows for an authentication
and authorization procedure, where necessary;
A2. metadata are accessible, even when the
data are no longer available;
Interoperable:I1. metadata use a formal, accessible, shared,
and broadly applicable language for
knowledge representation;
I2. metadata use vocabularies that follow FAIR
principles;
I3. metadata include qualified references to
other (meta)data;
Reusable:R1. metadata are richly described with a
plurality of accurate and relevant attributes;
R1.1. metadata are released with a clear and
accessible data usage license;
R1.2. metadata are associated with detailed
provenance;
R1.3. metadata meet domain-relevant
community standards;
FAIR DATA PRINCIPLES - DATA
Findable:F1. data are assigned a globally unique and
persistent identifier;
F2. data are described with rich metadata;
F3. metadata clearly and explicitly include the
identifier of the data it describes;
F4. (meta)data are registered or indexed in a
searchable resource;
Accessible:A1. data are retrievable by their identifier using a
standardized communications protocol;
A1.1 the protocol is open, free, and universally
implementable;
A1.2. the protocol allows for an authentication
and authorization procedure, where necessary;
A2. metadata are accessible, even when the
data are no longer available;
Interoperable:I1. data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation;
I2. data use vocabularies that follow FAIR
principles;
I3. data include qualified references to other
(meta)data;
Reusable:R1. data are richly described with a plurality of
accurate and relevant attributes;
R1.1. data are released with a clear and
accessible data usage license;
R1.2. data are associated with detailed
provenance;
R1.3. data meet domain-relevant community
standards;
FAIR DATA PRINCIPLES - SUPPORTING INFRASTRUCTURE
Findable:F1. (meta)data are assigned a globally unique and
persistent identifier;
F2. data are described with rich metadata;
F3. metadata clearly and explicitly include the
identifier of the data it describes;
F4. (meta)data are registered or indexed in a
searchable resource;
Accessible:A1. (meta)data are retrievable by their identifier
using a standardized communications protocol;
A1.1 the protocol is open, free, and universally
implementable;
A1.2. the protocol allows for an authentication and
authorization procedure, where necessary;
A2. metadata are accessible, even when the data
are no longer available;
Interoperable:I1. (meta)data use a formal, accessible,
shared, and broadly applicable language for
knowledge representation.
I2. (meta)data use vocabularies that follow
FAIR principles;
I3. (meta)data include qualified references to
other (meta)data;
Reusable:R1. meta(data) are richly described with a
plurality of accurate and relevant attributes;
R1.1. (meta)data are released with a clear and
accessible data usage license;
R1.2. (meta)data are associated with detailed
provenance;
R1.3. (meta)data meet domain-relevant
community standards;
FAIR DATA ECOSYSTEM (DTL)
Create Publish AnnotateFind
011001
1
110010
1
100110
0
BYOD FAIR Hackathon
BRING YOUR OWN DATA - BYOD
■ Goals:
■ Learn how to make data linkable “hands-on” with experts
■ Create a “telling story” to demonstrate its use
■ Make FAIR Data at the source
■ Composition:
■ Data owners – specialists on given datasets
■ Data interoperability experts
■ Domain experts
Source: Marcos Roos
BYOD Planning
Preparation
Identify Plan
Datasets
Attendees' profile
Output data access
Tentative dates
Tentative venue
Costs
Funds
Coordination
Set date
Invite attendees
Set venue
Catering
Lodging
Financial planning
Publicity
Working document
Preparatory calls
Data hosting
Software hosting
Documentation hosting
BYOD Planning
Execution
Day One
Introduction
SW, LD, Ontology intro
Use case intro
Workgroups division
Working sessions
WWW/TTTALA
Day Two
Progress report
Working sessions
Groups reports
WWW/TTTALA
Day Three
Data integration
Answer driving question
Explore data
Demo improvement
Final report
WWW/TTTALA
BYOD Planning
Follow-Up
D+15
Report difficulties
Clarifications
Next steps
D+45
Report difficulties
Clarifications
Next steps
Implementation
Expand FAIRification
Implement solution
Scale-up solution
Deploy
FAIRIFICATION PROCESS
■ Retrieve original data
■ Dataset identification and analysis
■ Definition of the semantic model
■ Data transformation
■ License assignment
■ Metadata definition
■ FAIR Data resource (data, metadata, license)
deployment
FAIRIFIER
■ Transform non-FAIR datasets into FAIR Data Resources
(dataset in FAIR format, license and metadata)
■ Data munging
■ Semantic modeling
■ License definition
■ Metadata definition and extraction
■ Data publication
FAIRIFICATION PROCESS
■ Retrieve original data
■ Dataset identification and analysis
■ Definition of the semantic model
■ Data transformation
■ License assignment
■ Metadata definition
■ FAIR Data resource (data, metadata, license)
deployment
FAIRIFICATION - NEW DATASET TYPE
FAIR Data Resource
submit generate
FAIR Data
Model Registrysto
re
Semantic
Model &
Non-FAIR
- FAIR
mapping
FAIRIFICATION - RECURRING DATASET TYPE
FAIR Data Resource
submit generate
FAIR Data
Model Registry
qu
ery
Semantic
Model &
Non-FAIR
- FAIR
mappingretr
iev
e
FAIR DATA POINT
A particular class of FAIR Data System that provides access to datasets in a FAIR manner. The datasets can be external or internal to the FAIR Data Point. Also, the source data can be a non-FAIR dataset or a FAIR Data Resource. If the source data is non-FAIR, the FAIR Data Point needs to made the necessary FAIR transformations on the fly.
FAIR Data Point metadata
Catalog 1 metadata
Dataset metadataTitle
Publisher
License
Theme(s)
Version
…
DCAT/HCLS
FAIR Data Point metadata
Catalog 1 metadata
Dataset 1 metadata
Distribution metadataTitle
Media type
Download/access URL
License
…
DCAT
FAIR Data Point metadata
Catalog metadata
Dataset metadata
Distribution metadata
Data record metadataType
Domain
Range
…
RML
FAIR Data Point metadata
Catalog 2
metadataCatalog 1 metadata
Dataset 1 metadata
Distribution 1.a
metadata
Data record
metadata
Distribution 1.b
metadata
Dataset 2 metadata
Distribution 2.a
metadata
Data record
metadata
Distribution 2.b
metadata
Dataset 3 metadata
Distribution 3.a
metadata
Data record
metadata
FAIR DATA POINT - GUI
}}
Repository
metadata
Catalog
metadata
summary
}Dataset/
distribution
metadata
summary
} Catalog
metadata
FAIR HACKATHON - GOALS
■ Align solutions with FAIR Data Point specifications.
■ Metadata content
■ API
■ Data
FAIR HACKATHON OUTCOME
■ FAIR data model for solutions content;
■ Architecture of the required adjustments/extensions;
■ Technical specification of the adjustments/extensions;
■ Proof-of-concept of the adjusted solution;
■ Allow third-party annotation on existing knowledge
bases
■ Capture the provenance of the annotator and the
original statement
Open RDF
Knowledge AnnotatorORKA
■ A particular class of FAIR Data System to provide
support for data interoperability;
■ Supports publication and access to FAIR data.
■ Fosters an ecosystems of applications and services;
■ Federated architecture: different FAIRports (and other
FAIR Data Systems) are interconnectable;
■ Supports citations of datasets and data items;
■ Provides metrics for data usage and citation;
METADATA LAYERS
Data Repository (FDP)
(Dataset) Catalog(s)
Dataset
Distribution
Data Record
DCAT/HCLS
RML
METADATA LAYERS’ EXTENSIONS - VOCABULARIES
Data Repository (FDP)
(Dataset) Catalog(s)
Dataset
Distribution
Data Record
METADATA LAYERS’ EXTENSIONS - VOCABULARIES
DCATdcat:publisher
biosch:organization
"@context" : "http://schema.org" ,"@type" : "NGO","address" : {
"@type" : "PostalAddress" ,"addressLocality" : "Utrecht, The Netherlands""postalCode" : “3511 GC" ,"streetAddress" : “Catharijnesingel 54"
},"email" : “info(at)dtls.nl" ,"@type" : “Organization”,“@type”: “not-for-profit”,"name" : “Dutch Techncentre for Life Sciences" ,"telephone" : "( 31) 85 30 30 711"
METADATA LAYERS’ EXTENSIONS - EXTENDED MODEL
Data Repository (FDP)
(Dataset) Catalog(s)
Dataset
Distribution
Data Record
DatA Tag Suite
(DATS)
PROV