ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata,...

31
www.elixir-europe.org @ELIXIREurope www.elixir-europe.org ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559. ELIXIR’s Human Data Communities IMI FAIRplus project Jen Harrow, ELIXIR Tools Platform Coordinator

Transcript of ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata,...

Page 1: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

www.elixir-europe.org

@ELIXIREurope

www.elixir-europe.org

ELIXIR-EXCELERATE is funded by the European Commission within the

Research Infrastructures programme of Horizon 2020, grant agreement number

676559.

ELIXIR’s Human Data Communities IMI FAIRplus project

Jen Harrow, ELIXIR Tools Platform Coordinator

Page 2: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

agriculture

medicine

bioindustries

environment

ELIXIR’s mission is to operate a sustainable European infrastructure for biological information, supporting life-science research and its translation to society, the bio-industries, environment and medicine

ELIXIR’s strategy is to connect national bioinformatics centres and EMBL-EBI into a distributed infrastructure built from coordinated national and

international data resources, tools and services

Page 3: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Development of ELIXIR

EOSC-Life, EJP-RD ,FAIRPlus

Page 4: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

ELIXIR in numbers

• 22 Members and 1 Observers

• ~ 180 institutes involved

• 700+ staff

• 18 Core Data Resources

• 21 Implementation Studies ongoing or soon to start

• 27 papers in ELIXIR F1000 channel

• 300 live events in TeSS

• 400 companies attended Innovation and SME programme

Page 5: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Overview of ELIXIR Platforms:

Page 6: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Data Platform: Supporting the whole ELIXIR ecosystem of data resources

ELIXIR Core Data Resources

“of fundamental importance to all

research in the life sciences”

ELIXIR Data Resources

Prioritised by national

programmes; reviewed by ELIXIR SAB

Huge importance to specific research

communities

Page 7: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Working towards the launch of Global BioData Coalition

• Global BioData Coalition is now being formed• Coalition of funders - worldwide - to coordinate

and sustain global biodata landscape• HIRO support, WT, NIH and AMED (Japan)

leading

• 2 year project to develop coalition • Initial funding from WT, NIH - approaching

additional funders globally• Project plan agreed, start Q4 2018

• ELIXIR: Lead consultation on indicators and procedure w global stakeholders• Based on ELIXIR Methodology

Page 8: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Interoperability Platform

FindableAccessibleInteroperableReusable

Tech

Services

Standards

Adoption

Interoperability services and practices to support FAIR data and interoperability activities

Use case focusedData providersData integrators

With international initiatives, from community grassroots to government programmes.

Capacity

building

Page 9: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

The 2018 Recommendations of RIRsResource Description

FAIRsharing (UK) Registry of curated metadata of DBs, Policies, Standards

g:Profiler (EE) Gene-centric data integrator - Web UI, and API services

Identifiers.org (EBI) Identification and resolution system for life science, provider of compact identifiers and URIs

Intermine (UK) Framework to integrate life sciences data based on an extensible data model, providing web interface and

RESTful web services.

ISA Framework (UK) ISA (Investigation > Study > Assay) helps researchers to provide rich description of experimental metadata

so that the resulting data and discoveries are reproducible and reusable.

Ontology Lookup Service

(EBI)

Repository for biomedical ontologies that aims to provide a single point of access to the latest ontology

versions through web UI or RESTFUL API

3DBIONOTES API* (ES) A reusable platform-independent API call component for protein metadata alignment, annotation, and

integration across major protein data resources.

BridgeDb (NL) A combination of a software framework and and API for mapping identifiers for related objects in life

sciences

DisGeNET API* (ES) API SPARQL Endpoint for genetic variant (human disease data)

MOLGENIS (NL) A software package to help researchers set up an online database application that supports data queries

and allows data sharing.

Page 10: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

The Tools Platform

Raise software

quality and

sustainability,

by producing

and promoting

software best

practices and

developing

training

activities

Bio.tools, a

discovery portal

for

bioinformatics

software

information,

providing

curated

description of

tools and data

services

OpenEBench,

an infrastructure

providing

services for

hosting

scientific

benchmark

activities and

technical

monitoring of

bioinformatics

tools and

service

To support

efforts around

software

packaging &

containers, e.g.

BioConda/

BioContainer

and support

sustainable

integration into

bio.tools and

OpenEBench

T

To drive the

development of

execution

platforms (eg

Galaxy) and

ensure

integration with

bio.tools,

OpenEBench

and workflows

using CWL

Tools Interoperability, guidelines and resources for guaranteeing platforms integration at the ELIXIR Tools

platform ecosystem, with other platforms at ELIXIR and beyond.

Page 11: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Compute Platform: Access, exchange and storage

• Mission: Develop distributed solutions for cloud, compute, storage services including user authentication and access control

• Coordination of dependencies with e-infrastructures (esp. GÉANT, EGI, EUDAT) in collaboration with biological and medical research infrastructures (CORBEL)

Tommi Nyrönen, FI , Luděk Matyska, CZ , Steven Newhouse, EBI

Page 12: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

ELIXIR Authorisation and Authentication Infrastructure

Enables life science researchers to use their institutional IDs to access services and data:• Reduced bureaucracy and costs

• Improved vetting: federated identities provide greater confidence to the service and data providers

• Regular updates: as researchers join leave institutions, their affiliation information is maintained regularly

• Improved access to usage metrics: consistent use of accounts allows service providers to better analyse the use of their services

• Applicable to other research infrastructures (CORBEL)

Page 13: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Training Platform:TeSS Portal (Training eSupport System)

• Platform to disseminate, discover & package training resources, training materials and events – led by ELIXIR UK

• Aggregating information from ELIXIR nodes and various 3rd-party content providers

http://tess.elixir-uk.org

Page 14: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

ELIXIR Communities connect infrastructure with life-science research experts across Europe

• ELIXIR Communities are formed around domain experts in our Nodes

• Include non-ELIXIR partners

• ELIXIR Communities provide a mechanism for long-term collaborations with other ESFRI and large-scale initiatives

• ELIXIR Communities will drive the service developments in the ELIXIR Platforms and provide framework to develop and maintain community standards

Page 15: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Partnerships and community formation

ELIXIR Human Data Communities• Federated Human Data• Rare Diseases• human Copy Number Variation

Page 16: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

se

ELIXIR Human Genomics & Translational Data –Tools developed within ELIXIR

Data DiscoverabilityFederating lightweight discoverability of data, anddatasets across ELIXIR

Data ArchivalUtilising the ELIXIR Deposition Databases toensure secure, long-term, efficient archival of data

Federated Data AccessCoordinating a collection of interoperable EGA-like resources to ensure secure management ofsensitive data across the ELIXIR Nodes

Data AnalysisBringing ‘analysis to data’ via common workflow languages, workflows, containers, and tools

ELIXIR Beacon - GA4GH

Driver Project

ELIXIR Federated Human Data

Community - htsget/htsref

bio.tools

Serena Scollen and Gary Saunders

Page 17: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Developing standards and tools:ELIXIR and GA4GH to develop strategic partnership

Simplify the way people search for and request access to potentially identifiable data in international and national

genomic data resources

Working towards GA4GH standards, APIs and toolkits to be used throughout

8 GA4GH Workstreams8 new driver project announced in Feb

15/22 ELIXIR Nodes involved

Page 18: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Public data discovery web-service: Beacon Driver Project

Yes / No(+optional metadata

about the allele)

Do you have information about the allele “C at

position 32,936,732 on chromosome 13?”

Beacon X: YesBeacon Y: NoBeacon Z: No…

Do you have information about the allele “C at

position 32,936,732 on chromosome 13?”

https://beacon-network.org www.elixir-europe.org/beacons

9 Nodes have lit Beacons

Page 19: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Federation of human genome data

• Many national datasets from human research participants needs to be stored locally

• ELIXIR developing a federation with shared metadata (FAIR) and local data store (secure)

• Linking local EGA to

• national clouds

• international access (ELIXIR-AAI - Authentication and Authorisation Infrastructure)

Page 20: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Sharing genomic data across borders

EU declaration - 2018

Currently signed by:

Austria, Bulgaria, Croatia, Cyprus, Czech Republic, Estonia, Finland, Greece, Italy, Latvia, Lithuania, Luxembourg, Malta, Portugal, Slovenia, Spain, Sweden, Netherlands and the UK

ELIXIR members

Page 21: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

“Leveraging European infrastructures to access one million human genomes by 2022”

• Coordinated, secure, federated environment will enable population scale genomic, phenotypic, and biomolecular data to be accessible across international borders

• Lessons learned & solutions developed should be taken from existing infrastructures, and ongoing data sharing efforts in cancer, population genetics & rare disease areas

• Need to empower data scientists with knowledge and tools

Saunders G et al., pre-submission acceptance to Nature Genetics Reviews

Page 22: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

New ELIXIR initiative :FAIRplus-to develop tools and guidelines for making life science FAIR

ELIXIR - Project CoordinatorJanssen - Project Leader

22 participants 12 academic, 7 EFPIA, 3 SME

€8.23M budget €4M H2020 EC funding + €4.23M EFPIA in-kind

42 months (Jan 2019-June2022)

22

Page 23: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

FAIRplus: Aims

• Establish a value-based process for prioritisation and selection of Innovative Medicine

Initiative (IMI) project databases

• Develop FAIRification toolkit e.g. develop guidelines, tools and metrics - FAIR Cookbook

• Apply this toolkit to FAIRify datasets from selected IMI projects (>20 selected using a value

based selection process) and EFPIA companies

• Deliver training for data handlers (academia, SMEs and pharmaceuticals) to change and

sustain the data management culture e.g. Fellowship scheme

• Foster and innovation ecosystem on FAIR open data to power future reuse, knowledge

generation and societal benefit e.g. FAIR innovation and SME events

23

Page 24: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Our consortia

24

Page 25: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Concept

25

Page 26: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Tools available to FAIRplus

OmicsDIOmicsDI

Identifiers.org

Bioschemas

ELIXIR-LU Data

catalogue

Containerisation

tools

F A I R cross capability

Page 27: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

CMMI (Capability Maturity Model Integration) for processes and datasets

271.

Initial2.

Repeatable3.

Defined4.

Managed5.

Optimizing

Using a “design,

use and refine”

cycle we will

iterate through

the processes

and products

Page 28: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Use Cases:Strong links to past Innovative Medicine Initiative (IMI) projects

STAGE 1 CONSORTIUM PARTICIPANTS:

ADAPT-SMART:

LYG

ADVANCE:

Synapse

AETIONOMY: *

UL

AMYPAD:

Synapse

APPROACH:

LYG, ITTM

BEAT-DKD:

SIB

BioVacSafe:

CDISC

DDMoRE:

EMBL-EBI, LYG

DRIVE AB:

Synapse

EBiSC:

EMBL-EBI, Fraunhofer

EPAD:

Synapse

Ebola+:

HYVE

EHR4CR:

EMBL-EBI, CDISC

ELF:

LYG

EMIF:

EMBL-EBI, IMIM,

HYVE, ITTM, Synapse

EMTRAIN:

EMBL-EBI

e-Tox: *

EMBL-EBI, BSC,

IMIM, Synapse

eTRANSAFE:

ELIXIR Hub, EMBL-

EBI, BSC, IMIM

eTRIKS: *

UOXF, UL, ICL, HYVE,

OntoForce, CDISC,

ITTM

EU-AIMs:

EMBL-EBI

HARMONY: Synapse

IMPRiND:

UOXF

IMIDIA:

SIB

iPiE:

IMIM,

Synapse

K4DD:

Fraunhofer

ND4BB*TRANSLOCATION:

HYVE, Fraunhofer

Open PHACTS:*

EMBL-EBI, UNIMAN,

BSC,

IMIM, HWU,

UM, PHACTS,

OntoForce

RADAR-CNS:

LYG, HYVE

RESCEU:

Synapse

RHAPSODY:

SIB

ROADMAP: Synapse SAFE-T:

ITTM

TransQST:

EMBL-EBI, IMIM,

UM, Synapse

BigData@Heart:

HYVE

EFPIA PARTICIPANTS (selected examples with highest relevance to this topic):

JANSSEN AZ LILLY GSK NOVARTIS BAYER BI

OncoTrack

OpenPHACTS

eTRIKS

DO->IT

HARMONY

eTOX

eTRANSAFE

ELF

K4DD

OncoTrack

OpenPHACTS

eTRIKS

eTOX

eTRANSAFE

ELF

K4DD

OncoTrack

OpenPHACTS

eTRIKS

DO->IT

OpenPHACTS

eTRIKS

DO->IT

eTOX

K4DD

OpenPHACTS

DO->IT

HARMONY

eTOX

eTRANSAFE

OncoTrack

eTRIKS

DO->IT

HARMONY

eTOX

eTRANSAFE

ELF

K4DD

OncoTrack

DO->IT

eTOX

eTRANSAFE

Plus ReSOLUTE

Page 29: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

Communications

29

FAIRplus Website: www.fairplus-project.eu

FAIRplus Twitter: https://twitter.com/FAIRplus_eu● Tweet using #FAIRplus

IMI forum LinkedIn: https://www.linkedin.com/company/innovative-medicine-initiative-joint-undertaking/

Contact us: [email protected]

CONFIDENTIALITY: Please respect that this project is EC funded. If you would like to disseminate or use any of the information from the work plan or outcomes, please contact the Coordinators [email protected]

Page 30: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

18th-22nd Nov 2019 Paris

Themes Can include :Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools discovery, and Training materials

Opportunities for companies to submit hacking topics

Contact: [email protected]

Page 31: ELIXIR’s Human Data Communities IMI FAIRplus project · Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools

www.elixir-europe.org

@ELIXIREurope

www.elixir-europe.org

ELIXIR-EXCELERATE is funded by the European Commission within the

Research Infrastructures programme of Horizon 2020, grant agreement number

676559.

Thank you!