European open Science cloud Nederland gidsland data ...
Transcript of European open Science cloud Nederland gidsland data ...
EUROPEAN OPEN SCIENCE CLOUD & FAIR DATA
THE FORECAST FOR SAN DIEGO
PARTLY FAIR / PARTLY CLOUDY
November 3, 2016
The Data Tsunami
Datarrhoeia
Standards?
Needle Transport
Do It Yourself Data
THE BIG DATA PROBLEM
2
BUT WE ALL DID TRY, DIDN’T WE?
1992 Data Stewardship Plan
1997 Improved Data Stewardship Plan
SO…...............................
Most data do not TALK to each other
Data are lost and/or hard to find Inhibits fully effective health care, research & innovation
Research data malpractice (Life Science example):
Only 12% of NIH funded datasets are demonstrably deposited in recognized repositories: so over 200,000 ‘invisible’ public datasets can not be re-used effectively.
Approximately 50% of funded research not reproducible
Inhibits scaling of effective knowledge discovery
FAIR DATA PRINCIPLES: INVENTED IN THE NETHERLANDS
Leiden 2014
Initiated by
EVOLVED RAPIDLY INTO A GLOBAL MOVEMENT
Rapid acceptance and endorsement process
The conference
The Website
Research Data Alliance endorsement
DTL flagship project
FORCE11 international partner
Articles accepted in NATURE
NIH accepts FAIR compliance in Life Sciences Commons
DTL director Prof. Barend Mons Chair High Level Expert Group EC
The Personal Health Train Initiative started
EC announces European Open Science Cloud with FAIR as leading principle
World 2016
AND RECENTLY EVEN THE G20 WANTS FAIR!
“We support appropriate efforts to promote open science and facilitate
appropriate access to publicly funded research results on findable, accessible,
interoperable and reusable (FAIR) principles.” (Statement 12)
http://europa.eu/rapid/press-release_STATEMENT-16-2967_en.htm
EC TAKES ACTION: THE EUROPEAN OPEN SCIENCE CLOUD
Europe acknowledged the problem
Moved for a solution: EOSC
Data stewardship for better discovery; mandatory!
Cost eligible for funding
Internet of Data based upon FAIR data principles
Training of 500.000 data experts
Financing
€6.7B for initial phase EOSC
EC + MS : €400 billion for research annually
@ 5% equals €20B for Data Stewardship
EOSC inspired by and based on Dutch FAIR initiative
FAIR ADOPTED BY USA (NIH) AS WELL
THE NIH The NIH Commons initiative
The Commons Data
Vouchers for mandatory
data stewardship
WHAT IS FAIR DATA?
Findable - (meta)data is uniquely and persistently identifiable.
Should have basic machine readable descriptive metadata.
Accessible - data is reachable and accessible by humans and
machines using standard formats and protocols.
Interoperable - (meta)data is machine readable and
annotated with resolvable vocabularies/ontologies.
Reusable - (meta)data is sufficiently well-described to allow
(semi)automated integration with other compatible data sources.
We want a large ecosystem of apps that use
FAIR Data
SO the FAIR solution between them must be THIN!
We want to support a wide range of source
providers
THE HOUR GLASS CONCEPT
http://www.nature.com/articles/sdata201618
THE FAIR GUIDING PRINCIPLES
THE FAIR DATA ECOSYSTEM
BENEFITS
Public data:
Improved sharing and re-use of research and personal health data
Improved results research and health data analytics
Improved involvement of citizen/patient (digital control of own data)
Improved economic benefits health care sector including prevention
Private data:
Interoperable with public domain data
Effective hypothesis validation
Improved decision support clinical development
Improved effectiveness discovery process
Leadership
San Diego can lead the implementation of the FAIR
Data Principles and beyond
Assets
San Diego Internet of Data
Open network of FAIR Data
Backbone infrastructure for both
OPPORTUNITIES SAN DIEGO
THE IMPLEMENTATION NODE APPROACH
SAN DIEGO
THE PERSONAL HEALTH TRAIN PROJECT
The Personal Health Train
Health data ‘stations’
Citizen ‘in control’
Public and Private sector
PHT video:
http://www.dtls.nl/fair-data/personal-health-train/