Time to Wrap-up - EUDAT...EUDAT Training … Metadata Persistent Identifiers, Handles, Type...
Transcript of Time to Wrap-up - EUDAT...EUDAT Training … Metadata Persistent Identifiers, Handles, Type...
Time to Wrap-up
Plenary I: Addressing the new data challenges – the
case for cross-disciplinary science and services
Richard Frackowiak
Director, Department of
Clinical Neuroscience,
Head of Service of
Neurology, CHUV
University Hospital,
Lausanne
Kostas Glinos
European Commission,
Head e Infrastructure
unit - Directorate
General for
Communications
Networks, Content and
Technology
Kimmo Koski
Managing Director CSC
- IT Center for Science &
EUDAT Project
Coordinator
Plenary II: Life and Earth Sciences at a cross-road:
community driven flagship initiatives in the EU and USA
William (Bill) Michener
Professor and Director
of e-Science Initiatives
for University Libraries,
University of New
Mexico & DataONE
Principal Investigator
Ewan Birney
Associate Director of
the EMBL-European
Bioinformatics Institute
(EMBL-EBI)
Maryline Lengert
Senior Advisor, ESRIN,
ESA & Helix Nebula
Plenary III: Towards Global Data Infrastructure
ComponentsJohn Wood John Wood
RDA Council
Chair &
Secretary
General
Association
Commonwealth
Universities
Raphael Ritz,
Head of Data
Science Group
at RZG, Max
Planck Society
William (Bill)
Principal Investigator
William (Bill)
Michener
Professor and
Director of e-Science
Initiatives for University
Libraries, University of
New Mexico & DataONE
Principal Investigator
Tobias Weigel
German Climate
Computing Center (DKRZ)
/ University of Hamburg
Daan Broeder
Max Planck Institute For
Psycholinguistics,
Nijmegen
Rainer Stotzka
Karlsruhe Institute
of Technology
(KIT), Institute for Data
Processing & Electronics,
Software Methods
Group Head
EUDAT 2nd Conference – 28-30 October 2013, Rome, Italy - www.eudat.eu/2nd-conference5
Johannes Reetz RZG /
MPG TRACK 1 - EUDAT SERVICES
Morris Riedel
Juelich Supercomputing Centre,
University of Iceland
TRACK 2 - INTEROPERABILITIES
Rob Baxter
EPCC, University of Edinburgh TRACK 3 - POLICY AND
SUSTAINABILITY ISSUES
Mark van de Sanden
SURFSara TRACK 4 – NEW SERVICES
1. EUDAT Services
6
2. Interoperabilities
• Session I – Federated AAI
– US works on federated AAI as well, linking different IDs
– EUDAT & Terena/EduGain pilot prototype exists to use
– ‘FIM4R group’ started different pilots (partly with industry)
• Session II – Private-Public Partnerships
– EUDAT CDI offers unique capabilities & know-how for industry
– Collaborations with industry need focus and broader representatives
• Session III – Identifiers
– US and handle system moves to a sustainable foundation (DONA)
– Working with PIDs and DOIs stable and increased usage
– New approaches for identifiers for ‘open time series’
7
3. Policy & Sustainability Issues
• Session I: Access and Reuse policies– Data policies: increasingly long-term & open
– Harmonising across EUDAT centres is essential
– Licensing: compatibility across repos is key to enabling intelligent re-use
• CC 4.0 could give us useful sets of open licences
• Session II: Data Management Plans and Certifications– DMP: increasingly required, and now in H2020
– We need to find carrots as well as sticks: research benefits as well as compliance
– DMPonline: v4 out in beta, highly customizable
– EUDAT plans to adopt & adapt in collaboration with DCC
– DSA: increasingly needed to build trust
– How best to adopt in a distributed infrastructure?
• Session III: Cost and Funding Models– Funding viewpoint: focus on funding & charging
– It’s not about costs but about who pays (and in what “currency”)
Track 4
9
Discover New Services
Semantic Annotation
Dynamic Data
Workflows
• Understand community and
user requirements
• Regular surveys
• Working groups
LifeWatch/LTER use case
automated, auto corrections
• LifeWatch/LTER use case
• EUON: European Ontology Network,
bring experts together
• Proposal EUDAT Semantic Annotation
service
• How does this integrate with EUDAT
services: PID, B2SAFE, B2SHARE?
• Is not the holy grail, scientists are the
real experts
• But it helps, should be more
automated, auto corrections
ENES and CLARIN use cases• ENES and CLARIN use cases
• EUDAT Generic Execution Framework (GEF)
• HTTP/Rest API to be integrated with workflows
• Bottom up approach to come best workflows to support
• Information about workflows should be easy findable, repositories?
• EPOS: How to refer to streaming, changing,
incomplete (sensor) data
• CLARIN: How to manage crowdsourcing
data and the crowd
• Bi-temporal approach: OBSERVATION and
STATE time to select data sets version at a
certain time
• How to engage with the crowd, how to
manage intellectual, data privacy, with fast
crowing data volumes
Close collaboration between Communities and
IT people are essential
EUDAT Training …
Metadata
Persistent
Identifiers,
Handles, Type
Registries, EPIC
Implementation
of Staging,
Replication and
Storage: Services
and Tools
Data Staging,
Replication and
Storage: Integrating
with EUDAT’s
Building Blocks
215 participants from…
11
0
5
10
15
20
25
30
35
40
45
50
Participants by CountryTHE TOP 5 …
Country Participants
Italy 45
Germany 30
United Kingdom 28
Netherlands 24
Finland 14
Gender balance …
MALE
78%
FEMALE
22%
Organisation Type …
Academia &
Research
87%
Industry
6%
International
Body
3%
Media
1%
Public Admin
3%
Communities count ...
EUDAT USER
Communitie
s
20%
EUDAT
15%Other
communties
65%
Multi-disciplinary ….Biodiversity &
Ecological
Research
9%
Bioinformatics
4%
Earth Sciences &
Climate research
13%
Humanities and
Social Sciences
23%
Medical sciences
1%
Physics
1%
Research data
management &
policy
31%
Research
Infrastructures &
Technologies
18%
Summary
• 94 respondents, from 23 countries
17
0 10 20 30 40 50 60
Other
Policy maker
Funder
Researcher/Scientist
Germany
19 %
Italy
17 %
The
Netherlands
13 %
Finland
9 %
United Kingdom
11 %
Spain
5 %
Sweden
4 %
Norway
3 %
Other
19 %
Data and Metadata Services
0 10 20 30 40 50 60
Joint Metadata Catalogue
Real Time Data Handling
Memento Service
EUDAT Box
Simple Store
Database Replication
Light Replication
Safe Replication
Very much interested Somewhat interested Not much interested
Lifecycle and Processing Services
19
0 10 20 30 40 50 60
Semantic Referencing
Workflow Support
Conversion services
Crowd Sourcing Support
XLS/WORD Support
R Support
Data Lifecycle Support
Data Staging
Very much interested Somewhat interested Not much interested
Registries, PIDs & AAI
20
0 10 20 30 40 50 60 70
Site registry
Service Registry
Data Type Registry
Policy Registry
AAI Support
Stable PID Support
Very much interested Somewhat interested Not much interested
Working Groups
21
0 10 20 30 40 50 60 70 80 90
Data access and re-use policies
Dynamic Data
Workflows
Semantic Services
Interested Not interested
… and the winner is…
22
Thank You!
23