Imac 090924

Post on 20-Aug-2015

1.229 views 5 download

Tags:

Transcript of Imac 090924

ProjectProject3TU.Datacentrum3TU.Datacentrum

Im@c, September 24Im@c, September 24thth 2009 2009Jeroen Rombouts, MScJeroen Rombouts, MSc

Project manager 3TU.DatacentrumProject manager 3TU.Datacentrum

Presentation outlinePresentation outline

Why care about research data?

What do data producers have to say?

Why care? 1/3Why care? 1/3

Research

Manuscript Publication

Data Metadata

Repository Library

Why care? 2/3Why care? 2/3

• Physical decay of storage media;

• Loss of descriptive (meta)data;

• Loss of ‘rendering’ capabilities (contemporary applications for viewing and analysing data).

Risks of current research data management

Reasons for long-term preservation and access

• Data value (cost intensive, valorisation, continuous datasets);

• Research quality (verification, knowledge transfer, sharing).

Why care? 3/3Why care? 3/3

• Plan of National Science Foundation regarding preservation of digital scientific output (2006);

• OAIS reference model (2002 by CCSDS) becomes ISO standard (2009);

• KNAW starts Dutch data repository for humanities and social sciences: DANS (Data Archiving and Networked Services) (2005);

• No initiatives for engineering and science in the Netherlands.

Project setting

The 3TU.Datacentrum 1/8The 3TU.Datacentrum 1/8

• Builds on two previous projects;– E-Archiving – digital depot– Darelux – Data Archiving River Environment Luxemburg

• Time frame of 3 years 2008 - 2010;– Financed mainly by 3TU.Federation– Datasets from TUD, TU/e and UT, later other science data

• Goal: long-term access to research data.

Project description

The 3TU.Datacentrum 2/8The 3TU.Datacentrum 2/8

Tasks

CollaborationWith DANS, SURF, Koninklijke Bibliotheek and others:• “DRIVER-II” (EU-7FP), Demonstrator voor Enhanced Publications;• “Waardevolle Data & Diensten” (SURFshare), identify added value of data repository for data producers.• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.

• Implement and run ‘data-archive’ (facilitate data producers);- Collect, preserve, publish and provide access to data- (ß): drietu2.3tu.nl/repository/collection:all/view/html

• Data management consultancy;- Select and develop formats, metadata, tools, etc.

The 3TU.Datacentrum 3/8The 3TU.Datacentrum 3/8

• Data of ‘enhanced publications’ (underlying data and visualisations linked to publications).Increase publication value (stronger basis, more citations, …);

• Data generated by ‘hard to repeat’ processes.E.g. high cost, (environmental) observations, complex or continuous experiments, …;

• Data collected with public funding.Conditions by funding organisations or publishers like Nature Publishing Group, NWO, governmental organisations, universities, …;

• Preferably open access data with potential for reuse (verification, new research, …).Increase visibility, efficiency and quality of research efforts.

Which data to preserve? And why?

• Technical infrastructure (server, platform, websites, formats & models)

• Dataset Darelux (2.0)http://drietu2.3tu.nl/repository/resource:study-CITG/view/html

• Dataset Flame (BagIt)http://drietu2.3tu.nl/datasets/flame/

• Dataset Wind speed/Solar radiationhttp://drietu2.3tu.nl/datasets/windzon/

• Datasets ‘on the way’: NNV Survey ‘job market physicists’, Enhanced Publication ‘combustion’, Waterlab, Biotechnology, Remote sensing, ‘Tire noise’

The 3TU.Datacentrum 4/8The 3TU.Datacentrum 4/8

• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.“to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence”;

• Founding member COAR: Confederation of Open Access Repositories (October);

• Provide input for “Nota Wetenschappelijke informatievoorziening” (OC&W), “Toekomst voor ons digitaal geheugen” (NCDD);

• Partner in “Nationale Coalitie Digitale Duurzaamheid” (www.ncdd.nl);

• Coordinating “Forum onderzoeksdata”.

Related ‘results’ 5/8Related ‘results’ 5/8

The 3TU.Datacentrum 6/8The 3TU.Datacentrum 6/8

The 3TU.Datacentrum 8/8The 3TU.Datacentrum 8/8

The benefits for data producers and data consumers

• Increased visibility of research output. (metadata in repository networks, assigning doi’s, facilitate increases citation rate for ‘enhanced publications’, ...);

• Improved quality of dataset (quality assurance for multi- user setup, checks on ingest, …);

• Provide (long-term) preservation of and accessibility to, valuable research data;

• Distribution of research data for reuse, including administration and usage statistics;

• Provides advice on data management, rights, formats, metadata, etc.

Nobody needs my data

Data transfer not needed, every PhD does own project

Our datasets are confidential

Interesting but not for me

Only for long term continuous

data

Datasets are stored by publisherNo time!

Our research is once only

What do data producers say? 1/2What do data producers say? 1/2

Surprising our university had no faciltity for data

preservation

Transfer of data between PhD’s can be

improved

Would like to publish data

Good opportunity to share datasets

we bought

Very usefull, essential metadata

often missing Much to

improve in reuse of data

When can I store my datasets?

What do data producers say? 2/2What do data producers say? 2/2

Questions? Suggestions?Questions? Suggestions?

Nature News Special on Data Sharing (september 2009)www.nature.com/news/specials/datasharing/index.html

Toekomst voor ons digitaal geheugenhttp://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf

ResourcesResources

• The 3TU.Datacentrum project www.datacentrum.3tu.nl• "Unavailability of online supplementary scientific information from

articles published in major journals" doi:10.1096/fj.05-4784lsf• "Going, Going, Gone: Lost Internet References“

doi:10.1126/science.1088234• “Sharing Detailed Research Data Is Associated with Increased

Citation Rate” doi:10.1371/journal.pone.0000308• “To share or not to share” www.rin.ac.uk/data-publication• “NSF’s Cyberinfrastructure Vision for 21st century Discovery”

www.nsf.gov/od/oci/ci_v5.pdf• “SURF Direct” Digitale rechten – onderzoeksdata (Dutch)

www.surf.nl/surfdirect• Nature News Special on Data Sharing (september 2009)

www.nature.com/news/specials/datasharing/index.html• Toekomst voor ons digitaal geheugen

http://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf