My data, your data, our data - increasing data value through reuse (Eurocris2014 keynote)

download My data, your data, our data - increasing data value through reuse (Eurocris2014 keynote)

of 55

Embed Size (px)

description

My keynote talk for Eurocris2014, Rome. I make the case for reuse of research data, discuss the barriers and look at ways we are trying to overcome them.

Transcript of My data, your data, our data - increasing data value through reuse (Eurocris2014 keynote)

  • My Data, Our Data, Your Data: data reuse through data management Kevin Ashley Digital Curation Centre www.dcc.ac.uk @kevingashley Kevin.ashley@ed.ac.uk Reusable with attribution: CC-BY The DCC is supported by Jisc
  • A summary Why data reuse ? What stops us ? How data management helps Harmonising the goals of research administration and research Barriers again The case for reuse - again 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 2
  • My home the DCC Mission to increase capability and capacity for research data services in UK institutions Not just a UK problem an international one Training, shared services, guidance, policy, standards, futures 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 3
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 4 What is data curation ? Maintaining, preserving and adding value to research data throughout its lifecycle More than preservation: Active management dealing with change Less than preservation: Lifecycle sometimes involves destruction
  • DCC guidance 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 5
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 6 SWEDEN DENMARK CANADA
  • Data reuse stories The palaeontologist who saved years of work with archaeological data 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 7
  • What a paleontologist looks at 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 8 Now 100 million years ago 25m 50m 75m 1m
  • What a paleontologist looks at 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 9 Now 100 million years ago 25m 50m 75m 1mNow 1 million years 750,000500,000100,000
  • What an archaeologist looks at 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 10 Now 1 million years 750,000500,000100,000 100,000 years ago 75,000 50,00025,000
  • Data reuse stories The palaeontologist who saved years of work with archaeological data The 19th-century ships logs that help us model climate change 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 11
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 12 The Old weather project Data for research, not from research
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 13
  • Data reuse stories The palaeontologist who saved years of work with archaeological data The 19th-century ships logs that help us model climate change The noise from research radar that mapped dust from Eyjafjallajkull 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 14
  • Data reuse - messages 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 15 Often your data tells stories that your publications do not Not all data comes from other researchers One persons noise is another persons signal Discipline-bounded data discovery doesnt give us all we need or want
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 16 Why care? Data is expensive an investment Reuse: More research Teaching & Learning Planning Impact with or without publication Accountability Legal & regulatory requirements
  • Why does this matter? Research quality How close can we get to the truth? Research speed How quickly can we get to the truth? Research finance How much does the truth cost? Improving one or more of these is of interest to all actors: Researchers as data creators Researchers as data reusers Research institutions Funders hence government and society 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 17
  • G8UK - Endorses OA Open Data Charter Policy Paper 18 June 2013 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 18 G8UK - Billigt offenen Zugang Eine offene Daten Charter Strategiepapier.
  • Funder requirements UK USA NSF, NEH, NIH Europe Most place burden on researcher some on the institution 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 19 http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx
  • RCUK policy - The 1-minute version Research data are a public good make openly available in timely & responsible way Have policies & plans. Data with long-term value should be preserved & usable Metadata for discovery & reuse. Link publications & data Sometimes law, ethics get in the way. We understand. Limited embargos OK. Recognition is important always cite data sources OK to use public money to do this. Do it efficiently. 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 20
  • EPSRC policy points Awareness of regulatory environment Data access statement Policies and processes Data storage Structured metadata descriptions DOIs for data Securely preserved for a minimum of 10 years from last use 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY Compliance expected by 2015
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 22 DCC Policy Summary http://www.dcc.ac.uk/resources/policy-and-legal
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 23 Findable, citable data has value Important to link publications to data (and vice versa) Increases citations of data & publication Increases reuse (hence value) But effects exist even without publication, if data is: Archived Citable Discoverable MORAL: build a data registry
  • What stops data reuse Loss Destruction Pride Gluttony Ineptitude Concealment Bureaucracy Complexity Procrastination Lack of potential 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 24
  • Kevin Ashley Eurocris2014 - CC-BY 25 Departments dont have guidelines or norms for personal back-up and researcher procedure, knowledge and diligence varies tremendously. Many have experienced moderate to catastrophic data loss Incremental Project Report, June 2010 http://www.flickr.com/photos/mattimattila/3003324844/ 2014-05-14
  • What stops data reuse Loss Destruction Pride Gluttony Ineptitude Concealment Bureaucracy Complexity Procrastination Lack of potential 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 26
  • How people talk about data I put my data in figshare and I got a DOI for it Not our data; the universitys data; my funders data; the data; the peoples data; your data. 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 27
  • Data ownership its messy You need ownership to make data free Governments may assert this Industrial collaborators understanding role of public funding Research admin tracks the rules 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 28
  • ON METADATA 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 29
  • Disciplines current state Typically specialised Focussed on discipline-specific concerns Frequently embedded hence processing required to expose independently Historic failure to express generic concepts generically Place Time 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 30
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 31
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 32 Understanding Data Requirements http://www.dcc.ac.uk/
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 33
  • Data centres are good value! See Jisc reports on ADS, BADC, UKDA: Returns on investment between 400% and 1200% 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 34
  • 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 35
  • Integrity Not everyone publishes here Almost all fraud connected to unavailable data People suffer & die due to research fraud When your research is reproducible it gets cited 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 36
  • Integrity not without data Cyril Burt Twin studies on intelligence. Questioned 1976; now discredited Duke case Data hiding leads to wasted treatments, clinical trials, probable death & huge lawsuits Dutch cases Stapel 55 publications fictitious data Poldermans fabricated data or negligence? 2014-05-14 Kevin Ashley Eurocris2014 - CC-BY 37 The ca