Library Data Management Services

126
Library Data Management Services: A Strategic Framework for Development and Implementation NN/LM MAR Research Data Management Symposium 28 April 2014 Keith Webster Dean of University Libraries

Transcript of Library Data Management Services

Library Data Management Services: A Strategic Framework for Development and Implementation

NN/LM MAR Research Data Management Symposium28 April 2014

Keith WebsterDean of University Libraries

Our  professional  future

The genealogy of the contemporary research

library

The risk of invisibility

The emergence of open science

Our core professional skills

An overview of data management

The policy context

Data management activities

The  UQ  experience

DM service philosophy

Partnerships

Skills development

The Australian model

Emerging services at CMU

Pu6ng  it  into  prac8ceThe  data  management  impera8ve

Our  professional  future

The genealogy of the contemporary research library

Where do library clients go?

Specific e-resource

General search engine

Library catalogue

Library building1

18

38

47

13

28

21

37

2003 2012

Search engine

Wikipedia

SNS

Email

Online database

Virtual reference

Library website 0

0

1

1

2

7

83

Where do student start a search? Where do academics begin research?

Perceptions of libraries 2010, OCLC

Faculty study 2012: key insights for libraries and publishers, Ithaka

Our  professional  future

The risk of invisibility

What is happening in the world is bypassing university libraries

Peter Murray-Rust The scientist’s view JISC Libraries of the future debate, April 2009

“…contact  with  librarians  and  informa8on  professionals  is  rare”  

“…researchers  are  generally  confident  in  their  [self-­‐taught]  abili8es..,  librarians  see  them  as..rela8vely  unsophis8cated”  

“…librarians  see  it  as  a  problem  that  they  are  not  reaching  all  researchers  with  formal  training,  whereas  most  researchers  don’t  think  they  need  it”

• The  part  that  academic  librarians  should  play  remains  unclear  

• Raise  awareness  of  eResearch  amongst  library  staff  

• Provide  advice  on  data  management  to  eResearchers  

• Data  cura8on  is  vast,  complex  and  requires  subject  input

• “The  bad  news  is  that  I’m  not  sure  they  understand  what  goes  on  in  the  library  other  than  taking  out  books.”  

Benton  Founda8on,  1996  

• “User  percep8ons  nega8vely  affect  the  ability  of  librarians  to  meet  informa8on  needs  simply  because  a  profession  cannot  serve  those  who  do  not  understand  its  purpose  and  exper8se.”  

Durrance,  1988

The worst thing about the stereotype is that it impacts on the psyche of librarians who really begin to believe that they don't deserve the kingpin role

US Congress, 2001

Our  professional  future

The emergence of open science

• It is likely that the way that researchers publish, assess impact, communicate, and collaborate will change more within the next 20 years than it did in the past 200 years.

http://book.openingscience.org/

• Driven by end-users!• Interdisciplinary knowledge!• Collaborative across sectors!• Transitory research teams!• Accountability (social and

economic) to range of stakeholders!

• Quality control (academic merit, cost effectiveness, economic and social relevance)

(Gibbons [et al], 1994)

• Driven by academic discipline!• Knowledge framed by

disciplinary norms!• Deeply institutionalised!• Accountability to peers!• Scientist is expert!• Quality control by peer

review and contribution to discipline

Mode 1 Mode 2

Modes of knowledge production

Funding structures and requirements

• External funding!• Diverse source of funding!• Government!• Not-for-profit!• Industry!

• Economic outcomes!• increase wealth creation & prosperity!• improve nation’s health, environment & quality of life!

• Innovation!• Improved competitiveness!• “Commercialisation” of research!• Less “curiosity-driven” activity

• Fund the best research to meet the needs of the country!

• Develop leaders and researchers who can meet national and global priorities!

• Foster public engagement with research!

• Funding international collaboration

Aims of research funders

Open access, open data, open science

!!Increasingly, the “private” nature of academic science is being displaced by a culture of openness - ideas, approaches and observations are shared at the earliest opportunity with colleagues - and sometimes the world at large.!!Whilst the ‘version of record’ approach to journal article creation retains validity, this is increasingly seen as a compliance matter - required to meet career objectives and funder/government requirements!

!!Traditional enquiry-driven research has been supplanted by reflexive research, driven by the increasingly necessary flow of external research funding into universities. Largely, this comes from government agencies, but charities (such as the Wellcome Trust) and industry are also powerful sponsors of high-quality activity.!!This state has led to the notion of the triple-helix of research - academe, industry and government.!!In turn, these inter-relationships have spawned a major industry around assessing and evaluating the impact of research. Initially, the aim was to drive up standards; this is now shifting to a culture of openness, and a desire to foster public engagement.!!!

Useful  knowledge Useful  knowledge

Sharable  knowledge

Sharable  knowledge

http://michaelnielsen.org/blog/the-future-of-science-2/

About 35 percent of scientists are using things like blogs to consume and produce content. There is an explosion of online tools and platforms available to scientists, ranging from Web 2.0 tools modified or created for the scientific world to Web sites that are doing amazing things with video, lab notebooks, and social networking.!!The next generation of PIs is already establishing new behaviors. They feel comfortable blogging, using social media tools, and using wikis to advance their research. It will take the big institutions to support open-access journals, for example. And it will take technological innovation in the form of software that is purpose-built for this unique community and its set of challenges.!!We’re talking about something as fundamental and important as modernizing the architecture of science.

Adam Bly

http://seedmagazine.com/content/article/science_2.0_pioneers/

There are a billion people connected to the, the Web. At least one of them has a smarter idea about what to do with your data than you do.

James Boyle

Our  professional  future

Our core professional skills

How  do  we  add  value?

• Bri8sh  Library  adds  £419m  of  value  to  the  economy  each  year  

h\p://www.bl.uk/aboutus/stratpolprog/increasingvalue/bri8shlibrary_economicevalua8on.pdf

Making  a  differenceAdverse event avoided Percent

Hospital admission 11.5

Hospital acquired infection 8.2

Surgery 21.2

Additional tests/procedures 49.0

Additional out-patient visits 26.4

Patient mortality 19.2

Marshall (1994) The impact of information services on decision making

Collection-centric - 1st generation

Client-focused - 2nd generation

Experience-centered - 3rd generation

Connected Learning Experiences and Information Specialists in the Research Process- 4th generation

Current priorities in academic libraries

1. Continue and complete migration from print to electronic and realign service operations

2. Retire legacy collections3. Continue to repurpose library as primary

learning space4. Reposition library expertise and resources to be

more closely embedded in research and teaching enterprise outside library

5. Extend focus of collection development from external purchase to local curation

Lewis (2007); Webster (2010, 2012)

CORE  SCHEMA  (CILIP,  2004)

Our  professional  future

The genealogy of the contemporary research

library

The risk of invisibility

The emergence of open science

Our core professional skills

An overview of data management

The policy context

Data management activities

The  UQ  experience

DM service philosophy

Partnerships

Skills development

The Australian model

Emerging services at CMU

Pu6ng  it  into  prac8ceThe  data  management  impera8ve

The  data  management  impera8ve

An overview of data management

Why Data Management Services?"The Board believes that timely attention to digital research data sharing and management is fundamental to supporting U.S. science and engineering in the twenty-first century.

...strong and sustainable data sharing and management policies [are] a critical national need."

Digital Research Data Sharing and Management December 2011

Task Force on Data Policies Committee on Strategy and Budget

National Science Board

More  data  will  be  created  in  the  next  five  years  than  has  been  collected  in  the  whole  of  human  history.  Properly  managed,  this  data  will  form  a  major  resource  for  Australian  researchers.

"Create a comprehensive framework...that provide[s] reliable, effective access to the full spectrum of public digital scientific data."

2009

Research collaboration is associated with high academic and wider impact

International collaboration is associated with high academic impact

Data can be shared easily across borders

Sharing  data?• Create  opportuni8es  

–For  re-­‐analysis  and  re-­‐use  –To  facilitate  collabora8on  

• Solve  problems  –Waste  of  money,  people  and  effort  –Loss  of  irretrievable  data  –Inability  to  verify  research  

• Issues  and  challenges  –Pa8ent  confiden8ality  –IP  and  discovery  protec8on  

• Promote  cura8on  rather  than  sharing?

The  data  management  impera8ve

The policy context

• The rapid development in computing technology and the Internet have opened up new applications for the basic sources of research — the base material of research data — which has given a major impetus to scientific work in recent years.

• Access to research data increases the returns from public investment in this area; reinforces open scientific inquiry; encourages diversity of studies and opinion; promotes new areas of work and enables the exploration of topics not envisioned by the initial investigators.

• The value of data lies in their use. Full and open access to scientific data should be adopted as the international norm for the exchange of scientific data derived from publicly funded research.

• Builds upon work in Fort Lauderdale biological data sharing principles

http://www.nature.com/nature/journal/v461/n7261/pdf/461168a.pdf

64

Key points

• Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.

• To ensure that the research process is not damaged by inappropriate release of data, research organisation policies and practices should ensure that these are considered at all stages in the research process.

69

71

73

74

Ins8tu8ons  are  to  retain  research  data,  provide  secure  data  storage,  iden8fy  ownership,  and  ensure  security  and  confiden8ality  of  research  data  

Researchers  are  to  retain  research  data  and  primary  materials,  manage  storage  of  research  data  and  primary  materials,  maintain  confiden8ality  of  research  data  and  primary  materials.  

Australian requirements

1.Intellectual property 2.Data management, including: ◦ Storage ◦ RetentionDisposal ◦ Access, publication, description

3. Conflict of interest — do all parties have the same understanding about the use of the data? 3.Collaboration and contractual agreements 4.Ethics and privacy Compliance

76

77

78

“The Holdren Memo”

To achieve the Administration’s commitment to increase access to federally funded published research and digital scientific data, Federal agencies investing in research and development must have clear and coordinated policies for increasing such access.

Memo on Increasing Access to the Results of Federally Funded Scientific Research

White House Office of Science and Technology Policy

February 22, 2012

Data Management and Sharing

Soon (late 2014?)

The  data  management  impera8ve

Data management activities

82

What do we mean by RDM?

Data Retention Policy

Repository Data Policy

Data Visualization

Data Management

Planning

File Formatting

Metadata

Discovery

Grant Writing

Registry

Intellectual Property Issues

84

Research Data Lifecycle

Conceptualize Project

Data Archiving Publication

Data Analysis

Collect Data

Data Reuse

Compliance-Side Economics

Data Management

Planning

3+ Years Data & Institutional Repositories

Pre-Award Compliance Post-Project Compliance

Data Services Program

Data Management

Planning

Data & Institutional Repositories

Operational DMP &

Compliance Checklist

Check Up

Visits

Compliance Assessment

Data Management

Training

Data Consult.

& Staging

3+ Years

Our  professional  future

The genealogy of the contemporary research

library

The risk of invisibility

The emergence of open science

Our core professional skills

An overview of data management

The policy context

Data management activities

The  UQ  experience

DM service philosophy

Partnerships

Skills development

The Australian model

Emerging services at CMU

Pu6ng  it  into  prac8ceThe  data  management  impera8ve

Pu6ng  it  into  place

Data management service philosophy

What might our service offer?

• Teaching or doing?

• Compliance or support?

• Storage or registering?

• Policy advice vs policy development

• Institution-wide or in response to requests?

• Advising on data re-use (sources, analysis etc)

Data curation lifecycle

Pu6ng  it  into  place

Partnerships

Likely partners

• Office of Research

• Ethics/privacy/legal experts

• Computing specialists

• High performance computing

Other sources of help

• National data services

• Data archives

• Research funding agencies

• Other libraries

• Growing number of books and reports

• Specialist advice

Pu6ng  it  into  place

Skills development

Collec8ons  gridhigh low

low

high

stewardship

uniq

uene

ssBooks Journals Newspapers Gov. docs CD, DVD Maps Scores

Special collections Rare books Local/Historical newspapers Local history materials Archives & Manuscripts, Theses & dissertations

Research, learning and administrative materials, •ePrints/tech reports •Learning objects •Courseware •E-portfolios •Research data

•Institutional records •Reports, newsletters, etc

Freely-accessible web resources Open source software Newsgroup archives

h\p://www.slideshare.net/lisld/collec8ons-­‐grid

Librarians’  competencies  profile  for  RDMKey  roles  

• Providing  access  to  data  –Iden8fica8on  of  data  sets;  discovery  and  analy8c  tools;  advice  on  informa8cs  

• Advocacy  and  support  for  managing  data  –Policy  development;  ar8cula8ng  benefits;  promo8ng  data  sharing  and  reuse;  educa8on  and  training;  data  audits  

• Managing  data  collec8ons  –Preparing  for  data  deposit;  appraisal;  selec8on;  inges8on;  cura8on;  preserva8on;  storage  and  backup

Based on ARL draft distributed at CNI conference, St Louis, April 2014

Librarians’  competencies  profile  for  RDM

Core  competencies  • Providing  access  to  data  

–Data  centres  and  repositories;  organiza8on  and  structure  of  data;  licensing  and  IP;  manipula8on  and  analysis  

• Advocacy  and  support  for  managing  data  –Research  funder  mandates;  DMP;  research  workflows;  disciplinary  norms;  journal  requirements;  data  audit  and  assessment  tools  

• Managing  data  collec8ons  –Metadata;  discovery  tools  and  indexing;  database  design;  data  linking;  forensic  procedures  in  data  cura8on

LIS2975  @  Pi\  iSchool• The  Data  Landscape  • Universi8es  and  Data  • Data  Requirements  and  Capability  • RDM  Roadmaps,  Strategy  and  Planning  • Data  Management  Plans  • Disciplinary  Data  1  • Legal  and  Ethical  Data  Issues  • Disciplinary  Data  2  • Data  Centres  • Data  Advocacy,  Skills  and  Training  • Data  Sustainability  and  Cost

103

104

Pu6ng  it  into  place

The Australian model

Research  infrastructure

Teaching  and  Learning  Services

Social  Sciences  and  Humani8es

Life  Sciences

Engineering  and  Applied  Science

107

Research  Infrastructure

• Ins8tu8onal  repository  • Research  data  catalogue  • Research  publica8ons  repor8ng  and  evalua8on  

• Digi8sa8on

Early  progress

• Lead  ins8tu8on  in  APSR  • Development  of  eSpace  • ANDS  at  UQ  • Seminars  and  workshops  from  2007  onwards  • Partnering  with  eScience  and  HPC  ins8tutes  • Strong  involvement  across  all  disciplines

Service model

• Data management interview and planning

• Consultancy

• Legal advice

• Pointers to other resources - eg for storage

• Data description and publication

• Long-term preservation

• Feeds to Research Data Australia

114

Pu6ng  it  into  place

Emerging services at CMU

CMU  Faculty  Senate• WHEREAS  

• Researchers  in  all  disciplines  are  faced  with  a  range  of  data  management  needs  as  research  becomes  more  collabora8ve,  data-­‐intensive,  and  computa8onal,  

• And  the  Office  of  Science  and  Technology  Policy  direc8ve  issued  February  22,  2013,  requires  federal  agencies  that  fund  research  to  mandate  public  access  and  re-­‐use  rights  to  peer-­‐reviewed  publica8ons  and  digital  data  arising  from  that  funding,    

• And  the  federal  Open  Data  Policy  issued  May  9,  2013,  s8pulates  the  requirements  for  sharing  and  enabling  re-­‐use  of  digital  data,  

• And  data  sharing  and  re-­‐use  increase  the  accountability,  verifica8on,  impact,  and  return  on  investment  in  research,    

• And  technical  exper8se  and  support  services  are  required  to  meet  researcher  needs,  funding  impera8ves,  and  public  policy  goals,  

• And  an  ins8tu8onal  commitment  to  effec8ve  data  management  is  required  for  faculty  to  par8cipate

THEREFORE  BE  IT  RESOLVED  THAT  CARNEGIE  MELLON  UNIVERSITY  

• Charge  the  University  Libraries,  Office  of  Sponsored  Programs,  Office  of  Research  Integrity  and  Compliance,  and  Compu8ng  Services  to  collaborate  and  provide  the  community  with  core  services  and  tools  for  managing  data  throughout  the  data  life  cycle.  

• Promote  these  services  and  tools  and  encourage  faculty  to  use  them  to  manage  and  share  their  data.  

• Study  means  by  which  faculty  can  par8cipate  effec8vely.      

• Establish  incen8ves  and  community  norms  for  effec8ve  data  management  and  sharing.  

• Provide  ongoing  financial  support  to  the  units  providing  services  and  tools,  including  support  for  the  infrastructure,  personnel,  educa8on  and  training  needed  to  sustain  long-­‐term  data  management  and  cura8on.  

• Develop  a  research  data  management  policy,  establishing  the  University’s  commitment  to  long  term  data  management,  and  aligned  with  federal  agency  requirements  and  open  data  ini8a8ves.    This  policy  and  progress  towards  its  implementa8on  will  be  posted  on  relevant  web  pages.

A.   Research  Data  must  be  created,  maintained,  protected,  and  shared  in  accordance  with  contractual,  legisla8ve,  regulatory,  ethical  and  other  relevant  requirements.      

B.   Where  permi\ed,  management  and  sharing  of  Research  Data  should  be  supported  through  the  alloca8on  of  the  funding  that  supported  the  research.      

C.   Rights  assigned  to  Research  Data  should  not  unnecessarily  restrict  its  management,  sharing,  or  reuse.      

D.   A  Data  Management  Plan  (DMP)  should  be  documented  for  all  research  projects  that  will  produce  Research  Data,  with  excep8ons  noted.  

E.   Following  comple8on  of  a  research  project,  the  Research  Data  to  be  shared  should  be  deposited  in  one  or  more  Trusted  Data  Repositories  for  access  and  preserva8on.      

F.   Research  Data  shared  by  University  Researchers  should  be  registered  with  the  University  Libraries,  regardless  of  whether  access  to  the  Data  is  hosted  by  the  University  or  a  third  party.      

G.   Shared  Research  Data  should  be  made  available  for  access  and  reuse  in  a  8mely  manner,  in  compliance  with  funding  or  other  requirements.      

H.   Shared  Research  Data  should  be  curated  and  preserved  in  sufficient  detail  for  the  full  Period  of  Reten8on,  in  conformance  with  this  Policy  or  with  legisla8ve,  regulatory,  or  contractual  obliga8ons.      

I.   Shared  Research  Data  produced  or  used  during  research  should  be  cited  in  all  research  outputs  following  accepted  or  emerging  data  cita8on  prac8ces.      

119

Core SteeringSupport Collaboration

122

Our  professional  future

The genealogy of the contemporary research

library

The risk of invisibility

The emergence of open science

Our core professional skills

An overview of data management

The policy context

Data management activities

The  UQ  experience

DM service philosophy

Partnerships

Skills development

The Australian model

Emerging services at CMU

Pu6ng  it  into  prac8ceThe  data  management  impera8ve