Scientists’ Data and Information Practices and Needs

48
Scientists’ Data and Information Practices and Needs Carol Tenopir, University of Tennessee and Mike Frame, USGS June 15, 2011 UC3 Summer Webinar Series

description

UC3 Summer Webinar Series. Scientists’ Data and Information Practices and Needs. Carol Tenopir, University of Tennessee and Mike Frame, USGS June 15, 2011. - PowerPoint PPT Presentation

Transcript of Scientists’ Data and Information Practices and Needs

Page 1: Scientists’ Data and Information Practices and  Needs

Scientists’ Data and Information Practices and Needs

Carol Tenopir, University of Tennessee and

Mike Frame, USGSJune 15, 2011

UC3 Summer Webinar Series

Page 2: Scientists’ Data and Information Practices and  Needs

Scientists’ Data and Information Practices and Needs:

A Baseline Assessment & Implications for Libraries

Carol Tenopir, University of Tennessee and

Mike Frame, USGSCo-Leaders of the DataONE Usability & Assessment Working Group

2

Page 3: Scientists’ Data and Information Practices and  Needs

Provide universal access to data about life on earth and the environment that sustains it

1. Build on existing cyberinfrastructure

2. Create new cyberinfrastructure 3. Support new communities

of practice

3

Page 4: Scientists’ Data and Information Practices and  Needs

Scientists

Data Managers

Public Officials

Citizen-scientists

Libraries & Librarians

Students & Teachers

Assessment-stakeholders

Publishers

Page 5: Scientists’ Data and Information Practices and  Needs

5

Collect

Assure

Describe

Deposit

Preserve

Discover

Integrate

Analyze

Data Life Cycle

Assessment

Page 6: Scientists’ Data and Information Practices and  Needs

Baseline Assessment of Scientists (2010)

n=1329n=1317

Primary Discipline

Primary Discipline

social sciences15%

computer science/en-gineering

9%

physical sciences12%

environmental sciences & ecology36%

atmospheric science4%

biology14%

medicine2%

other7%

academic80%

government13%

others8%

Primary Work Sector

6

Page 7: Scientists’ Data and Information Practices and  Needs

Meet the Scientists: Joe & Mabel

7

Joe is a biodiversity scientist employed by a government agency. He acts as a program manager and consultant. Joe oversees collection of new data in the field and also manages historical data from other providers. Joe has data from a variety of different projects conducted over the years.

Mabel is an academic environmental scientist. She collects and records data in the field on a variety of specimen variables and environmental impacts. Mabel has a data set related to her personal research interests, as well as data collected for a university museum collection.

Page 8: Scientists’ Data and Information Practices and  Needs

Lessons Learned

8

Page 9: Scientists’ Data and Information Practices and  Needs

1. Scientists need a variety of data types and many scientists are interested in sharing data.

9

Page 10: Scientists’ Data and Information Practices and  Needs

10

experiment

observational

data models

biotic survey

abiotic survey

remote-sensed abiotic

remote-sensed biotic

social science survey

interviews

Other

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

54%48%

38% 34% 33%27%

20% 19%15%

6%

Data Types

Page 11: Scientists’ Data and Information Practices and  Needs

share my data with others place at least some of my data into a central data repository

place all of my data into a central data repository

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

75%78%

41%

Current Sharing Practices

Page 12: Scientists’ Data and Information Practices and  Needs

Willing to place all of my data into a central data repository with no restric-

tions

Appropriate to create new datasets from shared data

Willing to place at least some of my data into a central data repository with

no restrictions

Willing to share data across a broad group of researchers

0% 20% 40% 60% 80% 100%

41%

76%

78%

81%

Many are interested in sharing data

Percent agree

Page 13: Scientists’ Data and Information Practices and  Needs

Joe & Mabel: About Sharing Data

13

“If NBII required anyone who extracted data through the portal to also share data with the portal, then a resounding yes.”

“I’m interested in having data available to researchers interested in larger questions, particularly climate change questions.”

“We are torn between putting it out there for everyone and worry about suffering the risk of something bad happening with it. Saddest thing would be if the data loses its use, where it isn’t shared.”

“I don’t think I would be opposed to it. It would not be a decision I would make personally; we would have to have permission to share.”

Page 14: Scientists’ Data and Information Practices and  Needs

2. There are many barriers to sharing data and conditions that must be met.

14

Page 15: Scientists’ Data and Information Practices and  Needs

Gap Between Willingness to Share and Accessibility

15

place at least some of my data into a central data repository

place all of my data into a central data repository

Others can access my data easily 0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

78%

41%36%

Page 16: Scientists’ Data and Information Practices and  Needs

use other researchers' datasets if their datasets were easily accessible

willing to share data across a broad group of researchers

it is appropriate to create new datasets from shared data

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

84% 81%76%

Interest in Data Sharing

16

Page 17: Scientists’ Data and Information Practices and  Needs

Reprints of articles

Reciprocal sharing agreement

Opportunity to collaborate

Acknowledge provider/funder

Formally cite provider/funder

0% 20% 40% 60% 80% 100%

70%

72%

81%

93%

95%

Conditions on data sharing

Percent agree

Page 18: Scientists’ Data and Information Practices and  Needs

Lack of funding

Insufficient time to make data available

No place to put data

Don't have the rights to make the data public

0% 20% 40% 60% 80% 100%

40%

54%

24%

24%

More challenges ..

Percent agree

Page 19: Scientists’ Data and Information Practices and  Needs

Lack of funding

Insufficient time to make data available

No place to put data

Don't have the rights to make the data public

0% 20% 40% 60% 80% 100%

43%

62%

24%

18%

40%

54%

24%

24%

More challenges ..

Percent agree

Page 20: Scientists’ Data and Information Practices and  Needs

Joe & Mabel: About Restrictions & Conditions to Sharing Data

20

“We want to make sure that those of us who have been involved in gathering the data get appropriate recognition for it.”

“If someone were to ask about rare or endangered plants, I would limit that information to appropriate people: natural heritage, universities and federal agencies.”

“We will share it with people who want to use the data for restoration or research. If a consultant wants data to make money, then we are hesitant to hand it out.”

“Is there a mechanism by which we can know when our data is being used? Knowing how valuable we are to the general public comes from the use of our data.”

Page 21: Scientists’ Data and Information Practices and  Needs

3. There are different needs, attitudes, and practices between scientists who work in government agencies and those who work in academia.

21

Page 22: Scientists’ Data and Information Practices and  Needs

the process for cataloging/describing data

the tools for preparing my documentation

tools and technical support for data management during the life of the project

formal established process to store data beyond the project

0% 20% 40% 60% 80% 100%

62%

46%

40%

35%

48%

34%

52%

53%

GovernmentAcademic

“I am satisfied with …”

Percent agree/strongly agree

Page 23: Scientists’ Data and Information Practices and  Needs

• Academic respondents are more likely to have sole responsibility for approving access to some or all of their datasets.– Academic 83%, Government 63%

23

Responsibilities for Data

Page 24: Scientists’ Data and Information Practices and  Needs

• Government respondents are more likely to agree their organization was involved in:– “managing data during the life of the project”

• Government 52%, Academic 39%,

– “storing data beyond the life of the project” • Government 53%, Academic 46%

24

Organizational Involvement

Page 25: Scientists’ Data and Information Practices and  Needs

25

“If other people are using my data then I somehow need to report that. I need to know how it’s being used and if any publications result.”

“I don’t have anything I’m keeping private. I’m willing to put it all out there.”

“I don’t have the authority to make decisions about data sharing. “

“Our data sharing policy makes it difficult for us to withhold parts of the datasets we receive. As a result, some data contributors only share sub-sets of their data.”

Joe & Mabel: The View from Government & Academic Organizations

Page 26: Scientists’ Data and Information Practices and  Needs

4. The skill level of scientists and use and access to appropriate tools varies across the data life cycle.

26

Page 27: Scientists’ Data and Information Practices and  Needs

DIF DwC DC EML FGDC Open GIS

ISO My Lab none

12 21 26

95 95 96 97

266

676

Metadata standard

What metadata standard do you currently use?

Page 28: Scientists’ Data and Information Practices and  Needs

28

“We are currently redoing all of our collection databases at the museum. We are building an in-house system. We looked at available standards and decided to write our own.”

“For my research, very little metadata has been created. For metadata associated with the museum collection, Darwin Core has been used.“

“For contemporary sets, the person who submits the data also submits a metadata record. We create another record representing what we think it is. We have one version of the data, submitter may have a version they keep on their website. We want to be able to show that these are two different things.”

“We write FGDC records.”

Joe & Mabel: About Metadata

Page 29: Scientists’ Data and Information Practices and  Needs

5. Scientists need assistance across the data life cycle.

29

Page 30: Scientists’ Data and Information Practices and  Needs

30

% Government % Academic

Training on best practices 23 21

Funds for data management long-term 27 20

Funds for data management short-term 34 29

Tools and technical support for data management long-term

39 34

Tools and technical support for data management short-term

48 43

My organization provides…

Page 31: Scientists’ Data and Information Practices and  Needs

Lack of funding

Insufficient time to make data available

No place to put data

Don't have the rights to make the data public

0% 20% 40% 60% 80% 100%

40%

54%

24%

24%

More challenges ..

Percent agree

Page 32: Scientists’ Data and Information Practices and  Needs

Joe & Mabel: Looking for Assistance

32

“It is cumbersome to put those data sets together, but only because it is important. If there were ways to automate some of that information collection out of the data sets, it would help.”

“Maximum utility of the data would require geo-referencing of the data. We would need help geo-referencing the part of the collection that isn’t geo-referenced.”

“Ideally, we would like for our research results to be disseminated in a way that’s accessible and digestible to not just academics but to everybody.”

“Manpower. We need more people to handle these sorts of things.”

Page 33: Scientists’ Data and Information Practices and  Needs

Are there standards?

Collect

Assure

Describe

Deposit

Preserve

Discover

Integrate

Analyze

Data Life Cycle Scientist Challenges

How do I preserve my

data?

What tools do I use?

Will I get credit for my work?

How much will it cost?

What is a data management

plan?

Who can help me?

What is metadata?

Where do I preserve my

data?

Page 34: Scientists’ Data and Information Practices and  Needs

Year 1 Year 2 Year 3 Year 4 Year 5

Scientists: BL

Future Assessments

Scientists: FU

Librarians: BL Librarians: FU

Policy Makers: BL Policy Makers: FU

Educators: BL Educators: FU

Library Policies: BL Library Policies: FU

Page 35: Scientists’ Data and Information Practices and  Needs

Library and Librarian Surveys

• Library (1 per library) current practices• Librarian (individuals) attitudes and

perceptions• Started with ARL libraries (spring and summer

2011; 38 library responses and 223 librarians so far)

• Will expand to other North American academic libraries and librarians

Page 36: Scientists’ Data and Information Practices and  Needs

Stewardship role (select &

deselect)?

Librarian & Library Assessment

Collect

Assure

Describe

Deposit

Preserve

Discover

Integrate

Analyze

Are RDS priority?

Role in partnering with

researcher?

Level of knowledge &

skills ?

Is there an agency repository that accepts data?

Level of participation with data?

Role of librarian discovering

data?

Level of involvement

with metadata?

Role of the librarian to help preservation?

Page 37: Scientists’ Data and Information Practices and  Needs

Library SurveyResearch Data Services (RDS)

- Research data reference/consultation services to researchers are provided by individual discipline librarians (33%) or dedicated data librarians (17%) or a combination of both (50%).

- Almost half of the libraries (45%) do not have policies and/or procedures associated with research data services.

Page 38: Scientists’ Data and Information Practices and  Needs

Library SurveyCollaboration for RDS

n=18

Page 39: Scientists’ Data and Information Practices and  Needs

Library Survey Staffing issues

n=28

Page 40: Scientists’ Data and Information Practices and  Needs

Library SurveyOpportunities for Staff for RDS

n=25

Page 41: Scientists’ Data and Information Practices and  Needs

Librarian Survey

– Distributed to 950 librarians– Science, data, metadata, scholarly communication,

digital collection, electronic resources librarians– 223 people replied at least one question

Page 42: Scientists’ Data and Information Practices and  Needs

Librarian Survey

• Interact with faculty, students, or staff in support of RDS 28% Yes-integral part, 41% Yes-occasionally, 32% No (n=221)

• With faculty or staff consultation on

n=192

n=194

n=193

Page 43: Scientists’ Data and Information Practices and  Needs

Frequency of research data services performed by the librarian

n=167

n=167

n=167

n=166

Page 44: Scientists’ Data and Information Practices and  Needs

Librarian Survey

• Outreach and collaboration w/ other RDS– Off campus 61% Never, 34% few times a year (n=157)– On campus 51% Never, 34% few times a year (n=157)

• Participation in … about RDS

informal discussion groups

working groups/professional groups

policy development

strategic planning

2%

3%

4%

3%

6%

8%

4%

4%

20%

12%

9%

11%

49%

40%

34%

40%

24%

39%

50%

42%

daily once a week once a month few times a year never

n=158

n=158

n=158

n=156

Page 45: Scientists’ Data and Information Practices and  Needs

Librarian SurveySkills & Expertise

48%57%

31%

51%

n=157n=156 n=156n=157

Page 46: Scientists’ Data and Information Practices and  Needs

Librarian Survey

Most important motivation to be involved in RDS

RDS are important to subject disci-pline I support

RDS is primary responsibility

personal interest in RDS

My job includes facilitating data contributions to our institutional

repository

My job includes metadata creation,

training, and/or management

Other My research includes RDS

0%

5%

10%

15%

20%

25%

30%

25%

23%

16%

14%13%

9%

2%

Page 47: Scientists’ Data and Information Practices and  Needs

Next steps

• Follow-up to ARL libraries and librarians• Expand scope to other academic libraries• Federal libraries/librarians• Data Managers• Other Working Groups looking at citizen

scientists and UG educators