Data and Donuts: The Impact of Data Management

34
The Impact of Data Management C. Tobin Magle, PhD Sept. 29, 2016 9:00-10:00 a.m. Morgan Library Computer Classroom 173

Transcript of Data and Donuts: The Impact of Data Management

Page 1: Data and Donuts: The Impact of Data Management

The Impact of Data

ManagementC. Tobin Magle, PhD

Sept. 29, 20169:00-10:00 a.m.

Morgan Library Computer Classroom 173

Page 2: Data and Donuts: The Impact of Data Management

but the same principles apply to both

data management !=

data sharing

Page 3: Data and Donuts: The Impact of Data Management

Why should I care about data management?

Rinehart, AK. “Getting emotional about data” College & Research Libraries News September 2015 vol. 76 no. 8 437-440

Page 4: Data and Donuts: The Impact of Data Management

Everything* is digital

• Needs new skills• Data are ephemeral• Facilitates sharing

*ok not everything, but most things

Page 5: Data and Donuts: The Impact of Data Management

More researchers

https://www.nsf.gov/statistics/2016/nsf16300/digest/nsf16300.pdf

Page 6: Data and Donuts: The Impact of Data Management

See arXiv:1402.4578 for details

Page 7: Data and Donuts: The Impact of Data Management

Working Email

Data are extant(If status known)

Status of data (if response)

Response (if email working)

doi:10.1016/j.cub.2013.11.014

Page 8: Data and Donuts: The Impact of Data Management

We are losing vast amounts of data

00

0

0

0

0

0

0

0

00

0

0

1

1

1

11

1

11

1

1

1

1

1

1

1

0

00

0

0

0

000

000 0

1

1

1 1

10

Page 9: Data and Donuts: The Impact of Data Management

Research funding is tight

http://www.bu.edu/research/articles/funding-for-scientific-research/

Page 10: Data and Donuts: The Impact of Data Management

Funders want to do more with less

http://figshare.com/blog/2015_The_year_of_open_data_mandates/143

Page 11: Data and Donuts: The Impact of Data Management

White House’s 2013 OSTP

“The Obama Administration is committed to the proposition that citizens deserve easy access to the results of research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the results of federally funded research freely available to the public—generally within one year of publication.”

http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research

Page 12: Data and Donuts: The Impact of Data Management

NSF post-award requirements

“Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”

http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4

Page 13: Data and Donuts: The Impact of Data Management

In other words…In other words…

Page 14: Data and Donuts: The Impact of Data Management

It’s good for science

• Improves research reproducibility

• Improves efficiency

• Spurs innovation

Page 15: Data and Donuts: The Impact of Data Management

It’s good for you

• You are the future data user

• Your data get used (and cited)

• Exposure to collaborators

• More competitive grants

Page 16: Data and Donuts: The Impact of Data Management

But wait…

Barriers to data sharing

Page 17: Data and Donuts: The Impact of Data Management

“But it’s mine, I don’t want to share!”

• Usually funded by public money• See White House statement

• If you work for CSU, the university actually owns your data

• You are the steward• CSU promotes open data

Page 18: Data and Donuts: The Impact of Data Management

“But my data are too small to be useful”

Page 19: Data and Donuts: The Impact of Data Management

“But I work with sensitive/private data”

• CAN share deidentified data

• CAN share summary data • https://clinicaltrials.gov/

• Controlled access• See dbGaP @ NCBI re: NIH genomic data sharing

policy• Release metadata so people know the data exist and

ask for it• Identifying personal genomes by surname

inference• https://www.ncbi.nlm.nih.gov/pubmed/23329047

Page 20: Data and Donuts: The Impact of Data Management

“But I’m planning applying for a patent!”

• Ok data sharing isn’t right for you

• But good data management practices have benefits even if you don’t share!

• Can share later

Page 21: Data and Donuts: The Impact of Data Management

What is data management?

The policies, practices and procedures needed to manage the storage, access and preservation of data

produced from a research project

Page 22: Data and Donuts: The Impact of Data Management

Where does data management fit into

research?

Throughout the whole research cycle

Page 23: Data and Donuts: The Impact of Data Management

Hypothesis

The research cycle

Page 24: Data and Donuts: The Impact of Data Management

Hypothesis Experimental design

The research cycle

Page 25: Data and Donuts: The Impact of Data Management

Hypothesis DataExperimental design

The research cycle

Page 26: Data and Donuts: The Impact of Data Management

Hypothesis DataExperimental design

Results

The research cycle

Page 27: Data and Donuts: The Impact of Data Management

Hypothesis DataExperimental design

ResultsArticle

The research cycle

Page 28: Data and Donuts: The Impact of Data Management

Hypothesis DataExperimental design

ResultsArticle

The research cycle

Page 29: Data and Donuts: The Impact of Data Management

Hypothesis DataExperimental design

ResultsArticle

Data Management Plans

The research cycle

Page 30: Data and Donuts: The Impact of Data Management

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Analysis

The research cycle

Page 31: Data and Donuts: The Impact of Data Management

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

ClosedData

Archiving

The research cycle

Page 32: Data and Donuts: The Impact of Data Management

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

Code Reproducible Research

ClosedData

Archiving

The research cycle

Page 33: Data and Donuts: The Impact of Data Management

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

Code Reproducible Research

Reuse

ClosedData

Archiving

The research cycle

Page 34: Data and Donuts: The Impact of Data Management

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

Code Reproducible Research

Reuse

ClosedData

Archiving

The research cycle