RDAP13 Elizabeth Moss: The impact of data reuse

54
Viable Data Citation: Expanding the Impact of Social Science Research RDAP13 Panel on Data Citation and Altmetrics , April 5, 2013 Elizabeth Moss, ICPSR [email protected]

description

Kathleen Fear, ICPSR, University of Michigan “The impact of data reuse: a pilot study of 5 measures” Panel: Data citation and altmetrics Research Data Access & Preservation Summit 2013 Baltimore, MD April 4, 2013 #rdap13

Transcript of RDAP13 Elizabeth Moss: The impact of data reuse

Page 1: RDAP13 Elizabeth Moss: The impact of data reuse

Viable Data Citation: Expanding the Impact of Social Science Research

RDAP13 Panel on Data Citation and Altmetrics, April 5, 2013Elizabeth Moss, [email protected]

Page 2: RDAP13 Elizabeth Moss: The impact of data reuse

At ICPSR

• Providing opportunities for tracking and measuring impact• Linking data to the literature, and the

challenges involved • Aiding the cultural shift to viable citing

practice (impact can be better measured if data use is readily discernable)

Page 3: RDAP13 Elizabeth Moss: The impact of data reuse
Page 4: RDAP13 Elizabeth Moss: The impact of data reuse

Top 10 Data Downloads in the Previous Six Months (non-anonymous, distinct users downloading one or more files)

ICPSR Study Title # Downloads

National Longitudinal Study of Adolescent Health (Add Health), 1994-2008 1817

National Survey on Drug Use and Health, 2010 1109

Chinese Household Income Project, 2002 648

General Social Survey, 1972-2010 [Cumulative File] 643

National Survey on Drug Use and Health, 2011 603

Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003 [United States] 527

Health Behavior in School-Aged Children (HBSC), 2005-2006 509

American National Election Study, 2008: Pre- and Post-Election Survey 427

India Human Development Survey (IHDS), 2005 395

School Survey on Crime and Safety (SSOCS), 2006 339

Page 5: RDAP13 Elizabeth Moss: The impact of data reuse

Who uses these shared data?

With what impact?

Page 6: RDAP13 Elizabeth Moss: The impact of data reuse
Page 8: RDAP13 Elizabeth Moss: The impact of data reuse
Page 9: RDAP13 Elizabeth Moss: The impact of data reuse
Page 10: RDAP13 Elizabeth Moss: The impact of data reuse

• Increase likelihood of discovery and re-use• Aid students, instructors, researchers, and

funders

The ICPSR Bibliography of Data-related Literature

Link research data to scholarly literature about it

Page 11: RDAP13 Elizabeth Moss: The impact of data reuse

It’s really a searchable database . . .

. . . containing 65,000 citations of known published and unpublished works resulting from analyses of data archived at ICPSR

. . . that resides in Oracle, with an internal UI for database management

. . . that can generate study bibliographieslinking each study with the literature about it, and out to the full text

Page 12: RDAP13 Elizabeth Moss: The impact of data reuse
Page 13: RDAP13 Elizabeth Moss: The impact of data reuse
Page 14: RDAP13 Elizabeth Moss: The impact of data reuse
Page 15: RDAP13 Elizabeth Moss: The impact of data reuse
Page 16: RDAP13 Elizabeth Moss: The impact of data reuse
Page 17: RDAP13 Elizabeth Moss: The impact of data reuse
Page 18: RDAP13 Elizabeth Moss: The impact of data reuse
Page 19: RDAP13 Elizabeth Moss: The impact of data reuse
Page 20: RDAP13 Elizabeth Moss: The impact of data reuse
Page 21: RDAP13 Elizabeth Moss: The impact of data reuse

It’s useful to all stakeholdersInstructors direct students to begin data-related research projects by reading some of the major works based on the dataAdvanced researchers also use it to conduct a focused literature review before deciding to use a datasetReporters and policymakers looking for processed statistics look for reports explaining studiesPrincipal investigators and funding agencies want to track how data are used after they are deposited

Page 22: RDAP13 Elizabeth Moss: The impact of data reuse

But challenging to provide

Page 23: RDAP13 Elizabeth Moss: The impact of data reuse

The state of data citation in the social science literature

Page 24: RDAP13 Elizabeth Moss: The impact of data reuse

Abstract?Acknowledgements?

Charts and Tables?

Appendices?

References!

Discussion?Footnotes?

Sample?Methods?

Data “Sighting”(implicit)

vs. Data Citing

(explicit)

Page 25: RDAP13 Elizabeth Moss: The impact of data reuse

Typical “sightings”• Sample described, not named, no author

information, no access information, only a publication cited

• Data named in text, with some attribution, but no access information

• Cited in reference section, but with no permanent, unique identifier, so difficult for indexing scripts to find to automate tracking

Page 26: RDAP13 Elizabeth Moss: The impact of data reuse

ICPSR’s advocates the use of DOIs• ICPSR has been providing citations to its data

since 1990 and started assigning DOIs in 2008

• DOIs apply at the study or collection level (a study can have multiple datasets) and resolve to the study home page with richest metadata

• DOIs are of the form: doi:10.3886/ICPSR04549

Page 27: RDAP13 Elizabeth Moss: The impact of data reuse

A-typical “citing:”In the references, with the DOI

doi:10.3886/ICPSR21240

Page 28: RDAP13 Elizabeth Moss: The impact of data reuse

Challenges in database search infrastructure• Journal databases fielded for journal article

discovery are not ideal for finding data “sightation”

• No field searching on methods sections• Full-text search brings back too many bad hits• Limiting to abstract misses too many good hits

Page 29: RDAP13 Elizabeth Moss: The impact of data reuse

• Tension between highly curating a manageable collection and minimally maintaining a broad collection

• Too many publications for efficient collection by humans, so we must make it easy for scripts to do it reliably

Challenges in tracking many studies

Page 30: RDAP13 Elizabeth Moss: The impact of data reuse

Challenges of completeness

• Data use that is too difficult/costly to find cannot be counted

• A selective sample, difficult to draw accurate conclusions in broad analyses of re-use

Page 31: RDAP13 Elizabeth Moss: The impact of data reuse

Challenges in publishing practice, and lack of data management planning• Publishing sequence prevents citation

creation before publication• Potential for change by educating the

PI/mentor• Consciousness raising starting to occur due

to funders’ requirements

Page 32: RDAP13 Elizabeth Moss: The impact of data reuse
Page 33: RDAP13 Elizabeth Moss: The impact of data reuse
Page 34: RDAP13 Elizabeth Moss: The impact of data reuse
Page 35: RDAP13 Elizabeth Moss: The impact of data reuse
Page 36: RDAP13 Elizabeth Moss: The impact of data reuse
Page 37: RDAP13 Elizabeth Moss: The impact of data reuse

Poorly described and cited data+ Excessive human search effort= Too costly, too questionable for confident measure of impact

Page 38: RDAP13 Elizabeth Moss: The impact of data reuse
Page 39: RDAP13 Elizabeth Moss: The impact of data reuse
Page 40: RDAP13 Elizabeth Moss: The impact of data reuse
Page 41: RDAP13 Elizabeth Moss: The impact of data reuse

Citing data with a DOI+ Minimal human search effort= High hit accuracy for the cost, and better confidence of impact measures

Page 43: RDAP13 Elizabeth Moss: The impact of data reuse

Finding data with simple search fields

Integration with Web of Knowledge All Databases: Research data is equal to research literature

Page 44: RDAP13 Elizabeth Moss: The impact of data reuse

Converting journal search infrastructure to meet the needs of data, but synching metadata still a work in progress.

Articles linked to underlying data.Increased data discovery.Reward for data citation.Potential for automated tracking.

Page 45: RDAP13 Elizabeth Moss: The impact of data reuse

Building a culture of viable data citation to improve measures of impact

Page 46: RDAP13 Elizabeth Moss: The impact of data reuse

Provide PIs and users with citations and DOIs for all study-level data

Page 47: RDAP13 Elizabeth Moss: The impact of data reuse

Join groups advocating viable data citing practice

Page 48: RDAP13 Elizabeth Moss: The impact of data reuse
Page 49: RDAP13 Elizabeth Moss: The impact of data reuse
Page 50: RDAP13 Elizabeth Moss: The impact of data reuse

Work with partner repositories to change publishing practice

Page 51: RDAP13 Elizabeth Moss: The impact of data reuse
Page 52: RDAP13 Elizabeth Moss: The impact of data reuse

Three meetings: Journal editors, domain repositories, and funders• Establish consistent data citation in social

science journals• Encourage transparency in research• Optimize editorial work flows: sequencing• Develop common standards for repositories• Find long-term funding models repository

sustainability

Page 53: RDAP13 Elizabeth Moss: The impact of data reuse
Page 54: RDAP13 Elizabeth Moss: The impact of data reuse

Thank you

Elizabeth [email protected]