Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for...

56
Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth Moss, MSLIS [email protected]

Transcript of Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for...

Page 1: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Data Citation: Key to Discovery,Reuse, and Tracking Impact

Curating and Managing Research Data for ReuseICPSR Summer Program August 2, 2013

Elizabeth Moss, [email protected]

Page 2: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

1.A tour of the ICPSR Bibliography of Data-related Literature2.The challenges of tracking data reuse (you have to be able to discern data use before you can track data reuse)3.Efforts to improve citing standards and practices, leading to sharing and impact

Today’s talk

Page 3: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Top 10 Data Downloads (first half of the year)

Title ArchiveNumber of downloads

National Longitudinal Study of Adolescent Health (Add Health), 1994-2008 DSDR 2,062

National Survey on Drug Use and Health, 2011 SAMHDA 1,216

Chinese Household Income Project, 2002 DSDR 720

Health Behavior in School-Aged Children (HBSC), 2005-2006 SAMHDA 555

National Survey on Drug Use and Health, 2010 SAMHDA 541

General Social Survey, 1972-2010 [Cumulative File] ICPSR 524

American National Election Study, 2008: Pre- and Post-Election Survey ICPSR 480

Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003 [United States] CPES 472

India Human Development Survey (IHDS), 2005 DSDR 453Historical, Demographic, Economic, and Social Data: The United

States, 1790-2002 ICPSR 359

Page 4: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Top 10 Series Data Downloads (January through July 2013)

Title ArchiveNumber of downloads

National Survey on Drug Use and Health (NSDUH) Series SAMHDA 4,012

National Longitudinal Study of Adolescent Health (Add Health), Restricted Data Series DSDR 3,701

Uniform Crime Reporting Program Data Series NACJD 2,034

Midlife Development in the United States (MIDUS) Series NACDA 1,762

ABC News/Washington Post Poll Series ICPSR 1,744

National Crime Victimization Survey (NCVS) Series NACJD 1,603

American National Election Study (ANES) Series ICPSR 1,563

Chinese Household Income Project Series DSDR 1,500

Current Population Survey Series ICPSR 1,382

National Health Interview Series NACDA 1,316

Page 5: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Who uses these shared data?

How are they used?

With what impact?

Page 6: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

• Increase likelihood of discovery and reuse• Aid students, instructors, researchers, and

funders

The ICPSR Bibliography of Data-related Literature

Link research data to scholarly literature about it

Page 7: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

It’s really a searchable database . . .

. . . containing over 65,000 citations of known published and unpublished works resulting from analyses of data archived at ICPSR

. . . that resides in Oracle, with an internal UI for database management

. . . that can generate study bibliographieslinking each study with the literature about it, and out to the full text

Page 8: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 9: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 10: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 11: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 12: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 13: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 14: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 15: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 16: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 17: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

It’s useful to all stakeholdersInstructors direct students to begin data-related research projects by reading some of the major works based on the dataAdvanced researchers also use it to conduct a focused literature review before deciding to use a datasetReporters and policymakers looking for processed statistics look for reports explaining studiesPrincipal investigators and funding agencies want to track how data are used after they are deposited

Page 18: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

But challenging to provide

Page 19: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Provide PIs and data users with citations (since 1990) and DOIs (since 2008) for all study-level data

Page 20: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Explicit citation, in the references, with the DOI

doi:10.3886/ICPSR21240

“The use of DOI names for the citing of data sets would make their provenance trackable and citable and therefore allow interoperability with existing reference services like Thomson Reuters “Web of Science . . .”

From: http://www.codata.org/taskgroups/TGdatacitation/index.html

Page 21: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

The state of data citation in the social science literature

Page 22: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Data “Sighting”(implicit)

vs. Data Citing

(explicit)

Page 23: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Typical “sightings”• Sample described, not named, no author

information, no access information, only a publication cited

• Data named in text, with some attribution, but no access information

• Cited in reference section, but with no permanent, unique identifier, so difficult for indexing scripts to find to automate tracking

Page 24: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Challenges in database search infrastructure• Journal databases fielded for journal article

discovery are not ideal for finding data “sightation”

• No field searching on methods sections• Full-text search brings back too many bad hits• Limiting to abstract misses too many good hits

Page 25: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

• Tension between highly curating a manageable collection and minimally maintaining a broad collection

• Too many publications for efficient collection by humans, so we must make it easy for scripts to do it reliably

Challenges in tracking many studies

Page 26: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Challenges of completeness

•Data use that is too difficult/costly to find cannot be counted

•A selective sample, difficult to draw accurate conclusions in broad analyses of reuse

Page 27: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Challenges in lack of data management planning• Publishing sequence prevents citation

creation before publication• Potential for change by educating the

PI/mentor; graduate directors; liaison librarians

• Consciousness raising starting to occur due to funders’ requirements

Page 28: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 29: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 30: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 31: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 32: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 33: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Poorly described and cited data+ Excessive human search effort= Too costly, too questionable for confident measure of impact

Page 34: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 35: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 36: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 37: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Citing data with a DOI+ Minimal human search effort= High hit accuracy for the cost, and better confidence of impact measures

Page 38: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Building a culture of viable data citation to improve measures of impact

Page 39: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

From: CODATA Data Citation Standards and Practices Task Group. 2012. Task Group Data Citation and Attribution Bibliographyhttp://www.codata.org/taskgroups/TGdatacitation/docs/CODATA_DDCTG_BestPracticesBib_FINAL_17June2012.pdf

Page 40: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

http://www.datacite.org/

Page 41: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 42: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

http://odin-project.eu/

The tool enables users to search the DataCite Metadata Store for their works, and subsequently to add (or claim) those research outputs – including datasets, software, and other types – to their ORCID profile. This should increase the visibility of these research outputs, and will make it easier to use these data citations in applications that connect to the ORCID Registry – ImpactStory is one of several services already doing this.

http://odin-project.eu/2013/05/13/new-orcid-integrated-data-citation-tool/

Page 43: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Finding data with simple search fields

Integration with Web of Knowledge All Databases: Research data is equal to research literature

Integration with Web of Knowledge All Databases: Research data is equal to research literature

Page 44: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Converting journal search infrastructure to meet the needs of data, but synching metadata still a work in progress.

Converting journal search infrastructure to meet the needs of data, but synching metadata still a work in progress.

Articles linked to underlying data.Increased data discovery.Reward for data citation.Potential for automated tracking.

Articles linked to underlying data.Increased data discovery.Reward for data citation.Potential for automated tracking.

What audience does this have?Anecdotally, no large group of adopters yet.

What audience does this have?Anecdotally, no large group of adopters yet.

Page 45: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

http://iassistdata.org/

Page 46: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

http://www.codata.org/taskgroups/TGdatacitation/index.html

“CODATA, the Committee on Data for Science and Technology, is an interdisciplinary Scientific Committee of the International Council for Science (ICSU), was established 40 years ago.CODATA works to improve the quality, reliability, management and accessibility of data of importance to all fields of science and technology.”

From: http://www.codata.org/about/who.html

Page 47: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

“The move to encourage wider access to the results of publicly-funded research will have limited impact without the associated tools, networks and standards that are needed for sharing and mining of data. The Research Data Alliance aims to provide them.”

https://rd-alliance.org/

Page 48: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Data-PASS partners work to change publishing practice

Page 49: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 50: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Altmetrics are an attempt to augment or replace the inadequate ways we now use to determine relevant and significant sources of knowledge: 1. peer review 2. citation counting 3. journal impact factors

Altmetrics.org/manifesto

Page 51: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

ImpactStory

• Example user http://impactstory.org/CarlBoettiger

• Needs: more aggregator and repository data exposed for harvesting metrics

Page 52: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 53: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.
Page 54: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

ICPSR will create APIs for others to query for usage statistics.

Page 55: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Other altmetrics resources

ASIS&T Bulletin Special Section: Altmetrics: What, Why and Where? April/May 2013. http://www.asis.org/Bulletin/Apr-13/

Piwowar, Heather. “Data Citation and Altmetrics Panel: Tools that Work Today to Reveal Dataset Use,” April 5, 2013. RDAP 13. Baltimore, MD.http://www.slideshare.net/asist_org/rdap13-piwowar-tools-that-work-today-to-reveal-dataset-use

Page 56: Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth.

Thank you.

Elizabeth [email protected]