RDAP13 Kathleen Fear: The impact of data reuse: a pilot study of 5 measures

Post on 12-Nov-2014

1.313 views 1 download

Tags:

description

Kathleen Fear, University of Michigan “The impact of data reuse: a pilot study of 5 measures” Research Data Access & Preservation Summit 2013 Baltimore, MD April 4, 2013 #rdap13 Panel: Data citation and altmetrics

Transcript of RDAP13 Kathleen Fear: The impact of data reuse: a pilot study of 5 measures

https://www.asis.org/rdap/

LastName, Title

The impact of data reuse: a pilot study of five measures

Kathleen FearApril 5, 2013

What is reuse impact?

• Scholarly contribution through producing data– Recognized and rewarded through publications and

publication metrics• Scholarly contribution through sharing data– Recognized and rewarded through

?

What can we do to measure and communicate the scholarly contribution a data producer

makes when their data is reused?

Pilot study of 5 measures

• Identify a set of social science datasets

• Find out how much and in what contexts they have been reused

• Demonstrate a variety of measures– Do they all come out the same?– Or do different measures highlight different data?

Sample set: 273 studies

38%

45%

11%

3% 3%Author Type

Single author

Two or more authors

Government

Non-governmental institution

Media organization

81%

19%

Processed vs. unprocessed

Processed

Unprocessed

38%

32%

30%

Release Date

2000

2001

2002

Reuse citations

• How many times has the data been reused?

• ICPSR Bibliography of Data-Related Literature– Excluded: publications by study authors and

research team members, literature reviews, commentary

Some data is reused a lot

Lots of data is reused a little

Even more is reused not at all

Study ID Study Name Reuse count

6693 National Comorbidity Survey: Baseline (NCS-1), 1990-1992 175

2884 National Treatment Improvement Evaluation Study (NTIES), 1992-1997 32

3160Project on Policing Neighborhoods in Indianapolis, Indiana, and St. Petersburg, Florida, 1996-1997

34

2851

Hispanic Established Populations for the Epidemiologic Studies of the Elderly, 1993-1994: [Arizona, California, Colorado, New Mexico, and Texas]

24

2258 Drug Abuse Treatment Outcome Study (DATOS), 1991-1994: [United States] 19

How high-quality are the data’s reuse publications?

Study ID Study Name Secondary Impact

6693 National Comorbidity Survey: Baseline (NCS-1), 1990-1992 83

2258 Drug Abuse Treatment Outcome Study (DATOS), 1991-1994: [United States] 21

2851

Hispanic Established Populations for the Epidemiologic Studies of the Elderly, 1993-1994: [Arizona, California, Colorado, New Mexico, and Texas]

20

2884 National Treatment Improvement Evaluation Study (NTIES), 1992-1997 19

2778 Gambling Impact and Behavior Study, 1997-1999: [United States] 18

How broadly or narrowly is the data reused?

Diversity

• Variety, balance, disparity among reuse publications + disparity between citing disciplines and data discipline

δ = pi∑ pjdij + pkdk∑

DataID Study Title Diversity

3190 National Organizations Survey (NOS), 1996-1997 2.5000

3337Evaluation of the Gang Resistance Education and Training (GREAT) Program in the United States, 1995-1999

2.2230

2976Police Stress and Domestic Violence in Police Families in Baltimore, Maryland, 1997-1999

2.1794

3334Aging, Status, and Sense of Control (ASOC), 1995, 1998, 2001 [United States]

2.0313

2993 Reintegrative Shaming Experiments (RISE) in Australia, 1995-1999 2.0000

How large is the publication network stemming from the data?

Downloaders

• How many individuals download the data?

• Unique users identified by email address or IP address

Study ID Study Name Downloaders

6693 National Comorbidity Survey: Baseline (NCS-1), 1990-1992 3787

2790World Values Surveys and European Values Surveys, 1981-1984, 1990-1993, and 1995-1997

3393

2778 Gambling Impact and Behavior Study, 1997-1999: [United States] 2637

3088 Alcohol and Drug Services Study (ADSS), 1996-1999: [United States] 2478

3355 Recidivism of Prisoners Released in 1994 2209

Comparing metricsReuse Citations Sec. Impact Diversity Downloaders

6693 6693 3190 66933160 2258 3337 27782884 2851 2976 30882851 2884 3334 28332258 2778 3052 33342976 3385 2993 33372833 3160 3323 22582778 3337 2778 29763337 2833 3163 32123385 3023 3002 3002

Reuse count

Secondary impact Diversity

Downloaders

Study ID Study Name Diver-

sityReuse Count

Sec. Impact

Down-loaders

3052

Risk Factors for Violent Victimization of Women in a Major Northeastern City, 1990-1991 and 1996-1997

5 18 26 27

3355Recidivism of Prisoners Released in 1994

29 18 18 4

3450Pennsylvania Sentencing Data, 1998

30 14 6 29

https://www.asis.org/rdap/

LastName, Title

Thank you!

Questions?

Kathleen Fearkfear@umich.edu