results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey...

31
RDM PRACTICES AT THE CSIR: results of two surveys Louise Patterton February 2016 1

Transcript of results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey...

Page 1: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM PRACTICES AT THE CSIR: results of two surveys

Louise Patterton February 2016 1

Page 2: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

PRESENTATION OUTLINE

• Survey background

• Survey 1: experienced researchers

• Survey 2: emerging researchers

• Survey comparisons

• Main RDM findings

• Conclusion and the way forward

Page 3: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Survey background

New position at the CSIR (data librarian)

RDM is a relatively new concept

Background knowledge had to be obtained:

what are researchers doing?

CSIR Survey was natural progression from there

Most institutions conduct RDM surveys

Also known as : data audit, needs assessment,

state of the art, scoping studies/interviews

Page 4: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Survey 1: RDM practices of experienced researchers

• Survey was conducted and completed in 2014

• Target population: Research Group Leaders (RGLs)

• Population: 100 RGLs

• 36 RGLs took part in survey

• 9 out of 10 research units involved

• One-on-one interviews, 23 open-ended questions

• Based on questionnaire used at Oxford University in 2008 by

Luis Martinez-Uribe, also used at UP in 2011

• Additional questions added

• Recorded interviews transcribed and analysed

4

Page 5: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Survey 2: RDM practices of emerging researchers

• Emerging researcher: full-time CSIR employee, 35 years or younger, has a PhD or is

busy with a PhD

• Population: 179 emerging researchers

• 48 completed the online survey, eSurv was survey tool

• Study conducted in 2015

• Online questionnaire; 31 questions; multitude of question formats

5

Page 6: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Survey comparisons (not RDM findings)

DIFFERENCES: • RGLs vs emerging researchers

• 2014 vs 2015

• Unit involvement

• One-on-one interview vs online survey

• Anonymity differences

• Open-ended questions vs multitude of question formats

• 23 questions vs 31 questions

• RDM practices investigated

6

Page 7: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Survey comparisons (not RDM findings)

SIMILARITIES: • CSIR employees

• Units, disciplines

• RDM question similarities (format, backup, metadata)

• Both surveys: questionnaire

• Most questions had option of open-ended elaboration

• Anonymity when published

• 2014 vs 2015

7

Page 8: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: RDM awareness

• This question was only included in the RGL survey

• Most common answer: ‘have heard of it, not applying it’

8

Page 9: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: familiarity with concept

Responses varied…….

– No – Yes…sort of….not really – Yes, I have heard of it, but I do not really know

what it is – Please explain to me what is a data management

plan? I could have done one without knowing about it.

– I understand the words, but I don’t…..no!

– I am vaguely familiar with data management plans

– RDM is not required in this field

– We are probably not as familiar as we should be, but yes we have definitely heard of it

9

Page 10: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

- We most definitely develop a research plan that contains elements of data management…there is never a, not that I can recall, a separate data management plan, as such.

- Yes, it is a specific requirement of a Water Research Commission Project. In fact, in the final report that you produce for Water Research Commission project, they request a data management chapter at the end of that report.

RDM findings: familiarity with concept

Page 11: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data management training

• Most researchers had not received any RDM training

11

Page 12: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: development of a DMP

yes

no

don't know

Yes: 11%

No: 76%

I don't know: 13%

• This question was only included in the emerging research survey

• The majority had not heard of a DMP

12

Page 13: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: research data formats

Most popular formats used were

RGLs: spreadsheets, images, text (word/pdf),

video

Emerging: text, spreadsheets, images

19 vs 15 formats

Wide range of formats: google maps, audio,

GIS….

13

Page 14: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: research data volume

• Direct comparison not possible • Typical dataset size vs current research

project • Biggest dataset size category were the least

frequent dataset size used • Subset of researchers not aware of volumes

created

14

Page 15: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: software tools used

• RGLS: Microsoft Office, Matlab

• Emerging researchers: Microsoft Excel, Microsoft Word, and Matlab

• 31 different software tools indicates in 2015 survey

15

Page 16: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Data storage media Prevalence

PC/laptop 61%

I-drive 47%

External hard drive 28%

Lab computers 11%

server 11%

EB* 8%

Project server 8%

RDM findings: data storage (RGLs)

16

Page 17: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data storage (emerging)

0

5

10

15

20

25

30

35

40

45

50

Stor

age

loca

tion

prev

alen

ce

Storage location

17

Page 18: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data backups (RGLs)

• 89% of RGLs do data backups

• 11% admit to not backing up data

18

Page 19: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

• All emerging researcher data is backed op

• Most frequent response: ……ad hoc!

RDM findings: data backups (emerging)

19

Page 20: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data backup locations (RGLs)

• Most common: CSIR drive, server, external hard-drive

20

Page 21: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data backup locations (emerging)

0

5

10

15

20

25

30

35

40

Externalhard drive

CSIR drive Cloud (e.g.dropbox)

USB device Servermanaged

by CSIR ICT

NAS Server inunit

CD/DVD Don't know CHPC Universitybackupserver

n/a

Back

up lo

catio

n pr

eval

ence

Backup location

• Most common: external hard-drive, CSIR drive, cloud, USB

• Cloud usage more prominent with 2015 survey respondents

21

Page 22: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

no

RDM findings: creating metadata (RGLs)

22

Page 23: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: creating metadata (emerging)

23

Page 24: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: metadata practices summary

• No metadata added by 33% of RGLs, 33% of

emerging researchers

• 17% of RGLs always make use of a metadata

standard

• Only 2% of emerging researchers always make

use of a metadata standard

24

Page 25: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data sharing

• Data most often shared with researchers who helped create data,

research group, or supervisors

• Low levels of sharing with funders, journal publishers, the public or others

in the same discipline

• 54% had not shared any data during the last 5 years

• Only 4% had shared more than 10 times

• Data sharing methods: email, USB stick, FTP, web portal for download

• Extremely low usage of curated digital data repository

• Secondary data requests: most had requested

• Secondary data requests: most common answer 2-5 times

25

Page 26: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data recovery

26

Page 27: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: data management training needs

1. Developing a DMP 2. Data storage 3. Data documentation, Copyright, Creating metadata

27

Page 28: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: challenges and obstacles

IT-related: systems internet speed management of drives lack of trust storage space shareable workspaces remote access

28

Page 29: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

RDM findings: challenges and obstacles

Financial software, servers, licenses are costly

Software issues Open source compatibility Software/media expiry, outdated technology

RDM practices lack of experience Integrity of research Data accessibility

Data security issues Data loss, encryption, data corruption, data theft

Data sharing/data confidentiality

29

Page 30: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Conclusion and the way forward

• Need for RDM training • RDM policy and procedure to be put in place

• The way forward:

RDM awareness session: April 2016 RDM roadshow: meetings with research units RDM to form part of research process Dataset indexing: Oracle Workflow, TOdB, Researchspace Data storage (CSIR ICT, DIRISA, Researchspace) DMP tool (DIRISA) DOI (Wim Hugo) Awareness and marketing Training Information specialists to be involved ……..

30

Page 31: results of two surveys - WordPress.com · • Population: 100 RGLs • 36 RGLs took part in survey • 9 out of 10 research units involved • One-on-one interviews, 23 open -ended

Thank you