Data Management Crash Course

24
Do You Still Have Your Data? • What if your hard drive crashes? • What if you are accused of fraud? • What if your collaborator abruptly quits? • What if the building burns down? • What if you need to use your old data? • What if your backup fails? • What if your computer gets stolen?

description

This presentation is a crash course on practical data management. It is actually a portion of this talk (http://www.slideshare.net/kbriney/responsible-conduct-of-research-data-management) on data management and management plans, but I think the slides are useful enough to stand on their own.

Transcript of Data Management Crash Course

Page 1: Data Management Crash Course

Do You Still Have Your Data?

• What if your hard drive crashes?• What if you are accused of fraud?• What if your collaborator abruptly quits?• What if the building burns down?• What if you need to use your old data?• What if your backup fails?• What if your computer gets stolen?• What if…

Page 2: Data Management Crash Course

Why Data Management?

• Don’t lose data• Find data more easily– Especially if you need older data

• Easier to analyze organized, documented data• Avoid accusations of fraud & misconduct• Get credit for your data• Don’t drown in irrelevant data

Page 3: Data Management Crash Course

For each minute of planning at beginning of a project, you will save 10 minutes of headache later

Page 4: Data Management Crash Course

What Are Data?

http://www.flickr.com/photos/dia-a-dia/7046151669/ (CC BY-NC-SA)

Page 5: Data Management Crash Course

What Are Data?

• “Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings”– OMB Circular A-110

http://www.whitehouse.gov/omb/circulars_a110

Page 6: Data Management Crash Course

What Are Data?

• Observational– Sensor data, telemetry, survey data, sample data, images

• Experimental– Gene sequences, chromatograms, toroid magnetic field

data• Simulation– Climate models, economic models

• Derived or compiled– Text and data mining, compiled database, 3D models,

data gathered from public documents

Page 7: Data Management Crash Course

PRACTICAL DATA MANAGEMENTA Crash Course in

Page 8: Data Management Crash Course

Storage and Backups

http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND)

Page 9: Data Management Crash Course

Storage and Backups

• Library motto: Lots of Copies Keeps Stuff Safe!• Rule of 3: 2 onsite, 1 offsite

• Any backup is better than none• Automatic backup is better than manual• Your research is only as safe as your backup

plan– Periodically test restore from backup!

Page 10: Data Management Crash Course

Example

• I keep my data– On my computer– Backed up manually on shared drive• I set a weekly reminder to do this

– Backed up automatically via SpiderOak cloud storage

• A note on cloud storage…

Page 11: Data Management Crash Course

Consistency

http://www.flickr.com/photos/mactucket/361798299/ (CC-BY-ND)

Page 12: Data Management Crash Course

Consistency

• Consistent file naming– Make it easier to find files– Avoid many duplicates– Make it easier to wrap up a project

• Names descriptive but short (<25 characters)• Avoid “ / \ : * ? ‘ < > [ ] & $ and spaces• Date convention: YYYY-MM-DD

Page 13: Data Management Crash Course

Examples

• DataManagement_v6.pptx• 20090923_spctrm_trans_03.csv• SLAposter_FINAL.ai• BlogPost-2011-11-12.docx

• Find a system that works for you

Page 14: Data Management Crash Course

Consistency

• Consistent documentation– Record all necessary information– Keep information in one place– Easier to search and use later

• Take 5 minutes before starting a project• Create a list of information to record– Don’t forget to record the units!

Page 15: Data Management Crash Course

Example

• For my experiment, I need to collect:– Date– Experiment– Scan number– Powers– Wavelengths– Concentration (or sample weight)– Calibration factors, like timing and beam size

Page 16: Data Management Crash Course

Recording Your Conventions

http://www.flickr.com/photos/jjpacres/3293117576/ (CC BY-NC-ND)

Page 17: Data Management Crash Course

Recording Your Conventions

• What if someone needs to find your data?• Eventually will hand off data to your PI

• Record your naming conventions• Record your documentation schemes• Record overall project information– Contact info, grant #, project summary, etc.

Page 18: Data Management Crash Course

Examples

• Print out near computer/experiment area– Document conventions

• In front of research/lab notebook– Page 1: Project information– Page 2: Conventions and abbreviations– Page 3-X: Index of experiments

• README.txt in data folder– Top-level folder: project information– Lower-level folder: what’s in this folder?

Page 19: Data Management Crash Course

Planning for the Future

http://www.flickr.com/photos/bonedaddy/2791636546/ (CC BY-SA)

Page 20: Data Management Crash Course

Planning for the Future

• Get help for sensitive data!– HIPAA, FERPA, FISMA, IRB, etc.

• UWM Information Security Office– Visit: www.uwm.edu/itsecurity/

• Policy pages– www.uwm.edu/legal/hipaa/index.cfm– www.uwm.edu/academics/ferpa.cfm

Page 21: Data Management Crash Course

Planning for the Future

• We can’t open files from 10 years ago

• Proprietary file types– Convert to open file format• .doc .txt• .xls .csv• .jpg .tif

– Preserve software if no open file format• Periodically move data to new media

Page 22: Data Management Crash Course

Goal: Don’t Stress Over Data

http://www.flickr.com/photos/72775875@N06/7729764370/ (CC BY-NC-SA)

Page 23: Data Management Crash Course

More Information

• Data Services– www.uwm.edu/libraries/dataservices/

• Data Management Plans– dataplan.uwm.edu

• Kristin Briney, Data Services Librarian– Contact me!

Page 24: Data Management Crash Course

Thank You

• The content of this presentation is licensed under a Creative Commons Attribution 3.0 Unported License (CC BY)– Image licenses as marked