What is DATA MANAGEMENT? · Data Management Data Management: Refers to all aspects of collecting,...
Transcript of What is DATA MANAGEMENT? · Data Management Data Management: Refers to all aspects of collecting,...
Presented by Andrade Zafiro and Chavira Laura
Grupo de transparencia en la dirección de Economía Instituto de Salud Pública,
Cuernavaca,México
06 Septiembre 2017
Data Management Plan
What is DATA MANAGEMENT?
Paraempezar …
• Imagine that you wrote and published an article for a project you’d worked on for two and ahalf years. Your article was cited often and your results are now well known. Three yearsafter publication an investigator accuses you of having falsified information in thearticle.
– Do you think you could verify all of the work?
– What would you need to prove the information was not falsified?
– What things should you have done during the investigation to enable you to prove theinformation described in the article now, three years later?
Data Management
Data Management:
Refers to all aspects of collecting, saving, sharing, converting, backing up and
storing data.
REQUIERE
– Organizational skills• Individuals• In teams
– Understanding study methodology• Different phases of a studyos diferentespasos duranteunestudio• Design, implementation,analysis,publication.
– Understanding ethical implications• Before, duringandaftercompletingastudy.
Data Management
BENEFITS OF DATA MANAGEMENT
What is a Data Management Plan
(DMP)?
• A checklist, template or document that provides a detailed description of the management of information in your study. – What kind of information will you collect?
– How are you going to manage ethical issues, for example confidentiality?
– What information will be saved and where?
– What will be responsible for handling the data?
– What are the restrictions around data sharing? (ie.Audio or text files with names, etc?)
ONLINE TOOLS
• We used the following tools to support the development of today’s presentation.
– Checklist for aDMP:• http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP_C
hecklist_2013.pdf
– DMP Assistant (Portage Network by the University ofAlberta)• https://assistant.portagenetwork.ca/?locale=en
– DMPonline (developed by the Digital Curation Centre (DCC) and the University of California Curation Center(UC3))• https://dmponline.dcc.ac.uk/
– DMPTOOL (developed by the University of California Curation Center (UC3))• https://dmptool.org/
KEY CONCEPTS IN DATA MANAGEMENT
Information Security and Backups
• Consider cost (ex: USB data sticks, hard disks, computers)• Who will be in charge?• Options for saving your information:
– Server: recommended, one secure location– Personal computers and laptops: should not be used as the only way to save your master
files, risk of being lost or stolen.– External storage devices(ex: hard disks, USBs, CDs, DVDs): not recommended
long term, risk of being lost or stolen.– If you do use this option it is important to use high quality hardwear and ensure information is
encrypted.
Data Encryption
• Passwords – Do not use your name
– Do not write password on a post-it
– Establish a security procedure to recover passwords. – Have at least one other person who knows the password.
• Universidad de Edinburgo: http://www.ed.ac.uk/infosec/how-to-protect/lock- your-devices/passwords/choosing-strong-passwords
Data Backups
• The most important component of data management• Have at least 3 copies of your information in two different mediums (ex. Laptop + hard disk, desktop
computer + laptop)• Keep the copies in different locations• Regularlycheck yourstorage devicesto ensurethey are still functioningproperly.• Create a databackupstrategy
– Who?– Where?– What? (All the information? Only some?)– How frequently?– How much space do you need?
Organizing the Information
• Important because it allows you to identify and use information in an efficient and effective way
• "Investing a little time in the beginning saves a lot of time at the end”• 3 key points
– File structure
– File names
– Versions
Naming Files
• Two key points– Be consistent – Be descriptive
• Consistent: Develop a strategy and ensure rules are followed systematically– YYMMDD is not the same as YYYYMMDD or DDMMYYYY
• Descriptive: You and your team members should be able to rapidly identify the files you are looking for. – Is it a master file? Are they results of hte study? Is it version 1, 2 or 3?
Advice for developing a strategy fornaming files.
• Take into account the type of information– PDF– Qualitative information
• Audio, photographs, transcripts, files for analysis in MAXQDA, Atlas.ti, etc.– Quantitative information
• Databases, do files, etc.
• Basic considerations – Names should be short (25-32 characters) and signify something
important. – Do not use special characters (#$%/&)– Don’t use spaces, always _ or all the words together– If you include numbers, always put two digits, not just one (01,02, 03, 04)
Image
PDF Example
• First last name of the first author
• Year of publication
• Journal
• Short description
Eaton_2015_AmJPH_StigmaM istrustInHealCarEnga gmtMSM
EXAMPLES OF STRATEGIES AND FILE NAMES
VERSIONS
Major Changes
– Use numbers and letters,
V01, V02
– Only numbers 01,02
• Which versions should
stay?– V01,…V07,…V20
Minor changesMenores
– Decimales V01.01– People’s initials
za_lb_lc_ja
Final.dta
NuevoFinal.dta
Never name a file “final”
becuase this will happen:
EstesieselFinal.dta
Elfinalquesiusamosparaelarticulo.dta
DMP TEMPLATE WITH EXAMPLES
ADMINISTRATIVE INFORMATION
Administrative Information
Name of project and ID
-If the project has different names
in different IRBs, include all of
them.
-Include the name given for
proposals, subsidies, etc.
-Send the short name used
amoung team members
Brief description of the project
Funding body CONACYT
ADMINISTRATIVE INFORMATION
Administrative information
Principle Investigator /name, ID, and e-mail
Sergio Bautista-Arredondo
[email protected] / skype:sergio_a_bautista
Research Coordinator Zafiro Andrade-Romo
[email protected] skype: zafiroandrade
Administrative Coordinator TaniaAramburo
[email protected] / skype:tania.aramburo
DMP Responsible Zafiro Andrade-Romo
[email protected] skype: zafiroandrade
DATA COLLECTION
Data Collection
Type of data collected,Tipo de datos recopilados,created, registered, linked
File format of the data to be registered
DATA COLLECTION
Data Collection
Forms and procedures to be
used for the structure,
name and version of the
control files
STORAGE AND BACKUP
Storage andBackupExpected storage requirements(storage space in Mb, GB, TB, etc. And time it will be stored)
We need at least 5GB of space.
Backup Plan(storage location, who will be in charge of the backup, how often do you plan on backing up)
1. We will keep a copy of all the studio files in the folder in Dropbox and a hard drive.
2. The hard drive will be kept in the office of the principal investigator,3. Zafiro Andrade will be responsible for the backup, which will be
carried out at least once every three months.
DATA SHARING
Data sharing and anonymizationprocedures
Sharing conditions (among the
members of the development team)
What data will be shared and in
what form? (Example: Processed,
parsed, final)
File format of the data that will be
analyzed
DATA SHARING
Responsabilities and resources
Person responsible for DMP application
ZafiroAndrade
Ethicial and legal compliance
Pperson who has data use
privledges.
Sergio Bautista-Arredondo
People who are allowed access to
the files
Audio files
Zafiro Andrade, Laura Chavira, LuisBarraza
Transcriptions, files for qualitative MXQDA
People who will be completing the analysis and the principal investigator.
Word, Excel,PPT all of the team
STATAfiles and other files for quantitative analysis
Peoplewhowillbecompletingtheanalysisandtheprincipalinvestigator.
The principal investigator should be consulted befre before sharing any files with peoeple outside of the team.
Any person outside of the team who is given access to the files should sign a confidentiatility form.
DATA SHARING
Ethics and legalcompliance
Cases in which the
data will be made
public
When we publish the dates in PlosOne, all the data will become public
The PI should be consulted before making any data public
Bibliography• Anne Thompson. (2007, Sep 8, 2015). Naming Conventions. The University of Edinburgh. Retrieved
from http://www.ed.ac.uk/records-management/records-management/staff-guidance/electronic-records/naming-conventions
• Canadian Association of Research Libraries. (n.d.).Portage Network. Retrievedfrom https://portagenetwork.ca/
• Carleton University. (n.d.). Research Data Management. Retrievedfrom https://library.carleton.ca/services/research-data-management
• Christine Malinowski. (2016). Data Management: File Organization Power Point Presentation. MIT Libraries. Retrieved from https://libraries.mit.edu/data-management/files/2014/05/FileOrg_20160121.pdf
• DCC. (2013). Checklist for a Data Management Plan, v4.0. Digital Curation Centre, . Retrieved from http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP/DMP_Checklist_2013.pdf
• EDINAand Data Library, & University of Edinburgh. (2017, 28April 2017). Research Data MANTRA [online course]. Retrieved from http://datalib.edina.ac.uk/mantra/
• Stanford University. (n.d.). Best practices for file naming. Stanford University. Retrieved from https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming
• The University of Edinburgh. (2016, Dec 6, 2016). Benefits of writing a DMP. Retrieved from http://www.ed.ac.uk/information-services/research-support/research-data-service/planning-your-data/benefits-of-writing-a-dmp
• University of Cambridge. (2007). Organising your data.University of Cambridge. Retrieved from https://www.data.cam.ac.uk/data-management-guide/organising-your-data
THANK YOU