What is DATA MANAGEMENT? · Data Management Data Management: Refers to all aspects of collecting,...

Post on 22-Jul-2020

4 views 0 download

Transcript of What is DATA MANAGEMENT? · Data Management Data Management: Refers to all aspects of collecting,...

Presented by Andrade Zafiro and Chavira Laura

Grupo de transparencia en la dirección de Economía Instituto de Salud Pública,

Cuernavaca,México

06 Septiembre 2017

Data Management Plan

What is DATA MANAGEMENT?

Paraempezar …

• Imagine that you wrote and published an article for a project you’d worked on for two and ahalf years. Your article was cited often and your results are now well known. Three yearsafter publication an investigator accuses you of having falsified information in thearticle.

– Do you think you could verify all of the work?

– What would you need to prove the information was not falsified?

– What things should you have done during the investigation to enable you to prove theinformation described in the article now, three years later?

Data Management

Data Management:

Refers to all aspects of collecting, saving, sharing, converting, backing up and

storing data.

REQUIERE

– Organizational skills• Individuals• In teams

– Understanding study methodology• Different phases of a studyos diferentespasos duranteunestudio• Design, implementation,analysis,publication.

– Understanding ethical implications• Before, duringandaftercompletingastudy.

Data Management

BENEFITS OF DATA MANAGEMENT

What is a Data Management Plan

(DMP)?

• A checklist, template or document that provides a detailed description of the management of information in your study. – What kind of information will you collect?

– How are you going to manage ethical issues, for example confidentiality?

– What information will be saved and where?

– What will be responsible for handling the data?

– What are the restrictions around data sharing? (ie.Audio or text files with names, etc?)

ONLINE TOOLS

• We used the following tools to support the development of today’s presentation.

– Checklist for aDMP:• http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP_C

hecklist_2013.pdf

– DMP Assistant (Portage Network by the University ofAlberta)• https://assistant.portagenetwork.ca/?locale=en

– DMPonline (developed by the Digital Curation Centre (DCC) and the University of California Curation Center(UC3))• https://dmponline.dcc.ac.uk/

– DMPTOOL (developed by the University of California Curation Center (UC3))• https://dmptool.org/

KEY CONCEPTS IN DATA MANAGEMENT

Information Security and Backups

• Consider cost (ex: USB data sticks, hard disks, computers)• Who will be in charge?• Options for saving your information:

– Server: recommended, one secure location– Personal computers and laptops: should not be used as the only way to save your master

files, risk of being lost or stolen.– External storage devices(ex: hard disks, USBs, CDs, DVDs): not recommended

long term, risk of being lost or stolen.– If you do use this option it is important to use high quality hardwear and ensure information is

encrypted.

Data Encryption

• Passwords – Do not use your name

– Do not write password on a post-it

– Establish a security procedure to recover passwords. – Have at least one other person who knows the password.

• Universidad de Edinburgo: http://www.ed.ac.uk/infosec/how-to-protect/lock- your-devices/passwords/choosing-strong-passwords

Data Backups

• The most important component of data management• Have at least 3 copies of your information in two different mediums (ex. Laptop + hard disk, desktop

computer + laptop)• Keep the copies in different locations• Regularlycheck yourstorage devicesto ensurethey are still functioningproperly.• Create a databackupstrategy

– Who?– Where?– What? (All the information? Only some?)– How frequently?– How much space do you need?

Organizing the Information

• Important because it allows you to identify and use information in an efficient and effective way

• "Investing a little time in the beginning saves a lot of time at the end”• 3 key points

– File structure

– File names

– Versions

Naming Files

• Two key points– Be consistent – Be descriptive

• Consistent: Develop a strategy and ensure rules are followed systematically– YYMMDD is not the same as YYYYMMDD or DDMMYYYY

• Descriptive: You and your team members should be able to rapidly identify the files you are looking for. – Is it a master file? Are they results of hte study? Is it version 1, 2 or 3?

Advice for developing a strategy fornaming files.

• Take into account the type of information– PDF– Qualitative information

• Audio, photographs, transcripts, files for analysis in MAXQDA, Atlas.ti, etc.– Quantitative information

• Databases, do files, etc.

• Basic considerations – Names should be short (25-32 characters) and signify something

important. – Do not use special characters (#$%/&)– Don’t use spaces, always _ or all the words together– If you include numbers, always put two digits, not just one (01,02, 03, 04)

Image

PDF Example

• First last name of the first author

• Year of publication

• Journal

• Short description

Eaton_2015_AmJPH_StigmaM istrustInHealCarEnga gmtMSM

EXAMPLES OF STRATEGIES AND FILE NAMES

VERSIONS

Major Changes

– Use numbers and letters,

V01, V02

– Only numbers 01,02

• Which versions should

stay?– V01,…V07,…V20

Minor changesMenores

– Decimales V01.01– People’s initials

za_lb_lc_ja

Final.dta

NuevoFinal.dta

Never name a file “final”

becuase this will happen:

EstesieselFinal.dta

Elfinalquesiusamosparaelarticulo.dta

DMP TEMPLATE WITH EXAMPLES

ADMINISTRATIVE INFORMATION

Administrative Information

Name of project and ID

-If the project has different names

in different IRBs, include all of

them.

-Include the name given for

proposals, subsidies, etc.

-Send the short name used

amoung team members

Brief description of the project

Funding body CONACYT

ADMINISTRATIVE INFORMATION

Administrative information

Principle Investigator /name, ID, and e-mail

Sergio Bautista-Arredondo

sbautista@insp.mx / skype:sergio_a_bautista

Research Coordinator Zafiro Andrade-Romo

zafiro.andrade@insp.mx,

zafiroandrade@hotmail.com skype: zafiroandrade

Administrative Coordinator TaniaAramburo

tania.aramburo@insp.mx / skype:tania.aramburo

DMP Responsible Zafiro Andrade-Romo

zafiro.andrade@insp.mx,

zafiroandrade@hotmail.com skype: zafiroandrade

DATA COLLECTION

Data Collection

Type of data collected,Tipo de datos recopilados,created, registered, linked

File format of the data to be registered

DATA COLLECTION

Data Collection

Forms and procedures to be

used for the structure,

name and version of the

control files

STORAGE AND BACKUP

Storage andBackupExpected storage requirements(storage space in Mb, GB, TB, etc. And time it will be stored)

We need at least 5GB of space.

Backup Plan(storage location, who will be in charge of the backup, how often do you plan on backing up)

1. We will keep a copy of all the studio files in the folder in Dropbox and a hard drive.

2. The hard drive will be kept in the office of the principal investigator,3. Zafiro Andrade will be responsible for the backup, which will be

carried out at least once every three months.

DATA SHARING

Data sharing and anonymizationprocedures

Sharing conditions (among the

members of the development team)

What data will be shared and in

what form? (Example: Processed,

parsed, final)

File format of the data that will be

analyzed

DATA SHARING

Responsabilities and resources

Person responsible for DMP application

ZafiroAndrade

Ethicial and legal compliance

Pperson who has data use

privledges.

Sergio Bautista-Arredondo

People who are allowed access to

the files

Audio files

Zafiro Andrade, Laura Chavira, LuisBarraza

Transcriptions, files for qualitative MXQDA

People who will be completing the analysis and the principal investigator.

Word, Excel,PPT all of the team

STATAfiles and other files for quantitative analysis

Peoplewhowillbecompletingtheanalysisandtheprincipalinvestigator.

The principal investigator should be consulted befre before sharing any files with peoeple outside of the team.

Any person outside of the team who is given access to the files should sign a confidentiatility form.

DATA SHARING

Ethics and legalcompliance

Cases in which the

data will be made

public

When we publish the dates in PlosOne, all the data will become public

The PI should be consulted before making any data public

Bibliography• Anne Thompson. (2007, Sep 8, 2015). Naming Conventions. The University of Edinburgh. Retrieved

from http://www.ed.ac.uk/records-management/records-management/staff-guidance/electronic-records/naming-conventions

• Canadian Association of Research Libraries. (n.d.).Portage Network. Retrievedfrom https://portagenetwork.ca/

• Carleton University. (n.d.). Research Data Management. Retrievedfrom https://library.carleton.ca/services/research-data-management

• Christine Malinowski. (2016). Data Management: File Organization Power Point Presentation. MIT Libraries. Retrieved from https://libraries.mit.edu/data-management/files/2014/05/FileOrg_20160121.pdf

• DCC. (2013). Checklist for a Data Management Plan, v4.0. Digital Curation Centre, . Retrieved from http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP/DMP_Checklist_2013.pdf

• EDINAand Data Library, & University of Edinburgh. (2017, 28April 2017). Research Data MANTRA [online course]. Retrieved from http://datalib.edina.ac.uk/mantra/

• Stanford University. (n.d.). Best practices for file naming. Stanford University. Retrieved from https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming

• The University of Edinburgh. (2016, Dec 6, 2016). Benefits of writing a DMP. Retrieved from http://www.ed.ac.uk/information-services/research-support/research-data-service/planning-your-data/benefits-of-writing-a-dmp

• University of Cambridge. (2007). Organising your data.University of Cambridge. Retrieved from https://www.data.cam.ac.uk/data-management-guide/organising-your-data

THANK YOU