Data Management Plan - caregiversprommd-project.eu · 1.5 26-04-2016 Final Cristian Barrué (UPC)...
Transcript of Data Management Plan - caregiversprommd-project.eu · 1.5 26-04-2016 Final Cristian Barrué (UPC)...
“ThisprojecthasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeundergrantagreementNo690211”
DeliverableNumber:D7.3,version:1.5
DataManagementPlan
CAREGIVERSPRO-MMDPROJECT
Ref. Ares(2016)2070840 - 30/04/2016
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page2of24
Documentinformation
ProjectNumber 690211 Acronym CAREGIVERSPRO-MMD
Fulltitle Self-managementinterventionsandmutualassistancecommunityservices,helpingpatientswithdementiaandcaregiversconnectwithothersforevaluation,supportandinspirationtoimprovethecareexperience
Projectcoordinator UniversitatPolitècnicadeCatalunya-BarcelonaTechProf.UlisesCortés,[email protected]
ProjectURL http://www.caregiversprommd-project.eu
Deliverable Number D7.3 Title DataManagementPlan-firstversion
Workpackage Number WP7 Title Dissemination,Communication,ExploitationandBusinessPlanning
Dateofdelivery Contractual 01/05/2016 Actual 30/04/2016
Nature ReportþDemonstratorpOtherp
DisseminationLevel PublicþConsortiump
Keywords
Authors(Partner) AtiaCortés(UPC),CristianBarrué(UPC),UlisesCortés(UPC)
ResponsibleAuthor CristianBarrué Email [email protected]
Partner UPC Phone +34934134011
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page3of24
DocumentVersionHistory
Version Date Status Author Description
0.1 06-03-2016 Draft AtiaCortés(UPC) Start of the document, TOC,reviewandfirststructure
1.0 15-03-2016 Draft AtiaCortés(UPC) Contribution to differentsections
1.1 27-03-2016 Draft CristianBarrué(UPC) Contribution to differentsections
1.2 25-03-2016 Draft Cristian Barrué,GabrielVerdejo(UPC)
Contributiontosection6
1.3 19-03-2016 Draft Kevin Paulson (Hull),Atia Cortés (UPC),Dimitrios Daskalakis(QPL), AnastasiaMatonaki(QPL)
Reviewofthedocument
1.4 20-04-2016 Draft Ulises Cortés (UPC),RafadeBofarull(MDD)
Reviewofthedocument
1.5 26-04-2016 Final CristianBarrué(UPC) Integration of review inputs,finalcontributions
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page4of24
Executivesummary
This isa livedocumentthatdescribesthedifferentprocessesregardingdatamanagement,storage and exploitation that have to be agreed and adopted by every member of theCAREGIVERSPRO-MMD Consortium. Over the course of the project this document will bereviewedandupdated.Additionalinformationonthedatastructureorthemethodology,achangeinresponsibilityforataskorinthebudget,maybeincludedinfutureversionsoftheDataManagementPlan.
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page5of24
ListofAcronyms
Acronym Title
CERIF CommonEuropeanResearchInformationFormat
C-MMD CAREGIVERSPRO-MMD
DMP DataManagementPlan
DoA DescriptionofAction
HONCode HealthOntheNetCode
QA QualityAssurance
QC QualityControl
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page6of24
ListofTablesTable1ProjectFactSheet.........................................................................................................8
Table2PersonalDataset.........................................................................................................10
Table3ScreeningDataset.......................................................................................................11
Table4TreatmentDataset......................................................................................................13
Table5InterventionDataset...................................................................................................14
Table6DisseminationDataset................................................................................................15
Table7DatasetSummary.......................................................................................................16
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page7of24
Tableofcontents
1 INTRODUCTION 8
2 PROJECTINFORMATION 8
3 DATA,MATERIALS,RESOURCESCOLLECTIONINFORMATION 103.1 DESCRIPTIONOFTHEDATA 10
3.1.1 PERSONALDATASET 103.1.2 SCREENINGDATASET 113.1.3 TREATMENTDATASET 133.1.4 INTERVENTIONDATASET 143.1.5 DISSEMINATIONDATASET 153.1.6 DATASETSUMMARY 16
3.2 QUALITYASSURANCEPROCESS 17
4 ETHICS,INTELLECTUALPROPERTY,CITATION 184.1 ETHICS 184.2 INTELLECTUALPROPERTY 194.3 CITATION 19
5 ACCESSANDUSEOFINFORMATION 20
6 STORAGEANDBACKUPOFDATA 206.1 BESTPRACTICESFORFILEFORMATS 21
6.1.1 PROPRIETARYVSOPENFORMATS 216.1.2 GUIDELINESFORCHOOSINGFORMATS 216.1.3 SOMEPREFERREDFILEFORMATS 21
7 ARCHIVINGANDFUTUREPROOFINGOFINFORMATION 22
8 RESOURCINGOFDATAMANAGEMENT 228.1 ROLESINDATAMANAGEMENT 228.2 FINANCIALDATAMANAGEMENTPROCESS 23
9 REVIEWOFDATAMANAGEMENTPROCESS 23
10 STATEMENTSANDPERSONNELDETAILS 2310.1 STATEMENTOFAGREEMENT 23
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page8of24
1 IntroductionThis document presents the first version of the Data Management Plan (DMP) for theCAREGIVERSPRO-MMDproject.ProjectsfundedbyintheHorizon2020OpenResearchDataPilotarerequiredtodevelopseveralversionsofaDataManagementPlan(DMP), inwhichtheywillspecify,amongotherthings,whatdatawillbekeptforthelongerterm.Inthecaseof CAREGIVERSPRO-MMD,which is not participating in theOpen ResearchData Pilot, theDMP is presented on a voluntary basis as a tool that can improve pilot preparation andresult analysis. TheConsortiumwill follow the guidelines described inOpenAire1 platformand thedocument“GuidelinesonDataManagement inHorizon2020”2 .ADMPdescribesthedatamanagementlifecycleforalldatasetstobecollected,processedorgeneratedbyaresearchproject.Itmustcover:
• thehandlingofresearchdataduring&aftertheproject;
• whatdatawillbecollected,processedorgenerated;
• whatmethodology&standardswillbeapplied;
• whetherdatawillbeshared/madeopenaccess&how;
• howdatawillbecurated&preserved.
TheDataManagementPlanwillbeupdated-ifappropriate-duringtheprojectlifetime(inthe form of deliverables D7.7 andD7.8). New versions of the DMP could also be createdwheneversignificantchangesariseintheprojectsuchas:
• newdatasets;
• changesinconsortiumpolicies;
• externalfactors.
2 ProjectInformationInthissectionweprovideabrieffactsheetoftheprojectdetailsandassociateddatamanagementrequirements
Table1ProjectFactSheet
ProjectTitle CAREGIVERSPRO-MMD
ProjectDuration 36months(01/01/16-31/12/18)
Partners• UniversitatPolitècnicadeCatalunya(UPC,Spain)• MobileDynamics(MDD,Spain)• UniversityofHull(HUL,UK)• Q-PLANInternationalLTD(QPL,Greece)
1https://www.openaire.eu/opendatapilot-dmp
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page9of24
• CooperativaSocialeCOOSSMarche(COO,Italy)• Fundació-UniversitatdelBages(FUB,Spain)• CentreHopitalierUniversitairedeRouen(CHU,
France)• CenterforResearchandTechnologyHellas(CERTH,
Greece)
BriefDescription
Self-management interventions and mutual assistancecommunity services, helping patients with dementia andcaregiversconnectwithothers forevaluation,supportandinspirationtoimprovethecareexperience
UniversityRequirementsforDataManagement
UPC is responsible forallocatingdata insafeenvironment,maintainingback-upsandprocessingthedatagenerated
FundingBody EuropeanCommission(Horizon2020PHC-25-2105)GrantNumber 690211Budget 4.087.198,75€
FundingBodyRequirementsforDataManagement
ForOpenDataprojects,theonesspecifiedinGuidelinesonDataManagementinHorizon20202.
2https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page10of24
3 Data,Materials,ResourcesCollectionInformationThepurposeofthissectionistoprovideafulldescriptionofthedatathatwillbegeneratedand stored during this project. The information provided in here might be adapted orupdatedinfurtherversionsofthisdocument.
3.1 Descriptionofthedata
AlldatawillbegeneratedthroughtheuseoftheCAREGIVERSPRO-MMDonlineplatformbyseveralcategoriesofusers, i.e.healthprofessionals,caregiversandpatients.Eachcategoryofuserwillhaveaccesstospecifiedcontentandwillbeabletogeneratedifferenttypesofinformationaccordingtothepermissionsgranted.
Foreachuseroftheplatform,differentdatasetsdescribedinthissectionmaybegenerated.Additionaldatasetsmaybegenerated in the future.Thedatawillbecollectedbeforeandafterthepilotphaseoftheproject.
Theplatformwillalsoprovidemeanstoassessandstoredatanodirectlyproducedbyusersi.e. the interaction among users and the evolution on their activity in the social network,whichwillalsobesubjecttofurtheranalysis.
3.1.1 PersonalDatasetTable2PersonalDataset
Datasetreferenceandname
C-MMD-Personal
Datasetdescription
This data set contains all the personal data captured through the registration toolsintegrated in the C-MMD platform for the dyad (patient and caregiver) and the healthprofessionals.Theregistrationtoolcollectsstandardpersonal information. i.e.asdescribedinEUDataProtectionDirective(95/46/EC)3:
"Personaldata”shallmeanany informationrelatingtoan identifiedor identifiablenaturalperson ('Data Subject'); an identifiable person is one who can be identified, directly orindirectly, in particular by reference to an identification number or to one ormore factorsspecifictohisphysical,physiological,mental,economic,culturalorsocialidentity.
Therefore,thenatureofthedatacorrespondstothevaluesusedtorepresentsuchconcepts(e.g. text, integers). At thismoment the registering tool has not been implemented in itsfinalversion,furtherdetailswillbegiveninfutureversions.
Standardsandmetadata
Datawillbestoredeachtimeauser(beitpatient,caregiverorhealthprofessional)registerstotheplatformormodifiestheirprofile.Althoughatthismomenttheregisteringtooland3http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:en:HTML
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page11of24
profilemanagementtoolhavenotbeendefinedyet,itisexpectedthatdatawillbestoredinaMySQLdatabase,usingnoSQLdatabaseforcomplementarypurposes.Recordswillalsoberelated(andidentified)withotherdatasetsandthedatewhenthedatawasrecorded.
Metadatawillincludeinformationabouttheprofilecreationtime,rangeofpossiblevalues,etc. Thismetadatawill beassociated toeach table andwill follow theCommonEuropeanResearchInformationFormat(CERIF)metadatastandard4.
Datasharing
ThisdatasetwillnotbesharedoutsideoftheConsortiumboundariesforethicalandsecurityreasons.EachdatasetrecordbelongstotheuserandtotheConsortiumpartnerresponsiblefor the user. Only the user, people authorised by him/her (e.g. caregiver) and authorisedpersonnelof theConsortiumpartner responsible for theuser, canaccess the record.Datawill be available to users and people authorised by them through the C-MMD platform.Authorised personnel of the pilot partner generating the data will be able to accessaggregateddata inperiodic reportsandalsowillbeable toaccess rawdatadumpedfromthe database in csv files or through a web service.Each access will beidentifiable andtraceable.
DatasetrecordswillbesharedamongdefinedConsortiumpartnersanonymisedforresearchpurposes in order to be used for the tasks of the project. Anonymisation is the standardprocedurefollowedtopreserveconfidentialityofparticipants.
Eachparticipant(e.g.patient,caregiver,doctor)willsignaninformedconsentatrecruitmentphaseauthorizingaccesstoallhis/herdata(raw,aggregated,anonymised).Userswillagreeto the anonymised and aggregateddata being used for research andpossibly commercialexploitation.
ThedatarepositorywillbeintheC-MMDhostintheUPCpremises(moredetailsaregiveninsection6).
Archivingandpreservation(includingstorageandbackup)
Seesection6and7.
3.1.2 ScreeningDatasetTable3ScreeningDataset
Datasetreferenceandname
C-MMD-Screening
Datasetdescription
Thisdata set contains all the clinical and social data captured through the screening toolsintegratedintheC-MMDplatformforthedyad(patientandcaregiver).Thescreeningtools
4http://www.eurocris.org/cerif/main-features-cerif
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page12of24
implementstandard evaluation scales for different conditions (physical, psychosocial,neurological, functional,etc.). Therefore, thenatureof thedatacorresponds to thevaluesusedtoevaluatesuchscales.Atthismomentthescreeningtoolhasnotbeenimplementedyet,furtherdetailswillbegiveninfutureversions.
Standardsandmetadata
Thedatawillbestoredfollowingthestandardnumericscalesdefinedbyeachscreeningtooleach time that a user (be it patient, caregiver or health professional) uses one of thescreening tools. Although at this moment the screening tool has not been defined it isexpected that data will be stored in a MySQL database, using noSQL database forcomplementary purposes. Records will also be related (and identified) with the user towhichtherecordeddatabelongandthedatewhenthedatawasrecorded.
Metadatawill include informationabout the scale recorded, rangeofpossible values, etc.This metadata will beassociated to each table and will follow the Common EuropeanResearchInformationFormat(CERIF)metadatastandard5.
Datasharing
ThisdatasetwillnotbesharedoutsideoftheConsortiumboundariesforethicalandsecurityreasons.EachdatasetrecordbelongstotheuserandtotheConsortiumpartnerresponsiblefor the user. Only the user, people authorised by Him/her (i.e. caregiver) and authorisedpersonnel of theConsortiumpartner responsible for theuser can access the record.Datawill be available to users and people authorised by them through the C-MMD platform.Authorised personnel of the pilot partner generating the data will be able to accessaggregateddata inperiodic reportsandalsowillbeable toaccess rawdatadumpedfromthe database in csv files or through a web service.Each access will beidentifiable andtraceable.
Dataset records will be shared among the Consortium partners anonymised for researchpurposes in order to be used in the tasks of the project. Anonymisation is the standardprocedurefollowedtopreserveconfidentialityofparticipants.
Eachparticipantwillsignaninformedconsentatrecruitmentphaseauthorizingaccesstoallhis/her data (raw, aggregated, anonymised). Users will agree to the anonymised andaggregateddatabeingusedforresearchandpossiblycommercialexploitation.
ThedatarepositorywillbeallocatedintheC-MMDhostintheUPCpremises(moredetailsinsection6).
Archivingandpreservation(includingstorageandbackup)
Seesection6and7.
5http://www.eurocris.org/cerif/main-features-cerif
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page13of24
3.1.3 TreatmentDatasetTable4TreatmentDataset
Datasetreferenceandname
C-MMD-Treatment
Datasetdescription
This dataset contains all the treatment information for each dyad. The treatmentinformationwillcomefrom:(1)aspecifictoolsetintegratedintheplatformforthatpurpose,(2)throughtheAPItoconnectwithnationalhealthcaresystemswherepossible.Thenatureof thedatacorresponds tomedicationdescriptions,doses, schedulesand follow-upof theadherence. At this moment the data-capturing tool has not been implemented, furtherdetailswillbegiveninfutureversions.
Standardsandmetadata
The datawill be stored following the numeric/text standards each time that a user (be itpatient, caregiver or health professional) uses the treatment management interface tointroduceormodify informationabout thepharmacological treatmentbeing followedandthe adherence regime to the treatment. Although at this moment the treatmentmanagement toolhasnotbeendefined it isexpected thatdatawillbestored inaMySQLdatabase,usingnoSQLdatabase forcomplementarypurposes.Recordswillalsoberelated(andidentified)withtheusertowhichtherecordeddatabelongandthedatewhenthedatawasrecorded.
Metadatawill include information about the data recorded, range of possible values, etc.This metadata will beassociated to each table and will follow the Common EuropeanResearchInformationFormat(CERIF)metadatastandard6.
Datasharing
ThisdatasetwillnotbesharedoutsideoftheConsortiumboundariesforethicalandsecurityreasons.EachdatasetrecordbelongstotheuserandtotheConsortiumpartnerresponsiblefortheuser.Onlytheuser,peopleauthorisedbyhim/her(i.e.thecaregiver)andauthorisedpersonnelof theConsortiumpartner responsible for theuser, canaccess the record.Datawill be available to users and people authorised by them through the C-MMD platform.Authorised personnel of the pilot partner generating the data will be able to accessaggregateddata inperiodic reportsandalsowillbeable toaccess rawdatadumpedfromthe database in csv files or through a web service.Each access will beidentifiable andtraceable.
Dataset recordswill be shared among the Consortium partners, anonymised for researchpurposes, in order to achieve with the tasks of WP6. Anonymisation is the standardprocedurefollowedtopreserveconfidentialityofparticipants.
Alldescribedaccessestodata(raw,aggregated,anonymised)willbeauthorisedthoughaninformedconsentsignedbytheparticipantattherecruitmentphase.Userswillagreetothe6http://www.eurocris.org/cerif/main-features-cerif
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page14of24
anonymised and aggregated data being used for research and possibly commercialexploitation.
ThedatarepositorywillbeallocatedintheC-MMDhostintheUPCpremises(moredetailsinsection6).
Archivingandpreservation(includingstorageandbackup)
Seesection6and7.
3.1.4 InterventionDatasetTable5InterventionDataset
Datasetreferenceandname
C-MMD-Intervention
Datasetdescription
This data set contains all the intervention contents created by the consortium membersduring the lifetimeof theproject. These intervention contents includeposts, articles, tips,multimedia, tutorials, webinars and any kind of educational content produced to supportthecaregivingprocessandthehealthyageing lifestyle.These interventioncontentswillbeintroduced in theplatform through specific toolsdesigned for thatpurpose (e.g. theonesavailable inWordpress toeditblogposts).Standards inmultimediaandtextpostsstoragewill be followed. At this moment the editor tools have not been implemented, furtherdetailswillbegiveninfutureversions.
Standardsandmetadata
Thedatawillbestoredfollowingthestandardtext/media formats followingbestpracticesfordatamanagement(seesection6).Althoughatthismomenttheeditingtoolhasnotbeendefined,it isexpectedthatdatawillbestoredinaMySQLdatabase,usingnoSQLdatabasefor complementary purposes. Records will also be related (and identified) with the userauthoringthecontentsandthedatewhenthedatawasrecorded.
As explained in section 5.1 of DoA and later in this document in section 4, all contentscreatedwillfollowtheHONCode.
Metadata will include information about the intervention recorded and a list of tags orkeywords that relate thecontentwith specific symptoms,conditionsorproblems that thecontent refers to (e.g. a video about Alzheimer could have the tagsAlzheimer,dementia,cognitive decline, etc.) This metadata will beassociated to each table and will follow theCommonEuropeanResearchInformationFormat(CERIF)metadatastandard7.
Datasharing
7http://www.eurocris.org/cerif/main-features-cerif
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page15of24
Eachdataset recordbelongs to theConsortiumpartner responsible for creating it. All theConsortiumandsuitableusers8areauthorisedtoaccesstherecordedcontents.DatawillbeavailabletousersandpeopleauthorisedbythemthroughtheC-MMDplatform.Aggregateddata about the amount of contents generated and specific metadata (e.g. tags) will beavailable as well as access to raw data dumped from the database in files to selectedConsortiummembers.
Datasetrecords,particularlyaggregateddata,willbesharedamongtheConsortiumpartnersforresearchpurposesinordertobeusedinthetasksoftheproject.
Users will agree to the anonymised and aggregated data being used for research andpossiblycommercialexploitation.
ThedatarepositorywillbeallocatedintheC-MMDhostintheUPCpremises(moredetailsinsection6).
Archivingandpreservation(includingstorageandbackup)
Seesection6and7.
3.1.5 DisseminationDatasetTable6DisseminationDataset
Datasetreferenceandname
C-MMD-Dissemination
Datasetdescription
This data set contains all thedissemination contents createdby the consortiummembersduring the lifetime of the project. These dissemination contents include scientific papers,newsletters,multimedia,pressarticles,conferencesandanykindofdisseminationcontentproduced to support the communication activities of the project and dissemination ofresults. These contents created from different sources will be stored in adatabase/filesystem.
Standardsandmetadata
8Inthecaseofpatientsorcaregivers,contentsshouldbeavailabledependingontheirspecificneeds
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page16of24
Thedatawillbestoredfollowingthestandardtext/media formats followingbestpracticesfordatamanagement(seesection6).Recordswillalsoberelated(andidentified)withtheuserauthoringthecontentsandthedatewhenthedatawasrecorded.
Metadata will include information about the dissemination data recorded, the targetaudience, identifier (i.e. DOI, URI), authors, title of the publication, time of publication,related event (e.g. conference, forum, etc.) and a list of tags or keywords that relate thecontentwithspecifictopicsorresults.ThismetadatawillbeassociatedtoeachtableandwillfollowtheCommonEuropeanResearchInformationFormat(CERIF)metadatastandard9.
Datasharing
EachdatasetrecordbelongstotheConsortiumpartner/sresponsible forcreating it.Thesecontentsareopenforaccess.
ThedatarepositorywillbeallocatedintheC-MMDhostintheUPCpremises(moredetailsinsection6).
Archivingandpreservation(includingstorageandbackup)
Seesection6and7.
3.1.6 DatasetSummaryTable7DatasetSummary
Dataset Who Ownership Access
PersonalDataset
User Yes Yes,full
Partner(recruiting) Yes Yes,fulltoauthorisedpersonnel
RestofConsortium No Yes,onlyanonymisedandaggregateddata
World No No
ScreeningDataset
User Yes Yes,full
Partner(recruiting) Yes Yes,fulltoauthorisedpersonnel
RestofConsortium No Yes,onlyanonymisedandaggregateddata
9http://www.eurocris.org/cerif/main-features-cerif
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page17of24
World No No
TreatmentDataset
User Yes Yes,full
Partner(recruiting) Yes Yes,fulltoauthorisedpersonnel
RestofConsortium No Yes,onlyanonymisedandaggregateddata
World No No
InterventionDataset
User No Yes,dependingontheirneeds
Partner(authoring) Yes Yes,fulltoauthorisedpersonnel
RestofConsortium No Yes,fulltoauthorisedpersonnel
World No Limitedanddependingonprojectneedsandexploitationpolicies
DisseminationDataset
User No Yes
Partner(authoring) Yes Yes
RestofConsortium No Yes
World No Yes
3.2 QualityAssuranceProcess
Every data gathering process is susceptible to contamination in the absence of adequatepreventive measures. Data contamination results from a process or phenomenon, otherthantheoneofinterest,whichcanaffectthevariablevalues.Datacontaminationresultsinerroneousvaluesinthedataset.Ingeneral,therearetwotypesoferrorsthatcanoccurinadataset.Firstly,errorsofcommission,whichare theresultof incorrector inaccuratedatabeingincludedinthedataset.Thismayhappenbecauseofamalfunctioninginstrumentthatproducesfaultyresults,datathataremistypedduringentry,orotherproblems.
Errorsofomissionarethesecondtypeoferrors.Theseresultfromdataormetadatabeingomitted. Situations that result in omission errors occur when data are inadequatelydocumented,whentherearehumanerrorsduringdatacollectionorentry,orwhenthereareanomaliesinthefieldthataffectthedata.
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page18of24
Quality assurance/quality control (QA/QC) activities should be an integral part of anyinventorydevelopmentprocessesastheyimprovetransparency,consistency,comparability,completenessandaccuracy.
Qualitycontrol(QC) isdefinedasasystemofcheckstoassessandmaintainthequalityofthe data inventory being compiled. Quality control procedures are designed to provideroutinetechnicalcheckstomeasureandcontrolthedataconsistency,integrity,correctnessandcompleteness;andtoidentifyandaddresserrorsandomissions.Qualitycontrolchecksshould cover everything from data acquisition and handling, application of approvedprocedures andmethods, and documentation. Examples of general quality control checksinclude:
• checkingfortranscriptionerrorsindatainput;• checkingthatscalemeasuresarewithintherangeofacceptablevalues;• checkingthatproperconversionfactorsareused;
InfutureversionsofthisdocumentwewillprovidemoredetailsontheQCprotocolstobeadoptedduringtheprojectlifetime.
Quality assurance (QA) is a planned system of review procedures conducted outside theactual inventory compilation by personnel not directly involved in the inventorydevelopment process. It is a non-biased, independent review of methods and/or datasummariesthatensuresthatthe inventorycontinuesto incorporatecorrectlythescientificknowledge and data generated. Quality assurance procedures may include expert peerreviewsofdatasummariesandauditstoassessthequalityoftheinventoryandtoidentifywhere improvements could be made. If deemed necessary, selected members of theAdvisoryBoardmayperformthistaskinthecourseoftheprojectlifecycle.
4 Ethics,IntellectualProperty,Citation4.1 Ethics
The lack of ethical principles standardization at international levelmaypotentially lead tothe abuse of data collection, use and storage by exploiting differences between societieswith regard to established ethical standards. Ethics of data collection, and data use andstorage inmedicalapplications, isofgrowing importancesince thequalityandquantityofmedical data usage is growing quickly both in Europe andworldwide. Great concerns areraised about data protection and privacy issues in the area of biometric and healthapplications with growing markets that might be affected by insufficiently protectedsensitiveinformation.
The healthcare providers that are involved in the project follow strict ethical codes. Allethical, legal and regulatory issues will be studied in detail in T8.6 and presented in theincrementalversionsofD8.3.Themostrelevantfindingswillbeincludedinthefinalversionofthisdocument.
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page19of24
4.2 IntellectualProperty
Withregardtopropertyandownershipofmedicaldataandrecords,therearetwodistinctviews. From the standpoint of practitioners (i.e., healthcare providers, hospitals), patientmedical records are their property because they are the ones who write, compile andproduce the records (data producers). At the same time, patients tend to believe thatmedicalrecordsbelongtothemastheyprovidetherelevantinformation.
Nevertheless, the project will produce data assets that do not correspond to medicalrecords.Forinstance:
• Interventioncontentsandguidelines;• Gamificationreports;• Treatmentadherencereports;• Aggregatedmedicaldatareports;and• Reportsandstatisticsofplatformusage.
TheownershipandIPRoftheseassetswillbedetailedinfutureversionsofthisdocument.The resulting agreements will be compliant with corresponding legislation (i.e. DataProtectionAct,Copyright,FreedomofInformationAct,etc.).
4.3 Citation
Anarticle,paperorpresentationthatrefersto,ordraws,informationfromadatasetshouldcite thedata set, just as itwould citeother sources suchasbooks andarticles.A citationgivesappropriatecredittothedatasetcreator(s),andallowsinterestedreaderstofindthedataset so theycanconfirmthedata isbeingcorrectly represented,orcanuse it in theirownwork.Thereisnouniversalstandardforformattingadatasetcitation.
Therearemanydifferentstylesforformattingcitations,suchasAPAandChicagoManualofStyle. In addition, most scientific publications have their own style, either unique tothemselves or based on an existing style. A few of these styles, such as APA 6th edition,specifyhowtocitedatasets.However,mostcitationstylemanualsdonotcurrentlycovercitingdatasets.Consequently,adaptationofthestyles’generalformatcanbeappliedtotheneedsofdatasets.
Atthisearlystage,theinformationusedtociteC-MMDdatasetscouldbe:
• Author(s)(theprincipalinvestigatorcanbeusedasthe“author”ofadataset)• Title• YearofPublication• Publisher(partnerproducingthedataset)• Version• Accessinformation(doiorurl)
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page20of24
5 AccessandUseofInformationOneoftheobjectivesoftheCAREGIVERSPRO-MMDprojectistodevelopthesolutionintoacommercial product. This is the main reason why the Consortium has decided thatpotentiallypublishabledatawillnotbeavailableforopenaccessuntiltheendoftheproject,oncetheexploitationpathshavebeendefined.
However, results of the pilot execution and platform evaluation will be made publiclyavailablethroughthedeliverablesD6.1–Mid-Pilotpreliminaryanalysisreport,D6.2–FinalPilotanalysisreportandD6.3–Userfeedbackandusabilityreport.
Moredetailsonspecificdatasetaccessregimesaredefinedinsection3.1.
6 StorageandBackupofDataInorder to safeguard theappropriatepreservationof thedata,portionof thebudgethasbeenallocatedinthedatastorageandbackupsduringthelifespanoftheprojectandatleastforthefollowingtwoyears.
The data will be stored in databases installed on the same server that holds theCAREGIVERSPRO-MMD platform. These Databases are only accessible locally (i.e. onlyavailabletotheserver itself) inordertopreventanyconnectionfromoutside.Thesystemand server configuration have been arranged in order to support local data encryption toavoidphysicalaccesstotheharddiskdrive.Thismeasurewouldpreventaccesstothedataifthephysicalstoragewasstolenoraccesseddirectly.
TheserverhasalocalfirewallthatonlyallowssecurewebconnectionstotheInternetandverified IP addresses for development/updates of the C-MMD application. A local log filerecordseveryaccesstotheserver.
TheserverislocatedintheUPCcampusDataCenter.Thisdatacenterisadedicated250m2facilitywithcontrolledaccess,personal IDcardsforauthorizedstaffandvideosurveillance24x7.Theserverhasdedicatedbandwidthandbackuppowersysteminordertoguaranteeavailability.
Adailybackupprocedurehasbeendesignedinordertoensuredataintegrityandrecovery.Thisbackuphastwomainsubsystems:
1. File system backup: A daily copy of every file in the file system is stored incompressedformat.
2. Databasebackup:Adailydumpofeverydatabase/tableisstoredinasinglefile.3. Dailyencryptionandcompressionoflogfiles.
Optionally, this backup can be physically moved to a safe location outside the UPC DataCenterifthepersonaldatarequiresthislevelofprotection.Aspecificbudgetisreservedforthistask.
A 30-daywindowbackup systemhas been programmed and enough disk space has beenreservedforamonthlyoperation.
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page21of24
6.1 BestPracticesforFileFormats
Thefileformatsusedhaveadirect impactontheabilitytoopenthosefilesata laterdateandontheabilityofotherpeopletoaccessthosedata.
6.1.1 ProprietaryvsOpenFormats
Datashouldbesavedinanon-proprietary(open)fileformatwhenpossible.Ifconversiontoan open data formatwill result in some data loss from the files, it should be consideredsavingthedatainboththeproprietaryformatandanopenformat.Havingatleastsomeoftheinformationavailableinthefutureisbetterthanhavingnone.
Whenitisnecessarytosavefilesinaproprietaryformat,itwillbeincludedareadme.txtfilethatdocumentsthenameandversionofthesoftwareusedtogeneratethefile,aswellasthecompanywhomadethesoftware.
6.1.2 GuidelinesforChoosingFormats
Whenselectingfileformatsforarchiving,theformatsshouldideallybe:
§ Non-proprietary;§ Unencrypted;10§ Uncompressed;§ Incommonusagebytheresearchcommunity;§ Adherenttoanopen,documentedstandard:
o Interoperableamongdiverseplatformsandapplicationso Fullypublishedandavailableroyalty-freeo Fully and independently implementable bymultiple software providers on
multiple platforms without any intellectual property restrictions fornecessarytechnology
o Developedandmaintainedbyanopenstandardsorganizationwithawell-definedinclusiveprocessforevolutionofthestandard
6.1.3 SomePreferredFileFormats1112
• Containers:TAR,GZIP,ZIP• Databases:XML,CSV• Geospatial:SHP,DBF,GeoTIFF,NetCDF• Movingimages:MOV,MPEG,AVI,MXF• Sounds:WAVE,AIFF,MP3,MXF• Statistics:ASCII,DTA,POR,SAS,SAV• Stillimages:TIFF,JPEG2000,PDF,PNG,GIF,BMP• Tabulardata:CSV• Text:XML,PDF/A,HTML,ASCII,UTF-8• Webarchive:WARC
10DatawillbeencryptedintheUPCserverforsecurityreasons11http://www.digitalpreservation.gov/formats/12http://www.loc.gov/preservation/resources/rfs/data.html
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page22of24
7 ArchivingandFutureProofingofInformationThe national legislation (European compliant) of the server site (Spain) compels UPC topreservealldataandaccessrecordsfortwoyearsaftertheprojectcompletion.Theserverwill remain in the same safe location in order to preserve physical and logical access.Consequently, the data will be kept in the server and will be accessible under the sametermsthatwillbeagreedamongpartnersduringtheprojectlifespan.
All public project deliverables will be available at least for five years after the projectcompletionattheprojectportal.
Selected datasets, databases, standalone documents, and even software may be madepublicoropenforexploitationattheendoftheproject.Theseresourcesmayproveuselesswithoutexplanatorynotes(metadata)accompanyingthem.Metadatawillbeclearly linkedtothematerialssothattheycanadequatelyinformanyfutureuseraboutthematerial.Forexample, apublisheddatasetwill typically be accompaniedby ametadatadocument thatexplains the various fields, theirusefulness and summarises thepurposeof thedataset ingeneral.Thesedocumentswillbestoredalongwiththedatasetandmadeaccessibleinthesamemannerasthedataset(e.g.online,ordownload).Contactinformationwillbeprovidedaccordinglyincasethatthefutureuserneedsfurtherclarification.
8 ResourcingofDataManagementThis section outlines the staffing and financial details of the datamanagementwithin theCAREGIVERSPRO-MMDproject.Theformeraspectprovidesinformationabouttheroleandresponsibilitiesof thepartners thatgeneratethedataandthosewhocontrol it.The latteraspectdescribesthefinancingprocessfordatamanagementanddatastorage.
8.1 RolesinDataManagement
Eachpilotpartner(HUL,COO,FUB,CHU)isresponsibleforthedatageneratedintheirownpilotsbythedifferentstakeholdersoftheplatformasdataproducers.Eachpilotpartnerwillassigna responsibleperson fromhisorher institution for this task tobedesigned for thenextversionofthisdocument.
The UPC is responsible for all the aspects related with data storage and backup as dataprocessor.
MDDandCERTHasthemaindevelopersoftheC-MMDplatformwillberesponsibleasdataprocessorandserviceproviderofalltheaspectsrelatedwithdatagathering,dataintegrity,accesslogging,etc.
As specified in section5.1.3ofDoA, specific agreementswill be signedamongpartners inorder to grant access to the different datasets for the different uses (data storage, dataprocessing,serviceprovision).
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page23of24
8.2 FinancialDataManagementProcess
Asmentionedbefore,theConsortiumhasreservedaportionoftheprojectbudgetfordatahostingandbackup.
9 ReviewofDataManagementProcessThe follow-up of this plan will be reported in future versions of this document, wheredetailedprotocolsandmeasureswillbedescribedtoensurethecompliancewiththeplanalongwithpreliminary results on theobservedevolution.UPCasmain contributor to thisplan,supportedbytherolesdescribedinsection8.1,willperformthefollow-up.
ExternalreviewersoftheConsortiumaswellasselectedmembersoftheAdvisoryBoardwillsupportthepeer-reviewprocess.
10 StatementsandPersonnelDetails
10.1 StatementofAgreement
TheConsortiumagreetothespecificelementsoftheplanasoutlined.13
ProjectCoordinator
Title
Designation
Name
Date
Signature
ProjectManager
Title
Designation
Name
Date
Signature
13Tobesignedinthenextversionofthisdocument
D7.3DataManagementPlan
CAREGIVERSPRO-MMD
D7.3DataManagementPlan:Page24of24
ManagementBoard(onetableforeachmember)
Title
Designation
Name
Date
Signature