Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data...

1

Transcript of Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data...

Page 1: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Moving Tow ards FAIR dat a shar ing w it h The Dat averse Project

Open source research data repository softw are

Page 2: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

“It is on ly by sharing data—a n d “d a t a ” in t h e b ro a d se n se o f e ve ryt h in g t h a t w e n t in t o a co n c lu sio n —t h a t t h e id e a l o f sc ie n t ific kn o w le d g e b e in g t e st a b le ca n b e a ch ie ve d . It g o e s b a ck t o t h e b a sic id e a o f sc ie n ce t h a t p e o p le le a rn in h ig h sch o o l o r p e rh a p s kin d e rg a rt e n , ...t h a t it s re su lt s flo w fro m “p u b lic ly a va ila b le , re p ro d u c ib le , e ve ryb o d y-ca n -st a n d -a ro u n d -a n d -lo o k-a t -it d a t a .””

An t h o n y Co x, ch ie f sc ie n ce o ffice r o f Ne xt He a lt h Te ch n o lo g ie s , 20 16

Page 3: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

The St at e of Open Dat a

The State of Open Data 2019 Report curated by FigShare in the Dig ital Science Report found that

● 79% of 2019 responden t s su p p o rt ive o f a n a t io n a l m a n d a t e fo r m a kin g p rim a ry re se a rch o p e n ly a va ila b le

● 67% of responden t s t h in k t h a t fu n d e rs should w it h h o ld fu n d in g , o r p e n a lise , re se a rch e rs w h o d o n o t sh a re t h e ir d a t a if m a n d a t e d

● 69% of responden t s t h in k t h a t fu n d e rs sh o u ld m a ke t h e sh a rin g o f re se a rch d a t a p a rt o f t h e ir re q u ire m e n t s

● 36% of responden t s e xp re sse d t h e co n ce rn t h a t t h e ir d a t a m a y b e m isu se d● 42% of researchers w o u ld b e e n co u ra g e d t o sh a re t h e ir d a t a if it re su lt e d in a co -a u t h o rsh ip

w h ile “op e n d a t a ” c le a rly h a s m ore re cog n it ion in t h e com m u n it y (b y virt u e o f t h e h ig h re sp on se ra t e t o ou r su rve y), “FAIR p r incip les” are relat ively unk now n t o t h e com m u n it y w it h 52% o f re sp on d e n t s w h o a re fre q u e n t d a t a -sh a re rs n e ve r h a vin g h e a rd o f t h e m

Th is re p o rt h a s b e e n p u b lish e d s in ce 20 16 a n d e xa m in e s t h e t re n d s in o p e n d a t a sh a rin g

Page 4: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a sharing

The Research LifeCycle , DMP, and RDM

FAIR data gu id ing p rincip les

How Dataverse Supports FAIR standards and data sharing

Nanyang Technolog ical Un iversit y Research Data Repository

Guidance for Data Managem ent and Curat ion Best Pract ices

Page 5: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 6: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

The Benef it s of Dat a Sharing1. Advances from reproducib le research

1. Allow s verificat ion of resu lt s

1. Im proves Science

1. Enhanced visib ilit y

1. Enhanced d iscoverab ilit y

1. More citat ions

1. Co-authorsh ips, and m ore...

Page 7: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

If you shared your dat a, have you been cit ed?

Page 8: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a sharing

The Research LifeCycle , DMP, and RDM

FAIR data gu id ing p rincip les

How Dataverse Supports FAIR standards and data sharing

Nanyang Technolog ical Un iversit y Research Data Repository

Guidance for Data Managem ent and Curat ion Best Pract ices

Page 9: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 10: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Research Life Cycle

Page 11: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

If you have shared your dat a, have you received coau t horsh ip

Page 12: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

ht t ps://w w w .uzh .ch/b log /hbz/2018/11/15/data-m a n a g e m e n t -p la n -in -a -n u t sh e ll/? la n g =e n

Page 13: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hat w ill you deposit w it h t he Reposit ory?

Everything Needed to Validate Results presented in your

publication

Data Metadata

The “other” data described in your DMP

Tools: documentation, scripts, software, statistical analysis

It is understandable?ReadMe text files

Page 14: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Quest ion t o consider for you r DMP● W hat is the project /aim of research?

● W ho are the collaborators, funders, and princip le

invest igators of the project? Docum entat ion and

Organ izat ion

● W hat in form at ion w ill be needed for the data to be

read and in terpreted in the fu ture?

● How w ill the data be labelled and organ ized? And how

w ill you com m unicate that? Storage and Backup

● How and w here w ill data be stored and backed up (how

often?) during research?

● W hat file form ats of data w ill be produced? Do they

enab le sharing and long term access (open source)?

Access and Securit y

● Is there sensit ive data and if so, how w ill you m anage

access and securit y?

● Is there data that needs to be retained/dest royed

for cont ractual, legal, regu latory purposes?

Select ion and Preservat ion

● W here and how long w ill the data be stored

(backups too)?

● W hich data w ill be retained, shared, and/or

p reserved? Sharing

● W ho w ill you share the data w ith? How ?

● Quest ion s t o con t in ue ask in g yourself :

○ Is t h e m et ad at a st i l l availab le an d

un d erst an dab le?

○ Are t h e form at s st i l l usab le?

○ Is t h e sof t w are st i l l availab le?

○ Is t h e d at a st i l l in t h e correct locat ion ?

○ Are m y back ups w ork in g as I expect ?

Page 15: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 16: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

At w hat st ag e of t he research lifecycle are you now ?

Page 17: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

ht tp ://w w w .dcc.ac.uk/resources/curat ion -life cyc le -m o d e l

W hat is Research Dat a Manag em en t and Dig it al Cu rat ion

Research Dat a Manag em en t is t h e ca re a n d m a in t e n a n ce o f t h e d a t a t h a t is p rod u ce d d u rin g t h e cou rse o f a re se a rch cyc le .

It is a n in t e g ra l p a rt o f t h e re se a rch p roce ss .

He lp s t o e n su re t h a t you r d a t a is p rop e rly o rg a n ize d , d e sc rib e d , p re se rve d , a n d sh a re d .

Dig it al cu rat ionin vo lve s m a in t a in in g , p re se rvin g a n d a d d in g va lu e t o d ig it a l re se a rch d a t a t h rou g h ou t it s life cyc le .

Page 18: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dig it al Dat a Curat ion LifeCycle

● Concerned w ith the “chang ing ” aspects of data

● Curat ion of the Data

● Capturing the collected data

● Storage

● Transform at ion /m ig rat ion

● Appraisal/Disposal

Page 19: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

So w hy the urgent need to im p lem ent RDM?

● Da t a De lu g e

○ Th e re is lo t s o f d a t a b e in g g e n e ra t e d !

● Fu n d e rs’ m a n d a t e s

○ Re q u ire d DMP s w it h su b m issio n o f fu n d in g re q u e st s

● Na t io n a l P o lic ie s

○ Da t a P ro t e c t io n , a s w e ll a s , d a t a a cce ss

Page 20: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Angus W hyte (DCC) and Jonathan Tedds (Un iversit y of Leicester), 2011

“Effe c t ive m a n a g e m e n t is p ro vid in g in st it u t io n s w it h n e w w a ys t o fin d synerg ies across research g roups, p ro d u c in g n e w kn o w le d g e b y e n g a g in g a b ro a d e r ra n g e o f st ak eholders, a n d e n a b lin g w id e r re u se o f d a t a in t each ing and learn ing , co m m e rc ia l e xp lo it a t io n a n d policy developm en t .”

“Th e o b je c t ive o f t h e se p o lic ie s is is n o d iffe re n t fro m t h a t o f re se a rch it se lf; t o benef it science, scholarsh ip and p rovide w ider social and econom ic im pact s. “

“A st ro n g ca se fo r m a n a g in g a n d cu ra t in g re se a rch d a t a ca n b e m a d e , a s a m eans t o assu re research in t eg r it y and t o p rovide im provem en t s in research ef f iciency a n d in t h e ef fect iveness of inst it u t ional support . “

Page 21: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hat are the challenges in RDM?

● Lib ra ria n s d o n o t t e n d t o b e re se a rch e rs○ Th e g a p is n o t ju st in t h e RDM p roce ss b u t a lso in u n d e rst a n d in g t h e p e rsp e c t ive o f

t h e re se a rch e r

● Lib ra ria n s m a y n o t kn o w e n o u g h a b o u t w h a t re se a rch e rs d o , t h e

re se a rch p ro ce ss, t h e d isc ip lin e s

● W h y sh o u ld lib ra ria n s le a rn RDM?

● Ho w d o w e d e c id e o n t h e a p p ro p ria t e co lle c t io n o f skills t o t e a ch

Lib ra ria n s?

Page 22: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

h t t ps:/ /scholar .h arvard .edu / f i les/m ercecrosas/ f i les/ rd m -h arvard -crosas.p d f

Page 23: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 24: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hat is you r t op Dat a Manag em en t Plann ing quest ion

Page 25: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a sharing

The Research LifeCycle , DMP, and RDM

FAIR dat a g u id ing p r incip les

How Dataverse Supports FAIR standards and data sharing

Nanyang Technolog ical Un iversit y Research Data Repository

Guidance for Data Managem ent and Curat ion Best Pract ices

Page 26: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Are you fam iliar w it h t he FAIR dat a sharing st andards

Page 27: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

ht t ps://com m ons.w ik im edia.org /w ik i/File:FAIR_data_princip les.jpg

Page 28: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

”The FAIR Princip les pu t specific em phasis on enhancing the ab ilit y of m ach ines to a u t om a t ica lly fin d a n d u se t h e d a t a , in a d d it ion t o su p p o rt in g it s re u se b y in d ivid u a ls . ”

“Th e P rin c ip le s d e fin e ch a ra c t e ris t ic s t h a t con t e m p ora ry d a t a re sou rce s, t oo ls , voca b u la rie s a n d in fra st ru c t u re s sh ou ld e xh ib it t o a ssis t d iscove ry a n d re u se b y t h ird -p a rt ie s . By m in im a lly d e fin in g e a ch g u id in g p rin c ip le , t h e b a rrie r-t o -e n t ry fo r d a t a p rod u ce rs , p u b lish e rs a n d s t e w a rd s w h o w ish t o m a ke t h e ir d a t a h o ld in g s FAIR is p u rp ose ly m a in t a in e d a s low a s p ossib le ... They act as a g u ide t o da t a pub lishers and st ew ards t o assist t hem in eva lua t ing w het her t heir pa rt icu la r im p lem en t a t ion choices a re rendering t heir d ig it a l resea rch a rt ifact s Findab le, Accessib le, In t eroperab le, and Reusab le.

W ilkin so n e t a l. 20 16 . Th e FAIR Gu id in g P rin c ip le s fo r sc ie n t ific d a t a m a n a g e m e n t a n d s t e w a rd sh ip . 20 16

SCIENTIFIC DATA | 3:160 0 18 | DOI: 10 .10 38 /sd a t a .20 16 .18

Page 29: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

FAIR Dat a Princip les

To be Findab le:

● (m eta)data are assigned a g lobally un ique and persisten t iden t ifier

● data are described w ith rich m etadata ● m etadata clearly and exp licit ly include the

iden t ifier of t he data it describes ● (m eta)data are reg istered or indexed in a

searchab le resource

To be Accessib le:

● (m eta)data are ret rievab le by their iden t ifier using a standard ized com m un icat ions p rotocol

● t he p rotocol is open , free, and un iversally im p lem entab le

● t he p rotocol allow s for an authen t icat ion and authorizat ion p rocedure, w here necessary

● m etadata are accessib le, even w hen the data are no longer availab le

To be In t eroperab le:

● (m eta)data use a form al, accessib le, shared , and b road ly app licab le language for know ledge represen tat ion .

● (m eta)data use vocabularies that follow FAIR p rincip les

● (m eta)data include qualified references to other (m eta)data

To be Reusab le:

● m eta(data) are rich ly described w ith a p luralit y of accurate and relevan t at t ributes

● (m eta)data are released w ith a clear and accessib le data usage license (m eta)data are associated w ith detailed p rovenance (m eta)data m eet dom ain relevan t com m un it y standards

Page 30: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a sharing

The Research LifeCycle , DMP, and RDM

FAIR dat a g u id ing p r incip les

How Dat averse Support s FAIR st andards and dat a sharing

Nanyang Technolog ical Un iversit y Research Data Repository

Guidance for Data Managem ent and Curat ion Best Pract ices

Page 31: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) [CC BY 4.0 (https://creativecommons.org/licenses/by/4.0)]João Batista Neto - Data types - pt br.svg

Page 32: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

The Dataverse Project

● Op e n so u rce w e b a p p lica t io n t o sha re, p reserve, cit e, exp lore, and ana lyzere se a rch d a t a .

● Allo w s yo u t o c re a t e Dat a Reposit or ies ca lle d “ Dat averses,” t o su p p o rt re se a rch d a t a sh a rin g .

● Th e ce n t ra l in s ig h t b e h in d Da t a ve rse is t o au t om a t em u ch o f t h e jo b o f t h e p ro fe ssio n a l a rch ivis t .

● To p ro vid e se rvice s fo r a n d t o d is t rib u t e cred it t o t h e d a t a c re a t o r.

Page 33: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

NTU Dataverse RepositoryThe Dataverse Project

Personal repository spaces (dataverses) Individual datasets

● Self -cu rat ed or m anag ed dat a sharing

● Personalized dat a spaces

● Dat aset creat ion

● Dat a f i le support

● Dat a m anag em en t

● Pub lish ing w ork f low s

● And m uch m ore...

Page 34: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dat averse Creat ion Pag e

Discoverab le Met adat a st art s here

Cust om m et adat a b lock s

More st andards as t hey becom e availab le

Searchab le Fact s m at ch m et adat a op t ions

Page 35: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Met adat a: Find and Reuse Dat a

At Mu lt ip le Levels:● Citat ion m etadata ● Discip line-based

m etadata● Custom m etadata ● File m etadata● Variab le-level m etadata

W it h Mu lt ip le St andards:● Data Docum entat ion

In it iat ive (DDI) ● Dublin Core● Schem a.org JSON-LD● DDI Codebook● JSON● OpenAIRE

Download metadatain multiple formats

Page 36: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dataset landing page

Citation for entire dataset: DOI, with URL, and metadata registered to DataCite (FINDABLE & ACCESSIBLE)

Make Data Count (REUSABLE)

Citation and discoverable metadata using DataCite, schema.org, Dublin Core, DDI standards (FINDABLE, ACCESSIBLE, REUSABLE)

More metadata, including domain-specific (REUSABLE)

Terms with usage license or Data Use Agreements (REUSABLE)

PROV metadata (REUSABLE) - forthcoming

File Hierarchy structure

DataTags for sensitive data support - forthcoming

Page 37: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dat a f i le land ing pag e

Make Data Count (REUSABLE METRIC)

Citat ion for data file, w it h a DOI and URL for each file (FINDABLE & ACCESSIBLE)

Variab le m etadata for t abu lar data file using DDI standards (INTEROPERABLE & REUSABLE)

Page 38: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 39: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 40: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 41: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Met adat a are accessib le, even w hen t he dat a are no long er availab le

Page 42: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Sum m ary of Dat averse Dat a Sharing Support

Data Managem ent

Data Findab ilit y, Accessib ilit y, In teroperab ilit y, and Reuse

Data access and m anagem ent w orkflow s: perm issions, pub lish ing , data access con t rol, guestbook

Data analysis and visualizat ion

External t ool in teg rat ions

Page 43: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a shar ing

The Research LifeCycle , DMP, and RDM

FAIR dat a g u id ing p r incip les

How Dat averse Support s FAIR st andards and dat a shar ing

Nanyang Technolog ical Un iversit y Research Dat a Reposit ory

Guidance for Data Managem ent and Curat ion Best Pract ices

Page 44: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Nanyang Technolog ical Un iversit y Research Dat a Reposit ory

Page 45: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 46: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

NTU Research Dat a Policy

● NTU recog n izes t h at research d at a is an im port an t part of all research and scholarly w ork● Dat a Man ag em en t Plan s are requ ired for all research proposals● The Universit y ow n s all research d at a produced by research projects conducted at or under the ausp ices of

NTU regard less of fund ing source● The Universit y assig n s au t om at ic r ig h t s t o t h e PI an d h is/h er d esig n at ed research ers to use and pub lish all

research data arising from their p roject for non-com m ercial purposes on ly● The f in al research d at a from projects carried out at NTU shall be m ade availab le for sharing (via t h e NTU Dat a

Reposit ory)● The sharing and use of research data shall be based on Creat ive Com m on s l icen se CC:BY:NC, w here others

m ay use data for non-com m ercial app licat ions on ly and m ust correct ly at t ribu te the data source in NTU

For t h e research er● Prepare a d at a m an ag em en t p lan using either the NTU DMP tem plate or that p rovided by the fund ing agency

and subm it it on line onto the NTU p lat form specified by the Un iversit y w it h in t h ree m on t h s upon approval of t h e p roject g ran t .

● The PI shall p rovide an updated version w henever there are substant ive changes to the research project .● Subm it the final research data to the NTU Dat a Reposit ory or ext ern al open access reposit ory n o lat er t h an

t h e f irst on lin e pub licat ion of the art icle.

Page 47: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Exam p les of w ell deposit ed con t en t in NTU dat a reposit ory

Page 48: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Replicat ion d a t a fo r

Ke yw ord s ● Re la t e d p u b lica t ion

● Act ive lin k t o a rt ic le DOI

Page 49: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

● Rep licat ion dat a for

● Keyw ords

● Relat ed pub licat ion

● Soft w are

ORCID ID

Page 50: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

All f i les t ypes are support ed

Preview of im ag e and PDF f ile

Add it ional funct ionalit y for t abu lar f i les

Page 51: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

●Version ing

Page 52: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 53: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 54: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 55: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 56: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

W hy share dat a? The benef it s of dat a shar ing

The Research LifeCycle , DMP, and RDM

FAIR dat a g u id ing p r incip les

How Dat averse Support s FAIR st andards and dat a shar ing

Nanyang Technolog ical Un iversit y Research Dat a Reposit ory

Gu idance for Dat a Manag em en t and Curat ion Best Pract ices

Page 57: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Guidance for Dat a Manag em en t and Curat ion Best Pract ices

1. NTU Policies

2. Up load your p roject in DRAFT form at t o m anage your p roject w ith your t eam .

a. The data and m etadata w ill on ly visib le t o t he research team and NTU Dataverse data m anagem ent t eam . A DOI is assigned to t he dataset at t h is t im e

b. Private URL fea tu re a l lows you to share your unpub l i shed da tase t w i th o thers

3. Mult ip le roles availab le t o support m anagem ent of your research space.

4. Once PUBLISHED, m etadata are alw ays visib le for d iscoverab ilit y and to m eet Data Cite standards for data sharing and the DOI assigned at d raft is now d iscoverab le.

5. Rest ricted data files w ill be visib le but inaccessib le w it h o u t p e rm issio n fro m t h e p ro je c t a d m in is t ra t o r.

6 . Da t a file s ca n b e re s t ric t e d fo r a cce ss a n d Te rm s ca n b e m o d ifie d t o m e e t t h e n e e d s o f t h e P I

7. Em b a rg o o n d a t a r e s t r i c t i o n s c a n b e a d d e d m a n u a l l y w h i l e t h i s f e a t u r e i s i n d e v e l o p m e n t

Page 58: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Responsib le Dat a Sharing

● Copyright, credit to funders, funders as “producers”

● Sensitive content

● Terms of Access

● Data Sharing contracts

Page 59: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Draf t Dat aset s

● Manage your p roject in DRAFT, unpub lished form at

○ Send DOI to journal! DOI w ill not change once pub lished!

● Use “PRIVATE URL” feature to share content w ith review ers

● Use “perm issions” t o g ive team m em bers access to DRAFT

● Rem em ber to “PUBLISH” your dataset w hen your Pub licat ion is pub lished!

○ ADD your b id irect ional link to “related pub licat ion” field first !

Page 60: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Cit at ion Met adat a

Dataverse provides standards

Defau lt m etadata provided:

● Tit le sh ou ld b e d e sc rip t ive /sa m e a s re la t e d a rt ic le● W h o is t h e “p rod u ce r” “a u t h o r”---cop yrig h t !● Re p lica t ion d a t a fo r “ch e ckb ox”● Dig it a l Ob je c t Id e n t ifie r (DOI)● Re p osit o ry = DR-NTU (Da t a )● Ve rsion #● UNF for tabular files

Esposito, Gianluca; Michelle, Neoh, 2019, "Disapproval from romantic partners, friends and parents: source of criticism regulates prefrontal cortex activity", https://doi.org/10.21979/N9/JHHBXB, DR-NTU (Data), V1, UNF:6:gxfI0mU1ePO1nk9GiGeP6A== [fileUNF]

Page 61: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Join t Declarat ion of Dat a Cit at ion Princip les

Page 62: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dat a Cit at ion : Cred it for Sharing Dat a

A form al dat a cit at ion au t om at ically g enerat ed At t r ibu t ion t o dat a creat ors and dat a p roviders Persist en t iden t if ier (e.g ., DOI) resolves t o dat aset land ing pag e Version in cit at ion Un iversal Num erical Fing erp rin t (UNF): a check sum independen t of f i le form at , for t abu lar dat a f i les Com p lian t w it h t he Join t Decla ra t ion of Da t a Cit a t ion Princip les

Page 63: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

BEST PRACTICES: DATA FILES

● Consisten t file system

● Keep raw files separate from analyzed data files

● Descrip t ive file nam es.

○ “DOLIn terview _DoeJane_20061207” rather t han “m yData”.

● Use cap ital let t ers or underscores betw een w ords

● Do not rely on the d irectory h ierarchy

● Convert data to open, stab le form ats (ascii, t xt , csv, pd f) instead of p roprietary form ats (xls, doc, psd)

● “Rep lace” files in exist ing dataset , instead of “Delete” *version ing

Page 64: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Tabu lar Dat a Funct ionalit y

Page 65: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

File HierarchyTab le or Tree view

Folder p reservat ion

Page 66: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Geospat ial File Hand ling

Page 67: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Met adat a

Use estab lished m etadata standards w here possib le

NTU Dataverse p rovides custom fields as w ell as standard on :

● Cit a t io n Me t a d a t a (Re q u ire d )● Ge o sp a t ia l Me t a d a t a● So c ia l Sc ie n ce a n d Hu m a n it ie s Me t a d a t a● Ast ro n o m y a n d Ast ro p h ysics Me t a d a t a● Life Sc ie n ce s Me t a d a t a● Jo u rn a l Me t a d a t a

Page 68: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Met adat a: Find and Reuse Dat a

At Multiple Levels:● Citation metadata ● Discipline-based metadata● Custom metadata ● File metadata● Variable-level metadata

With Multiple Standards:● Data Documentation

Initiative (DDI) ● Dublin Core● Schema.org JSON-LD● DDI Codebook● JSON● OpenAIRE

Download metadatain multiple formats

Page 69: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Term s Of Access

● The best data sharing open is CC0

● Instances of eth ics/confident ialit y m ust be adhered to!

● Custom Term s of Access:

● Licensed data

● Applicat ions for use of access

● “Request access” for rest ricted content

Page 70: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Version ing

● Version ing allow s you to keep your orig inal DOI th roughout dataset changes

● Metadata changes resu lt in a m inor change to your cit at ion version #○ V1--->V1.1 *not changed in visib le citat ion

● Data file delet ions/add it ions/rep lacem ents resu lt in a m ajor version change ○ V1 ---> V2 *changed in visib le citat ion

● Som e d iscip lines do not use “version ing ”● Opt ion to com pletely overw rite an exist ing version , w ith no

version ing record**

Page 71: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Best Pract ices: Deaccession ing

Deaccession ing has ru les and regu lat ions

● W h e n a d a t a se t is d e st ro ye d● W h e n a d a t a se t is m o ve d t o a n o t h e r re p o sit o ry● W h e n a d a t a se t is sh a re d b y m ist a ke (n o t fo r cu ra t io n m ist a ke s - u se

ve rsio n in g !)● Le a ve a n o t e o n w h y d a t a se t w a s d e a cce ssio n e d

○ n e w re p osit o ry/DOI? ○ Tim e re st ric t e d sh a rin g ? ○ Ap p ra isa l re vie w b y RDM t e a m ?

Page 72: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data
Page 73: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

The Dataverse Project: dataverse.org

Harvard Dataverse: dataverse.harvard.edu

Test the features for Dataverse at: demo.dataverse.org

Questions or support: NTU University contact

Thank You!

Page 74: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

1. Support for FAIR Dat a Pr incip lesa. Findab le, Accessib le, In teroperab le, Reusab le.

More in form at ion .2. Dat a cit at ion for dat aset s and f i les

a. EndNote XML, RIS Form at , or BibTeX Form at . More in form at ion .

3. OAI-PMH (Harvest ing )a. Gather and expose m etadata from and to

other system s using standard ized m etadata form ats: Dub lin Core, Data Docum ent In it iat ive (DDI), OpenAIRE, et c. More in form at ion .

4 . APIs for in t eroperab il it y and cust om in t eg rat ionsa. Search API, Data Deposit API, Data Access API,

Met rics API, et c. More in form at ion .5. Log in v ia Sh ibbolet h

a. Sing le Sign On (SSO) using your inst it u t ion 's credent ials. More in form at ion .

6. Log in v ia ORCID, Goog le, or Git Huba. Log in using popu lar OAuth2 providers. More

in form at ion .

1. Dat aCit e in t eg rat iona. W hen datasets are pub lished their m etadata is sent t o

DataCite. More in form at ion .2. Usag e st at ist ics and m et r ics

a. Dow nload counters, support for Make Data Count . More in form at ion .

3. Schem a.org JSON-LDa. Used by Goog le Dataset Search and other services for

d iscoverab ilit y. More in form at ion .4 . Preview and analysis of t abu lar f i les

a. Data Exp lorer allow s for search ing , chart ing and cross tabu lat ion analysis More in form at ion .

5. Ext ernal Toolsa. Enab le add it ional features not bu ilt in t o Dataverse.

More in form at ion .6. Fixit y check s for f i les

a. MD5, SHA-1, SHA-256, SHA-512, UNF. More in form at ion .7. Pub lish ing w ork f low support

a. Datasets start as draft s and can be subm it ted for review before pub licat ion . More in form at ion .

Dat a Sharing and Preservat ion Feat u res Provided

Page 75: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

1. File dow n load in R and TSV form ata. Proprietary t abu lar form ats are converted

in to RData and TSV. More in form at ion .2. Version ing

a. History of changes to datasets and files are preserved. More in form at ion .

3. Cust om Term s of Usea. CC0 w aiver by defau lt , custom term s of use.

More in form at ion .4 . Guest book

a. Opt ionally collect data about w ho is dow nload ing the files from your datasets. More in form at ion .

5. Facet ed searcha. Facets are custom izab le and data driven.

More in form at ion .6. File h ierarchy

a. Users are ab le to cont rol dataset file h ierarchy and d irectory st ructure. More in form at ion .

1. Rest r ict ed f i lesa. Cont rol w ho can dow nload files and choose w hether

or not t o enab le a "Request Access" bu t ton . More in form at ion .

2. Cust om izat ion of dat aversesa. Your personal or organzat ional dataverse can be

custom ized and branded. More in form at ion .3. Dropbox in t eg rat ion

a. Up load files stored on Dropbox. More in form at ion .4 . Not if icat ions

a. In app and em ail not ificat ions for access requests, requests for review , et c. More in form at ion .

5. W idg et sa. Em bed data ou tside of app licat ion More

in form at ion .6. User m anag em en t

a. Dashboard for com m on user-related tasks. More in form at ion .

7. Back end st orag e on S3 or Sw if ta. Choose betw een filesystem or ob ject storage. More

in form at ion .

Dat a Sharing and Preservat ion Feat u res Provided

Page 76: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

Dat a Sharing and Preservat ion Feat u res Provided

1. Mapp ing of g eospat ial f i lesa. Create m aps from shapefiles t abu lar files w ith geospat ial data. More in form at ion .

2. Hand ling larg e dat aa. File t ransfer using rsync (experim ental). More in form at ion .

3. Open Science Fram ew ork In t eg rat iona. h t tps://help .osf.io/hc/en-us/art icles/360019737314-Connect -Dataverse-to-a-Project

4 . Pu ll header m et adat a f rom Ast ronom y (FITS) f i lesa. Dataset m etadata p repopulated from FITS file m etadata. More in form at ion

5. Dat averse/RSpace In t eg rat iona. h t tps://dataverse.org /b log /dataverse-rspace-in teg rat ion

i. Dataverse and Research Space are p leased to announce an in teg rat ion of t he RSpace elect ron ic lab not ebook w ith Dataverse. Th is in teg rat ion , described in t he follow ing b rief video, enab les researchers to deposit datasets d irect ly from RSpace to any Dataverse.

Page 77: Moving Towards FAIR data sharing with The Dataverse Project · The sharing and use of research data shall be based on Creative Com m ons license CC:BY:NC , where others m ay use data

References"3 The Benefit s of Data Sharing ." Nat ional Academ ies of Sciences, Eng ineering , and Medicine. 2016. Princip les and Obstacles for Sharing Data from Environm enta l Hea lt h Research: W orkshop Sum m ary. W a sh in g t on , DC: Th e Na t ion a l Aca d e m ie s P re ss. d o i: 10 .17226/2170 3.

Sa rd a n e lli, F., Alì, M., Hu n in k, M.G. e t a l. Eu r Ra d io l (20 18 ) 28 : 2328 . h t t p s://d o i.o rg /10 .10 0 7/s0 0 330 -0 17-5165-5

h t t p s://d a t a ve rse .o rg /

h t t p s://d a t a ve rse .h a rva rd .e d u /d a t a ve rse /h a rva rd

h t t p s://g it h u b .com /IQSS/d a t a ve rse

h t t p s://w w w .n a t u re .com /a rt ic le s/sd a t a 20 1618 .p d f

h t t p s://sch o la r.h a rva rd .e d u /file s/m e rc e c rosa s /file s /fa ird a t a -d a t a ve rse -m e rce c rosa s.p d f

h t t p s://d a t a ve rse .o rg /file s/d a t a ve rse o rg /file s/fa ir-d a t a sh a rin g -a rc h e o log ysym p osiu m .p d f

h t t p s://w w w .fo rce 11.o rg /g rou p /fa irg rou p /fa irp rin c ip le s

h t t p s://icon -lib ra ry.n e t /ico n /re p osit o ry-ic on -9 .h t m l

h t t p s://fig sh a re .com /a rt ic le s/St a t e _of_O p e n _Da t a _20 19 /10 0 1178 8

h t t p s://kn ow le d g e .fig sh a re .c om /a rt ic le s /it e m /st a t e -o f-op e n -d a t a -20 19