Open Science Monitor Methodological Note April 2019 · 2020-01-28 · 1 OPEN SCIENCE MONITOR...
Transcript of Open Science Monitor Methodological Note April 2019 · 2020-01-28 · 1 OPEN SCIENCE MONITOR...
1
OPEN SCIENCE MONITOR
UPDATED METHODOLOGICAL NOTE
Brussels,4thApril2019
Consortium partners: The Lisbon Council ESADE Business School
Centre for Science and Technology Studies (CWTS) at Leiden University
Subcontractor: Elsevier
2
1 Introduction..........................................................................................................................................................3
1.1 Objectives.....................................................................................................................................................41.2 Scope..............................................................................................................................................................5
2 Indicatorsanddatasources..........................................................................................................................62.1 Openaccesstopublications................................................................................................................7
2.2 Openresearchdata.................................................................................................................................8
2.3 Opencollaboration..................................................................................................................................92.3.1 Opencode..........................................................................................................................................9
2.3.2 Openscientifichardware.........................................................................................................10
2.3.3 Citizenscience...............................................................................................................................102.3.4 Altmetrics.........................................................................................................................................10
3 Nextsteps.............................................................................................................................................................12Annex1:MethodologicalnoteontheimplementationofUnpaywalldataintotheOpenAccesslabellingoftheOpenScienceMonitor..............................................................................................13
Introduction.............................................................................................................................................................13References................................................................................................................................................................15
Annex2:Answertocomments............................................................................................................................16
Openaccess..............................................................................................................................................................16Openresearchdata..............................................................................................................................................17
Opencollaboration...............................................................................................................................................18Annex3:Methodologicalapproachforthecasestudies.........................................................................20
ResearchOverview..........................................................................................................................................20
DataCollection...................................................................................................................................................20DataAnalysis......................................................................................................................................................21
Status:Overviewofcasestudiesperformed.......................................................................................22Annex4:Methodologicalapproachforthesurvey....................................................................................23
Annex5:Participantstotheexperts’workshopandmembersoftheadvisorygroup............25
3
1 Introduction ThisistherevisedversionofthemethodologyoftheOpenScienceMonitor,basedonthecommentsreceivedonline,thediscussionintheexperts’workshop.Open science has recently emerged as a powerful trend in research policy. To be clear,opennesshasalwaysbeena corevalueof science,but itmeantpublishing the resultsorresearchinajournalarticle.Today,thereisconsensusthat,byensuringthewidestpossibleaccess and reuse to publications, data, code and other intermediate outputs, scientificproductivitygrows,scientificmisconductbecomesrarer,discoveriesareaccelerated.Yetitisalsoclearthatprogresstowardsopenscienceisslow,becauseithastofitinasystemthatprovidesappropriateincentivestoallparties.Ofcourse,dr.RossicanadvancehisresearchfasterbyhavingaccesstoDr.Svensson’sdata,butwhatistherationaleforDrSvenssontoshareherdataifnooneincludesdatacitationmetricsinthecareerassessmentcriteria?TheEuropeanCommissionhasrecognisedthischallengeandmovedforwardwithstronginitiativesfromtheinitial2012recommendationonscientificinformation(C(2012)4890),suchastheOpenSciencePolicyPlatformandtheEuropeanOpenScienceCloud.OpenaccessandopendataarenowthedefaultoptionforgranteesofH2020.TheOpenScienceMonitor(OSM)aimstoprovidedataand insightneededtosupporttheimplementationofthesepolicies.ItgathersthebestavailableevidenceontheevolutionofOpenScience,itsdriversandimpacts,drawingonmultipleindicatorsaswellasonarichsetofcasestudies.1Thismonitoringexercise ischallenging.Openscience isa fastevolving,multidimensionalphenomenon.AccordingtotheOECD(2015),“openscienceencompassesunhinderedaccesstoscientificarticles,accesstodatafrompublicresearch,andcollaborativeresearchenabledby ICT tools and incentives”. This very definition confirms the relative fuzziness of theconceptandtheneedforacleardefinitionofthe"trends"thatcomposeopenscience.Preciselybecauseofthefastevolutionandnoveltyofthesetrends,inmanycasesitisnotpossible to find consolidated,widely recognized indicators. Formore established trends,such as open access to publications, robust indicators are available through bibliometricanalysis.Formostothers,suchasopencodeandopenhardware,therearenostandardisedmetricsordata gathering techniques and there is the need to identify the best availableindicatorthatallowsonetocapturetheevolutionandshowtheimportanceofthetrend.
Thepresentdocumentillustratesthemethodologybehindtheselectedindicatorsforeachtrend.Thepurposeof thedocument is toensure transparencyand togather feedback inordertoimprovetheselectedindicators,thedatasourcesandoverallanalysis.TheinitiallaunchoftheOSMcontainsalimitednumberofindicators,mainlyupdatingtheexistingindicatorsfromthepreviousMonitor(2017).Newtrendsandnewindicatorswillbe added in the course of the OSM project, also based on the feedback to the presentdocument.
1The OSM has been published in 2017 as a pilot and re-launched by the European Commission in 2018 through a contract with aconsortiumcomposedbytheLisbonCouncil,ESADEBusinessSchoolandCWTSofLeidenUniversity(plusElsevierassubcontractor).Seehttps://ec.europa.eu/research/openscience/index.cfm?pg=home§ion=monitor
4
1.1 Objectives
TheOSMcoversfourtasks:1. Toprovidemetricsontheopensciencetrendsandtheirdevelopment.2. Toassessthedrivers(andbarriers)toopenscienceadoption.3. Toidentifytheimpacts(bothpositiveandnegative)ofopenscience4. Tosupportevidence-basedpolicyactions.
Theindicatorspresentedherefocusmainlyonthefirsttwotasks:mappingthetrends,andunderstandingthedrivers(andbarriers)foropenscienceimplementation.Thechartbelowprovidesanoverviewoftheunderlyingconceptualmodel.Figure1:Aconceptualmodel:aninterventionlogicapproach
The central aspect of themodel refers to the analysis of the open science trends and isarticulatedalongsidethreedimensions:supply,uptakeandreuseofscientificoutputs.IntheOSMframework,supplyreferstotheemergenceofservicessuchasdatarepositories.Thenumberofdatarepositories(oneoftheexistingindicators)isasupplyindicatorofthedevelopment of Open Science. On the demand side, indicators include, for example, theamountofdatastoredintherepositories,thepercentageofscientistssharingdata.Finally,becauseof thenatureofOpenScience, theanalysiswillgobeyondusage,sincethereusedimensionisparticularlyimportant.Inthiscase,relevantindicatorsincludethenumberofscientistsreusingdatapublishedbyotherscientists,orthenumberofpapersusingthesedata.
5
Ontheleftsideofthechart,themodelidentifiesthekeyfactorsinfluencingthetrends,bothpositivelyandnegatively(i.e.driversandbarriers).Bothdriversandbarriersareparticularlyrelevantforpolicy-makersasthisistheareawhereanactioncanmakegreatestdifference,and are therefore strongly related to policy recommendations. These include “policydrivers”, such as funders’ mandates. It is important to assess not only policy driversdedicatedtoopenscience,butalsomoregeneralpolicydriversthatcouldhaveanimpactonthe uptake of open science. For instance, the increasing reliance on performance-basedfundingortheemphasisonmarketexploitationofresearcharegeneralpolicydriversthatcouldactuallyslowdowntheuptakeofopenscience.Therightsideofthechartinthemodel,illustratestheimpactsofopensciencetoresearchorthescientificprocessitself;toindustryorthecapacitytotranslateresearchintomarketableproductsandservices;tosocietyorthecapacitytoaddresssocietalchallenges.
1.2 Scope
Bydefinition,openscienceconcernstheentirecycleofthescientificprocess,notonlyopenaccesstopublications.Hencethemacro-trendscoveredbythestudyinclude:openaccesstopublications, open research data and open collaboration. While the first two are self-explanatory, open scientific collaboration is an umbrella concept to include forms ofcollaborationinthecourseofthescientificprocessthatdonotfitunderopendataandopenpublications.Table1:Articulationofthetrendstobemonitored
Categories Trends
Openaccesstopublications
• Openaccesspolicies(fundersandjournals),• Greenandgoldopenaccessadoption(bibliometrics).2
Openresearchdata
• Opendatapolicies(fundersandjournals)• Opendatarepositories• Opendataadoptionandresearchers’attitudes.
Opencollaboration
• Opencode,• Altmetrics,• Openhardware,• Citizenscience.
Newtrendswithintheopenscienceframeworkwillbeidentifiedthroughinteractionwiththe stakeholder’s community by monitoring discussion groups, associations (such asResearch Data Alliance- RDA),mailing lists, and conferences such as those organised byForce11(www.force11.org).
2AccordingtotheEC,“‘Goldopenaccess’meansthatopenaccessisprovidedimmediatelyviathepublisherwhenanarticleispublished,i.e.whereitispublishedinopenaccessjournalsorin‘hybrid’journalscombiningsubscriptionaccessandopenaccesstoindividualarticles.Ingoldopenaccess,thepaymentofpublicationcosts(‘articleprocessingcharges’)isshiftedfromreaders’subscriptionsto(generallyone-off)paymentsbytheauthor.[…]‘Green.openaccess’meansthatthepublishedarticleorthefinalpeer-reviewedmanuscriptisarchivedbytheresearcher(orarepresentative)inanonlinerepository.”(Source:H2020ModelGrantAgreement)
6
The study coversall research disciplines, and aims to identify the differences in openscience adoption and dynamics between diverse disciplines. Current evidence showsdiversityinopensciencepracticesindifferentresearchfields,particularlyindata-intensiveresearchdomains(e.g.lifesciences)comparedtoothers(e.g.humanities).Thegeographiccoverageofthestudyis28MemberStates(MS)andG8countries,includingthe main international partners, with different degrees of granularity for the differentvariables.Asfaraspossible,datahastobepresentedatcountrylevel.Finally,theanalysisfocusesonthefactorsatplayfordifferentstakeholdersasmappedinthechartbelow(table2).Foreachstakeholder’scategory,OSMwilldeliberatelyconsiderbothtraditional(e.g.ThomsonReuters)andnewplayersinresearch(e.g.F1000).Table2:Stakeholderstypes
Researchers Professionalandcitizensresearchers
Researchinstitutions
Universities,otherpubliclyfundedresearchinstitutions,andinformalgroups
Publishers Traditionalpublishers
NewOAonlineplayers
Serviceproviders Bibliometricsandnewplayers
Policymakers Atsupranational,nationalandlocallevel
Researchfunders Privateandpublicfundingagencies.
2 Indicators and data sources Becauseofthefastandmultidimensionalnatureofopenscience,awidevarietyofindicatorshavebeenused,dependingondataavailability:
- Bibliometrics:thisisthecaseforopenaccesstopublicationsindicators,andpartiallyforopendataandaltmetrics.
- Online repositories: there are many repositories dedicated to providing a widecoverage of the trends, such as policies by funders and journals, APIs and openhardware.
- Surveys:surveysofresearchersshedlightonusageanddrivers.Preferenceisgiventomulti-yearsurveys.
- Ad hoc analysis in scientific articles or reports: for instance, reviews of journalspolicieswithregardtoopendataandopencode
- Datafromspecificservices:openscienceservicesoftenofferdataontheiruptake,asforSci-starterorMendeley.Inthiscase,dataofferlimitedrepresentativenessaboutthetrendingeneral,butcanstillbeusefultodetectdifferences(e.g.bycountryordiscipline).Wherepossible,inthiscase,wepresentdatafrommultipleservices.
7
2.1 Open access to publications
This trend has received lots of attention by people commenting, mainly because of theexclusive reliance on the Scopus database. The consortiumhas not received evidence todispute that Scopusdata allow for the necessary data quality, especially since the “openaccess” tagging is exclusively performed by the consortium partners. But in addition, toimprovetherobustness,weupdatedthemethodologybyaddingUnpaywalldatatoprovidethebestpossiblecoverage,andbyaddingdedicatedanalysisthatwillperformcontrolsoftheeffectsondataofusingalternativedatabasessuchasWebofScience.MoredetailsareprovidedintheupdatedAnnex1.Additionally,datafromScopuscanbemadeavailabletoindividual academic researchers to assess or replicate the OSMmethodology, under thestandingpolicyofElseviertopermitacademicresearchaccesstoScopusdata.Beside the long list of indicators below, the detailed methodology for calculating thepercentageofOApublicationsispresentedintheannex1.
Indicator SourceNumberofFunderswithopenaccesspolicies(withcaveatthatitisskewedtowardswesterncountries)
SherpaJuliet3
NumberofJournalswithopenaccesspolicies(withcaveatthatitisskewedtowardswesterncountries)
SherpaRomeo4
Numberofpublishers/journalsthathaveadoptedtheTOPGuidelines(includingthelevelofadoptionactualimplementationwherepossible)
Cos.io
P-#Scopuspublicationsthatenterintheanalysis
Scopus,Unpaywall
P(oa)-#ScopuspublicationsthatareOpenAccess(CWTSmethodforOAidentification)
Scopus,Unpaywall
P(greenoa)-#ScopuspublicationsthatareGreenOA
Scopus,Unpaywall
P(goldoa)-#ScopuspublicationsthatareGoldOA
Scopus,Unpaywall
PP(oa)-PercentageOApublicationsoftotalpublications
Scopus,Unpaywall
PP(greenoa)-PercentagegoldOApublicationsoftotalpublications
Scopus,Unpaywall
PP(goldoa)-PercentagegreenOApublicationsoftotalpublications
Scopus,Unpaywall
3http://v2.sherpa.ac.uk/juliet/4http://www.sherpa.ac.uk/romeo/index.php?la=en&fIDnum=|&mode=simple
8
2.2 Open research data
Severalcommentsreceivedwereusefultoidentifynewdatasourcestomeasureopendatapublication,andhavebeenadded.TherewereseveralcriticismsofusingElsevier togatherdatathroughthesurvey,butnovalidalternativesofcomparablequalityandcost/efficiencywereproposed.Moreover,datafromElsevier surveywill be openly released, as last year. The detailedmethodology fordevelopingtheElseviersurveyispresentedintheannex4.Several comments pointed to the need for measuring new additional aspects, such as“numberofpapersbasedonopenlyavailable rawdata”.However,noconcreteproposalsweremadeaboutsources.Wewillfollowupwiththosecommentingtoobtainfurtherdetail.
Indicator Source
Number of Funderswith policies ondata sharing (withcaveatthatitisskewedtowardswesterncountries)
SherpaJuliet
NumberofJournalswithpoliciesondatasharing Vasilevskyetal,20175
Numberofopendatarepositories Re3data
%ofpaperpublishedwithdata Bibliometrics:Datacite
Citationsofdatajournals Bibliometrics:Datacite
Attitudeofresearchersondatasharing. S2016and2018surveybyElsevier,follow-upofthe2017report.6
Sharing of research data: % of researchers that havedirectly shared researchdata from their lastproject,byrecipient.
S2016and2018surveybyElsevier.
Benefits of sharing research data:%of researchers perbenefit.
S2016and2018surveybyElsevier.
Consequences of sharing data: Contact made withresearchers outside their research team after sharingdata,%ofresearchersbytypeoforganisation.
2018surveybyElsevier.
Makingresearchdataavailable:Effortrequiredtomakeresearch data reusable by others,% of researchers peramountofeffort.
S2016and2018surveybyElsevier.
Managementofresearchdata:%ofresearchersthattakestepstomanagetheirresearchdataand/orarchiveitforpotentialreusebythemselvesorothers.
S2016and2018surveybyElsevier.
5Vasilevsky,NicoleA.,JessicaMinnier,MelissaA.Haendel,andRobinE.Champieux.“ReproducibleandReusableResearch:AreJournalDataSharingPoliciesMeetingtheMark?”PeerJ5(April25,2017):e3208.doi:10.7717/peerj.3208.6Berghmans,Stephane,HelenaCousijn,GemmaDeakin,IngeborgMeijer,AdrianMulligan,AndrewPlume,SarahdeRijcke,etal.“OpenData:TheResearcherPerspective,”2017,48p.doi:10.17632/bwrnfb4bvh.1.
9
Attitudesofresearchers:%ofresearchersthatagreewithstatement.
S2016and2018surveybyElsevier.
Numberand/ortotalsizeofCC-0datasets. Base-search.net
NumberofOAI-compliantrepositories. Base-search.net
Numberofrepositorieswithanopendata(https://opendefinition.org/)policyformetadata.
OpenDOAR,"commercial"inmetadatareusepolicy.https://opendefinition.org/
2.3 Open collaboration
Indicator Source
Membershipofsocialnetworksonscience(Mendeley,ResearchGate,f1000)
Scientificsocialnetworks
2.3.1 Open code
Severalcommentsaddressedthisissue,mainlybysuggestingnewdatasourcestobeused.They are tentatively included here for discussion. Several suggestions did not includesourcesandarenotlistedhereforthetimebeing,pendingadditionalanalysis.
Indicator Source
NumberofcodeprojectswithDOI MozillaCodemeta
NumberofscientificAPI Programmableweb
%ofjournalswithopencodepolicy Stodden20137
SoftwarecitationsinDataCite Datacite
NumberofcodeprojectsinZenodo Zenodo
Add:numberofsoftwaredepositsunderanOSI-approvedlicense.
Base
NumberofSoftwarepapersinSoftwareJournals
(e.g.JORShttps://openresearchsoftware.metajnl.com/andothers)
N.ofusersinreproducibilityplatformssuchasCodeOcean
CodeOcean
7Stodden,V., Guo,P. andMa,Z. (2013), “Towardreproduciblecomputational research:an empiricalanalysisofdata andcodepolicyadoption”,PLoSOne,Vol.8No.6,p.e67111.doi:10.1371/journal.pone.0067111.
10
2.3.2 Open scientific hardware
The few comments received here pointed to the limited importance of open hardwarelicenses,becauseofthefragmentationacrosstheEU.Thatindicatorhasthenbeenremoved.
Indicator Source
Numberofprojectsonopenhardwarerepository OpenHardwarerepository8
2.3.3 Citizen science
Theveryfewcommentsreceiveddidnotincludeadditionalsourcesandarethereforenotincludedforthetimebeing.
Indicator Source
N.ProjectsinZooniverseandScistarter ZooniverseandScistarter
N.ParticipantsinZooniverseandScistarter ZooniverseandScistarter
2.3.4 Altmetrics
ThefeedbackreceivedinthiscasewashighlycriticalofthedependenceonPlumAnalyticsandMendeley.Basedonthefeedbackreceived,theconsortiumwillkeeptheindicatorsassuch,butperformadditionalchecksandanalysisusingalternativestoPlumAnalytics,suchasAltmetric.com,assuggestedbythecomments.ForwhatconcernsMendeley,itistheonlysourcecurrentlyavailableprovidingopendataaboutreadershipandwillthereforecontinuetobeused.Theindicatorswillbereassessedoncethedatabecomeavailable.
Indicator SourceP(tracked)-#Scopuspublicationsthatcanbetrackedbythedifferentsources(e.g.typicallyonlypublicationswithaDOI,PMID,Scopusid,etc.canbetracked).
Scopus&PlumAnalytics
P(mendeley)-#ScopuspublicationswithreadershipactivityinMendeley
Scopus,Mendeley&PlumAnalytics
PP(mendeley)-ProportionofpublicationscoveredonMendeley.P(mendeley)/P(tracked)
Scopus,Mendeley&PlumAnalytics
TRS-TotalReadershipScoreofScopuspublications.SumofallMendeleyreadershipreceivedbyallP(tracked)
Scopus,Mendeley&PlumAnalytics
TRS(academics)-TotalReadershipScoreofScopuspublicationsfromMendeleyacademicusers(PhdS,Professors,Postdocs,researchers,etc.)
Scopus,Mendeley&PlumAnalytics
TRS(students)-TotalReadershipScoreofScopuspublicationsfromMendeleystudentusers(Masterand
Scopus,Mendeley&PlumAnalytics
8https://www.ohwr.org
11
Bachelorstudents)TRS(professionals)-TotalReadershipScoreofScopuspublicationsfromMendeleyprofessionalusers(librarians,otherprofessionals,etc.)
Scopus,Mendeley&PlumAnalytics
MRS-MeanReadershipsScore.TRS/P(tracked) Scopus&PlumAnalyticsMRS(academics)-TRS(academics)/P(tracked) Scopus&PlumAnalyticsMRS(students)-TRS(students)/P(tracked) Scopus&PlumAnalyticsMRS(professionals)-TRS(professionals)/P(tracked) Scopus&PlumAnalyticsP(twitter)-#Scopuspublicationsthathavebeenmentionedinatleastone(re)tweet
Scopus&PlumAnalytics
PP(twitter)-ProportionofpublicationsmentionedonTwitter.P(twitter)/P(tracked)
Scopus&PlumAnalytics
TTWS-TotalTwitterScore.SumofalltweetsmentionsreceivedbyallP(tracked)
Scopus&PlumAnalytics
MTWS-MeanTwitterScore.TTWS/P(tracked) Scopus&PlumAnalytics
12
3 Next steps TheconsortiumwilldelivernewsetofcasestudiesbytheendofJuly2019.
The consortiumwill continue revising themethodologywith the community, throughanopenLinkedingroup.
13
Annex 1: Methodological note on the implementation of Unpaywall data into the Open Access labelling of the Open Science Monitor ThedvanLeeuwen&RodrigoCostas
CentreforScienceandTechnologyStudies(CWTS),LeidenUniversity,theNetherlands
Introduction
InthisdocumentthemethodologicalapproachfortheidentificationandcreationoftheOpenAccess(OA)labelsfortheOpenScienceMonitorispresented.CWTShasbeenworkingonanOAevidenceoverthelastthreeyears,andwhichhasbeenreportedat theParis2017STIConference(vanLeeuwenetal,2017). In thismethodwestrivedforahighdegreeofreproducibilityofourresultsbasedupondatacarryingOAlabelsfollowing fromthemethodologywedeveloped,baseduponfreelyandopendatasources(DOAJ,ROAD,CrossRef,PMCentralandOpenAIRE).ThiswasappliedonElsevier’sScopuspublicationdata.
Frommidof2018onwards,anothersourceforOAtaggingbecameprominent,namelytheUnpaywall database (https://unpaywall.org/).CWTS is (still)workingon integrating thisdatasourceintothecurrentanalysis,whichmeansacombinationofbasicresearchonthisimplementation,inwhichwecomparedthepreviouslydevelopedmethodologywiththenewone,e.g.comparingourmethodologyandthenumbersofpublicationlabelledwithOAtags,withtheUnpaywalldata(seealsoMartín-Martínetal,2018).TheinclusionofUnpaywallinthemethodologyrequiresus toconductresearchtobetterunderstandwhatOAevidencedataUnpaywallprovides,whetheralltypesofOAevidencealignwithourcriteriaofbuildingOAevidence,andwhetherthereisanypotentialconceptualissuesrelatedtosometypologiesofOAprovidedbyUnpaywall(forexample,wewonderwhetherthe‘Bronze’OAtypologydisclosedbyUnpaywallcanbeconsideredasasustainableformofOA,cf.Martín-Martínetal,2018).ThemethodologicalapproachthatweproposemainlyfocusesonaddingdifferentOAlabelsto thepublications covered in theScopusdatabase,usingUnpaywall toestablish thisOAstatusofscientificpublications.ItisimportanttohighlightthattwobasicprinciplesforthisOA label are sustainability and legality. By sustainability we mean that it should, inprinciple, be possible to reproduce the OA labelling from the various sources used,repeatedly,inanopenfashion,witharelativelylimitedriskofthesourcesuseddisappearingbehindapay-wall,andparticularlythatthereportedpublicationsasOAwillchangetheirstatus to closed. The second aspect (legality) relates to the usage of data sources thatrepresentlegalOAevidenceforpublications,excludingrogueorillegalOApublications(i.e.wedonotconsiderOApublicationsmadefreelyavailableinplatformssuchasResearchGateorSci-hub legal, andprobablyalsonotsustainable).While the former criterion ismainlyorientedtoascientificrequirement,namelythatofreproducibilityandperdurabilityovertime, the latter criteria is particularly important for science policy, indicating that OApublishingalignswithpoliciesandmandates.
14
Inthere-loadingofthepublicationcountsintheOpenScienceMonitor,UnpaywalldataformthesourcefortaggingScopuspublicationswithlabelsonOpenAccessavailability.ThisisachangefromthepreviousOAtaggingmethod,whenitcomestothevarietyofOAtags,andtheprocedure,while thecriteriadevelopedfor thetaggingofpublicationswithOAlabelsremainintact.
IntheimplementationofUnpaywalldata,wewereexpectingtobecapableofdistinguishingnexttoGoldandGreen,whichweusedsofar,alsoHybridOAasafurtherformofcompliantOApublishing.However,weidentifiedinthecurrentUnpaywallcategoryof‘Hybrid’,someform of mixing up of Gold with true Hybrid (which means APC-based publishing in anotherwisetoolaccessjournal)occurs,whichblurstheperspectiveontheHybridcategory.Consequently,alsothecategoryGoldwillbeaffected,asthefigurespresentedherewillbelikelytoalowerestimateoftherealsituation.Wearecurrentlyworkingonsortingthisout,in order to create better defined categories of OA types, distinguishing Gold, Green andHybrid.AfourthOAcategoryinUnpaywallis‘Bronze’,whichisbasicallyaformofopenlyavailablepublishinginitiatedbythepublishers,inwhichthecopyrightstatusisnotclear.Asinourcriteria this isnota sustainable formofOAweopt fornot consideringasa separateOAcategory,althoughitwillbeincludedintheoverallconsiderationofOApublications.In thisdatadelivery,andhencethere-loadingof theOSMwebsite,wetake fourdifferentanalytical approaches: overall (all publications), countries, fields, and fields & countriescombined.Wedothisintwoways,oneforthefullperiod,theotherforthetrendanalysisfrom2009-2017.
ThefollowingOAindicatorsarecalculated:- Totalnumberofpublications:thisistheoverallnumberofpublications,whichisused
asthedenominatorforthecalculationofsharesofOA.- Total(andshareof)OA:theoverallnumber(andshare)ofOAavailablepublications
(coveringall typesofOArecordedbyUnpaywall -namelyGold,Green,HybridandBronze). With this we intend to be fully in line with the Unpaywall data in thedisclosureofOAavailability.
- GreenOAandGoldOA:inthisdatadeliverywereportpublicationcounts(andshares)ofGreenandGoldOApublicationsseparately.Thismeansthatwedonotapplyanypreferenceapproach(e.g.givingprioritytoGoldoverGreenwhenapublicationcanbe labelled as both). The rationale behind this choice is based on the idea thatdifferenttypesofOAhavedifferentinterestsdependingonthedifferentstakeholders(e.g.readers,authors,academicinstitutions,funders,etc.),andatthisstageweoptforleavingthemintheirmostoriginalform,sobothtypesofOAcanbefullyinformed.
Finally,regardingtheunderlyingpublicationdata,wehaverestrictedtheanalysistoonlythosepublicationshavingaDOIinScopus,sincecurrentlyUnpaywallonlyprovidesOAlabelstopublicationswithDOIs.TheinclusionofallpublicationswithandwithoutDOIscouldleadto an underestimation of OA prevalence, since those publications without DOIs wouldincreasethedenominator,whiletheycannotbetrackedforOA.Futuredevelopmentswillbealso oriented towards providing OA evidence for those publications without DOIs, thusexpectingtoincreasetheOAanalyticallandscape.Furthermore,wehaveonlyworkedwith
15
articlesandreviews,andfuturedevelopmentswillalsoconsidertheincorporationofotherdocumenttypes.
References
van Leeuwen TN,Meijer I, Yegros-Yegros, A & Costas R, Developing indicators on OpenAccess by combining evidence from diverse data sources, Proceedings of the 2017 STIConference, 6-8 September, Paris, France (https://sti2017.paris/)(https://arxiv.org/abs/1802.02827)Martín-Martín,A.,Costas,R.,vanLeeuwen,T.,&DelgadoLópez-Cózar,E.(2018).EvidenceofOpen Access of scientific publications inGoogle Scholar: a large-scale analysis. SocArXivpapers.DOI:10.17605/OSF.IO/K54UV Olensky, M., Schmidt, M., & Van Eck, N.J. (2016). Evaluation of the Citation MatchingAlgorithmsofCWTSandiFQinComparisontotheWebofScience.JournaloftheAssociationforInformationScienceandTechnology,67(10),2550-2564.doi:10.1002/asi.23590.
16
Annex 2: Answer to comments Below,thecommentsreceivedonlinearegroupunderheadings,basedontheircontent.Atthe end of each answer, the relevant comments ids are listed in parenthesis. The fullcommentswithidsareavailableonline.
Open access
Only open sources should be used, not proprietary data since open data sourcesalreadyexist.There are today no open data sources that offer the richness of metadata provided byproprietarysources.CrossrefinparticularlacksseveralfieldsthatarecrucialtotheworkoftheOpenScienceMonitor.Thefullexplanationofthedifferencesandthenecessitytouseproprietarydataisprovidedintheslidespresentedintheworkshop.(1431,1428,1289,1288,1280,1281,1305,1315,1340,1346,1379,1381,1479,1480,1487,1265,1272,1341,1309,1342,1426,1455,1268,1343,1290,1468,1472,1422,1424,1449)Scopusisbiasedandhasaconflictofinterestbecauseit’sownedbyElsevierItistheconsortiumdevelopingtheindicators,whileElsevieronlyprovidesunderlyingdataforsomeindicators.Inparticular,itisCWTSthatattributestheopenaccesstag.Scopushasbiases, as all other sources have, and they are known and treated transparently by theconsortium, but it remains a fundamental and high-quality instrument for bibliometricanalysis.The roleof the consortium isprecisely todevelop robust indicators taking intoaccountthelimitationsofthedifferentsources.(1344,1382,1456,1345)Data are not accessible for replication because they are based on a proprietarydatabaseScopuscanbemadeavailabletoindividualacademicresearcherstoassessorreplicatetheOSMmethodology,underthestandingpolicyofElseviertopermitacademicresearchaccessto Scopus data.Requests outlining data requirements and research scope should besubmittedthroughtheprojectemail([email protected]).(1430,1397,1440)MultiplesourcesshouldbeusedtoensurerobustnessToaddressthiscomment, theconsortiumwillcarryoutandpublishanadhocadditionalanalysiscarryingoutthesameanalysisbasedonWebofScience.Inaddition,theconsortiumwilluseUnpaywalldataalongsideScopusdatainthedatabaseusedtocreatetheheadlineindicators.(1266,1311,1396,1466,1484,1492,1267,1467,1269,1275,1325)Sources have insufficient coverage (in terms of journals, disciplines, countries,monographs).WithregardtoScopus,towidenthescopeandcapture,theconsortiumhasobtainedaccesstoUnpaywall data,which has a larger footprint andwill be integrated in the analysis inadditiontoScopus.
17
(1392,1429,1438,1439,1442,1454,1469,1483,1306,1432,1470,1481,1308,1471,1352,1444,1445,1459,1489,1490)WithregardtoSherpa,unfortunatelythislimitationisunavoidable.Theinformationaboutthebiasedcoveragewillbeincludedinthepresentationoftheindicators.(1420)Indicatorshouldnottakeintoaccountimpactfactorandrelatedissues.Theconsortiumagrees.Theindicatorsrelatedto“highlycited”journalshavebeenremoved.(1326,1347,1457,1493,1348,1349,1441,1286,1287,1312,1316,1328,1350,1401,1458,1473)NewindicatorsandsourcesThe consortium received many useful proposals, but only few of them immediatelyactionable.Mostproposalsneedadditional effort, and somearenotdeemedrelevant.Toenablethiseffortaswellasadditionalcollaborationonanyindicator,theconsortiumwillsetupadditionalcollaborationspaces,beyondtheone-offconsultationaboutthemethodology.(1327,1298,1329,1515,1545,1546,1548,1549,1282,1283,1297,1330,1355,1402,1403,1406,1463,1474,1390,1465)Somecommentswereoutofscope,basedonthetenderrequirements.(1351,1443,1303,1357,1359,1398,1446,1495,1499,1509,1510,1513,1400,1291,1399,1496,1497,1498,1299,1285,1360,1384)
Open research data
Alternativeprovider toElsevier for thesurveybecauseofnegativeperceptionandconflictofinterestTherewereseveralcriticismsofusingElsevier togatherdatathroughthesurvey,butnovalidalternativesofcomparablequalityandcost/efficiencywereproposed.Inthiscasetoo,theconsortiumisresponsible for thedefinitionof thesurveyandtheconstructionof theindicator. The survey was already carried out in 2017, with positive reception by thecommunity,andcontinuityisavalueaddedoftheanalysis.Moreover,fullanonymiseddatafromthesurveywillbeopenlyreleased,justasin2017.(1270,1314,1356,1361,1378,1417,1425,1523)UsealternativesurveyssuchasFigshare’sTheconsortiumalreadyincludestheresultsofothersurveys,suchasFigshare’s2017survey,inthedashboard.Whenavailable,newdatawillbeadded.(1356,1523)
SourceshaveinsufficientcoverageWithregardtoSherpa,unfortunatelythislimitationisunavoidable.Theinformationaboutthebiasedcoveragewillbeincludedinthepresentationoftheindicators.(1421)
NewindicatorsandsourcesTheconsortiumreceivedmanyusefulproposals,butonlyfewofthemincludedimmediatelyusabledatasources.
18
(1292,1300,1301,1211,1296)Most proposals need additional effort. To enable this effort as well as additionalcollaborationonanyindicator,theconsortiumwillsetupadditionalcollaborationspaces,beyondtheone-offconsultationaboutthemethodology.(1318,1460,1504,1505,1506,1507,1508,1332,1333,1522,1358,1385,1414,1518)Somesuggestionswererelevantforothersections.(1388,1415,1517)Othersuggestionsweredeemedoutofscopeornotrelevantenough.(1503,1270,1314,1361,1378,1417,1425,1302,1331,1304)
Open collaboration
AlternativeprovidertoElsevierforthesurveybecauseofconflictofinterestSimilaranswertothepreviouscommentsalsointhiscase.Itistheconsortiumresponsibleforprocessingthedataandbuildingtheindicators.Sourcesareassessedpurelyonmerit.Inparticular,Plumofferhighvaluedatathatareneededforthemonitor.(1310,1364,1393,1365,1408,1370,1371,1372,1373,1374,1278,1279,1313,1319,1323,1375,1416,1488)NeedtoavoidusingproprietarydataProprietarydataareusedwherenoopendataare available, and therearenoopendataavailable on altmetrics. The alternative would be using Altmetric.com, which is alsoproprietary.Onadifferentnote,Mendeleyprovidesreadingstatisticsasopendata,whichareusefultoelaborate indicators,althoughobviously limited inscope.Appropriatedisclaimerswillbeincludedinthedashboard.(1215,1380,1383,1389,1411,1257,1258)UsemultiplesourcesWithregardtoaltmetrics,theobviousalternativeisaltmetric.com–whichrequiresalicense.Theconsortiumwill investigatethe feasibilityof the license, inordertocarryoutadhoc“robustnesschecks”fortheanalysis.Withregardtoreadershipdata,Mendeleyistheonlyproviderofopendataonthis.(1394,1407,1435,1256,1337,1447,1461,1464,1476,1485,1338,1263,1262,1410,1255)RemovesomeindicatorsbecausenotvalidThe consortium agrees to remove some indicators of limited validity, in particular theindicatorsrelatedtoopencodesinceGitHubandotherrepositoriesdoesnotprovideawaytodefinecodingprojectsrelatedtoscience.(1334,1254,1386,1320)Withregardtoreadershipandsocialmedia, the indicatorsareconsidered importantanduseful.Theywillbereassessedatthetimeoftheanalysis.(1409,1448,1486,1259,1367,1260,1277,1336,1368,1261,1335,1369,1529)
19
NewindicatorsandsourcesSome comments included new indicators and sources, which will be included in themethodology.(1213,1294,1387)Othercommentshadinterestingproposalsbutnofeasiblesources.Ongoingcollaborationwilltakeplacetobetterdefinetheindicatorsandthesources.(1434,1433,1321,1322,1363,1524,1527,1395,1528,1228,1339,1362,1391,1437,1477,1520,1530,1521,1295,1212,1239,1225,1376,1462)Finally,somecommentsweredeemedoutofscopeorcontainedsuggestionsforindicatorsnotrelevantenough.(1257,1258,1404,1214,1293,1405,1436,1451,1475,1525,1526,1264,1377,1412,1450,1452,1453,1511,1512,1531,1532,1533,1534,1535,1536,1537,1538,1539,1540,1541)
20
Annex 3: Methodological approach for the case studies
ResearchOverviewThe study employs amultiple case study of a thematic sampling of 30 research projectsacross different disciplines and countries worldwide to support the assessment aboutdrivers,barriersandimpactofopenscience(Eisenhardt,1989).Theprojectsareselectedacross the open science trends: open access, open data, open peer review, open sciencehardware,opencode,andreproduciblescience,citizenscience,andopencollaboration.Theselectionhasemphasisedopensciencetrendswherethereisalackofbibliometricdataorwherethequantitativedataavailableisanecdotal.
In order tomake the selection, the study hasdynamically generated a database of openscienceprojects, that is,researchprojects thathaveadoptedat leastoneof thetrendsorprojectsdedicatedtosupportingthedevelopmentofopenscience.Thenotionofaresearchprojectisrecognisedasaunitofanalysisbyresearchers,institutionsandfundersalike.Thedatabase is continuously enriched by the study members based on desk research,communitymembersrecommendationsandbyminingthedataofopenscienceplatforms.Theselectionofcaseshasbeenmadeinthreesequentialphases(M1,M6,M12)toprovidethe case analysis to the open science monitor in cascade, while in parallel the projectdatabase is being completed. The project database includes descriptive data about theresearch projects: types of trend, discipline, country, duration, and others. The list isdynamicallyusedforidentifyingthecasestudiestobecarriedout.Fromthepreliminarylistofcasesintheprojectdatabase,wehaveselectedacross-sectionof 13 instrumental cases (Stake 1995) along the open science trends that facilitated ourunderstandingofdrivers,barriersandimpacts(i.e.,toscience,industry,andsociety)ofopenscience.Threemajortypesofcaseshavebeenperformed:
i. Policycases,whichrefertocasestudiesdevotedtostudyingopensciencepoliciesatdifferentgovernmentallevels(i.e.,national,regionalandfunderpolicies)acrossEurope.
ii. In-depthcasestudies,whichareexploratorystudiesofopenscienceprojectsthathave combined multiple data collection methods, including secondary dataanalysis, semi-structured interviews and/or study visits (observations). Theaveragelengthoftheanalysisprovidedisaround7000words.
iii. Informativecasestudies,whicharedescriptivecasestudiesaboutopenscienceprojects that have been carried through desk research by analyzing availablesecondary data. The average length of the analysis provided is around 3000words.
DataCollectionThedatacollectionprocessfocusedonadiversesetofprimaryandsecondarydata.Forthein-depth case studies, primary data has included until January 2019 13 semi-structuredinterviewsanddirectobservationfromonestudyvisitsforoneofthecasestudies(i.e.,WhiteRabbit).
21
Interviewsforthecaseswerechosenontheinitialrecommendationoftheleaderoftheopenscienceprojectappearinginthepublicsources,withsubsequentrecommendationsfromtheinterviewees.Thegoalwas,ingeneralforthein-depthcases,tointerviewarepresentativecross-sectionoftheprojectteam.Secondarysourcesincludedinformationretrievedfrompublications,projectsrepositories,wikis,websites,blogsandothersocialmediainformationreferringtotheproject,amongstothers.Table1Numberofinterviewrespondentspercase
TitleCaseStudy NºInterviews
WhiteRabbit 3
OpenTargets 3
PistoiaAlliance 3
UKpolicy 1
Finlandpolicy 1
UKResearchSoftareEngineers 1
Datacite 1
NLpolicy 0
ComparingWosandScopus 0
Total 13
DataAnalysisThestudyhasemployedanembeddeddesign(Yin,2009), focusingoneachopenscienceprojectatthreelevels:(1)managementteamintheproject;(2)projectcharacteristics;(3)organizationcharacteristicsandpolicies.Theanalysisofcasesincludes:
1) Ananalysispercase,asanentityitself,whereresearchershaveprovidedbackgroundinformationabout thecase,which includesa literaturereviewrelatedtotheopensciencetrendwheretheproject is located,andsomecontextual informationabouttheproject itself;drivers,whichuncoversthemotivationsandmaindriving forcesmaking the project possible; barriers, which describe the major bottlenecks andchallengesencounteredbytheprojectteaminthecourseoftheproject;andimpactordirecteffectsoftheprojecttowardsthescientificcommunity,businessecosystemandsocialbenefitsoftheprojectatlarge.
2) Anaggregatedanalysisofcases(i.e.,cross-analysis).Themultiplecasedesignsallowreplication logic, by treating the cases as a series of experiments and focus on
22
identifyinguniquepatternsofeachcaseandfindingpatternsacrossdifferentcases.Identificationofsimilaritiesanddifferencesbetweencaseshelpustocategorisethedifferentmechanismsthatoperateinthedifferentprojectsandrelationships.Whilethe analysis per case has been done by a single research team of one of theorganizationsinvolvedintheOpenScienceMonitor(i.e.,CWTS,ESADE,andLC),thecross-analysishasbeenperformedbyamixedresearchteamcombiningmembersofthethreeorganizations.Thetasksforthecross-analysisweredistributedamongthemixedresearchteam,anddifferentgroupdiscussionssupportedtheanalysis,whichhasbeenperformedinseveraliterations.
Twocross-analysishavebeenenvisaged:thefirstoneinJanuary2019(includedinpresentreport),whichincludes13cases;andthelastonescheduledforM24,whichwillincludeall30casestudies.Thepreliminaryresultsofthecross-analysiswillbesharedwiththeOpenScience Monitor Advisory board, composed by 15 members across the open sciencecommunity.Theresearchteamwillrevisetheanalysisaccordingtothefeedbackprovidedbytheexpertsintheadvisorygroup.
Status:OverviewofcasestudiesperformedAtthisstageoftheOpenScienceMonitor(January2019),thefollowingcasestudieshavebeenperformed:Table2Overviewofcasestudies(upuntilJanuary2019)
Trend
Type
ofcase
Openaccess Opendata Opencollaboration
OpenCode Openhardware
Openreview
Policycase Netherlands
Finland
UnitedKingdom
ResearchSoftwareEngineers(UK)
In-depthcase OpenTargets WhiteRabbit
PistoiaAlliance
DataCite
WebofScience&Scopus
Informativecase
F1000 Reana F1000
Yoda
23
Annex 4: Methodological approach for the survey
StudydesignA cross sectional study – an online survey delivered by email to respondents using theConfirmit survey platform. The survey provides a snapshot of the current scientificenvironmentandattitudesofresearchers,opendataand itsreuse inresearch;aswehadcollecteddataonthistopicpreviouslyin2016wewereabletocomparetopreviousdatacollected.Ane-mailwassenttoactiveresearchersrequestingthemtoparticipateinasurveyonopendata. The surveywas conducted using the Confirmit survey platform. ResearcherswereselectedfromScopusdatabaseofpublishedresearchers.Thesurveyinvitationwassentouton27thSeptember2018,witharemindersentaweeklater.ParticipantsResearcherswere randomly selected fromtheScopusdatabaseofpublished researchers,with the sample profiled so country and subject area speciality were tracked to ensuresufficientresponses.Theseweremeasuredagainst theknowndistributionofresearchersaccordingtotheOECDandUNESCO.Therespondentswereabletochangetheiranswersatany time before submitting the filled-out questionnaire, but not after. Emails sent to theparticipantscontaineduniquelinkstoaccessthesurvey.
StudysizeThesurveywassenttojustover40,000researchers,andtheremindersweresenttothenon-respondents.1029responseswerereceived(2.5%responserate).Informedconsentandethicsapproval
Participantswereinformedintheinvitationletterandinthesurveydescriptionaboutthepurposeofthestudy,theresearchteambehindthesurvey,themediantime(15minutes)neededtocompletethesurvey(basedonoursurveypilotdata),thatthedatawillbemadepublicly available, no identifying information will be shared, and that by filling out thequestionnairetheywillbegivingtheirconsenttoparticipateintheresearch.
IncentivesNo incentives, except the option to be alerted of the study results, were offered to theparticipants.
Datasources/measurementRespondents were required to answer all the survey questions or choose the ‘do notknow/notapplicable’option.Respondents could contact the investigators, if they encounter any, technical or otherdifficultiesviaemail(replyemailaddress,butalsoshownwithintheinvitation)
24
StorageAlldatawasstoredontheConfirmitplatformduringdatacollection.TheanonymisedsurveydataisavailableonElsevier’sdatastorageplatform(MendeleyData)andonourprojects’datarepositorysite.Responsesareconfidentialandstoredinasecureenvironment.Bias
Asperanylargeanonymousonlinesurvey,itispossiblewemayexperienceresponsebias,i.e.respondents’opinionsdifferingsystematicallyfromthoseofnon-respondentsastheyareinterestedinthetopicofthesurvey.Surveyresponseswereweightedtoberepresentativeoftheresearcherpopulation(UNESCOcountsofresearchers).Allresultsinthereportareweighted;basesizesareunweighted.StatisticalmethodsThe variables will be presented as absolute number and percentages and (exact) 95%confidenceintervals(CI).
25
Annex 5: Participants to the experts’ workshop and members of the advisory group AndreasPester,Researcher,CarinthiaUniversityofAppliedSciencesBarendMons,ScientificDirector,GoFair
BeetaBalaliMood,Consultant,PistoiaAlliance
DavidCameronNeylon,SeniorScientist,ScienceandTechnologyFacilitiesCouncilDidcotEmmaLazzeri,NationalOpenAccessDesk,TheItalianNationalResearchCouncil-InstituteofInformationScienceandTechnologies(CNR-ISTI)George Papastefanatos, ESOCS, Research Associate Management of Information SystemsResearchCenter"Athena"
HeatherA.Piwowar,Cofounder,ImpactStory/UnpaywallJasonPriem,Cofounder,ImpactStory/Unpaywall
MarinDacos,OpenScienceAdvisor to theDirector-General forResearchand Innovation,FrenchMinistryofHigherEducation,ResearchandInnovationMichaelRobertTaylor,HeadofMetricsDevelopment,DigitalScience&ResearchSolutionsLimitedPaoloManghi,TechnicalManager,InstituteoftheNationalResearchCouncilofItaly
Paul Wouters, Professor of Scientometrics, Director Centre for Science and TechnologyStudies,LeidenUniversityRebeccaLawrence,ManagingDirector,F1000Researchopenforscience
RobertaDaleRobertson,Opensciencepolicysenioranalyst,JISC
ŽigaTurk,Professor,UniversityofLubljana