3 AquaRES toolsodnature.naturalsciences.be/downloads/aquares/AquaRES... · 2016-09-05 ·...

Post on 08-May-2020

2 views 0 download

Transcript of 3 AquaRES toolsodnature.naturalsciences.be/downloads/aquares/AquaRES... · 2016-09-05 ·...

AquaRES tools for taxonomic editors & external users

Leen Vandepitte

On behalf of the WoRMS DMT

• Aquacache– toolfor comparing taxonomic checklists

• Improved dataservicesinthe framework ofaquares– Taxonmatchservices

– Occurrence checking services

– General quality control and dataformatchecking services

• Hands-ondemoofdataservices

• Dataexchangeand toolstargeting international initiatives

AquaCache

• Fromtheprojectdescription:– wewill design andbuild acentraldatacachelinking the threedatabases FADA,

WoRMS andRAMS.

– This data cachewill behosted atVLIZand is primarily meant toactasaninternalsystem for running common web services intermsoftaxonmatching anddatacleaning &refinement.

– Eachofthe databases willretain itsimport, update mechanism andquality control.

• Inreality:– Thegoal oftheAquaCache is toserve asaninternal datamanagement tool, helping

theinvolved editors tosearch through and compare lists andidentify possibleoverlaps anddiscrepancies between lists available inmore thanone speciesregister, e.g.onthe level ofhigher classification orthe statusof aname.

– Lists canbeuploaded intotheAquaCache asDarwinCore Taxon files andneed toinclude toSpecies Profile extension, indicating whether ataxonismarine, fresh,brackish orterrestrial.

– Thefunctionalities of theAquaCache will bebroadened towards thefuture,depending ontheneeds of theinvolved editors.

– AftertheAquaRES project (2013-2016), theAquaCache will continue under theLifeWatch project, where itsfunctionalities and applications canbefurtherdeveloped.

• AquaCache =management tool

• Search&compare the involved systems (“search”)

• Taxonomy:– Green:exactmatch (taxonname+higherclassification)

– Yellow: taxonmatch identifiesinconsistencies between thenames (e.g. spellingvariation)

– Red:higherclassificationisdifferent

• Environment:• !:flag missinginWoRMS

• X: flags differ between WoRMS &FADA

• V:flags correspond between WoRMS &FADA

Work inprogress…

Demonstration ofAquaCachehttp://aquacache.lifewatch.be/

A. Taxonmatchservices

B. Occurrence checking services

C. General quality control and dataformatchecking services

Improved dataservicesinthe framework ofAquaRES:

Anoverview,including somedemonstrations

Taxonmatchservices• Thisserviceallowsuserstomatchtheirtaxonomiclisttoavailableonline

standards,including– WoRMS

– Catalogue ofLife(CoL)

– Pan-European Species Infrastructure (PESI)

– IntegratedTaxonomic Information System (ITIS)

– Index Fungorum (IF)

– Paleo-DB

– Global Names Index (GNI)

– International PlantName Index(IPNI)

– FADA

– InterimRegisterofMarine andNon-Marine Genera(IRMNG)

– RAMS

– FishBase

• Optiontosearchallofthelistedtaxonomicstandardsorjustaselection

Available

Indevelopment

• Taxonmatchtool=>available through LifeWatch:www.lifewatch.be/data-services

Occurrence checking servicesPlottingsamplinglocationsonamap:• Enablesaquickvisualqualitycheck ofthedata• Usersareabletodetectpossibleerrorsinthecoordinates

Commonmistakes:– Switching oflatitude and longitude

– Lackof aminus sign toindicateWest orSouth

=>Thesekindsofflawscaneasilybefixedbytheuser,improvingoverallqualityofthedata.

Comparingyourownoccurrenceswithdocumenteddistributions• Takingthisfurther,userscanalsocomparetheiroccurrenceswiththe

documenteddistributionsinthetaxonomicdatabases(WoRMS,RAMS,FADA).• Detection ofpossible errors orgaps can goboth ways:

– Gaps inyour own dataó gaps inthe taxonomic database

– Errors inyour own dataó errors in the taxonomic database

• DEMO:Showonmaptool=>available through LifeWatch:www.lifewatch.be/data-services=>Useable for marine&non-marinelocations

• DEMO:Compareown occurrences to documented distributions=>soon available through LifeWatch:www.lifewatch.be/data-services=>“only asgood asthe available data”

=>Still under development!

Generalquality control&dataformatchecking services

Fromtheprojectdescription:• Theseservicesincludee.g.mappingoftheuploadedfieldnameswitha

standardsetoffields,highlightingnon-matchesormissingrequiredfields,andcheckingofthedataformatofe.g.thedate-relatedfields.

• Thesequalitycontrolstepsareprimarilytargetingdataproviderstoallowthemtoeasilychecktheformatandcontentoftheirdatabeforesubmission

• Suchqualitycontrolserviceswerespecificallybeingdevelopedfordatathatwillcontributeto(Eur)OBIS(cfr.now largely replacedby IPT).

• Withinthisproject,thedifferentdataformatsusedforWoRMS,FADA,SCAR-MarBIN,AntaBIF andBioFresh willbecomparedandwherepossiblemappedtoacommonstandard(e.g.DarwinCore)inordertobuildmoregenericwebservicesforcheckingthequalityandformatofthesedata.

What hasbeendone?=>what isalready outthere?• IPT – Integrated PublishingToolkit(by GBIF)

#inherentcheckswhen uploading your file(s)• Check whether occurrenceIDs areprovided &unique (core-extension)• Check whether ‘basis ofrecord’ isprovided• Check dataformat:

– EventDate asISO-standard

– Lat-lon asdecimal degrees,with correct separator

– Character encoding of filecan be indicated (preferred: UTF-8)– IndividualCount fieldas‘integer’

• DarwinCore Archive Validator– Checks the DarwinCore Archives: inspects files &compares the mapped concepts

to GBIFextensions– Specific focus on unique Identifiers that links the different extensions to the core

table

– http://tools.gbif.org/dwca-reports/148-7656490821008157004.html

• LifeWatch webservices– Dataformatvalidation:

• Latitude&longitude <>0,0

• Latitude&longitude between acceptable boundaries (-180/180 &-90/90)

• EventDate in correctformat

=>DEMO

• Already lotoftoolsexist…nouse to re-invent thewheel…

A. General overview

B. Catalogue ofLife

C. Encyclopedia ofLife

D. GBIF

Dataexchangewith international initiatives

Dataexchange

• Largely automated

GENETICS

LifeWatch Taxonomic Backbone

WoRMS – providerto Catalogue ofLife• MemorandumofUnderstanding- 2009

– WoRMS ascontributor to CoL, through its editorial network &global speciesdatabases

– Datawill be displayed in original form, without editing

– Dataareshared freely, butIPRremains with original custodians (=editors)

• Yearly updatesof#GlobalSpeciesDatabases(2009-2012)• Monthly automated updatesto CoL,indefined exchangeformat(since 2013)

Brachiopoda

Cumacea

OphiuroideaPhoronida

PoriferaProseriata - Kalyptorhyncha

Acanthocephala

Acoelomorpha Cephalochordata

Foraminifera

Kinorhyncha

Merostomata

Octocorallia

Orthonectida

RhombozoaAsteroidea

Bochusacea

Brachypoda

Bryozoa

Isopoda

Mystacocarida

Nemertea

Oligochaeta

Polychaeta

Remipedia

Tantulocarida

Thermosbaenacea

Cestoda

Chaetognatha

GastrotrichaGnathostomulida

Mollusca

Monogenea

Myxozoa

Placozoa

Priapulida

Trematoda

113,764speciesnamesacross 23(sub)phyla

• 46GlobalSpeciesDatabasesdelivered to CoL,withmonthlyupdates

Brachyura

Echinoidea

Holothuroidea

Hydrozoa

Leptostraca

Scaphopoda

Tanaidacea

Xenoturbellida

Polycystina

Brachiopoda

Cumacea

OphiuroideaPhoronida

PoriferaProseriata - Kalyptorhyncha

Acanthocephala

Acoelomorpha Cephalochordata

Foraminifera

Kinorhyncha

Merostomata

Octocorallia

Orthonectida

RhombozoaAsteroidea

Bochusacea

Brachypoda

Bryozoa

Isopoda

Mystacocarida

Nemertea

Oligochaeta

Polychaeta

Remipedia

Tantulocarida

Thermosbaenacea

Cestoda

Chaetognatha

GastrotrichaGnathostomulida

Mollusca

Monogenea

Myxozoa

Placozoa

Priapulida

Trematoda

113,764speciesnamesacross 23(sub)phyla

• 46GlobalSpeciesDatabasesdelivered to CoL,withmonthlyupdates

Brachyura

Echinoidea

Holothuroidea

Hydrozoa

Leptostraca

Scaphopoda

Tanaidacea

Xenoturbellida

Polycystina

1species

35,030 species

• EncyclopediaofLife(EoL)

– EoLgets access to all theWoRMS content (ó Catalogue ofLife)

– MoU between WoRMS &EoL

– Selected information:

• Accepted taxonnames

• Higher classification

• Distributions

• Selection ofnotes

– Datatransfer based onmonthly exports from WoRMS

http://eol.org/