Nothing in biology makes sense except in the light of evolution. – Theodosius Dobzhansky
Nothing in taxonomy makes sense except in the light of Open Access
Transcript of Nothing in taxonomy makes sense except in the light of Open Access
![Page 1: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/1.jpg)
Donat Agosti Plazihttp://plazi.org
Systematics AssociationOxford, 28. August 2015
Nothing in taxonomy makes sense except in the light of Open Access
![Page 2: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/2.jpg)
![Page 3: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/3.jpg)
I want to be able at anytime, anywhere to access, mine and analyse a
significant body of published and digitized taxonomic knowledge.
I want to build by machine the catalogue of life.
I hope taxonomiy communications arrives in the 21st century
Vision and hope
![Page 4: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/4.jpg)
1. The demand
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the
only location with a complete set of ant systematics publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).2004
![Page 5: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/5.jpg)
2. The corpus of taxonomic literature
![Page 6: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/6.jpg)
Build and establish a TreatmentBank, such as Plazi, as basis forcontent mining of and linking to the taxonomic literature
3. The core corpus of taxonomic knowledge: Treatments
![Page 7: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/7.jpg)
4. Make use of the semantic linked WWW
Avoid all the waistful actual publishing!
• Publish structured data• Publish open access• Make taxonomic literature first class literature by minting
DOIs and making digital copies accessible• Add links to names, treatments, articles, DNA sequences,
digital objects• Help by building your own public corpus of citable data
Pensoft journals (e.g. Biodiversity Data Journal, Zookeys, Phytokeys) are the gold standard.
![Page 8: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/8.jpg)
Surfing or the seduction of science (for a young kid)
![Page 9: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/9.jpg)
Surfing or the seduction of science (for a young kid)
![Page 10: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/10.jpg)
Surfing or the seduction of science (for a young kid)
![Page 11: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/11.jpg)
Surfing or the seduction of science (for an adult)
![Page 12: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/12.jpg)
Get a copy of the Cyclothone paper
Surfing or the seduction of science (for an adult)
![Page 13: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/13.jpg)
Surfing or the imperative for science
![Page 14: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/14.jpg)
Surfing or the imperative for science
![Page 15: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/15.jpg)
Linking treatments and data with external resources
NCBI
Surfing or the imperative for science
![Page 16: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/16.jpg)
Establish Plazi as, or use Plazi to build TreatmentBank as source for content mining of thetaxonomic literature
TreatmentBank
![Page 17: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/17.jpg)
What are the species in Amazonia?
TreatmentBank
![Page 18: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/18.jpg)
Countries (Region)Australia (Queensland)
Export species materials citations (DwC)
![Page 19: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/19.jpg)
Text mining tools: Visualization of treatment content
Summary of content of 37 Zootaxa spider publications and 8 Biodiversity Data Journal. (Miller et al., 2015)
![Page 20: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/20.jpg)
Pseudomyrmex ants and Vachellia ant-acaciasare a classic example of mutualism in biology.
allenii
melanoceras
ruddiae
chiapensis
collinsii
cookii
cornigera
globulifera
hindsii
janzenii
mayana
sphaerocephala
boopis
flavicornis
hesperius
ita
janzenikuenckeli
mixtecus
nigrocinctus
nigropilosus
opaciceps
particeps
peperi
reconditus
satanicus
simulansspinicola
subtilissimus
veneficus
ferrugineus
gentlei
gracilis
Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments
Acacia-ant species: Pseudomyrmex gracili
Treatment: redescription
Associated ant-acacia: Acacia gentlei
Ants Plants
Photocredits: Alex Wild
Treatment
Treatments linked through citations
Text mining tools: Visualization of treatment content
![Page 21: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/21.jpg)
What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud
![Page 22: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/22.jpg)
The demand: scientists and citizen scientists
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the
only location with a complete set of ant systematics publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).
Online catalogueOpen accessOnline library2004
![Page 23: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/23.jpg)
Online catalogue
The interest of big science
2004
2005
![Page 24: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/24.jpg)
The demand: scientists and citizen scientists
![Page 25: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/25.jpg)
The scientific challenge: Bridging the gap
1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca421 acatttattt
![Page 26: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/26.jpg)
Where do we stand?
![Page 27: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/27.jpg)
![Page 28: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/28.jpg)
The bristlemouths are a rapacious family of deep-sea fishes that include the wildly successful genus Cyclothone
In contrast, ichthyologists put the likely figure for bristlemouths at hundreds of trillions — and perhaps quadrillions, or thousands of trillions.
![Page 29: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/29.jpg)
The bristlemouths are a rapacious family of deep-sea fishes that include the wildly successful genus Cyclothone
![Page 30: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/30.jpg)
![Page 31: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/31.jpg)
Taxonomy?Source?
![Page 32: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/32.jpg)
![Page 33: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/33.jpg)
Issue USD 266.00Article USD 48.00
![Page 34: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/34.jpg)
Get a copy of the Cyclothone paper
Our contribution for a better understanding of biodiversity
![Page 35: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/35.jpg)
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages. Source: (Agosti 2005)
Access
![Page 36: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/36.jpg)
• Limited access (copyright)
• Limited discoverability of content
• Research results cannot be cited
• Data mining does not work
Issues of access
![Page 37: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/37.jpg)
Provide an open access, linked corpus of taxonomic literature
A solution
![Page 38: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/38.jpg)
Surfing at breakfast table
![Page 39: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/39.jpg)
article
treatment
CiteshttpURI
cites (DOI)
Scientific name
https://www.wikidata.org/wiki/Property:P1992
Feed Wikipedia with taxonomic data
![Page 40: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/40.jpg)
Surfing or the imperative for science
![Page 41: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/41.jpg)
Surfing or the imperative for science
![Page 42: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/42.jpg)
Surfing or the imperative for science
![Page 43: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/43.jpg)
LODPDF
HNS
HNS
Surfing or the imperative for science: Use of name services
![Page 44: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/44.jpg)
The goal
![Page 45: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/45.jpg)
Create a citable open corpus of taxonomic publications
![Page 46: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/46.jpg)
![Page 47: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/47.jpg)
Biodiversity Literature Repository: Record
![Page 48: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/48.jpg)
Biodiversity Literature Repository: RecordTreatment
Illustration
![Page 49: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/49.jpg)
http://plazi.org/wiki/Blue_ListPatterson et al., 2014: http://dx.doi.org/10.1186/1756-0500-7-79
Legal issues
![Page 50: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/50.jpg)
Workflow
Plazi SRS
find scan «OCR» markup store +access
![Page 51: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/51.jpg)
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name>
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin
to a sharp apical tooth, the apex parallel to the anterior
(Holotype with material in mandibles, so mandibles and
$ described below from paratypes.) Median clypeus
....
</treatment>
Semantisch erweiterter Text(TaxonX)
… alternatives: From human to machine readable text
RDF
![Page 52: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/52.jpg)
Plazi tools: table extraction
«Treatment»Wissenschaftliche ArtnameVerbreitungsnachweisBibliographische Records
Cataglyphis tartessica workersVariable mean ± SDHead length 11.23 ± 0.12Head width 11.15 ± 0.12Scape length 11.47 ± 0.12Mesosoma length 11.94 ± 0.16Femur length 12.03 ± 0.14Cephalic index 0 93.60 ± 3.940Scape index 128.10 ± 7.660
![Page 53: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/53.jpg)
Plazi tools: discovering of scientific names
![Page 54: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/54.jpg)
Plazi tools: discovering and parsing of bibliographic references
![Page 55: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/55.jpg)
Plazi tools: discovering and parsing of observation data
![Page 56: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/56.jpg)
Plazi tools: discovering of treatments
![Page 57: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/57.jpg)
Treatment: a well defined part of an article that defines the particular usage of a scientific name by an authority at a given time (a page(s) in a publication).
Treatment
The special case taxonomic literature: The citated elements aretreatments, not article
Formica obsoleta Linnaeus, 1758: 580
![Page 58: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/58.jpg)
Treatment
![Page 59: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/59.jpg)
Original combinations
Reference to an orginal combination
Subsequent useages of names cite the referenced treatment
What is a treatment?
![Page 60: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/60.jpg)
Treatment and treatment reference and citation
Trea
tmen
t ci
tati
on
Treatment references
![Page 61: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/61.jpg)
Treatment
Citing of treatments or linking of treatments to treatments
By minting persistent httpURIs for treatments, treatmentscan be cited like a bibliographic reference
http://treatment.plazi.org/id/A9FFD1FC-4629-FFB4-968F-AD38386521BA
![Page 62: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/62.jpg)
Status quo
• 50,000+ treatments life, daily growth
• RDF in Betaversion
• GoldenGate Imagine (PDF and text mining tool) in betaversion
• Provider for data for NCBI, Wikidata, GBIF, EOL, antweb
• Biodiversity Literature Repository functional
![Page 63: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/63.jpg)
Next steps
• Collaborate with ContentMine to extract >50
treatments/day
![Page 64: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/64.jpg)
Next steps
Planned collaboration with ContentMine to extract treatments on a daly bases
http://www.slideshare.net/petermurrayrust/?
BioDiv
![Page 65: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/65.jpg)
Next steps
• Collaborate with ContentMine to extract 50 treatments/day
• 1 Million treatments life
• RDF Version accessibl
• GoldenGate Imagine (Text mining tool)
• Provider für Daten für NCBI, GBIF, EOL, antweb
• Biodiversity Literature Repository mit 100,000 bibliographic
references and digital copies (PDF, images, etc.)
![Page 66: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/66.jpg)
Next steps
BUT
![Page 67: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/67.jpg)
Next steps
Avoid all this waste (our next generation will have to clean up)!
Publish structured dataPublish open accessPublish in journals with DOIAdd links to names, treatments, articles, DNA sequences, digital objectsHelp build your own corpus of citable data
Pensoft journals (e.g. Biodiversity Data Journal, Zookeys, Phytokeys) are the gold standard.
![Page 68: Nothing in taxonomy makes sense except in the light of Open Access](https://reader034.fdocuments.in/reader034/viewer/2022042908/58f14e011a28ab30068b45db/html5/thumbnails/68.jpg)
Thanks!
Donat Agosti
Acknowledgment: Pensoft, Zenodo/CERN, NCBI, Wikidata, ContentMine