Building the Biodiversity Knowledge Graph

of 75 /75

Click here to load reader

description

Slides from 4th Global Online Biodiversity Informatics Seminar https://plus.google.com/events/clvk6nd14d9fhh7e4a6oe5mt9s0

Transcript of Building the Biodiversity Knowledge Graph

Page 1: Building the Biodiversity Knowledge Graph

Building the Biodiversity Knowledge Graph

@rdmpage

http://iphylo.blogspot.com

Page 2: Building the Biodiversity Knowledge Graph

• There are known knowns, things we know that we know

• There are known unknowns, things we now know we don’t know

• But there are also unknown unknowns, things we do not know we don’t know

Page 3: Building the Biodiversity Knowledge Graph

known

unknown

knowns

unknowns

Page 4: Building the Biodiversity Knowledge Graph

Things we don’t know that we know

Page 5: Building the Biodiversity Knowledge Graph

Melissotarsus insularis

Page 6: Building the Biodiversity Knowledge Graph

Melissotarsus insularis no hit

CASENT0107663-D01 DQ176312

Melissotarsus sp. BLF m1DQ176312

CASENT0107663-D01Melissotarsus insularis

1

Melissotarsus insularisMelissotarsus sp. BLF m1 =

Page 7: Building the Biodiversity Knowledge Graph

We have a vast amount of “old stuff”

Page 8: Building the Biodiversity Knowledge Graph

Numbers of new animal names

1923

WWIWWII

Page 9: Building the Biodiversity Knowledge Graph

We are learning new stuff

Page 10: Building the Biodiversity Knowledge Graph
Page 11: Building the Biodiversity Knowledge Graph

“New” and “old” are disconnected

Page 12: Building the Biodiversity Knowledge Graph

Dark taxa

http://iphylo.blogspot.co.uk/2011/04/dark-taxa-genbank-in-post-taxonomic.html

Page 13: Building the Biodiversity Knowledge Graph

Mammals in GenBank

Proper Linnaean names

Aus sp.

Page 14: Building the Biodiversity Knowledge Graph

Mammals

Proper Linnaean names

Aus sp.

Page 15: Building the Biodiversity Knowledge Graph

“Invertebrates”

BOLD

Page 16: Building the Biodiversity Knowledge Graph

Challenge: linking things together

(sticky data)

Page 17: Building the Biodiversity Knowledge Graph

Data is good

Page 18: Building the Biodiversity Knowledge Graph

More data is better…

Page 19: Building the Biodiversity Knowledge Graph

…but this data is not sticky

Page 20: Building the Biodiversity Knowledge Graph

Location

Page 21: Building the Biodiversity Knowledge Graph
Page 22: Building the Biodiversity Knowledge Graph
Page 23: Building the Biodiversity Knowledge Graph

name

name

Tags

Page 24: Building the Biodiversity Knowledge Graph

Namenname

Page 25: Building the Biodiversity Knowledge Graph

Identifiers

Page 26: Building the Biodiversity Knowledge Graph
Page 27: Building the Biodiversity Knowledge Graph

Shared identifiers are sticky

Page 28: Building the Biodiversity Knowledge Graph

Identifiers

• Globally unique

• Resolvable (for humans and machines)

• Use other people’s identifiers to link things together

Page 29: Building the Biodiversity Knowledge Graph

Human and machine readable

machine

human

Page 30: Building the Biodiversity Knowledge Graph
Page 31: Building the Biodiversity Knowledge Graph

{"author": [ { "family": "Page", "given": "Roderic D.M." } ], "container-title": "PeerJ", "reference-count": 60, "page": "e190", "deposited": { "date-parts": [ [ 2013, 11, 18 ] ], "timestamp": 1384732800000 }, "title": "BioNames: linking taxonomy, texts, and trees", "type": "journal-article", "DOI": "10.7717/peerj.190", "ISSN": [ "2167-8359" ], "URL": "http://dx.doi.org/10.7717/peerj.190”}

Page 32: Building the Biodiversity Knowledge Graph

Using other people’s identifiers is hard work and scary

• Hard work - you have to find their identifiers

• Scary - what happens if other person breaks their identifiers?

• Solution: make it easy to find them, and make them robust (e.g., CrossRef and DOIs)

Page 33: Building the Biodiversity Knowledge Graph

http://dx.doi.org/10.7717/peerj.190

DOI (Digital Object Identifier)

Page 34: Building the Biodiversity Knowledge Graph
Page 35: Building the Biodiversity Knowledge Graph
Page 36: Building the Biodiversity Knowledge Graph

Biodiversity Knowledge Graph(linking things together)

Page 37: Building the Biodiversity Knowledge Graph
Page 38: Building the Biodiversity Knowledge Graph

Our questions are “paths” in this network

Page 39: Building the Biodiversity Knowledge Graph
Page 40: Building the Biodiversity Knowledge Graph

Phylogeography

Page 41: Building the Biodiversity Knowledge Graph

Taxonomy

Page 42: Building the Biodiversity Knowledge Graph
Page 43: Building the Biodiversity Knowledge Graph

GenBank records from Spain

Page 44: Building the Biodiversity Knowledge Graph

MESH term

Page 45: Building the Biodiversity Knowledge Graph

PMID:948206

Page 46: Building the Biodiversity Knowledge Graph

http://biostor.org/reference/102054

Page 47: Building the Biodiversity Knowledge Graph
Page 48: Building the Biodiversity Knowledge Graph

http://data.gbif.org/occurrences/215921922/

Page 49: Building the Biodiversity Knowledge Graph

BHL and GBIF as biomedical databases

http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html

Page 50: Building the Biodiversity Knowledge Graph

Metrics(counting links in the knowledge graph)

Page 51: Building the Biodiversity Knowledge Graph

In an attempt to live up to that increasing demand for documentation, the leadership of the Natural History Museum of Denmark has issued an order to its curatorial staff - The staff members are requested to document which publications from 2011, written entirely by external scientists, that in one way or another are based on material in the collections of the Museum.

http://markmail.org/message/opv2we7fkmro2nen@TAXACOM

Page 52: Building the Biodiversity Knowledge Graph

https://twitter.com/#!/search/10.1371%252Fjournal.pone.0036881

Page 53: Building the Biodiversity Knowledge Graph

https://twitter.com/edwbaker/status/205595933159858176

https://twitter.com/edwbaker/status/205595933159858176

Page 54: Building the Biodiversity Knowledge Graph

http://www.museum-analytics.org/

Page 55: Building the Biodiversity Knowledge Graph

Cited, linkable specimens

NMNH Vertebrate Zoology Herpetology Collections 11194

CAS Herpetology Collection Catalog

MCZ Herpetology Collection

Herpetology Collection (University of Kansas Biodiversity Research Center)

9619

6720

5818

http://iphylo.blogspot.co.uk/2012/02/gbif-specimens-in-biostor-who-are-top.html

Page 56: Building the Biodiversity Knowledge Graph

Annotation(everyone can make

the knowledge graph)

Page 57: Building the Biodiversity Knowledge Graph
Page 58: Building the Biodiversity Knowledge Graph
Page 59: Building the Biodiversity Knowledge Graph

http://bionames.org/labs/bookmarklet/

Page 60: Building the Biodiversity Knowledge Graph
Page 61: Building the Biodiversity Knowledge Graph
Page 62: Building the Biodiversity Knowledge Graph
Page 63: Building the Biodiversity Knowledge Graph
Page 64: Building the Biodiversity Knowledge Graph
Page 65: Building the Biodiversity Knowledge Graph
Page 66: Building the Biodiversity Knowledge Graph

How many people view annotation

DataFix me!

Page 67: Building the Biodiversity Knowledge Graph

Annotation as fixing errors

Page 68: Building the Biodiversity Knowledge Graph

Annotation as buildingthe knowledge graph

paper specimen

paper

sequence

taxonomic name

specimen

cites

publishes

has voucher

Page 69: Building the Biodiversity Knowledge Graph
Page 70: Building the Biodiversity Knowledge Graph

OK, but if the biodiversity knowledge graph is so cool, why haven’t we

made it already?

Page 71: Building the Biodiversity Knowledge Graph
Page 72: Building the Biodiversity Knowledge Graph
Page 73: Building the Biodiversity Knowledge Graph
Page 74: Building the Biodiversity Knowledge Graph
Page 75: Building the Biodiversity Knowledge Graph

Open question:

Who will build thebiodiversity knowledge graph?