Capturing the context: one small(ish step for modellers, one giant leap for mankind.

26
One small(ish) step for modellers, one giant leap for mankind Capturing the context Mihai Glonț Reproducible and Citable Data and Models Warnemuende September 2015

Transcript of Capturing the context: one small(ish step for modellers, one giant leap for mankind.

One small(ish) step for modellers, one giant leap for mankind

Capturing the context

Mihai Glonț

Reproducible and Citable Data and ModelsWarnemuendeSeptember 2015

A simple(?) question

How easy is it to find reusable models? Reusable should entail, at least

– Reproducible

– Friendly licence

– Understandable

Is this understandable?

Problems

How do we recognise concepts? Is adenosine5PrimePhospate a better variable name than a? Do all modellers know the same amount information about ATP?

How can we uniquely identify the concepts involved in a modelling exercise?

A brief (and biased) history of the Web

Web 1.0 - basic HTML pages (personal web sites on Geosites)

Web 2.0

● Prevalence of content generators● Social media● Rich user interfaces● Folksonomies● Software as a service

Web 3.0

● Semantic Web

● “The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries" (W3C)

● Machines understand the data on the web and can reason about it

● Implicit knowledge is captured in a machine-processable manner

● What holiday options are there for a family of four for 10 days, somewhere sunny and close to the sea, with good food and a budget of EUR 3000?

Semantic web overview

● Taxonomies and ontologies define concepts (resources) and ontologies

● Identification through URIs● Data is exchanged as RDF

Ontologies

● Define concepts, instances, attributes and relationships● Workshop is a kind of Thing

● Workshop hasA location

Linking ontologies

http://lod-cloud.net/

RDF Primer

● Resource Description Framework● Documents consist of a series of statements

● Statements (triples) follow the following syntax● Subject - Predicate – Object

https://sems.uni-rostock.de/reproducible-and-citable-data-and-models/http://example.com/someOntology/hasLocation

https://en.wikipedia.org/wiki/Warnemunde

A selection of ontologies for life scientists

● ChEBI: http://www.ebi.ac.uk/chebi/

● GO: http://geneontology.org/

● BRENDA Tissue Ontology: http://www.brenda-enzymes.org/

● FMA: http://bioportal.bioontology.org/ontologies/FMA

● Human disease ontology: http://disease-ontology.org/

● TEDDY: http://purl.bioontology.org/ontology/TEDDY/

● KiSAO: http://co.mbine.org/standards/kisao

● SBO: http://www.ebi.ac.uk/sbo/

https://www.ebi.ac.uk/ontology-lookup/

Identifiers, identifiers, identifiers

● Is http://purl.uniprot.org/taxonomy/9606

the same as http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=9606 or

http://taxonomy.bio2rdf.org/describe/?url=http://bio2rdf.org/taxonomy:9606

● What if the URIs change?● What if the URIs don't point to anything?

Introducing identifiers.org

● The aim of the identifiers.org project is to provide unique, stable, resolvable and location-independent URIs to identify and to locate scientific data

● Community-driven

● Free to use

Registry

500+ curated data collections

500+ curated data collections

Creating unique URIs

• Homo sapiens in Taxonomy (9606)

http://identifiers.org/taxonomy/9606http://identifiers.org/taxonomy/9606

[Data collection]

[Entity identifier]

Creating resolvable URIs

http://identifiers.org/taxonomy/9606http://identifiers.org/taxonomy/9606

• URI to identify the entity 'Homo sapiens' in the data collection Taxonomy

http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=9606

http://www.uniprot.org/taxonomy/9606

http://www.ebi.ac.uk/ena/data/view/Taxon:9606

ResourceResource ResourceResource ReferenceReference

Primary

http://info.identifiers.org/taxonomy/9606http://info.identifiers.org/taxonomy/9606

Inter-conversion of identifier schemes• Registry records different identifier schemes

• Web service for inter-conversion between identifier schemes

http://purl.obolibrary.org/obo/GO_0005886

http://purl.obolibrary.org/obo/GO_0005886

http://bio2rdf.org/go:0005886http://bio2rdf.org/go:0005886

http://identifiers.org/go/GO:0005886

http://identifiers.org/go/GO:0005886

Support for different formats

TaxonomyTaxonomy

htmlhtml

htmlhtml

RDFRDF

jsonjson

• The Registry records the formats provided by the various data resources

BioModels

http://www.ebi.ac.uk/biomodels/http://biomodels.caltech.edu/

Model workflow within BioModels

BioModels model display

BioModels model display

BioModels model display

BioModels model classification

Quo vadis?

● Model curation is hard

● Model annotation is laborious

● We moved from lack of methods to scalability and usability issues

● Towards semi-automated annotation based on model clustering

● User-friendly tools for annotating models