Post on 24-Feb-2016
description
Collaborative ontology development by scientists
Melissa Haendel
Setting the stage
1. Who we are and what do we need2. What are our bottlenecks:
Getting info from the domain experts Ontology tools Synchronizing ontologies
3. Modularizing anatomy ontologies4. Ideas for collaborative ontology
editing
Who are we? What do we want?Domain Experts:Anatomists, comparative morphologists,developmental biologists, immunologists, neuroscientists, etc.
Ontologists:Biologists-gone-informatics, computer scientists and logicians
Engineers:Our tool builders
Ontologies and tools to develop them
Domain experts: want to query for gene expression and phenotypes across species
Ontologists: have to be able to interpret and
represent domain knowledge
computationally
Engineers: have to build tools that can
consume ontologies and give the
Domain Experts the right results
Anatomy and phenotype ontologies have work hard for us
Ontologies must be intelligible to:
Humans Machines
Enable comparison of structures across different organisms Standardization of vocabulary among communities Integration across databases Query across large amount of data Automatic reasoning to infer related classes Error checking Annotation consistency
Term needed for annotation
Ontology development workflow and bottlenecks
reconcile
Term requested
Ontology development workflow and bottlenecks
reconcile
Term discussed by community
Ontology development workflow and bottlenecks
reconcile
Ontology development workflow and bottlenecks
reconcile
GO
CL
CARO
TAOAAO
XAO ZFA
MA
MP UBERON
Ontology development workflow and bottlenecks
reconcile
Synchronize?
1) Extracting domain knowledge into an ontology efficiently
2) Multiple ontology editing tools, each with pros and cons, neither easily used by domain experts
3) Synchronization across interoperable ontologies
Three bottlenecks
How can we increase the efficiency of extracting knowledge from domain experts?
An example of what has worked well so far:
1862 Christian Schussele
Familiar tooling: Google docs, Phenote, ExcelVisualization: Cmap, Vue, GraphViz
Need too merge different sources of informationNeed a way to get this information into a computable form
Two ontology editors (and viewers) commonly used by the biomedical community
http://oboedit.org/
OBOEdit- OBO ontology editor and viewer
Protégé - OWL ontology editor and viewer
http://protege.stanford.edu/
Both tools are non-trivial to learn to use Neither have a lot of bulk operations, import/export different formats easily, or deal with synchronization readily
There is a barrier for domain experts to contribute knowledge, and a bottleneck for editors to get this knowledge into ontologies efficiently
More biologist-friendly (thank you John!)
Tool used by broader community
How to synchronize ontologies
Mapping (bioportal set, ..) Direct reconciliation (TAO and ZFA) Synchronization using imports
Three approaches:
Ontology mappings are often not useful
FMA (human) tibia FBbt (fruitfly) tibia
FMA extensor retinaculum of wrist
MA retina
GAZ (geography) Colon FMA (human) Colon
ZFA (zebrafish) aortic arch MA (mouse) arch of aorta
GAZ (geography) Serpentine CHEBI (chemistry) serpentine
Dictyostelium giant cell FMA giant cell
ZFA (zebrafish) blastoderm Fbbt blastoderm stage
PATO (quality) male Chebi (chemical) maleate 2(-)
(For anatomy, you may want to remove the mappings that NCBO Bioportal creates for your ontology and/or ask not to allow mapping)
Zebrafish terms are is_a subtypes of teleost terms
is_a
Zebrafish Anatomy Teleost Anatomy Ontology
Reconciliation and linking between TAO and ZFA
Logic implemented via Xrefs- difficult to keep synchronizedXrefs logic can be less clear and more difficult to use
Synchronization by import across ontologies
One can import a whole ontology or just portions of another ontologyMIREOT: Minimum information to reference an external ontology term
This strategy requires better facilities while editing
CARO
VAO
Present TAO Modularized ontology
OntoFox: a Web Server for MIREOTing Good things: Based on MIREOT principle Web-based data input and output Output OWL file can be directly imported in your ontology No programming needed Programmatically accessible
Improvements: Integration into ontology editing tools More customizable http://ontofox.hegroup.org
We need synchronization solutions that are integrated within ontology editing
tools
What IS the anatomy ontology landscape?How can we efficiently build our anatomy
ontologies to be most interoperable?
We could have built: A single ontology for ontology editors and consumers Different editors have editing rights to different ontology partitions
- by taxon- by domain (e.g. neuroscience, skeletal anatomy)
No taxon-specific subtypes- use structure, function etc. as differentia
Dynamic views according to user needs
Ontology landscape model view
cell tissue
muscletissue
mesonephros
limb
antenna
weberian ossicle
mammary gland
nervous system
mollusc foot
tentacle
mantle
pupal DN3 period neuron
mushroom body
brachial lobe
pons
vertebravertebralcolumn
circulatory system appendage
mesoderm
gut
tibia
gland
bone
skeletaltissue
parietalbone
fin
gonad
trachea
respiratoryairway
link(small sample)
tibiafibula
larva
user/editorview
metencephalon
neuroview
skeletalview
mammalianview
ventralnervecord
molluscview neuro
view
skeletalview
Proposed model moving forward
Maintain series of ontologies at different taxonomic levels- euk, plant, metazoan, vertebrate, mollusc, arthropod,
insect, mammal, human, drosophila Each ontology imports/MIREOTs relevant subset of
ontology “above” it- this is recursive
Subtypes are only introduced as needed Work together on commonalities at appropriate
level above your ontology
zebrafish
caro / uberon/allcell tissue
metazoa
muscletissue
vertebrata
mesonephros
limb
arthropoda
antenna
teleost
weberian ossicle
mammalia
mammary gland
nervous system
mollusca
foot
cephalopod
tentacle
mantle
drosophila
neuron types XYZ
mushroom body
brachial lobe
NO pons
vertebravertebralcolumn
circulatory system
appendage
mesoderm
gut
tibia
gland
bone
skeletaltissue
parietalbone
fin
gonad
trachea
respiratoryairway
cross-ontologylink (sample)
amphibia
tibiafibula
larva
shellcuticle
skeleton
import
mouse human
Model view
Idealized protocol for new AOs
1. Collect draft list of terms2. Subdivide roughly into applicability at taxonomic
levels3. Request new terms from existing AOs above you4. Is a new mid-level AO required?
- yes – collaborate and create, go to 1.5. Import pre-reasoned subset from next AO above6. Build your ontology (David will take it from here in his talk later today)
Modularizing ontologies- positive reinforcement Identify key points of integration between ontologies Modularize based on domain or taxon
Import and reuse rather than cross-referencing or “aligning”
Let the reasoner help do the work Work together to distribute work
• To get the imports working well• To have distributed social responsibility assigned• Design patterns to ensure we are all doing the same thing• To check for consistency and errors across multiple ontologies using reasoners to get correct results for all users
-These ontologies are supposed to be orthogonal but aren’t always
• Visualization tools that can aid non-ontology experts in identifying errors across multiple ontologies
Modularizing ontologies – We need:
Returning to the bottlenecks in our process…Looking for solutions
Need easy-to-use tools for information captureIdeally based on existing familiar toolsAuto-populated from/to ontologiesSocial management - who is responsible for what
Need better import/export functionality: - into/out of ontology editors from simple collection tools- from a myriad of ontology sources
Need better interoperability between editors/formatsNeed enhanced bulk operations
Need to know specific requirements for building tools and user feedback
Need money and opportunities to interact (like this one!)
Existing tools for collaborative ontology editing don’t quite get us there
Google Refine has nice features for manipulating data, including RDF exports, but isn’t collaborative
Mapping Master for Protégé enables generation of OWL from spreadsheets, but is not collaborative and requires ontology knowledge
Web Protégé isn’t fully-fledged and is not useful for non-technical contribution
Ideas for collaborative ontology editing
Extracted from ontology with perl script Need to be edited by domain experts, and then
converted back in OWL Need to be merged with existing OWL file
Example: File extracted from ontology for this meeting:
There is a better way…..
Ideas for using Google Docs Enable creation of Google spreadsheets that curators and domain
experts can edit with the following features:
Tell Google spreadsheet which columns are which from ontology input file: labels, parents, URIs, xref, class, etc
Live-updated with latest external ontology versions using SPARQL
Export OBO/ RDF/ OWL serialization Enable search on external ontologies via autocomplete Track changes
This will solve some of the sync problems because the queries are executed whenever the doc is open or updated
Ideas for using Google Docs Enable creation of Google Drawings that curators and
domain experts can edit with the following features: Import of external ontologies Have relations and classes exported out from Google Drawing Export OBO/ RDF/ OWL serialization Linked to Google Spreadsheet Track changes
Ontology editor dreamsA truly collaborative web-based editing platform (a
la Web Protégé) compatible with OWL and OBOSupporting:
Import and export of customizable spreadsheets from Google Docs
Creation of “live templates” (spreadsheet in synch with SPARQL endpoints)
Supports MIREOT import Users roles and permission Web based versioning