My ontology is better than yours! Building and evaluating ontologies for integrative research
-
Upload
robert-hoehndorf -
Category
Education
-
view
811 -
download
0
Transcript of My ontology is better than yours! Building and evaluating ontologies for integrative research
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
My ontology is better than yours!Building and evaluating ontologies for integrative research
Robert Hoehndorf
Department of GeneticsUniversity of Cambridge
Bio-Ontology SIG
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Translational research
National Cancer Institute:
Translational research transforms scientific discoveries arising fromlaboratory, clinical, or population studies into clinical applicationsto reduce [disease] incidence, morbidity, and mortality.
slide by Robert Stevens
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Gruber (1993):
An ontology is the explicit specification of a conceptualization of adomain.
controlled vocabularies
hierarchically organized
facilitate data integration
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Body
Organ
Cell
Molecule
Tissue
Population
Gene
Transcript
Organelle
Individual
Physical object Quality Function Process
Gene OntologyCelltype
Sequence Ontology
GO-CC
ChEBI Ontology
AnatomyOntology
PhenotypeOntology
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
How can we find the “best” ontology?How can we develop the “best” ontology?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesOntology evaluation
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesEvaluation criteria
ontology design principles rooted in
best practicesphilosophylogicontology engineeringlinguistics
community agreement
community requests
peer review
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
OntologyOntology evaluation
definitions
singular nouns
common relations
single is-a hierarchy
orthogonality
realism
...
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Most ontology evaluation criteria are intrinsic criteria and evaluatewhat ontologies are.
How can we evaluate what ontologies do?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Most ontology evaluation criteria are intrinsic criteria and evaluatewhat ontologies are.
How can we evaluate what ontologies do?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesA functional perspective
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesEvaluation criteria
criteria from software engineering, etc.
user studyunit testscomplexity...
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesA functional perspective
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologiesEvaluation criteria
criteria from biology
experimentsstatistics (p-values)comparison to gold/silver standard...
PharmacogenomicsPharmacogenomics databases
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Research questions
drug discovery
drug repurposing
drug response
drug pathways
disease pathways
causal mutations
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Research questions
drug discovery
drug repurposing
drug response
drug pathways
disease pathways
causal mutations
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Traditional approaches to drug repurposing
drug target identification
models of drug binding
experiment design and execution (e.g., binding assays)
analysis and interpretation of experiment results
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Integrative approaches to drug repurposing
SIDER
text mining of drug labelsside-effect similarityUMLS
PREDICT
disease–disease similaritydrug–drug similaritydisease phenotypes, gene functions, side effects, chemicalstructure, protein interactions, text miningHPO, MESH, GO
OFFSIDES
adverse event reportsATC, UMLS
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Pharmacogenomics
Can we get some novel information about drug indications (andcausal mutations) by analyzing experimental data from animalmodels?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Approach
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Approach
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Relevant ontologies
Mammalian Phenotype Ontology
9,161 classesmanually developedannotation of animal modelsformal (EQ) definitions
Human Phenotype Ontology
9,796 classesmanually developedannotation of diseasesformal (EQ) definitions
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Challenges
1 comparison of human and mouse phenotypes
cross-species integrationhow do we represent phenotypes?
2 computation of similarity
semantic similarity based on ontology taxonomywhich ontology do we use for computing similarity?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Cross-species phenotype integration
representation of MP and HPO phenotypes
PATO-based formal definitionsGOhomologous and analogous anatomical structures (UBERON)
aim: cross-species integration of phenotypes
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
What are phenotypes and how do we represent them (forcross-species integration)?
Abnormal appendix: E=Appendix, Q=Abnormal
representation:
appendix with quality Abnormalquality Abnormal of some appendixorganism with appendix that has quality Abnormal...
inheritance of phenotypes across parthood
Abnormality of tip of appendix subclass of Abnormality ofappendix?
absence of appendix
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Semantic similarity results depend on
the number of distinctions made by ontology developers
the kind of distinctions made by ontology developers
the data that is analyzed
the similarity measure
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Should we compute phenotypic similarity based on the Human orthe Mammalian Phenotype Ontology (or both)? How can wecompare the results?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Ontology design decisions can be resolved empirically!
no a priori “right” way to represent phenotypes
focus on scientific results, not representation
evaluation:
empiricalobjectivequantitativeexternal
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Ontology design decisions can be resolved empirically!
finish the analysis
use known gene–disease associations as gold standard
use FDA-approved drug indications as gold standard
compare analysis results against gold standard
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity over phenotype ontologies measuresphenotypic similarity
semantic similarity
pairwise comparison of disease and animal phenotypes
sim(P,D) =
∑x∈Cl(P)∩Cl(D)
IC (x)
∑y∈Cl(P)∪Cl(D)
IC (y)
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
PhenomeNET compares phenotypes across species
ranking of gene for each disease
candidate genes for disease
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Statistical testing to rank drug–disease pairs
one-sided Wilcoxon signed rank test
result: ranking of drugs for each disease based on p-value
low p-value: mutations in mouse genes associated with a drugresult in phenotypes that are very similar to a diseasephenotypehigh p-value: genes uniformly distributed across ranks
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Receiver Operating Characteristic
Source: Wikipedia
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Gene-disease associations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tru
e P
ositiv
e R
ate
False Positive Rate
PhenomeNet initial
xoriginal
AUC: original 0.68
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Gene-disease associations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tru
e P
ositiv
e R
ate
False Positive Rate
PhenomeNet improved
xoriginal
latest
AUC (original): 0.68AUC (latest): 0.89
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Gene-drug associations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tru
e P
ositiv
e R
ate
False Positive Rate
PhenomeDrug initial
xoriginal
AUC: original 0.61
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Gene-drug associations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Tru
e P
ositiv
e R
ate
False Positive Rate
PhenomeDrug improved
xoriginal
latest
AUC (original): 0.61AUC (latest): 0.67
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Representation of phenotypes for cross-species integration
’Abnormality of appendix’ EquivalentTo: has-part
some (part-of some (Appendix and has-quality some
Quality))
organism-centric approach (has-part some)
transitivity over parthood (part-of some)
Quality used as indicator of abnormality
use of OWL EL
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Representation of phenotypes for cross-species integration
’Large appendix’ EquivalentTo: has-part some
(Appendix and has-quality some ’Increased size’)
organism-centric approach (has-part some)
no transitivity over parthood
use of OWL EL
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Absence
’Absence of appendix’ EquivalentTo: has-part some
(Appendix and has-quality some Absent)
subclass of Abnormality of appendix
use of OWL EL
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Should we compute phenotypic similarity based on the Human orthe Mammalian Phenotype Ontology (or both)? How can wecompare the results?
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Computation of semantic similarity using the MammalianPhenotype Ontology improves the analysis results.
problem specific
depending on mouse data
depending on the approach
depending on similarity measure
depending on gold standard dataset
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusion
Quantitative, external evaluation can improve ontologies andontology-based analysis methods.
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Annotation
Definitions:
intrinsic:
having definitionsAristotelian definitions
external:
having definitions that are easily understandablehaving definitions that improve annotation consistency
criteria:
measure annotation consistencyuser study
Dolan, M. E., et al. A procedure for assessing GO annotation consistency. Bioinformatics 21, i136–i143 (2005).
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Annotation
Labels:
intrinsic:
singular nounsreference to universals
external:
use of common, widely used termsuse of unambiguous terms
criteria:
measure annotation consistencyuser studyrecall in text
Yao, L., et al. Benchmarking Ontologies: Bigger or Better? PLoS Comput Biol 7, e1001055 (Jan. 2011).
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Knowledge bases and querying
Queries:
intrinsic:
use of OWLuse of specific relationsuse of upper level ontologyconsistency
external:
retrieve correct answersretrieve relevant answers
criteria:
user study (to evaluate query answers)test setcomparison to gold standard
Boeker, M., et al. Unintended consequences of existential quantifications in biomedical ontologies. BMCBioinformatics 12, 456 (2011).
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusions
My ontology is better than yours.
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusions
My ontology is better than yours.
My ontology can do some things better than your ontology.
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
ConclusionsQuantitative criteria
Empirical, objective, quantitative, application-based evaluation willallow us to systematically improve ontologies for science.
Thank you for your attention
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity: 112
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity: 412