Post on 12-Jan-2016
1
Manchester Medical Informatics Group OpenGALEN
Linking Formal Ontologies: Scale, Granularity and Context
Alan Rector
Medical Informatics Group, University of Manchesterwww.cs.man.ac.uk/mig
www.opengalen.orgimg.cs.man.ac.uk
rector@cs.man.ac.uk
2
Manchester Medical Informatics Group OpenGALEN
Why use Logic-based Ontologies?
because
Knowledge is Fractal!&
Changeable!
3
Manchester Medical Informatics Group OpenGALEN
Four Roles of Terminology/Ontologies
• Content of Databases and Patient Records – Structural linkage within EPR/EHR & messages– Content of EPR/EHR & messages
• Capturing information - the user interface
• Linkage between domainsLinkage between domains– Health and Bio Sciences Health and Bio Sciences – Macro, Micro, and Molecular scalesMacro, Micro, and Molecular scales– Contexts: Normal / abnormal; species; stage of developmentContexts: Normal / abnormal; species; stage of development– Healthcare delivery and Clinical research– Patient Records and Decision Support
• Indexing Information– Metadata and the semantic web
• www.semanticweb.org www.w3c.org
4
Manchester Medical Informatics Group OpenGALEN
Logic based ontologies
• The descendants of frame systems and object hierarchies via KL-ONE
• “is-kind-of” = “implies” – “Dog is a kind of wolf”
means“All dogs are wolves”
– Therefore logically computable
• Modern examples: OIL, DAML+OIL (“OWL”?)– Underpinned by the FaCT family of Description Logic Reasoners
• Others LOOM, CLASSIC, BACK, GRAIL,...
• www.ontoknowledge.org/oil www.semanticweb.org
5
Manchester Medical Informatics Group OpenGALEN
Logic-based Ontologies: Conceptual Lego
hand
extremity
body
acute
chronic
abnormal
normalischaemic
deletion
bacterial
polymorphism
cell
protein
gene
infection
inflammation
Lung
expression
6
Manchester Medical Informatics Group OpenGALEN
Logic-based Ontologies: Conceptual Lego
“SNPolymorphism of CFTRGene causing Defect in MembraneTransport of ChlorideIon causing Increase in Viscosity of Mucus in CysticFibrosis…”
“Hand which isanatomicallynormal”
7
Manchester Medical Informatics Group OpenGALEN
What’s in a “Logic based ontology”?
• Primitive concepts - in a hierarchy– Described but not defined
• Properties - relations between concepts– Also in a hierarchy
• Descriptors - property-concept pairs – qualified by “some”, “only”, “at least”, “at most”
• Defined concepts– Made from primitive concepts and descriptors
• Axioms– disjointness, further description of defined concepts
• A Reasoner– to organise it for you
8
Manchester Medical Informatics Group OpenGALEN
Encrustation
+ involves: MitralValve
Thing
+ feature: pathological
Structure
+ feature: pathological
+ involves: Heart
Logic Based Ontologies: A crash course
Thing
Structure
Heart MitralValve EncrustationMitralValve* ALWAYS partOf: Heart
Encrustation* ALWAYS feature: pathological
Feature
pathological red
+ (feature: pathological)
red
+ partOf: Heart
red
+ partOf: Heart
9
Manchester Medical Informatics Group OpenGALEN
Bridging Bio and Health Informatics
• Define concepts with ‘pieces’ from different scales and disciplines– “Polymorphism which causes defect which causes disease”
• Define concepts which make context explicit– “ ‘Hand which is anatomically normal’
has five fingers”
• Separate properties for different contexts/views – “Abnormalities of clinical parts of the heart”
• includes pericardium
10
Manchester Medical Informatics Group OpenGALEN
Bridging Scales and
context with Ontologies
GenesSpecies
Protein
Function
Disease
Protein coded bygene in species
Function ofProtein coded bygene in species
Disease caused by abnormality inFunction ofProtein coded bygene in species
Gene in Species
11
Manchester Medical Informatics Group OpenGALEN
Representing context and views by variant properties
Organ
HeartPericardium
OrganPart
CardiacValve
Disease of (is_part_of) Heart
Disease of Pericardium
is_part_of
is_structurally_part_ofis_clinically_part_of
12
Manchester Medical Informatics Group OpenGALEN
The cost: Ontologies are not Thesauri
organ } kind heart } part heart valve } kind aortic valve } part aortic valve cusp
A Mixed Hierarchy
Works for navigation by humans
Works for “Disease of…’ and ‘Procedure on…’
Fails for “Surface of…”
How can the computer know the difference?
13
Manchester Medical Informatics Group OpenGALEN
From a thesaurus to a logic-based ontology
disorder of organ
disorder of heart
disorder of valve in heart
disorder of aortic valve in heart
disorder of cusp in aortic valve in heart
A logic-based is-kind-of (subsumption) hierarchy
Untangle part-whole and is-kind-of in anatomic ontology
Link Clinical Ontology with Anatomical ontology
Add rule that “Disorder of part disorder of whole”
Reasoner can then create automatically:
14
Manchester Medical Informatics Group OpenGALEN
Examples common in Bio Ontologies
Is part ofGolgi membrane Integral protein
Is part ofPlasma membrane Apical plasma membrane
15
Manchester Medical Informatics Group OpenGALEN
The Cost: Normalising (untangling) Ontologies
StructureFunction
Part-wholeStructure Function
Part-w
hole
16
Manchester Medical Informatics Group OpenGALEN
The Cost: Normalising (untangling) Ontologies
Making each meaning explicit and separate
… ActionRole PhysiologicRole HormoneRole CatalystRole …
… Substance BodySubstance Protein Steroid …
PhysSubstance Protein ProteinHormone Insulin Enzyme Steroid SteroidHormone Hormone ProteinHormone^ Insulin^ SteroidHormone^ Catalyst Enzyme^
Hormone = Substance & playsRole-HormoneRoleProteinHormone = Protein & playsRole-HormoneRoleSteroidHormone = Steroid & playsRole-HormoneRoleCatalyst = Substance & playsRole CatalystRole
...and helping keep argument rational and meetings short!
Enzyme ?=? Protein & playsRole-CatalystRole
PhysSubstance Protein ‘ ProteinHormone’ Insulin ‘Enzyme’ Steroid ‘SteroidHormone’ ‘Hormone’ ‘ProteinHormone’ Insulin^ ‘SteroidHormone’ ‘Catalyst’ ‘Enzyme’
17
Manchester Medical Informatics Group OpenGALEN
The Cost
• You can’t say everything you want to– Expressiveness costs computational complexity
• More inference takes more time– Scaling for complex tasks still being investigated
• Many other kinds of reasoning needed
It doesn’t make the! Coffee!
18
Manchester Medical Informatics Group OpenGALEN
Other benefits• Limit combinatorial explosions
From “phrase book” to “dictionary + grammar” Avoid the “exploding bicycle”
– 1980 - ICD-9 (E826) 8 – 1990 - READ-2 (T30..) 81– 1995 - READ-3 87– 1996 - ICD-10 (V10-19) 587
• V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income
– and meanwhile elsewhere in ICD-10
• W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity
• X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities
19
Manchester Medical Informatics Group OpenGALEN
Study a phase 2
Other benefits
Hypertension
Idiopathic Hypertension
In our company’s studies
Study a
Phase 2
Hypertension
Idiopathic Hypertension`
In our company’s studies
Phase 2
• Index and assemble information
20
Manchester Medical Informatics Group OpenGALEN
Summary: Logic based ontologies because
Knowledge is Fractal• Link “Conceptual Lego”
– at all levels• indefinitely
– Spanning scales, genotype, phenotype, etc.
• Model context and views– Express differences explicitly
• Manage combinatorial explosion
• Index information efficiently
Next step: Larger scale demonstrations in Genotype to Phenotype