VR. Formal Principles for Biomedical Ontologies Barry Smith .

130
VR

Transcript of VR. Formal Principles for Biomedical Ontologies Barry Smith .

Page 1: VR. Formal Principles for Biomedical Ontologies Barry Smith .

VR

Page 2: VR. Formal Principles for Biomedical Ontologies Barry Smith .

Formal Principles for Biomedical Ontologies

Barry Smith

http://ifomis.de

Page 3: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de3

Three levels of ontology

Page 4: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de4

Three levels of ontology

1) formal (top-level) ontology dealing with categories employed in every domain:

object, event, whole, part, instance, class

2) domain ontology, applies top-level system to a particular domain

cell, gene, drug, disease, therapy

3) terminology-based ontology

large, lower-level system

Dupuytren’s disease of palm, nodules with no contracture

Page 5: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de5

Three levels of ontology1) formal (top-level) ontology dealing with

categories employed in every domain:

object, event, whole, part, instance, class

2) domain ontology, applies top-level system to a particular domain

cell, gene, drug, disease, therapy

3) terminology-based ontology

large, lower-level system

Dupuytren’s disease of palm, nodules with no contracture

Page 6: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de6

Three levels of ontology1) formal (top-level) ontology dealing with

categories employed in every domain:

object, event, whole, part, instance, class

2) domain ontology, applies top-level system to a particular domain

cell, gene, drug, disease, therapy

3) terminology-based ontology

large, lower-level system

Dupuytren’s disease of palm, nodules with no contracture

Page 7: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de7

Compare:

1) pure mathematics (re-usable theories of structures such as order, set, function, mapping)

2) applied mathematics, applications of these theories = re-using the same definitions, theorems, proofs in new application domains

3) physical chemistry, biophysics, etc. = adding detail

Page 8: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de8

Three levels of biomedical ontology

1) formal (top-level) ontology = medical ontology has nothing like the technology of re-usable definitions, theorems and proofs provided by pure mathematics

2) domain ontology = UMLS Semantic Network, GALEN CORE

3) terminology-based ontology = UMLS, SNOMED-CT, GALEN, FMA

?????

Page 9: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de9

Description Logic , Protégé,

and other tools for supporting automatic reasoning do not fill this gap

they do not provide theories of classes, functions, processes, etc.

rather: successful coding in a DL-framework presupposes that such theories have already been applied in the very construction of the terminology-based ontology

Page 10: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de10

IFOMIS

Institute for Formal Ontology and Medical Information Science,

mission:

use basic principles of philosophical ontology, traditional theories of classification and definition for quality assurance and alignment of biomedical ontologies

Page 11: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de11

Strategy

Part 1: Survey of GO

Part 2: Provide principles for building biomedical ontologies derived from formal (top-level) ontology, and illustrate how they can help in quality assurance of terminology-based ontologies like GO

Part 3: Show how it can be done right

Page 12: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de12

Part OneSurvey of GO

Page 13: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de13

GO is three ontologies

cellular componentsmolecular functions biological processes

December 16, 2003:1372 component terms7271 function terms8069 process terms

Page 14: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de14

GO an impressive achievement

used by over 20 genome database and many other groups in academia and industrysuccessful methodology, much imitatednow part of OBO (open biological ontologies) consortium

Here I focus on problems / errorsGO here is just an example

Page 15: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de15

Primary aim of GO

not rigorous definition and principled classification

but rather: providing a practically useful framework for keeping track of the biological annotations that are applied to gene products

Page 16: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de16

Each of GO’s ontologies

is organized in a graph-theoretical structure involving two sorts of links or edges:

is-a

(epithelial cell differentiation is-a cell differentiation)

part-of

(axonemal microtubule part-of axoneme)

Page 17: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de17

This graph-theoretic architecture

to designed to help humans, who can use the graphs to locate the features and attributes they are addressing in their work and thus to determine the designated terms for these features and attributes within GO’s ‘controlled vocabulary.’

Page 18: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de18

GO’s three ontologies

When a gene is identified, three important types of questions need to be addressed: Where is it located in the cell? What functions does it have on the molecular level? And to what biological processes do these functions contribute?

Page 19: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de19

GO’s three ontologies

molecular functions

cellular constituents

biological processes

Page 20: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de20

The Cellular Component Ontology (counterpart of anatomy)

consists of terms such as flagellum, chromosome, ferritin, extracellular matrix and virion Cellular components are physical and measurable entities. They are, in the terminology of philosophical ontology, objects or things (independent continuants). They endure self-identically through time while undergoing changes of various sorts

Cellular component embraces also the extracellular environment of cells and cells themselves

Page 21: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de21

No organisms

GO does not include terms for specific organisms, not even for single-celled organisms

Page 22: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de22

The Molecular Function Ontology

molecular function = the action characteristic of a gene product.

Actions such as ice nucleation or protein stabilization do not endure but rather occur.

Page 23: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de23

The Molecular Function Ontology

Originally included terms such as anti-coagulant (defined as: ‘a substance that retards or prevents coagulation’) and enzyme (defined as: ‘a substance … that catalyzes’)

These refer neither to functions nor to actions but rather to components.

Page 24: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de24

The Molecular Activity Ontology

Confusion remedied to a degree by policy change of March 2003: ‘All GO molecular function term names [with the exception of the parent term molecular function and of the whole node binding] are to be appended with the word “activity”.’

Page 25: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de25

‘Function’ = ‘Activity’

Thus the term ‘structural molecule,’ which is defined as meaning: ‘the action of a molecule that contributes to structural integrity,’ is amended to ‘structural molecule activity’

Page 26: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de26

still problem’s with GO Molecular Function Definitions

anti-coagulant activity (defined as: “a substance that retards or prevents coagulation”)

enzyme activity (defined as: “a substance that catalyzes”)

Page 27: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de27

… and there are still problems with Molecular Function terms

GO:0005199:

structural constituent of cell wall

Page 28: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de28

structural constituent of cell wall

Definition: The action of a molecule that contributes to the structural integrity of a cell wall.

confuses actions, which GO includes in its function ontology, with constituents, which GO includes in its cellular component ontology

Page 29: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de29

extracellular matrix structural constituentpuparial glue (sensu Diptera) structural constituent of bonestructural constituent of chorion (sensu Insecta) structural constituent of chromatin structural constituent of cuticlestructural constituent of cytoskeleton structural constituent of epidermisstructural constituent of eye lens structural constituent of muscle structural constituent of myelin sheath structural constituent of nuclear pore structural constituent of peritrophic membrane (sensu

Insecta) structural constituent of ribosome structural constituent of tooth enamel structural constituent of vitelline membrane (sensu Insecta)

Page 30: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de30

The Biological Process Ontology

biological process: ‘A phenomenon marked by changes that lead to a particular result, mediated by one or more gene products.’

Examples: glycolysis, death, adult walking behaviorresponse to blue light

Page 31: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de31

OccurrentsBoth molecular activity and biological process terms refer to what philosophical ontologists call occurrents

= entities which do not endure through time but rather unfold themselves in successive temporal phases. Occurrents can be segmented into parts along the temporal dimension.Continuants exist in toto in every instant at which they exist at all.

Page 32: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de32

Molecular functions and biological processes are closely interrelated

E.g. the process anti-apoptosis involves the molecular function apoptosis inhibitor activity.

How can GO express such relations?

Page 33: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de33

Are they a matter of granularity?‘A biological process is accomplished via

one or more ordered assemblies of molecular functions.’

??? Molecular activities = building blocks of biologica processes ???

So: Functions are parts of processes

But no:

Page 34: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de34

GO’s three ontologies are separate

No links or edges defined between them

molecular functions

cellular constituents

biological processes

Page 35: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de35

Question:

How understand granularity

if not in terms of parthood?

Page 36: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de36

Molecular functions

renamed ‘activities’, because ‘activity’ unlike ‘process’, connotes agency ?but molecules are not agents

hypothesis: the term ‘function’ was used for the molecular function ontology because the activities in question are functional in relation to the pertinent organism.

Page 37: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de37

Functions

A function is functional

= beneficial to the organism

If an organism-part has a function, this is because the functioning of this organism-part is beneficial to the organism

The function of the heart is to pump blood

Not: the function of the hip is to financially support hip-replacement surgeons

Page 38: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de38

Some processes are functionings

E.g. pumping blood

Page 39: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de39

Two sorts of processes

1. Functionings (realizations of functions = beneficial to the organism)

2. Other processes (e.g. the result of external interventions)

Cf. difference between physiology and pathology

Page 40: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de40

GO not clear about this distinction

transport: The directed movement of substances (such as … ions) into, out of, or within a cell

cell growth and/or maintenance: Any process required for the survival and growth of a cell

Synonym: cell physiology

transport is-a cell growth and/or maintenance

but (GO:0019060) viral intracellular protein transport

is-a transport

Page 41: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de41

Why do these problems arise?

GO has no clear understanding of the role of temporal relations in organizing an ontology

(thus also no clear understanding of the difference between a function and the activity which is the realization of a function)

Page 42: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de42

GO excludes organisms from its scope (they are of the wrong granularity)

Yet each process or function requires some bearer or bearers which it is the process or function of.

Processes are dependent on their bearers

(Theory of dependence vs. independence part of formal ontology)

(Theory of continuants vs. occurrents part of formal ontology)

Page 43: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de43

Some formal ontology

Components are independent continuants

Functions are dependent continuants

(the function of an object exists continuously in time, just like the object which has the function;

and it exists even when it is not being exercised)

Processes are (dependent) occurrents

Page 44: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de44

More generally:

Continuants can be divided into independent (objects, things, components) and dependent (features, attributes, conditions, functions, roles, qualities …)

All occurrents are dependent entities. Every occurrent is dependent for its

existence on one or more continuants. A change is always a change in some

continuant object.

Page 45: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de45

Page 46: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de46

Part Two

Principles of Biomedical Ontologies and their use in quality assurance of terminology-based ontologies

Page 47: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de47

Principle of Temporal Coherence

An ontology should rigorously distinguish continuants from occurrents.

(Anatomy is a science of continuants)

Page 48: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de48

Principle of Dependence

If an ontology recognizes a dependent entity then it (or a linked ontology) should recognize also the relevant class of bearers

Part of our aim here is to lay down principles which can support such linkability

Page 49: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de49

Linking to external ontologies

can also help to link together GO’s own three separate parts

Page 50: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de50

GO’s three ontologies

molecular functions

cellular constituent

s

biological processes

dependent

independent

Page 51: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de51

GO’s three ontologies

molecular functions

cellular constituent

s

organism-level

biological processes

cellularprocesses

Page 52: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de52

‘part-of’; ‘is dependent on’

molecular functions

moleculecomplexe

s

cellularprocesses

cellular constituents

organism-level

biological processes

organisms

Page 53: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de53

molecular functions

moleculecomplexe

s

cellularprocesses

cellular constituents

organism-level

biological processes

organisms

Page 54: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de54

moleculecomplexe

s

cellular constituent

s

molecular function

s

cellularfunctions

organism-level

biological functions

organisms

molecular processe

s

cellularprocesses

organism-level

biological processes

Page 55: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de55

moleculecomplexe

s

cellular constituent

s

molecular function

s

cellularfunctions

organism-level

biological functions

organisms

molecular processe

s

cellularprocesses

organism-level

biological processes

functioningsfunctionings functionings

Page 56: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de56

GO must be linked with other, neighboring ontologies

GO has: adult walking behavior but not adultGO has: eye pigmentation but not eyeGO has: response to blue light but not light

(or blue)94% of words used in GO terms are not GO

termsPart of the solution “Medical FactNet” (NLM,

10am tomorrow)

Page 57: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de57

GO taking steps in this direction

Linking to a good external ontology of organism types (to solve some of the problems with sensu)

It needs to link further to a good external ontology of anatomy, to solve the location problem

and to a good external ontology of coarse-grained reality, to solve the adult walking behavior problem

Human beings know what ‘walking’ means

Page 58: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de58

Human beings know what adults are older than embryos

GO needs to be linked to ontology of development

and in general to resources for reasoning about time and change

Page 59: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de59

but such linkages are possible

only if GO itself has a coherent formal architecture

Page 60: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de60

Principle of Univocity

univocity: terms should have the same meanings (and thus point to the same referents) on every occasion of use

UMLS-Semantic Network:

‘organization’ = body plan (anatomy)

‘organization’ = social organization

Page 61: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de61

Polysemy of GO’s part-of

– membrane part-of cell, intended to mean “a membrane is a part-of any cell”

– flagellum part-of cell, intended to mean “a flagellum is part-of some cells”

– replication fork part-of cell cycle, intended to mean: “a replication fork is part-of the nucleoplasm only during certain times of the cell cycle”

Page 62: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de62

Three meanings of ‘part-of ’

‘part-of’ = ‘can be part of’ (flagellum part-of cell)

‘part-of’ = ‘is sometimes part of’ (replication fork part-of the nucleoplasm)

‘part-of’ = ‘is included as a sublist in’

Page 63: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de63

THE GOAL IS:

not to impose basic principles of classification and definition on biologists

– All the principles presented here should be conceived not as iron requirements but rather as rules of thumb –

deviation from which is often marked by characteristic families of coding errors

Page 64: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de64

example

[GO: 0030430] host cell cytoplasm, defined as: The cytoplasm of a host cell

[GO:0018995] host, defined as: Any organism in which another organism, especially a parasite or symbiont, spends part or all of its life cycle and from which it obtains nourishment and/or protection

Page 65: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de65

Why is this an error?

because organisms do not fall within the scope of GO

An organism is not a cellular component, and it is not a molecular function, and not a biological process, either

Page 66: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de66

host cell cytoplasm part-of host

breaks GO’s own granularity constraints

Page 67: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de67

Why univocity?

1. humans are good at disambiguating ambiguous expressions, machines not

2. quality assurance and ontology maintenance

3. GO, SNOMED, etc., are designed to constitute ‘controlled vocabularies’

Page 68: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de68

Quality assurance and ontology maintenance must be automated

As GO increases in size and scope it will “be increasingly difficult to maintain the semantic consistency we desire without software tools that perform consistency checks and controlled updates”.

The addition of each new term will require the curator to understand the entire structure of GO in order to avoid redundancy and to ensure that all appropriate linkages are made with other terms.

Page 69: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de69

The purpose of a ‘controlled vocabulary’

= to ensure that the same terms are used by different research groups with the same meanings

this has implications also for the syntax of GO terms (= the way terms are compounded together out of other terms)

Page 70: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de70

Univocity and syntax

The story of ‘/’

Page 71: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de71

/

GO:0008608 microtubule/kinetochore interaction

=df Physical interaction between microtubules and chromatin via proteins making up the kinetochore complex

Page 72: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de72

/

GO:0001539 ciliary/flagellar motility

=df Locomotion due to movement of cilia or flagella.

Page 73: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de73

/GO:0045798 negative regulation of

chromatin assembly/disassembly

=df Any process that stops, prevents or reduces the rate of chromatin assembly and/or disassembly

Page 74: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de74

/GO:0000082 G1/S transition of mitotic

cell cycle

defined as: Progression from G1 phase to S phase of the standard mitotic cell cycle.

Page 75: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de75

/

GO:0001559 interpretation of nuclear/cytoplasmic to regulate cell growth

=df The process where the size of the nucleus with respect to its cytoplasm signals the cell to grow or stop growing.

Page 76: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de76

/

GO:0015539 hexuronate (glucuronate/galacturonate) porter activity

=df Catalysis of the reaction: hexuronate(out) + cation(out) = hexuronate(in) + cation(in)

Page 77: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de77

Problems with GO’s compositionality

/ (slash) 286

: (semi-colon) 177

, (comma) 1206

and 180

Page 78: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de78

comma

cytokinesis, site selection

Page 79: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de79

plurals

biological process

physiological processes cellular process

cell growth and/or maintenance

Page 80: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de80

specification 39 complex 563

formation; forming

142regulator; regulatory; regulated; regulation

1326

determination; determinacy

56 acting on 146

with 54 constituting 35

from 141 constituent; constitutive 29

in 51 dependent 182

via 164 sensu 469

Page 81: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de81

Questions regarding operators

How does ‘constituent’ relate to ‘component’

If A within B then is A part-of B or included-in-the-interior-of B ?

Does via mean by means of or along the path of ?

How is ‘un-’ related to ‘not’ (how is ‘unlocalized’ related to ‘not localized’)

Page 82: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de82

‘involved in’

term-forming operator (reflection of GO’s limited resources for expressing relations):

hydrolase activity, acting on acid anhydrides, involved in cellular and subcellular movement

asymmetric protein localization involved in cell fate commitment

cell-cell signaling involved in cell fate commitment

protein secretion involved in cell fate commitment

Page 83: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de83

‘involved in’

hydrolase activity, acting on acid anhydrides, involved in cellular and subcellular movement

This is a term because GO does not have the resources to express ‘is-involved-in’ as a relation between terms

note problems with commas

Page 84: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de84

‘involved in’

hydrolase activity,

acting on acid anhydrides,

involved in cellular and subcellular movement

is-a hydrolase activity, acting on acid anhydrides

Page 85: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de85

‘involved in’

hydrolase activity, acting on acid anhydrides, involved in cellular and subcellular movement is-a hydrolase activity, acting on acid anhydrides

is ok: hydrolase activity, acting on anhydrides can but need not be involved in cellular and subcellular movement

Page 86: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de86

‘involved in’

asymmetric protein localization involved in cell fate commitment is-a cell fate commitment

should be a part-of relation

(compare: breathing involved in running is a running)

Page 87: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de87

‘involved in’

cell-cell signaling involved in cell fate commitment is-a cell fate commitment

ditto: should be a part-of relation

Page 88: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de88

these, though, are good:

asymmetric protein localization involved in cell fate commitment is-a asymmetric protein localization

cell-cell signaling involved in cell fate commitment is-a cell-cell signaling

Page 89: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de89

‘involved in’

protein secretion involved in cell fate commitment synonym of protein secretion

are there instances of protein secretion not involved in cell fate commitment?

… Problems with GO’s peculiar use of ‘synonym’

Page 90: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de90

Consequences of inconsistent and/or indeterminate use of

operatorsthere are 29.42% distinct terms within GO which

contain one or more polysemous operators but these terms receive only 13.96% of the

annotations present within GO Hypothesis: This lower percentage of annotations

reflects the fact that poorly defined operators are not well understood by annotators, who thus avoid the corresponding terms

Page 91: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de91

Principle of Compositionality

The meanings of compound terms should be determined

1. by the meanings of constituent terms

together with

2. the rules governing the syntactic operators

Page 92: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de92

Principle of Objectivity

which classes exist is not a function of our biological knowledge.

(Terms such as ‘unclassified’ or ‘unknown ligand’ or ‘not otherwise classified as peptides’ do not designate biological natural kinds.)

Page 93: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de93

GO:0008372 cellular component unknown

cellular component unknown is-a cellular component

unlocalized is-a cellular component

Holliday junction helicase complex is-a unlocalized

Page 94: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de94

GO’s excuse

‘unlocalized’ is used as a placeholder onlybut automatic information retrieval systems

cannot distinguish it from other, genuine class names

formal tools exist which can deal with the addition of knowledge into a classification system without the need to create fake classes

(Theory of Granular Partitions)

Page 95: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de95

Principle of Positivity

Class names should be positive. Logical complements of classes are not themselves classes.

(Terms such as ‘non-mammal’ or ‘non-membrane’ or ‘invertebrate’ or do not designate natural kinds.)

Page 96: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de96

Terms such as

‘Veterinary proprietary drug AND/OR biological’ *

do not designate natural kinds. (Which biological classes exist is not a matter of logic.)

*has 2532 children in SNOMED-CT

Page 97: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de97

Principle of Explicitness

if a link between two classes holds only under certain specific restrictions, then this restriction should be made explicit in the statement of the corresponding link-axiom

cf. GO’s sensu

Page 98: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de98

GOcan in practice be used only by trained biologists (with know how)

whether a GO-term truly stands in the is_a relation depends e.g. on the type of organism involved

glycosome is part-of cytoplasm

only for Kinetoplastidae

Computers have no counterpart of such context-dependent know-how

Page 99: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de99

Principle of Single Inheritance

no class in a classificatory hierarchy should have more than one parent on the immediate higher level

Page 100: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de100

Principle of Taxonomic Levels

the terms in a classificatory hierarchy should be divided into predetermined levels (analogous to the levels of kingdom, phylum, class, order, etc., in traditional biology).

depth in GO’s hierarchies not determinate because of multiple inheritance

Page 101: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de101

Principle of Partonomic Levels

Terms in a partonomic hierarchy should be divided into predetermined granularity levels, for example: organism, organ, cell, molecule, etc.)

(GO is about to break physiological process into 'cell physiological process' and 'organism physiological process'.)

= take granularity seriously

Page 102: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de102

Principle of Exhaustiveness

the classes on any given level should exhaust the domain of the classificatory hierarchy.

Page 103: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de103

Single Inheritance + Exhaustiveness = JEPD

for: Jointly Exhaustive and Pairwise Disjoint

Exhaustiveness often difficult to satisfy in the realm of biological phenomena; but its acceptance as an ideal is presupposed as a goal by every scientist.

Single inheritance accepted in all traditional (species-genus) classifications, now under threat because multiple inheritances is a computationally useful device (allows one to avoid certain kinds of combinatory explosion).

Page 104: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de104

Problems with multiple inheritance

B C

is-a1 is-a2

A

‘is-a’ no longer univocal

Page 105: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de105

GO’s ‘is-a’ is pressed into service to mean a variety of different things

the resulting ambiguities make the rules for correct coding difficult to communicate to human curators in terms of generally intelligible principles

they also serve as obstacles to integration with neighboring ontologies

Page 106: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de106

Problems with multiple inheritance

B C

is-a1 is-a2

A E

D

‘sibling’ is no longer determinate

Principle of levels is violated

Page 107: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de107

Page 108: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de108

Page 109: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de109

A storage vacuole is not a special kind of vacuole

a box used for storage is not a special kind of box

Page 110: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de110

Page 111: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de111

Another term-forming operator

lytic vacuole within a protein storage vacuole

lytic vacuole within a protein storage vacuole is-a protein storage vacuole

time-out within a baseball game is-a baseball game

embryo within a uterus is-a uterus

Page 112: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de112

Problems with Location

is-located-at / is-located-in and similar relations need to be expressed in GO via some combination of ‘is-a’ and ‘part-of’

… is-a unlocalized

is-a site of

… within …

… in …

Page 113: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de113

Problems with location

extrinsic to membrane part-of membrane

extrinsic to plasma membrane part-of plasma membrane

extrinsic to vacuolar membrane part-of vacuolar membrane

Page 114: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de114

Differentiation and Development

development cellular process

cell differentiation

Page 115: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de115

Cell differentation is-a development

But according to GO’s own definitions the agent or subject of differentiation is the cell, while the agent or subject of development is the whole organism

(again: GO has problems in keeping track of entities on differerent levels of granularity)

Page 116: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de116

cell differentiation is-a development

but:

hemocyte differentiation hemocyte development

part-of

Page 117: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de117

GO:0007514: garland cell differentiation

Definition: Development of garland cells, a small group of nephrocytes which take up waste materials from the hemolymph by endocytosis.

(Illustrates GO’s problems with definitions)

Page 118: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de118

Page 119: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de119

Part Three

How to do things right

so far only scratched the surface:

sensu

synonyms

GO’s definitions

GO’s ‘logical relationships’

Page 120: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de120

Principles for GO terms

Temporal coherenceDependenceUnivocityCompositionalityObjectivityPositivityExplicitness Taxonomic LevelsPartonomic LevelsSingle InheritanceExhaustiveness

Page 121: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de121

Should these principles be satisfied?

Michael Ashburner:

GO’s philosophy from the beginning was ‘just in time’ - that is, we made no great attempt to ‘complete’ the ontologies …. If you try and ‘complete’ an ontology, or worse: try and ‘get it right,’ then you will fail …

Page 122: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de122

Can these principles be satisfied?

Compare GO with Foundational Model of Anatomy (FMA)

Page 123: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de123

Principle GO FMA

Temporal coherence No N/A

Dependence No N/A

Univocity No Yes

Compositionality No Yes

Objectivity No Yes

Positivity No Yes

Explicitness No N/A

Taxonomic Levels No Yes

Partonomic Levels No Yes

Single Inheritance No Yes

Exhaustiveness No No

Page 124: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de124

The End Principle GO FMA

Temporal coherence

No N/A

Dependence No N/A

Univocity No Yes

Compositionality No Yes

Objectivity No Yes

Positivity No Yes

Explicitness No N/A

Taxonomic Levels

No Yes

Partonomic Levels

No Yes

Single Inheritance

No Yes

Exhaustiveness No No

Page 125: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de125

Is GO an ontology

GO a controlled vocabulary

= (ramshackle) syntactic regimentation

but because is-a and part-of are not given uniform readings, this does NOT mean the sort of semantic regimentation which would amount to an ontology in the proper sense of the word

Page 126: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de126

rules for definitions

intelligibility: the terms used in a definition should be simpler (more intelligible) than the term to be defined

definitions: do not confuse definitions with the communication of new knowledge

Page 127: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de127

Principle of Substitutability

in all so-called extensional contexts a defined term should be substitutable by its definition in such a way that the result is both grammatically correct and has the same truth-value as the sentence with which we begin

GO:0015070: toxin activity Definition: Acts as to cause injury to other living

organisms.

Page 128: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de128

substitutability

There is toxin activity here

There is acts as to cause injury to other living organisms here

Page 129: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de129

Defining is-a

A is-a B = every instance of A is an instance of B

A is-a B = A and B are natural kinds and every instance of A is an instance of B

A is-a B = A and B are natural kinds and every instance of A is as a matter of necessity an instance of B

Page 130: VR. Formal Principles for Biomedical Ontologies Barry Smith .

http:// ifomis.de130

Solutions to these problems

‘part_of’ should mean: part_of