Principles for Building Biomedical Ontologies

Post on 11-Jan-2016

26 views 1 download

Tags:

description

Principles for Building Biomedical Ontologies. Talk delivered by Jennifer Clark, GO Editorial Office. Clark et al ., 2005. is_a. part_of. Clark et al ., 2005. Slides and content by:. Barry Smith http://ifomis.de Rama Balakrishnan, David Hill, Jennifer Clark. http://www.geneontology.org. - PowerPoint PPT Presentation

Transcript of Principles for Building Biomedical Ontologies

Principles for Building Biomedical Ontologies

Talk delivered by Jennifer Clark, GO Editorial Office

Clark et al., 2005

part_of

is_a

Clark et al., 2005

•formal ontology

•information science

•special reference to the bio-medical domain.

Barry Smith

http://ifomis.de

Rama Balakrishnan, David Hill, Jennifer Clark.

http://www.geneontology.org

Slides and content by:

The Rules

1. Univocity

2. Positivity

3. Objectivity

4. Single Inheritance

5. Intelligibility of Definitions

6. Basis in Reality

classes

GO terms, types, kinds, universals

instances

annotated gene product attributes,

tokens, individuals, particulars

1 Univocity:

Terms should have the same meanings on every occasion of use

= bud initiation

= bud initiation

= bud initiation

The Challenge of Univocity: People use the same words to describe different things

Bud initiation? How is a computer to distinguish?

= bud initiation

sensu Metazoa

= bud initiation

sensu Saccharomyces

= bud initiation

sensu Viridiplantae

Univocity: GO adds “sensu” descriptors to discriminate among organisms

Tactile senseTactionTactition

?

The Challenge of Univocity:People call the same thing by different names

Tactile senseTactionTactition

perception of touch ; GO:0050975

Univocity: GO uses 1 term and many characterized synonyms

‘is at times part of’ antlers part_of red deer

‘necessarily is_part’ Seed dormancy part_of seed development

‘necessarily has_part’ Plant embryo part_of seed

Univocity in part_of relation

2 Positivity:

The complements of classes are not themselves classes.

Vertebrates

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

Vertebratesnon-vertebrates

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

Vertebratesnon-vertebrates

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

http://www.digibarn.com/collections/systems/canon-cat/Image53.jpg

Vertebratesnon-vertebratesSet of

all things

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

http://www.digibarn.com/collections/systems/canon-cat/Image53.jpg

VertebratesSet of

all organisms

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

VertebratesInvertebratesSet of

all organisms

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

VertebratesInvertebratesSet of

all organisms

http://www.artalyst.com/files/userimages/user70/21088058-O.preview.jpg

http://www.cucco.org/CatPictures/Cat%20Nap.jpg

membrane-bound organelle

GO:0043227

V. Not a membrane bound organelle

Non-membrane bound organelle

A centrosome is not a membrane bound organelle,but it still may be considered an organelle.

Non-membrane bound organelles

3 Objectivity:

The existence of classesis not dependenton our biological knowledge.

do not designate biological natural kinds.

‘unlocalised’‘unknown’ ‘unclassified’

http://news.bbc.co.uk/1/hi/sci/tech/4501152.stm

Task:

Annotate

molecular function

of 10-4,

a gene from Drosophila melanogaster

molecular function

molecular function unknown

is_a

10-4

Molecular function ontology Annotations

molecular function

molecular function unknown

is_a

10-4

Molecular function ontology Annotations

molecular function

molecular function unknown

is_a

10-4

Molecular function ontology Annotations

4 Single Inheritance:

No class in a classificationhierarchy should have morethan one is_a parent on theimmediate higher level

Clark et al., 2005

part_of

is_a

Rule of Single Inheritance

no diamonds:

C

is_a2

B

is_a1

A

Problems with multiple inheritance

B C

is_a1 is_a2

A

‘is_a’ no longer univocal

(univocal: having only one meaning)

Is_a diamond in GO Process

behavior

locomotory behavior larval behavior

larval locomotory behavior

is_a

behavior

locomotory behavior larval behavior

larval locomotory behavior

behavior of a thingdescriptive

behavior

is_a

Is_a diamond in GO Process

behavior

locomotory behavior

larval behavior

larval locomotory

behavior

is_a1 is_a2

5 Intelligibility of Definitions:

The terms used in a definition should be simplerthan the term to be defined

cellular process

cell differentiation

cell fate cell

Specification development

Is_a

part_of

cell differentiation

osteoblast neuron keratinocyte differentiation differentiation differentiation

adipocyte garland celldifferentiation differentiation

‘X cell differentiation’

is_a

Essence = Genus + Differentiae

Genus: differentiation

Differentiae: a neuron (or x cell)

X cell differentiation

X cell differentiation

Differentiation of an x cell.

X cell differentiation

The process whereby

a relatively unspecialized cell

acquires specialized features

of an x cell.

[List characteristics of x cell.]

cone cell fate commitment retinal_cone_cell

keratinocyte differentiation keratinocyte

adipocyte differentiation fat_cell

dendritic cell activation dendritic_cell

Process ontology Cell Ontology

[Term]id: GO:0030182name: neuron differentiationnamespace: biological_process

def: "The process whereby a relatively unspecialized cell acquires specialized features of a neuron." [GO:mah]

is_a: GO:0030154 ! cell differentiationrelationship: part_of GO:0048699 ! neurogenesis

[Term]id: CL:0000540name: neuron

def: "The basic cellular unit of nervous tissue. Each neuronconsists of a body\, an axon\, and dendrites. Their purposeis to receive\, conduct\, and transmit impulses in the nervous system." [MESH:A.08.663]

xref_analog: FBbt:00005106xref_analog: FBbt:00005146is_a: CL:0000393 ! electrically responsive cellis_a: CL:0000404 ! electrically signaling cellrelationship: develops_from CL:0000031 ! neuroblast

[Term]id: GO:0030182name: neuron differentiationnamespace: biological_process

def: "The process whereby a relatively unspecialized cell acquires specialized features of a neuron. The basic cellular unit of nervous tissue. Each neuron consists of a body\, an axon\, and dendrites. Their purpose is to receive\, conduct\, and transmit impulses in the nervous system." [MESH:A.08.663, GO:mah]

is_a: GO:0030154 ! cell differentiationintersection_of: is_a GO:0030154 ! cell differentiationintersection_of: has_participant CL:0000540 ! neuron

Other Ontologies that can be aligned with GO

Chemical ontologies 3,4-dihydroxy-2-butanone-4-phosphate synthase activity

Anatomy ontologies metanephros development

But Eventually…

Building Ontology

Improve

Collaborate and Learn

6 Basis in Reality:

When building or maintaining an ontology, always think carefullyabout how classes relate to instances in reality

supermanstrengthflightx-ray visionleaps over tall buildings in a single bound

Catwomanstrength,speed,agilityand ultra-keen senses of a cat.

http://www.uncleodiescollectibles.com/doesnotcompute/2004-10-11/Actor%20Christoper%20Reeve.jpg

http://home.austarnet.com.au/davekimble/catwoman.jpg

cartoon character super power ontology

super senses super physical powers

x-ray cat super supervision senses leaping strength

is_a

Annotations

Ontology

cartoon character super power ontology

super senses super physical powers

x-ray cat super supervision senses leaping strength

is_a

Catwoman Catwoman

Annotations

Ontology

Superman

Superman

cartoon character super power ontology

super senses super physical powers

x-ray cat super supervision senses leaping strength

is_a

Catwoman’scat senses

Catwoman’ssuper strength

Annotations

Ontology

Superman’sX-ray vision

Superman’ssuper leaping

cartoon character super power ontology

super senses super physical powers

is_a

Catwoman’scat senses

Catwoman’ssuper strength

Annotations

Ontology

Superman’sX-ray vision

Superman’ssuper leaping

molecular function

binding

tetrapyrrole binding cofactor binding

chlorophyll heme coenzyme quinonebinding binding binding binding

is_a

PSBI

Annotations

Ontology

PSBI

molecular function

binding

tetrapyrrole binding cofactor binding

chlorophyll heme coenzyme quinonebinding binding binding binding

is_a

PSBI’s quinone binding function

Annotations

Ontology

PSBI’s chlorophyll binding function

The Rules1. Univocity: Terms should have the same meanings

on every occasion of use2. Positivity: Terms such as ‘non-mammal’ or ‘non-

membrane’ do not designate genuine classes.3. Objectivity: Terms such as ‘unknown’ or

‘unclassified’ or ‘unlocalized’ do not designate biological natural kinds.

4. Single Inheritance: No class in a classification hierarchy should have more than one is_a parent on the immediate higher level

5. Intelligibility of Definitions: The terms used in a definition should be simpler (more intelligible) than the term to be defined

6. Basis in Reality: When building or maintaining an ontology, always think carefully about how classes relate to instances in reality

END

spare slides follow.

How to define A is_a B

A is_a B =def.

1. A and B are names of universals (natural kinds, types) in reality

2. all instances of A are as a matter of biological science also instances of B

True path violationWhat is it?

chromosome

Mitochondrial chromosome

Is_a relationship

Part_of relationship

nucleus

True path violationWhat is it?

nucleus chromosome

Nuclear chromosome

Mitochondrial chromosome

Is_a relationshipsPart_of relationship

The Importance of synonyms for utility:How do we represent the function of tRNA?

Molecular_function

Triplet_codon amino acid adaptor activity

GO Definition: Mediates the insertion of an amino acid at the correct point in the sequence of a nascent polypeptide chain during protein synthesis.

Synonym: tRNA

Main obstacle to integration

Current ontologies do not deal well with Time and Space and Instances (particulars)

Our definitions should link the terms in the ontology to instances in spatio-temporal reality

7 Distinguish Universals and Instances

Don’t forget instances when defining relations

part_of as a relation between classes versus part_of as a relation between instances

nucleus part_of cell your heart part_of you

•formal ontology

•information science

•special reference to the bio-medical domain.

Barry Smith

http://ifomis.de

Rama Balakrishnan, David Hill, Jennifer Clark.

http://www.geneontology.org

Slides and content by: