Species and Classification in Biology Barry Smith

Post on 08-Jan-2018

226 views 0 download

description

3 DNA m

Transcript of Species and Classification in Biology Barry Smith

Species and Classification in Biology

Barry Smithhttp://ifomis.org

http:// ifomis.org 2

http:// ifomis.org 3

DNA

10-9 m

http:// ifomis.org 4

DNA

Protein

Organelle

Cell

Tissue

Organ

Organism

10-5 m

10-1 m

10-9 m

http:// ifomis.org 5

New golden age of classification*

~ 30 million species30,000 genes in human200,000 proteins100s of cell types100,000s of disease types 1,000,000s of biochemical pathways

(including disease pathways)

*… legacy of Human Genome Project

http:// ifomis.org 6

DNA

Protein

Organelle

Cell

Tissue

Organ

Organism

10-5 m

10-1 m

10-9 m

http:// ifomis.org 7

FUNCTIONAL GENOMICS

proteomics, reactomics, metabonomics,

phenomics, behaviouromics,

toxicopharmacogenomics…

http:// ifomis.org 8

The incompatibilities between different scientific cultures and terminologies

immunologygenetics

cell biology

http:// ifomis.org 9

have resurrected the problem of the unity of science in a new guise:

The logical positivist solution to this problem addressed a world in which sciences are associated with printed texts.What happens when sciences are associated with databases?

http:// ifomis.org 10

… when each (chemical, pathological, immunological, toxicological) information system uses its own classifications

how can we overcome the incompatibilities which become apparent when data from distinct sources are combined?

http:// ifomis.org 11

Answer:

“Ontology”

http:// ifomis.org 12

= building software artefactsstandardized classification systems/

controlled vocabularies

so that data from one source should be expressed in a language which

makes it compatible with data from every other source

http:// ifomis.org 13

Google hits (in millions) 25.4.06

ontology 52.4ontology + philosophy 2.7ontology + information science 6.0

ontology + database 7.8

http:// ifomis.org 14

A Linnaean Species Hierarchy

http:// ifomis.org 15

(Small) Disease Hierarchy

http:// ifomis.org 16

Combining hierarchies

OrganismsDiseases

http:// ifomis.org 17

via Dependence Relations

Organisms Diseases

http:// ifomis.org 18

A Window on Reality

http:// ifomis.org 19

A Window on RealityOrganisms

Diseases

http:// ifomis.org 20

A Window on Reality

http:// ifomis.org 21

How to understand species (aka types, universals, kinds)

Species are something like invariants in reality which can be studied by science

Species have instances: this mouse, this cell, this cell membrane ...

http:// ifomis.org 22

Entity =def

anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software

http:// ifomis.org 23

Domain =def

a portion of reality that forms the subject-matter of a single science or technology or mode of study;

proteomicsradiologyviral infections in mouse

http:// ifomis.org 24

Representation =def

an image, idea, map, picture, name or description ... of some entity or entities.

http:// ifomis.org 25

Analogue representations

http:// ifomis.org 26

Representational units =def

terms, icons, photographs, identifiers ... which refer, or are intended to refer, to entities

http:// ifomis.org 27

Composite representation =def

representation (1) built out of representational units

which(2) form a structure that mirrors, or is intended to mirror, the entities in some domain

http:// ifomis.org 28

Periodic TableThe Periodic Table

http:// ifomis.org 29

Ontologies are here

http:// ifomis.org 30

Ontologies are representational artifacts

http:// ifomis.org 31

What do ontologies represent?

http:// ifomis.org 32

A 515287 DC3300 Dust Collector FanB 521683 Gilmer BeltC 521682 Motor Drive Belt

http:// ifomis.org 33

A 515287 DC3300 Dust Collector FanB 521683 Gilmer BeltC 521682 Motor Drive Belt

instances

types

http:// ifomis.org 34

Two kinds of composite representational artifacts

Databases, inventories: represent what is particular in reality = instances

Ontologies, terminologies, catalogs: represent what is general in reality = types

http:// ifomis.org 35

What do ontologies represent?

http:// ifomis.org 36

Ontologies do not represent concepts in people’s heads

http:// ifomis.org 37

Ontology is a tool of science

Scientists do not describe the concepts in scientists’ heads

They describe the types in reality, as a step towards finding ways to reason about (and treat) instances of these types

http:// ifomis.org 38

The biologist has a cognitive representation which involves theoretical knowledge

derived from textbooks

http:// ifomis.org 39

An ontology is like a scientific text; it is a representation of types in reality

http:// ifomis.org 40

Two kinds of composite representational artifacts

Databases represent instancesOntologies represent types

http:// ifomis.org 41

Instances stand in similarity relations

Frank and Bill are similar as humans, mammals, animals, etc.

Human, mammal and animal are types at different levels of granularity

http:// ifomis.org 42

siamese

mammal

cat

organism

substancetypes

animal

instances

frog

http:// ifomis.org 43

science needs to find uniform ways of representing types

ontology =def a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent1. types in reality2. those relations between these types which obtain universally (= for all instances)

lung is_a anatomical structurelobe of lung part_of lung

http:// ifomis.org 44

is_a

A is_a B =def

For all x, if x instance_of A then x instance_of B

cell division is_a biological process

http:// ifomis.org 45

Entities

http:// ifomis.org 46

Entities

universals (species, types, taxa, …)

particulars (individuals, tokens, instances)

http:// ifomis.org 47

Canonical instances within the realm of individuals

= those individuals which 1. instantiate universals (entering into biological laws)2. are prototypical

Canonical Anatomy: no Siamese twins, no six-fingered giants, no amputation stumps, …

http:// ifomis.org 48

Entities

universals

instancesjunkjunk

junk

example of junk particulars: desk-mountain

http:// ifomis.org 49

Entities

human

Jane

inst

http:// ifomis.org 50

Ontologies are More than Just Taxonomies

http:// ifomis.org 51

The Gene Ontology

7 million google hits

a cross-species controlled vocabulary for annotations of genes and gene products

deeper than Darwinianism

http:// ifomis.org 52

When a gene is identified

three important types of questions need to be addressed:

1. Where is it located in the cell? 2. What functions does it have on the

molecular level? 3. To what biological processes do these

functions contribute?

http:// ifomis.org 53

GO has three ontologies

molecular functions

cellular components

biological processes

http:// ifomis.org 54

GO astonishingly influential

used by all major species genome projectsused by all major pharmacological research

groupsused by all major bioinformatics research

groups

http:// ifomis.org 55

GO part of the Open Biological Ontologies consortium

Fungal OntologyPlant Ontology Yeast OntologyDisease Ontology

Mouse Anatomy OntologyCell OntologySequence OntologyRelations Ontology

http:// ifomis.org 56

Each of GO’s ontologies

is organized in a graph-theoretical structure involving two sorts of links or edges:

is-a (= is a subtype of )(copulation is-a biological process)

part-of (cell wall part-of cell)

http:// ifomis.org 57

http:// ifomis.org 58

The Gene Ontology

a ‘controlled vocabulary’designed to standardize annotation of genes and gene productsused by over 20 genome database and many other groups in academia and industryand methodology much imitated

http:// ifomis.org 59

The Methodology of AnnotationsScientific curators use experimental observations

reported in the biomedical literature to link gene products with GO terms in annotations.

The gene annotations taken together yield a slowly growing computer-interpretable map of biological reality,

The process of annotating literature also leads to improvements and extensions of the ontology, which institutes a virtuous cycle of improvement in the quality and reach of future annotations and of the ontology itself.

The Gene Ontology as Cartoon

http:// ifomis.org 60

cellular componentsmolecular functions biological processes

1372 component terms7271 function terms8069 process terms

http:// ifomis.org 61

The Cellular Component Ontology (counterpart of anatomy)

membranenucleus

http:// ifomis.org 62

The Molecular Function Ontology

protein stabilization

The Molecular Function ontology is (roughly) an ontology of actions on the molecular level of granularity

http:// ifomis.org 63

Biological Process Ontology

death

An ontology of occurrents on the level of granularity of cells, organs and whole organisms

http:// ifomis.org 64

GO here an example

a. of the sorts of problems confronting life science data integration

b. of the degree to which formal methods are relevant to the solution of these problems

http:// ifomis.org 65

Each of GO’s ontologies

is organized in a graph-theoretical data structure involving two sorts of links or edges:

is-a (= is a subtype of )(copulation is-a biological process)

part-of (cell wall part-of cell)

http:// ifomis.org 66

Linnaeus

http:// ifomis.org 67

http:// ifomis.org 68

Entities

http:// ifomis.org 69

Entities

universals (kinds, types, taxa, …)

particulars (individuals, tokens, instances …)

Axiom: Nothing is both a universal and a particular

http:// ifomis.org 70

Entities

universals*

*natural, biological, kinds

http:// ifomis.org 71

Entities

universals

instances

http:// ifomis.org 72

universals are natural kinds

Instances are natural exemplars of natural kinds(problem of non-standard instances) Not all individuals are instances of universals

http:// ifomis.org 73

Entities

universals

instancesinstances

penumbra of borderline cases

http:// ifomis.org 74

Entities

universals

instancesjunkjunk

junk

example of junk: beachball-desk

http:// ifomis.org 75

Primitive relations: inst and part

inst(Jane, human being)part(Jane’s heart, Jane’s body)

A universal is anything that is instantiatedAn instance as anything (any individual) that

instantiates some universal

http:// ifomis.org 76

Entities

human

Jane

inst

http:// ifomis.org 77

A is_a B genus(B)

species(A)

instances

http:// ifomis.org 78

is-a

D3* e is a f =def universal(e) universal(f) x (inst(x, e) inst(x, f)).

genus(A)=def universal(A) B (B is a A B A)

species(A)=def universal(A) B (A is a B B A)

http:// ifomis.org 79

solve problem of false positives

insist that

A is_a B

holds always as a matter of scientific law

http:// ifomis.org 80

nearest species

nearestspecies(A, B)=def A is_a B &

C ((A is_a C & C is_a B) (C = A or C = B)

B

A

http:// ifomis.org 81

Definitions

highest genus

lowest species

instances

http:// ifomis.org 82

Lowest Species and Highest Genus

lowestspecies(A)=def

species(A) & not-genus(A)highestgenus(A)=def

genus(A) & not-species(A)

Theorem:universal(A) (genus(A) or

lowestspecies(A))

http:// ifomis.org 83

Axioms

Every universal has at least one instance

Distinct lowest species never share instances

SINGLE INHERITANCE: Every species is the nearest species to

exactly one genus

http:// ifomis.org 84

Axioms governing instgenus(A) & inst(x, A)

B nearestspecies(B, A) & inst(x, B) EVERY GENUS HAS AN INSTANTIATED

SPECIES

nearestspecies(A, B) A’s instances are properly included in B’s instances

EACH SPECIES HAS A SMALLER CLASS OF INSTANCES THAN ITS GENUS

http:// ifomis.org 85

Axiomsnearestspecies(B, A)

C (nearestspecies(C, A) & B C)EVERY GENUS HAS AT LEAST TWO CHILDREN

nearestspecies(B, A) & nearestspecies(C, A) & B C) not-x (inst(x, B) & inst(x, C))SPECIES OF A COMMON GENUS NEVER SHARE INSTANCES

http:// ifomis.org 86

Theorems

(genus(A) & inst(x, A)) B (lowestspecies(B) & B is_a A & inst(x, B))EVERY INSTANCE IS ALSO AN INSTANCE OF SOME LOWEST SPECIES

(genus(A) & lowestspecies(B) & x(inst(x, A) & inst(x, B)) B is_a A)IF AN INSTANCE OF A LOWEST SPECIES IS AN INSTANCE OF A GENUS THEN THE LOWEST SPECIES IS A CHILD OF THE GENUS

http:// ifomis.org 87

Theorems

universal(A) & universal(B) (A = B or A is_a B or B is_a A or not-x(inst(x, A) & inst(x, B)))

DISTINCT UNIVERSALS EITHER STAND IN A PARENT-CHILD RELATIONSHIP OR THEY HAVE NO INSTANCES IN COMMON

http:// ifomis.org 88

Theorems

A is_a B & A is_a C (B = C or B is_a C or C is_a

B)

UNIVERSALS WHICH SHARE A CHILD IN COMMON ARE EITHER IDENTICAL OR ONE IS SUBORDINATED TO THE OTHER

http:// ifomis.org 89

Theorems

(genus(A) & genus(B) & x(inst(x, A) & inst(x, B))) C(C is_a A & C is_a B)

IF TWO GENERA HAVE A COMMON INSTANCE THEN THEY HAVE A COMMON CHILD

http:// ifomis.org 90

Expanding the theory

Sexually reproducing organismsOrganisms in general

To take account of development (child, adult; larva, butterfly)

Biological processesBiological functions

-- at different levels of granularity

http:// ifomis.org 91

How to understand species (aka types, universals, kinds)

Species are something like invariants in reality which can be studied by science

Species have instances: this mouse, this cell, this cell membrane ...

http:// ifomis.org 92

Universal, Classes, Sets

A class is the extension of universal

http:// ifomis.org 93

Class =def

a maximal collection of particulars determined by a general term (‘cell’, ‘mouse’, ‘Saarländer’)

the class A = the collection of all particulars x for which ‘x is A’ is true

http:// ifomis.org 94

Universals and Classes vs. SumsThe former are marked by granularity: they divide up the domain into whole units, whose interior parts are traced over. The universal human being is instantiated only by human beings as single, whole units.

A mereological sum is not granular in this sense (molecules are parts of the mereological sum of human beings)

http:// ifomis.org 95

A bad solutionIdentify both universals and classes with sets in

the mathematical sense

Problem of false positives

adult childlion in Leipzig lionanimal owned by the Emporer mammalmammal weighing less than 200 Kg animal

http:// ifomis.org 96

Sets in the mathematical sense are marked by granularity

Granularity = each class or set is laid across reality like a grid consisting (1) of a number of slots or pigeonholes each (2) occupied by some member.

Each set is (1) associated with a specific number of slots, each of which (2) must be occupied by some specific member.

A class survives the turnover in its instances: both (1) the number of slots and (2) the individuals occupying these slots may vary with time

http:// ifomis.org 97

But sets are timelessA set is an abstract structure, existing outside time and space. The set of human beings existing at t is (timelessly) a different entity from the set of human beings existing at t because of births and deaths. Biological classes exist in timeDarwin: because the universals of which they are extensions exist in time