Species and Classification in Biology Barry Smith

97
Species and Classification in Biology Barry Smith http://ifomis.org

description

3 DNA m

Transcript of Species and Classification in Biology Barry Smith

Page 1: Species and Classification in Biology Barry Smith

Species and Classification in Biology

Barry Smithhttp://ifomis.org

Page 2: Species and Classification in Biology Barry Smith

http:// ifomis.org 2

Page 3: Species and Classification in Biology Barry Smith

http:// ifomis.org 3

DNA

10-9 m

Page 4: Species and Classification in Biology Barry Smith

http:// ifomis.org 4

DNA

Protein

Organelle

Cell

Tissue

Organ

Organism

10-5 m

10-1 m

10-9 m

Page 5: Species and Classification in Biology Barry Smith

http:// ifomis.org 5

New golden age of classification*

~ 30 million species30,000 genes in human200,000 proteins100s of cell types100,000s of disease types 1,000,000s of biochemical pathways

(including disease pathways)

*… legacy of Human Genome Project

Page 6: Species and Classification in Biology Barry Smith

http:// ifomis.org 6

DNA

Protein

Organelle

Cell

Tissue

Organ

Organism

10-5 m

10-1 m

10-9 m

Page 7: Species and Classification in Biology Barry Smith

http:// ifomis.org 7

FUNCTIONAL GENOMICS

proteomics, reactomics, metabonomics,

phenomics, behaviouromics,

toxicopharmacogenomics…

Page 8: Species and Classification in Biology Barry Smith

http:// ifomis.org 8

The incompatibilities between different scientific cultures and terminologies

immunologygenetics

cell biology

Page 9: Species and Classification in Biology Barry Smith

http:// ifomis.org 9

have resurrected the problem of the unity of science in a new guise:

The logical positivist solution to this problem addressed a world in which sciences are associated with printed texts.What happens when sciences are associated with databases?

Page 10: Species and Classification in Biology Barry Smith

http:// ifomis.org 10

… when each (chemical, pathological, immunological, toxicological) information system uses its own classifications

how can we overcome the incompatibilities which become apparent when data from distinct sources are combined?

Page 11: Species and Classification in Biology Barry Smith

http:// ifomis.org 11

Answer:

“Ontology”

Page 12: Species and Classification in Biology Barry Smith

http:// ifomis.org 12

= building software artefactsstandardized classification systems/

controlled vocabularies

so that data from one source should be expressed in a language which

makes it compatible with data from every other source

Page 13: Species and Classification in Biology Barry Smith

http:// ifomis.org 13

Google hits (in millions) 25.4.06

ontology 52.4ontology + philosophy 2.7ontology + information science 6.0

ontology + database 7.8

Page 14: Species and Classification in Biology Barry Smith

http:// ifomis.org 14

A Linnaean Species Hierarchy

Page 15: Species and Classification in Biology Barry Smith

http:// ifomis.org 15

(Small) Disease Hierarchy

Page 16: Species and Classification in Biology Barry Smith

http:// ifomis.org 16

Combining hierarchies

OrganismsDiseases

Page 17: Species and Classification in Biology Barry Smith

http:// ifomis.org 17

via Dependence Relations

Organisms Diseases

Page 18: Species and Classification in Biology Barry Smith

http:// ifomis.org 18

A Window on Reality

Page 19: Species and Classification in Biology Barry Smith

http:// ifomis.org 19

A Window on RealityOrganisms

Diseases

Page 20: Species and Classification in Biology Barry Smith

http:// ifomis.org 20

A Window on Reality

Page 21: Species and Classification in Biology Barry Smith

http:// ifomis.org 21

How to understand species (aka types, universals, kinds)

Species are something like invariants in reality which can be studied by science

Species have instances: this mouse, this cell, this cell membrane ...

Page 22: Species and Classification in Biology Barry Smith

http:// ifomis.org 22

Entity =def

anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software

Page 23: Species and Classification in Biology Barry Smith

http:// ifomis.org 23

Domain =def

a portion of reality that forms the subject-matter of a single science or technology or mode of study;

proteomicsradiologyviral infections in mouse

Page 24: Species and Classification in Biology Barry Smith

http:// ifomis.org 24

Representation =def

an image, idea, map, picture, name or description ... of some entity or entities.

Page 25: Species and Classification in Biology Barry Smith

http:// ifomis.org 25

Analogue representations

Page 26: Species and Classification in Biology Barry Smith

http:// ifomis.org 26

Representational units =def

terms, icons, photographs, identifiers ... which refer, or are intended to refer, to entities

Page 27: Species and Classification in Biology Barry Smith

http:// ifomis.org 27

Composite representation =def

representation (1) built out of representational units

which(2) form a structure that mirrors, or is intended to mirror, the entities in some domain

Page 28: Species and Classification in Biology Barry Smith

http:// ifomis.org 28

Periodic TableThe Periodic Table

Page 29: Species and Classification in Biology Barry Smith

http:// ifomis.org 29

Ontologies are here

Page 30: Species and Classification in Biology Barry Smith

http:// ifomis.org 30

Ontologies are representational artifacts

Page 31: Species and Classification in Biology Barry Smith

http:// ifomis.org 31

What do ontologies represent?

Page 32: Species and Classification in Biology Barry Smith

http:// ifomis.org 32

A 515287 DC3300 Dust Collector FanB 521683 Gilmer BeltC 521682 Motor Drive Belt

Page 33: Species and Classification in Biology Barry Smith

http:// ifomis.org 33

A 515287 DC3300 Dust Collector FanB 521683 Gilmer BeltC 521682 Motor Drive Belt

instances

types

Page 34: Species and Classification in Biology Barry Smith

http:// ifomis.org 34

Two kinds of composite representational artifacts

Databases, inventories: represent what is particular in reality = instances

Ontologies, terminologies, catalogs: represent what is general in reality = types

Page 35: Species and Classification in Biology Barry Smith

http:// ifomis.org 35

What do ontologies represent?

Page 36: Species and Classification in Biology Barry Smith

http:// ifomis.org 36

Ontologies do not represent concepts in people’s heads

Page 37: Species and Classification in Biology Barry Smith

http:// ifomis.org 37

Ontology is a tool of science

Scientists do not describe the concepts in scientists’ heads

They describe the types in reality, as a step towards finding ways to reason about (and treat) instances of these types

Page 38: Species and Classification in Biology Barry Smith

http:// ifomis.org 38

The biologist has a cognitive representation which involves theoretical knowledge

derived from textbooks

Page 39: Species and Classification in Biology Barry Smith

http:// ifomis.org 39

An ontology is like a scientific text; it is a representation of types in reality

Page 40: Species and Classification in Biology Barry Smith

http:// ifomis.org 40

Two kinds of composite representational artifacts

Databases represent instancesOntologies represent types

Page 41: Species and Classification in Biology Barry Smith

http:// ifomis.org 41

Instances stand in similarity relations

Frank and Bill are similar as humans, mammals, animals, etc.

Human, mammal and animal are types at different levels of granularity

Page 42: Species and Classification in Biology Barry Smith

http:// ifomis.org 42

siamese

mammal

cat

organism

substancetypes

animal

instances

frog

Page 43: Species and Classification in Biology Barry Smith

http:// ifomis.org 43

science needs to find uniform ways of representing types

ontology =def a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent1. types in reality2. those relations between these types which obtain universally (= for all instances)

lung is_a anatomical structurelobe of lung part_of lung

Page 44: Species and Classification in Biology Barry Smith

http:// ifomis.org 44

is_a

A is_a B =def

For all x, if x instance_of A then x instance_of B

cell division is_a biological process

Page 45: Species and Classification in Biology Barry Smith

http:// ifomis.org 45

Entities

Page 46: Species and Classification in Biology Barry Smith

http:// ifomis.org 46

Entities

universals (species, types, taxa, …)

particulars (individuals, tokens, instances)

Page 47: Species and Classification in Biology Barry Smith

http:// ifomis.org 47

Canonical instances within the realm of individuals

= those individuals which 1. instantiate universals (entering into biological laws)2. are prototypical

Canonical Anatomy: no Siamese twins, no six-fingered giants, no amputation stumps, …

Page 48: Species and Classification in Biology Barry Smith

http:// ifomis.org 48

Entities

universals

instancesjunkjunk

junk

example of junk particulars: desk-mountain

Page 49: Species and Classification in Biology Barry Smith

http:// ifomis.org 49

Entities

human

Jane

inst

Page 50: Species and Classification in Biology Barry Smith

http:// ifomis.org 50

Ontologies are More than Just Taxonomies

Page 51: Species and Classification in Biology Barry Smith

http:// ifomis.org 51

The Gene Ontology

7 million google hits

a cross-species controlled vocabulary for annotations of genes and gene products

deeper than Darwinianism

Page 52: Species and Classification in Biology Barry Smith

http:// ifomis.org 52

When a gene is identified

three important types of questions need to be addressed:

1. Where is it located in the cell? 2. What functions does it have on the

molecular level? 3. To what biological processes do these

functions contribute?

Page 53: Species and Classification in Biology Barry Smith

http:// ifomis.org 53

GO has three ontologies

molecular functions

cellular components

biological processes

Page 54: Species and Classification in Biology Barry Smith

http:// ifomis.org 54

GO astonishingly influential

used by all major species genome projectsused by all major pharmacological research

groupsused by all major bioinformatics research

groups

Page 55: Species and Classification in Biology Barry Smith

http:// ifomis.org 55

GO part of the Open Biological Ontologies consortium

Fungal OntologyPlant Ontology Yeast OntologyDisease Ontology

Mouse Anatomy OntologyCell OntologySequence OntologyRelations Ontology

Page 56: Species and Classification in Biology Barry Smith

http:// ifomis.org 56

Each of GO’s ontologies

is organized in a graph-theoretical structure involving two sorts of links or edges:

is-a (= is a subtype of )(copulation is-a biological process)

part-of (cell wall part-of cell)

Page 57: Species and Classification in Biology Barry Smith

http:// ifomis.org 57

Page 58: Species and Classification in Biology Barry Smith

http:// ifomis.org 58

The Gene Ontology

a ‘controlled vocabulary’designed to standardize annotation of genes and gene productsused by over 20 genome database and many other groups in academia and industryand methodology much imitated

Page 59: Species and Classification in Biology Barry Smith

http:// ifomis.org 59

The Methodology of AnnotationsScientific curators use experimental observations

reported in the biomedical literature to link gene products with GO terms in annotations.

The gene annotations taken together yield a slowly growing computer-interpretable map of biological reality,

The process of annotating literature also leads to improvements and extensions of the ontology, which institutes a virtuous cycle of improvement in the quality and reach of future annotations and of the ontology itself.

The Gene Ontology as Cartoon

Page 60: Species and Classification in Biology Barry Smith

http:// ifomis.org 60

cellular componentsmolecular functions biological processes

1372 component terms7271 function terms8069 process terms

Page 61: Species and Classification in Biology Barry Smith

http:// ifomis.org 61

The Cellular Component Ontology (counterpart of anatomy)

membranenucleus

Page 62: Species and Classification in Biology Barry Smith

http:// ifomis.org 62

The Molecular Function Ontology

protein stabilization

The Molecular Function ontology is (roughly) an ontology of actions on the molecular level of granularity

Page 63: Species and Classification in Biology Barry Smith

http:// ifomis.org 63

Biological Process Ontology

death

An ontology of occurrents on the level of granularity of cells, organs and whole organisms

Page 64: Species and Classification in Biology Barry Smith

http:// ifomis.org 64

GO here an example

a. of the sorts of problems confronting life science data integration

b. of the degree to which formal methods are relevant to the solution of these problems

Page 65: Species and Classification in Biology Barry Smith

http:// ifomis.org 65

Each of GO’s ontologies

is organized in a graph-theoretical data structure involving two sorts of links or edges:

is-a (= is a subtype of )(copulation is-a biological process)

part-of (cell wall part-of cell)

Page 66: Species and Classification in Biology Barry Smith

http:// ifomis.org 66

Linnaeus

Page 67: Species and Classification in Biology Barry Smith

http:// ifomis.org 67

Page 68: Species and Classification in Biology Barry Smith

http:// ifomis.org 68

Entities

Page 69: Species and Classification in Biology Barry Smith

http:// ifomis.org 69

Entities

universals (kinds, types, taxa, …)

particulars (individuals, tokens, instances …)

Axiom: Nothing is both a universal and a particular

Page 70: Species and Classification in Biology Barry Smith

http:// ifomis.org 70

Entities

universals*

*natural, biological, kinds

Page 71: Species and Classification in Biology Barry Smith

http:// ifomis.org 71

Entities

universals

instances

Page 72: Species and Classification in Biology Barry Smith

http:// ifomis.org 72

universals are natural kinds

Instances are natural exemplars of natural kinds(problem of non-standard instances) Not all individuals are instances of universals

Page 73: Species and Classification in Biology Barry Smith

http:// ifomis.org 73

Entities

universals

instancesinstances

penumbra of borderline cases

Page 74: Species and Classification in Biology Barry Smith

http:// ifomis.org 74

Entities

universals

instancesjunkjunk

junk

example of junk: beachball-desk

Page 75: Species and Classification in Biology Barry Smith

http:// ifomis.org 75

Primitive relations: inst and part

inst(Jane, human being)part(Jane’s heart, Jane’s body)

A universal is anything that is instantiatedAn instance as anything (any individual) that

instantiates some universal

Page 76: Species and Classification in Biology Barry Smith

http:// ifomis.org 76

Entities

human

Jane

inst

Page 77: Species and Classification in Biology Barry Smith

http:// ifomis.org 77

A is_a B genus(B)

species(A)

instances

Page 78: Species and Classification in Biology Barry Smith

http:// ifomis.org 78

is-a

D3* e is a f =def universal(e) universal(f) x (inst(x, e) inst(x, f)).

genus(A)=def universal(A) B (B is a A B A)

species(A)=def universal(A) B (A is a B B A)

Page 79: Species and Classification in Biology Barry Smith

http:// ifomis.org 79

solve problem of false positives

insist that

A is_a B

holds always as a matter of scientific law

Page 80: Species and Classification in Biology Barry Smith

http:// ifomis.org 80

nearest species

nearestspecies(A, B)=def A is_a B &

C ((A is_a C & C is_a B) (C = A or C = B)

B

A

Page 81: Species and Classification in Biology Barry Smith

http:// ifomis.org 81

Definitions

highest genus

lowest species

instances

Page 82: Species and Classification in Biology Barry Smith

http:// ifomis.org 82

Lowest Species and Highest Genus

lowestspecies(A)=def

species(A) & not-genus(A)highestgenus(A)=def

genus(A) & not-species(A)

Theorem:universal(A) (genus(A) or

lowestspecies(A))

Page 83: Species and Classification in Biology Barry Smith

http:// ifomis.org 83

Axioms

Every universal has at least one instance

Distinct lowest species never share instances

SINGLE INHERITANCE: Every species is the nearest species to

exactly one genus

Page 84: Species and Classification in Biology Barry Smith

http:// ifomis.org 84

Axioms governing instgenus(A) & inst(x, A)

B nearestspecies(B, A) & inst(x, B) EVERY GENUS HAS AN INSTANTIATED

SPECIES

nearestspecies(A, B) A’s instances are properly included in B’s instances

EACH SPECIES HAS A SMALLER CLASS OF INSTANCES THAN ITS GENUS

Page 85: Species and Classification in Biology Barry Smith

http:// ifomis.org 85

Axiomsnearestspecies(B, A)

C (nearestspecies(C, A) & B C)EVERY GENUS HAS AT LEAST TWO CHILDREN

nearestspecies(B, A) & nearestspecies(C, A) & B C) not-x (inst(x, B) & inst(x, C))SPECIES OF A COMMON GENUS NEVER SHARE INSTANCES

Page 86: Species and Classification in Biology Barry Smith

http:// ifomis.org 86

Theorems

(genus(A) & inst(x, A)) B (lowestspecies(B) & B is_a A & inst(x, B))EVERY INSTANCE IS ALSO AN INSTANCE OF SOME LOWEST SPECIES

(genus(A) & lowestspecies(B) & x(inst(x, A) & inst(x, B)) B is_a A)IF AN INSTANCE OF A LOWEST SPECIES IS AN INSTANCE OF A GENUS THEN THE LOWEST SPECIES IS A CHILD OF THE GENUS

Page 87: Species and Classification in Biology Barry Smith

http:// ifomis.org 87

Theorems

universal(A) & universal(B) (A = B or A is_a B or B is_a A or not-x(inst(x, A) & inst(x, B)))

DISTINCT UNIVERSALS EITHER STAND IN A PARENT-CHILD RELATIONSHIP OR THEY HAVE NO INSTANCES IN COMMON

Page 88: Species and Classification in Biology Barry Smith

http:// ifomis.org 88

Theorems

A is_a B & A is_a C (B = C or B is_a C or C is_a

B)

UNIVERSALS WHICH SHARE A CHILD IN COMMON ARE EITHER IDENTICAL OR ONE IS SUBORDINATED TO THE OTHER

Page 89: Species and Classification in Biology Barry Smith

http:// ifomis.org 89

Theorems

(genus(A) & genus(B) & x(inst(x, A) & inst(x, B))) C(C is_a A & C is_a B)

IF TWO GENERA HAVE A COMMON INSTANCE THEN THEY HAVE A COMMON CHILD

Page 90: Species and Classification in Biology Barry Smith

http:// ifomis.org 90

Expanding the theory

Sexually reproducing organismsOrganisms in general

To take account of development (child, adult; larva, butterfly)

Biological processesBiological functions

-- at different levels of granularity

Page 91: Species and Classification in Biology Barry Smith

http:// ifomis.org 91

How to understand species (aka types, universals, kinds)

Species are something like invariants in reality which can be studied by science

Species have instances: this mouse, this cell, this cell membrane ...

Page 92: Species and Classification in Biology Barry Smith

http:// ifomis.org 92

Universal, Classes, Sets

A class is the extension of universal

Page 93: Species and Classification in Biology Barry Smith

http:// ifomis.org 93

Class =def

a maximal collection of particulars determined by a general term (‘cell’, ‘mouse’, ‘Saarländer’)

the class A = the collection of all particulars x for which ‘x is A’ is true

Page 94: Species and Classification in Biology Barry Smith

http:// ifomis.org 94

Universals and Classes vs. SumsThe former are marked by granularity: they divide up the domain into whole units, whose interior parts are traced over. The universal human being is instantiated only by human beings as single, whole units.

A mereological sum is not granular in this sense (molecules are parts of the mereological sum of human beings)

Page 95: Species and Classification in Biology Barry Smith

http:// ifomis.org 95

A bad solutionIdentify both universals and classes with sets in

the mathematical sense

Problem of false positives

adult childlion in Leipzig lionanimal owned by the Emporer mammalmammal weighing less than 200 Kg animal

Page 96: Species and Classification in Biology Barry Smith

http:// ifomis.org 96

Sets in the mathematical sense are marked by granularity

Granularity = each class or set is laid across reality like a grid consisting (1) of a number of slots or pigeonholes each (2) occupied by some member.

Each set is (1) associated with a specific number of slots, each of which (2) must be occupied by some specific member.

A class survives the turnover in its instances: both (1) the number of slots and (2) the individuals occupying these slots may vary with time

Page 97: Species and Classification in Biology Barry Smith

http:// ifomis.org 97

But sets are timelessA set is an abstract structure, existing outside time and space. The set of human beings existing at t is (timelessly) a different entity from the set of human beings existing at t because of births and deaths. Biological classes exist in timeDarwin: because the universals of which they are extensions exist in time