1 Introduction to Bio-Ontologies Barry Smith .

88
1 Introduction to Bio- Ontologies Barry Smith http:// ontology.buffalo.edu/smith

Transcript of 1 Introduction to Bio-Ontologies Barry Smith .

Page 1: 1 Introduction to Bio-Ontologies Barry Smith .

1

Introduction to Bio-Ontologies

Barry Smithhttp://ontology.buffalo.edu/smith

Page 2: 1 Introduction to Bio-Ontologies Barry Smith .

Outline1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 3: 1 Introduction to Bio-Ontologies Barry Smith .

1.1. Who am I?Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 4: 1 Introduction to Bio-Ontologies Barry Smith .

Who am I?

Foundational Model of Anatomy Ontology (FMA)

Common Anatomy Reference Ontology (CARO)

Protein Ontology (PRO)

Infectious Disease Ontology (IDO)

Ontology for General Medical Science (OGMS)

Plant Ontology (PO)

Biometrics Upper Ontology

4

Page 5: 1 Introduction to Bio-Ontologies Barry Smith .

NCBO: National Center for Biomedical Ontology (NIH Roadmap Center)

5

− Stanford Biomedical Informatics Research− The Mayo Clinic− University at Buffalo Department of

Philosophy

http://bioportal.bioontology.org

Page 6: 1 Introduction to Bio-Ontologies Barry Smith .

6

Page 7: 1 Introduction to Bio-Ontologies Barry Smith .

National Cancer Institute Thesaurus

7

Preferred Name (Preferred_Name): Wood

Definitions (DEFINITION)The hard, fibrous substance composing most of the stem and branches of a tree or shrub, and lying beneath the bark; the xylem.

Full Id:http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Wood

Alt Definition: Fibrous plant material under the bark that is created by lateral cell division from the vascular cambium. Noted for high content cellulose, hemicellulose, and lignin in the cell walls.FDA

Page 8: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2.2. How to find your dataHow to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 9: 1 Introduction to Bio-Ontologies Barry Smith .

9

Page 10: 1 Introduction to Bio-Ontologies Barry Smith .

10

Page 11: 1 Introduction to Bio-Ontologies Barry Smith .

11

Page 12: 1 Introduction to Bio-Ontologies Barry Smith .

The Infinite Monkey (Fortuitous Interoperability)

strategy to resolve data silos

Page 13: 1 Introduction to Bio-Ontologies Barry Smith .

How to find your data?

How to find and integrate other people’s data?

How to reason with data when you find it?

How to understand the significance of the data you collected 3 years earlier?

Part of the solution must involve consensus-based, standardized terminologies and coding schemes

13

Page 14: 1 Introduction to Bio-Ontologies Barry Smith .

14

NIH Mandates for Sharing of Research Data

Investigators submitting an NIH application seeking $500,000 or more in any single year are expected to include a plan for data sharing

(http://grants.nih.gov/grants/policy/data_sharing)

Page 15: 1 Introduction to Bio-Ontologies Barry Smith .

Making data (re-)usable through standards

• Standards provide– common structure and terminology– single data source for review (less redundant

data)

• Standards allow– use of common tools and techniques– common training– single validation of data

15

Page 16: 1 Introduction to Bio-Ontologies Barry Smith .

16

Problems with standards

• Standards involve considerable costs of re-tooling, maintenance, training, ...

• Not all standards are of equal quality

• Bad standards create lasting problems

Page 17: 1 Introduction to Bio-Ontologies Barry Smith .

Ontology success stories, and some reasons for failure

Linked Open Data in the Semantic Web 17

Page 18: 1 Introduction to Bio-Ontologies Barry Smith .

18etc.

Page 19: 1 Introduction to Bio-Ontologies Barry Smith .

The more ontology is successful, the more it fails

• As ontologies (controlled vocabularies) become easier to create, and to use

• more and more ontologies are constructed

• thereby recreating the very silo problems ontologies were designed to solve

How to solve this problem?

19

Page 20: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2. How to find your data

3.3. How to do biology across the genomeHow to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 21: 1 Introduction to Bio-Ontologies Barry Smith .

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV

How to do biology across the genome?

Page 22: 1 Introduction to Bio-Ontologies Barry Smith .

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE

22

Page 23: 1 Introduction to Bio-Ontologies Barry Smith .

Biomedical Ontology in PubMed

Page 24: 1 Introduction to Bio-Ontologies Barry Smith .

By far the most successful: GO (Gene Ontology)

24

Page 25: 1 Introduction to Bio-Ontologies Barry Smith .

Clark et al., 2005

part_of

is_a

25

Page 26: 1 Introduction to Bio-Ontologies Barry Smith .

26

Page 27: 1 Introduction to Bio-Ontologies Barry Smith .

The Gene Ontology

27

Page 28: 1 Introduction to Bio-Ontologies Barry Smith .

28

what cellular component?

what molecular function?

what biological process?

the GO works through annotation of data

Page 29: 1 Introduction to Bio-Ontologies Barry Smith .

29

what cellular component?

what molecular function?

what biological process?

three types of data

Page 30: 1 Introduction to Bio-Ontologies Barry Smith .

WormBase

Gramene

FlyBase

Rat Genome Database

DictyBase

Mouse Genome Database

The Arabidopsis Information Resource

The Zebrafish Information Network

Berkeley Drosophila Genome Project

Saccharomyces Genome Database

...

30

Gene Ontology Consortium

Page 31: 1 Introduction to Bio-Ontologies Barry Smith .

Benefits of GO

1. rooted in basic experimental biology

2. links people to data and to literature

3. links data to data

• across species (human, mouse, yeast, fly ...)

• across granularities (molecule, cell, organ, organism, population)

4. links medicine to biological science

5. cumulation of scientific knowledge in algorithmically tractable form

31

Page 32: 1 Introduction to Bio-Ontologies Barry Smith .

A strategy for translational medicine

Sjöblöm T, et al. analyzed 13,023 genes in 11 breast and 11 colorectal cancers

using functional information captured by GO identified 189 genes as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention.

Science. 2006 Oct 13;314(5797):268-74.

32

Page 33: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4.4. How to extend the GO methodology to clinical and How to extend the GO methodology to clinical and translational medicine: Open Biomedical Ontologiestranslational medicine: Open Biomedical Ontologies

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 34: 1 Introduction to Bio-Ontologies Barry Smith .

34

Page 35: 1 Introduction to Bio-Ontologies Barry Smith .

35

Ontology Scope URL Custodians

Cell Ontology (CL)

cell types from prokaryotes to mammals

obo.sourceforge.net/cgi-

bin/detail.cgi?cell

Jonathan Bard, Michael Ashburner, Oliver Hofman

Chemical Entities of Bio-

logical Interest (ChEBI)

molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara

Common Anatomy Refer-

ence Ontology (CARO)

anatomical structures in human and model

organisms(under development)

Melissa Haendel, Terry Hayamizu, Cornelius

Rosse, David Sutherland,

Foundational Model of Anatomy (FMA)

structure of the human body

fma.biostr.washington.

edu

JLV Mejino Jr.,Cornelius Rosse

Functional Genomics Investigation

Ontology (FuGO)

design, protocol, data instrumentation, and

analysisfugo.sf.net FuGO Working Group

Gene Ontology (GO)

cellular components, molecular functions, biological processes

www.geneontology.org

Gene Ontology Consortium

Phenotypic Quality Ontology

(PaTO)

qualities of anatomical structures

obo.sourceforge.net/cgi

-bin/ detail.cgi?attribute_and_value

Michael Ashburner, Suzanna

Lewis, Georgios Gkoutos

Protein Ontology (PrO)

protein types and modifications

(under development)Protein Ontology

Consortium

Relation Ontology (RO)

relationsobo.sf.net/

relationshipBarry Smith, Chris

Mungall

RNA Ontology(RnaO)

three-dimensional RNA structures

(under development) RNA Ontology Consortium

Sequence Ontology(SO)

properties and features of nucleic sequences

song.sf.net Karen Eilbeck

Page 36: 1 Introduction to Bio-Ontologies Barry Smith .

36

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

Page 37: 1 Introduction to Bio-Ontologies Barry Smith .

Community / Population Ontology

37

− family, clan− ethnicity− religion − diet− social networking− education (literacy ...)− healthcare (economics ...)− household forms− demography− public health−...

Page 38: 1 Introduction to Bio-Ontologies Barry Smith .

38

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

Page 39: 1 Introduction to Bio-Ontologies Barry Smith .

39

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

Page 40: 1 Introduction to Bio-Ontologies Barry Smith .

40

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO)

Population Phenotype

PopulationProcess

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

Page 41: 1 Introduction to Bio-Ontologies Barry Smith .

41

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population OrganFunction

(FMP, CPRO)

Population

Phenotype

Population Process

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

E N

V I R

O N

M E

N T

Page 42: 1 Introduction to Bio-Ontologies Barry Smith .

42

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population

Environment of population

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environment

http://obofoundry.org

E N

V I R

O N

M E

N T

Page 43: 1 Introduction to Bio-Ontologies Barry Smith .

43

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community, Deme, Population

Environment of population

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism*

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environment

E N

V I R

O N

M E

N T

* The sum total of the conditions and elements that make up the surroundings and influence the development and actions of an individual.

Page 44: 1 Introduction to Bio-Ontologies Barry Smith .

44

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population OrganFunction

(FMP, CPRO)

Population

Phenotype

Population

Process

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Phenotypic Quality(PaTO)

Biological

Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

Pla

nt

An

ato

my

Plant G

rowth and

Developm

ental Stage

Page 45: 1 Introduction to Bio-Ontologies Barry Smith .

Goal of the OBO Foundry

all biomedical research data should cumulate to form a single, algorithmically processable, whole

Smith, et al. Nature Biotechnology, Nov 2007

45

Page 46: 1 Introduction to Bio-Ontologies Barry Smith .

46

CRITERIA

The ontology is open and available to be used by all.

The ontology is instantiated in, a common formal language and shares a common formal architecture

The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

OBO FOUNDRY CRITERIA

Page 47: 1 Introduction to Bio-Ontologies Barry Smith .

47

CRITERIA The developers of each ontology commit to its

maintenance in light of scientific advance, and to soliciting community feedback for its improvement.

They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.

Page 48: 1 Introduction to Bio-Ontologies Barry Smith .

48

Page 49: 1 Introduction to Bio-Ontologies Barry Smith .

49

Current OBO Foundry Ontologies

• Biological process (GO)• Cellular component (GO)• Chemical entities of biological interest• Molecular function (GO)• Phenotypic quality• PRotein Ontology (PRO)• Xenopus Anatomy and Development• Zebrafish Anatomy and Development

Page 50: 1 Introduction to Bio-Ontologies Barry Smith .

50

Foundry ontologies under review

Cell Ontology (CL)Infectious Disease Ontology (IDO)Ontology for Biomedical Investigations (OBI)Plant Ontology (PO)

Page 51: 1 Introduction to Bio-Ontologies Barry Smith .

51

Ontologies under construction

Allergy OntologyEnvironment Ontology (EnvO) Immunology Ontology (IDO)Mental Functioning Ontology (MFO)

Emotion Ontology (MFO-EM)Pain Ontology

Mental Disease Ontology (MDO)Neurological Disease Ontology (ND)Vaccine Ontology

Page 52: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5.5. An OBO Foundry success storyAn OBO Foundry success story

6. The Infectious Disease Ontology

7. The Environment Ontology

Page 53: 1 Introduction to Bio-Ontologies Barry Smith .

Anatomy Ontologies

Fish Multi-Species Anatomy Ontology (NSF funding received)Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy Ontology (SPD)Xenopus Anatomy Ontology (XAO)

undergoing reform: Drosophila and Zebrafish Anatomy Ontologies

53

Page 54: 1 Introduction to Bio-Ontologies Barry Smith .

Ontologies facilitate grouping of annotations

brain 20 hindbrain 15 rhombomere 10

Query brain without ontology 20Query brain with ontology 45

54

Page 55: 1 Introduction to Bio-Ontologies Barry Smith .

Pleural Cavity

Pleural Cavity

Interlobar recess

Interlobar recess

Mesothelium of Pleura

Mesothelium of Pleura

Pleura(Wall of Sac)

Pleura(Wall of Sac)

VisceralPleura

VisceralPleura

Pleural SacPleural Sac

Parietal Pleura

Parietal Pleura

Anatomical SpaceAnatomical Space

OrganCavityOrganCavity

Serous SacCavity

Serous SacCavity

AnatomicalStructure

AnatomicalStructure

OrganOrgan

Serous SacSerous Sac

MediastinalPleura

MediastinalPleura

TissueTissue

Organ PartOrgan Part

Organ Subdivision

Organ Subdivision

Organ Component

Organ Component

Organ CavitySubdivision

Organ CavitySubdivision

Serous SacCavity

Subdivision

Serous SacCavity

Subdivision

part

_of

is_a

Page 56: 1 Introduction to Bio-Ontologies Barry Smith .

Basic Formal Ontology (Top Level)

http://www.ifomis.org/bfo/

Continuant Occurrent

IndependentContinuant

DependentContinuant

Anatomical Structure

ProcessStage

Quality

56

Page 57: 1 Introduction to Bio-Ontologies Barry Smith .

Anatomical Entity

Physical Anatomical Entity

Material Physical Anatomical Entity

-is a-

Non-material Physical Anatomical Entity

ConceptualAnatomical Entity

AnatomicalStructure

BodySubstance

BodyPart

HumanBody

OrganSystem

OrganCell

OrganPart

AnatomicalSpace

Anatomical Relationship

CellPart

Biological Macromolecule

Tissue

Non-Physical

Anatomical Entity

Physical Anatomical Entity

Material Physical Anatomical Entity

-is a-

Non-material Physical Anatomical Entity

ConceptualAnatomical Entity

AnatomicalStructure

BodySubstance

BodyPart

HumanBody

OrganSystem

OrganCell

OrganPart

AnatomicalSpace

Anatomical Relationship

CellPart

Biological Macromolecule

Tissue

Anatomical Entity

Physical Anatomical Entity

Material Physical Anatomical Entity

-is a-

Non-material Physical Anatomical Entity

ConceptualAnatomical Entity

AnatomicalStructure

BodySubstance

BodyPart

HumanBody

OrganSystem

OrganCell

OrganPart

AnatomicalSpace

Anatomical Relationship

CellPart

Biological Macromolecule

Tissue

Non-Physical

Independent Continuant

Page 58: 1 Introduction to Bio-Ontologies Barry Smith .

OBO Foundry organized in terms of Basic Formal Ontology

through the methodology of downward population

Each Foundry ontology can be seen as an extension of a single upper level ontology (BFO)

58

Page 59: 1 Introduction to Bio-Ontologies Barry Smith .

Example: The Cell Ontology

Page 60: 1 Introduction to Bio-Ontologies Barry Smith .

Continuant

IndependentContinuant

DependentContinuant

..... .....Quality Disposition

60

Page 61: 1 Introduction to Bio-Ontologies Barry Smith .

depends_on

Continuant Occurrent

process, eventIndependentContinuant

thing

DependentContinuant

quality

.... ..... .......temperature dependson bearer

61

Page 62: 1 Introduction to Bio-Ontologies Barry Smith .

this particular case of redness (of a particular fly eye)

the universal red

instantiates

an instance of eye (in a particular fly)

the universal eye

instantiates

depends_on

62Phenotype Ontology (PATO)

Page 63: 1 Introduction to Bio-Ontologies Barry Smith .

the particular case of redness (of a particular fly eye)

red

instantiates

an instance of an eye (in a particular fly)

eye

instantiates

depends on

color anatomical structure

is_a is_a

63

Page 64: 1 Introduction to Bio-Ontologies Barry Smith .

portion of water

this portion of H20

64

portion of ice

portion of liquid water

portion of gas

instantiates at t1

instantiates at t2

instantiates at t3

Phase transitions

Page 65: 1 Introduction to Bio-Ontologies Barry Smith .

plant

this plant

65

zygote embryo seed

instantiates at t1

instantiates at t2

instantiates at t3

Phase transitions

Page 66: 1 Introduction to Bio-Ontologies Barry Smith .

human

John (exists continuously)

66

embryo fetus adultneonate infant child

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

in nature, no sharp boundaries here

Page 67: 1 Introduction to Bio-Ontologies Barry Smith .

temperature

John’s temperature (exists continuously)

67

37ºC 37.1ºC 37.5ºC37.2ºC 37.3ºC 37.4ºC

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

in nature, no sharp boundaries here

in nature, no sharp boundaries here

Page 68: 1 Introduction to Bio-Ontologies Barry Smith .

coronary heart disease

John’s coronary heart disease (exists continuously)

68

asymptomatic (‘silent’)

infarction

early lesions and small

fibrous plaques

stable angina

surface disruption of

plaque

unstable angina

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

time

Page 69: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6.6. IDO: The Infectious Disease OntologyIDO: The Infectious Disease Ontology

7. The Environment Ontology

Page 70: 1 Introduction to Bio-Ontologies Barry Smith .

We have data

TBDB: Tuberculosis Database, including Microarray data

VFDB: Virulence Factor DB

TropNetEurop Dengue Case Data

ISD: Influenza Sequence Database at LANL

PathPort: Pathogen Portal Project

...

70

Page 71: 1 Introduction to Bio-Ontologies Barry Smith .

We need to annotate this data

to allow retrieval and integration of– sequence and protein data for pathogens– case report data for patients– clinical trial data for drugs, vaccines– epidemiological data for surveillance, prevention– ...

Goal: to make data deriving from different sources comparable and computable

71

Page 72: 1 Introduction to Bio-Ontologies Barry Smith .

IDO needs to work with

Disease Ontology (DO) + SNOMED CT

Gene Ontology Immunology Branch

Phenotypic Quality Ontology (PATO)

Protein Ontology (PRO)

Sequence Ontology (SO)

...

72

Page 73: 1 Introduction to Bio-Ontologies Barry Smith .

We need common controlled vocabularies to describe these data in ways that will assure

comparability and cumulation

What content is needed to adequately cover the infectious domain?– Host-related terms (e.g. carrier, susceptibility)– Pathogen-related terms (e.g. virulence)– Vector-related terms (e.g. reservoir, – Terms for the biology of disease pathogenesis (e.g.

evasion of host defense)– Population-level terms (e.g. epidemic, endemic,

pandemic, )

73

Page 74: 1 Introduction to Bio-Ontologies Barry Smith .

IDO Processes

74

Page 75: 1 Introduction to Bio-Ontologies Barry Smith .

IDOQualities

75

Page 76: 1 Introduction to Bio-Ontologies Barry Smith .

IDO Roles

76

Page 77: 1 Introduction to Bio-Ontologies Barry Smith .

IDO provides a common template

IDO contains terms (like ‘pathogen’, ‘vector’, ‘host’) which apply to organisms of all species involved in infectious disease and its transmission

Disease- and organism-specific ontologies built as refinements of the IDO core

77

Page 78: 1 Introduction to Bio-Ontologies Barry Smith .

Disease-specific IDO test projects

MITRE, Mount Sinai, UTSouthwestern – Influenza– Stuart Sealfon, Joanne Luciano,

IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus)– Kristos Louis

Colorado State University – Dengue Fever– Saul Lozano-Fuentes

Duke – Tuberculosis– Carol Dukes-Hamilton

Cleveland Clinic – Infective Endocarditis– Sivaram Arabandi

University of Michigan – Brucilosis– Yongqun He

78

Page 79: 1 Introduction to Bio-Ontologies Barry Smith .

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7.7. The Environment OntologyThe Environment Ontology

Page 80: 1 Introduction to Bio-Ontologies Barry Smith .

80

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OFORGANISMS

biome / biotope, territory, habitat, neighborhood, ...

work environment, home environment;host/symbiont environment; ...

ORGAN ANDORGANISM

CELL AND CELLULAR

COMPONENT

extracellular matrix; chemokine gradient; ...

MOLECULE

hydrophobic surface; virus localized to cellular substructure; active site on

protein; pharmacophore ...

http://obofoundry.orgE

N V

I R O

N M

E N

T

Page 81: 1 Introduction to Bio-Ontologies Barry Smith .

The Environment Ontology

81

OBO FoundryGenomic Standards ConsortiumNational Environment Research Council (UK)USDA, Gramene, J. Craig Venter Institute ...

Page 82: 1 Introduction to Bio-Ontologies Barry Smith .

82

Applications of EnvO in biology

Page 83: 1 Introduction to Bio-Ontologies Barry Smith .

How EnvO currently works for information retrieval

Retrieve all experiments on organisms obtained from:– deep-sea thermal vents– arctic ice cores– rainforest canopy– alpine melt zone

Retrieve all data on organisms sampled from:– hot and dry environments– cold and wet environments– a height above 5,000 meters

Retrieve all the omic data from soil organisms subject to:– moderate heavy metal contamination

83

Page 84: 1 Introduction to Bio-Ontologies Barry Smith .

extending EnvO to clinical and translational research

• we have public heath, community and population data

• we need to make this data available for search and algorithmic processing

• we create a consensus-based ontology which can interoperate with ontologies for neighboring domains of medicine and basic biology

84

Page 85: 1 Introduction to Bio-Ontologies Barry Smith .

Environment = totality of circumstances external to a living organism or group of organisms

– pH– evapotranspiration– turbidity– available light– predominant vegetation– predatory pressure– nutrient limitation …

85

Page 86: 1 Introduction to Bio-Ontologies Barry Smith .

extend EnvO to the clinical domain– dietary patterns (Food Ontology: FAO, USDA) ...

allergies– neighborhood patterns

• built environment, living conditions• climate• social networking• crime, transport• education, religion, work• health, hygiene

– disease patterns• bio-environment (bacteriological, ...)• patterns of disease transmission (links to IDO)

86

Page 87: 1 Introduction to Bio-Ontologies Barry Smith .

87

Page 88: 1 Introduction to Bio-Ontologies Barry Smith .

with thanks toBFO: Fabian Neuhaus (NIST), Melissa Haendel (Oregon),

David Sutherland (Flybase)

EnvO: Dawn Field, Norman Morrison (NERC)

FMA: Cornelius Rosse, J. L. E. Mejino (Seattle)

IDO: Lindsay Cowell, Albert Goldfain (Dallas)

OBO Foundry: Michael Ashburner, Suzanna Lewis, Chris Mungall (Flybase, GO), Alan Ruttenberg (Buffalo, Neurocommons)

NCBO: NIH RFA-RM-04-022

PRO: NIH R01 GM080646-01

PO: The Plant Ontology Consortium

88