Ontologies GO Workshop 3-6 August 2010. Ontologies What are ontologies? Why use ontologies? Open...

27
Ontologies GO Workshop 3-6 August 2010

Transcript of Ontologies GO Workshop 3-6 August 2010. Ontologies What are ontologies? Why use ontologies? Open...

Page 1: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Ontologies

GO Workshop3-6 August 2010

Page 2: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Ontologies What are ontologies? Why use ontologies? Open Biological Ontologies (OBO), National

Center for Biomedical Ontology (NCBO) Some useful ontologies…

Page 3: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

What Are Ontologies?"An ontology is an explicit specification of some topic. For our purposes, it is a formal and declarative representation which includes the vocabulary (or names) for referring to the terms in that subject area and the logical statements that describe what the terms are and how they are related to each other…

“Ontologies therefore provide a vocabulary for representing and communicating knowledge about some topic and a set of relationships that hold among the terms in that vocabulary”

(From the Stanford Knowledge Systems Lab).

Page 4: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

What Are Ontologies?“An ontology is a controlled vocabulary of well defined terms with specified relationships between those terms, capable of interpretation by both humans and computers.”

Bio-ontologies can be used to provide structured annotation.

Biocurators are biologists who are trained to catalogue biological data (using database structures, bio-ontologies, etc).

Page 5: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

new sequencing technologies are increasing the rate that DNA is sequenced: Jan 2009: 20 billion bases (or letters) of high-

quality human DNA sequence – seven-times the length of a human genome – in 10 days. Computer analysis of the genome took another 10 days.

complexity of data is also increasing

How manage the data?- data sharing

- from data to knowledge

Why use ontologies?

Page 6: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers annotate data in a consistent way allows data sharing across databases allows computational analysis of high-throughput

“omics” datasets Objects in an ontology (eg. genes, cell types, tissue

types, stages of development) are well defined.

The ontology shows how the objects relate to each other.

Why use ontologies?

Page 7: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Ontologies

digital identifier(computers)

description(humans)

relationships between terms

Page 8: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Ontology Relationships ontologies link terms using relationships relations between terms are also categorized and defined GO:

is a (eg. lyase activity is a catalytic activity) part of (eg. replication fork is part of chromosome) regulates negatively regulates positively regulates

PO: is a part of develops from

http://www.geneontology.org/GO.ontology.relations.shtml

Page 9: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Page 10: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Page 11: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Relationships: the True Path Rule Why are relationships between terms

important? TRUE PATH RULE: all attributes of

children must hold for all parents so if a protein is annotated to a term, it

must also be true for all the parent terms this enables us to move up the ontology

structure from a granular term to a broader term

Premise of many GO anaylsis tools

Page 12: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Bio-ontology requirements1. Ontology development

continual process as new terms are added to support more detailed data

2. Annotate data to the ontology computational annotation (breadth - quick) manual biocuration (depth - slow)

3. Tools that use the ontology data browsing and searching the ontology and its

associated data analysis of data annotated to the ontology

Page 13: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Resources for biocuration bio-ontologies (Open Biomedical Ontologies) computational pipelines (‘breadth’)

for computational annotations useful for gene products without published information

manual biocuration (‘depth’) requires trained biocurators community annotation efforts each species has its own body of literature

biocuration co-ordination MODs? Consortium? Community? biocuration prioritization co-ordination with existing Dbs, annotation, nomenclature

initiatives data updates

Page 14: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Current bio-ontology limitations ontology development annotation strategies to match increasing

amount of biological data computational pipelines & biocomputing community annotation/prioritization strategies biocurators

tools for dataset analysis (data complexity) cross-ontology data mining data visualization

Page 15: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

http://obo.sourceforge.net/

The Open Biomedical Ontology is an initiative to develop bio-ontologies using common rules/principles and resources

aim to develop interoperable ontologies common relationships common evidence codes standardize file sharing

develop links between ontologies?

Page 16: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

http://obo.sourceforge.net/

Gene Ontology Plant Ontology

Sequence OntologyTrait Ontology

Expression/Tissue OntologiesInfectious Disease Ontology

Cell Ontology

Page 17: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Genomic Annotation Genome annotation is the process of

attaching biological information to genomic sequences. It consists of two main steps:

1. identifying functional elements in the genome: “structural annotation”

2. attaching biological information to these elements: “functional annotation”

biologists often use the term “annotation” when they are referring only to structural annotation

Page 18: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Structural & Functional Annotation

Structural Annotation: Open reading frames (ORFs) predicted during genome

assembly predicted ORFs require experimental confirmation Sequence Ontology Project (SO): provide for a structured

controlled vocabulary for the description of primary annotations of nucleic acid sequence

Functional Annotation: Gene Ontology (GO): annotation of gene product function initially, predicted ORFs have no functional literature and GO

annotation relies on computational methods (rapid) functional literature exists for many genes/proteins prior to

genome sequencing

Page 19: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Functional annotation usingGene Ontology

Nomenclature(species’ genome nomenclature committees)

Other annotations

using other bio-ontologies e.g.

AnatomyOntology

Structural Annotationincluding Sequence Ontology

Genomic Annotation

Page 20: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Gene Ontology (GO) Not about genes!

Gene products: genes, transcripts, ncRNA, proteins

The GO describes gene product function Not a single ontology

Biological Process (BP or P) Molecular Function (MF or F) Cellular Component (CC or C)

Page 21: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Gene Ontology (GO) de facto method for functional annotation Widely used for functional genomics (high

throughput) Many tools available for gene expression

analysis using GO The GO Consortium homepage:

http://www.geneontology.org

Page 22: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

Plant Ontology (PO) describes plant structures and growth and

developmental stages Currently used for Arabidopsis, maize, rice – more

being added (soybean, tomato, cotton, etc) Plant Structure: describes morphological and

anatomical structures representing organ, tissue and cell types

Growth and developmental stages: describes (i) whole plant growth stages and (ii) plant structure developmental stages

The PO Consortium homepage:

http://www.plantontology.org/

Page 23: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Page 24: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

PO Browser – based on the GO Consortium browser, Amigo

Page 25: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Page 26: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Page 27: Ontologies GO Workshop 3-6 August 2010. Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.

http://www.ebi.ac.uk/ontology-lookup/