Introduction to the Gene Ontology and GO Annotation Resources

89
EBI is an Outstation of the European Molecular Biology Laboratory. EBI Bioinformatics Roadshow 15 March 2011 Düsseldorf, Germany Rebecca Foulger Introduction to the Gene Ontology and GO Annotation Resources

description

Introduction to the Gene Ontology and GO Annotation Resources. EBI Bioinformatics Roadshow 15 March 2011 Düsseldorf, Germany Rebecca Foulger. - PowerPoint PPT Presentation

Transcript of Introduction to the Gene Ontology and GO Annotation Resources

Page 1: Introduction to the Gene Ontology and GO Annotation Resources

EBI is an Outstation of the European Molecular Biology Laboratory.

EBI Bioinformatics Roadshow15 March 2011

Düsseldorf, Germany

Rebecca Foulger

Introduction to the Gene Ontology and GO Annotation Resources

Page 2: Introduction to the Gene Ontology and GO Annotation Resources

OUTLINE OF TUTORIAL:

PART I: Ontologies and the Gene Ontology (GO)

PART II: GO AnnotationsHow to access GO annotationsHow scientists use GO annotations

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 3: Introduction to the Gene Ontology and GO Annotation Resources

PART I: Gene Ontology

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 4: Introduction to the Gene Ontology and GO Annotation Resources

What’s in a name...?

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 5: Introduction to the Gene Ontology and GO Annotation Resources

Q: What is a cell?

A: It really depends who you ask!

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 6: Introduction to the Gene Ontology and GO Annotation Resources

Different things can be described

by the same name

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 7: Introduction to the Gene Ontology and GO Annotation Resources

• Glucose synthesis• Glucose biosynthesis• Glucose formation• Glucose anabolism• Gluconeogenesis

The same thing can be described by different names:

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 8: Introduction to the Gene Ontology and GO Annotation Resources

Inconsistency in naming of biological concepts

• Same name for different concepts

• Different names for the same concept

Comparison is difficult – in particular across species or across databases

Just one reason why the Gene Ontology (GO) is is needed…

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 9: Introduction to the Gene Ontology and GO Annotation Resources

Why do we need GO?

• Large datasets need to be interpreted quickly

• Inconsistency in naming of biological concepts

• Increasing amounts of biological data available

• Increasing amounts of biological data to come

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 10: Introduction to the Gene Ontology and GO Annotation Resources

Increasing amounts of biological data available

Search on mesoderm development…. you get

9441 results!

Expansion of sequence information

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 11: Introduction to the Gene Ontology and GO Annotation Resources

What is an ontology?

• Dictionary: • A branch of metaphysics concerned with the nature and relations

of being (philosophy)

• A formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts (computer science)

• Barry Smith: • The science of what is, of the kinds and structures of objects,

properties, events, processes and relations in every area of reality.

1606 1700s

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 12: Introduction to the Gene Ontology and GO Annotation Resources

What is an ontology?

• More usefully: • An ontology is the representation of something we know about.

“Ontologies" consist of a representation of things, that are detectable or directly observable, and the relationships between those things.

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 13: Introduction to the Gene Ontology and GO Annotation Resources

What is an ontology?

• An ontology is more than just a list of terms (a controlled vocabulary)• A vocabulary of terms• Definitions for those terms• *** Defined logical relationships between the terms ***

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 14: Introduction to the Gene Ontology and GO Annotation Resources

What’s in an Ontology?

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 15: Introduction to the Gene Ontology and GO Annotation Resources

What is the Gene Ontology (GO)?

A way to capture biological knowledge in a written and computable

form

Describes attributes of gene products (RNA and

protein)

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 16: Introduction to the Gene Ontology and GO Annotation Resources

The scope of GO

What information might we want to capture about a gene product?

• What does the gene product do?• Where does it act?• How does it act?

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 17: Introduction to the Gene Ontology and GO Annotation Resources

Biological Processwhat does a gene product do?

cell divisiontranscription

A commonly recognised series of events

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 18: Introduction to the Gene Ontology and GO Annotation Resources

Cellular Componentwhere is a gene product located?

• plasma membran

e

• mitochondrion• mitochondrial membrane

• mitochondrial matrix• mitochondrial lumen • ribosome

• large ribosomal subunit • small ribosomal subunit

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 19: Introduction to the Gene Ontology and GO Annotation Resources

Molecular Functionhow does a gene product act?

• insulin binding• insulin receptor activity

• glucose-6-phosphate isomerase activity

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 20: Introduction to the Gene Ontology and GO Annotation Resources

Three separate ontologies or one large one?

• GO was originally three completely independent hierarchies, with no relationships between them

• As of 2009, GO have started making relationships between biological process and molecular function in the live ontology

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 21: Introduction to the Gene Ontology and GO Annotation Resources

Function

Function

Process

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 22: Introduction to the Gene Ontology and GO Annotation Resources

• GO IS:• species independent• covers normal processes

• GO is NOT:• NO pathological/disease processes• NO experimental conditions• NO evolutionary relationships• NO gene products• NOT a nomenclature system

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 23: Introduction to the Gene Ontology and GO Annotation Resources

Aims of the GO project

• Compile the ontologies

• Annotate gene products using ontology terms

• Provide a public resource of data and tools

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 24: Introduction to the Gene Ontology and GO Annotation Resources

Anatomy of a GO termUnique identifier

Term name

DefinitionSynonyms

Cross-references

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 25: Introduction to the Gene Ontology and GO Annotation Resources

• GO is structured as a hierarchical directed acyclic graph (DAG)

• Terms can have more than one parent and zero, one or more children

• Terms are linked by relationships, which add to the meaning of the term

node

nodenode

edge

Ontology structure

• Nodes = terms in the ontology

• Edges = relationships between the concepts

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 26: Introduction to the Gene Ontology and GO Annotation Resources

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 27: Introduction to the Gene Ontology and GO Annotation Resources

Relationships between GO terms

• is_a

• part_of

• regulates• positively regulates• negatively regulates

• has_part

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 28: Introduction to the Gene Ontology and GO Annotation Resources

is_a

• If A is a B, then A is a subtype of B• mitotic cell cycle is a cell cycle• lyase activity is a catalytic activity.

• Transitive relationship: can infer up the graph

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 29: Introduction to the Gene Ontology and GO Annotation Resources

part_of

• Necessarily part of

• Wherever B exists, it is as part of A. But not all B is part of A.

• Transitive relationship (can infer up the graph)

BA

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 30: Introduction to the Gene Ontology and GO Annotation Resources

regulates• One process directly affects another process or quality

• Necessarily regulates: if both A and B are present, B always regulates A, but A may not always be regulated by B

BA

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 31: Introduction to the Gene Ontology and GO Annotation Resources

• Relationships are upside down compared to is_a and part_of

• Necessarily has part

has_part

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 32: Introduction to the Gene Ontology and GO Annotation Resources

is_a complete

• For all terms in the ontology, you have to be able to reach the root through a complete path of is_a relationships:• we call this being is_a complete• important for reasoning over the ontology, and ontology development

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 33: Introduction to the Gene Ontology and GO Annotation Resources

True path rule

• Child terms inherit the meaning of all their parent terms.

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 34: Introduction to the Gene Ontology and GO Annotation Resources

How is GO maintained?• GO editors and annotators work with experts to remodel specific

areas of the ontology• Signaling• Kidney development• Transcription• Pathogenesis• Cell cycle

• Deal with requests from the community• database curators, researchers, software developers• Some simple requests can be dealt with automatically

• GO Consortium meetings for large changes

• Mailing lists, conference calls, content workshops

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 35: Introduction to the Gene Ontology and GO Annotation Resources

Requesting changes to the ontology• Public Source Forge (SF) tracker for term related issues

https://sourceforge.net/projects/geneontology/

Page 36: Introduction to the Gene Ontology and GO Annotation Resources

Why modify the GO?

• GO reflects current knowledge of biology

• Information from new organisms can make existing terms and arrangements incorrect

• Not everything perfect from the outset• Improving definitions• Adding in synonyms and extra relationships

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 37: Introduction to the Gene Ontology and GO Annotation Resources

• Terms become obsolete when they are removed or redefined

• GO IDs are never deleted

• For each term, a comment is added to explain why the term is now obsolete

• Alternative GO terms are suggested to replace an obsoleted term

Ensuring Stability in a Dynamic Ontology

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 38: Introduction to the Gene Ontology and GO Annotation Resources

Searching for GO terms

http://www.ebi.ac.uk/QuickGO/

http://amigo.geneontology.org

… there are more browsers available on the GO Tools page:http://www.geneontology.org/GO.tools.browsers.shtml

The latest OBO Gene Ontology file can be downloaded from:http://www.geneontology.org/ontology/gene_ontology.obo

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 39: Introduction to the Gene Ontology and GO Annotation Resources

Exercise

Browsing the Gene Ontology using QuickGO

• Exercise 1

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 40: Introduction to the Gene Ontology and GO Annotation Resources

PART II: GO Annotation

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 41: Introduction to the Gene Ontology and GO Annotation Resources

http://www.geneontology.org

Reactome

E. Coli hub

Page 42: Introduction to the Gene Ontology and GO Annotation Resources

A GO annotation is…

A statement that a gene product:

1. has a particular molecular functionOr is involved in a particular biological processOr is located within a certain cellular component

2. as determined by a particular method

3. as described in a particular reference

Accession

Name GO ID GO term name Reference Evidence Code

P00505 GOT2 GO:0004069 Aspartate transaminase activity

PMID:2731362

IDA

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 43: Introduction to the Gene Ontology and GO Annotation Resources

Evidence codes

IDA: enzyme assay

IPI: e.g. Y2H

http://www.geneontology.org/GO.evidence.shtml

review papers

subcategories of ISS

BLASTs, orthology comparison, HMMs

Page 44: Introduction to the Gene Ontology and GO Annotation Resources

GO evidence code decision tree

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 45: Introduction to the Gene Ontology and GO Annotation Resources

Gene Ontology Annotation (GOA)

The GOA database at the EBI is:

• The largest open-source contributor of annotations to GO

• Member of the GO Consortium since 2001

• Provides annotation for 321,998 species (February 2011 release)

• GOA’s priority is to annotate the human proteome

• GOA is responsible for human, chicken and bovine annotations in the

GO consortium

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 46: Introduction to the Gene Ontology and GO Annotation Resources

GOA makes annotations using two methods

• Electronic • Quick way of producing large numbers of annotations• Annotations are less detailed

• Manual • Time-consuming process producing lower numbers of

annotations• Annotations are very detailed and accurate

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 47: Introduction to the Gene Ontology and GO Annotation Resources

Electronic annotation by GOA

• 1. Mapping of external concepts to GO terms• InterPro2GO (protein domains)

• SPKW2GO (UniProt/Swiss-Prot keywords)

• HAMAP2GO (Microbial protein annotation)

• EC2GO (Enzyme Commission numbers)

• SPSL2GO (Swiss-Prot subcellular locations)

• 2. Automatic transfer of annotations to orthologs

Macaque Chimpanzee Guinea Pig Rat Mouse

Cow Dog Chicken

Ensembl compara

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 48: Introduction to the Gene Ontology and GO Annotation Resources

Mappings of concepts from UniProtKB files

Aspartate transaminase activity ; GO:0004069

lipid transport; GO:006869

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 49: Introduction to the Gene Ontology and GO Annotation Resources

Automatic transfer of annotations to orthologs

Mouse Rat Zebrafish Xenopus

Macaque Chimpanzee

Guinea Pig Rat Mouse

Dog Chicken

Human

Rat

Human

Mouse

Human

Xenopus

Tetraodon

Fugu

Zebrafish

Cow

Ensembl COMPARA

• Homologies between different species calculated

• GO terms projected from MANUAL annotation only (IDA, IEP, IGI, IMP, IPI)

• One-to-one orthologies used.

Currently provides 479,961 GO annotations for

60,515 proteins from 49 species (February 2011 release)

Page 50: Introduction to the Gene Ontology and GO Annotation Resources

Manual annotation by GOA

• High-quality, specific annotations using:

• Peer-reviewed papers

• A range of evidence codes to categorize the types of evidence

found in a paper

www.ebi.ac.uk/GOA

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 51: Introduction to the Gene Ontology and GO Annotation Resources

Finding annotations in a paper

In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…

Process: response to wounding GO:0009611

wound response

serine/threonine kinase activity,

Function: protein serine/threonine kinase activity GO:0004674

integral membrane protein

Component: integral to plasma membrane GO:0005887

…for B. napus PERK1 protein (Q9ARH1)

PubMed ID: 12374299

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 52: Introduction to the Gene Ontology and GO Annotation Resources

• Qualifier

sModify the interpretation of an annotation

• NOT (protein is not associated with the GO term)• colocalizes_with (protein associates with complex but is not a bona fide member)• contributes_to (describes action of a complex of proteins)

• 'With' columnCan include further information on the method being referenced

e.g. the protein accession of an interacting protein

Additional information

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 53: Introduction to the Gene Ontology and GO Annotation Resources

The NOT qualifier

• NOT is used to make an explicit note that the gene product is not associated with the GO term

• Also used to document conflicting claims in the literature

• NOT can be used with ALL three gene ontologies

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 54: Introduction to the Gene Ontology and GO Annotation Resources

In these cells, SIPP1 was mainly present in the nucleus, where it displayed a non-uniform, speckled distribution and appeared to be excluded from the nucleoli. excluded from the nucleoli

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 55: Introduction to the Gene Ontology and GO Annotation Resources

The colocalizes_with qualifier

ONLY used with GO component ontology

Gene products that are transiently or

peripherally associated with an

organelle or complex

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 56: Introduction to the Gene Ontology and GO Annotation Resources

Immunoblot analysis with anti-PSI polyclonal antibodies of U1 snRNP particles affinity purified from Drosophila embryonic nuclear extracts showed that PSI is physically associated with U1 snRNP (Figure 1A, top panel).

Association of U1 snRNP with GST-PSI was detected by ethidium bromide staining of the selected snRNAs and was confirmed by blot hybridization with an antisense U1 snRNA riboprobe (Figure 1C, lane 4).

PSI is physically associated with U1 snRNP

Association of U1 snRNP with GST-PSI

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 57: Introduction to the Gene Ontology and GO Annotation Resources

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 58: Introduction to the Gene Ontology and GO Annotation Resources

The contributes_to qualifier

• Where an individual gene product that is part of a complex can be annotated to terms that describe the action (function or process) of the whole complex

• contributes_to is not needed to annotate a catalytic subunit. Furthermore, contributes_to may be used for any non-catalytic subunit, whether the subunit is essential for the activity of the complex or not

• Annotations to contributes_to often use the IC evidence code, but others may also be used.

ONLY used with GO function ontology

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 59: Introduction to the Gene Ontology and GO Annotation Resources

..we next examined whether a complex of four proteins can be formed…. As shown in Figure 4, FLAG-tagged PIG-C was precipitated efficiently with anti- FLAG beads in four combinations with other proteins (Figure 4A, lanes 1–4)….. These results strongly suggest that all four proteins form a complex.

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 60: Introduction to the Gene Ontology and GO Annotation Resources

.. To test whether the protein complex consisting of PIG-A, PIG-H, PIG-C and hGPI1 has GlcNAc transferase activity in vitro….

…incubation of the radiolabeled donor of GlcNAc, UDP-[6-3H]GlcNAc, with lysates of JY5 cells transfected with GST-tagged PIG-A resulted in synthesis of GlcNAc-PI and its subsequent deacetylation to glucosa- minyl phosphatidylinositol (GlcN-PI)

whether the protein complexhas GlcNAc transferase activity

resulted in synthesis of GlcNAc-PI andIts subsequent deacetylation to glucosa-minyl phosphatidylinositol (GlcN-PI)

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 61: Introduction to the Gene Ontology and GO Annotation Resources

Unknown v.s. Unannotated• When there is no existing data to support an

annotation, gene is annotated to the ROOT (top level) term

• NOT the same as having no annotation at all • No annotation means that no one has looked yet

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 62: Introduction to the Gene Ontology and GO Annotation Resources

WITH column

• The with column provides supporting evidence for ISS, IPI, IGI and IC evidence codes

ISS: the accession of the aligned protein/orthologIPI: the accession of the interacting proteinIGI: the accession of the interacting geneIC: The GO:ID for the inferred_from term WITH

column

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 63: Introduction to the Gene Ontology and GO Annotation Resources

How to access GO annotation data

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 64: Introduction to the Gene Ontology and GO Annotation Resources

Where can you find annotations?

UniProtKB

Ensembl

Entrez gene

Page 65: Introduction to the Gene Ontology and GO Annotation Resources

Gene Association Files• 17 column files containing all information for each annotation

GO Consortium website

GOA website

Page 66: Introduction to the Gene Ontology and GO Annotation Resources

GO browsers

Page 67: Introduction to the Gene Ontology and GO Annotation Resources

QuickGO browser

Search GO terms or proteins

Find sets of GO annotations

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 68: Introduction to the Gene Ontology and GO Annotation Resources

Exercise

Searching for GO annotations in QuickGO

• Exercise 2• Exercise 3

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 69: Introduction to the Gene Ontology and GO Annotation Resources

Exercise

Using QuickGO to create a tailored set of annotations

• Exercise 4: Filtering• Exercise 5: Statistics

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 70: Introduction to the Gene Ontology and GO Annotation Resources

How scientists use the GO, and the tools they use for analysis

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 71: Introduction to the Gene Ontology and GO Annotation Resources

Using GO annotations

• If you wanted to find out the role of a gene product manually, you’d have to read an awful lot of papers

• But by using GO annotations, this work has already been done for you!

GO:0006915 : apoptosis

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 72: Introduction to the Gene Ontology and GO Annotation Resources

How scientists use the GO

• Access gene product functional information

• Analyse high-throughput genomic or proteomic datasets

• Validation of experimental techniques

• Get a broad overview of a proteome

• Obtain functional information for novel gene products

•  

Some examples…

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 73: Introduction to the Gene Ontology and GO Annotation Resources

Selec ted Gene Tree: pears on lw n3d ...Branc h c olor c las s ific ation: Set_ LW_ n3d_ 5p_ ...

Co lored by : Copy of Copy o f C5_ RMA (Defa...Gene L is t: all genes (14010)attacked

time

control

Puparial adhesionMolting cycleHemocyanin

Defense responseImmune response

Response to stimulusToll regulated genes

JAK-STAT regulated genes

Immune responseToll regulated genes

Amino acid catabolismLipid metobolism

Peptidase activityProtein catabolismImmune response

Selec ted Gene Tree: pears on lw n3d ...Branc h c o lor c la s s ification : Set_ LW_ n3d_ 5p_ ...

Colored by: Copy of Copy o f C5_ RMA (Defa...Gene L is t: a ll genes (14010)

Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.

MicroArray data analysis

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 74: Introduction to the Gene Ontology and GO Annotation Resources

Validation of experimental techniques

(Cao et al., Journal of Proteome Research 2006)Rat liver plasma membrane isolation

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 75: Introduction to the Gene Ontology and GO Annotation Resources

Analysis of high-throughput proteomic datasets

(Orrù et al., Molecular and Cellular Proteomics 2007)Characterisation of proteins interacting with ribosomal protein S19

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 76: Introduction to the Gene Ontology and GO Annotation Resources

Obtain functional information for novel gene products

MPYVSQSQHIDRVRGAIEGRLPAPGNSSRLVSSWQRSYEQYRLDPGSVIGPRVLTSSELR DVQGKEEAFLRASGQCLARLHDMIRMADYCVMLTDAHGVTIDYRIDRDRRGDFKHAGLYI GSCWSEREEGTCGIASVLTDLAPITVHKTDHFRAAFTTLTCSASPIFAPTGELIGVLDAS AVQSPDNRDSQRLVFQLVRQSAALIEDGYFLNQTAQHWMIFGHASRNFVEAQPEVLIAFD ECGNIAASNRKAQECIAGLNGPRHVDEIFDTSAVHLHDVARTDTIMPLRLRATGAVLYAR IRAPLKRVSRSACAVSPSHSGQGTHDAHNDTNLDAISRFLHSRDSRIARNAEVALRIAGK HLPILILGETGVGKEVFAQALHASGARRAKPFVAVNCGAIPDSLIESELFGYAPGAFTGA RSRGARGKIAQAHGGTLFLDEIGDMPLNLQTRLLRVLAEGEVLPLGGDAPVRVDIDVICA THRDLARMVEEGTFREDLYYRLSGATLHMPPLRERADILDVVHAVFDEEAQSAGHVLTLD GRLAERLARFSWPGNIRQLRNVLRYACAVCDSTRVELRHVSPDVAALLAPDEAALRPALA LENDERARIVDALTRHHWRPNAAAEALGM

InterProScan

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 77: Introduction to the Gene Ontology and GO Annotation Resources

Annotating novel sequences

• Can use BLAST queries to find similar sequences with GO annotation which can be transferred to the new sequence

• Two tools currently available;• AmiGO BLAST (from GO Consortium)

http://amigo.geneontology.org/cgi-bin/amigo/blast.cgi• searches the GO Consortium database

• BLAST2GO (from Babelomics) http://www.blast2go.org/• searches the NCBI database

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 78: Introduction to the Gene Ontology and GO Annotation Resources

AmiGO BLAST

Exportin-T from Pongo abelii (Sumatran orangutan)

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 79: Introduction to the Gene Ontology and GO Annotation Resources

Numerous Third Party Tools

• Many tools exist that use GO to find common biological functions from a list of genes:

http://www.geneontology.org/GO.tools.microarray.shtml

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 80: Introduction to the Gene Ontology and GO Annotation Resources

GO tools: enrichment analysis

• Most of these tools work in a similar way:

• input a gene list and a subset of ‘interesting’ genes

• tool shows which GO categories have most interesting genes associated with them i.e. which categories are ‘enriched’ for interesting genes

• tool provides a statistical measure to determine whether enrichment is significant

• Try exercise 7 at home

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 81: Introduction to the Gene Ontology and GO Annotation Resources

GO slims

• Many GO analysis tools use GO slims to give a broad overview of the dataset

• GO slims are cut-down versions of the GO and contain a subset of the terms in the whole GO

• GO slims usually contain less-specialised GO terms

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 82: Introduction to the Gene Ontology and GO Annotation Resources

Slimming the GO using the ‘true path rule’ Many gene products are associated with a

large number of descriptive, leaf GO nodes:

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 83: Introduction to the Gene Ontology and GO Annotation Resources

Slimming the GO using the ‘true path rule’ …however annotations can be mapped up

to a smaller set of parent GO terms:

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 84: Introduction to the Gene Ontology and GO Annotation Resources

GO slims

• Custom slims are available for download;

http://www.geneontology.org/GO.slims.shtml

• Or you can make your own using;

• QuickGO• http://www.ebi.ac.uk/QuickGO

• AmiGO's GO slimmer• http://amigo.geneontology.org/cgi-bin/amigo/slimmer

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 85: Introduction to the Gene Ontology and GO Annotation Resources

Slimming with QuickGO

www.ebi.ac.uk/QuickGO

Map-up annotations with GO slims

Search GO terms or proteins

Find sets of GO annotations

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 86: Introduction to the Gene Ontology and GO Annotation Resources

Exercise

Map-up annotation using a GO slim

• Exercise 6

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 87: Introduction to the Gene Ontology and GO Annotation Resources

Just some things to be aware of….• The GO is continually changing

• New terms created• Existing terms obsoleted• Re-structured• New annotations being created

• ALWAYS use a current version of ontology and annotations

• If publishing your analyses, please report the versions/dates you use:http://www.geneontology.org/GO.cite.shtml

• Differences in representation of GO terms may be due to biological phenomenon. But also may be due to annotation-bias or experimental assays

• Often better to remove the ‘NOT’ annotations before doing any large-scale analysis, as they can skew the results

ontology

annotation

GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011

Page 88: Introduction to the Gene Ontology and GO Annotation Resources
Page 89: Introduction to the Gene Ontology and GO Annotation Resources

EBI is an Outstation of the European Molecular Biology Laboratory.

Thank you