Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

17
Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine Manually curated and computationally predicted GO annotations at the Saccharomyces Genome Database http://www.yeastgenome.org/

description

Manually curated and computationally predicted GO annotations at the Saccharomyces Genome Database. Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine. http://www.yeastgenome.org/. Scientific community. Data from high through-put experiments. - PowerPoint PPT Presentation

Transcript of Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Page 1: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Eurie L. Hong, Ph.D.

Department of Genetics • Stanford University School of Medicine

Manually curated and computationally predicted GO annotations at the

Saccharomyces Genome Database

http://www.yeastgenome.org/

Page 2: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Scientific community

Integrated dataAnalysis tools

Data from high through-put experiments

Data from traditional experiments Genome

sequence

Page 3: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

CHS6/YJL099W Locus Summary Page

Curated data from published

literature

Links to other databases

Summary of published data

Links to SGD tools and other databases

Data fromhigh throughput experiments

Sequence Information

Nomenclature

Page 4: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Accessing the data via files

ftp://ftp.yeastgenome.org/yeast/

Page 5: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Display of GO Annotations

Page 6: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Status of GO Annotations at SGD

All protein and RNA gene products have been annotated with GO terms

All GO annotations are manually curated from literature (no IEA)

864 genes (13.7% of all genes)Cellular Component

1448 genes (23.0% of all genes)Biological Process

2112 genes (33.6% of all genes)Molecular Function

from Genome Snapshot 8/23/2006

Genes without published characterization data

Page 7: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Sources of Computationally PredictedGO Annotations

1. InterPro domain matches in S. cerevisiae proteinssource: GOA project

2. Integrated analysis of multiple datasetssource: publications, external databases

Page 8: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

CHS6/YJL099W Locus Summary Page

Page 9: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Identifying Types of GO Annotations

Page 10: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

CHS6/YJL099W GO Annotation Page

{{{

Core GO Annotations

GO Annotations from Large Scale Experiments

Computationally PredictedGO Annotations

Page 11: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine
Page 12: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Changes to GO Term Finder

{Current functionality

{Specify background set

{Refine annotations usedby annotation source or

evidence codes

Page 13: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Improving GO Annotations

Computationally predicted GO annotations

Manually curated GO annotations

1. Computational predictions may indicate publications that were overlooked

2. Review inconsistencies between computationally predicted and manually curated GO annotations to improve mappings and manually curated annotations

3. Review inconsistencies between computationally predicted and manually curated GO annotations to improve ontology

Page 14: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Additional Annotations Using Interpro2GO

from gene_association.goa_uniprot 7/2006

Molecular Function 468 genes

Biological Process 316 genes

Cellular Component 207 genes

Information added to genes withno published characterization data

Page 15: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Preliminary Comparison: Cellular Component Annotations

Other38%

43%

15%

18%

2%

5946 IEA 9059 IC+IDA+IEP+IGI+IMP+IPI+ISS+NAS+RCA+TAS

Interpro2go annotation is ancestor of curated annotation

Interpro2go annotation

for an unknown Other shared

parent term

Shared parent is root term

Interpro2go annotation matches curated annotation

Shared parent is child of root term

18%

4%

Page 16: Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Summary

1. Currently, all GO annotations for S. cerevisiae gene products are manually curated from literature

2. SGD will incorporate computationally predicted GO annotations that will provide additional information for a gene product’s role in biology

3. Computationally predicted GO annotations will be used to refine and improve manually curated GO annotations at SGD