Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine
description
Transcript of Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine
Eurie L. Hong, Ph.D.
Department of Genetics • Stanford University School of Medicine
Manually curated and computationally predicted GO annotations at the
Saccharomyces Genome Database
http://www.yeastgenome.org/
Scientific community
Integrated dataAnalysis tools
Data from high through-put experiments
Data from traditional experiments Genome
sequence
CHS6/YJL099W Locus Summary Page
Curated data from published
literature
Links to other databases
Summary of published data
Links to SGD tools and other databases
Data fromhigh throughput experiments
Sequence Information
Nomenclature
Accessing the data via files
ftp://ftp.yeastgenome.org/yeast/
Display of GO Annotations
Status of GO Annotations at SGD
All protein and RNA gene products have been annotated with GO terms
All GO annotations are manually curated from literature (no IEA)
864 genes (13.7% of all genes)Cellular Component
1448 genes (23.0% of all genes)Biological Process
2112 genes (33.6% of all genes)Molecular Function
from Genome Snapshot 8/23/2006
Genes without published characterization data
Sources of Computationally PredictedGO Annotations
1. InterPro domain matches in S. cerevisiae proteinssource: GOA project
2. Integrated analysis of multiple datasetssource: publications, external databases
CHS6/YJL099W Locus Summary Page
Identifying Types of GO Annotations
CHS6/YJL099W GO Annotation Page
{{{
Core GO Annotations
GO Annotations from Large Scale Experiments
Computationally PredictedGO Annotations
Changes to GO Term Finder
{Current functionality
{Specify background set
{Refine annotations usedby annotation source or
evidence codes
Improving GO Annotations
Computationally predicted GO annotations
Manually curated GO annotations
1. Computational predictions may indicate publications that were overlooked
2. Review inconsistencies between computationally predicted and manually curated GO annotations to improve mappings and manually curated annotations
3. Review inconsistencies between computationally predicted and manually curated GO annotations to improve ontology
Additional Annotations Using Interpro2GO
from gene_association.goa_uniprot 7/2006
Molecular Function 468 genes
Biological Process 316 genes
Cellular Component 207 genes
Information added to genes withno published characterization data
Preliminary Comparison: Cellular Component Annotations
Other38%
43%
15%
18%
2%
5946 IEA 9059 IC+IDA+IEP+IGI+IMP+IPI+ISS+NAS+RCA+TAS
Interpro2go annotation is ancestor of curated annotation
Interpro2go annotation
for an unknown Other shared
parent term
Shared parent is root term
Interpro2go annotation matches curated annotation
Shared parent is child of root term
18%
4%
Summary
1. Currently, all GO annotations for S. cerevisiae gene products are manually curated from literature
2. SGD will incorporate computationally predicted GO annotations that will provide additional information for a gene product’s role in biology
3. Computationally predicted GO annotations will be used to refine and improve manually curated GO annotations at SGD