SAB 2008 LITERATURE CURATION Overview & Integrated Phenotype Curation.
-
Upload
audrey-nelson -
Category
Documents
-
view
234 -
download
3
Transcript of SAB 2008 LITERATURE CURATION Overview & Integrated Phenotype Curation.
SAB 2008
LITERATURE CURATION
Overview & Integrated Phenotype Curation
SAB 2008
WebInterface
Literature Curation - Data Flow for First Pass
Papercollection
First Pass
Data flagged with comments for 32 different data types
Postgres DB
Some data typesstored for
future curation
Active Curation:Data Extraction
Database Input Files
St.Louis DB
CompleteDB
Caltech DB Sanger DB
Local Databases
SAB 2008
First-Pass Curation Fields (based on 5787 Papers)
1546
1291
958917 877
811 794 789697
493 472
320 293 281 279194 172 166 147 140
93 81 57 41 34 20 18 18 120
400
800
1200
1600
Expression data
RNAi
Transgene
Gene product interactions
Mutant phenotype
Gene function
Sequence change
Gene-gene interactions
Antibody
Gene-seq, gene name, synonym
Gene regulation
Structure correctionSite of action analysis
Overexpression
New allele
Sequence features
Protein functions in vitro
Mapping dataStructural info
Cell (name,function,ablation)
Microarray
Mosaic analysis
Covalent modification
SNPs
Mass-Spec
RNAi (large-scale)
Functional complementation
Chemicals
Human diseases
SAB 2008
Objects in WS170
Objects in WS190/91
% Change
% Complete*
Mutant Phenotype (total alleles) 2736 4675 71% 20%RNAi (Large and small scale) 64461 74427 15% 53%Overexpression 5 9 80% < 1%Nomenclature Data 100%
Genetic interactions 4920 6795 38%Gene Product Interaction (Y2H) 11573 11573
Cell Function (ablation and mosaics) 0 183 NA 15%
Expression Data 6355 9744 53% 100%Gene Regulation on Expression Level 642 2044 218% 100%Microarray 40 53** 33% 71%
Feature Data*** Start up phase
Sequence Change 100%
Transgene 4062 5151 27% 100%C elegans Antibodies 1084 1324 22% 100%
Concise Description:†
Total Descriptions 4398 5335 21% Genes w/> 5 references 7% 85% Genes w/> 1 reference 5% 53%
Total GO annotations 75,065 141,937 47%Total non-IEA GO annotation 25,634 33,045 22%
Gene Ontology:†
Data from 78 papers since WS 170
Sanger Request Tracker - 67 since WS 170
Data from 97 papers since WS 170
Reagents:
Sequence Data:
Gene Identity and Function:
Gene Expression and Function:
Interactions:
Cell Data:
* Based on first pass papers completed unless otherwise noted † Outside of first pass** includes one tiling array*** Data from Sanger RT, - not only first pass
SAB 2008
Phenotype Ontology
Provides a controlled vocabulary for phenotypic descriptions, organized hierarchically
Can annotate phenotypes to a very granular level, preserving associations with more general terms
Many Data Types Include a Phenotype Assignment
Phenotype Annotations- Consistency- Efficiency
Mutant Phenotype (total alleles)RNAi (Large and small scale)OverexpressionNomenclature Data
Genetic interactionsGene Product Interaction (Y2H)
Cell Function (ablation and mosaics)
Expression DataGene Regulation on Expression LevelMicroarray
Feature Data***Sequence Change
TransgeneC elegans Antibodies
Total Descriptions Genes w/> 5 references Genes w/> 1 reference
Total GO annotationsTotal non-IEA GO annotation
Gene Ontology:
Concise Description:
Gene Identity and Function:
Interactions:
Cell Data:
Gene Expression and Function:
Sequence Data:
Reagents:
SAB 2008
The WormBase Phenotype Ontology is Hierarchical:
SAB 2008
Annotate to a very granular level, preserving associations with more general terms
Multiple vulva-like protrusions are present along the ventral side of the animal. This is usually a result of all six vulval precursor cells adopting vulval (1° or 2°) fates.
Definitionw/ references
SynonymsMuv
(OBO-EDIT)
vulva_development_abnormal
vulva_cell_fate_specification_abnormal
vulval_cell_induction_abnormal
vulval_cell_induction_increased
multivulva
reproductive_system_development_abnormal
Term Name
SAB 2008
Using the Ontological Structure for Data Retrieval
Query by Name, WB IDor Synonym
Also for:Gene OntologyAnatomy Ontology
Output: Showing children of parent term and annotations
See individual annotations with references
Vulva development
SAB 2008
Phenotype Ontology Overview:
Development:
Maintained with OBO-EDIT and registered with OBO foundry (NCBO)OBO-Edit is developed by the Berkeley Bioinformatics and Ontologies Project, and is funded by the Gene Ontology Consortium.
ReleasePhenotype Terms
Defined terms
Percent Defined
Terms used (%)
WS160 - Jul, 2006 (prior to PO) 119 0 0 ---
WS170 -Feb, 2007 1394 237 17% 40%
Current 1677 708 42% 60%
We will continue development in parallel w/ curation - reflects the developing complexity with which terms are described in literature
Refined by usage
(Currently there are 4,675 alleles curated w/ 10,468 phenotype associations ~125% WS170)
Community InputSeek input from experts in certain fields to develop ontology
SAB 2008
annotations 355
The embryonic_lethal branch - Fabio Piano and Kris GunsalusExpert input leads to granularity that reflects term usage
SAB 2008
Integrated Phenotype CurationInitial paradigm - one curator = one data type
Paper:Chromatin regulation and sumoylation in the inhibition of Ras-induced vulval development in Caenorhabditis elegans.
RNAi Phenotype“RNAi of smo-1 on its own induces a low percentage of Muv animals”.
RNAi based Interaction (Synthetic)“smo-1 displays synMuv activity in both class A and class B backgrounds”
RNAi based Interaction (Enhancement)“let-60(n2021) increase in the percentage of Muv animals compared to smo-1(RNAi) alone”
“RNAi of the sumoylation pathway gene smo-1 leads to ectopic lag-2 expression”RNAi based Gene Regulation (Ectopic)
Poulin et al - EMBO J. 2005 Jul 20;24(14):2613-23.
First Pass
Gene Regulation
RNAi
Interactions
smo-1:
(change in expression) (genetic)
SAB 2008
Need for curation integration: RNAi curation as an example
First Pass
InteractionsRNAiGene Regulation
RNAi based Interactions
RNAi curation form has functionality to generate interaction objects
Enter number of interacting genes
SAB 2008
- Keep track in Postgres database - avoids redundant curation
Enhancement
Muv
let-60(n2021)
smo-1(RNAi)
- Currently there are 2493 RNAi-based interactions in WormBase
“let-60(n2021) increase in the percentage of Muv animals compared to smo-1(RNAi) alone”
SAB 2008
Coordination of RNAi based Gene Regulation
· If an RNAi object is created first:
I enter information here so that Xiaodong can create a gene regulation object for the RNAi object
Xiaodong creates a gene regulation object and I input the object name here
· Currently there are 365 RNAi-based gene regulations (46% from WS170)
· Need to set up a tracking system in Postgres
· If a gene regulation object is created first:
SAB 2008
Postgres
First pass
RNAi
Alleles
Towards Integrating RNAi and Allele Curation
RNAi Checkout
Allele Checkout
SAB 2008
SAB 2008
SAB 2008