PIR Bio-defense Related Pathogen Data Mining
-
Upload
kaseem-richard -
Category
Documents
-
view
17 -
download
1
description
Transcript of PIR Bio-defense Related Pathogen Data Mining
PIR Bio-defense Related Pathogen Data Mining
NIAID
Literature Mining of Pathogenesis-Related Proteins
Biodefense Proteomics Resource Center at PIR
Dengue Virus E Proteins Bioinformatics AnalysisUS ArmyUS Army
November 19, 2007
2
Literature Mining of Pathogenesis-Related Proteins
Objective: To develop a text mining system for pathogenesis-related proteins in
pathogens of military and biodefense relevance To integrate the pathogenesis-related proteins into integrated protein
databases for functional analysis Priority pathogenic organisms:
Francisella tularensis – Gammaproteobacteria Dengue virus – (+)ssRNA virus Brucella – Alphaproteobacteria Trypanosoma cruzi – Kinetoplastida
Integrated information for pathogenic proteins UniProtKB iProClass Other pathway databases
3
Functional pathway analysis Data
integration
iProClass
Literature Mining of Pathogenesis-Related Proteins
4
Literature Mining of Pathogenesis-Related Proteins
11
22
33
nn-1-1
nn
Document retrieval(Prioritizing)
Name recognition
Passage highlighting
System adjustment
Pathogenesisrelated papers
Pathogenesisrelated papers
Priority list of pathogens…
Priority list of pathogens…
5
http://pir.georgetown.edu/iprolink/rlimsp/
RLIMS-P: Rule-based Literature Mining System for Protein Phosphorylation
6
BioThesaurus: Gene/protein name searches - synonyms, ambiguous names…
http://pir.georgetown.edu/iprolink/biothesaurus/
7
Exp. DataExp. Data
InformationInformation
KnowledgeKnowledge
Gene ID Protein ID Peptide seq.
FunctionPathwayFamily ……
Categorize,Statistics, Cross-dataset,Association
UniProtKB AC/ID1
3
2
8
iProXpress – Pathway Profiling
Protein information matrix: extensive annotations including protein name, family classification, function, protein-protein interaction, pathway…
Functional profiling: iterative categorization, sorting, cross-dataset comparison, coupled with manual examination.
ER Mit
Mit
ER
KEGG pathway
Organelle proteome data sets
9
Gene Ontology: Molecular Process
IP-MS Data from E2-treated breast cancer cells
Transcriptional regulation chromatin interaction
histone regulation
10
1.1. Albert Einstein College of MedicineAlbert Einstein College of MedicineT. gondii, C. parvumT. gondii, C. parvum
2.2. Caprion Pharmaceuticals Caprion Pharmaceuticals B. abortusB. abortus
3.3. Harvard Institute of Proteomics Harvard Institute of Proteomics V. choleraeV. cholerae, , B. anthracisB. anthracis
4.4. Myriad Genetics Myriad Genetics B. anthracis, Y. pestis, F. tularensis, Vaccinia, B. anthracis, Y. pestis, F. tularensis, Vaccinia, VariolaVariola
5.5. Pacific Northwest National Laboratory Pacific Northwest National Laboratory S. typhimurium, S. typhi, Vaccinia, MonkeypoxS. typhimurium, S. typhi, Vaccinia, Monkeypox
6.6. ScrippsScrippsSARS CoV, SARS CoV, InfluenzaInfluenza
7.7. University of Michigan University of Michigan B. anthracisB. anthracis
Scripps Caprion
MyriadHarvard
U of Michigan
Albert Einstein
PNNL
Resource Center
SSS
PIR VBI
DATA
NIAID
11www.proteomicsresource.org
Mouse proteins detected in B. anthracis and S. typhimurium infected macrophages
13
Integrated Analysis:Selection Pressure, Entropy; Epitope DengueDengue
DENV1DENV3DENV2DENV4
14
Additional Structure Analysis
Exposed
Result: identification of diagnostic and vaccine targets
DengueDengue
aa site Interacting residuesInteracting residuesvariant exposed