PIR Bio-defense Related Pathogen Data Mining

14
PIR Bio-defense Related Pathogen Data Mining NIAID Literature Mining of Pathogenesis-Related Proteins Biodefense Proteomics Resource Center at PIR Dengue Virus E Proteins Bioinformatics Analysis US Army US Army November 19, 2007

description

NIAID. PIR Bio-defense Related Pathogen Data Mining. November 19, 2007. Literature Mining of Pathogenesis-Related Proteins. Biodefense Proteomics Resource Center at PIR. Dengue Virus E Proteins Bioinformatics Analysis. US Army. Literature Mining of Pathogenesis-Related Proteins. - PowerPoint PPT Presentation

Transcript of PIR Bio-defense Related Pathogen Data Mining

Page 1: PIR  Bio-defense Related Pathogen Data Mining

PIR Bio-defense Related Pathogen Data Mining

NIAID

Literature Mining of Pathogenesis-Related Proteins

Biodefense Proteomics Resource Center at PIR

Dengue Virus E Proteins Bioinformatics AnalysisUS ArmyUS Army

November 19, 2007

Page 2: PIR  Bio-defense Related Pathogen Data Mining

2

Literature Mining of Pathogenesis-Related Proteins

Objective: To develop a text mining system for pathogenesis-related proteins in

pathogens of military and biodefense relevance To integrate the pathogenesis-related proteins into integrated protein

databases for functional analysis Priority pathogenic organisms:

Francisella tularensis – Gammaproteobacteria Dengue virus – (+)ssRNA virus Brucella – Alphaproteobacteria Trypanosoma cruzi – Kinetoplastida

Integrated information for pathogenic proteins UniProtKB iProClass Other pathway databases

Page 3: PIR  Bio-defense Related Pathogen Data Mining

3

Functional pathway analysis Data

integration

iProClass

Literature Mining of Pathogenesis-Related Proteins

Page 4: PIR  Bio-defense Related Pathogen Data Mining

4

Literature Mining of Pathogenesis-Related Proteins

11

22

33

nn-1-1

nn

Document retrieval(Prioritizing)

Name recognition

Passage highlighting

System adjustment

Pathogenesisrelated papers

Pathogenesisrelated papers

Priority list of pathogens…

Priority list of pathogens…

Page 5: PIR  Bio-defense Related Pathogen Data Mining

5

http://pir.georgetown.edu/iprolink/rlimsp/

RLIMS-P: Rule-based Literature Mining System for Protein Phosphorylation

Page 6: PIR  Bio-defense Related Pathogen Data Mining

6

BioThesaurus: Gene/protein name searches - synonyms, ambiguous names…

http://pir.georgetown.edu/iprolink/biothesaurus/

Page 7: PIR  Bio-defense Related Pathogen Data Mining

7

Exp. DataExp. Data

InformationInformation

KnowledgeKnowledge

Gene ID Protein ID Peptide seq.

FunctionPathwayFamily ……

Categorize,Statistics, Cross-dataset,Association

UniProtKB AC/ID1

3

2

Page 8: PIR  Bio-defense Related Pathogen Data Mining

8

iProXpress – Pathway Profiling

Protein information matrix: extensive annotations including protein name, family classification, function, protein-protein interaction, pathway…

Functional profiling: iterative categorization, sorting, cross-dataset comparison, coupled with manual examination.

ER Mit

Mit

ER

KEGG pathway

Organelle proteome data sets

Page 9: PIR  Bio-defense Related Pathogen Data Mining

9

Gene Ontology: Molecular Process

IP-MS Data from E2-treated breast cancer cells

Transcriptional regulation chromatin interaction

histone regulation

Page 10: PIR  Bio-defense Related Pathogen Data Mining

10

1.1. Albert Einstein College of MedicineAlbert Einstein College of MedicineT. gondii, C. parvumT. gondii, C. parvum

2.2. Caprion Pharmaceuticals Caprion Pharmaceuticals B. abortusB. abortus

3.3. Harvard Institute of Proteomics Harvard Institute of Proteomics V. choleraeV. cholerae, , B. anthracisB. anthracis

4.4. Myriad Genetics Myriad Genetics B. anthracis, Y. pestis, F. tularensis, Vaccinia, B. anthracis, Y. pestis, F. tularensis, Vaccinia, VariolaVariola

5.5. Pacific Northwest National Laboratory Pacific Northwest National Laboratory S. typhimurium, S. typhi, Vaccinia, MonkeypoxS. typhimurium, S. typhi, Vaccinia, Monkeypox

6.6. ScrippsScrippsSARS CoV, SARS CoV, InfluenzaInfluenza

7.7. University of Michigan University of Michigan B. anthracisB. anthracis

Scripps Caprion

MyriadHarvard

U of Michigan

Albert Einstein

PNNL

Resource Center

SSS

PIR VBI

DATA

NIAID

Page 11: PIR  Bio-defense Related Pathogen Data Mining

11www.proteomicsresource.org

Page 12: PIR  Bio-defense Related Pathogen Data Mining

Mouse proteins detected in B. anthracis and S. typhimurium infected macrophages

Page 13: PIR  Bio-defense Related Pathogen Data Mining

13

Integrated Analysis:Selection Pressure, Entropy; Epitope DengueDengue

DENV1DENV3DENV2DENV4

Page 14: PIR  Bio-defense Related Pathogen Data Mining

14

Additional Structure Analysis

Exposed

Result: identification of diagnostic and vaccine targets

DengueDengue

aa site Interacting residuesInteracting residuesvariant exposed