Pathogen Profiling Pipeline
-
Upload
tom14 -
Category
Technology
-
view
3.492 -
download
0
description
Transcript of Pathogen Profiling Pipeline
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
1
Pathogen Profiling Pipeline
Tom MatthewsNational Microbiology LaboratoryPublic Health Agency of Canada
A Metagenomics Tool for RapidIdentification of Pathogens from Clinical
Specimens
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
2
Introduction
● With novel/emerging disease classical pathogen identification may not always produce results
● Advances in next-gen sequencing technology● Characterize samples at genomic level
● Pathogen Profiling Pipeline● Bioinformatics pipeline ● Analysis of host and microbial nucleic acids
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
3
Features
● Nucleotide and protein BLAST analysis● Unbiased analysis of input reads● Clustered execution● Web front-end● Custom analysis pipelines● Easily viewed results
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
4
Filtering Overview● BLAST analysis performed against reference
sequence database● Assigns hits according to cut-off criteria● Calculate equivalent hits● Clustered BLAST and filtering
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
5
Last Common Ancestor Estimation
● Uses equivalent hits for LCA calculation
● User specifies equivalent hit percentage cutoff
● NCBI taxonomy database for ancestor lookup
● Walks up taxonomy tree to find lowest intersection of all leaf nodes
● Unbiased approach
Vaccinia
Camelpox
Taterapox
VariolaOrthopoxvirus
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
6
Filtering Outputs
● Hits – High scoring reads passing filtering values
● Equivalent Hits – BLAST hits matching to within an assigned percentage of the top hit's bitscore
● Last Common Ancestors – Calculated (estimated) LCA of all the equivalent hits
● Unassigned – Passed to the next pipeline step
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
7
Example Analysis Method
● BLAST reads against host database
● Remove host reads
● BLAST unassigned against reference database
● Filter hits vs. unassigned
● Repeat...
● Post analysis
Samplereads
BLAST andFiltering
Hostgenome
Viralgenome
Bacterialgenome
Protozoangenome
Fungalgenome
Non-hostreads
BLAST andFiltering
BLAST andFiltering
Non-hostreads
BLAST andFiltering
BLAST andFiltering
Poolresults
UniqueorganismsIn sample
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
8
Pipeline Construction
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
9
Pipeline Construction
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
10
Pipeline Construction
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
11
Pipeline Construction
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
12
Pipeline Construction
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
13
Pipeline Execution
● Custom execution manager● Computes dependencies and monitors running
jobs● Distribute jobs across Linux cluster● Facilitates unattended clustered executions
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
14
Reports
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
15
Drill Down Reports
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
16
Abundance View
● Displays abundance of taxonomic hits
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
17
Example Run
● Mouth swab input samples● Two pools:
● Samples spiked with Vaccinia and Influenza A● Background reference sample
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
18
Example Run
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
19
Example run
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
20
Example Run
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
21
Example Run
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
22
Wrap-up
● Unbiased analysis of input reads● Custom analysis pipelines● Last common ancestor calculation● Clustered execution● Multiple report views● Exportable results
June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009
23
Acknowledgements
● Gary Van Domselaar● Morag Graham● Shaun Tyler● Heather Kent● Kim Melnychuk● Christine Bonner● Geoff Peters● Philip Mabon