Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier...
Transcript of Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier...
![Page 1: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/1.jpg)
Easier Workflows & Tool comparison
with oqtans+
Vipin T. Sreedharan <[email protected]> Rätsch Lab cBio, Memorial Sloan-Kettering Cancer Center, New York, USA.
MLB AG-Rätsch, FML of the Max Planck Society, Tübingen, Germany. Center for Bioinformatics, University of Tübingen, Germany.
Galaxy Community Conference, 26 July 2012, Chicago
![Page 2: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/2.jpg)
Deep sequencing data
‣ Specificities:
‣ Rapidity of production
‣ Low cost
‣ Small fragments (reads)
![Page 3: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/3.jpg)
Applied to many scientific contexts:
Deep sequencing data
oqtans
![Page 4: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/4.jpg)
RNA-Seq Analysis
Transcript Prediction Read Mapping Sequencing Quantification
‣ Common analysis tasks
‣ compare two samples (wild type, mutant)
‣ identify new transcripts Lee et al. 2011 Nucleic Acids Res Yamashita et al. 2011 Genome Res Daines et al. 2011 Genome Res Ramani et al. 2010 Genome Res
Tang et al. 2011 Nature Methods Grabherr et al. 2011 Nature Biotech Li et al. 2011 Science Gerstein et al. 2010 Science
![Page 5: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/5.jpg)
Transcript Prediction Read Mapping Sequencing Quantification
‣ Common issues
‣ Reproducibility
‣ Tool availability
‣ Scalability
RNA-Seq Analysis
![Page 6: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/6.jpg)
oqtans+ Galaxy Tools
Transcript Prediction Read Mapping Sequencing Quantification
PALMapper*
BWA
TopHat
* developed by Rätsch Lab, at cBio MSKCC
rQuant*
rDiff*
Cufflinks/Cuffdiff
DESeq
Genesetter*
topGo
mTIM*
ASP*
Trinity
Cufflinks
Scripture
Schultheiss et al. 2011 BMC Bioinformatics
RNA-geeq*
![Page 7: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/7.jpg)
oqtans+ Transcriptome analysis toolsuite
Transcript Prediction Read Mapping Sequencing Quantification
‣ PALMapper: highly accurate short-read mapper using base quality and splice site predictions
G. Jean et al. 2010 Curr Protoc Bioinformatics
![Page 8: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/8.jpg)
Transcript Prediction Read Mapping Sequencing Quantification
‣ mTIM: reconstructs exon-intron structure from alignments and splice site predictions
J. Behr et al. 2011 NIPS Machine learning in Comp Bio workshop, G. Zeller et al. 2012 i.p.
oqtans+ Transcriptome analysis toolsuite
![Page 9: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/9.jpg)
Transcript Prediction Read Mapping Sequencing Quantification
‣ rQuant: estimates biases in library prep, sequencing, and read mapping; accurately determines the abundances of transcripts
R. Bohnert & G. Rätsch 2010 Nucleic Acids Res
oqtans+ Transcriptome analysis toolsuite
![Page 10: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/10.jpg)
Transcript Prediction Read Mapping Sequencing Quantification
‣ rDiff/DESeq: determines significant differences in transcript/gene expression between experiments using statistical tests
O. Stegle et al. 2010 Nature Preceedings S. Anders and W. Huber 2010 Genome Biology
oqtans+ Transcriptome analysis toolsuite
![Page 11: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/11.jpg)
Accuracy of read alignments
PALMapper
TopHat
Sequencing
‣ C. elegans
‣ 75 nt RNA-seq reads (24 million)
![Page 12: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/12.jpg)
Intron and transcript accuracy evaluation
PALMapper
TopHat
Sequencing
mTIM
Cufflinks
‣ C. elegans
‣ 75 nt RNA-seq reads (24 million)
![Page 13: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/13.jpg)
genesetter
oqtans+ Workflow
PALMapper Sequencing DESeq
‣ Illumina, 78 nt RNA-seq reads
‣ Columbia accession (Col-0) (1.2 million)
‣ Canary Island accession (Can-0) (4.9 million) Gan et al. 2011 Nature
‣ Multiple reference genomes and transcriptomes for A. thaliana
![Page 14: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/14.jpg)
‣ Compute resources: 2 xxlarge
‣ Time:
‣ Alignments: 20 minutes
‣ Quantitative analysis: 10 minutes
‣ Cost on Amazon EC2: approx. $2.82
PALMapper Sequencing DESeq
genesetter
oqtans+ on AWS cloud
![Page 15: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/15.jpg)
oqtans+: Package contents
Read Mapping Version PALMapper 0.4
BWA 0.5.7
TopHat 1.5.0
Transcript Prediction
Version
mTIM 0.2 Cufflinks 1.3.0 Trinity r2012-06-08 Scripture Beta-2 ASP 0.3
Quantification Version rQuant 2.2 rDiff 0.2 Cufflinks/Cuffdiff 1.3.0 DESeq 1.6.1 topGO 0.1 Genesetter 0.1
Read Alignment Filtering
Version
SAFT 0.2 Multi-Mapper Resolution 0.1
![Page 16: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/16.jpg)
oqtans+ Automated Tool Deployment fabric script
How to resolve the requirements of OS specific packages ?
![Page 17: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/17.jpg)
‣ Public compute cluster
‣ 12 nodes, 112 CPUs
‣ our Galaxy test instance
‣ All tools described here, and more!
‣ http://bioweb.me/mlb-galaxy
oqtans+ Availability: Our Server
![Page 18: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/18.jpg)
‣ Free, open-source packages of our own tools
‣ Including Galaxy Tool Wrappers
‣ http://oqtans.org
‣ Fabric scripts to install on any Galaxy instance
‣ Community Tool Shed http://toolshed.g2.bx.psu.edu
oqtans+ Availability: Source Code
![Page 19: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/19.jpg)
‣ Demo cloud instance with all oqtans+ tools
‣ http://cloud.oqtans.org
‣ AMI at Amazon Web Services for EC2
‣ Cloudman to launch any number of instances as a compute cluster
oqtans+ Availability: Cloud Computing
![Page 20: Easier Workflows & Tool comparison with oqtans+raetschlab.org/lectures/GCCoqtans2012UIC.pdfEasier Workflows & Tool comparison with oqtans+ Vipin T. Sreedharan](https://reader036.fdocuments.in/reader036/viewer/2022071216/6047cf35fb27940305064153/html5/thumbnails/20.jpg)
http://oqtans.org
Jonas Behr, Regina Bohnert, Philipp Drewe, Nico Görnitz, Géraldine Jean
André Kahles, Pramod Mudra, Sebastian Schultheiss, Georg Zeller, Gunnar Rätsch