26 nov2013seminar

20
Genome Informatics – cold spring Harbour oct 30 th Nov 2nd

description

Genome Informatics 2013

Transcript of 26 nov2013seminar

Page 1: 26 nov2013seminar

Genome Informatics – cold spring Harbour oct 30th –

Nov 2nd

Page 2: 26 nov2013seminar

#GI2013

163 Posters/46 talks/402

participants

ENCODE

poster

Talk

Oth-ers

http://www.gersteinlab.org/

Page 3: 26 nov2013seminar
Page 4: 26 nov2013seminar

Databases Data Mining Visualization and Curation.

Transcriptomics, Alternative Splicing and Gene prediction. Sequencing Pipeline and Assembly. Comparative and Evolutionary

Genomics Epigenomics and Non-coding Genome Population and personal Genomics

Highlights

Page 5: 26 nov2013seminar

Medsavant: integrated solution for storage, annotation, filtration, prioritization and visual inspection of variants. #1000 genome project; #FORGE consortium 425 individuals

1. Databases…

Mendel App

Finding of Rare disease genes

Page 6: 26 nov2013seminar

The Genome Institute at Washington University – Genome Modelling System.◦ TCGA◦ ICGC (Int. Cancer Genome Consortium)◦ 1000 genomes project◦ PCGP (Pediatric Cancer Genome Project)

Databases…

https://github.com/genome/gms

Page 7: 26 nov2013seminar

RNA-seq (RNA Sequencing), also called "Whole Transcriptome Shotgun Sequencing" [1] ("WTSS"), is a technology that uses the capabilities of next-generation sequencing to reveal a snapshot of RNA presence and quantity from a genome at a given moment in time.[2]

RNAseq

Page 8: 26 nov2013seminar

Mario Stanke at Institute of Mathematics and computer science, Germany◦ Synteny based approach in gene prediction

Method for isolating Ribosome bound mRNA◦ Tufts University, Boston, MA.◦ Specifically can tell which mRNAs are actually

translated.

2. Transcriptomics, Alternative splicing and Gene Prediction

Page 9: 26 nov2013seminar

Data Analysis and coordination center (DACC).◦ For HMP data analysis◦ CloVR virtual machine◦ http://www.Hmpdacc.org/tools_protocols/tools_pro

tocols.php

In silico genome subtraction tool◦ MGC (Microbial Genome Comparison tool) – A

stand alone Java package

2. Transcriptomics, Alternative splicing and Gene Prediction

Page 10: 26 nov2013seminar

Studying co-expression network with micr-arrays and RNAseq◦ 3320 microarray experiments vs 813 RNAseq

samples with 8 datasets Major biological difference Hundreds and thousands of samples necessary to

build a robust RNAseq dataset RNAseq co-expression analysis necessary.

2. Transcriptomics, Alternative splicing and Gene Prediction

Page 11: 26 nov2013seminar

RNAseq Analysis pipeline using Ion Torrent data.◦ Tophat-cufflink◦ STAR-cufflink◦ TMAP-cufflink Expression levels are zero for STAR, TMAP aligners

2.Transcriptomics, Alternative splicing and Gene Prediction

•Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR: STAR: ultrafast universal RNA-seq aligner.Bioinformatics 2013, 29:15-21. PubMed Abstract | Publisher Full Text Return to text

•TMAP is a short read aligner specifically tuned for data from the Ion Torrent PGM

Page 12: 26 nov2013seminar

• FPKM vs RSEM and cufflink•Lior Pachter @lpachter1 Nov@konrad_jk @tuuliel @joe_pickrell @yarbsalocin I'm sure RSEM, Sailfish, and many other tools can handle it as well.•http://liorpachter.wordpress.com/seq/

RNAseqMortazavi, A., Williams,

B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and

quantifying mammalian transcriptomes by RNA-

seq. Nat. Methods 5, 621–628 (2008).

ICEseq

Page 13: 26 nov2013seminar

Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013 Jan;31(1):46-53. doi: 10.1038/nbt.2450. Epub 2012 Dec 9. PubMed PMID: 23222703.

Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515(2010).

Dissects the difference between a gene and a transcript – emphasizes on differential expression in isoform in a given condition.

RPKM vs FPKM  (# of mapped reads)/(length of transcript in kilo base)/(million) (# of fragments)/(length of transcript in kilo base)/(million)

Cufflink

Page 14: 26 nov2013seminar

•rsem-prepare-reference•rsem-calculate-expression

Page 15: 26 nov2013seminar

SAILFISH

RSEM or eXPRESS

Sailfish: Alignment-free Isoform Quantification from RNA-seq Reads using Lightweight Algorithmsbio.math.berkeley.edu/eXpress/simdata

Page 16: 26 nov2013seminar

A -> I editing -> sequenced as Guanine◦ Mapping to genome may

correct this, but how to distinguish this from SNP?

Adapted ICE (Inosine Chemical Erasing) using Cyanoethylation.

Computational pipeline to analyze this.

ICEseq

Page 17: 26 nov2013seminar

www.omicsmaps.com

IICB features in Omics map…

Page 18: 26 nov2013seminar
Page 19: 26 nov2013seminar
Page 20: 26 nov2013seminar

ThanksWe created a virtual machine…..

Akash GuptaArijit Panda

Madhu C.Deeksha SinghArpita Ghorai

Subhadeep Das

Neha Sanghi

http://front.math.ucdavis.edu/arXiv