Case Study: RNASeq Sequencing · 2012. 3. 23. · A typical pipeline for RNASeq analysis Map reads...
Transcript of Case Study: RNASeq Sequencing · 2012. 3. 23. · A typical pipeline for RNASeq analysis Map reads...
Case Study: RNASeq Sequencing
Chirag Nepal
23/03/2012
Outline
Introduction
Different kinds of experiment for transcriptome studies
RNA-Seq
ChiP-Seq
CAGE-Seq
Case studyHox-c cluster
Conclusion
High throughput sequencing (HTS)
HTS parallelize the sequencing process by producing millions of sequence at once.
Low cost, high throughput
Various sequencing machines
454 pyrosequencing
SOLiD sequencing
Illumina (Solexa) sequencing
• Applications
What is RNA-seq
RNA-seq is a revolutionary tool for transcriptomics studies with the use of high-throughput sequencing technologies to sequence cDNA, and gives information about a sample's RNA.
Major findings from RNA-Seq dataGene expression levelsAlternative splicing/isoformsIntron/exon boundariesDe-novo gene predictions
Library preparation for RNA-seq
Nature Methods - 5, 621 - 628 (2008
A typical pipeline for RNASeq analysis
Map reads to genome
Map splice reads
Construct gene model
Visualization
Differential expression
Multiple tools are available for each analysis
Use of multiple samples
Input sequence
• One experiment can produce tens of millions of such reads.
• Output from Illumina sequencing
Mapping of reads back to the genome
Nature Methods - 5, 621 - 628 (2008
• Challenges• Short (36bp- 76bp)• Multi mapping read
• Span exon boundary
UCSC genome browser
http://genome.ucsc.edu/ Step:1
Step:2
• Choose your species of interest
• Click custom tracks to add new tracks
Read coverage
Scalechr1:
2 kb208425000 208430000
RefSeq GenesNon-Rat RefSeq Genes
Ad-lib Rep-1
Vehicle Rep-1
Mus Malat1Homo MALAT1
Ad-lib Rep-1
918 _
1 _
Vehicle Rep-1
191 _
1 _
• Read counts gives an indication of gene expression level
Differential read count
Spliced reads
• Accounting spliced reads gives better intron/exon boundary
Merge the transcripts
Scalechr1:
TCONS_00002770TCONS_00002769TCONS_00002768
TCONS_00001998TCONS_00001997TCONS_00001999
TCONS_00002152TCONS_00002153
10 kb47180000 47190000 47200000
UCSC Known Genes Based on UniProt, RefSeq, and GenBank mRNAGene model
Gene model - Vehicle
Gene model - Adlib
• Differentially spliced• Alternative transcription start site PMID: 22383036
Have a look at the data in the browser.
RNASeq data between different replicates reveal a lot of dynamics
PMID: 22383036
Differential expression between samples
Gene of your interest
Questions ?