Transcript of Transcriptomics Jim Noonan GENE 760. Transcriptomics.
- Slide 1
- Transcriptomics Jim Noonan GENE 760
- Slide 2
- Slide 3
- Transcriptomics
- Slide 4
- Introduction to RNA-seq
- Slide 5
- RNA-seq workflow Martin and Wang Nat Rev Genet 12:671 (2011)
Wang et al. Nat Rev Genet 10:57 (2009)
- Slide 6
- Illumina RNA-seq library preparation Capture poly-A RNA with
poly-T oligo attached beads (100 ng total) (2x) RNA quality must be
high degradation produces 3 bias Non-poly-A RNAs are not recovered
Fragment mRNA Synthesize ds cDNA Ligate adapters Amplify Generate
clusters and sequence
- Slide 7
- Ribosomal RNA subtraction RiboMinus
- Slide 8
- Use existing gene annotation: Align to genome plus annotated
splices Depends on high-quality gene annotation Which annotation to
use: RefSeq, GENCODE, UCSC? Isoform quantification? Identifying
novel transcripts? Differential expression De novo transcript
assembly: Assemble transcripts directly from reads Allows
transcriptome analyses of species without reference genomes
Quantifying relative expression levels in RNA-seq
- Slide 9
- Mapping RNA-seq reads
- Slide 10
- Reads per kilobase of feature length per million mapped reads
(RPKM) Fragments per kilobase per million mapped reads (FPKM)
(paired-end reads) Transcripts per million (TPM) Counts per million
(CPM) Quantifying relative expression levels in RNA-seq What is a
feature? What about genomes with poor genome annotation? What about
species with no sequenced genome? For a detailed comparison of
normalization methods, see: Bullard et al. BMC Bioinformatics 11:94
(2010). Robinson and Oshlack, Genome Biol 11:R25 (2010)
- Slide 11
- Map reads to genome Map remaining reads to known splice
junctions Composite gene models Requires good gene models Isoforms
are ignored
- Slide 12
- Which gene annotation to use?
- Slide 13
- Martin and Wang Nat Rev Genet 12:671 (2011) Splice-aware short
read aligners
- Slide 14
- The Tuxedo suite Trapnell et al. Nature Protocols 7:562
(2012)
- Slide 15
- Cufflinks: ab initio transcript assembly Trapnell et al. Nat.
Biotechnology 28:511 (2010) Step 1: map reads to reference
genome
- Slide 16
- Trapnell et al. Nat. Biotechnology 28:511 (2010) Isoform
abundances estimated by maximum likelihood Cufflinks: ab initio
transcript assembly
- Slide 17
- Differential expression Garber et al. Nat Methods 8:469
(2011)
- Slide 18
- Differential expression Garber et al. Nat Methods 8:469 (2011)
Popular methods: EdgeR DEseq Cuffdiff Require count data Assume
negative binomial or Poisson distribution
- Slide 19
- Wang et al. Nat Rev Genet 10:57 (2009) What depth of sequencing
is required to characterize a transcriptome?
- Slide 20
- Considerations Gene length: Long genes are detected before
short genes Expression level: High expressors are detected before
low expressors Complexity of the transcriptome: Tissues with many
cell types require more sequencing Feature type Composite gene
models Common isoforms Rare isoforms Detection vs. quantification
Obtaining confident expression level estimates (e.g., stable RPKMs)
requires greater coverage
- Slide 21
- Applications of RNA-seq Characterizing transcriptome complexity
Alternative splicing Differential expression analysis Gene- and
isoform-level expression comparisons Novel RNA species lincRNAs
Pervasive transcription Allele-specific expression Effect of
genetic variation on gene expression Imprinting RNA editing Novel
events
- Slide 22
- Wang et al Nature 456:470 (2008) Alternative isoform regulation
in human tissue transcriptomes
- Slide 23
- Wang et al. Nature 456:470 (2008) Diversity of alternative
splicing events in human tissues
- Slide 24
- Novel RNA species: annotating lincRNAs Guttman et al Nat
Biotechnol 28:503 (2010)
- Slide 25
- Small RNA sequencing Rother and Meister, Biochimie 93: 1905
(2011)
- Slide 26
- Small RNA sequencing Rother and Meister, Biochimie 93: 1905
(2011) microRNAs ~22 nt piRNAs ~25-30 nt
- Slide 27
- Small RNA sequencing: Illumina protocol microRNAs ~22 nt piRNAs
~25-30 nt
- Slide 28
- Distinguishing functional small RNAs from noise Structural
similarity to known small RNAs: miR-deep, miR-cat Binding to small
RNA processing proteins Genetic requirements for processing
Friedlander et al. Nat Biotechnology 26:407 (2008)
- Slide 29
- Measuring translation by ribosome footprinting Ingolia, Nat Rev
Genet 15:205(2014)
- Slide 30
- Measuring translation by ribosome footprinting Ingolia et al.
Science 324:218 (2009)
- Slide 31
- Measuring translation by ribosome footprinting Ingolia et al.
Science 324:218 (2009)
- Slide 32
- Some lincRNAs are translated in mouse ES cells Ingolia et al.
Cell 147:789 (2011)
- Slide 33
- Detecting RNA-protein interactions: CLIP Rother and Meister,
Biochimie 93: 1905 (2011)
- Slide 34
- Enhancer-associated RNAs (eRNAs) Ren B. Nature 465:173
(2010)
- Slide 35
- Enhancer-associated RNAs (eRNAs) Kim et al Nature 465:182
(2010)
- Slide 36
- How much of the genome is transcribed? Kellis et al. Proc.
Natl. Acad. Sci. USA 111:6131 (2014) Estimates from ENCODE