Single Cell RNA Sequencing - GitHub Pages · Single cell RNA-seq This semester I Learned about and...
Transcript of Single Cell RNA Sequencing - GitHub Pages · Single cell RNA-seq This semester I Learned about and...
Single Cell RNA SequencingAnalysis and Applications
Jacob M. Maronge
Department of StatisticsUniversity of Wisconsin–Madison
https://jmmaronge.github.io/
� jmmaronge
7 @jmmaronge
Why RNA sequencing?Central dogma of molecular biology
DNA → mRNA → Proteins (→ Traits)
I Hard to measure proteinsI Measure mRNA as an attempt to get at traitsI 2 ways to measure mRNA gene expression
abundance: Bulk and Single Cell
RNA sequencingDifference between bulk and single cell
From: http://web.stanford.edu/class/cs262/presentations/lecture12.pdf
Single cell RNA-seqData structure
cell1,1 cell2,1 ... cellj1,1 cell1,2 cell2,2 ... cellj2,2
gene1 n1,1,1 n1,2,1 . . . n1,j1,1 n1,1,2 n1,2,2 . . . n1,j2,2
gene2 n2,1,1 n2,2,1 . . . n2,j1,1 n2,1,2 n2,2,2 . . . n2,j2,2...
......
......
......
......
genei ni,1,1 ni,2,1 . . . ni,j1,1 ni,1,2 ni,2,2 . . . ni,j2,2
I ni,j,k refers to the count of abundance of gene
expression in gene i , cell j , and condition k .
Single cell RNA-seqThis semester
I Learned about and found a problem in astatistical analysis technique for single cellRNA sequencing (scDD)
I Learned about the problem of normalization insingle cell RNA sequencing.
I Performed a literature review for the problem ofidentifying cell subpopulations.
Questions for Statistical AnalysisDifferential Expression
I Scientific collaborator comes in - wants to know ifthere is a difference in gene expression for theirfavorite gene across conditions.
AnalysisChallenges for Single Cell Data
I Single cell data present more challenges (lots ofzeros, multiple peaks, etc.)
Single Cell Differential Distribution (scDD)
From Korthauer et al. (2016)
I scDD uses 2 stage approach to sort genes into categories: 1.)run permutation test to see if gene is different acrossconditions; 2.) use Dirichlet Process Mixture Model to sortinto one of these 4 categories.
Potential Problem?Classified as DM – is it really?
I As a result, a min.size parameter was implementedinto the scDD R package to put a threshold on therequired number of obs. to be called a cluster.
Further analysisDigging deeper
I What if we look at the same genes in groups of cellsand try to classify subpopulations of these cells?
I Ex. pluripotent stem cells: These cells have the abilityto turn into blood cells, lung tissue etc.
I Goal: Figure out which pluripotent cells are morelikely to become blood, lung, etc.
I Turns out there’s many methods implemented forsolving similar problems.
Possible Further Direction?ATAC-Seq
I Short for: Assay for Transposase-AccessibleChromatin with high throughput sequencing(Buenrostro et al. (2016)).
I Chromatin is found in cells: consists of DNA, protein,and RNA.
I ATAC-Seq looks at where chromatin is accessible tobe transcribed.
I Related to epigenetics - basic idea: actions affectwhich genes are expressed.
Thank you!Special Thanks
Kendziorski LabI Rhonda BacherI Jeea ChoiI Prof. Christina KendziorskiI Ziyue Wang
References
I Buenrostro J, Wu B, Chang H, Greenleaf W. ATAC-seq: AMethod for Assaying Chromatin Accessibility Genome-Wide.Current protocols in molecular biology / edited by Frederick MAusubel . [et al]. 2015;109:21.29.1-21.29.9.doi:10.1002/0471142727.mb2129s109.
I Korthauer KD, Chu LF, Newton MA, Li Y, Thomson J, Stewart R,Kendziorski C. A statistical approach for identifying differentialdistributions in single-cell RNA-seq experiments. GenomeBiology. 2016 Oct 25;17(1):222.