Biases in RNA- Seq data October 30, 2013 NBIC Advanced RNA- Seq course
Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed...
Transcript of Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed...
![Page 1: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/1.jpg)
Single Cell RNA-seq Data Integration
Ahmed MahfouzLeiden Computational Biology Center, LUMC
Delft Bioinformtaics Lab, TU Delft
![Page 2: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/2.jpg)
Why integrate?
Donor 1Donor 2Donor 3Donor 4
Same tissue from different donors
SickHealthy
Cell Type 1Cell Type 2Cell Type 3
Cross condition comparisons
2
![Page 3: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/3.jpg)
Building a cell atlas8 maps of the human pancreas
3
![Page 4: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/4.jpg)
Building a cell atlas8 maps of the human pancreas
4
![Page 5: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/5.jpg)
Confounders and batch effects
1. Technical variability• Changes in sample quality/processing
• Library prep or sequencing technology
• ‘Experimental reality’
Technical ‘batch effects’ confound downstream analysis
2. Biological variability• Patient differences
• Environmental/genetic perturbation
• Evolution! (cross-species analysis)
Biological ‘batch effects’ confound comparisons of scRNA-seq data
5Shaham et al. (https://doi.org/10.1093/bioinformatics/btx196)
![Page 6: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/6.jpg)
Confounders and batch effects
6Hicks et al. (https://doi.org/10.1093/biostatistics/kxx053)
Don’t design your experiment like this!!!
Confounded design Not confounded design
Good experimental design does not remove batch effects, it prevents them from biasing your results.
![Page 7: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/7.jpg)
Outline
• Single cell batch Correction methods:
• Performance assessment
• Sample multiplexing
• Simultaneous mRNA and protein profiling: REAP-seq and CITE-seq
7
![Page 8: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/8.jpg)
Batch correction methods
• Many good options have been developed for bulk RNA-seq data:• RUVseq() or svaseq()
• Linear models with e.g. removeBatchEffect() in limma or scater
• ComBat() in sva
• …
• But bulk RNA-seq methods make modelling assumptions that are likely to be violated in scRNAseq data (do they?)
8
![Page 9: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/9.jpg)
Batch correction methods
• MNNcorrect (https://doi.org/10.1038/nbt.4091)
• CCA + anchors (Seurat v3) (https://doi.org/10.1101/460147)
• CCA + dynamic time warping (Seurat v2) (https://doi.org/10.1038/nbt.4096)
• LIGER (https://doi.org/10.1101/459891)
• Harmony (https://doi.org/10.1101/461954)
• Conos (https://doi.org/10.1101/460246)
• Scanorama (https://doi.org/10.1101/371179)
• scMerge (https://doi.org/10.1073/pnas.1820006116)
• …
9
Two broad strategies:• Joint dimension reduction• Graph-based joint clustering
![Page 10: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/10.jpg)
Batch correction methods
• MNNcorrect (https://doi.org/10.1038/nbt.4091)
• CCA + anchors (Seurat v3) (https://doi.org/10.1101/460147)
• CCA + dynamic time warping (Seurat v2) (https://doi.org/10.1038/nbt.4096)
• LIGER (https://doi.org/10.1101/459891)
• Harmony (https://doi.org/10.1101/461954)
• Conos (https://doi.org/10.1101/460246)
• Scanorama (https://doi.org/10.1101/371179)
• scMerge (https://doi.org/10.1073/pnas.1820006116)
• …
10
Two broad strategies:• Joint dimension reduction• Graph-based joint clustering
![Page 11: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/11.jpg)
Mutual Nearest Neighbors (MNN)
11Haghverdi et al. (https://doi.org/10.1038/nbt.4091)
![Page 12: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/12.jpg)
Mutual Nearest Neighbors (MNN)
12
![Page 13: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/13.jpg)
Mutual Nearest Neighbors (MNN)
13
![Page 14: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/14.jpg)
Mutual Nearest Neighbors (MNN)
14
![Page 15: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/15.jpg)
Mutual Nearest Neighbors (MNN)
15
![Page 16: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/16.jpg)
Mutual Nearest Neighbors (MNN)
16
![Page 17: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/17.jpg)
Mutual Nearest Neighbors (MNN)
17
MEPs: megakaryocyte–erythrocyte progenitorsGMPs: granulocyte–monocyte progenitorsCMPs: common myeloid progenitors
![Page 18: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/18.jpg)
Mutual Nearest Neighbors (MNN)
• Pooling experiments -> increased statistical power
18
![Page 19: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/19.jpg)
CCA + anchors (Seurat v3)
1. Find corresponding cells across datasets
2. Compute a data adjustment based on correspondences between cells
3. Apply the adjustment
Stuart et al. (https://doi.org/10.1101/460147)
Query Reference
19
![Page 20: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/20.jpg)
Principle component analysis
Werner et al. (https://doi.org/10.1371/journal.pone.0113083)20
![Page 21: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/21.jpg)
Finding corresponding cellsCanonical correlation analysis and normalization
PC1
PC
2
CC1C
C2
Batch 1Batch 2
CCA captures correlated sources of variation between two datasets 21
![Page 22: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/22.jpg)
Finding corresponding cellsCanonical correlation analysis and normalization
L2-normalization corrects for differences in scale
Seurat(v3)::FindIntegrationAnchors
22
![Page 23: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/23.jpg)
Finding corresponding cellsAnchors: mutual nearest neighbors
Seurat(v3)::FindIntegrationAnchors
23
![Page 24: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/24.jpg)
Finding corresponding cellsData integration
1. Calculate the matrix 𝐵, where each column represents the difference between the two expression vectors for every pair of anchor cells 𝑎
2. Construct a weight matrix 𝑊 that defines the strength of association between each query cell 𝑐, and each anchor 𝑖
3. Calculate a transformation matrix 𝐶 using the previously computed weights matrix and the integration matrix as
4. Subtract the transformation matrix 𝐶 from the original expression matrix 𝑌to produce the integrated expression matrix 𝑌
Seurat(v3)::IntegrateData
24
![Page 25: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/25.jpg)
CCA + anchors (Seurat v3)
Retinal bipolar datasets: 51K cells, 6 technologies, 2 Species 25
![Page 26: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/26.jpg)
Label transfer (classification)
26
Weighted vote classifierWhat is the classification of each cells nearest anchors?
![Page 27: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/27.jpg)
Integration across modalities
27
![Page 28: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/28.jpg)
LIGER Linked Inference of Genomic Experimental Relationships
1) Integrative non-negative matrix factorization (iNMF) to learn a shared low-dimensional space
2) Perform joint clustering on the shared factor neighborhood graph
• Factors are interpretable due to non-negative constraint
• Finds set of dataset-specific factors and a set of shared factors
28Welch et al. (https://doi.org/10.1101/459891)
![Page 29: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/29.jpg)
LIGER Linked Inference of Genomic Experimental Relationships
• Joint clustering of gene expression and DNA methylation data
29
![Page 30: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/30.jpg)
Performance assessment
• Qualitative (visualization)
• Quantitative:• Silhouette score
• kBET: k-nearest-neighbor batch-effect test
• …
30
![Page 31: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/31.jpg)
Silhouette score
A score for each cell that assesses the separation of cell types, with a high score suggesting that cells of the same cell type are close together and far from other cells of a different type.
𝑎 𝑖 is the average distance of cell 𝑖 to all other cells within 𝑖’s cluster.
𝑏 𝑖 is the average distance of 𝑖 to all cells in the nearest cluster to which 𝑖 does not belong.
Silhouette score: 31
𝑠 𝑖 =𝑏 𝑖 − 𝑎(𝑖)
max(𝑎 𝑖 , 𝑏(𝑖))
𝑎 𝑖 =1
𝐶𝑖
∀𝑗
𝑑(𝑥𝑖 , 𝑥𝑗)
𝑏 𝑖 = min∀𝑗,𝑗∉𝐶𝑖
𝑑(𝑥𝑖 , 𝑥𝑗)
𝑆 =1
𝑁𝑠(𝑖)
![Page 32: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/32.jpg)
kBET: k-nearest-neighbor batch-effect test
Büttner et al. (https://doi.org/10.1038/s41592-018-0254-1)32
![Page 33: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/33.jpg)
kBET is more responsive than other batch tests on simulated data
33
![Page 34: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/34.jpg)
kBET assesses data-integration quality
34
![Page 35: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/35.jpg)
Summary (so far)
• Integration can allow us to improve the interpretation of single-cell data, and build a multi-modal view of the tissue
• Numerous methods now available for integration, mainly using joint dimension reduction, or joint clustering, or a combination of both
• Joint dimension reduction can yield interpretable factors and aid in the identification of equivalent states, but is computationally expensive
• Graph-based methods alone can be extremely fast, but may struggle when technical differences are on a similar scale to biological differences
35
![Page 36: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/36.jpg)
Sample multiplexing
• To simultaneously measure cells combined from different samples/conditions/…• Pool many cells together in the same run
• Mitigates technical effects
• Able to identify conditions from output data
36
![Page 37: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/37.jpg)
Multiplexing solves few problems
Cost Doublets Batch effects
37
![Page 38: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/38.jpg)
DemuxletNatural SNPs
Kang et al. (https://doi.org/10.1038/nbt.4042)38
![Page 39: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/39.jpg)
Cell hashing with barcoded antibodies
Stoeckius et al. (https://doi.org/10.1186/s13059-018-1603-1)39
![Page 40: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/40.jpg)
DNA-barcoded antibodies
Stoeckius et al. (https://doi.org/10.1038/nmeth.4380)40
![Page 41: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/41.jpg)
Equal concentration mixing of PBMCs from eight human donors A-H
Cell hashing with barcoded antibodies
41
![Page 42: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/42.jpg)
Validation by comparison to Demuxlet
● Using genetic variations (SNPs) to determine the sources of cells (individuals)
Cell hashing with barcoded antibodies
42
![Page 43: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/43.jpg)
Cell hashing with barcoded antibodies
Commercialized by BioLegend (TotalSeq™)
● Human: hashtags are made of two antibodies, CD298 and β2 microglobulin
● Mouse: hashtags are made of two antibodies, CD45 and H-2 MHC class I
43
![Page 44: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/44.jpg)
REAP-seq CITE-seq
44
![Page 45: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/45.jpg)
REAP-seqRNA expression and protein sequencing assay
45
![Page 46: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/46.jpg)
CITE-seqCellular Indexing of Transcriptomes and Epitopes by Sequencing
Stoeckius et al. (https://doi.org/10.1038/nmeth.4380)46
![Page 47: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/47.jpg)
CITE-seqCellular Indexing of Transcriptomes and Epitopes by Sequencing
Stoeckius et al. (https://doi.org/10.1038/nmeth.4380)47
![Page 48: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/48.jpg)
Single cell Multi-omicsSame cell
48Macaulay et al. (https://doi.org/10.1016/j.tig.2016.12.003)
![Page 49: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/49.jpg)
Single cell isoform RNA sequencing ScISOr-seq
49Gupta*, Collier* et al. (https://doi.org/10.1038/nbt.4259)
![Page 50: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/50.jpg)
Multi-omics factor analysis
50Argelaguet et al. (https://doi.org/10.15252/msb.20178124)
![Page 51: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/51.jpg)
Summary
• Batch effects sometimes not avoidable
• Many batch correction/integration methods available, mainly using joint dimension reduction, or joint clustering, or a combination of both
• Performance assessment is challenging
• Sample multiplexing can help alleviate batch effects
• Simultaneous mRNA and protein profiling: REAP-seq and CITE-seq
• Several single cell multi-omics technologies
51
![Page 52: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/52.jpg)
Data integration practical
• MNN correction
• Seurat v3
• Four pancreatic datasets
52
![Page 53: Single cell RNA-seq Data Integration - GitHub Pages · Single Cell RNA-seq Data Integration Ahmed Mahfouz Leiden Computational Biology Center, LUMC Delft Bioinformtaics Lab, TU Delft](https://reader036.fdocuments.in/reader036/viewer/2022062603/5f0254117e708231d403baae/html5/thumbnails/53.jpg)
Resources
• Stuart et al. “Comprehensive integration of single cell data”
https://www.biorxiv.org/content/10.1101/460147v1
• Haghverdi et al. “Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors”
https://doi.org/10.1038/nbt.4091
• Tim Stuart “Integration and harmonization of single-cell data” (Satija Lab single cell genomics day 2019)
https://satijalab.org/scgd/
• Andrew Butler “Batch Correction and Data Integration for Single Cell Transcriptomics” (Satija Lab single cell genomics day 2018)
https://satijalab.org/scgd18/
• Orchestrating Single-Cell Analysis with Bioconductor
https://osca.bioconductor.org/
• Hemberg’s group course: Analysis of single cell RNA-seq data
https://scrnaseq-course.cog.sanger.ac.uk/website/index.html
• Seurat Integration and Label Transfer tutorial
https://satijalab.org/seurat/v3.0/pancreas_integration_label_transfer.html 53