Improving Transcriptome Profiling for Single Cell and Low ...

1
Keerthana Krishnan 1 , Yanxia Bei 1 , Janine G. Borgaro 1 , Shengxi Guan 1 , Vaishnavi Panchapakesa 1 , Karen Duggan 1 , Lynne Apone 1 , Timur Shtatland 1 , Bradley W. Langhorst 1 , Melissa Arn 1 , Jonathan Sanford 1 , Christine Sumner 1 , Diwakar R Pattabiraman 2 , Thomas C. Evans, Jr. 1 , Eileen Dimalanta 1 , Nicole M. Nichols 1 and Theodore Davis 1 1 New England Biolabs, Ipswich, MA 01938 2 Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, NH 03756 Improving Transcriptome Profiling for Single Cell and Low Input RNA RNA sequencing has been widely used to determine gene expression profiles of diverse tissues, cell types, developmental stages and diseases. Most of these studies are based on population analyses using thousands of cells. Such studies, however, disguise the potentially significant biological variations among individual cells. To overcome this limitation, single-cell RNA-seq is emerging as a powerful approach to characterize gene expression heterogeneity within phenotypically identical or complex cell populations and in rare cell types. We developed a simple and robust single-cell, low input RNA-seq workflow to generate full-length cDNAs that can easily be converted into sequencing-ready Illumina libraries when combined with the NEBNext® Ultra™ II FS DNA Library Preparation Kit which utilizes enzymatic fragmentation. Using this approach, we generated libraries from a variety of input material including Universal Human Reference (UHR) RNA (2 pg – 200 ng), single cells from cultured cell lines and mouse primary cells, and sequenced on the Illumina ® NextSeq ® 500. High quality sequencing data was obtained from all samples. We observe excellent gene body coverage and high sensitivity as demonstrated by detection of a high number of transcripts and expected number of RNA spike-ins (ERCC), even at single copy numbers. The data showed strong gene expression correlation (Pearson r>0.9) between RNA inputs that span over five orders of magnitude and in cultured cells (single vs. hundreds). From the analysis of the primary cells, we could successfully distinguish two types of cells from 8-week old mouse mammary glands and were able to trace them back to the basal and luminal developmental lineages, highlighting the high sensitivity of the protocol. We have developed a highly robust and sensitive method that consistently generates high quality sequencing data from single cell or low input RNA. It is streamlined and amenable to large-scale and high throughput automation. We envision this method facilitating novel discoveries in the area of low-input transcriptome applications and across various platforms. INTRODUCTION CONCLUSIONS 1. https://academic.oup.com/nar/article/44/W1/W3/2499339 2. Patro et al., (2015). Salmon: Accurate, Versatile and Ultrafast Quantification from RNA- seq Data using Lightweight-Alignment. bioRxiv, 21592. 3. Kim D, Langmead B, and Salzberg SL (2015). HISAT: a fast spliced aligner with low memory requirements, Nature methods. 4. http://broadinstitute.github.io/picard 5. http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml Authors would like to acknowledge the technical assistance provided by Laurie Mazzola, Danielle Fuchs, and Joanna Bybee at the New England Biolabs’ Sequencing Core Facility. METHODS RESULTS (A) cDNA libraries are prepared from intact cells or total RNA in a single-tube reaction. cDNA is synthesized by template-switching-mediated reverse transcription followed by amplification by PCR; (B) The full length cDNA is enzymatically fragmented, end repaired, dA-tailed, adaptors ligated and PCR amplified to generate final libraries to be sequenced on Illumina platforms; (C) Flowchart illustrating a streamlined Single Cell/Low Input RNA library prep workflow incorporating cDNA synthesis and library preparation with hands-on time of ~30 mins. STEP II: Library Generation STEP I: cDNA Synthesis & Amplification A B Overview of Workflow C cDNA library Single cell Input • cDNA Primer Mix ADD ADD ADD ADD ADD 1 2 • TSO • RT Enzyme Mix • RT Buffer • cDNA Primer • PCR Master Mix Total RNA Reverse transcription & non-templated addition Cell lysis Template switching mRNA cDNA Primer Adaptor AAAAAA TTTTTT AAAAAA TTTTTT XXX cDNA amplication AAAAAA TTTTTT XXX XXX Template-switching oligo (TSO) XXX cDNA library cleanup Transfer TTTTTT XXX TTTTTT XXX TTTTTT XXX XXX AAAAAA TTTTTT XXX XXX AAAAAA TTTTTT XXX XXX AAAAAA XXX AAAAAA • Beads • TRIS/H 2 0 OR Sensitive and Consistent Performance across different input amounts: Input: 2 pg – 200 ng UHR RNA A: Human Transcriptome and ERCC Expression Correlation B: ERCC Expression Correlation 0 5 10 15 20 25 30 35 40 45 50 2pg 5pg 10pg No. of ERCC Transcripts RNA Input Amount Expected Observed C: Gene Body Coverage D: ERCC Detection Sensitivity Illumina libraries made from Total UHR RNA in the range of 2 pg to 200 ng were sequenced using 2X75 cycles on a NextSeq 500 and data analysed using Galaxy (1). Transcript abundance was quantified using Salmon v0.6 (2) on the GRCh38 transcriptome reference. Reads were mapped to hg19 Human Reference Genome by HISAT2 (3) and gene body coverage was calculated using Picard tools (4). Consistent performance was observed across libraries made using various inputs. Results are shown as (A) transcript expression correlation between replicates (200 ng vs. 200 ng) and across different RNA inputs, 200 ng, 10 ng, 1 ng, 100 pg, 10 pg, 2 pg; (B) ERCC transcript expression correlation plots from the same libraries described in A; (C) gene body coverage for libraries made from 2 pg– 200 ng of UHR RNA; (D) ”Expected” vs. “Detected” number of ERCC RNA species in low input RNA samples (2 pg -10 pg). Expected: number of ERCC RNA species with at least 1 copy number in the RNA samples; Detected: number of ERCC RNA species with TPM (Transcripts per million) 1. Libraries show excellent correlation across all inputs and consistent gene body coverage over the entire transcript length Input: HeLa Cells and HeLa Total RNA Illumina libraries made from Hela cells (single cell, 10 cells and 100 cells) and Hela Total RNA were sequenced using 2X75 cycles on a NextSeq 500. Results show (A) Consistent correlation of transcripts across 100 HeLa cells, 10 cells, and a single cell. Similar correlation is also seen with 10 ng, 1 ng, 100 pg of Total HeLa RNA; (B) Consistent gene body coverage across different inputs A: Transcriptome Expression Correlation B: Gene Body Coverage 200ng Total RNA TPM 200ng Total RNA TPM 10ng Total RNA TPM 1ng Total RNA TPM 100pg Total RNA TPM 10pg Total RNA TPM 2pg Total RNA TPM R 2 =0.999 R 2 =0.987 R 2 =0.999 R 2 =0.994 R 2 =0.999 R 2 =0.981 200ng Total RNA TPM 200ng Total RNA TPM 10ng Total RNA TPM 1ng Total RNA TPM 100pg Total RNA TPM 10pg Total RNA TPM 2pg Total RNA TPM R 2 =0.999 R 2 =0.966 R 2 =0.942 R 2 =0.942 R 2 =0.945 R 2 =0.925 Illumina libraries were made with NEBNext Single Cell/Low Input RNA Library Prep Kit using HeLa, Jurkat or mouse M1 single cells or 10 pg UHR RNA. Using the same inputs, libraries were also made using Clontech SMART-Seq ® v4 Ultra ® Low Input RNA Kit followed by Illumina Nextera ® XT kit. All libraries were sequenced using 2X75 cycles on a NextSeq 500 and data analysed using Galaxy (1) as described previously. Results from these analyses are shown as (A) cDNA library yield comparison; (B) final library yield comparison; (C) number of transcripts with TPM1 (Transcripts per million) from each library; (D) Gene body coverage comparison for libraries generated with kits from NEBNext or Clontech. Across all metrics the NEBNext workflow generated libraries that show superior performance with higher cDNA and library yields, detection of higher number of transcripts and better coverage across the transcript length. C: Transcripts Identified A: cDNA Yield D: Gene Body Coverage 0 200 400 600 800 1000 1200 10 pg Hela Single Cell Jurkat Single Cell M1 Single Cell Total Yield (ng) B: Illumina Library Yield Robust and Superior Performance across Different Sample Types: Input: 10pg UHR RNA, HeLa, Jurkat, M1 single cell 0.000 0.200 0.400 0.600 0.800 1.000 1.200 1.400 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 Normalized Transcript Coverage Normalized Distance Along Transcript NEB Jurkat Single Cell NEB Hela Single Cell NEB M1 Single Cell NEB 10 pg Total RNA Clontech Jurkat Single Cell Clontech Hela Single Cell Clontech M1 Single Cell Clontech 10 pg Total RNA 0 5 10 15 20 25 30 10 pg Hela Single Cell Jurkat Single Cell M1 Single Cell Total cDNA Yield (ng) NEBNext Clontech, Nextera XT Sensitive and Superior Determination of Gene Expression: Input: Jurkat Cells Illumina libraries were made with NEBNext Single Cell/Low Input RNA Library Prep Kit using Jurkat single cells. For comparison, Jurkat single cells were used to make libraries with Clontech SMART-Seq v4 Ultra Low Input RNA Kit followed by Illumina Nextera XT kit. All libraries were sequenced using 2X75 cycles on a NextSeq 500 and data analysed using Galaxy (1) as described previously. Figures A-D show the number of transcripts detected per Jurkat single cell (6 replicates) using different methods (NEBNext vs. Clontech) and different ranges of expression (grouped into 1-5, 5-10, 10-50 and > 50 TPM). TPM = Transcripts per Kilobase Million. The box plot shows the median, first and third quartiles per method, and range of expression. Libraries generated using the NEBNext workflow detect more transcripts, especially low abundance transcripts. (A) Number of transcripts detected within TPM 1-5; (B) Number of transcripts detected within TPM 5-10; (C) Number of transcripts detected within TPM 10-50; (D) Number of transcripts detected >50 TPM; For the overlap, 5 replicates of Jurkat single cell libraries were chosen. (E) Overlapping transcripts detected with TPM 1 using the NEBNext Single Cell/Low Input RNA Library Prep Kit for Illumina; (F) Overlapping transcripts detected with TPM 1 using the SMART-Seq v4 Ultra Low Input RNA Kit followed by the Nextera XT DNA Library Prep Kit. NEBNext libraries consistently detect higher number and more overlapping transcripts across single Jurkat cells. Sensitive Determination of Gene Expression Signature to Identify Subtypes: Input: Primary Mouse Mammary Epithelial Cells Illumina libraries were generated from Mouse primary single cells (basal or luminal mammary cells) using the NEBNext Single Cell/Low Input RNA Library Prep, sequenced using 2X75 cycles on a NextSeq 500, and data analysed using Galaxy (1) as described previously. 10X Libraries were generated using the Chromium TM Single Cell 3’ Reagent Kit and sequenced on an Illumina HiSeq ® 2500. Data was aligned using Bowtie2 (5) and analysed on Loupe Cell Browser. ( A) number of transcripts with TPM0.1 and TPM1 from each subtype of primary cells identified using the NEBNext workflow; ( B) Clustering of >1800 cells using a 10X Chromium Single cell 3’ Solution identifies the basal subtype, luminal progenitor or mature luminal subpopulation based on marker gene expression; (C) Similar signature of marker genes can be identified in single basal primary cells (17 cells) and two subpopulations of luminal progenitor or mature luminal can be identified in the single luminal primary cells (30 cells). NEBNext libraries detect expression of gene markers to identify subtypes of mouse mammary epithelial cells. 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 TPM>0.1 TPM>1 Basal Mouse Primary Cells Luminal Mouse Primary Cells A B C Mouse Primary Single Cells Acta2 Krt14 Krt5 Cited1 Gpx3 Tmem15 Aldh1a3 Csn3 Ø The NEBNext Single Cell/Low Input RNA-seq workflow provides a streamlined and easy to use solution for NGS library preparation from single cells or a wide range of total RNA inputs from 2 pg up to 200 ng Ø Minimal hands-on time and few handling steps to reduce errors, leading to high consistency and reproducibility Ø High yields of cDNA and libraries are generated with uniform 5’-3’ transcript coverage Ø High transcript expression correlation between high and low input libraries is observed Ø Sensitive detection of low abundance transcripts and consistent detection of transcripts across all inputs and different cell types REFERENCES AND ACKNOWLEDGEMENT

Transcript of Improving Transcriptome Profiling for Single Cell and Low ...

Keerthana Krishnan1, Yanxia Bei1, Janine G. Borgaro1, Shengxi Guan1, Vaishnavi Panchapakesa1, Karen Duggan1, Lynne Apone1, Timur Shtatland1, Bradley W. Langhorst1, Melissa Arn1, Jonathan Sanford1, Christine Sumner1, Diwakar R Pattabiraman2, Thomas C. Evans, Jr.1, Eileen Dimalanta1, Nicole M. Nichols1 and Theodore Davis1

1New England Biolabs, Ipswich, MA 019382Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, NH 03756

Improving Transcriptome Profiling for Single Cell and Low Input RNA

RNA sequencing has been widely used to determine gene expression profiles ofdiverse tissues, cell types, developmental stages and diseases. Most of thesestudies are based on population analyses using thousands of cells. Such studies,however, disguise the potentially significant biological variations among individualcells. To overcome this limitation, single-cell RNA-seq is emerging as a powerfulapproach to characterize gene expression heterogeneity within phenotypicallyidentical or complex cell populations and in rare cell types.

We developed a simple and robust single-cell, low input RNA-seq workflow togenerate full-length cDNAs that can easily be converted into sequencing-readyIllumina libraries when combined with the NEBNext® Ultra™ II FS DNA LibraryPreparation Kit which utilizes enzymatic fragmentation. Using this approach, wegenerated libraries from a variety of input material including Universal HumanReference (UHR) RNA (2 pg – 200 ng), single cells from cultured cell lines andmouse primary cells, and sequenced on the Illumina® NextSeq® 500.

High quality sequencing data was obtained from all samples. We observeexcellent gene body coverage and high sensitivity as demonstrated by detection of ahigh number of transcripts and expected number of RNA spike-ins (ERCC), even atsingle copy numbers. The data showed strong gene expression correlation (Pearsonr>0.9) between RNA inputs that span over five orders of magnitude and in culturedcells (single vs. hundreds). From the analysis of the primary cells, we couldsuccessfully distinguish two types of cells from 8-week old mouse mammary glandsand were able to trace them back to the basal and luminal developmental lineages,highlighting the high sensitivity of the protocol.

We have developed a highly robust and sensitive method that consistentlygenerates high quality sequencing data from single cell or low input RNA. It isstreamlined and amenable to large-scale and high throughput automation. Weenvision this method facilitating novel discoveries in the area of low-inputtranscriptome applications and across various platforms.

INTRODUCTION

CONCLUSIONS

1. https://academic.oup.com/nar/article/44/W1/W3/24993392. Patro et al., (2015). Salmon: Accurate, Versatile and Ultrafast Quantification from RNA-

seq Data using Lightweight-Alignment. bioRxiv, 21592.3. Kim D, Langmead B, and Salzberg SL (2015). HISAT: a fast spliced aligner with low

memory requirements, Nature methods.4. http://broadinstitute.github.io/picard5. http://bowtie-bio.sourceforge.net/bowtie2/manual.shtmlAuthors would like to acknowledge the technical assistance provided by Laurie Mazzola,Danielle Fuchs, and Joanna Bybee at the New England Biolabs’ Sequencing Core Facility.

METHODS

RESULTS

REFERENCES & ACKNOWLEDGEMENTS

(A) cDNA libraries are prepared fromintact cells or total RNA in a single-tubereaction. cDNA is synthesized bytemplate-switching-mediated reversetranscription followed by amplification byPCR; (B) The full length cDNA isenzymatically fragmented, end repaired,dA-tailed, adaptors ligated and PCRamplified to generate final libraries to besequenced on Illumina platforms; (C)Flowchart illustrating a streamlinedSingle Cell/Low Input RNA library prepworkflow incorporating cDNA synthesisand library preparation with hands-ontime of ~30 mins.

STEP II: Library GenerationSTEP I: cDNA Synthesis & AmplificationA B

Overview of WorkflowC

cDNAlibrary

Single cell

Input

• cDNA Primer MixADD

ADD

ADD

ADD

ADD

1

2

• TSO• RT Enzyme Mix• RT Buffer

• cDNA Primer• PCR Master Mix

Total RNA

Reverse transcription &non-templated addition

Cell lysis

Template switching

mRNA

cDNA

Primer

Adaptor

5´ 3´5´3´

AAAAAATTTTTT

5´ 3´5´3´

AAAAAATTTTTT

XXX

cDNA amplification

3´5´3´

AAAAAATTTTTTXXX

XXX

Template-switchingoligo (TSO)

XXX

5´ 3´

cDNA library cleanup

Transfer

5´3´ TTTTTTXXX

5´ 3´5´3´ TTTTTTXXX

5´ 3´5´3´ TTTTTTXXX

XXX AAAAAA

5´ 3´5´3´ TTTTTTXXX

XXX AAAAAA

5´ 3´5´3´ TTTTTTXXX

XXX AAAAAA

3´ 5´5´ 3´XXX AAAAAA

• Beads

• TRIS/H20

OR

Sensitive and Consistent Performance across different input amounts:

Input: 2 pg – 200 ng UHR RNAA: Human Transcriptome and ERCC Expression Correlation

B: ERCC Expression Correlation

0"

5"

10"

15"

20"

25"

30"

35"

40"

45"

50"

2pg" 5pg" 10pg"

No.$of$E

RCC$Tran

scrip

ts$

RNA$Input$Amount$

Expected"

Observed"

C: Gene Body Coverage

D: ERCC Detection Sensitivity

Illumina libraries made from Total UHR RNA in the range of 2 pg to 200 ng were sequenced using 2X75 cycles on a NextSeq 500and data analysed using Galaxy (1). Transcript abundance was quantified using Salmon v0.6 (2) on the GRCh38 transcriptomereference. Reads were mapped to hg19 Human Reference Genome by HISAT2 (3) and gene body coverage was calculated usingPicard tools (4). Consistent performance was observed across libraries made using various inputs. Results are shown as (A)transcript expression correlation between replicates (200 ng vs. 200 ng) and across different RNA inputs, 200 ng, 10 ng, 1 ng, 100pg, 10 pg, 2 pg; (B) ERCC transcript expression correlation plots from the same libraries described in A; (C) gene body coverage forlibraries made from 2 pg– 200 ng of UHR RNA; (D) ”Expected” vs. “Detected” number of ERCC RNA species in low input RNAsamples (2 pg -10 pg). Expected: number of ERCC RNA species with at least 1 copy number in the RNA samples; Detected:number of ERCC RNA species with TPM (Transcripts per million) ≥1. Libraries show excellent correlation across all inputs andconsistent gene body coverage over the entire transcript length

Input: HeLa Cells and HeLa Total RNA

Illumina libraries made from Hela cells (single cell, 10 cells and 100 cells) and Hela Total RNA were sequenced using 2X75 cycles ona NextSeq 500. Results show (A) Consistent correlation of transcripts across 100 HeLa cells, 10 cells, and a single cell. Similarcorrelation is also seen with 10 ng, 1 ng, 100 pg of Total HeLa RNA; (B) Consistent gene body coverage across different inputs

A: Transcriptome Expression Correlation B: Gene Body Coverage

200ng Total RNA TPM

200n

g To

tal R

NA

TPM

10ng Total RNA TPM 1ng Total RNA TPM 100pg Total RNA TPM 10pg Total RNA TPM 2pg Total RNA TPM

R2=0.999 R2=0.987 R2=0.999 R2=0.994 R2=0.999 R2=0.981

200ng Total RNA TPM

200n

g To

tal R

NA

TPM

10ng Total RNA TPM 1ng Total RNA TPM 100pg Total RNA TPM 10pg Total RNA TPM 2pg Total RNA TPM

R2=0.999 R2=0.966 R2=0.942 R2=0.942 R2=0.945 R2=0.925

Illumina libraries were made with NEBNext Single Cell/Low Input RNA Library Prep Kitusing HeLa, Jurkat or mouse M1 single cells or 10 pg UHR RNA. Using the same inputs,libraries were also made using Clontech SMART-Seq® v4 Ultra® Low Input RNA Kitfollowed by Illumina Nextera® XT kit. All libraries were sequenced using 2X75 cycles on aNextSeq 500 and data analysed using Galaxy (1) as described previously. Results fromthese analyses are shown as (A) cDNA library yield comparison; (B) final library yieldcomparison; (C) number of transcripts with TPM≥1 (Transcripts per million) from eachlibrary; (D) Gene body coverage comparison for libraries generated with kits fromNEBNext or Clontech. Across all metrics the NEBNext workflow generated libraries thatshow superior performance with higher cDNA and library yields, detection of highernumber of transcripts and better coverage across the transcript length.

C: Transcripts IdentifiedA: cDNA Yield

D: Gene Body Coverage

0

200

400

600

800

1000

1200

10 pg Hela Single Cell

Jurkat Single Cell

M1 Single Cell

Tota

l Yie

ld (n

g)

Illumina Library Yield (ng)

NEBClontech, Nextera

B: Illumina Library Yield

Robust and Superior Performance across Different Sample Types:Input: 10pg UHR RNA, HeLa, Jurkat, M1 single cell

0.000#

0.200#

0.400#

0.600#

0.800#

1.000#

1.200#

1.400#

1# 5# 9# 13# 17# 21# 25# 29# 33# 37# 41# 45# 49# 53# 57# 61# 65# 69# 73# 77# 81# 85# 89# 93# 97#101#

Normalized

+Transcript+C

overage+

Normalized+Distance+Along+Transcript++

5'93'+Coverage+

NEB#Jurkat#Single#Cell#

NEB#Hela#Single#Cell#

NEB#M1#Single#Cell#

NEB#10#pg#Total#RNA#

Clontech#Jurkat#Single#Cell#

Clontech#Hela#Single#Cell#

Clontech#M1#Single#Cell#

Clontech#10#pg#Total#RNA#

0

5

10

15

20

25

30

10 pg Hela Single Cell

Jurkat Single Cell

M1 Single Cell

Tota

l cD

NA

Yiel

d (n

g)

cDNA Yield (ng)

NEBClontech, Nextera

NEBNextClontech, Nextera XT

Sensitive and Superior Determination of Gene Expression: Input: Jurkat Cells

Illumina libraries were made with NEBNext Single Cell/Low Input RNA Library Prep Kit using Jurkat single cells. For comparison, Jurkatsingle cells were used to make libraries with Clontech SMART-Seq v4 Ultra Low Input RNA Kit followed by Illumina Nextera XT kit. Alllibraries were sequenced using 2X75 cycles on a NextSeq 500 and data analysed using Galaxy (1) as described previously. Figures A-Dshow the number of transcripts detected per Jurkat single cell (6 replicates) using different methods (NEBNext vs. Clontech) and differentranges of expression (grouped into 1-5, 5-10, 10-50 and > 50 TPM). TPM = Transcripts per Kilobase Million. The box plot shows themedian, first and third quartiles per method, and range of expression. Libraries generated using the NEBNext workflow detect moretranscripts, especially low abundance transcripts. (A) Number of transcripts detected within TPM 1-5; (B) Number of transcripts detectedwithin TPM 5-10; (C) Number of transcripts detected within TPM 10-50; (D) Number of transcripts detected >50 TPM; For the overlap, 5replicates of Jurkat single cell libraries were chosen. (E) Overlapping transcripts detected with TPM ≥1 using the NEBNext Single Cell/LowInput RNA Library Prep Kit for Illumina; (F) Overlapping transcripts detected with TPM ≥1 using the SMART-Seq v4 Ultra Low Input RNAKit followed by the Nextera XT DNA Library Prep Kit. NEBNext libraries consistently detect higher number and more overlappingtranscripts across single Jurkat cells.

Sensitive Determination of Gene Expression Signature to Identify Subtypes:

Input: Primary Mouse Mammary Epithelial Cells

Illumina libraries were generated from Mouse primary single cells (basal or luminal mammary cells) using the NEBNext SingleCell/Low Input RNA Library Prep, sequenced using 2X75 cycles on a NextSeq 500, and data analysed using Galaxy (1) asdescribed previously. 10X Libraries were generated using the ChromiumTM Single Cell 3’ Reagent Kit and sequenced on anIllumina HiSeq® 2500. Data was aligned using Bowtie2 (5) and analysed on Loupe Cell Browser. (A) number of transcripts withTPM≥0.1 and TPM≥1 from each subtype of primary cells identified using the NEBNext workflow; (B) Clustering of >1800 cellsusing a 10X Chromium Single cell 3’ Solution identifies the basal subtype, luminal progenitor or mature luminal subpopulationbased on marker gene expression; (C) Similar signature of marker genes can be identified in single basal primary cells (17 cells)and two subpopulations of luminal progenitor or mature luminal can be identified in the single luminal primary cells (30 cells).NEBNext libraries detect expression of gene markers to identify subtypes of mouse mammary epithelial cells.

0"

1000"

2000"

3000"

4000"

5000"

6000"

7000"

8000"

9000"

10000"

TPM>0.1" TPM>1"

Basal"Mouse"Primary"Cells"

Luminal"Mouse"Primary"Cells"

A B C

Mou

se P

rimar

y Sin

gle

Cells

Acta2 Krt14 Krt5 Cited1 Gpx3 Tmem15 Aldh1a3 Csn3

Ø The NEBNext Single Cell/Low Input RNA-seq workflow provides a streamlined and easy to use solution for NGS library preparation from single cells or a wide range of total RNA inputs from 2 pg up to 200 ng

Ø Minimal hands-on time and few handling steps to reduce errors, leading to high consistency and reproducibility

Ø High yields of cDNA and libraries are generated with uniform 5’-3’ transcript coverage Ø High transcript expression correlation between high and low input libraries is observedØ Sensitive detection of low abundance transcripts and consistent detection of transcripts

across all inputs and different cell types

REFERENCES AND ACKNOWLEDGEMENT