Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampliseq™ Transcriptome...

1
Kelli Bramlett* 1 , James L. Witliff* 2 , Varun Bagai 1 , Jeff Schageman 1 , Jose Cienfuegos 1 , Kristi Lea 1 , and Jian Gu 1 . * Contributed equally to this work 1 Life Sciences Solutions Group, Thermo Fisher Scientific, 2130 Woodward St, Austin, TX, 78744, USA. 2 Dept of Biochemistry & Molecular Biology, Institute for Molecular Diversity & Drug Design, University of Louisville, Louisville, KY 40292 FIGURE 2. Differential Gene Expression Correlations to Orthogonal Gene Expression Measurement Systems ABSTRACT Thousands of genes are expressed in a controlled fashion in each eukaryotic cell determining what a cell can do and dictate normal tissue function. The measurement of the entire gene expression pattern of a given sample is critical in understanding the natural homeostatic state of a healthy tissue, as well as providing useful information when a system is altered due to environmental queues or potentially disease state. Many technologies have been utilized to measure the entire gene expression profile of a RNA test sample. DNA microarrays have become a key method to acquire a comparative snapshot of the gene expression profile from test samples in a high throughput manner. Quantitative PCR and newer sequencing techniques are popular alternatives offering highly accurate gene expression measurements, but with limitations due to cost and complex analysis needs. To address the challenges of current sequencing based methods of global gene expression profiling and take advantage of the simplicity of analysis that comes with defined expression profiling content from technologies such as microarrays, we have tested the Ion AmpliSeq™ Transcriptome Human Gene Expression Kit using RNA isolated from invasive ductal tumor samples. This novel approach allows profiling the global mRNA expression of human RNA in a highly multiplexed fashion using the Ion Torrent sequencing platform. The results show detection of more genes than popular microarray platforms with comparable differential gene expression measurements to quantitative PCR (r = 0.96) and RNA-Seq methods (r = 0.94). Data presented here demonstrates high on target mapping (>91% of reads) for all human breast carcinoma libraries. Gene expression values correlated with R>0.99 for all technical replicates. We saw >64% of the over 22,800 genes in the single pool panel detected for all libraries. The most highly expressed genes include genes expected to be over-expressed in breast tumor samples. The Ion AmpliSeq™ Transcriptome Human Gene Expression Kit is a novel method to measure global gene expression profiles from human RNA samples in a timely, cost effective, and high throughput manner resulting in sensitive and accurate gene expression measurements. MATERIALS AND METHODS Tissue Preparation & Laser Capture Microdissection: Using an IRB-approved study, frozen serial tissue sections containing 55 +/- 23% tumor were prepared and stained with H & E using established protocols (1,2), and tumor cells (~ 14,000 LCM pulses) were procured from an adjacent section using a Pixcell® IIe (Arcturus®/Thermo Fisher Scientific) instrument. Sample Preparation: Technology development experiments were performed with well characterized control RNAs, Universal Human Reference RNA (UHR; Agilent P/N 740000) and First Choice Human Brain Reference RNA (HBR; Thermo Fisher P/N AM6050). For invasive ductal tumor samples, RNA was extracted from intact tissue sections using mirVana™ miRNA Isolation kits (Thermo Fisher P/N AM 1650). Ion AmpliSeq™ Transcriptome libraries were generated from 4 different invasive ductal tumor samples using the Human Gene Expression Panel (Thermo Fisher P/N A26325) targeting ~21,000 well annotated human genes from 10ng of total RNA (effective range of 1-100ng input). Triplicate libraries were generated from each sample. Next Generation Sequencing: Ion AmpliSeq™ Transcriptome libraries were barcoded, templated and sequenced on the Proton Sequencing System. Multiplexed libraries were templated using the Ion PI™ Template OT2 200 kit v3 and sequenced using the Ion PI™ Sequencing 200 kit v3 on Ion PI™ v2 chips (Thermo Fisher Scientific P/N 4488318, 4488315, 4482321) as one eight-plex library pool (UHR/HBR experiments) or two six-plex library pools (human breast tumor experiments). The data analysis is performed on the Torrent Server using a free ampliSeqRNA plug-in that provides simple QC, visualization, and normalized counts per gene that corresponds to gene expression information. RESULTS – CONTROL RNA CONCLUSIONS •Gene expression information may be obtained from very small amounts of RNA from breast tissue samples in a technically reproducible method •Principal Component Analyses (PCA) of technical replicates of gene expression from each breast carcinoma revealed unique patterns which clustered together that were widely separated from the other three specimens. •A subset of genes identified as overexpressed by NGS in breast tumor tissue are found to be significant by Cox regression for association with disease free survival and overall survival •The Ion AmpliSeq™ Transcriptome Human Gene Expression Kit is an effective method for gene expression measurement with high correlation to orthogonal measurement systems REFERENCES 1. Andres, S.A., Wittliff, J.L. Relationships of ESR1 and XBP1 Expression in Human Breast Carcinoma and Stromal Cells Isolated by Laser Capture Microdissection Compared to Intact Breast Cancer Tissue. Endocrine 40:212-221, 2011. 2. Andres, S.A., Brock, G.N., Wittliff, J.L. Interrogating Differences in Expression of Targeted Gene Sets to Predict Breast Cancer Outcome. BMC Cancer 13:326, 2013. 3. Wu, D., et al., An alternative splicing isoform of eukaryotic initiation factor 4H promotes tumorigenisis in vivo and is a potential therapeutic target for human cancer. Int J Cancer, Mar 1, 2011. Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampliseq™ Transcriptome Human Gene Expression Kit Thermo Fisher Scientific • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com For Research Use Only. Not for use in diagnostic procedures. ©2015 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. FIGURE 1. The Ion AmpliSeq Transcriptome Human Gene Expression Kit detects 15,972 genes with control RNAs (green circle) and the Affy Hu133Aplus2 = 9,715 genes for the same input RNA (blue circle). MoreUHR/HBR DEGs are detected with the AmpliSeq Transcriptome platform than are reported with the MAQC data from Affy Hu 133A+2 Microarry platform FIGURE 2. Differential Gene Expression (DEG) Correlations between UHR and HBR comparing AmpliSeq Transcriptome Human Gene Expression Kit to TaqMan R = 0.96 (panel A) and RNA-SeqR = 0.94 (panel B) FIGURE 4. Gene Expression Profiles FIGURE 4. Gene Expression profiles driving sample level differences in the four independent invasive ductal tumor samples. On the left, all genes with mean expression 200 counts (N = 804) driving hierarchical clustering. On the right, a subset of 174 genes with greatest expression differences between the four samples. AmpliSeqRNA plugin normalized counts are used for the heirarchical clustering shown here. FIGURE 5. Kaplan-Meier Plots of Selected Genes Showing Association with Disease Free Survival and Overall Survival FIGURE 5. From the Ion AmpliSeq™ Transcriptome data, the most highly expressed genes across all four samples tested in replicate were identified. This list of genes overexpressed by NGS in breast tumor tissue sections was compared to gene expression data from microarrays of LCM procured neoplastic cells available in the IRB-approved biorepository. 40 genes in the original list from NGS analysis also were significant by COX regression analysis for disease free survival, overall survival or both. A subset of the 40 genes identified as significant in breast tumor samples by both NGS analysis and microarray analysis were chosen for Kaplan-Meier analysis above; GLUL, XBP1, EIF4H, and H3F3A. A. B. RESULTS – INVASIVE DUCTAL BREAST TUMOR FIGURE 3. A) Representative Correlations from Ion AmpliSeq™ Transcriptome technical replicate libraries. Three replicate libraries are shown from two different tissue samples. Technical replicate correlations range from 0.981 – 0.996 as shown in the table B) PCA (principal component analysis) plot of the 12 libraries generated from 4 independent tissue samples. The clustering pattern shows technical replicates from each sample cluster together and separately from other samples A. B. FIGURE 3. Technical Replicate Performance 0.9930.996 0.995 0.9810.994 0.9910.994 Sample 1 Sample 2 Sample 3 Sample 4 Ion AmpliSeq™ Transcriptome Technical Replicate CorrelaBons Sample type Log2 norm counts Log2 norm counts 0 50 100 150 200 0 50 100 Survival proportions: Survival of GLUL OS OS (mo) Percent survival below median gene expression above median gene expression P value < 0.0001 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of GLUL DFS DFS (mo) Percent survival below median gene expression above median gene expression P value 0.0001 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of XBP1 OS OS (mo) Percent survival below median gene expression above median gene expression P value 0.0280 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of XBP1 DFS DFS (mo) Percent survival below median gene expression above median gene expression P value 0.1237 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of EIF4H OS OS (mo) Percent survival below median gene expression above median gene expression P value 0.0028 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of EIF4H DFS DFS (mo) Percent survival below median gene expression above median gene expression P value 0.0018 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of H3F3A OS OS (mo) Percent survival below median gene expression above median gene expression P value 0.0780 n = 247 0 50 100 150 200 0 50 100 Survival proportions: Survival of H3F3A DFS DFS (mo) Percent survival below median gene expression above median gene expression P value 0.0019 n = 247 GLUL XBP1 EIF4H H3F3A

Transcript of Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampliseq™ Transcriptome...

Page 1: Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampliseq™ Transcriptome Human Gene Expression Kit

Kelli Bramlett*1, James L. Witliff*2, Varun Bagai1, Jeff Schageman1, Jose Cienfuegos1, Kristi Lea1, and Jian Gu1. * Contributed equally to this work 1 Life Sciences Solutions Group, Thermo Fisher Scientific, 2130 Woodward St, Austin, TX, 78744, USA. 2 Dept of Biochemistry & Molecular Biology, Institute for Molecular Diversity & Drug Design, University of Louisville, Louisville, KY 40292

FIGURE 2. Differential Gene Expression Correlations to Orthogonal Gene Expression Measurement Systems

ABSTRACT Thousands of genes are expressed in a controlled fashion in each eukaryotic cell determining what a cell can do and dictate normal tissue function. The measurement of the entire gene expression pattern of a given sample is critical in understanding the natural homeostatic state of a healthy tissue, as well as providing useful information when a system is altered due to environmental queues or potentially disease state. Many technologies have been utilized to measure the entire gene expression profile of a RNA test sample. DNA microarrays have become a key method to acquire a comparative snapshot of the gene expression profile from test samples in a high throughput manner. Quantitative PCR and newer sequencing techniques are popular alternatives offering highly accurate gene expression measurements, but with limitations due to cost and complex analysis needs. To address the challenges of current sequencing based methods of global gene expression profiling and take advantage of the simplicity of analysis that comes with defined expression profiling content from technologies such as microarrays, we have tested the Ion AmpliSeq™ Transcriptome Human Gene Expression Kit using RNA isolated from invasive ductal tumor samples. This novel approach allows profiling the global mRNA expression of human RNA in a highly multiplexed fashion using the Ion Torrent sequencing platform. The results show detection of more genes than popular microarray platforms with comparable differential gene expression measurements to quantitative PCR (r = 0.96) and RNA-Seq methods (r = 0.94). Data presented here demonstrates high on target mapping (>91% of reads) for all human breast carcinoma libraries. Gene expression values correlated with R>0.99 for all technical replicates. We saw >64% of the over 22,800 genes in the single pool panel detected for all libraries. The most highly expressed genes include genes expected to be over-expressed in breast tumor samples. The Ion AmpliSeq™ Transcriptome Human Gene Expression Kit is a novel method to measure global gene expression profiles from human RNA samples in a timely, cost effective, and high throughput manner resulting in sensitive and accurate gene expression measurements. MATERIALS AND METHODS Tissue Preparation & Laser Capture Microdissection: Using an IRB-approved study, frozen serial tissue sections containing 55 +/- 23% tumor were prepared and stained with H & E using established protocols (1,2), and tumor cells (~ 14,000 LCM pulses) were procured from an adjacent section using a Pixcell® IIe (Arcturus®/Thermo Fisher Scientific) instrument. Sample Preparation: Technology development experiments were performed with well characterized control RNAs, Universal Human Reference RNA (UHR; Agilent P/N 740000) and First Choice Human Brain Reference RNA (HBR; Thermo Fisher P/N AM6050). For invasive ductal tumor samples, RNA was extracted from intact tissue sections using mirVana™ miRNA Isolation kits (Thermo Fisher P/N AM 1650). Ion AmpliSeq™ Transcriptome libraries were generated from 4 different invasive ductal tumor samples using the Human Gene Expression Panel (Thermo Fisher P/N A26325) targeting ~21,000 well annotated human genes from 10ng of total RNA (effective range of 1-100ng input). Triplicate libraries were generated from each sample. Next Generation Sequencing: Ion AmpliSeq™ Transcriptome libraries were barcoded, templated and sequenced on the Proton Sequencing System. Multiplexed libraries were templated using the Ion PI™ Template OT2 200 kit v3 and sequenced using the Ion PI™ Sequencing 200 kit v3 on Ion PI™ v2 chips (Thermo Fisher Scientific P/N 4488318, 4488315, 4482321) as one eight-plex library pool (UHR/HBR experiments) or two six-plex library pools (human breast tumor experiments). The data analysis is performed on the Torrent Server using a free ampliSeqRNA plug-in that provides simple QC, visualization, and normalized counts per gene that corresponds to gene expression information. RESULTS – CONTROL RNA

CONCLUSIONS

• Gene expression information may be obtained from very small amounts of RNA from breast tissue samples in a technically reproducible method

• Principal Component Analyses (PCA) of technical replicates of gene expression from each breast carcinoma revealed unique patterns which clustered together that were widely separated from the other three specimens.

• A subset of genes identified as overexpressed by NGS in breast tumor tissue are found to be significant by Cox regression for association with disease free survival and overall survival

• The Ion AmpliSeq™ Transcriptome Human Gene Expression Kit is an effective method for gene expression measurement with high correlation to orthogonal measurement systems REFERENCES 1.  Andres, S.A., Wittliff, J.L. Relationships of ESR1 and XBP1 Expression in Human Breast

Carcinoma and Stromal Cells Isolated by Laser Capture Microdissection Compared to Intact Breast Cancer Tissue. Endocrine 40:212-221, 2011.

2.  Andres, S.A., Brock, G.N., Wittliff, J.L. Interrogating Differences in Expression of Targeted Gene Sets to Predict Breast Cancer Outcome. BMC Cancer 13:326, 2013.

3.  Wu, D., et al., An alternative splicing isoform of eukaryotic initiation factor 4H promotes tumorigenisis in vivo and is a potential therapeutic target for human cancer. Int J Cancer, Mar 1, 2011.

Global Gene Expression Profiles from Breast Tumor Samples using the Ion Ampliseq™ Transcriptome Human Gene Expression Kit

Thermo Fisher Scientific • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com For Research Use Only. Not for use in diagnostic procedures. ©2015 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified.

FIGURE 1. The Ion AmpliSeq Transcriptome Human Gene Expression Kit detects 15,972 genes with control RNAs (green circle) and the Affy Hu133Aplus2 = 9,715 genes for the same input RNA (blue circle). MoreUHR/HBR DEGs are detected with the AmpliSeq Transcriptome platform than are reported with the MAQC data from Affy Hu 133A+2 Microarry platform

FIGURE 2. Differential Gene Expression (DEG) Correlations between UHR and HBR comparing AmpliSeq Transcriptome Human Gene Expression Kit to TaqMan R = 0.96 (panel A) and RNA-SeqR = 0.94 (panel B)

FIGURE 4. Gene Expression Profiles

FIGURE 4. Gene Expression profiles driving sample level differences in the four independent invasive ductal tumor samples. On the left, all genes with mean expression ≥ 200 counts (N = 804) driving hierarchical clustering. On the right, a subset of 174 genes with greatest expression differences between the four samples. AmpliSeqRNA plugin normalized counts are used for the heirarchical clustering shown here.

FIGURE 5. Kaplan-Meier Plots of Selected Genes Showing Association with Disease Free Survival and Overall Survival

FIGURE 5. From the Ion AmpliSeq™ Transcriptome data, the most highly expressed genes across all four samples tested in replicate were identified. This list of genes overexpressed by NGS in breast tumor tissue sections was compared to gene expression data from microarrays of LCM procured neoplastic cells available in the IRB-approved biorepository. 40 genes in the original list from NGS analysis also were significant by COX regression analysis for disease free survival, overall survival or both. A subset of the 40 genes identified as significant in breast tumor samples by both NGS analysis and microarray analysis were chosen for Kaplan-Meier analysis above; GLUL, XBP1, EIF4H, and H3F3A.

A. B.

RESULTS – INVASIVE DUCTAL BREAST TUMOR

FIGURE 3. A) Representative Correlations from Ion AmpliSeq™ Transcriptome technical replicate libraries. Three replicate libraries are shown from two different tissue samples. Technical replicate correlations range from 0.981 – 0.996 as shown in the table B) PCA (principal component analysis) plot of the 12 libraries generated from 4 independent tissue samples. The clustering pattern shows technical replicates from each sample cluster together and separately from other samples

A. B.

FIGURE 3. Technical Replicate Performance

0.993-­‐0.996   0.995   0.981-­‐0.994   0.991-­‐0.994  

Sample  1   Sample  2   Sample  3   Sample  4  Ion  AmpliSeq™  Transcriptome  Technical  Replicate  CorrelaBons  

Sample type

Log2 norm counts Log2 norm counts

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p r o p o r t io n s : S u rv iv a l o f G L U L O S

O S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value < 0.0001

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p ro p o r t io n s : S u rv iv a l o f G L U L D F S

D F S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0001

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p ro p o r t io n s : S u rv iv a l o f X B P 1 O S

O S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0280

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p r o p o r t io n s : S u r v iv a l o f X B P 1 D F S

D F S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.1237

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p ro p o r t io n s : S u rv iv a l o f E IF 4 H O S

O S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0028

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p r o p o r t io n s : S u rv iv a l o f E IF 4 H D F S

D F S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0018

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p ro p o r t io n s : S u rv iv a l o f H 3 F 3 A O S

O S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0780

n = 2 4 7

0 5 0 1 0 0 1 5 0 2 0 00

5 0

1 0 0

S u rv iv a l p ro p o r t io n s : S u rv iv a l o f H 3 F 3 A D F S

D F S (m o )

Pe

rce

nt

su

rviv

al

b e lo w m e d ia n g e n e e x p re s s io n

a b o v e m e d ia n g e n e e x p re s s io n

P value 0.0019

n = 2 4 7

GLUL

XBP1

EIF4H

H3F3A