Identification of ten serum microRNAs from a genome-wide serum microRNA expression profile as novel...

9
Identification of ten serum microRNAs from a genome-wide serum microRNA expression profile as novel noninvasive biomarkers for nonsmall cell lung cancer diagnosis Xi Chen 1* , Zhibin Hu 2* , Wenjing Wang 3* , Yi Ba 4* , Lijia Ma 5,6* , Chunni Zhang 7* , Cheng Wang 7 , Zhiji Ren 1 , Yang Zhao 2 , Sijia Wu 1 , Rui Zhuang 1 , Yixin Zhang 8 , Heng Hu 3 , Chazhen Liu 3 , Lin Xu 9 , Jun Wang 5,6 , Hongbing Shen 2 , Junfeng Zhang 1 , Ke Zen 1 and Chen-Yu Zhang 1 1 Jiangsu Engineering Research Center for microRNA Biology and Biotechnology, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, 22 Hankou Road, Nanjing, Jiangsu, China 2 Department of Epidemiology and Biostatistics, Cancer Center, Nanjing Medical University, 140 Hanzhong Road, Nanjing, Jiangsu, China 3 Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China 4 Tianjin Medical University Cancer Institute and Hospital, Huanhuxi Road, Tiyuanbei, Tianjin, China 5 Beijing Genomics Institute, Yantian, Shenzhen, China 6 Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China 7 Department of Biochemistry, Jinling Hospital, Clinical School of Medical College, Nanjing University, 305 East Zhongshan Road, Nanjing, Jiangsu, China 8 Department of Hepatobiliary Surgery, Nantong Tumor Hospital, Nantong, Jiangsu, China 9 Department of Thoracic Surgery, Cancer Hospital of Jiangsu Province, Nanjing, Jiangsu, China The detection of nonsmall cell lung cancer (NSCLC) at an early stage presents a daunting challenge due to the lack of a specific noninvasive marker. The discovery of microRNAs (miRNAs), particularly those found in serum, has opened a new avenue for tumor diagnosis. To determine whether the expression profile of serum miRNAs can serve as a NSCLC fingerprint, we performed Taqman probe-based quantitative RT-PCR assay to selected differentially expressed serum miRNAs from a sample set including 400 NSCLC cases and 220 controls, and risk score analysis to evaluate the diagnostic value of the serum miRNA profiling system. After a two-phase selection and validation process, 10 miRNAs were found to have significantly different expression levels in NSCLC serum samples compared with the control serum samples. Risk score analysis showed that this panel of miRNAs was able to distinguish NSCLC cases from controls with high sensitivity and specificity. Under ROC curves, the AUC for tumor identification in training set and validation set were 0.966 and 0.972, respectively. Furthermore, the expression profile of the 10-serum miRNAs was correlated with the stage of NSCLC patients, especially in younger patients and patients with current smoking habits. More importantly, the serum miRNA-based biomarker for early NSCLC detection was supported by a retrospective analysis in which the 10-serum miRNA profile could accurately classify serum samples collected up to 33 months ahead of the clinical NSCLC diagnosis. Taken together, we demonstrate that the profiling of 10-serum miRNAs provides a novel noninvasive biomarker for NSCLC diagnosis. Lung cancer is the most common cancer in the world and the leading cause of cancer-related deaths in developed coun- tries. Nonsmall cell lung cancer (NSCLC) accounts for 75 to 80% of lung cancer cases. 1–3 So far, the most effective treat- ment for NSCLC is surgical resection, which is limited by the fact that 65% of patients have advanced disease at the time Key words: serum microRNA, nonsmall cell lung cancer, early diagnosis, noninvasive biomarker Additional Supporting Information may be found in the online version of this article Grant sponsor: The National Natural Science Foundation of China; Grant numbers: 90813035, 30890044, 30771036, 30772484, 30725008, 30890032, 31071232, 90608010; Grant sponsor: The National Basic Research Program of China (973 Program); Grant numbers: 2006CB503909, 2007CB815701, 2007CB815703, 2007CB815705, 2007CB815804; Grant sponsor: The National Basic Research Program of China (863 Program); Grant numbers: 2006AA02Z177, 2006AA10A121; Grant sponsor: Natural Science Foundation of Jiangsu Province; Grant number: BK2008021 *X.C., Z.H., W.W., Y.B., L.M. and C.Z., contributed equally to this work. DOI: 10.1002/ijc.26177 History: Received 21 Aug 2010; Accepted 28 Apr 2011; Online 9 May 2011 Correspondence to: Chen-Yu Zhang (or) Ke Zen (or) Junfeng Zhang, School of Life Sciences, Nanjing University, 22 Hankou Road, Nanjing, Jiangsu 210093, China. Tel.: 86-25-83686234, Fax: +86-25-83686234, E-mail: [email protected] (Chen-Yu Zhang), [email protected]. cn (Ke Zen), (or) [email protected] (Junfeng Zhang) Early Detection and Diagnosis Int. J. Cancer: 130, 1620–1628 (2012) V C 2011 UICC International Journal of Cancer IJC

Transcript of Identification of ten serum microRNAs from a genome-wide serum microRNA expression profile as novel...

Identification of ten serum microRNAs from a genome-wideserum microRNA expression profile as novel noninvasivebiomarkers for nonsmall cell lung cancer diagnosis

Xi Chen1*, Zhibin Hu2*, Wenjing Wang3*, Yi Ba4*, Lijia Ma5,6*, Chunni Zhang7*, Cheng Wang7, Zhiji Ren1, Yang Zhao2,

Sijia Wu1, Rui Zhuang1, Yixin Zhang8, Heng Hu3, Chazhen Liu3, Lin Xu9, Jun Wang5,6, Hongbing Shen2,

Junfeng Zhang1, Ke Zen1 and Chen-Yu Zhang1

1 Jiangsu Engineering Research Center for microRNA Biology and Biotechnology, State Key Laboratory of Pharmaceutical Biotechnology,

School of Life Sciences, Nanjing University, 22 Hankou Road, Nanjing, Jiangsu, China2 Department of Epidemiology and Biostatistics, Cancer Center, Nanjing Medical University, 140 Hanzhong Road, Nanjing, Jiangsu, China3 Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China4 Tianjin Medical University Cancer Institute and Hospital, Huanhuxi Road, Tiyuanbei, Tianjin, China5 Beijing Genomics Institute, Yantian, Shenzhen, China6 Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China7 Department of Biochemistry, Jinling Hospital, Clinical School of Medical College, Nanjing University, 305 East Zhongshan Road, Nanjing, Jiangsu, China8 Department of Hepatobiliary Surgery, Nantong Tumor Hospital, Nantong, Jiangsu, China9 Department of Thoracic Surgery, Cancer Hospital of Jiangsu Province, Nanjing, Jiangsu, China

The detection of nonsmall cell lung cancer (NSCLC) at an early stage presents a daunting challenge due to the lack of a

specific noninvasive marker. The discovery of microRNAs (miRNAs), particularly those found in serum, has opened a new

avenue for tumor diagnosis. To determine whether the expression profile of serum miRNAs can serve as a NSCLC fingerprint,

we performed Taqman probe-based quantitative RT-PCR assay to selected differentially expressed serum miRNAs from a

sample set including 400 NSCLC cases and 220 controls, and risk score analysis to evaluate the diagnostic value of the

serum miRNA profiling system. After a two-phase selection and validation process, 10 miRNAs were found to have

significantly different expression levels in NSCLC serum samples compared with the control serum samples. Risk score

analysis showed that this panel of miRNAs was able to distinguish NSCLC cases from controls with high sensitivity and

specificity. Under ROC curves, the AUC for tumor identification in training set and validation set were 0.966 and 0.972,

respectively. Furthermore, the expression profile of the 10-serum miRNAs was correlated with the stage of NSCLC patients,

especially in younger patients and patients with current smoking habits. More importantly, the serum miRNA-based biomarker

for early NSCLC detection was supported by a retrospective analysis in which the 10-serum miRNA profile could accurately

classify serum samples collected up to 33 months ahead of the clinical NSCLC diagnosis. Taken together, we demonstrate

that the profiling of 10-serum miRNAs provides a novel noninvasive biomarker for NSCLC diagnosis.

Lung cancer is the most common cancer in the world andthe leading cause of cancer-related deaths in developed coun-tries. Nonsmall cell lung cancer (NSCLC) accounts for 75 to

80% of lung cancer cases.1–3 So far, the most effective treat-ment for NSCLC is surgical resection, which is limited by thefact that 65% of patients have advanced disease at the time

Key words: serum microRNA, nonsmall cell lung cancer, early diagnosis, noninvasive biomarker

Additional Supporting Information may be found in the online version of this article

Grant sponsor: The National Natural Science Foundation of China; Grant numbers: 90813035, 30890044, 30771036, 30772484, 30725008,

30890032, 31071232, 90608010; Grant sponsor: The National Basic Research Program of China (973 Program); Grant numbers:

2006CB503909, 2007CB815701, 2007CB815703, 2007CB815705, 2007CB815804; Grant sponsor: The National Basic Research Program of

China (863 Program); Grant numbers: 2006AA02Z177, 2006AA10A121; Grant sponsor: Natural Science Foundation of Jiangsu Province;

Grant number: BK2008021

*X.C., Z.H., W.W., Y.B., L.M. and C.Z., contributed equally to this work.

DOI: 10.1002/ijc.26177

History: Received 21 Aug 2010; Accepted 28 Apr 2011; Online 9 May 2011

Correspondence to: Chen-Yu Zhang (or) Ke Zen (or) Junfeng Zhang, School of Life Sciences, Nanjing University, 22 Hankou Road,

Nanjing, Jiangsu 210093, China. Tel.: 86-25-83686234, Fax: +86-25-83686234, E-mail: [email protected] (Chen-Yu Zhang), [email protected].

cn (Ke Zen), (or) [email protected] (Junfeng Zhang)

Early

Detection

andDiagn

osis

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

International Journal of Cancer

IJC

of diagnosis.4,5 Most NSCLC cases, particularly at Stages Iand II, rarely show symptoms and are difficult to bedetected.4–6 The 5-year survival rate following surgical resec-tion is �70% for patients with Stage I NSCLC but that ratedrops to only 30% in patients with Stage III disease.7 Thus,the earlier detection of NSCLC would greatly facilitate moreeffective management of the disease.

To date, the reference gold standard in diagnosing NSCLCis pathologic evidence of malignant cells, which typicallyrequires an invasive strategy, such as bronchoscopy, transtho-racic needle aspiration, or thoracotomy. Chest X-ray andcomputed tomography screening can detect some lung can-cers at an early stage,8,9 but the diagnostic procedures andthe hazards of the associated radiation may outweigh thepotential benefits.10 Other methodologies, such as sputum cy-tology and bronchoalveolar lavage, have not been proven tobe effective screening tools. These concerns have ledresearchers to seek novel biomarkers and new diagnosticassays to noninvasively assess tumors. Several currently avail-able serum/plasma biomarkers offer the promise of compre-hensively analyzing tumors without the need to carry out abiopsy or a surgical procedure. There are, however, consider-able barriers to the broad implementation of these bio-markers in the clinic. These tumor markers are primarilyproteins, such as CYFRA21-1, CEA, NSE, TPS, chromograninA, CA125 and CA19-9.11 The major concern with thesemarkers is their limited sensitivity and specificity. Therefore,it is important to develop new methods and novel diagnosticbiomarkers for the detection of early events of NSCLC.

A new class of RNA regulatory genes known as micro-RNAs (miRNAs) has been found to introduce a whole newlayer of gene regulation in eukaryotes.12,13 miRNAs are en-dogenous noncoding RNAs of 19 to 24 nucleotides inlength.12,13 They play an important role in regulating geneexpression by base-pairing to the complementary sites on thetarget mRNAs, thus blocking the translation or triggering thedegradation of the target mRNAs.12,13 Altered expression oftissue miRNAs has been associated with many diseases, par-ticularly cancer.14,15 The use of tissue miRNA expression pro-files as diagnostic or prognostic biomarkers in cancer hasbeen demonstrated by several studies.14,15 In a recent study,we found that human serum contained a large amount ofstable miRNAs, and that the expression pattern of serummiRNA was altered in reflection of various disease condi-tions, including lung cancer, colorectal cancer, and diabetes.16

In particular, through initial screening by Solexa sequencingusing pooled serum samples, we identified near 100 miRNAsthat were differentially expressed in the sera from NSCLCpatients when compared with those from age- and gender-matched cancer-free controls.16 In this study, we have vali-dated our initial Solexa screening results at the individuallevel by using a stem-loop quantitative reverse transcriptionpolymerase chain reaction (qRT-PCR) assay. We tested theserum samples from 400 NSCLC cases and 220 controls andidentified ten serum miRNAs from a genome-wide serum

miRNA expression profiling as novel noninvasive biomarkersfor NSCLC diagnosis.

Material and MethodsCenters and investigators

The current study was conducted in two academic centers(Nanjing University, Nanjing, China; Nanjing Medical Uni-versity, Nanjing, China) and three nonacademic centers(Nantong Cancer Hospital, Jiangsu, China; Cancer Hospitalof Jiangsu Province, Nanjing, China; Jinling Hospital, Nanj-ing, China). Each center is either a tertiary cancer hospitaland research institute or a tertiary general hospital. All thehospitals have teams of medical and radiation oncologists,thoracic surgeons, interventional physicians and pathologistswho are experienced in implementing complex, multicenterclinical studies. All the protocols, including the diagnosis pro-cedure and serum collection manner, are identical in thesehospitals. Written informed consent was obtained from allpatients and volunteers before the study, and the study wasapproved by the ethics committee of each participatinginstitution.

Patients and control subjects

This study is comprised of 400 patients who received a diag-nosis of NSCLC at the Nantong Cancer Hospital (200patients) and the Cancer Hospital of Jiangsu Province (200patients) between 2005 and 2009. Most patients were referredto these tertiary centers for diagnostic procedures to investi-gate a lung mass. Patients were eligible if they were 18 yearsof age or older and had a pathological diagnostic NSCLCthat met histological or cytological criteria either through abiopsy procedure or surgical resection. Other eligibility crite-ria included: (a) the absence of previous lung cancer andother cancers; (b) the absence of previous chemotherapy orradiotherapy; and (c) the absence of synchronous multiplecancers. Histological typing of the tumors was performedaccording to the World Health Organization criteria. Stagingwas done according to the Sixth Edition of the AmericanJoint Commission on Cancer tumor-node-metastasis (TNM)staging system. The demographic and clinical features of thepatients are summarized in Table 1.

The recruitment of subjects to the parallel control groupwas conducted in the Healthy Physical Examination Centerof the Jinling Hospital. Routine laboratory and imaging testsincluded complete blood cell counts, baseline electrolytes,chemistry profile, tumor markers (CEA and AFP), inflamma-tory markers (C-reactive protein), a type-B ultrasound for theabdomen and pelvis, and a chest X-ray. Subjects who showedno abnormalities during the medical checkup were enrolledas cancer-free controls. These healthy individuals did nothave any history or examination findings that suggested ei-ther a pulmonary pathology or any constitutional symptoms.All chest radiographs were unremarkable. The control sub-jects were frequency matched to the cases by age (65 years),sex and residential area (urban or rural). After a written

Early

Detection

andDiagn

osis

Chen et al. 1621

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

informed consent was obtained, face-to-face interviews wereconducted to obtain demographic data (e.g., age and sex) andexposure information (e.g., smoking status) by trained inter-viewers and 5 ml venous blood sample was collected fromeach participant. For both patients and healthy controls, indi-viduals who smoked one cigarette per day for >1 year weredefined as ever smokers, otherwise they were considered asnever smokers. Those smokers who quit for >1 year in theirlifetime were considered as former smokers.

Preclinical diagnosis serum samples from seven NSCLCcases were obtained from a pool of more than 20,000 individ-uals who participated in a community-based screening pro-gram for noninfectious diseases conducted in the Jiangsuprovince during 2004 and 2005. As part of the screening pro-gram, a history was taken and a physical examination wasconducted. Subjects who had no history of cancer and no ex-amination finding that suggested any type of cancer atrecruitment were submitted to follow-up. The follow-up wasconducted in July of 2008. Among the 20,000 cases, sevensubjects were reported to have primary NSCLC that mani-fested between 2005 and 2008 by radiographic diagnosis andpathological confirmation. A retrospective analysis was con-

ducted in July of 2009. For this analysis, fifty subjects whohad no cancer during the recruitment and the follow-up werepicked up as controls. Serum miRNA expression levels inthese seven prediagnosis subjects and 50 control subjectswere assessed.

RNA isolation and serum miRNA qRT-PCR assay

RNA isolation and serum miRNA qRT-PCR assay were con-ducted as previously described.16–18 Briefly, venous bloodsamples (�5 ml) were collected from each donor and placedin a serum separator tube. Samples were processed within 1hr. Separation of the serum was accomplished by centrifuga-tion at 800 g for 10 min at room temperature, followed by a15 min high-speed centrifugation at 10,000 g at room tem-perature to completely remove the cell debris. The superna-tant serum was recovered and stored at �80�C until analysis.

Total RNA was extracted from 250 ll of serum using theTrizol LS Reagent (Invitrogen, Carlsbad, CA) according tothe manufacturer’s instructions. Typically, after extractingtotal RNA from 250 ll serum using Trizol LS Reagent, theRNA concentrations in the yield were in the range of �100–200 ng. qRT-PCR was carried out using a Taqman miRNA

Table 1. Demographic and clinical features of NSCLC patients and healthy subjects

Training set Validation setp-value(NSCLC intraining vs.validation)

NSCLC(n 5 200)

Control(n 5 110)

p-value(NSCLC vs.control)

NSCLC(n 5 200)

Control(n 5 110)

p-value(NSCLC vs.control)Variable No. % No. % No. % No. %

Average age (years) 59.6 6 10.5 57.9 6 7.9 0.146 59.5 6 10.3 58.7 6 8.2 0.488 0.977

Age (years) �60 97 48.5 66 60 0.069 100 50 57 51.8 0.851 0.842

>60 103 51.5 44 40 100 50 53 48.2

Female 68 34 34 30.9 43 21.5 31 28.2

Smoking status Current 95 47.5 46 41.8 0.014 116 58 56 50.9 0.011 0.027

Ever 28 14 6 5.5 32 16 9 8.2

Never 77 38.5 58 52.7 52 26 45 40.9

Histological types Adenocarcinoma 124 62 108 54 0.247

Squamous cellcarcinoma

60 30 75 37.5

Large cellcarcinoma

16 8 17 8.5

Stage I 39 19.5 60 30 < 0.001

II 33 16.5 28 14

III 76 38 82 41

IV 52 26 30 15

Other diseases None 188 94 107 97.3 0.199 190 95 107 97.3 0.423 0.745

Cardiacdysfunction

5 2.5 0 0 4 2 0 0

Active infection 4 2 3 2.7 5 2.5 3 2.7

Neurologicdisorders

3 1.5 0 0 1 0.5 0 0

Statistical comparison was performed by using Student’s t-test or two-sided k2 test.

Early

Detection

andDiagn

osis

1622 Serum microRNA as biomarker for nonsmall cell lung cancer

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

PCR kit (Applied Biosystems, Foster City, CA) according tothe manufacturer’s instructions. Briefly, 5 ll (�5–10 ng/ll)of total RNA was reverse-transcribed to cDNA using AMVreverse transcriptase (TaKaRa, Dalian, China) and the stem-loop RT primers (Applied Biosystems). Real-time PCR wasperformed using TaqMan miRNA probes (Applied Biosys-tems) on the Applied Biosystems 7300 Sequence DetectionSystem (Applied Biosystems). All reactions, including the no-template controls, were run in triplicate. After the reactions,the CT values were determined using the fixed threshold set-tings. To calculate the absolute expression levels of the targetmiRNAs, a series of synthetic miRNA oligonucleotides (dis-solved in water) of known concentrations (from 1 fM to 105

fM) were also reverse-transcribed and amplified. The absoluteamount of each miRNA was then calculated by referring tothe standard curve. Since U6 and 5S rRNA are degraded inserum samples and there is no current consensus on house-keeping miRNAs for qRT-PCR analysis of serum miRNAs,the expression levels of miRNAs were directly normalized toserum volume in our study.

Statistical analysis

Statistical comparison of the demographic features betweenthe NSCLC cases and control samples, or between theNSCLC cases from training set and validation set, were per-formed by using Student’s t-test or two-sided k2 test. The dif-ferences were considered statistically significant at p < 0.05.

Risk score analysis was performed to evaluate the associa-tions between NSCLC and the expression levels of the serummiRNAs. The risk score of each miRNA in the training set,denoted as s, was set as 1 if the expression level was greaterthan the upper 95% reference interval for the correspondingmiRNA level in controls and as 0 if otherwise.

When taking into account the correlation of each miRNAwith NSCLC risk, each patient was assigned a risk scorefunction (RSF) according to a linear combination of theexpression level of the miRNA. The RSF for sample i usingthe information from the ten miRNAs was:

RSFi ¼X10

j¼1Wj � sij

In the above equation, sij is the risk score for miRNA j onsample i, and Wj is the weight of the risk score of miRNA j.To determine the Ws, ten univariate logistic regression mod-els were fitted using the disease status with each of the riskscores. The regression coefficient of each risk score was usedas the weight to indicate the contribution of each miRNA tothe RSF. The frequency table and ROC curves were thenused to evaluate the diagnostic effects of the profiling and tofind the appropriate cutoff point. Verification of the proce-dure and the cutoffs were performed in the validation sampleset. All the statistical analyses were performed with StatisticalAnalysis System software (v.9.1.3; SAS Institute, Cary, NC).

For cluster analysis, we used hierarchical clustering inCluster 3.0 with the complete linkage method.

ResultsPatient description

A two-phase, case–control test was designed to identify se-rum miRNAs as a surrogate marker for NSCLC (Fig. 1). In

Figure 1. Overview of the design strategy. In the initial biomarker

selection stage, we screened the expression levels of 91 miRNAs

in a training sample set consisting of 200 NSCLC cases (Nantong

Cancer Hospital) and 110 controls (Jinling Hospital) using the

method of stem-loop qRT-PCR. Subsequently, the significantly

altered miRNAs were validated in another independent sample set

consisting of 200 NSCLC cases (Cancer Hospital of Jiangsu

Province) and 110 controls (Jinling Hospital). Finally, the refined

panel of serum miRNAs selected as the NSCLC signature was

tested in the prediagnosis serum samples to determine whether

this biomarker set might be useful for early lung cancer detection.

Early

Detection

andDiagn

osis

Chen et al. 1623

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

the initial biomarker selection stage, serum samples from 200NSCLC cases (Nantong Cancer Hospital, Jiangsu, China) and110 matched controls (Jinling Hospital, Nanjing, China) weresubjected to qRT-PCR assay. The significantly altered miR-NAs were selected and validated in additional 200 NSCLCpatients (Cancer Hospital of Jiangsu Province, Nanjing,China) and 110 controls (Jinling Hospital, Nanjing, China).All 400 patients enrolled in the present study had clinicaland pathological diagnosis of NSCLC. There was no signifi-cant difference in the distribution of age and gender betweenthe cancer patients and the normal subjects while the NSCLCgroup had more smokers than the control group. In general,NSCLC patients and control subjects had no other diseases,including significant cardiac dysfunction, active infection(hepatitis, tuberculosis, etc.), and neurological or psychiatricdisorders at the time when blood was drawn.

Biomarker selection and validation phase

We first evaluated the reliability and reproducibility of ourqRT-PCR assay for measuring serum miRNAs. Our resultssuggest that miRNA can be efficiently extracted and amplifiedfrom serum, and the qRT-PCR results of serum miRNAs canbe reliably compared across multiple samples (Supporting In-formation Fig. S1).

To identify the profile of serum miRNAs as a NSCLC sig-nature, we first examined the expression levels of 91 miRNAsin a set of serum samples including 200 NSCLC cases and110 controls (training set). In this phase, only those miRNAswith a mean fold change � 2 and p-value < 0.05 wereselected. Among the 91 miRNAs checked, 63 could be readilydetected in serum, whereas 21 miRNAs were undetectableand 7 miRNAs had PCR amplification occurring in a nonlin-ear manner (Supporting Information Table S1). The expres-sion levels of the 63 serum miRNAs in 200 NSCLC cases and110 controls were visualized in a clustered heatmap (Support-ing Information Fig. S2). Most miRNAs were unchangedbetween the cases and controls (fold change < 2 and p-value

> 0.05), while 10 miRNAs were differentially expressed inNSCLC samples compared with normal samples (Table 2and Supporting Information Table S1). All of these miRNAswere elevated in the serum samples from NSCLC cases com-pared with the controls (Table 2). The expression levels ofthese miRNAs in the control sera were extremely low, rang-ing from 18.57 fM to 83.06 fM. In contrast, their concentra-tions in the NSCLC sera were dramatically elevated, rangingfrom 0.43 pM to 1.74 pM (Table 2).

To verify the accuracy and specificity of these 10 miRNAsto be used as the NSCLC signature, we further assessed the10 miRNAs in another independent sample set consisting of200 NSCLC cases and 110 cancer-free controls (validationset). As shown in Table 2, the trend of miRNA expressionalteration was generally concordant between the training setand the validation set, and all of the 10 miRNAs were shownto be significantly up-regulated by a factor greater than two-fold. Through this two-phase test and analysis, a profile of10-serum miRNAs was generated and would serve as poten-tial biomarker for NSCLC in next test and analysis.

Prediction of NSCLC cases and control subjects by risk

score analysis

To further evaluate the diagnostic value of this 10-miRNAprofiling system, we performed a risk score analysis on thedata set and employed this risk scoring method to predictNSCLC cases and control subjects. First, the risk score for-mula was used to calculate the risk scores for all samples inthe training set. Samples were ranked according to their riskscores and then divided into a high-risk group, representingthe predicted NSCLC cases, or a low-risk group, representingthe predicted control subjects, using the optimal cutoff valueof 5.006. At this cutoff, the sensitivity was 0.93 and the speci-ficity was 0.90, with the value of sensitivity þ specificity con-sidered to be maximal. As shown in Supporting InformationTable S2, only 11 of the 110 controls had a risk score >

5.006, while 186 out of the 200 NSCLC samples had a risk

Table 2. Serum miRNAs differentially expressed in NSCLC cases compared with control subjects

Training set Validation set

Control NSCLCFoldchange p-value Control NSCLC

Foldchange p-value

miR-20a 28.08 6 5.35 1737.65 6 261.24 61.89 1.86 � 10�6 30.36 6 5.53 606.44 6 94.67 19.97 8.89 � 10�6

miR-24 38.09 6 6.75 435.84 6 74.27 11.44 8.81 � 10�5 41.32 6 6.83 398.56 6 61.86 9.65 2.47 � 10�5

miR-25 30.63 6 5.23 433.62 6 74.26 14.15 7.07 � 10�5 31.49 6 5.76 553.42 6 117.95 17.57 0.0011

miR-145 18.57 6 3.71 1557.1 6 357.33 83.85 0.0015 21.86 6 3.46 200.79 6 22.06 9.18 5.41 � 10�9

miR-152 49.36 6 10.59 1561.15 6 513.15 31.63 0.029 43.57 6 9.04 732.23 6 123.55 16.81 4.53 � 10�5

miR-199a-5p 29.57 6 5.16 462.51 6 96.02 15.64 0.00091 22.99 6 5.21 505.76 6 96.89 22 0.00025

miR-221 22.21 6 8.42 1445.99 6 447.73 65.09 0.019 22.05 6 6.79 117.41 6 26.8 5.33 0.0093

miR-222 35.23 6 8.18 648.25 6 128.1 18.4 0.00044 38.76 6 7.05 1027.9 6 75.72 26.52 1.45 � 10�19

miR-223 49.11 6 13.62 1262.86 6 317.63 25.71 0.0048 56.19 6 18.67 700.49 6 103.95 12.47 6.68 � 10�6

miR-320 83.06 6 20.95 1726.01 6 226.71 20.78 1.5 � 10�7 69.9 6 18.82 1106.6 6 112.51 15.83 4.9 � 10�11

The concentration of miRNAs is presented as mean 6 SE (fM).

Early

Detection

andDiagn

osis

1624 Serum microRNA as biomarker for nonsmall cell lung cancer

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

score > 5.006. Second, the risk score formula, using the samecutoff point, was used to calculate the risk score for samplesfrom the validation set. The sensitivity for the validation setwas 0.925 and the specificity was 0.90. Out of 200 NSCLCcases and 110 controls from the validation set, only 12 con-trols and 14 NSCLC cases were incorrectly predicted by thisrisk score method.

We also constructed receiver operating characteristic(ROC) curves for continuous predictors using these risk scorefunctions to estimate the sensitivity and specificity of themiRNA-based biomarkers. The areas under the curve (AUC)were 0.966 and 0.972 for the training set and validation set,respectively (Figs. 2a and 2b). The results indicate that theprofile of the 10-serum miRNAs is an accurate biomarker for

Figure 2. ROC curve analysis for the discrimination between NSCLC serum samples and control samples by the 10-serum miRNA profile. (a

and b) ROC curves for the 10-serum miRNA profile to distinguish NSCLC serum samples from control samples in training set (a) and

validation set (b). (c and d) The contribution of individual serum miRNAs on the AUC of the ROC curve. The absolute expression levels of

the 10 serum miRNAs in samples from training set (c) and validation set (d) were measured by the Taqman probe-based qRT-PCR. ROC

curves were established to evaluate the diagnostic value of each miRNA for differentiating between NSCLC cases and controls. The 10

miRNAs were added one-by-one according to their individual AUC (from high to low). The AUC of the ROC curve based on the subsequent

adding of the next best miRNA incrementally improved.

Early

Detection

andDiagn

osis

Chen et al. 1625

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

NSCLC diagnosis. To illustrate the contribution of individualserum miRNAs on the AUC of the ROC curve, we estab-lished ROC curves to evaluate the diagnostic value of eachmiRNA for differentiating between NSCLC cases and con-trols. We found that the best miRNA alone can yield a ROCcurve area more than 80%, and that the subsequent additionof each of the nine miRNAs incrementally improved the sen-sitivity and specificity of the miRNA-based biomarker in dis-criminating NSCLC cases from cancer-free controls (Figs. 2cand 2d). These results clearly indicate that, although one par-ticular miRNA in serum may help distinguish NSCLC casesfrom cancer-free controls, a combination of a panel of miR-NAs has a great potential to offer much more sensitive andspecific diagnostic tests.

The differential expression of miRNAs between theNSCLC and control serum samples was also analyzed by anunsupervised clustering that was blind to the clinical annota-tions. The dendrogram generated by the cluster analysisshowed a clear separation of the NSCLC samples from thecontrol samples based on the 10-serum miRNA profile (Sup-porting Information Figs. S3A and S3B). In training set, 2 of110 control samples and 53 of 200 NSCLC samples wereclassified incorrectly (Supporting Information Fig. S3A). Invalidation set, 200 NSCLC cases and 110 controls were alsoclearly separated into two main classes, with 22 NSCLC casesand 3 control samples classified incorrectly (Supporting In-formation Fig. S3B).

Correlation of serum miRNA expression with demographic

and clinical factors

Diagnosis of NSCLC by traditional assays is generally affectedby the tumor’s TNM stage, histological subset and smokinghistory of the patients. To determine whether our serummiRNA-based tumor marker was affected by such clinicalfeatures, we explored the correlation of serum miRNAexpression with demographic and clinical factors using stu-dent t-test or one-way ANOVA. In this analysis, the samplesfrom training set and validation set were combined and usedto conduct the calculation. No obvious difference was

observed when NSCLC cases were stratified by gender, age,histological subsets of tumors or smoking history of thepatients, whereas the expression level of serum miRNA wascorrelated with the stage of NSCLC patients. As shown inSupporting Information Figures S4A–S4C, values of riskscore, which could differentiate cancer cases from normalcontrols, were differentially distributed in cancers with differ-ent tumor stage. In particular, mean risk score (SupportingInformation Fig. S4A) and high risk score rate (SupportingInformation Figs. S4B and S4C) were progressively increasedfrom Stage I to Stage IV NSCLC cases. In an opposite direc-tion, the false positive rate was progressively decreased (Sup-porting Information Fig. S4D). We also noted that risk scoreshowed a correlation with tumor stage in younger patients(� 60 years) (Supporting Information Fig. S4E) but not inolder patients (> 60 years). Moreover, if only patients withcurrent smoking habits were enrolled, risk score would showa significant correlation with tumor stage (Supporting Infor-mation Fig. S4F). In sum, these results show that the highrisk score or high level of serum miRNAs in NSCLC patientsis associated with advanced clinical stages of this tumor.

The 10-serum miRNA profile as a potential marker for

early diagnosis of NSCLC

We further evaluated the potential of using these biomarkersas diagnostic markers for early lung cancers by using a retro-spective analysis. In this analysis, serum samples wererecruited from 20,000 individuals who participated in a com-munity-based screening program for noninfectious diseasesconducted in the Jiangsu province during 2004 and 2005.Subsequently, subjects who had no history of cancer weresubmitted to follow-up conducted in July of 2008. Therewere seven individuals from this pool of 20,000 who hadbeen diagnosed as NSCLC. The prediagnosis serum samplesof these seven lung cancer cases, as well as 50 randomlyselected normal controls, were obtained and assessed by theqRT-PCR assay. As shown in Table 3, six out of these sevensubjects actually had a risk score � 5.006 and were subse-quently classified as lung cancer cases by the 10-serum

Table 3. Risk scores of the prediagnosis samples

Sample labels Pre-1 Pre-2 Pre-3 Pre-4 Pre-5 Pre-6 Pre-7

Age (years) 76 59 42 70 53 52 71

Sex Female Female Male Male Male Male Male

Smoking status Never Never Ever Ever Ever Ever Ever

Leading time (month)1 33.37 26.47 8.23 2.03 9.4 23.17 0.7

Stage IV III II I

Survival Status Death of disease Death of disease Death of disease Death of disease Alive Alive Death of disease

Survival time (month)2 5.93 11.63 0.6 4.07 38.07 12.73 5.17

Risk score 28.018 10.2069 6.6879 5.1904 6.4625 2.8433 13.0151

1Leading Time ¼ Diagnostic time � blood drawing time. 2Survival Time ¼ Death time (Follow-up time) � diagnostic time.

Early

Detection

andDiagn

osis

1626 Serum microRNA as biomarker for nonsmall cell lung cancer

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

miRNA signature. Since these seven subjects were ‘‘cancerfree’’ at recruitment but later clinically classified as lung can-cer patients, this result strongly suggests that our serummiRNA-based biomarker might be capable of detecting lungcancer at its early stages. In this case, the longest leadingtime of tumor identification by the 10-serum miRNA bio-marker was 33.37 months. Interestingly, one of the sevensubjects, sample Pre-6, was classified as cancer-free accordingto the risk score based on the 10-serum miRNA biomarker.Considering that the serum from this subject was collected23.17 months ahead of the clinical diagnosis of only Stage INSCLC, it is likely that this subject did not have lung malig-nancy at the time when the blood was drawn. None of therandomly selected 50 tumor-free individuals had high-risksignature (risk score � 5.006) (Supporting Information TableS3). These results significantly strengthen the clinical applic-ability of the 10-serum miRNA signature as an early diagno-sis marker of NSCLC.

DiscussionRecent studies by our group16 and others19–23 have indicatedthat the unique patterns of serum miRNA may serve as novelnoninvasive biomarkers for various diseases, including cancers.There are several advantages for using serum miRNA in clini-cal diagnosis or screen of NSCLC: (i) A serum-based biomarkerwould allow the comprehensive analysis of tumors without therequirement of invasive procedures, such as a biopsy or sur-gery; (ii) A serum miRNA-based test had a low cost and aneasy sample management, including sample collection andprocessing; and (iii) A serum miRNA-based test recruits apanel of miRNAs as the marker for NSCLC detection insteadof an individual miRNA. Because a panel of miRNAs reflectsvarious aspects of tumorigenesis, the combination of thesemiRNAs forms a more complete indicator for tumor detectionthan the conventional single protein or carbohydrate molecule-based biomarkers. As shown in Figure 2, AUC, sensitivity, andspecificity for NSCLC detection by our 10-miRNA biomarkerare 0.97, 0.93 and 0.90, respectively, which are significantlyhigher than those of any single-factor index, such as CYFRA21-1 (AUC � 0.84, sensitivity � 0.5, specificity � 0.95), TPS(AUC � 0.74, sensitivity � 0.34, specificity � 0.95), and CEA(AUC � 0.8, sensitivity � 0.53, specificity � 0.95).24,25

Our results also demonstrated that a combination of multi-ple serum miRNAs is more reliable than the single miRNA-based assays previously proposed for diagnosis of cancers. Theearly studies on serum or plasma miRNAs as disease finger-prints mainly focused on single or only a few tumor-specificmiRNAs.22,23 Although this kind of approach is simple andstraightforward, the specificity based on individual miRNAs isgenerally poor. The diverse, complex molecular events involvedin the initiation and development of a malignancy severely lim-its the utility of individual miRNAs as a tumor biomarker.Although in our ROC analysis of individual serum miRNAseach serum miRNA alone can provide a specific and sensitivetest for certain patients (AUC ¼ 0.7–0.8), the ROC curves

obtained by sequentially adding miRNAs is better in identify-ing the issues of tumor and population heterogeneity (Figs. 2cand 2d). Interestingly, it seems that a signature of 10 or 9 miR-NAs might have similar efficacy (Figs. 2c and 2d). To find theoptimized risk score function (optimized miRNA numbersincluded in the model), we randomly split the training set into2 sets with equal sample size. Subsequently, the weight of eachmarker of the risk model was derived using the first set and theeffect of number of miRNAs in the model on AUC was eval-uated using the second set. The procedure was repeated 1000times. The result showed that the average AUC of 10 miRNAmodel is greater than other models (Supporting InformationTable S4). Therefore, for both practical and technical consider-ations, we chose 10 miRNAs as the signature of NSCLC.

The potential of this 10-serum miRNA-based biomarkerin the diagnosis of early NSCLC is particularly intriguing.Although the number of tested NSCLC cases (7 from 20,000)was relatively small, the accuracy rate of this serum miRNAassay was astonishingly high. We identified 6 out of 7 pre-diagnosis serum samples that can be classified as NSCLC,whereas none of the 50 randomly picked control samples hada risk score higher than threshold. Since patients with earlystage NSCLC can undergo complete resection of tumors, ourdata strongly suggest that the 10-serum miRNA profile as thebiomarker for defining the early events of NSCLC is an effec-tive way to change the outcomes and improve the prognosis.

A comparison of the miRNA expression patterns betweenserum and tissues/cells may provide additional evidence sup-porting the use of serum miRNAs as reliable diagnostic bio-markers. Among the 10 serum miRNAs used for NSCLC di-agnosis, many are known to be associated with lung cancer.For example, increased levels of miR-221, miR-223, miR-199a-5p, miR-20a, miR-25 and miR-24 were seen in tissuesamples from lung cancer patients.26 Likewise, miR-221 andmiR-222 were reported to be the most upregulated miRNAsin TRAIL-resistant nonsmall cell lung cancer cells.27 All ofthe 10 selected miRNAs are associated with genes linked totumorigenesis (Supporting Information Table S5). The con-cordance between the expression of serum miRNAs and tis-sue miRNAs previously identified in the same type of tumorsuggests that these serum miRNAs could be derived from tu-mor cells or tissues/cells affected by tumors.

In this study, we identified a correction between serummiRNA expression and tumor stage, suggesting that serummiRNA may have an application in the precise clinicaldescription of NSCLC. However, since our initial screeningused NSCLC serum samples without tumor stage separation,the 10-serum miRNA profile cannot absolutely discriminateStage I/II NSCLC from more advanced NSCLC. Future stud-ies may be necessary to directly compare the sera from vari-ous stages of NSCLC at the miRNA screening phase. Furtherstudies will also be required to test other types of tumors todetermine whether the 10-serum miRNA profile is capable ofdiscriminating NSCLC from other tumors. Nevertheless,given that the TNM staging system is presently the most

Early

Detection

andDiagn

osis

Chen et al. 1627

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC

important tool used by clinical oncologists to make estimatesof tumor burden, to predict prognosis and survival, and tochoose the best combination of treatment modalities such assurgery, radiation and chemotherapy,28–30 it is intriguing tospeculate that serum miRNAs might have an important prog-nostic value and could serve as an important predictive pa-rameter in the decision on adjuvant treatment protocols.

In conclusion, we have demonstrated that the expressionprofile of 10-serum miRNAs can serve as a noninvasive bio-

marker for NSCLC detection. The results also show a poten-tial of this 10-serum miRNA signature in diagnosis ofNSCLC at its very early stage. The future application of thisserum miRNA-based tumor biomarker may initiate a revolu-tion in clinical management.

AcknowledgementsThe authors thank Dr. Thomas M. Roberts at Harvard Medical School forhis generous support during the course of this study.

References

1. Jemal A, Siegel R, Ward E, Murray T, XuJ, Thun MJ. Cancer statistics, 2007. CACancer J Clin 2007;57:43–66.

2. Parkin DM, Bray F, Ferlay J, Pisani P.Estimating the world cancer burden:Globocan 2000. Int J Cancer 2001;94:153–6.

3. Ganti AK, Mulshine JL. Lung cancerscreening. Oncologist 2006;11:481–7.

4. Patz EF, Jr., Goodman PC, Bepler G.Screening for lung cancer. N Engl J Med2000;343:1627–33.

5. Brambilla C, Fievet F, Jeanmart M, deFraipont F, Lantuejoul S, Frappat V,Ferretti G, Brichon PY, Moro-Sibilot D.Early detection of lung cancer: role ofbiomarkers. Eur Respir J Suppl 2003;39:36s–44s.

6. Rossi A, Maione P, Colantuoni G, GaizoFD, Guerriero C, Nicolella D, Ferrara C,Gridelli C. Screening for lung cancer: Newhorizons? Crit Rev Oncol Hematol 2005;56:311–20.

7. Dominioni L, Imperatori A, Rovera F,Ochetti A, Torrigiotti G, Paolucci M. StageI nonsmall cell lung carcinoma: analysis ofsurvival and implications for screening.Cancer 2000;89:2334–44.

8. Oken MM, Marcus PM, Hu P, Beck TM,Hocking W, Kvale PA, Cordes J, Riley TL,Winslow SD, Peace S, Levin DL, ProrokPC, et al. Baseline chest radiograph forlung cancer detection in the randomizedProstate, Lung, Colorectal and OvarianCancer Screening Trial J Natl Cancer Inst2005;97:1832–9.

9. Henschke CI, Yankelevitz DF, Libby DM,Pasmantier MW, Smith JP, Miettinen OS.Survival of patients with stage I lungcancer detected on CT screening. N Engl JMed 2006;355:1763–71.

10. Mascalchi M, Belli G, Zappa M, Picozzi G,Falchini M, Della Nave R, Allescia G, MasiA, Pegna AL, Villari N, Paci E. Risk-benefit analysis of X-ray exposureassociated with lung cancer screening inthe Italung-CT trial. AJ. R Am J Roentgenol2006;187:421–9.

11. Tarro G, Perna A, Esposito C. Earlydiagnosis of lung cancer by detection of

tumor liberated protein. J Cell Physiol2005;203:1–5.

12. He L, Hannon GJ. MicroRNAs: smallRNAs with a big role in gene regulation.Nat Rev Genet 2004;5:522–31.

13. Bartel DP. MicroRNAs: genomics,biogenesis, mechanism, and function. Cell2004;116:281–97.

14. Esquela-Kerscher A, Slack FJ. Oncomirs—microRNAs with a role in cancer. Nat RevCancer 2006;6:259–69.

15. Calin GA, Croce CM. MicroRNAsignatures in human cancers. Nat RevCancer 2006;6:857–66.

16. Chen X, Ba Y, Ma L, Cai X, Yin Y, WangK, Guo J, Zhang Y, Chen J, Guo X, Li Q,Li X, et al. Characterization of microRNAsin serum: a novel class of biomarkers fordiagnosis of cancer and other diseases. CellRes 2008;18:997–1006.

17. Chen C, Ridzon DA, Broomer AJ, Zhou Z,Lee DH, Nguyen JT, Barbisin M, Xu NL,Mahuvakar VR, Andersen MR, Lao KQ,Livak KJ, et al. Real-time quantification ofmicroRNAs by stem-loop RT-PCR. NucleicAcids Res 2005;33:e179.

18. Tang F, Hajkova P, Barton SC, Lao K,Surani MA. MicroRNA expressionprofiling of single whole embryonic stemcells. Nucleic Acids Res 2006;34:e9.

19. Mitchell PS, Parkin RK, Kroh EM, FritzBR, Wyman SK, Pogosova-Agadjanyan EL,Peterson A, Noteboom J, O’Briant KC,Allen A, Lin DW, Urban N, et al.Circulating microRNAs as stableblood-based markers for cancer detection.Proc Natl Acad Sci USA 2008;105:10513–8.

20. Resnick KE, Alder H, Hagan JP,Richardson DL, Croce CM, Cohn DE. Thedetection of differentially expressedmicroRNAs from the serum of ovariancancer patients using a novel real-timePCR platform. Gynecol Oncol 2009;112:55–9.

21. Gilad S, Meiri E, Yogev Y, Benjamin S,Lebanony D, Yerushalmi N, Benjamin H,Kushnir M, Cholakh H, Melamed N,Bentwich Z, Hod M, et al. SerummicroRNAs are promising novelbiomarkers. PLoS One 2008;3:e3148.

22. Ng EK, Chong WW, Jin H, Lam EK, ShinVY, Yu J, Poon TC, Ng SS, Sung JJ.Differential expression of microRNAs inplasma of patients with colorectal cancer: apotential marker for colorectal cancerscreening. Gut 2009;58:1375–81.

23. Lawrie CH, Gal S, Dunlop HM, PushkaranB, Liggins AP, Pulford K, Banham AH,Pezzella F, Boultwood J, Wainscoat JS,Hatton CS, Harris AL. Detection ofelevated levels of tumour-associatedmicroRNAs in serum of patients withdiffuse large B-cell lymphoma. Br JHaematol 2008;141:672–5.

24. Wieskopf B, Demangeat C, Purohit A,Stenger R, Gries P, Kreisman H, Quoix E.Cyfra 21–1 as a biologic marker of non-small cell lung cancer. Evaluation ofsensitivity, specificity, and prognostic role.Chest 1995;108:163–9.

25. Nisman B, Lafair J, Heching N, Lyass O, BarasM, Peretz T, Barak V. Evaluation of tissuepolypeptide specific antigen, CYFRA 21-1,and carcinoembryonic antigen in nonsmallcell lung carcinoma: does the combined use ofcytokeratin markers give any additionalinformation?. Cancer 1998;82:1850–9.

26. Volinia S, Calin GA, Liu CG, Ambs S,Cimmino A, Petrocca F, Visone R, IorioM, Roldo C, Ferracin M, Prueitt RL,Yanaihara N, et al. A microRNAexpression signature of human solidtumors defines cancer gene targets. ProcNatl Acad Sci USA 2006;103:2257–61.

27. Garofalo M, Quintavalle C, Di Leva G, ZancaC, Romano G, Taccioli C, Liu CG, CroceCM, Condorelli G. MicroRNA signatures ofTRAIL resistance in human non-small celllung cancer. Oncogene 2008;27:3845–55.

28. Hermanek P, Sobin L, eds. TNMclassification of malignant tumors, 4thedn. Berlin: Springer-Verlag 1987:69–73.

29. Mountain CF. The new InternationalStaging System for Lung Cancer. Surg ClinNorth Am 1987;67:925–35.

30. Naruke T, Goya T, Tsuchiya R, SuemasuK. Prognosis and survival in resected lungcarcinoma based on the new internationalstaging system. J Thorac Cardiovasc Surg1988;96:440–7.

Early

Detection

andDiagn

osis

1628 Serum microRNA as biomarker for nonsmall cell lung cancer

Int. J. Cancer: 130, 1620–1628 (2012) VC 2011 UICC