High-resolution mapping of DNA hypermethylation and hypomethylation in ... · PDF...

6
High-resolution mapping of DNA hypermethylation and hypomethylation in lung cancer Tibor A. Rauch*, Xueyan Zhong*, Xiwei Wu , Melody Wang*, Kemp H. Kernstine , Zunde Wang*, Arthur D. Riggs* § , and Gerd P. Pfeifer* § Divisions of *Biology, Information Sciences, and Surgery, Beckman Research Institute of the City of Hope, Duarte, CA 91010 Contributed by Arthur D. Riggs, November 12, 2007 (sent for review September 5, 2007) Changes in DNA methylation patterns are an important character- istic of human cancer. Tumors have reduced levels of genomic DNA methylation and contain hypermethylated CpG islands, but the full extent and sequence context of DNA hypomethylation and hyper- methylation is unknown. Here, we used methylated CpG island recovery assay-assisted high-resolution genomic tiling and CpG island arrays to analyze methylation patterns in lung squamous cell carcinomas and matched normal lung tissue. Normal tissues from different individuals showed overall very similar DNA methylation patterns. Each tumor contained several hundred hypermethylated CpG islands. We identified and confirmed 11 CpG islands that were methylated in 80 –100% of the SCC tumors, and many hold promise as effective biomarkers for early detection of lung cancer. In addition, we find that extensive DNA hypomethylation in tumors occurs specifically at repetitive sequences, including short and long interspersed nuclear elements and LTR elements, segmental dupli- cations, and subtelomeric regions, but single-copy sequences rarely become demethylated. The results are consistent with a specific defect in methylation of repetitive DNA sequences in human cancer. DNA methylation tiling arrays CpG islands C hanges in DNA methylation patterns are frequent events in human tumors (1). DNA hypomethylation in cancer tissue was first observed more than two decades ago (2–6) and may be mechanistically linked to tumorigenesis (7). In the 1990s, research- ers reported hypermethylation of CpG islands of several known and putative tumor suppressor genes and other genes involved in important genome defense pathways, such as DNA repair (1, 8 –12). Today, there are many reports that have documented methylation of CpG islands associated with a large number of different genes, including almost every type of human cancer. In lung cancer, several CpG islands known to be methylated include those associated with CDKN2A, RASSF1A, RARbeta, MGMT, GSTP1, CDH13, APC, DAPK, TIMP3, and several others (13–17). The methylation fre- quency (i.e., the percentage of tumors analyzed that carry methyl- ated alleles) ranges from 10% to 80%, but these numbers differ widely depending on the tumor histology, the study population, and/or the methodology used to assess methylation. Detection of methylated CpG islands in easily accessible biological materials such as serum or sputum has the potential to be useful for the early diagnosis of lung cancer and other malignancies (18–20). Repetitive DNA elements, such as short and long interspersed nuclear elements (SINEs and LINEs, respectively) and simple repeat sequences, are often found hypomethylated in tumors (21–26). Although it seems clear that methylation-induced silencing of tumor suppressor genes can be an important event in tumori- genesis, the magnitude, exact sequence specificity, and biological significance of tumor-associated DNA hypomethylation is much less understood (21, 26). In particular, the extent and sequence context of single-copy gene and general genome hypomethylation is not known, and it is commonly assumed that all genomic sequences are hypomethylated in tumors. Current research approaches are geared toward the character- ization of the full complement of DNA methylation changes in cancer. Several techniques, including differentiation of methylated and unmethylated sequences by use of restriction enzymes or by precipitation with an anti-5-methylcytosine antibody have been introduced (27). We recently developed a methylation detection method, the methylated-CpG island recovery assay (MIRA), that does not depend on the use of sodium bisulfite but has similar sensitivity and specificity as bisulfite-based approaches (28). The MIRA method is based on the high affinity of the MBD2b/ MBD3L1 protein complex for methylated CpG dinucleotides. Methylated double-stranded DNA sequences are enriched and are used to make probes for use with microarrays. For efficient pull down of methylated DNA by this method, two or more methylated CpG sites in a fragment of 50 or fewer base pairs are required (29). In the present study, we have used the MIRA method in combi- nation with CpG island and genomic tiling arrays to characterize at high resolution the DNA methylation changes that occur in the genome of lung squamous cell carcinomas (SCCs). Tumor-specific CpG island DNA methylation markers are identified as well as a specific defect in methylation of repetitive DNA elements. Results Chromosomal DNA Methylation Patterns. To analyze tumor- associated DNA methylation changes at the chromosomal level, we compared two stage I, one stage II, and one stage III lung SCCs with normal matched lung tissues. We used the MIRA-assisted microarray method (29) with genome tiling arrays for high- resolution DNA methylation analysis. MIRA-enriched and input fractions were cohybridized to NimbleGen arrays covering genomic regions at a resolution of 100 bp (Fig. 1). These MIRA microarrays provide complete and high-resolution mapping of DNA methyl- ation patterns along chromosomes. We used arrays covering the entire short arm of chromosome 7 [57 megabases (Mb)], the entire short and part of the long arm of chromosome 8 (65 Mb), and regions of the long arm of chromosomes 6 (5 Mb) and 7 (12.7 Mb). DNA methylation profiles of normal lung and matched pairs of SCC samples were compared between different patients. We observed a striking conservation of overall chromosomal DNA methylation patterns [Fig. 1 and supporting information (SI) Fig. 5], in particular when comparing normal lung tissue from different individuals. Hypermethylation in Tumors at CpG Islands. Regions near the cen- tromeres and telomeres were more densely methylated than other loci in normal and tumor samples (Fig. 2A). Upon closer exami- nation of the high-resolution methylation data, we detected 16 Author contributions: T.A.R., K.H.K., A.D.R., and G.P.P. designed research; T.A.R., X.Z., M.W., and Z.W. performed research; T.A.R., X.W., Z.W., A.D.R., and G.P.P. analyzed data; and T.A.R., A.D.R., and G.P.P. wrote the paper. The authors declare no conflict of interest. Data deposition: The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE9622). § To whom correspondence may be addressed. E-mail: [email protected] or [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0710735105/DC1. © 2007 by The National Academy of Sciences of the USA 252–257 PNAS January 8, 2008 vol. 105 no. 1 www.pnas.orgcgidoi10.1073pnas.0710735105

Transcript of High-resolution mapping of DNA hypermethylation and hypomethylation in ... · PDF...

High-resolution mapping of DNA hypermethylationand hypomethylation in lung cancerTibor A. Rauch*, Xueyan Zhong*, Xiwei Wu†, Melody Wang*, Kemp H. Kernstine‡, Zunde Wang*, Arthur D. Riggs*§,and Gerd P. Pfeifer*§

Divisions of *Biology, †Information Sciences, and ‡Surgery, Beckman Research Institute of the City of Hope, Duarte, CA 91010

Contributed by Arthur D. Riggs, November 12, 2007 (sent for review September 5, 2007)

Changes in DNA methylation patterns are an important character-istic of human cancer. Tumors have reduced levels of genomic DNAmethylation and contain hypermethylated CpG islands, but the fullextent and sequence context of DNA hypomethylation and hyper-methylation is unknown. Here, we used methylated CpG islandrecovery assay-assisted high-resolution genomic tiling and CpGisland arrays to analyze methylation patterns in lung squamous cellcarcinomas and matched normal lung tissue. Normal tissues fromdifferent individuals showed overall very similar DNA methylationpatterns. Each tumor contained several hundred hypermethylatedCpG islands. We identified and confirmed 11 CpG islands that weremethylated in 80–100% of the SCC tumors, and many hold promiseas effective biomarkers for early detection of lung cancer. Inaddition, we find that extensive DNA hypomethylation in tumorsoccurs specifically at repetitive sequences, including short and longinterspersed nuclear elements and LTR elements, segmental dupli-cations, and subtelomeric regions, but single-copy sequencesrarely become demethylated. The results are consistent with aspecific defect in methylation of repetitive DNA sequences inhuman cancer.

DNA methylation � tiling arrays � CpG islands

Changes in DNA methylation patterns are frequent events inhuman tumors (1). DNA hypomethylation in cancer tissue was

first observed more than two decades ago (2–6) and may bemechanistically linked to tumorigenesis (7). In the 1990s, research-ers reported hypermethylation of CpG islands of several known andputative tumor suppressor genes and other genes involved inimportant genome defense pathways, such as DNA repair (1, 8–12).Today, there are many reports that have documented methylationof CpG islands associated with a large number of different genes,including almost every type of human cancer. In lung cancer, severalCpG islands known to be methylated include those associated withCDKN2A, RASSF1A, RARbeta, MGMT, GSTP1, CDH13, APC,DAPK, TIMP3, and several others (13–17). The methylation fre-quency (i.e., the percentage of tumors analyzed that carry methyl-ated alleles) ranges from �10% to �80%, but these numbers differwidely depending on the tumor histology, the study population,and/or the methodology used to assess methylation. Detection ofmethylated CpG islands in easily accessible biological materialssuch as serum or sputum has the potential to be useful for the earlydiagnosis of lung cancer and other malignancies (18–20).

Repetitive DNA elements, such as short and long interspersednuclear elements (SINEs and LINEs, respectively) and simplerepeat sequences, are often found hypomethylated in tumors(21–26). Although it seems clear that methylation-induced silencingof tumor suppressor genes can be an important event in tumori-genesis, the magnitude, exact sequence specificity, and biologicalsignificance of tumor-associated DNA hypomethylation is muchless understood (21, 26). In particular, the extent and sequencecontext of single-copy gene and general genome hypomethylationis not known, and it is commonly assumed that all genomicsequences are hypomethylated in tumors.

Current research approaches are geared toward the character-ization of the full complement of DNA methylation changes in

cancer. Several techniques, including differentiation of methylatedand unmethylated sequences by use of restriction enzymes or byprecipitation with an anti-5-methylcytosine antibody have beenintroduced (27). We recently developed a methylation detectionmethod, the methylated-CpG island recovery assay (MIRA), thatdoes not depend on the use of sodium bisulfite but has similarsensitivity and specificity as bisulfite-based approaches (28). TheMIRA method is based on the high affinity of the MBD2b/MBD3L1 protein complex for methylated CpG dinucleotides.Methylated double-stranded DNA sequences are enriched and areused to make probes for use with microarrays. For efficient pulldown of methylated DNA by this method, two or more methylatedCpG sites in a fragment of 50 or fewer base pairs are required (29).In the present study, we have used the MIRA method in combi-nation with CpG island and genomic tiling arrays to characterize athigh resolution the DNA methylation changes that occur in thegenome of lung squamous cell carcinomas (SCCs). Tumor-specificCpG island DNA methylation markers are identified as well as aspecific defect in methylation of repetitive DNA elements.

ResultsChromosomal DNA Methylation Patterns. To analyze tumor-associated DNA methylation changes at the chromosomal level, wecompared two stage I, one stage II, and one stage III lung SCCswith normal matched lung tissues. We used the MIRA-assistedmicroarray method (29) with genome tiling arrays for high-resolution DNA methylation analysis. MIRA-enriched and inputfractions were cohybridized to NimbleGen arrays covering genomicregions at a resolution of 100 bp (Fig. 1). These MIRA microarraysprovide complete and high-resolution mapping of DNA methyl-ation patterns along chromosomes. We used arrays covering theentire short arm of chromosome 7 [57 megabases (Mb)], the entireshort and part of the long arm of chromosome 8 (65 Mb), andregions of the long arm of chromosomes 6 (5 Mb) and 7 (12.7 Mb).DNA methylation profiles of normal lung and matched pairs ofSCC samples were compared between different patients. Weobserved a striking conservation of overall chromosomal DNAmethylation patterns [Fig. 1 and supporting information (SI) Fig. 5],in particular when comparing normal lung tissue from differentindividuals.

Hypermethylation in Tumors at CpG Islands. Regions near the cen-tromeres and telomeres were more densely methylated than otherloci in normal and tumor samples (Fig. 2A). Upon closer exami-nation of the high-resolution methylation data, we detected 16

Author contributions: T.A.R., K.H.K., A.D.R., and G.P.P. designed research; T.A.R., X.Z.,M.W., and Z.W. performed research; T.A.R., X.W., Z.W., A.D.R., and G.P.P. analyzed data;and T.A.R., A.D.R., and G.P.P. wrote the paper.

The authors declare no conflict of interest.

Data deposition: The data reported in this paper have been deposited in the GeneExpression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE9622).

§To whom correspondence may be addressed. E-mail: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0710735105/DC1.

© 2007 by The National Academy of Sciences of the USA

252–257 � PNAS � January 8, 2008 � vol. 105 � no. 1 www.pnas.org�cgi�doi�10.1073�pnas.0710735105

cancer-specifically hypermethylated regions on the short arm ofchromosome 8 in one stage I SCC (tumor 2). All of them were CpGislands or CpG-rich regions, often overlapping or located in closeproximity to promoter regions (SI Table 3). We analyzed the DNAmethylation status of 11 of the 16 hypermethylated loci by bisulfitesequencing (examples are shown in Fig. 2B) and combined bisulfiterestriction analysis (COBRA) assays (data not shown) and foundthis data to be entirely consistent with the array data. In stage II andstage III lung tumors, we detected a similar number of hyperm-ethylated regions (SI Table 3). Hypermethylated targets fromdifferent patients were partially overlapping. Importantly, we foundthat, other than CpG islands, no other sequences were found to behypermethylated in tumors relative to normal tissue.

Hypomethylation in Tumors at Repetitive DNA Sequences. In additionto hypermethylation, the tiling arrays provided information on theextent and sequence specificity of DNA hypomethylation. SINEsand LINEs, together with human endogenous retroviruses(HERVs), make up �45% of the human genome (30). Transpos-able elements are highly methylated and mostly silenced in normalcells (25, 31). Although repetitive sequences are not directly rep-resented as probes on the tiling arrays, we obtained information onthe methylation status of SINE elements due to hybridization offlanking single copy DNA to adjacent probes after MseI digestion.In the MIRA technique, the highly methylated elements are

captured by the MBD2b/MBD3L1 protein complex (29). Aftercomparing the DNA methylation profiles of normal lung tissues andthe matched SCC samples, we detected several thousand tumor-associated demethylation events of genomic regions carrying SINEelements (examples are shown Fig. 2C, Fig. 3, and SI Fig. 6). Themethylation status of several arbitrarily chosen SINE elements wasverified by bisulfite sequencing and COBRA assays. Primers forbisulfite sequencing were complementary to the flanking uniquesequences, and the sequencing data reflects the methylation statusof the repetitive element itself. The sequencing data confirmed theMIRA-assisted tiling array methylation profiles for SINE elementsand their extensive hypomethylation in tumors. The cancer-specifichypomethylation of SINE elements was not well conserved betweenindividual tumors; this reflects a degree of randomness for targetingindividual SINE sequences for demethylation in cancer.

Next, we surveyed all of the CpG islands on chromosome 8p intumor SCC2 and its corresponding normal tissue. As expected,�98% (159/162) of the promoter-associated CpG islands wereunmethylated in normal lung. In addition, there were 78 unmeth-ylated intragenic and intergenic CpG islands. Further, we found 159mostly short (�0.6 kb) methylated CpG islands in normal lung.Sixty-four of these methylated CpG islands were intragenic, andthey generally did not become hypomethylated in the tumor.However, the majority of the methylated islands (a total of 95) werelocated between 0 and 2 Mb away from the chromosome end,overlapping the subtelomeric region, and these were not associatedwith a known gene. Almost all of the methylated subtelomeric CpGislands were composed of short direct or indirect repeat sequences.Fifty-four of the 95 subtelomeric methylated islands underwentdemethylation in the tumor. Their demethylation is consistent witha specific defect of repetitive DNA methylation in cancer tissue. Therepeat-rich subtelomeric region of chromosome 8, even outside ofCpG islands, was substantially hypomethylated in the tumor (ex-ample shown in SI Fig. 7). Importantly, however, nonsubtelomericsingle-sequence genes and intergenic regions were not demethyl-ated in tumors. Within 157 Mb of DNA sequence analyzed, wecould detect only one unique-sequence CpG-rich sequence that wascancer-specifically demethylated. This hypomethylated sequence islocated at the 3� end of an uncharacterized gene, C8orf72 (SIFig. 8).

A particularly interesting example is the UNC5D gene, becausecancer-specific hyper- and hypomethylation events occurred in thesame gene. Its promoter was hypermethylated, whereas SINEsequences downstream in the intragenic region were all hypom-ethylated (Fig. 3). The UNC5D gene is frequently deleted in gastriccancer (32), suggesting a possible link between SINE-specifichypomethylation and chromosomal instability leading to loss ofheterozygosity in this region. However, hypermethylation of apromoter was not always associated with demethylation of multipleelements in the gene body. The promoter-methylated ZNF703 andFGF17 genes, for example, are only 3.0 and 8.5 kb in length,respectively, and contain either no or just a single copy of intragenicSINE elements.

To get a more complete picture of the DNA methylation changesin other repetitive sequences, we next extended our analysis toLINE- and HERV-containing loci. LINEs, multiple-copy SINEs,and HERV-containing regions were not adequately covered by themicroarray analysis because of their size and a lack of specificprobes on the arrays. Therefore, we used a modified COBRAmethod (33) to explore methylation changes in LINE and HERVelements. Although this approach, unlike the MIRA-assisted tillingarrays, cannot provide information on a defined region, it can givean estimate for the global changes in methylation status of theseelements. We analyzed 20 normal lung tissue and matching SCCsamples (SI Fig. 9). We observed strong hypomethylation of LINEsin SCC samples. HERV promoter demethylation was not aspronounced as LINE demethylation but was still significant.

Another class of repeat sequences are segmental duplications

Fig. 1. Conservation of global chromosomal DNA methylation patterns. MIRA-assistedmicroarrayswereconductedforpairsofnormal lungtissue(N,green)andcorresponding SCCs (T, red). The profile of methylated DNA sequences is dis-played as the ratio of MIRA-enriched DNA signal versus input DNA signal. The toptwo tumor samples are stage I SCCs, the third sample is stage II, and the bottomsample is a stage III tumor. Representative tiling array data are shown demon-strating the general conservation of chromosomal methylation profiles betweendifferent individuals and, at this level of resolution, also between normal andtumor tissue. A segment of the short arm of chromosome 7 is shown.

Rauch et al. PNAS � January 8, 2008 � vol. 105 � no. 1 � 253

GEN

ETIC

S

that can be several kilobases in size. Chromosome 8p23 contains anarea of a direct genomic duplication (30.5 kb direct repeat) that isalso found on several other chromosomes. Although it is notpossible to tell which particular chromosomal segment hybridizes tothe probes on the array, it is clear that these duplicated sequencesunderwent extensive demethylation in the tumor sample (SIFig. 10).

A Complete Set of Hypermethylated CpG Islands and Discovery ofHighly Sensitive and Specific Lung Cancer DNA Methylation Biomar-kers. In addition to the chromosome tiling arrays, we used AgilentCpG island arrays, which cover most of the CpG islands in thehuman genome, to comprehensively analyze CpG island methyl-ation. Five stage I SCCs were initially analyzed on these arrays. Thenumber of methylated CpG islands ranged from 216 to 848 in thefive individual tumors (Table 1). All methylated CpG islands arelisted in SI Table 4. We identified 57 CpG islands that weremethylated in five of five SCC tumors (SI Table 5). A large fractionof these methylated CpG islands were mapped to homeobox genes.The CpG island sequences and flanking 1-kb regions of the 15 mostfrequently methylated genes were analyzed for potential consensusDNA sequences, but we could not identify any significant consensusmotifs. Because the most frequently methylated loci had excellentpotential to be specific and sensitive methylation biomarkers, weanalyzed 12 of these markers in a larger series of 20 SCCs (Fig. 4).The methylation frequency ranged from 14 of 20 (70%) to 20 of 20(100%) of the tumors (Table 2). The OTX1- and NR2E1- associatedCpG islands were methylated in all of the SCC tumors tested(100%). Several of these markers were highly specific for tumor-associated methylation, i.e., little or no methylation was observed in

tumor-adjacent normal lung tissue. These markers included theCpG islands of the OTX1, BARHL2, MEIS1, OC2, PAX6, IRX2,TFAP2A, and EVX2 genes (Fig. 4). Most of these commonlymethylated CpG islands were not methylated to a significant extentin noncancerous lung or leukocyte DNA (SI Fig. 11). The specificityand high methylation frequency of these genes make them excellentcandidates for future diagnostic applications developed for earlydetection of lung cancer.

DiscussionWe have used the MIRA method in combination with genometiling arrays for a comprehensive analysis of DNA methylationpatterns in lung cancer genomes. The advantages of MIRA overantibody-based precipitation methods are its higher sensitivity(unpublished data) and high affinity to double-stranded methylatedCpG targets. Its preference for binding to at least two methyl-ated CpGs within �50 base pairs prevents overestimation ofmethylated DNA molecules. For the NimbleGen tiling arrays,hypermethylation of CpG islands could easily be detected andverified by bisulfite sequencing when 50–60% of the CpG in aparticular region were methylated (Figs. 2 and 3). Given theintensity of the peaks in the tumors, it is likely that much lowermethylation frequencies (20%) can readily be detected, but we havenot found such areas. For hypomethylation on tiling arrays, we caneasily pick up demethylation of 60–90% of a region’s CpG sites(examples in Figs. 2, 3, and SI Fig. 8). The peak intensity differencessuggest that we should be able to detect more moderate differences,probably �30% hypomethylation, but it is unlikely that the methodwould readily detect small amounts of hypomethylation, i.e.,5–25%, because such a region still is highly methylated and will be

Fig. 2. DNA methylation patterns along the short arm of chromosome 8. (A) Schematic display of a chromosome 8 region and the methylation signals(MIRA-enriched DNA versus input DNA) in normal lung tissue and a matching stage I SCC tumor (tumor 2). (B) Examples of hypermethylated CpG islands. Themethylation array profiles are shown on the top, and the confirmation of tumor-associated methylation by bisulfite sequencing is shown near the bottom. Opencircles, unmethylated CpG sites; closed circles, methylated CpG sites. The black bars indicate the regions analyzed by bisulfite sequencing. (C) Examples ofhypomethylated SINE sequences in lung tumors. The array signals are shown on the top. Blue bars denote the location of SINE elements. Confirmation of theSINE methylation status by bisulfite sequencing is shown at the bottom.

254 � www.pnas.org�cgi�doi�10.1073�pnas.0710735105 Rauch et al.

pulled down in the MIRA procedure. For the Agilent CpG islandarrays, we were able to generally confirm hypermethylated CpGislands in tumors by bisulfite-based approaches when the fold-difference factor was �2 for multiple closely spaced probes. We alsoconfirmed known SCC hypermethylated CpG islands such asCDKN2 and TCF21. Some methylated genes, however, may goundetected with this approach because the associated MseI frag-ments may be too large and may fail to amplify.

One of the most significant findings of this study is that cancer-specifically demethylated chromosomal regions were almost exclu-sively mapped to repetitive DNA-containing sequences (LINE,SINE, LTR elements, and segmental duplications) and subtelo-meric regions. Subtelomeric DNA is composed of subtelomericrepeat sequences and segmental duplications and is densely meth-ylated (34, 35). The terminal 2 Mb of chromosome 8p weresubstantially hypomethylated in the tumors. However, our analysisof chromosomes 7 and 8 suggests that demethylation of single copy,

nonsubtelomeric DNA sequences is a very rare event. Wheredemethylation of single copy genes does occur, it may be mecha-nistically connected to demethylation of nearby repetitive DNA(although this formally remains to be tested). The limitations of thestudy are the number of samples and chromosomes analyzed. Forexample, there may be other single copy genes subject to hypom-ethylation in tumors, such as the MAGE genes (located on chro-mosomes 15 and X) (36) and the maspin gene (located on chro-mosome 18) (37, 38).

The specificity of hypomethylation for repetitive elements isconsistent with a defect of a molecular mechanism that maintainsmethylation of repetitive DNA. The nature of this defect is currentlyunknown. One possibility is that repetitive DNA is actively demeth-ylated in cancer cells, perhaps through reactivation of a DNAdemethylase gene that normally is expressed only in early devel-opment. To date, the nature of the mammalian DNA demethylasehas remained obscure (39). Alternatively, the maintenance processof methylation of repetitive DNA may be defective in cancer cells.The DNA methyltransferase DNMT1 is responsible for the reliablecopying of existing DNA methylation patterns, but evidence for arole of altered DNMT1 function in cancer-associated DNA hy-pomethylation is not available. DNMT3A and DNMT3B are denovo DNA methyltransferases. DNMT3L, devoid of methyltrans-ferase activity by itself, is capable of stimulating the activity ofDNMT3A and DNMT3B (40). DNMT3L was shown to positivelyregulate DNA methylation at imprinted sequences and at repeat

Fig. 3. Promoter hypermethylation and intragenicSINE hypomethylation in the UNC5D gene. (A) Thisgene on chromosome 8 shows hypermethylation ofthe promoter-associated CpG island (blue circles) andhypomethylation of multiple intragenic SINE elements(red circles). (B) Bisulfite sequencing confirms themethylation status of the promoter and its proximalSINE element. The purple bars indicate the regionsanalyzed by bisulfite sequencing. Black boxes indicateexons, and the arrow shows the transcription start site.

Table 1. Number of methylated CpG islands in stage I lung SCC

Sample Methylated CpG islands

SCC1 632SCC2 248SCC3 848SCC4 743SCC5 216

Rauch et al. PNAS � January 8, 2008 � vol. 105 � no. 1 � 255

GEN

ETIC

S

sequences in mouse germ cells (41). However, the role of theseproteins in maintaining methylation of repeat sequences in somaticcells is not clear. Instead of invoking a defect in DNA methyltrans-ferases themselves, another possibility is that their access to repet-itive DNA in cancer cells may be impeded. Defects in two chro-matin-associated DNA helicases, LSH and ATRX, have beenassociated with DNA hypomethylation in gene-targeted mice (42,43). Although it was initially thought that LSH is required for totalgenomic DNA methylation (43), this defect may be most importantfor repetitive DNA (44). Deficiencies in either ATRX or LSH (alsoknown as PASG, SMARCA6, or HELLS) in human tumors havebeen reported only rarely (45, 46). A double-stranded RNA-basedmechanism that guides methylation of repetitive DNA throughheterochromatin formation is also a formal possibility that warrantsfurther investigation.

The second important aspect of this paper is the comprehensiveanalysis of the CpG islands in human lung cancer by using microar-rays. We were able to measure the methylation levels at �27,000CpG islands and found that between 216 and 848 of these islands

were methylated in our lung SCC samples. These numbers arecompatible with earlier estimates derived from analysis of a subsetof CpG islands methylated in cancer (47). We found that CpGislands with different CpG densities can become hypermethylatedin tumors (SI Table 3). It is clear that not all of these methylatedgenes can be tumor suppressor genes. For example, consistent withearlier observations, a substantial subset of the methylated genes(20–40%, depending on the tumor) were homeobox genes (48).Homeobox gene-associated CpG islands were among the best stageI disease DNA methylation markers identified in this study. Wefound that the CpG islands of the OTX1, PAX6, IRX2, OC2,TFAP2A, and EVX2 genes are tumor-specifically methylated withvery little methylation found in normal lung tissue or in blood DNA.Methylation of the OTX1, IRX2, OC2, and EVX2 genes has not yetbeen reported in human cancers. Also, importantly, the methylationfrequency of these markers (80–100% of the tumors were meth-ylated for 11 of 12 markers tested; 70% for OC2; Table 2) is muchhigher than methylation frequencies of other lung cancer DNAmethylation markers reported previously. For example, we find that

Fig. 4. Frequently methylated CpG islands in lungSCCs. Verification of CpG island DNA methylationmarkers in normal lung tissue and matching stage I SCCsamples was conducted by bisulfite-based COBRA as-says. Methylation differences between SCCs (T) andmatching normal tissues (N) were analyzed by COBRAassays of the indicated gene targets. �, control diges-tion with no BstUI; �, BstUI-digested samples. Diges-tion by BstUI indicates methylation of the sequencethat was tested.

Table 2. Frequency of methylation of 12 DNA methylation biomarkers in 20 lung SCCs

SCC No. Stage MSX1 OTX1 BARHL2 PAX6 MEIS1 OC2 TFAP2A OSR1 ZNF577 EVX2 IRX2 NR2E1

1 I � � � � � � � � � � � �

2 I � � � � � � � � � � � �

3 I � � � � � � � � � � � �

4 I � � � � � � � � � � � �

5 I � � � � � � � � � � � �

6 I � � � � � � � � � � � �

7 I � � � � � � � � � � � �

8 I � � � � � � � � � � � �

9 I � � � � � � � � � � � �

10 I � � � � � � � � � � � �

11 I � � � � � � � � � � � �

12 II � � � � � � � � � � � �

13 II � � � � � � � � � � � �

14 II � � � � � � � � � � � �

15 II � � � � � � � � � � � �

16 II � � � � � � � � � � � �

17 III � � � � � � � � � � � �

18 III � � � � � � � � � � � �

19 III � � � � � � � � � � � �

20 III � � � � � � � � � � � �

Frequency 19/20 20/20 17/20 17/20 17/20 14/20 19/20 20/20 18/20 16/20 19/20 20/20

�, methylated CpG island; �, unmethylated CpG island as determined by COBRA assay.

256 � www.pnas.org�cgi�doi�10.1073�pnas.0710735105 Rauch et al.

OTX1 was tumor-specifically methylated in 20 of 20 (100%) of thetumors. Such markers are excellent candidates for clinical ordiagnostic applications aimed at either detection of early disease inbody fluids such as blood or sputum or at disease management andfollow-up by using molecular diagnostic testing.

Materials and MethodsGenomic DNA Preparation. Lung SCC samples and matching normal tissuesremoved with surgery were obtained from the frozen tumor bank of the Cityof Hope National Medical Center. Genomic DNA was purified from tissues bystandard procedures by using phenol chloroform extraction and ethanolprecipitation.

MIRA-Assisted Tiling Array Analysis. MIRA was performed as described in ref. 48.NimbleGen whole-chromosomal tiling arrays covering the entire short arms ofhuman chromosome 7 (HG18Tiling�Set17) and 8 (HG18Tiling�Set19) were used inthe DNA methylation profile analysis. MIRA-enriched DNA fractions were com-pared with input DNA. The labeling of dsDNA, microarray hybridization, andscanning were performed by the NimbleGen Service Group (Reykjavik, Iceland).Data were extracted from scanned images by using NimbleScan 2.3 extractionsoftware (NimbleGen Systems). Human CpG island microarrays, which contain237,000 oligonucleotide probes covering 27,800 CpG islands, were purchasedfrom Agilent Technologies. Two micrograms each of the amplicons from MIRA-enriched tumor DNA and normal control samples were labeled with a BioPrimeArray CGH Genomic Labeling kit (Invitrogen) with either Cy5-dCTP (tumor) or

Cy3-dCTP (control) in 87.5-�l reactions (both Cy3- and Cy5-dCTP were obtainedfrom GE Healthcare). The purified labeled samples were then mixed, and mi-croarray hybridization was performed according to the Agilent ChIP-on-chipprotocol (v.9.0). The hybridized arrays were scanned on an Axon 4000B microar-ray scanner, and the images were analyzed with Axon GenePix software v.5.1.Image and data analyses were performed as described in ref. 29. Individual CpGislands were considered methylation-positive when at least two adjacent probesallowing a one-probe gap within the CpG island scored a fold-difference factorof �2.0 when comparing tumor and normal tissue DNA.

DNA Methylation Analysis Using COBRA and Bisulfite Sequencing. The COBRAassays were performed according to the method of Xiong and Laird (49) by usingdigestion with BstUI for analysis of single copy genes, HinfI for analysis of LINEelements, or TaqI for analysis of HERV sequences. DNA was treated and purifiedwith the EpiTect bisulfite kit (Qiagen). The PCR primers used to amplify HERVsequences after bisulfite conversion were 5�-TTTATAGGTGTGTAGGGGTAATT-TATTTT and 5�-AATAAAAAACATATTTACTTTTAATTTTAC. LINE elements wereanalyzed by using consensus primers as described by Yang et al. (33). Other PCRprimers used for amplification of specific targets in bisulfite-treated DNA areavailable upon request. For sequence analysis, the PCR products obtained afterbisulfite conversion were cloned into the pDrive PCR cloning vector (Qiagen), andfive individual clones were sequenced.

ACKNOWLEDGMENTS. This work was supported by National Institutes ofHealth Grants CA104967 and CA128495 (to G.P.P.).

1. Jones PA, Baylin SB (2007) The epigenomics of cancer. Cell 128:683–692.2. Lapeyre JN, Becker FF (1979) 5-Methylcytosine content of nuclear DNA during chemical

hepatocarcinogenesis and in carcinomas which result. Biochem Biophys Res Commun87:698–705.

3. Romanov GA, Vanyushin BF (1981) Methylation of reiterated sequences in mammalianDNAs. Effects of the tissue type, age, malignancy and hormonal induction. BiochimBiophys Acta 653:204–218.

4. Feinberg AP, Vogelstein B (1983) Hypomethylation distinguishes genes of some humancancers from their normal counterparts. Nature 301:89–92.

5. Gama-Sosa MA, et al. (1983) Tissue-specific differences in DNA methylation in variousmammals. Biochim Biophys Acta 740:212–219.

6. Riggs AD, Jones PA (1983) 5-Methylcytosine, gene regulation, and cancer. Adv CancerRes 40:1–30.

7. Gaudet F, et al. (2003) Induction of tumors in mice by genomic hypomethylation.Science 300:489–492.

8. Gonzalez-Zulueta M, et al. (1995) Methylation of the 5� CpG island of the p16/CDKN2tumor suppressor gene in normal and transformed human tissues correlates with genesilencing. Cancer Res 55:4531–4535.

9. Herman JG, et al. (1995) Inactivation of the CDKN2/p16/MTS1 gene is frequentlyassociated with aberrant DNA methylation in all common human cancers. Cancer Res55:4525–4530.

10. Merlo A, et al. (1995) 5� CpG island methylation is associated with transcriptionalsilencing of the tumour suppressor p16/CDKN2/MTS1 in human cancers. Nat Med1:686–692.

11. Kane MF, et al. (1997) Methylation of the hMLH1 promoter correlates with lack ofexpression of hMLH1 in sporadic colon tumors and mismatch repair-defective humantumor cell lines. Cancer Res 57:808–811.

12. Esteller M, Corn PG, Baylin SB, Herman JG (2001) A gene hypermethylation profile ofhuman cancer. Cancer Res 61:3225–3229.

13. Dammann R, et al. (2000) Epigenetic inactivation of a RAS association domain familyprotein from the lung tumour suppressor locus 3p21.3. Nat Genet 25:315–319.

14. Zochbauer-Muller S, et al. (2001) Aberrant promoter methylation of multiple genes innon-small cell lung cancers. Cancer Res 61:249–255.

15. Yanagawa N, et al. (2003) Promoter hypermethylation of tumor suppressor andtumor-related genes in non-small cell lung cancers. Cancer Sci 94:589–592.

16. Topaloglu O, et al. (2004) Detection of promoter hypermethylation of multiple genesin the tumor and bronchoalveolar lavage of patients with lung cancer. Clin Cancer Res10:2284–2288.

17. Dammann R, et al. (2005) CpG island methylation and expression of tumour-associatedgenes in lung carcinoma. Eur J Cancer 41:1223–1236.

18. Laird PW (2003) The power and the promise of DNA methylation markers. Nat RevCancer 3:253–266.

19. Belinsky SA (2004) Gene-promoter hypermethylation as a biomarker in lung cancer.Nat Rev Cancer 4:707–717.

20. Ushijima T (2005) Detection and interpretation of altered methylation patterns incancer cells. Nat Rev Cancer 5:223–231.

21. Ehrlich M (2002) DNA methylation in cancer: too much, but also too little. Oncogene21:5400–5413.

22. Weisenberger DJ, et al. (2005) Analysis of repetitive element DNA methylation byMethyLight. Nucleic Acids Res 33:6823–6836.

23. Cadieux B, Ching TT, Vandenberg SR, Costello JF (2006) Genome-wide hypomethyla-tion in human glioblastomas associated with specific copy number alteration, meth-ylenetetrahydrofolate reductase allele status, and increased proliferation. Cancer Res66:8469–8476.

24. Rodriguez J, et al. (2006) Chromosomal instability correlates with genome-wide DNAdemethylation in human primary colorectal cancers. Cancer Res 66:8462–9468.

25. Estecio MR, et al. (2007) LINE-1 hypomethylation in cancer is highly variable andinversely correlated with microsatellite instability. PLoS ONE 2:e399.

26. Wilson AS, Power BE, Molloy PL (2007) DNA hypomethylation and human diseases.Biochim Biophys Acta 1775:138–162.

27. Esteller M (2007) Cancer epigenomics: DNA methylomes and histone-modificationmaps. Nat Rev Genet 8:286–298.

28. Rauch T, Pfeifer GP (2005) Methylated-CpG island recovery assay: a new technique forthe rapid detection of methylated-CpG islands in cancer. Lab Invest 85:1172–1180.

29. Rauch T, Li H, Wu X, Pfeifer GP (2006) MIRA-assisted microarray analysis, a newtechnology for the determination of DNA methylation patterns, identifies frequentmethylation of homeodomain-containing genes in lung cancer cells. Cancer Res66:7939–7947.

30. Lander ES, et al. (2001) Initial sequencing and analysis of the human genome. Nature409:860–921.

31. Yoder JA, Walsh CP, Bestor TH (1997) Cytosine methylation and the ecology ofintragenomic parasites. Trends Genet 13:335–340.

32. Koed K, et al. (2005) High-density single nucleotide polymorphism array defines novelstage and location-dependent allelic imbalances in human bladder tumors. Cancer Res65:34–45.

33. Yang AS, et al. (2004) A simple method for estimating global DNA methylation usingbisulfite PCR of repetitive DNA elements. Nucleic Acids Res 32:e38.

34. Brock GJ, Charlton J, Bird A (1999) Densely methylated sequences that are preferen-tially localized at telomere-proximal regions of human chromosomes. Gene 240:269–277.

35. Blasco MA (2007) The epigenetic regulation of mammalian telomeres. Nat Rev Genet8:299–309.

36. De Smet C, Loriot A, Boon T (2004) Promoter-dependent mechanism leading toselective hypomethylation within the 5� region of gene MAGE-A1 in tumor cells. MolCell Biol 24:4781–4790.

37. Ogasawara S, et al. (2004) Disruption of cell-type-specific methylation at the Maspingene promoter is frequently involved in undifferentiated thyroid cancers. Oncogene23:1117–1124.

38. Sato N, Fukushima N, Matsubayashi H, Goggins M (2004) Identification of maspin andS100P as novel hypomethylation targets in pancreatic cancer using global gene ex-pression profiling. Oncogene 23:1531–1538.

39. Chen ZX, Riggs AD (2005) Maintenance and regulation of DNA methylation patternsin mammals. Biochem Cell Biol 83:438–448.

40. Chen ZX, Mann JR, Hsieh CL, Riggs AD, Chedin F (2005) Physical and functionalinteractions between the human DNMT3L protein and members of the de novomethyltransferase family. J Cell Biochem 95:902–917.

41. Bourc’his D, Bestor TH (2004) Meiotic catastrophe and retrotransposon reactivation inmale germ cells lacking Dnmt3L. Nature 431:96–99.

42. Gibbons RJ, et al. (2000) Mutations in ATRX, encoding a SWI/SNF-like protein, causediverse changes in the pattern of DNA methylation. Nat Genet 24:368–371.

43. Dennis K, Fan T, Geiman T, Yan Q, Muegge K (2001) Lsh, a member of the SNF2 family,is required for genome-wide methylation. Genes Dev 15:2940–2944.

44. Huang J, et al. (2004) Lsh, an epigenetic guardian of repetitive elements. Nucleic AcidsRes 32:5019–5028.

45. Lee DW, et al. (2000) Proliferation-associated SNF2-like gene (PASG): a SNF2 familymember altered in leukemia. Cancer Res 60:3612–3622.

46. Yano M, et al. (2004) Tumor-specific exon creation of the HELLS/SMARCA6 gene innon-small cell lung cancer. Int J Cancer 112:8–13.

47. Costello JF, et al. (2000) Aberrant CpG-island methylation has non-random and tu-mour-type-specific patterns. Nat Genet 24:132–138.

48. Rauch T, et al. (2007) Homeobox gene methylation in lung cancer studied by genome-wide analysis with a microarray-based methylated CpG island recovery assay. Proc NatlAcad Sci USA 104:5527–5532.

49. Xiong Z, Laird PW (1997) COBRA: a sensitive and quantitative DNA methylation assay.Nucleic Acids Res 25:2532–2534.

Rauch et al. PNAS � January 8, 2008 � vol. 105 � no. 1 � 257

GEN

ETIC

S