Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin...

5
Proc. Natl. Acad. Sci. USA Vol. 91, pp. 12110-12114, December 1994 Medical Sciences Cloning of ELL, a gene that fuses to MLL in a t(11;19)(q23;pl3.1) in acute myeloid leukemia MICHAEL J. THIRMAN*, DENISE A. LEVITAN*, HIRo KOBAYASHI*, M. CELESTE SIMON*tt, AND JANET D. ROWLEY*t Departments of *Medicine, Section of Hematology/Oncology, and tMolecular Genetics and Cell Biology, tThe Howard Hughes Medical Institute, University of Chicago Medical Center, Chicago, IL 60637 Contributed by Janet D. Rowley, August 24, 1994 ABSTRACT To characterize the functions of MLL fusion transcripts, we cloned the gene that fuses to MLL in the translocation t(11;19)(q23;p13.1). This translocation is distinct from another type of 11;19 translocation with a 19p13.3 breakpoint that results in the fusion of MLL to the ENL gene. By PCR screening of a cDNA library prepared from a patient's leukemia cells with this translocation, we obtained a fusion transcript containing exon 7 of MLL and sequence of an unknown gene. The sequence of this gene was amplified and used as a probe to screen a fetal brain cDNA library. On Northern blot analysis, this cDNA detected a 4.4-kb transcript that was abundant in peripheral blood leukocytes, skeletal muscle, placenta, and testis and expressed at lower levels in spleen, thymus, heart, brain, lung, kidney, liver, and ovary. In addition, a 2.8-kb transcript was present in peripheral blood, testis, and placenta. On "zoo blots," this gene was shown to be evolutionarily conserved in 10 mammalian species as well as in chicken, frog, and fish. We have named this gene ELL (for eleven-nineteen lysine-rich leukemia gene). A highly basic, lysine-rich motif of the predicted ELL protein is homologous to similar regions of several proteins, including the DNA-binding domain of poly(ADP-ribose) polymerase. The characterization of the normal functions of ELL as well as its altered function when fused to MLL will be critical to further our understanding of the mechanisms of leukemogenesis. Aberrations in chromosome 11 band q23 are frequent cyto- genetic abnormalities in hematologic malignancies, espe- cially in acute leukemias. Previously, we identified a yeast artificial chromosome that contained the breakpoint region in leukemias with several common 11q23 translocations (1). Subsequently, we cloned a gene named MLL (for myeloid- lymphoid leukemia or mixed-lineage leukemia) that spans this breakpoint (2). Other groups have also cloned this gene and have called it Htrx, ALL-I, and HRX (3-5). A 0.74-kb BamHI fragment isolated from MLL detects gene rearrange- ments on Southern blot analysis in all patients with the five most common types of 11q23 translocations as well as a large number of other, less common 11q23 abnormalities (6). All breakpoints in MLL occur within an 8.3-kb breakpoint cluster region. The MLL gene encodes an A-T hook domain at the N terminus and several regions of homology to the Drosophila trithorax gene, including a series of zinc fingers in the region immediately 3' to the breakpoint region and 210 C-terminal amino acids that do not contain a known motif. Mapping of functional domains of MLL with yeast GAL4 hybrids in transient transfections has revealed that MLL contains a strong activation domain 3' of the breakpoint and a repres- sion domain 5' of the breakpoint region (7). Six genes at 11q23 partner chromosomal breakpoints have been cloned. These include AF4 in t(4;11)(q21;q23) (4, 8), ENL in t(11;19)(q23;p13.3) (5), AF9 in t(9;11)(p22;q23) (9), AF6 in t(6;11)(q27;q23) (10), AFlp in t(1;11)(p32;q23) (11), and AFX in t(X;11)(ql3;q23) (12). The functions of these fusion partner genes have not been determined. Transloca- tions at 11q23 result in the formation of two derivative chromosomes that encode chimeric transcripts. The der(11) transcript contains 5' MLL sequences fused to 3' sequences of the gene located on the partner chromosome, whereas the other derivative chromosome contains the 5' sequence of the partner gene potentially fused to 3' sequence of MLL. However, in 25% of patients, translocations are associated with deletions of MLL sequence that is 3' to the breakpoint (13). Thus, in these cases, a fusion transcript from the other derivative chromosome cannot be formed. In addition, anal- ysis of complex 11q23 translocations reveals that the der(11) junction is always conserved (14). These data indicate that the fusion transcript encoded by the der(11) must be critical to leukemogenesis. Although earlier cytogenetic studies did not discriminate between different breakpoints within band 19p13 in cases with a t(11;19)(q23;p13), it has become clear with more precise cytogenetic techniques that two breakpoints exist within band 19p13 (Fig. 1). These two breakpoints are easily distinguishable by fluorescence in situ hybridization (FISH) (15). The t(11;19)(q23;p13.3) breakpoint is common in acute lymphoblastic leukemia, primarily in infants and children. The translocation t(11;19)(q23;p13.1) is a recurring abnor- mality in both de novo and therapy-related acute myeloid leukemia and it has been observed in infants, children, and adults. The gene in t(11;19)(q23;p13.3) has already been cloned and named ENL (for eleven-nineteen leukemia). To characterize the functions of MLL fusion transcripts further, we have cloned the gene that fuses to MLL in t(11;19)(q23;p13.1). The open reading frame of this gene codes for a hydrophilic, highly basic protein. We have named this gene ELL (for eleven-nineteen lysine-rich leukemia gene).§ The ELL gene has a broad tissue distribution and is conserved in all mammalian species as well as in chicken, fish, and frog. MATERIALS AND METHODS Cytogenetic Analysis and FISH. Cytogenetic analysis was performed with a trypsin-Giemsa banding technique. Chro- mosomal abnormalities are described according to the Inter- national System for Human Cytogenetic Nomenclature. FISH was performed with biotin-labeled probes (16). Construction and Screening of cDNA Libraries. Poly(A)+ RNA was isolated with the FastTrack kit (Invitrogen) from cryopreserved bone marrow from a 41-year-old woman with Abbreviations: FISH, fluorescence in situ hybridization; 5' RACE, rapid amplification of cDNA 5' ends. §The sequence reported in this paper has been reported in the GenBank data base (accession no. U16282). 12110 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on September 15, 2020

Transcript of Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin...

Page 1: Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin acute lymphoblastic leukemia, primarily in infants and children. The translocation

Proc. Natl. Acad. Sci. USAVol. 91, pp. 12110-12114, December 1994Medical Sciences

Cloning of ELL, a gene that fuses to MLL in a t(11;19)(q23;pl3.1) inacute myeloid leukemiaMICHAEL J. THIRMAN*, DENISE A. LEVITAN*, HIRo KOBAYASHI*, M. CELESTE SIMON*tt,AND JANET D. ROWLEY*tDepartments of *Medicine, Section of Hematology/Oncology, and tMolecular Genetics and Cell Biology, tThe Howard Hughes Medical Institute, Universityof Chicago Medical Center, Chicago, IL 60637

Contributed by Janet D. Rowley, August 24, 1994

ABSTRACT To characterize the functions of MLL fusiontranscripts, we cloned the gene that fuses to MLL in thetranslocation t(11;19)(q23;p13.1). This translocation is distinctfrom another type of 11;19 translocation with a 19p13.3breakpoint that results in the fusion ofMLL to the ENL gene.By PCR screening of a cDNA library prepared from a patient'sleukemia cells with this translocation, we obtained a fusiontranscript containing exon 7 of MLL and sequence of anunknown gene. The sequence of this gene was amplified andused as a probe to screen a fetal brain cDNA library. OnNorthern blot analysis, this cDNA detected a 4.4-kb transcriptthat was abundant in peripheral blood leukocytes, skeletalmuscle, placenta, and testis and expressed at lower levels inspleen, thymus, heart, brain, lung, kidney, liver, and ovary. Inaddition, a 2.8-kb transcript was present in peripheral blood,testis, and placenta. On "zoo blots," this gene was shown to beevolutionarily conserved in 10 mammalian species as well as inchicken, frog, and fish. We have named this gene ELL (foreleven-nineteen lysine-rich leukemia gene). A highly basic,lysine-rich motif of the predicted ELL protein is homologous tosimilar regions of several proteins, including the DNA-bindingdomain of poly(ADP-ribose) polymerase. The characterizationof the normal functions of ELL as well as its altered functionwhen fused toMLL will be critical to further our understandingof the mechanisms of leukemogenesis.

Aberrations in chromosome 11 band q23 are frequent cyto-genetic abnormalities in hematologic malignancies, espe-cially in acute leukemias. Previously, we identified a yeastartificial chromosome that contained the breakpoint region inleukemias with several common 11q23 translocations (1).Subsequently, we cloned a gene named MLL (for myeloid-lymphoid leukemia or mixed-lineage leukemia) that spansthis breakpoint (2). Other groups have also cloned this geneand have called it Htrx, ALL-I, and HRX (3-5). A 0.74-kbBamHI fragment isolated from MLL detects gene rearrange-ments on Southern blot analysis in all patients with the fivemost common types of 11q23 translocations as well as a largenumber of other, less common 11q23 abnormalities (6). Allbreakpoints inMLL occur within an 8.3-kb breakpoint clusterregion. The MLL gene encodes an A-T hook domain at the Nterminus and several regions of homology to the Drosophilatrithorax gene, including a series of zinc fingers in the regionimmediately 3' to the breakpoint region and 210 C-terminalamino acids that do not contain a known motif. Mapping offunctional domains of MLL with yeast GAL4 hybrids intransient transfections has revealed that MLL contains astrong activation domain 3' of the breakpoint and a repres-sion domain 5' of the breakpoint region (7).

Six genes at 11q23 partner chromosomal breakpoints havebeen cloned. These include AF4 in t(4;11)(q21;q23) (4, 8),

ENL in t(11;19)(q23;p13.3) (5), AF9 in t(9;11)(p22;q23) (9),AF6 in t(6;11)(q27;q23) (10), AFlp in t(1;11)(p32;q23) (11),and AFX in t(X;11)(ql3;q23) (12). The functions of thesefusion partner genes have not been determined. Transloca-tions at 11q23 result in the formation of two derivativechromosomes that encode chimeric transcripts. The der(11)transcript contains 5' MLL sequences fused to 3' sequencesof the gene located on the partner chromosome, whereas theother derivative chromosome contains the 5' sequence of thepartner gene potentially fused to 3' sequence of MLL.However, in 25% of patients, translocations are associatedwith deletions of MLL sequence that is 3' to the breakpoint(13). Thus, in these cases, a fusion transcript from the otherderivative chromosome cannot be formed. In addition, anal-ysis of complex 11q23 translocations reveals that the der(11)junction is always conserved (14). These data indicate thatthe fusion transcript encoded by the der(11) must be criticalto leukemogenesis.Although earlier cytogenetic studies did not discriminate

between different breakpoints within band 19p13 in caseswith a t(11;19)(q23;p13), it has become clear with moreprecise cytogenetic techniques that two breakpoints existwithin band 19p13 (Fig. 1). These two breakpoints are easilydistinguishable by fluorescence in situ hybridization (FISH)(15). The t(11;19)(q23;p13.3) breakpoint is common in acutelymphoblastic leukemia, primarily in infants and children.The translocation t(11;19)(q23;p13.1) is a recurring abnor-mality in both de novo and therapy-related acute myeloidleukemia and it has been observed in infants, children, andadults. The gene in t(11;19)(q23;p13.3) has already beencloned and named ENL (for eleven-nineteen leukemia).To characterize the functions of MLL fusion transcripts

further, we have cloned the gene that fuses to MLL int(11;19)(q23;p13.1). The open reading frame of this genecodes for a hydrophilic, highly basic protein. We have namedthis gene ELL (for eleven-nineteen lysine-rich leukemiagene).§ The ELL gene has a broad tissue distribution and isconserved in all mammalian species as well as in chicken,fish, and frog.

MATERIALS AND METHODSCytogenetic Analysis and FISH. Cytogenetic analysis was

performed with a trypsin-Giemsa banding technique. Chro-mosomal abnormalities are described according to the Inter-national System for Human Cytogenetic Nomenclature.FISH was performed with biotin-labeled probes (16).

Construction and Screening of cDNA Libraries. Poly(A)+RNA was isolated with the FastTrack kit (Invitrogen) fromcryopreserved bone marrow from a 41-year-old woman with

Abbreviations: FISH, fluorescence in situ hybridization; 5' RACE,rapid amplification of cDNA 5' ends.§The sequence reported in this paper has been reported in theGenBank data base (accession no. U16282).

12110

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

15, 2

020

Page 2: Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin acute lymphoblastic leukemia, primarily in infants and children. The translocation

Proc. Natl. Acad. Sci. USA 91 (1994) 12111

A

:

It.

I

11

,_

19

FIG. 1. Partial karyotype of chromosome pairs 11 and 19 fromtwo cells from patients with a t(11;19) with a 19p13.1 breakpoint (A)and with 19p13.3 breakpoint (B). The normal chromosome is on theleft in each pair. Arrowheads indicate the breakpoints in the trans-location chromosome on the right. Note that there is a much largeramount of pale material at the end of the long arm of the der(11)chromosome inA than in B, and conversely, a much smaller segmentof pale material at the end of the short arm of the der(19) chromo-some. There is an additional normal chromosome 19 in B.

acute myeloid leukemia (FAB-M4) with a karyotype of46,XX,t(11;19)(q23;pl3.1)(100%o). First-strand cDNA syn-thesis was catalyzed with Moloney murine leukemia virusreverse transcriptase using random hexamers (Pharmacia).The second strand was synthesized with RNase H and DNApolymerase I (Pharmacia). EcoRI/Not I adaptors were blunt-end ligated to the cDNA with T4 DNA ligase (TimeSavercDNA library kit; Pharmacia) and then ligated into the EcoRIsite of the AZAP II vector (Stratagene). The library wasinitially screened with the 0.74-kb BamHI cDNA fragment ofMLL. Subsequently, the library was screened by PCR withnested primers from MLL exon 7 and the vector. PCRproducts were ligated into the pT7-Blue T vector (Novagen).A commercially prepared human fetal brain cDNA library(Stratagene) was screened with probes amplified from theMLL/ELL fusion transcript.

Nucleotide Sequencing. cDNA clones were sequenced en-tirely on both strands with Sequenase (United States Bio-chemical) using doubled-stranded templates. The DNA se-quence and the predicted open reading frame were comparedwith GenBank data bases by using the BLAST and TFASTAprograms.

Southern and Northern Blot Analysis. DNA was extractedand digested by standard techniques. RNA was isolated fromcell lines with TriReagent (Molecular Research Center, Cin-cinnati, OH). The multiple-tissue Northern blots (Clontech),the somatic cell hybrid panel (Oncor), and the "zoo blots"(Bios, New Haven, CT) were used according to the manu-facturers' suggestions.Rapid Amplification Of cDNA 5' Ends (5' RACE). First-

strand cDNA was synthesized from 2 ,ug of placentalpoly(A)+ RNA with avian myeloblastosis virus reverse tran-scriptase (Clontech) and an antisense primer 150 bp down-stream of the beginning of the 2.8-kb clone isolated from thefetal brain cDNA library. An anchor oligonucleotide wasligated to the 3' end of the cDNA by T4 RNA ligase(Clontech). After 30 cycles of amplification using the anchorprimer (Clontech) and a primer 130 bp from the beginning ofthe 2.8-kb cDNA clone, a second round of 30 cycles of PCRwas performed with a 17-mer oligonucleotide from the inter-nal portion of the anchor primer and two different internalprimers from the 2.8-kb clone (110 and 90 bp from thebeginning of the clone).

RESULTSCloning of t(11;19)(q23;p13.1) Fusion Transcript. The

0.74-kb BamHI cDNA fragment from the MLL gene detectedtwo rearranged bands in the BamHI digest of the DNA from

the patient with the translocation t(11;19)(q23,p13.1) (Fig.2A). The cDNA library prepared from the patient's leukemiacells was initially screened with the 0.74-kb BamHI cDNAfragment ofMLL used to detect gene rearrangements. Threeclones were isolated from the library but none contained afusion transcript. To facilitate the cloning of this fusiontranscript, we devised a PCR screening method using nestedMLL-specific and vector primers. Detailed mapping of thegenomic breakpoint within MLL in this patient's leukemiarevealed a break between exons 7 and 9. Thus the MLL-specific primers were chosen approximately 50 and 30 bpupstream of the end of exon 7. After two rounds of amplifi-cation, several PCR products were obtained. Sequence anal-ysis revealed normal MLL transcripts as well as a clone witha 270-bp insert that contained 30 bp ofexon 7 and 240 bp thatdiverged from MLL sequences. The 240 bp that divergedfrom MLL were amplified by PCR and subsequently used asa probe to screen a fetal brain cDNA library. Three cloneswere isolated and sequenced. The largest of these clones (2.8kb) contained the entire 240 bp found in the fusion transcript.The 2.2-kb clone contained a similar sequence (16 of 17 bp)to part of the 240-bp probe but was otherwise completelydifferent from the rest of the probe or from any sequences inthe 2.8-kb fragment. The 1.9-kb clone contained 48 bp thatwere present in the 2.8-kb clone and in the 240-bp fragmentbut diverged both 5' and 3' to this sequence. Preliminarysequence analysis revealed Alu sequences within this cloneas well as consensus splice acceptor and splice donor sitesflanking this 48-bp sequence, suggesting that this 1.9-kb clonerepresents a partially processed transcript or a genomicclone. The 1.9-kb and 2.2-kb clones were not characterizedfurther.

5' RACE. An additional 23 nt that are upstream of thebeginning of the 2.8-kb cDNA fragment were obtained withthe 5' RACE strategy. These sequences contained a potentialstart codon flanked by an adenine in the -3 position and aguanine in the +4 position, thus following Kozak's consensussequence for an initiation codon (17). Twelve base pairs of 5'untranslated sequence was obtained from the 5' RACE, butwe cannot rule out the possibility that the 5' untranslatedregion may be longer.Chromosomal Loalization. The 240-bp fragment that di-

verged from MLL in the fusion transcript was found tohybridize to chromosome 19 in a somatic cell hybrid panel(data not shown). The entire 2.8-kb clone was labeled forFISH and it hybridized exclusively to chromosome 19, bandp13.1 (Fig. 3).Northern Blot Analysis. Multiple-tissue Northern blots

were hybridized with a probe containing the first 500 bp ofthe

A BamHI

23 kb - a.,.[1~

60-8 -Hi.t..6 - .:

4 -

3 -

0

1 2 3

B BamHI Hindlil

23 kb-

8-6-4- H

3-

1 2 3 1 2 3

FIG. 2. (A) Southern blot of DNA digested with BamHI andprobed with the 0.7-kb fragment of the MLL gene. Lanes: 1, control;2, leukemia cell DNA with the t( 1;19)(q23;pl3.1) used to prepare thecDNA library; 3, DNA from a patient with a t(11;19)(q23.p13.3).Both patients exhibit two rearranged bands with this probe. (B)Southern blot ofDNA digested with BamHI or HindIIl and probedwith the first 500 bp of the 2.8-kb ELL clone. This fragment containscDNA sequences from both sides of the breakpoint. Lanes 1-3 areas in A. No rearrangements were detected with these digests.

Medical Sciences: Tifirman et al.

D

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

15, 2

020

Page 3: Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin acute lymphoblastic leukemia, primarily in infants and children. The translocation

12112 Medical Sciences: Thirman et al.

A EcoRi

..

0 Eqp? :.

o o= D:' i :

-4k _

FIG. 3. FISH in normal metaphase cells with the 2.8-kb fragmentofELL labeled with biotin and detected with fluorescein-conjugatedavidin. Images were captured with a CCD (charge-coupled device)camera. Chromosome 19 band p13.1 is identified with an arrowhead.

2.8-kb cDNA (Fig. 4 A and B). A 4.4-kb transcript appearedto be expressed ubiquitously, with the highest level ofexpression in skeletal muscle, placenta, testis, and peripheralblood leukocytes. A 2.8-kb transcript was also present inplacenta, peripheral blood, and testis. On prolonged expo-sures of the autoradiographs, this 2.8-kb transcript was alsovisualized in other tissues. Hybridization of this same probeto a Northern blot of hematopoietic cell lines revealedexpression of both transcripts in T-cell, B-cell, and myeloidlineages (Fig. 4C).

Evolutionary Conservation. To determine whether thisgene was evolutionarily conserved, blots containing DNAfrom multiple mammalian and other species were hybridizedwith a probe containing the first 500 bp of the 2.8-kb cDNA(Fig. 5). The equivalent intensity of the bands in all mam-malian species suggests a high degree of conservation. Inaddition to mammalian species, this gene appears to bepresent in chicken, fish, and frog.

A

7.5kb-4.4 kb-

2.4 kb -

2 kb -

I-actin

CU

0I

.B B .

. ~

CUC0 L.

-~~-_. ... _a_C

,0,

B C

IXa = %°1 I E -6 B

O- 0 0

7.5 kb - 9Ki4.4 kb- *

2.4 kb - .sf

2 kb -

;-actina m.:

0-

4.4 kb

2 kb M-actin

FIG. 4. Northern blots of RNA from multiple human tissues (Aand B) and hematopoietic cell lines (C) hybridized with the first 500bp of the 2.8-kb ELL clone. All blots were reprobed with j3-actincDNA. pbl, Peripheral blood leukocytes.

2 !4 5

2 A3 4 5 r" .7

FIG. 5. Zoo blots showing EcoRI-digested DNA probed with thefirst 500 bp of the 2.8-kb ELL clone.

Southern Blot Analysis. To determine whether DNA rear-rangements could also be observed in patient samples byprobes from the partner gene on 19p13.1, the blot that wasused to detect MLL rearrangements was rehybridized with aprobe containing the first 500 bp of the 2.8-kb cDNA (Fig.2B). This probe contains the sequences that are immediately5' and 3' to the breakpoint within the ELL cDNA and couldthus potentially detect both the der(11) and the der(19).However, no gene rearrangements could be identified, sug-gesting that the exons ofELL contained within the probe maybe far from the genomic translocation breakpoint junction onchromosome 19.Sequence Analysis. The entire 2.8-kb cDNA fragment was

sequenced on both strands. The sequence that diverged fromMLL in the fusion transcript begins at nt 148 of the 2.8-kbcDNA (Fig. 6A). The chimeric transcript generated by thistranslocation consists of 5' MLL through exon 7 fused inframe to 2747 nt ofELL (Fig. 6B). The ELL sequence in thefusion transcript contains a predicted open reading frame of576 aa followed by 930 nt of 3' untranslated sequence. Inaddition to the sequence present in the fusion transcript, the2.8-kb cDNA contains 124 nt 5' of the breakpoint on chro-mosome 19. When combined with the 23 nt obtained from 5'RACE, the sequence 5' ofthe breakpoint codes for 45 aa. Theopen reading frame of the unrearranged ELL gene codes fora predicted protein of 621 aa.Homology Searches. There are no significant nucleotide or

amino acid homologies to the other MLL partner genes thathave been cloned. However, ELL contains several regionsthat are rich in basic amino acids as has been noted in AF4,ENL, and AF9. The highest area of homology exists in ahighly basic, lysine-rich region beginning at amino acid 442 ofELL. This region is homologous to the basic region of theDNA-binding domain of poly(ADP-ribose) polymerase (Fig.7) (18-20). In addition, this region is also homologous to alysine-rich region of arginine-rich nuclear protein (21). Database searches also revealed a long stretch of limited homol-ogy to the Engrailed gene family, primarily upstream of thehomeobox domain (22). A stretch of amino acids in thislysine-rich region is identical to a repeated element in theknob-associated histidine-rich protein of Plasmodium falci-parum (23, 24). However, the adjacent amino acids are nothighly conserved. In addition, the bovine leukemia virusreceptor has a similar basic region near its N terminus (25).

DISCUSSIONAlthough MLL encodes DNA-binding motifs suggestive of atranscription factor, the normal functions of MLL have not

B EcoRiIlQ

c * r-I~

LI.V (AI

i

Proc. Natl. Acad Sci. USA 91 (1994)

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

15, 2

020

Page 4: Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin acute lymphoblastic leukemia, primarily in infants and children. The translocation

Proc. Natl. Acad. Sci. USA 91 (1994) 12113

A GATGGTCGCAAGATG GCG GCG CTG AAG GAG GAT AGG AGC TAC GGG CTG TCGM A A L K E D R S Y G L S

TGC GGG CGG GTT AGC GAC GGC AGC AAG GTG TCG GTG TTC CAC GTG AAGC G R V S D G S K V S V F H V K

CTC ACC GAC AGT GCC CTG AGG 6CC TTC GAG AGC TAC CGC GCC AGA CAGL T D S A L R A F E S Y R A R Q

V GAT TCT GTT TCA CTG AGG CCA TCT ATC CGA mTT CAA GGA AGC CAA GGGD S V S L R P S I R F Q G S Q G

CAC ATC TCC ATC CCC CAG CCT GAC TGC CCC GCA GAG GCG CGG ACG TTCH I S I P Q P D C P A E A R T F

TCC TTC TAC CTC TCC AAC ATC GGC CGC GAC AAC CCC CAG GGC AGC TTCS F Y L S N I G R D N P Q G S F

GAC TGC ATC CAG CAG TAT GTC TCC AGT CAT GGG GAA GTT CAC CTG GACD C I Q Q Y VS S H G E V H L D

TGC CTG GGC AGC ATA CAG GAC AAG ATC ACG GTG TGT GCC ACC GAC GACC L G S I Q D K I T V C A T D D

TCC TAC CGAG MG GCG CGG CAG AGC ATG GCC GAG CCG GAG GAG GAG ACGS Y Q K A R Q SM A Q A E E E T

CGG AGC CGA AGT GCC ATT GTC ATC AAG GCT GGA GGC CGC TAC CTG GGCR S R S A I V I K A G G R Y L G

AAG AAG GCT CAG TM CGG AAA CCA GCC CCA GGT GCA ACA GAC GCG CTGK K V Q F R K P A P G A T D A V

CCC TCC CGG AAG CGG GCA ACC CCC ATC AAC TTG GCG AGT GCC ATC AGGP S R K R A T P I N L A S A I R

AAG AGT GGT GCC ACT GCC GTG AGT GGG GGC AGC GGG GTG TCC CAG AGGK S G A S A V S G G S G V S Q R

CCC TIC CGT GAC CGA GTG CTG CAC CTC CTG GCA CTA CGG CCC TAC CGCP F R D R V L H L L A L R P Y R

AAG GCT GAG CTG CTG CTG CGA CTG CAG AAG GAC GGC CTG ACG CAG GCGK A E L L L R L Q K D G L T Q A

GAC AAG GAC GCG CTG GAT GGC CTC CTC CAG CAG GTG GCC AAC ATG AGTD K D A L D GL L Q Q V A N M S

GCT AAG GAC GGC ACG TGT ACA CTG CAG GAC TGC ATG TAC AAG GAT GTGA K D G T C T L Q D C M Y K D VCAG AAG GAC TGG CCT GGC TAC TCG GAG GCG GAC CAG CAG CTG CTG AAGQ K D W P G Y S E G D Q Q L L KCGG GTG CTC GTC CGG AAG CTG TGC CAG CCA CAG AGC ACT GGC AGC CTCR V L V R K L C Q P Q S T G S L

CTT GGA GAC CCT GCT GCC TCC AGC CCC CCA GGC GAG CGT GGG CGC TCGL G D P A A S S P G E R G R S

GCC TCG CCC CCA CAG AAG CGG CTG CAG CCT CCT GAT TTC ATC GAC CCCA S P P Q K R L Q P P D F I D PCTA GCC AAC AAG AAA CCC CGG ATA TCG CAC TTC ACT CAG AGA GCT CAGL A N K K P R I S H F T Q R A Q

CCT GCC GTC AAC GGG AAG CTG GGC GTC CCC AAT GGC C6T GAG GCC TTGP A V N G K L G V P N G R E A L

CTG CCC ACC CCG GGC CCA CCA GCC AGC ACG GAC ACC CTC AGC TCC AGCL P T P G P P A S T D T L S S S

ACT CAC CTG CCC CCG CGG CTG GAG CCC CCG AGG GCC CAC GAC CCC CTGT H L P P R L E P P R A H D P LGCC GAT GTC AGC AAT GAC CT6 GGC CAC AGC GGC CGA GAC TGT GAG CACA D V S N D L G H S G R D C E HGGA GAG GCG GCT GCC CCA GCC CCC ACT GTG CGC CTC GGC CTG CCC CTGG E A A A P A P T V R L G L P L

CTG ACG GAC TGT GCC CAG CCC AGC AGG CCA CAC GGC AGC CCC TCG CGCL T D C A Q P S R P H G S P S R

AGC AAG CCC AAG AAG AAG TCC MG AAG CAC AAA GAC AAG GAG AGG GCGS- K P K K K S K K H K D K E R A

GCT. GAG GAC AAG CCC CGG GCC CAG CTT CCA GAC TGT GCA CCT GCC ACC* E D K P R A Q L P D C A P A TCAT GCC ACC CCC GGA GCC CCA GCA GAC ACC CCA GGT TTA AAC GGA ACCH A T P G A P A D T P G L N G T

TGC AGC GTT TCC AGT GCT CCC ACG TCC ACG TCG GAG ACG CCT GAC TACC- ST V S S V P T S T S E T P D Y

TTi C AAG TAC GCA GCC ATC TCC TCT TCG GAG CAG CGC CAG AGC TACL LL K Y A A I S S S E Q R Q S Y

MCG AAC GAC TTC AAT GCC GAG TAC AGC GAG TAC CGC GAC CTG CAC GCCK N S F N A E Y S E Y R D L H ACGCAT GAG CGC ATC ACG CCG CGG TTC ACC CAG CTC GAC GCC CAG CTCt r E R I T R R F T Q L D A Q L

CGC CAM CTC TCC CAG GGC TCC GAG GAG TAT GAG ACT ACT CGA GGG CAGR Q L S Q G S E E Y E T T R G Q

AT! TI CAG GAA TAT CGA AAA ATC AAA AAG ACC AAC ACC AAC TAC AGCI L Q E Y R K I K K T N T N Y SCAGCGAG MG CAC CGC TCC GAG TAC CTG CAC AGC AAG CTG GCC CAC ATCQ E K H R C E Y L H S K L A H I

AAG AGG CTC ATC GCC GAG TAC GAC CAG CGG CAG CTG CAG GCT TGG CCC TAGK R L I A E Y D Q R Q L Q A W P *

CCGCCCTCCCCGATGGCGGGGATCTGGGAGGGTCGGCGGAGCAAAAGGCGGTGAGAGAGGATITTrAMAAAAT

TACMMGAGAAGCAATCTrrMCaACCTGTGCCTCCCTCaCGGAACCGCAGGGTaAGCCGCTGCCCTCGGTCCTGCCTGCAGTATTACAGAGTCGTCGMAAGGTGCAGCTGCGTTCTGAGGGCGTGGAATGGGCAGGTGGCCTCTGCTGGTCTCTGGCTCTACGTAGGCACCCC1CCCCAGCCTCTC CTCCTTGGGCAGGTCTGCTCCAGCAAGCACAGGGTGGCCCAGGTGGCTATMGAGAMCTCCAGCTGGTGCCCCCAGACAGCTGCTCAGAGCCAACGGGGGCAGAGGGCTTTCAGCGCCCCCAGGCCTGCCCTGCTATITCAGGCCCTCAGCTGTCGGGGGCCACTGTGTIICTGTGCTCCAAGTGAGACTCGGCCGCAGCGGCGTCAGACTMCTTCGCGATGTCCTCGGTTMCCCATITGTIGCTGCTGCTGCTCAT1CCACACTGTTGAGACCTTGTGGTCTCGATGCTGCTGGCCTCCCTCCGTCCCTCTGTCCACTTGTGGGTCCTGGGGTC

BMEL ENN 7 E| FRCM NUMLE DE 148

CA A MC AGA GE AAG GNT GITICA AM CXAH R I R V D F K D S V S L R P S

FIG. 6. (A) Nucleotide sequence and predicted open readingframe of the 2.8-kb ELL fragment. The breakpoint is indicated by anarrow. The termination codon is indicated by an asterisk. (B)Sequence of the fusion transcript.

yet been determined. Moreover, little is known about thealtered function of MLL that is a consequence of 11q23translocations. The creation of in-frame chimeric transcriptsas a result of these translocations suggests that the sequences

contributed by the partner genes are also critical to thisprocess. Because of the large number of 11q23 aberrationsthat involve MLL, we reasoned that comparison of MLLpartner genes might define common domains that are essen-tial for the leukemogenic function of the fusion transcript. Tofacilitate the characterization of the nature of the chimerictranscripts that form in these translocations, we have clonedthe gene (ELL) that fuses to MLL in t(11;19)(q23;p13.1), arecurring cytogenetic abnormality in acute myeloid leuke-mia. The ELL gene is highly conserved and expressedubiquitously as a 4.4-kb message and, in certain tissues, alsoas a 2.8-kb transcript. Previous studies of deletions that occurin the formation of 11q23 translocations have shown that theder(11) encodes the fusion transcript that is critical to leu-kemogenesis. Thus, in t(11;19)(q23;p13.1), this transcriptcontains 4.2 kb of 5' MLL sequencejuxtaposed to at least 2.7kb of 3' ELL sequence and codes for a predicted fusionprotein of 2015 aa. Additional 3' untranslated ELL sequencemay also be present.Data base searches of the nucleotide sequence and the

predicted protein of ELL reveal that it is a novel gene. Theregion with the most significant homologies to previouslyidentified genes lies in a highly basic, lysine-rich region nearthe C terminus of the ELL protein. Utilizing the Chou-Fasman (26) algorithm, computer modeling of the three-dimensional structure of ELL places this domain at a helicalportion of the predicted protein. The best-characterized genewith a strong homology to ELL is poly(ADP-ribose) poly-merase, a protein that ADP-ribosylates various nuclear pro-teins, including histones, DNA topoisomerases, and DNApolymerases. Poly(ADP-ribose) polymerase is involved incellular proliferation and is critical to the repair of double-strand DNA breaks and single-strand nicks (27). The C-ter-minal portion of the DNA-binding domain of poly(ADP-ribose) polymerase contains a highly basic, lysine-rich motifthat is homologous to ELL, but the N-terminal portioncontains two zinc fingers that are not present in ELL. Thezinc fingers are required for recognition of double-strandbreaks and single-strand nicks. Whereas deletion or mutationofboth zinc fingers eliminates recognition of strand breaks ornicks, binding to intactDNA does not require the zinc fingers(28). Moreover, studies of the C-terminal portion of theDNA-binding domain of poly(ADP-ribose) polymerase bySouthwestern blot analysis have revealed that DNA bindingis preserved in the absence of the N-terminal portion of theDNA-binding domain (29). In addition, a bipartite nuclearlocalization signal has been identified in the C-terminalportion of the DNA-binding domain of poly(ADP-ribose)polymerase (30). The second portion of this sequence(KKKSKK) is identical to a segment of the lysine-rich motifof ELL, but the first portion of the bipartite signal is notpresent in ELL. Functional studies will be necessary todetermine whether this region ofELL possesses the capacityfor either DNA binding or nuclear localization.Homologies to other genes with highly basic, lysine-rich

motifs have been identified. Arginine-rich nuclear proteinwas cloned from an expression library screened with anantibody to nuclear proteins. The precise function of arg-inine-rich nuclear protein has not been determined, but itcontains a region of homology to the U1 protein of smallnuclear ribonucleoprotein particles and is thus thought to beinvolved with RNA processing. The homology to the en-grailed family ofgenes spans a region in the EH-4 domain andthe beginning of the homeodomain. Taken together, thesehomologies suggest that ELL may also be a nuclear proteinthat potentially has DNA binding properties. A similar lysine-rich motif is also present in the bovine leukemia virusreceptor and the knob-associated histidine-rich protein ofPlasmodiumfalciparum. The significance of the homologiesto these two genes is not clear.

Medical Sciences: Thirman et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

15, 2

020

Page 5: Cloning ELL, MLLin t(11;19)(q23;pl3.1) - PNAS · (15). Thet(11;19)(q23;p13.3) breakpointis commonin acute lymphoblastic leukemia, primarily in infants and children. The translocation

12114 Medical Sciences: Thirman et al.

0)C.Q

ELL SPSRSKPKKKSKKHKDKERAAEDKPRAQ

Human Poly (ADP-ribose) polymerase dgvdevaKKKSKKeKDKdsklEkalkAQ

Human Arginine-rich nuclear protein

Knob-associated histidine-rich protein(Plasmodium falciparum)Bovine leukemia virus receptor

Human En-..

SPSprrhKKekKKdKDKERsrdereRst

hkdhdgeKKKSKKHKDnEdAesvKskkh

ekkskKPKKKeKKHKeKER--ekkKkeve

gPrsrKPKKK-npnKedkRprt aftaeQ

FIG. 7. Amino acid homologies. Identical amino acids shown as uppercase letters and divergent amino acids as lowercase letters. Identicaland conserved amino acids are on a shaded background.

None of the six MLL fusion partner genes cloned to datehad previously been identified. The potential normal func-tions of these genes have not been characterized and thuscomparisons of these genes can be based only on theirnucleotide sequence and predicted proteins. The proteinsencoded by AF4, AF9, and ENL contain nuclear targetingsequences and regions rich in serine and proline residues,suggesting that these proteins may act as transcription fac-tors. These three proteins also have segments that are rich inbasic amino acids, but none of these proteins contains aknown DNA-binding domain. AF6 and AFlp do not appearto be similar to AF4, AF9, and ENL. AF6 encodes shortstretches rich in prolines, charged amino acids, serines, andglutamines and does not have a nuclear localization signal.AF6 also encodes a GLGF motif found in proteins thought tobe involved in signal transduction. AFlp encodes three acidicdomains, a glutamine-rich region, and 15 copies of a DPFtripeptide repeat of unknown significance. AFlp is highlyhomologous to the murine eps 15 gene that encodes acytoplasmic phosphoprotein. Only partial sequence of theAFX gene has been published. ELL is not homologous to anyof the partner genes that have been cloned. Like AF4, AF9,and ENL, ELL encodes segments that are rich in basic aminoacids, although this is a more prominent feature ofELL thanof the other translocation partners. In addition, ELL alsoencodes a potential nuclear localization sequence.

Further work will be necessary to elucidate the normalfunctions of ELL as well as the contribution of its 3' se-quences to leukemogenesis. The homology of its highly basic,lysine-rich region to a DNA-binding domain suggests thatELL may also be a nuclear protein with DNA-bindingproperties. One potential consequence ofthe formation oftheMLL/ELL fusion transcript may be an altered DNA-bindingspecificity that results from the substitution of the DNA-binding domain of ELL for the loss of the zinc fingersencoded by 5' MLL. The characterization of ELL shouldprovide insights into its function and into the pathogenesis ofacute leukemia.

We thank Lisa Gottschalk for the preparation of figures. Thisresearch was supported in part by grants from the National Institutesof Health (CA42557 to J.D.R. and HL52094 to M.C.S.), from theDepartment of Energy (DE-FG02-86ER60408 to J.D.R.), and fromthe Spastic Paralysis Research Foundation, Illinois-Eastern IowaDistrict of Kiwanis International (to J.D.R.). M.J.T. is a recipient ofa Postdoctoral Research Fellowship for Physicians from the HowardHughes Medical Institute.

1. Rowley, J. D., Diaz, M. O., Espinosa, R., III, Patel, Y. D., van Melle,E., Ziemin-van der Poel, S., Taillon-Miller, P., Lichter, P., Evans, G. A.,Kersey, J. H., Ward, D. C., Domer, P. H. & Le Beau, M. M. (1990)Proc. Natl. Acad. Sci. USA 87, 9358-9362.

2. Ziemin-van der Poel, S., McCabe, N. R., Gill, H. J., Espinosa, R., III,

Patel, Y. D., Harden, A. M., Le Beau, M. M., Smith, S. B., Rowley,J. D. & Diaz, M. 0. (1991) Proc. Natl. Acad. Sci. USA 88, 10735-10739.

3. Djabali, M., Selleri, L., Parry, P., Bower, M., Young, B. D. & Evans,G. A. (1992) Nat. Genet. 2, 113-118.

4. Gu, Y., Nakamura, T., Alder, H., Prasad, R., Canaani, O., Cimino, G.,Croce, C. M. & Canaani, E. (1992) Cell 71, 701-708.

5. Tkachuk, D. C., Kohler, S. & Cleary, M. L. (1992) Cell 71, 691-700.6. Thirman, M. J., Gill, H. J., Burnett, R. C., Mbangkollo, D., McCabe,

N. R., Kobayashi, H., Ziemin-van der Poel, S., Kaneko, Y., Morgan, R.,Sandberg, A. A., Chaganti, R. S. K., Larson, R. A., Le Beau, M. M.,Diaz, M. 0. & Rowley, J. D. (1993) N. Engl. J. Med. 329, 909-914.

7. Zeleznik-Le, N. J., Harden, A. M. & Rowley, J. D. (1994) Proc. Nat!.Acad. Sci. USA 91, 10610-10614.

8. Domer, P. H., Fakharzadeh, S. S., Chen, C.-S., Jockel, J., Johansen, L.,Solverman, G. A., Kersey, J. H. & Korsmeyer, S. J. (1993) Proc. Natl.Acad. Sci. USA 90, 7884-7888.

9. Nakamura, T., Alder, H., Gu, Y., Prasad, R., Canaani, O., Kamada, N.,Gale, R. P., Lange, B., Crist, W. M., Nowell, P. C., Croce, C. M. &Canaani, E. (1993) Proc. Nat!. Acad. Sci. USA 90, 4631-4635.

10. Prasad, R., Gu, Y., Alder, H., Nakamura, T., Canaani, O., Saito, H.,Huebner, K., Gale, R. P., Nowell, P. C., Kuriyama, K., Miyazaki, Y.,Croce, C. M. & Canaani, E. (1993) Cancer Res. 53, 5624-5628.

11. Bernard, 0. A., Mauchauffe, M., Mecucci, C., Van Den Berghe, H. &Berger, R. (1994) Oncogene 9, 1039-1045.

12. Corral, J., Forster, A., Thompson, S., Lampert, F., Kaneko, Y., Slater,R., Kroes, W. G., van der Schoot, C. E., Ludwig, W. D., Karpas, A.,Pocock, C. & Cotter, F. (1993) Proc. Natl. Acad. Sci. USA 90, 8538-8542.

13. Thirman, M. J., Mbangkollo, D., Kobayashi, H., McCabe, N. R., Gill,H. J., Rowley, J. D. & Diaz, M. 0. (1993) Proc. Am. Assoc. CancerRes.34, 495 (abstr.).

14. Rowley, J. D. (1992) Gene Chromosome Cancer 5, 264-266.15. Huret, J. L., Brizard, A., Slater, R., Charrin, C., Bertheas, M. F.,

Guilhot, F., Hahlen, K., Kroes, W., van Leeuwen, E., Schoot, E. V. D.,Beishuizen, A., Tanzer, J. & Hagemeier, A. (1993) Leukemia 7,152-160.

16. Mitelman, F., ed. (1991) ISCN Guideline for Cancer Cytogenetics(Krager, Basel).

17. Kozak, M. (1986) Cell 44, 283-292.18. Cherney, B. W., McBride, 0. W., Chen, D., Alkhatib, H., Bhatia, K.,

Hensley, P. & Smulson, M. E. (1987) Proc. Natl. Acad. Sci. USA 84,8370-8374.

19. Uchida, K., Morita, T., Sato, T., Ogura, T., Yamashita, R., Noguchi, S.,Suzuki, H., Nyunoya, H., Miwa, M. & Sugimura, T. (1987) Biochem.Biophys. Res. Commun. 148, 617-622.

20. Kameshita, I., Matsuda, Z., Taniguchi, T. & Shizuta, Y. (1984) J. Biol.Chem. 259, 4770-4776.

21. Chaudhary, N., McMahon, C. & Blobel, G. (1991) Proc. Natl. Acad. Sci.USA 88, 8189-8193.

22. Logan, C., Hanks, M. C., Noble-Topham, S., Nallainathan, D., Provart,N. J. & Joyner, A. L. (1992) Dev. Genet. 13, 345-358.

23. Triglia, T., Stahl, H. D., Crewther, P. E., Scanlon, D., Brown, G. V.,Anders, R. F. & Kemp, D. J. (1987) EMBO J. 6, 1413-1419.

24. Ardeshir, F., Fling, J. E., Yoshitsugu, M., Aikawa, M., Reese, R. T. &Stanley, H. (1987) EMBO J. 6, 1421-1427.

25. Ban, J., Portetelie, D., Altaner, C., Horion, B., Milan, D., Krchnak, V.,Burny, A. & Kettmann, R. (1993) J. Virol. 67, 1050-1057.

26. Chou, P. Y. & Fasman, G. D. (1978) Adv. Enzymol. 47, 45-148.27. de Murcia, G., Menissier-de Murcia, J. & Schreiber, V. (1991) BioEssays

13, 455-461.28. Ikejima, M., Noguchi, S., Yamashita, R., Ogura, T., Sugimura, T., Gill,

D. M. & Miwa, M. (1990) J. Biol. Chem. 265, 21907-21913.29. Thibodeau, J., Potvin, F., Kirkland, J. B. & Poirier, G. (1993) Biochim.

Biophys. Acta 1163, 49-53.30. Schreiber, V., Molinete, M., Boeuf, H., de Murcia, G. & Menissier-de

Murcia, J. (1992) EMBO J. 9, 3263-3269.

Proc. Natl. Acad. Sci. USA 91 (1994)

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

15, 2

020