Transcription by RNA polymerase II: initiator-directed formation

10
Transcription by RNA polymerase II: initiator-directed formation of transcription-competent complexes LISA WEIS AND DANNY REINBERG2 Department of Biochemistry, Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey Piscataway, New Jersey 08854-5635 3300 0892-6638/92/0006-3300/$01 .50. © FASEB ABSTRACT Studies of transcription by RNA polymer- ase II have revealed two promoter elements, the TATA motif and the initiator (Inr), capable of directing specific transcription initiation. Although binding to the TATA motif by one of the components of the transcription machinery has been shown to be the initial recognition step in transcription complex formation, many promoters that lack a traditional TATA motif have recently been described. In such TATA-less promoters, the mr element is critical in positioning RNA polymerase II. Various mr elements have been described and classified according to sequence homology. These mr elements are recognized specifically by mr-binding proteins. Interaction between these mr-binding proteins and components of the basal transcription machinery provides a means through which a transcription competent complex can be formed. -Weis, L.; Rcinberg, D. Transcription by RNA poly- merase II: initiator-directed formation of transcription- competent complexes. FASEB J. 6: 3300-3309; 1992. Key Words: RNA polyrnerase II transcription gene regulation ONE OF THE MOST IMPORTANT WAYS in which gene expression is regulated is through transcription initiation. The mecha- nism by which an RNA molecule is accurately synthesized is highly ordered in both prokaryotes and eukaryotes. In prokaryotes, all RNA synthesis is accomplished by a single DNA-dependent RNA polymerase. Specific promoter recog- nition is achieved through the association of RNA polymer- ase with various sigma factors. Consistent with the complex- ity of the genome, in eukaryotes the process of transcription is mediated by three different DNA-dependent RNA poly- merases: I (RNAP I),3 II (RNAP II), and III (RNAP III). Each eukaryotic RNA polymerase isspecificfortranscribing unique sets of genes: RNAP I synthesizes ribosomal pre- RNA; RNAP II transcribes protein coding (class II) genes and most small nuclear RNA (sn RNA); and RNAP IIIsyn- thesizes small RNA species such as tRNA and ribosomal 5S RNA. Although the bacterial RNA polymerase, in cooperation with sigma factors,can recognize the promoter, all eukaryotic RNA polymerases require accessory proteins to acquire specificity. In the case of RNAP II, the accessory proteins re- quired for recognition of promoters of protein-encoding genes are known as the general transcription factors (GTFs). To date, seven have been identified (reviewed in ref 1). Six of these-TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and TFIIJ- are required for transcription, and one, TFIIA, contains a stimulatory activity (1). Of these, only TFIID has been shown to have specific promoter binding activity. In higher eukaryotes, TFIID is a multisubunit complex consisting of a polypeptide that binds to the TATA element (TATA binding protein or TBP)4 present in most promoters (see below) and additional TBRassociated factors (TAF5). TAFs include molecules that are considered to be involved in modulating transcription activation (1). In addition to RNAP II and the GTFs, transcription of protein-encoding genes involves site- specific DNA binding proteins. These factors (specific tran- scription factors) appear to regulate levels of expression, and in some instances confer the response to defined regulatory processes. Apart from specific regulatory elements present in promoters, there are two DNA elements common for many promoters of protein-encoding genes. These are the TATA motif, which in promoters of higher eukaryotes is located ap- proximately 30 nucleotides upstream (-30) of the transcrip- tion start site (+ 1), and the Initiator (Inr) element, which en- compasses the transcription start site. Although it was initially believed that most protein-encoding genes contained a TATA motif many promoters transcribed by RNAP II, es- pecially those of housekeeping genes, lack a TATA element. These are referred to as TATA-less (reviewed in ref 2). Studies using the TATA-less promoter of the terminal deoxynucleotidyl transferase (TdT) gene have led to the identification of a 17-base pair element encompassing the transcription start site (mr element) that contains all the in- formation necessary for determining specific initiation of transcription both in vivo and in vitro (3). Upon further in- spection, it was found that other promoters, such as the adenovirus major late (ML) and IVa2 promoters, also con- tained mr elements homologous to that of the TdT gene (3, 4). It now appears that most promoters contain an Inr, although the nucleotide sequence of this element is not con- ‘From the Symposium “Mechanisms of Transcription” presented at the American Society for Biochemistry and Molecular Biology! Biophysical Society Joint Meeting, Houston, Texas, February 12, 1992. 2To whom correspondence should be addressed, at: Department of Biochemistry, University of Medicine and Dentistry of New Jersey, 675 Hoes Lane, Piscataway, NJ 08854-5635, USA. 3Abbreviations: Inr, initiator; RNAP I, RNA polymerase I; sn RNA, small nuclear RNA; GTFs, general transcription factors; TAFs, TBP-associated factors; TdT, terminal deoxynucleotidyl transferase; ML, major late;PBGD, porphobiinogen deaminase; DHFR, dihydrofolate reductase; hTBP, human TBP; yTBP, yeast TBP; TAFs, TBP-associatedfactors; dTBP, Drosophila TBP; MLP, ML promotor; EMSA, electrophoretic mobility shift assay; CBF, cap binding factor; HIP-i; housekeeping initiator protein-h; CTD, COOH-terminal domain; CAD, carbamoyl phosphatase synthetase aspartate transcarbamylase-dihydroorotase; rp, ribosomal protein; MMLV, Moloney murine leukemia virus; UCRBP, upstream con- served region binding protein; fl-pol, DNA polymerase /3; ITF, in- itiation transcription factor; LBP-1, leader binding protein 1; ITF, mr binding protein; USF, upstream stimulating factor. 4TBP will refer to the recombinant TATA binding protein and TFIID to the multisubunit complex isolated from human cells.

Transcript of Transcription by RNA polymerase II: initiator-directed formation

Page 1: Transcription by RNA polymerase II: initiator-directed formation

Transcription by RNA polymerase II: initiator-directed

formation of transcription-competent complexesLISA WEIS AND DANNY REINBERG2

Department of Biochemistry, Robert Wood Johnson Medical School, University of Medicine and Dentistry ofNew Jersey Piscataway, New Jersey 08854-5635

3300 0892-6638/92/0006-3300/$01 .50. © FASEB

ABSTRACT Studies of transcription by RNA polymer-ase II have revealed two promoter elements, the TATAmotif and the initiator (Inr), capable of directing specifictranscription initiation. Although binding to the TATAmotif by one of the components of the transcriptionmachinery has been shown to be the initial recognitionstep in transcription complex formation, many promotersthat lack a traditional TATA motif have recently beendescribed. In such TATA-less promoters, the mr elementis critical in positioning RNA polymerase II. Various mrelements have been described and classified according tosequence homology. These mr elements are recognizedspecifically by mr-binding proteins. Interaction betweenthese mr-binding proteins and components of the basaltranscription machinery provides a means through whicha transcription competent complex can be formed.-Weis, L.; Rcinberg, D. Transcription by RNA poly-merase II: initiator-directed formation of transcription-competent complexes. FASEB J. 6: 3300-3309; 1992.

Key Words: RNA polyrnerase II transcription gene regulation

ONE OF THE MOST IMPORTANT WAYS in which gene expressionis regulated is through transcription initiation. The mecha-nism by which an RNA molecule is accurately synthesizedis highly ordered in both prokaryotes and eukaryotes. Inprokaryotes, all RNA synthesis is accomplished by a singleDNA-dependent RNA polymerase. Specific promoter recog-nition is achieved through the association of RNA polymer-ase with various sigma factors. Consistent with the complex-ity of the genome, in eukaryotes the process of transcriptionis mediated by three different DNA-dependent RNA poly-merases: I (RNAP I),3 II (RNAP II), and III (RNAP III).Each eukaryotic RNA polymerase isspecificfortranscribing

unique sets of genes: RNAP I synthesizes ribosomal pre-RNA; RNAP II transcribes protein coding (class II) genesand most small nuclear RNA (sn RNA); and RNAP IIIsyn-

thesizes small RNA species such as tRNA and ribosomal5S RNA.

Although the bacterial RNA polymerase, in cooperationwith sigma factors,can recognize the promoter, all eukaryoticRNA polymerases require accessory proteins to acquirespecificity. In the case of RNAP II, the accessory proteins re-quired for recognition of promoters of protein-encodinggenes are known as the general transcription factors (GTFs).To date, seven have been identified (reviewed in ref 1). Six ofthese-TFIIB, TFIID, TFIIE, TFIIF, TFIIH, and TFIIJ-are required for transcription, and one, TFIIA, contains astimulatory activity (1). Of these, only TFIID has beenshown to have specific promoter binding activity. In highereukaryotes, TFIID is a multisubunit complex consisting of apolypeptide that binds to the TATA element (TATA binding

protein or TBP)4 present in most promoters (see below) andadditional TBRassociated factors (TAF5). TAFs includemolecules that are considered to be involved in modulatingtranscription activation (1). In addition to RNAP II and theGTFs, transcription of protein-encoding genes involves site-specific DNA binding proteins. These factors (specific tran-scription factors) appear to regulate levels of expression, andin some instances confer the response to defined regulatoryprocesses. Apart from specific regulatory elements present inpromoters, there are two DNA elements common for manypromoters of protein-encoding genes. These are the TATAmotif, which in promoters of higher eukaryotes is located ap-proximately 30 nucleotides upstream (-30) of the transcrip-tion start site (+ 1), and the Initiator (Inr) element, which en-compasses the transcription start site. Although it wasinitially believed that most protein-encoding genes containeda TATA motif many promoters transcribed by RNAP II, es-pecially those of housekeeping genes, lack a TATA element.These are referred to as TATA-less (reviewed in ref 2).

Studies using the TATA-less promoter of the terminaldeoxynucleotidyl transferase (TdT) gene have led to theidentification of a 17-base pair element encompassing thetranscription start site (mr element) that contains all the in-formation necessary for determining specific initiation oftranscription both in vivo and in vitro (3). Upon further in-spection, it was found that other promoters, such as theadenovirus major late (ML) and IVa2 promoters, also con-tained mr elements homologous to that of the TdT gene (3,4). It now appears that most promoters contain an Inr,although the nucleotide sequence of this element is not con-

‘From the Symposium “Mechanisms of Transcription” presentedat the American Society for Biochemistry and Molecular Biology!Biophysical Society Joint Meeting, Houston, Texas, February 12,1992.

2To whom correspondence should be addressed, at: Departmentof Biochemistry, University of Medicine and Dentistry of NewJersey, 675 Hoes Lane, Piscataway, NJ 08854-5635, USA.

3Abbreviations: Inr, initiator; RNAP I, RNA polymerase I; snRNA, small nuclear RNA; GTFs, general transcription factors;

TAFs, TBP-associated factors; TdT, terminal deoxynucleotidyltransferase;ML, major late;PBGD, porphobiinogen deaminase;DHFR, dihydrofolate reductase; hTBP, human TBP; yTBP, yeastTBP; TAFs, TBP-associatedfactors;dTBP, Drosophila TBP; MLP,ML promotor; EMSA, electrophoretic mobility shift assay; CBF,cap binding factor; HIP-i; housekeeping initiator protein-h; CTD,COOH-terminal domain; CAD, carbamoyl phosphatase synthetaseaspartate transcarbamylase-dihydroorotase; rp, ribosomal protein;

MMLV, Moloney murine leukemia virus; UCRBP, upstream con-served region binding protein; fl-pol, DNA polymerase /3; ITF, in-itiation transcription factor; LBP-1, leader binding protein 1; ITF,mr binding protein; USF, upstream stimulating factor.

4TBP will refer to the recombinant TATA binding protein andTFIID to the multisubunit complex isolated from human cells.

Page 2: Transcription by RNA polymerase II: initiator-directed formation

MECHANISMS OF TRANSCRIPTION 3301

served. Similarities do exist, and on the basis of sequencehomology various Inr elements have been grouped into fami-lies (5). The TdT, ML, and IVa2 mr elements belong to onefamily (5). Other mr elements identified include those of theporphobilinogen deaminase [(PBGD), (6)], dihydrofolatereductase [(DHFR), (7)], ribosomal protein gene (8), andthe adeno-associated virus p5 promoters (9). Each mr seemsto belong to its own family, although exactly how a transcrip-tionally competent complex assembles on each is not wellunderstood. The mechanism by which specific transcriptionis accomplished by TATA-less promoters is the subject of thisreview.

The first step during formation of a transcription-competentcomplex on a TATA-containing promoter is the associationof TBP (or TFIID) with the TATA motif (10, 11). Binding of

TBP provides a site of nucleation through which RNAP IIand the rest of the GTFs can sequentially associate to forma transcriptionally competent complex (see Fig. 1) (1, 10, 11).Because none of the GTFs except TBP is able to bind pro-moter sequences specifically, how do complexes form onTATA-less promoters? Moreover, is TBP required for tran-scription from these promoters? Studies using the TdT-Inrprovided the first clues to understanding the mechanism (ormechanisms) underlying the transcription from TATA-lesspromoters.

The TdT-Inr family

Expression of the TdT gene is highly regulated duringB and T lymphocyte differentiation. The TdT promoter does

Figure 1. The first step in the formation of a preinitiation complex on a TATA-containing promoter is the binding of TFHD to the TATAmotif (10, 11). In higher eukaryotes, TFIID is a multisubunit protein consisting of TATA binding protein (TBP), which in humans hasa molecular mass of 38 kDa, and TBP-associated factors (TAFs) (60-62). TFIID has a molecular mass of >100 kDa (26). Binding ofTFIID is stimulated by the association of TFIID with TFIIA (10, 11). Human TFIIA appears to be a heterotrimer with polypeptides of34, 19, and 14 kDa (upper right-hand panel) (23). Binding of TFIID to the TATA motif permits TFIIB, a single 33-kDa polypeptide (63,

64), to enter the complex, forming the DAB complex (lower right-hand panel) (10, 11). The DAB complex is recognized by the non-phosphorylated (hA) form of RNAP II, in association with TFIIF (44). TFIIF is a tetramer consisting of two polypeptides of 74 and

30 kDa (according to mobility in PAGE-SDS) in a configuration of a2$2 (65). Binding of RNAP II and TFIIF to the DAB complex resultsin formation of the DABpoIF complex (lower left-hand panel). The DABpoIF complex is recognized by TFIIE, a tetramer consisting oftwo polypeptides that have molecular masses of 56 and 34 kDa, forming the DABpo1FE complex (66). The DABpo1FE complex is recog-nized by TFIIH, whose activity coelutes with five polypeptides having molecular masses of 92, 62, 40, 43, and 35 kDa. Association ofTFIIH with the DABpoIFE complex results in formation of the DABpoIFEH complex (67). TFIIJ, an activity not yet well characterized,is able to recognize and bind to this complex, resulting in a complete preinitiation transcription complex (23, 67, upper left-hand panel).

Page 3: Transcription by RNA polymerase II: initiator-directed formation

3302 Vol. 6 November 1992 The FASEB Journal WEIS AND REINBERG

not contain a TATA motif or binding sites for commonspecific transcription factors such as Spi, CAAT factors, oroctamer binding protein. Yet it does contain an element,..3YYCAYYYYY#{247}6,5 which has a weak homology with se-quences found in the region encompassing + 1 of many

TATA-containing promoters. Unlike many TATA-less pro-moters, transcription from the TdT promoter initiates froma single nucleotide, not from a cluster of sites surrounding+ 1. Deletion analysis of this promoter identified a 17-base

pair element, the Inr, capable of specifically positioning tran-scription initiation in vitro and in vivo (ref 3 and referencestherein). Further mutational analysis narrowed the TdT-hnrelement to nine nucleotides with a consensus sequence of.3CTCANTCT+5 (12). Transcription directed by the TdT-hnr

is capable of activation by either a traditional TATA box orthe SV4O 21 base pair repeats, which contain six bindingsites for the transcription factor Sph (3, 12). Spi binds to aGC-rich element present in the promoters of many genes, es-pecially those of housekeeping genes. With the discovery ofthe mr as a DNA element capable of directing basal tran-scription as well as Spi-activated transcription, it was impor-tant to analyze the role of the GTFs, including TBP, in mnr-directed transcription.

To determine whether the TATA binding protein wasnecessary for transcription from a TATA-less promoter, theability of heat-treated HeLa cell nuclear extract to transcribetwo synthetic promoters, one containing only a TATA motifand one containing only the TdT-mnr element, was assessed(13). The placement of multiple Spi sites upstream fromboth core elements provided a means to analyze the role ofTFIID on activated and basal transcription from TATA-containing and TATA-less promoters. Heating nuclear ex-tract selectively inactivates TFIID (14). Neither the TATA-containing nor the TATA-less promoters were able to be tran-scribed using heat-inactivated extract. Although both humanTBP (hTBP) and yeast ThP (yTBP) were capable of restoringbasal transcriptional levels to the TATA-containingpromoter, only hTBP was able to restore activated transcrip-tional levels to this promoter under these conditions. Addi-tion of either recombinant hTBP or yTBP had no effect ontranscription from the TATA-less promoter (12, 13). Eventhough these results may indicate that transcription from aTATA-less promoter occurs independently of the TATA-binding protein, further research has demonstrated that thisis not the case. The observations pertaining to both theTATA-containing and TATA-less promoters result from thecomplexity of TFIID. The human and yeast TBP proteinsare highly conserved, with 80% identity in 181 amino acidsat the COOH-terminus (reviewed in ref 15). However, thehuman protein contains a longer and unique amino termi-nus, which appears to be the site of interactions with otherpolypeptides known as TBPassociated factors (TAFs). TAFsare tightly associated with TBP, and consequently can onlybe dissociated from ThP in the presence of denaturingagents. These TAFs appear to be required for activation, andin some instances for transcription from the TATA-lesspromoter (ref 15 and references therein).

As mentioned, heat-inactivated nuclear extract is incapa-ble of transcribing a TATA-less promoter, indicating that aheat-labile component, possibly TBP, is required for tran-scription. Accordingly, Smale et al. (12) found that TFIID,not yTBP, was able to restore transcription from the TdT-Inrin heat-inactivated nuclear extract. That TBP is indeed re-quired for mr-driven transcription was demonstrated by twoindependent observations: 1) Carcamo et al. (5), using theAd-MLP in which the TATA motif was eliminated, foundthat transcription in vitro was dependent on TBP; and 2)

Pugh and Tijan (16) showed that the addition of polyclonalantibodies recognizing human TBP to nuclear extracts in-hibited transcription from both TATA-containing and TATA-less promoters. Addition of antibodies against the divergentNH2-terminus of Drosophila TBP was without effect.

As neither recombinant yTBP nor hTBP is able to restoretranscription in heat-inactivated nuclear extract from theaforementioned synthetic TATA-less promoter, it was postu-lated that Spi activation of a TATA-less promoter involves anadditional heat-labile, coactivator-hike activity called thetethering factor (13, 16). The proposed function of the tether-ing factor is to bridge Spi with the basal transcriptionmachinery by interacting with the TFIID complex. To date,a functional tethering factor has not been isolated, perhapsbecause it is an extremely labile component of the TFIIDcomplex. Results consistent with the presence of a tetheringfactor were obtained by immunoprecipitation of the TFIIDcomplex (16). TBIimmunodepleted extract is transcription-ally inactive on both the TATA-containing and TATA-less

synthetic promoters. Although addition of TBP to the im-munodepleted extract does not restore either basal orSpi-activated levels of transcription to the TATA-lesspromoter, immunoprecipitated TFIID does restore activatedtranscription levels (16). Thus, endogenous TFIID is re-quired for transcription from the TdT-Inr.

Additional evidence that the mechanisms through whichactivated transcription occurs from a TATA-less promoterand a TATA-containing promoter differ was obtained by invivo experiments performed by Colgan and Manley (17).The effect of the intracellular concentration of TBP on vari-ous promoters containing different mr elements was exam-ined. Overexpression of Drosophila TBP (dTBP) in Drosophilacells increased transcription from cotransfected TATA-containing promoters, but had a small or a negative effect ontranscription from cotransfected TATA-less promoters. Fur-thermore, competition with excess TATA-sequences resultedin repressed transcription from both TATA-containing andTATA-lacking promoters. Overexpression of dTBP was onlyable to overcome repression on the TATA-containing pro-moter (17). These experiments indicated that a factor that isnot TBP, but able to associate with TBP, is rate-limiting forTATA-less promoters and that TBP itself is rate-limiting forTATA-containing promoters.

On the basis of these results, it appears that TBP is re-quired for transcription from TATA-less promoters contain-ing the TdT-Inr. It also seems that factors other than thosenecessary for transcription from a TATA-containing pro-moter may be involved, or that the same factors may func-tion differently in transcription from a TATA-less promoter.How TFIID enters a transcriptionally competent complexand how such a complex is formed on this family of TATA-less promoters remained unclear. To elucidate which GTFsand associated proteins are required and how they form atranscriptionally competent complex on the TdT-Inr, twoother promoters containing similar mrs were studied,namely, the adenovirus ML and mVa2 promoters.

The start site for transcription from the adenovirus IVa2promoter is located 210 base pairs upstream from the tran-scription start site of the ML promoter (MLP), and tran-

scription from these promoters occurs on opposite strands(see Fig. 2). Mutation of the TATA motif in the MLP nega-tively affects transcription of this promoter in vivo and in

5Underlined nucleotides indicate major transcription initiationsites.

Page 4: Transcription by RNA polymerase II: initiator-directed formation

B

C I- TATAA&5 - C1tA

-

A

MECHANISMS OF TRANSCRIPTION 3303

r__ML P

r I ITATPAM I-i CTCA II ACTCI4- IVa2

.

T p

5-4TATAGAU TAThAM#{149} CICA I,

Figure 2. The transcription start site for the adenovirus IVa2

promoter is located 210 nucleotides upstream from the start site of

the adenovirus ML promoter. Transcription from these promotersoccurson oppositestrands(directionoftranscriptionisindicatedbyarrows). The IVa2 promoter contains a TATA-like sequence down-stream from the IVa2 transcription start site in the same orientationas the TATA motif of the MLP. Although both promoters containmr elements homologous to that of the TdT (3C1CA.,), these ele-ments are oriented in accordance with the directionality of tran-scription (A). TBP has been shown to bind both the cannonicalTATA motif of the MLP and the TATA-like element of the IVa2promoter (4). Binding of TBP (and, consequently, TFIID) is criti-cal for optimal transcription from both promoters (B). Althoughbinding of TFIID has been shown to be the first step in complex

formation on TATA-containing promoters (see Fig. 1), because theTATA motif of the IVa2 promoter is located downstream, this bind-

ing alone cannot direct functional complex formation from this

promoter. Instead,itmay serve to stabilizebinding of RNAP II(pol)which togetherwith TBP, TFIIB, and TFIIF, has been shownto recognize the CTCA motif (C) (5).

vitro and results in the appearance of heterogeneous startsites clustered around + 1 (5). Initially the IVa2 promoterwas believed to be TATA-less because it lacked a consensusTATA motif at -30. Surprisingly, the mVa2 promoter con-tains a TATA-like (5’-TATAGAAA-3’) element approximately20 nucleotides downstream from the transcription initiationsite that binds TBP (4). Binding of TBP to this downstreamelement may aid in positioning the preinitiation complex,resulting in increased levels of accurately initiated transcrip-

tion. Mutations in this element that abolish TBP bindingresult in decreased levels of accurately initiated transcriptionin vitro. Perhaps the strongest evidence that the binding ofTBP to this element is important is that substitution with aconsensus TATA element results in higher levels of TBPbinding, and consequently in increased levels of accuratelyinitiated transcription, than the wild-type IVa2 promoter(4). Because TATA elements of the ML and IVa2 promotersare oriented in the same direction, but in the case of the mVa2promoter downstream of the transcription start site, it is in-teresting to consider why transcription from these twopromoters occurs from opposite strands.

Upon further inspection of the ML and hVa2 promoters,it was found that their mr elements (6GTCCTCAC1tT-CTTCCG+1, and .5G1CTCAGAGTGGTCCG+,,, respec-tively), homologous to that of the TdT, are oriented in oppo-site directions (Fig. 2). mt was proposed that RNAP II wasable to recognize the mnr and bind to DNA in an orientation-dependent fashion, permitting transcription to proceed in a5’-3’ direction (4). To test this model, the ability of purified

RNAP mi (in the absence of the GTFs) to recognize and initi-

ate transcription from the ML and mVa2 promoters was ana-lyzed in vitro. Start sites used by purified RNAP II weremapped by primer extension analysis and by the Si nucleaseprotection assay. RNAP mm(in the absence of GTFs) wasfound to initiate transcription at many sites, althoughpreferred sites were observed. One of the preferred initiationsites mapped to +1. More important, mutations in the con-served CTCA motif abolished initiation by purified RNAPII within the mr. When TBP and highly purified prepara-tions of the remaining GTFs were included in the reaction,transcription initiated specifically at + 1 (5). Thus, the GTFs

confer specificity to RNAP mm.Mutations in the .3CTCA+Ielement of the mva2 mnr resulted in almost undetectable levelsof transcription at + 1 (4). Insertion of a CThA motif 20nucleotides downstream from +1 resulted in a new tran-scription start site within the insertion (CTCA), as detectedby in vitro primer extension analysis (J. Carcamo andD. Reinberg, unpublished observations). Further evidencethat RNAP II and the GTFs are capable of recognizing themnr was obtained by forming DNA protein complexes on themnr, followed by electrophoretic separation of DNA-proteincomplexes on native polyacrylamide gels [electrophoreticmobility shift assay (EMSA)]. Using EMSA, it was shownthat highly purified RNAP mm,with recombinant TBP andTFmmB, and highly purified TFIIF were able to form a stablecomplex on the mr of the adenovirus ML and hVa2 pro-moters in the absence of a TATA motif (5). A complex couldnot be formed in the absence of any of the four protein fac-tors (RNAP II, TBP, TFIIB, or TFIIF). The transcriptionexperiments suggest that RNAP mmis able to loosely recog-nize the TdT family mnr (3C1CA+,). The EMSA assay indi-cates that the GTFs stabilize binding of RNAP mmvia

protein-proteininteractionsspecificallyto the Inr.Specificityin the EMSA was demonstrated through oligonucleotidecompetition experiments. This analysis demonstrated that awild-type TdT-hnr, but not an mnr containing mutations inthe CTCA motif, was capable of inhibiting the formation ofa preinitiation complex. The involvement of TBP on com-plex formation on the hnr was also demonstrated usingoligonucleotide competition experiments. It was found thata wild-type TATA motif, but not a mutated TATA motif,could compete the DNA protein complex formed on the mnr(5). These studies, together with those of Pugh and Tijan (13,16) using the TdT-mnr element, show that the same GTFs(including TBP) required for transcription from a TATA-containing promoter are also required for transcription fromthose TATA-less promoters that contain TdT family mrs (5,13, 16).

Although it seemed apparent that the orientation of theInr of the ML and hVa2 promoters is critical to the direction-ality of transcription, the question of whether the orientationof the Inr plays a universal role in directing transcription re-mained. To further explore the involvement of the mnr in de-termining directionality of transcription, a series of syntheticpromoters containing the adenovirus TATA element, TdT-Inr, and multiple Spl binding sites in various combinationsand orientations were analyzed for quantity and directional-ity of the transcript produced. These experiments showedthat the TATA motif exerts a dominant effect over the mr ele-ment with respect to directionality of transcription (18). Thisseems to contradict the observations by Carcamo et al. (5),where the mnr orients transcription from the IVa2 promoter.In this case, however, the spacing between the mr and down-stream TATA motif is different from those analyzed usingsynthetic promoters. Furthermore, when a binding site for aspecific transcription factor and either an Inr element or

Page 5: Transcription by RNA polymerase II: initiator-directed formation

3304 Vol. 6 November 1992 The FASEB Journal WEIS AND REINBERG

TATA motif are both present, transcription always proceedssuch that the specific transcription factor binding site is up-stream from the transcriptional start site (18). The studies by

Carcamo et al. (5) used a portion of the hVa2 promoter con-taining the downstream TATA element, but extending up-stream to -204. It is now known that there are two upstreamelements in the IVa2 promoter that may, in addition to themr element, direct transcription away from the upstreamelement. In vivo studies using a different upstream element,the octamer binding site, and TATA elements from a varietyof promoters (including the adenovirus ML and the /3-globinpromoters) confirmed the importance of the upstream ele-

ments and the TATA motif in orienting transcription (19).

So far we have seen how various promoter elements playdifferent roles in TATA-containing vs. non-TATA promoters.Two different models have been presented for assembly of thetranscriptional machinery on these two types of promoters.In the case of a TATA-containing promoter, binding of TBPto the TATA motif provides the nucleation site for RNAP Hand the remaining GTFs. In the absence of a TATA-motif,RNAP II (which alone has a low affinity for the Inr) throughinteraction with TBP, TFmIB, and TFIIF recognizes the mrand forms a stable, specific transcription-competent com-plex. An alternative model can also be postulated if an mnr-binding protein that can interact with a component of thegeneral transcription machinery exists. Two such proteinshave been identified for the MLP: cap binding factor (CBF)and TFII-h (20, 21).

The binding site of CBF was mapped to + 1 to + 23 on theMLP using DNase I protection analysis and EMSA (20).Mutations in the CBF binding site resulted in decreasedlevels of transcription in vivo and in vitro (20). Also, associa-tion of CBF with the downstream element appeared to stabi-lize binding of TFIID to the TATA motif (22). Renaturationexperiments of a partially purified CBF fraction separated byPAGE-SDS suggest that CBF is a 90-kDa polypeptide (20).The exact role CBF plays in transcription is not well under-stood at this time.

TFIm-I, a 120-kDa polypeptide, also binds to the MLP-Inr

element; however, it also recognizes the TdT and human im-munodeficiency virus-i (HIV-1) mr elements (21). TheHIV-1 hnr (3CTGGG1CTCT+7) is similar to that of theMLP (3CTCACTC1tT+7), with the exception that alone itcannot function as a promoter (3). Surprisingly, TFIm-m is

immunologically related to the helix-loop-helix protein up-stream stimulating factor (USF). Also, TFII-I can recognizethe USF binding site on the MLP (60CACG1Th54) withgreater affinity than USF. Similarly, USF can recognize theMLP-Inr with lower affinity than TFmm-m.TFH-I bound tothe USF site cannot be competed by an oligonucleotide con-taining the MLP-mnr site nor can TFmI-I bound to the Insbe competed by an oligonucleotide containing the MLP-USF site, which suggests that TFmm-mmay contain multipleDNA binding domains (21). This factor appears to be dis-tinct from CBF because of differences in molecular mass (120vs. 90 kDa), binding sites (-1 to +7 vs. +1 to +23 on theMLP), and most important, affinity for the USF element(TFII-I recognizes this element, CBF does not) (20, 21, 22,24). The possibility that proteolysis of TFII-I results in a90-kDa protein (CBF) with a unique specificity remains.

TFmm-mmay mediate a complex formation pathway on theMLP alternative to that shown in Fig. 1, as it has been sug-gested that TFII-I is required for transcription from theMLP in the absence of TFIIA (20). This observation is puz-zling because human TBP was used as the source of TFIIDin these experiments, and it has been shown using highlypurified factors that TFIIA is not required for basal tran-

scription when TBP is used (23). Only when a reconstitutedsystem consisting of factors purified to homogeneity or fromexogenous sources such as Eschenchia coli is available will suchdiscrepancies be resolved. Perhaps an alternate preinitiationcomplex assembly pathway can exist where binding of TFII-mto the Inr provides the nucleation site for the general tran-scription machinery. The model proposing alternative path-ways of complex formation has been challenged by thefinding that the HIV-Inr, which binds TFmI-I, is aloneunable to direct transcription (3, 18). Perhaps, the TFII-Ibinding site, together with other DNA elements such as theCTCA motif, present in the TdT family Inr, and the TATAmotif can increase the formation of a transcription-competentcomplex. In support of this are findings demonstrating thatTFmI-I and TBP bind to the mnr and TATA elements in acooperative manner (24).

The concept that one set of GTFs can participate in multi-ple complex formation pathways, first shown by EMSA ex-periments using the MLP and IVa2 mnr elements (see pre-ceding text) (5), is further supported by mechanistic studiesperformed by Zenzie-Gregory et al. (25). Time course ex-periments had previously demonstrated that a lag periodpreceded transcription initiation from the adenovirus MLP(26). This was attributed to binding of TFIID to the TATAelement. Similar time course experiments using a syntheticpromoter containing the TdT-Inr in the presence or absenceof a TATA element were performed using nuclear extract.Transcription from the TATA-less promoter proceededlinearly, without a lag period (25). The absence of a lagperiod may be attributed to the lack of TFIID binding to aTATA motif. Additional evidence that the initial steps oftranscription preinitiation complex formation differ in TATAvs. non-TATA promoters was obtained by titrating nuclearextract on the same promoters already described. Theamount of transcription from the TATA-containing promoterwas directly proportional to the amount of extract used. Thisis in contrast to the TATA-less promoter, where a sharp in-crease in the amount of transcription was observed when asaturating concentration of nuclear extract was used (25).The TATA and TATA-less promoters behaved similarly in as-says that analyze later stages in preinitiation complex forma-tion, reinforcing the concept that although initial recognitionsteps in complex formation differ in these two types ofpromoters, formation of the complete preinitiation com-plexes is similar (25).

Another aspect of transcription that should be consideredis how an open complex is formed on TATA-containing andTATA-less promoters. mt is clear that on TATA-containingpromoters, binding of TBP to the TATA element initiates theformation of a transcription competent complex containingall the GTFs and RNAP II. Upon the addition of ATP (ordATP), this complex undergoes a transition to form an opencomplex. Hydrolysis of the /3--y bond of ATP appears to becritical, as transcription initiation fails to occur in thepresence of nonhydrolyzable ATP analogs. The exact mech-anism for conversion to an open complex is not clear, but itis most likely that certain GTFs are released, possibly afterone (or a combination of GTFs) catalyzes the hydrolysis ofATP and the melting of the DNA double helix occurs,perhaps utilizing the energy released from the /3-y bond ofATP. Currently, the mechanism of complex formation onTATA-less promoters has not been examined. Exactly whatrole, if any, an Ins-binding protein plays during the ATP-dependent step is not clear. Because the Ins element must bemelted for transcription initiation to occur, it will be interest-ing to determine when an Inr-binding protein is liberatedfrom the transcription complex. Alternatively, the Inr-

Page 6: Transcription by RNA polymerase II: initiator-directed formation

MECHANISMS OF TRANSCRIPTION 3305

binding protein could remain bound to the nontranscribedDNA strand, similar to the RNAP mmitranscription factorTFIIIA (27). Also, the process by which transcription isreinitiated on TATA-less promoters is not known. Further ex-periments must be done to examine these phenomena.

A TdT-family mnr has also been identified in the down-stream region (+447- +457) of the elF-2a promoter (28).This Inr is oriented in the opposite direction of transcriptionfrom elF-2a. Consequently, it has been suggested that thefunction of this Inr is to regulate transcription from theelF-2a promoter by producing antisense RNA that can forma duplex with the mRNA and inhibit translation. Althoughsuch an antisense message has not been detected in vivo, evi-dence supporting such a model exists. Mutations of thedownstream Inr element increase in vivo levels of transcrip-tion from the elF-2a promoter five- to eightfold (28). Anti-sense transcripts initiating within the downstream Ins havebeen mapped in vitro by primer extension analysis (28). mnvivo, the wild-type downstream Ins (but not a mutant mnr)is able to direct transcription of the luciferase gene in tran-siently transfected cells, indicating that this Inr can directformation of a transcription competent complex (M. No-guchi, T A. Silverman, and B. Safer, personal communica-tion). A site immediately upstream (with respect to theantisense transcript) of the downstream mnr element(+457ACTTTGCTFTTTCCA+474) has been mapped andshown to bind a 45-kDa factor (IBP) (M. Noguchi, T A.Silverman, and B. Safer, personal communication). Therelationship between this protein and TFhh-m and the relatedUSF is unknown. Future mutational analysis of the IBPbinding site and characterization of IBP will undoubtedlyclarify this relationship.

The PBGD-Inr family

Like previously described Ins elements, the Inr element ofthe TATA-less promoter of the erythroid specific humanPBGD gene has been shown to be sufficient for accurate invitro transcription initiation (6). This Inr element is com-posed of two domains: one from -1 to +1 (CAJ, which en-compasses the initiation site, and a second that extends from+ 5 to + 14 (TCC1tG’ITAC). This downstream element has

some homology with the downstream region of the TdT-Inr(+2TTCTGGAGAC+1’) (6). Although mutations in both ele-ments have dramatic deleterious effects on transcription invitro, only mutations in the 1/+1 element impair in vivotranscription. A more thorough examination of this domainreveals a sequence identical to that found to be essential inthe TdT family Ins 3Clt +. Hence, these results are inagreement with those seen for the MLP. Mutations in the3C1tA+1 element from either the ML or PBGD promoters

have drastic deleterious effects in vivo. The downstream ele-ment binds a ubiquitous nuclear factor. Even though this Inris homologous to the TdT-Inr, the PBGD hnr-binding pro-tein does not bind the TdT-Inr, implying that this protein isnot TFmm-m(6). At the present time, details concerning thisprotein are not known.

Although the PBGD promoter resembles many otherTATA-less promoters, preliminary data using heat-inactivated nuclear extract suggest that this promoter doesnot require TFIID (6). Further experiments using eitherTBP immunodepleted extract or a reconstituted system arenecessary to confirm this observation. If this promoter doesnot require TBP, it would be very surprising because TB?is required for transcription, not only from promoters tran-

scribed by RNAP II but also by those transcribed byRNAP I and RNAP mmm(29-32). Recent studies, however,

have shown that the GTF TFIIE is not required for tran-scription from the IgH promoter, indicating that the generaltranscription factors perhaps exhibit some promoterspecificity (33). This may be the case for the lack of a re-quirement for TFIID in transcription from the PBGDpromoter.

The DHFR-Inr family

DHFR is an enzyme required for the de novo synthesis ofglycine, purines, and thymidylate. Although it is expressedconstitutively at low levels in all cells at all stages of develop-ment, the gene is cell cyde regulated and a transient increasein transcription from the DHFR promoter has been ob-served at the GuS boundary (ref 7 and references therein).Similar to many housekeeping genes, the DHFR promoteris TATA-less, having instead four 48-base pair repeats. Eachof these contains a GC element that binds Spi. Althoughdownstream elements are required for optimal transcription,deletion analysis of the murine DHFR promoter has shownthat the minimal promoter elements required for in vitrotranscription reside within nucleotides -65 to + 15 (ref 7,and references therein). Two protein binding sites have beenidentified in this region. One of these, located upstream fromthe initiation site, binds Spi. The other site is the mnr element(‘ 1ATTICGCGCCAAACTT+5). Mutations in either ofthese elements affect specific transcription initiation (7, 34).The Ins appears to play a dominant role in positioning tran-scription initiation, as 10-, 14-, and 28-base pair insertionsbetween the Spi and Inr elements result in transcription in-itiation at the expected major and minor start sites. Thus,transcription initiates within the mnr element regardless ofthe distance between the mr and upstream Spi binding sites.The usage of these sites, however, differed from that observedin the wild-type promoter because the major transcriptioninitiation site became the minor initiation site and the minorinitiation site became the major one (7). This observationsuggests that the upstream Spi sites participate in position-ing the preinitiation complex.

The first Inr binding proteins identified recognize theDHFR-Inr element. Two laboratories independently identi-fied DHFR-Inr element binding proteins. In 1989, Blake andAzizkhan (35) showed that a 54-kDa cellular transcriptionfactor, E2F, originally identified for its role in activatingtranscription from the adenovirus E2 promoter (36), wasable to bind sequences encompassing the hamster DHFR in-itiation site (+2TTCG’CGCCAAA+13). Mutations thatabolished E2F binding also decreased DHFR promoter tran-scription in vivo and in vitro (35). In 1990, Means and Farn-ham (7) identified another factor, called housekeeping initia-tor protein-i (HIP-i), which bound to the murine DHFR-Inr(9TTCGCGCCA1). It now appears that HIP-i and E2Fare the same protein (P. Farnham, personal communica-tion), although they differ from TFII-I and any Inr bindingprotein mentioned previously. Other housekeeping pro-moters with Inr elements similar to the E2F binding site in-clude hypoxanthine phosphoribosyltransferase, KiRas,3-phosphoglycerate kinase, osteonectin, interferon regula-tory factor 1, and SURF-i (ref 7 and references therein). A

homologous element is also located at the Inr of the SV4Omajor late promoter (SV4O-MLP) (37). Mutations in theSV4O-MLP-Inr result in decreased levels of in vitro tran-scription and a change in the transcription start site (37).

In addition to properly positioning transcription initia-tion, the DHFR-Inr appears to be required for increasedlevels of transcription from this promoter that occur at theGUS boundary (ref 38 and references therein). A DHFRpromoter lacking the E2F binding site failed to exhibit a

Page 7: Transcription by RNA polymerase II: initiator-directed formation

3306 Vol. 6 November 1992 The FASEB Journal WEIS AND REINBERG

marked increase in transcription at the Gl/S boundary (38).It has been shown that E2F associates with and dissociates

from the DHFR promoter rapidly (less than 15 s). Thiscould be one mechanism by which the activity of E2F re-sponds to changes at the GuS boundary (38). E2F has beenshown to associate with factors known to be involved in cell

growth, such as cyclin A and the retinoblastoma gene prod-uct (ref 38 and references therein). Perhaps dissociation ofE2F from these factorsenables E2F to associatewith otherproteins, allowing it to activate transcription from theDHFR promoter. Association of E2F with an adenovirus-encoded, 19-kDa polypeptide is required for E2F activationof the adenovirus E2 promoter (ref 39 and references

therein). Consequently, this may represent a general mech-anism of E2F action. It is possible that E2F association witha cell cycle-specific factor permits stable binding of the pro-

tein to the Inr. Further investigation of the regulation of E2Fwill elucidate the exact mechanism by which it responds to

changes that occur in the cell cycle.

For transcription to occur from the DHFR promoter, the

large subunit of RNAP mmmust contain the COOH-terminal

domain (CTD) (40). The binding of E2F to the DHFR-Inrhas also been implicated in explaining the requirement theDHFR promoter exhibits for the CTD (41). Human RNAP mmcontains at least 10 subunits, the largest having an unusualCTD consisting of a heptapeptide repeat (YSPTSPS) notfound in RNAP I, RNAP 1mm,or bacterial RNA polymerase(for review, see refs 42 and 43). The length of the repeatseems to be related to the complexity of the organism, rang-ing from 26 or 27 copies in Sacchammyces cerevisiae to 52 copiesin mice and humans (42, 43). Three forms of RNAP II havebeen identified, differing only in the state of the CTD (ref 1and references therein). RNAP hA (210 kDa) contains anonphosphorylated CTD; RNAP 110 (240 kDa) contains anextensively phosphorylated CTD; and RNAP JIB (180 kDa)lacks all or most of the CTD due to proteolysis and is be-lieved to be an in vitro artifact of purification (42). The hAform preferentially associateswith the transcription preiniti-ation complex and is subsequently converted into the HOform by a cellular protein kinase (refs 1, 44, 45, and refer-ences therein). The mmoform of the polymerase is primarilyresponsible for elongation. Many kinases capable of phos-phorylating the CTD have been identified, yet it is unclearwhich are important for CTD phosphorylation in vivo. Prob-ably one of the most interesting kinases isolated is the kinaseactivity inherent in transcription factor TFIIH. This kinaseis active only when it is part of the preinitiation complex andwhen promoter sequences are present (46).

Although deleting more than half of the heptapeptiderepeats in mouse, Drosophila, and S. cerevisiae is lethal, studiesin vitro have indicated that the CTD is not required for allpromoters. Using the actin 5C promoter, Zhering et al. (47)show that RNAP II treated with chymotrypsin to remove theCTD can accurately initiate transcription in vitro. Similarresults were obtained with the adenovirus MLP. mn addition,the REP, interferon regulatory factor 1, cytomegalovirusmajor immediate-early, c-myc, and /3-actin promoters havebeen shown to be CTD-independent (ref 38 and referencestherein). Other promoters, such as those of the DHFR,RAF-1, carbamoyl phosphatase synthetase aspartate trans-carbamylase-dihydroorotase (CAD), and histone H2B genes,have been shown to require the CTD for transcription (ref41 and references therein).

Not all CTD-independent promoters contain TATA ele-ments and not all CTD-dependent promoters are TATA-less.Nuclear extract, depleted of endogenous RNAP II activityusing monoclonal antibodies that recognize the CTD, is

transcriptionally inactive. RNAP mIB cannot restore tran-scriptional activity when the template used is the DHFRpromoter, but it is able to restore activity when the templatecontains the TATA-less REP or the TATA-containing adeno-virus ML promoters (ref 41 and references therein). Deletionanalysis of the DHFR and REP promoters suggests thatminimal promoter sequences, consisting only of one Spibinding siteand the mnr element, confer the requirement forthe CTD (41). Because these promoters seem to differ onlyin the mnr elements, it is proposed that promoters with ele-ments similar to the DHFR-hnr require the CTD. Furtherobservations show that E2F binding sites are present in otherCTD-dependent promoters, although at least one CTD-dependent promoter, the CAD promoter, does not bind E2F(41). mn light of recent evidence that the CTD can directlyinteract with TB? (48) and that TBP is required for tran-scription from most TATA-less promoters (5, 13, 15), a modelfor how TBP enters the preinitiation complex on CTD-

dependent TATA-less promoters has been suggested (48). Ithas already been shown that RNAP mmhas a low affinity forInr elements similar to the adenovirus MLP-Inr (5). Perhapsin these cases, when TBP cannot enter the complex throughthe TATA motif, TBP joins the complex through an interac-tion with the polymerase. Another possibility is that an Insbinding protein may anchor the transcription complex. mfthis protein can interact with TFmID, the CTD would not berequired for transcription. Alternatively, if this protein is un-able to interact with TFIID, the CTD would be required torecruit TFIID to the complex. This may be the case for E2F,although it has not yet been shown whether or not this pro-tein can interact with TFIID.

The ribosomal protein Inr family

Like the TdT and DHFR promoters, mammalian ribosomalprotein (rp) promoters initiate transcription specificallyin the absence of a TATA box (49). The hnr elementsof the rp promoters contain a unique polypyrimidinetract. Mutations in the hnr of one of the rp promotersrpSl6(.+CTTCCCTTTICC+s), illustrated the importance ofthis Ins in transcription (8). Although one to three purinesubstitutions in the polypyrimidine tract increased transcrip-tion two- to sevenfold in vitro, in transient transfection ex-periments transcription was not increased. mn vivo, however,the start site was altered (8, 50). This was surprising in thecase of a single purine substitution at +1 because this Inrelement resembles a consensus Inr found in many TATA-containing promoters (51). When a consensus TATA motifwas placed at -30, transcription in vitro increased 10-fold,but wild-type levels were observed in vivo (8, 50). A muta-tion that made the rpS 16 promoter resemble a typical TATA-containing promoter (TATA at -30, and 1CA+1) behavedlike a strong promoter in vitro, but again initiated at apyrimidine and was transcribed at wild-type rpS 16 levels invivo (8, 50). These differences observed in vitro and in vivoare reminiscent of the previously mentioned observations byCarcamo et al. (5) regarding mutations in the TATA motifof the MLP: mutations in the TATA motif-directed tran-scription that initiated at + I in vitro and in vivo (in additionto a cluster of sites surrounding +1); however, levels of tran-scription were much lower in vitro than in vivo.

Although the rpS 16 promoter is TATA-less, a proteinbinding element has been identified near -30

(31TGAAAAA25) (8). The factor recognizing this elementappears to be different from TFIID because an oligonucleo-tide containing the TATA motif is unable to inhibit bindingof this factor to the rpSl6 -30 element (8). Three other ele-

Page 8: Transcription by RNA polymerase II: initiator-directed formation

MECHANISMS OF TRANSCRIPTION 3307

ments are required for transcription from this rp promoter,including one Spi binding site (50).

Transcription initiation from two additional rp promoters,rpL 30 and rpL 32, are currently under investigation. Theseappear to be transcribed in vivo with efficiencies equal tothat of the rpS 16 promoter (49). These promoters have poly-pyrimidine hnr elements similar to the rpS 16 promoter, butnot identical. Four different protein binding sites have beenidentified (a, $, ‘y, 6 in rpL 30 and , y, and two 6 in rpL32). Mutations in these regions decrease transcriptionefficiency from these promoters in vivo (49).

The cDNA encoding the 6 protein has been cloned froma mouse expression library (52). This cDNA was alsoisolated in a search for a murine protein that bindsto the upstream conserved region of the Moloney murineleukemia virus (MMLV) LTR (upstream conserved regionbinding protein, UCRBP) (53). Two laboratories inde-pendently isolated cDNAs encoding the human homologwhile studying proteins that bind to the adeno-associatedvirus p5 promoter (YY1) and the immunoglobulin light-chain 3 enhancer (NF-E1) (9, 54). The protein was givendifferent names by the groups and has also been shownto be the same as CF-I, which binds the c-myc (-260 region:GCGCGCGAGAAGAGAAAKIGGT) and skeletal actin (-90region: CACCCAAATATGGCGA) promoters (55). Theprotein has also been shown to bind within a repressionelement in the Epstein-Barr virus BZLF1 promoter as wellas to the immunoglobulin heavy-chain enhancer Ei site (9,54). For the purposes of this review, this protein (6, NF-El,CF-i, UCRBP, YY1) will be referred to as YY1.

The murine and human sequences are highly conserved atthe levels of both nucleic acid sequence and amino acid se-quence. The cDNA encodes a protein of 414 amino acidswith a predicted molecular weight of approximately 45 kDa.On PAGE-SDS, it migrates with an apparent molecularweight of approximately 60-68 kDa. YY1 contains severalinteresting motifs: four zinc fingers (C2H2) in the COOH-terminal portion; a highly acidic NH2-terminal domain(40% of the first 53 amino acids are acidic, including ii con-secutive acidic residues between amino acids 43 and 53); 11consecutive histidine residues between amino acids 70 and80; and an alanine/glycine rich region (58%) between aminoacids 154 and 198 (9, 52, 53, 54). This structure is interestingbecause it contains domains that have been implicated to beinvolved in activation (such as the acidic region) and repres-sion (the contiguous histidine domain and the glycine/ala-nine domain).

YY1 has been shown to act as both an activator and re-pressor (hence the name “yin and yang 1”), depending on thepromoter and the intracellular environment (9, 52-55). Inthe case of the immunoglobulin light chain enhancer (xE3)and the MMLV LTR, it appears to act as a repressor be-cause promoters with mutations in the binding site exhibitincreased levels of transcription (54). Different effects are ob-served when the YY1 binding sites in the rpL 32 promoterare mutated. Mutations in either binding site of thispromoter result in decreased promoter activity (52). Furtherevidence that this protein can activate transcription was ob-tained by Riggs et al. (55). When four copies of the YYIbinding site were placed upstream from the herpes simplexvirus thymidine kinase promoter, transcription was stimu-lated 2.5- to 7-fold in vivo depending on the type of cell ana-lyzed. Perhaps these seemingly contradictory results may beexplained by observations obtained by using the adeno-associated virus p5 promoter and the adenovirus E1A geneproduct that stimulates transcription from both viral and cel-lular genes. Binding of YY1 to an upstream element on the

p5 promoter repressed transcription. In the presence of bothYY1 and E1A, however, this repression could be overcome,and activated levels of transcription were observed (9). Be-cause the p5 promoter contains a second YY1 binding siteencompassing the Inr, this promoter will be placed in a dis-tinct hnr family.

The adeno-associated virus p5-Inr family

Two YY1 binding sites have been identified on the p5promoter: one located between -50 to -70 (p5-60;CGACATTTT) and the other at the p5-hnr element (p5 + 1;C1CCATTTT) (9). Additionally, a USF site is foundupstream of the p5 -60 element and a TATA motif is presentat -30. The p5 TATA element is unable to direct transcrip-tion in the absence of the p5-Inr element in vitro. Wheneither of the YYI binding sites are placed upstream of apromoter containing the adenovirus MLP TATA motif andthe TdT-Inr, transcription is repressed (9). The adenovirusE1A gene gives rise to two transcripts, 12S and 133, as aresult of alternate splicing. Either the 12S or 13S E1A geneproduct is able to relieve YY1 repression and the 13S geneproduct is able to activate transcription. A fusion proteinthat contains the yeast activator GAL4 DNA binding do-main fused to the full-length YYI cDNA clone was also ableto repress promoter-containing GAL4 upstream elements invivo (9). In the presence of EIA, repression was relieved andactivation was observed (9). A similar fusion protein contain-ing the GAL4 DNA binding domain fused to a YYI deriva-tive lacking 83 amino acids from the COOH terminus wasunable to repress transcription from the same promoter.Thus the last 83 amino acids, which include part of the zincfinger domain, are important for repression. Deletion muta-genesis of the murine homolog has shown that the zinc finger

domains are important for DNA binding (52). Unfortunate-

ly, the effect of these domains with respect to activation, andspecifically, EIA-mediated activation, has not yet beenaddressed.

The p5+1 element has been shown to act as an Ins ele-ment. When sequences from -12 to +11 are present in theabsence of upstream elements, specific transcription is ob-served in vitro (56). This element is activated by the presenceof either a TATA element or multiple Spi binding sites.HeLa cell nuclear extract depleted of YY1 by passagethrough an oligonucleotide affinity column, or Drosophila em-bryo extracts, which lack endogenous YY1, was used toassess the transcriptional function of YY1 in the p5+i ele-ment in vitro. Although these two systems were unable totranscribe the p5 + I minimal promoter, addition of YY1purified from HeLa cells restored transcription (56). In thepresence of anti-YYI antibodies, transcription was unable tobe restored by the addition of YY1 purified from HeLa cells.Recombinant YYI protein purified from E. coli is reportedto be transcriptionally inactive (56).

A homologous Inr element has been found in the TATA-less promoter of the human DNA polymerase (fl-pol) gene(L. Weis, J. Perez, and D. Reinberg, unpublished observa-tions). This promoter contains two contiguous distal Spibinding sites (centered at -60 and -70), one proximal Spibinding element (centered at -20), and an inverteddecanucleotide repeat (centered at -41), with homology tothose recognized by factors of the ATF family of proteins(reviewed in ref 57). Mutational analysis has identified twoelements required for basal levels of transcription from thispromoter. One, encompassing the transcriptional start site(...2CCAT1GTT+6), has been shown to bind a cellular factorcalled ITF (initiator transcription factor). Mutations in this

Page 9: Transcription by RNA polymerase II: initiator-directed formation

3308 Vol. 6 November 1992 The FASEBJournal WEIS AND REINBERG

element that abolish hTF binding also exhibit lower tran-scriptional activity in vitro and in vivo. Although this ele-ment is similar to the p5-hnr, the relationship between ITFand YY1 is not clear because ITF has not been purified tohomogeneity. The second element, located downstream

(+25CTGGGTTGC+33), appears to bind leader binding pro-tein 1 (LBP-u) (L. Weis, J. Perez, and D. Reinberg, unpub-lished observations). LBP-u was originally identified as acellular protein that was able to bind to the HIV-1 promoterat a high-affinity site (site I: -16 to +27) and a low-affinitysite (site hh: -38 to -17). Binding of LBP-1 to the high-affinity site apparently has a positive effect on transcription,as mutations in site h that abolish binding also decrease tran-scription in vivo (58). On the other hand, binding of LBP-ito site II appears to inhibit transcription by preventingTFIID from binding to the TATA element (59). On the /3-polpromoter, LBP-u appears to have a positive effect on tran-scription because mutations in this element result in de-creased transcriptional levels. In vitro, the presence of aTATA motif at -30 in the 1-pol promoter can negate thedeleterious effect of mutations in either the Inr or LBP-u ele-ments, indicating that these elements function in basal tran-scription of the -pol promoter. The placement of a TATAmotif at -30 on a construct with mutations in both the 13-polhnr and LBP-1 site restored wild-type levels of transcriptionin vitro but was without effect in vivo. Another factor thatappears to be required for basal transcription from thispromoter is TBP, as nuclear extract immunodepleted of TBPfailed to transcribe the f3-pol promoter. Transcription was re-stored by the addition of yTBP. It has been proposed thatbinding of hTF and/or LBP-1 to this promoter provides ameans for TBP and the rest of the basal transcriptionmachinery to form a transcription-competent complex (L.Weis, J. Perez, and D. Reinberg, unpublished observations).

Our understanding of the mechanism of initiation onTATA-less promoters is limited. It is clear that in these, aswell as in TATA-containing promoters, the Inr plays adominant role in vivo. Also, TFIID is an integral componentof the transcription machinery from almost all TATA-lesspromoters studied. Although the binding of TFIID to theTATA motif provides a nucleation site for transcription com-plex formation on TATA-containing promoters, this is notthe case for TATA-less promoters. In TATA-less promoters,the Inr element plays an essential role in complex formation.Recognition of the Inr by an hnr binding protein (ITF) pro-vides a means by which a transcription competent complexcan assemble. The pathway by which such a complex isformed is dependent on the ITF. Some ITFs may facilitatecomplex formation via interaction with TBP (e.g., TFII-Iand CBF) or one of the GTFs, whereas others may functionby interacting with the CTD of RNAP II. Only by delineat-ing each step in the formation of a transcription complex ineach mr element family will a true understanding of whatfactors are involved and how they interact to specifically in-itiate transcription be attained. []

We wish to thank our colleagues in the transcription field forproviding information prior to publication. We extend sincere apol-ogies to those who, due to the enormous amount of material thathad to be condensed, we failed to acknowledge. We thank Drs. FredMermelstein, Leigh Zawel, and Edio Maldonado for comments onthe manuscript. L. W. is supported by a fellowship from the New

Jersey Commission on Cancer. D. R. is supported by NIH (grantsGM3 7120 and GM47484) and the American Cancer Society (grantNP66724) and was the recipientof an American Cancer SocietyFaculty Research Award.

REFERENCES

1. Zawel, L., Reinberg, D. (1992) Initiation of transcriptionby RNApolymerase II: a multi-step process. Prog. Nucleic Acids Res. MoL Biol.In press.

2. Dynan, W. S. (1983) Promoters for housekeeping genes. Trends Gene.2, 196-197

3. Smale, S. T., Baltimore, D. (1989) The initiator as a transcriptionalcontrol element. Cell 57, 103-113

4. Carcamo, J., Maldonado, E., Ahn, M., Ha, I., Kasai, Y., Flint, J.,and Reinberg, D. (1990) A TATA-like sequence located downstreamof the transcription initiation site is required for expression of anRNA polymerase II transcribed gene. Genes & Dcv. 4, 1611-1622

5. Carcamo, J., Buckbinder, L., Reinberg, D. (1991) The initiator directsthe assembly of a transcription factor lID-dependent transcriptioncomplex. Proc. Nail. Aced Sd. USA 88, 8052-8056

6. Beaupain, D., Eleouet, J. F., Romeo, P. H. (1990) Initiation of tran-scription of the erythroid promoter of the porphobiinogen deaminasegene is regulated by a cis-acting sequence around the cap site. NucleicAcids Res. 18, 6509-6515

7. Means, A., Farnham, P. (1990) Transcription initiation from the di-hydrofolate reductase promoter is positioned by HIP1 binding at theinitiation site. Mol. CelL Biol. 10, 653-661

8. Hariharan, N., Perry, R. P. (1990) Functional dissection of a mouseribosomal protein promoter: significance of the polypyrimidine in-itiator and an element in the TATA-box region. Proc. Nail. Acad. Sci.USA 87, 1526-1530

9. Shi, Y., Seto, E., Chang, L S., Shenk, T (1991) Transcriptional repres-sion by YY1, a human GLI-Kruppel-related protein, and relief ofrepression by adenovirus E1A protein. Cell 67, 377-388

10. Buratowski, S., Hahn, S., Guarante, L., Sharp, P. A. (1989) Five in-termediate complexes iii transcription initiation by RNA polymer-ase II. Cell 56, 549-561

11. Maldonado, E., Ha, I., Cortes, P., Weis, L., Reinberg, D. (1990) Fac-tors involved in specific transcription by mammalian RNA polymer-ase II: role of transcription factors hA, III), and IIB during formationofa transcription-competentcomplex.MoL Cell. Biol. 10, 6335-6347

12. Smale, S. T., Schmidt, M. C., Berk,A.J., Baltimore, D. (1990) Tran-scriptional activation by Spl as directed through TATA or initiator:specific requirement for mammalian transcription factor lID. Proc.Nail. Acad. Sci. USA 87, 4509-4513

13. Pugh, B. F., Tijan, R. (1990) Mechanism of transcription activationby SpI: evidence for coactivators. Cell 61, 1187-1197

14. Nakajima, N., Horikoshi, M., Roeder, R. G. (1988) Factors involvedin specific transcription by mammalian RNA polymerase II: purifica-tion, genetic specificity, and TATA box-promoter interactions ofTFIID. MoL Cell. BioL 8, 4028-4040

15. Pugh, B. F., Tijan, R. (1992) Diverse transcriptional functions of themultisubunit eukaryotic TFIID complex. j Biol. Chem. 267, 679-682

16. Pugh, B. F., Tijan, R. (1991) Transcription from a TATA-less promoterrequires a multisubunit TFIID complex. Genes & Dcv. 5, 1935-1945

17. Colgan, J., Manley, J. L. (1992) TFIID can be ratelimitingin vivofor TATA-containing, but not TATA-lacking, RNA polymerase IIpromoters. Genes & Dcv. 6, 304-315

18. O’Shea-Greenfield, Smale, S. T. (1992) Roles of TATA and initiatorelements in determining the start site location and direction of RNApolymerase IItranscription.j Biol. Chem. 267, 1391-1402

19. Licen, X., Thali, M., Schaffner, W. (1991) Upstream box/TATA boxorder is the major determinant of the direction of transcription. NucleicAcids Ret. 19, 6699-6704

20. Garfinkel, S., Thompson, J. A., Jacob,W. F.,Cohen, R., Safer, B.(1990) Identification and characterization of an adenovirus 2 majorlate promoter CAP sequence DNA-binding protein. j Biol. Chem.

265, 10309-1031921. Roy, A. L., Meisterernst, M., Pognonec, P., Roeder, R. G. (1991)

Cooperative interaction of an initiator-binding transcription initiationfactor and the helix-loop-helix activator USF. Nature (London) 354,245-248

22. Safer, B., Reinberg, D., Jacob, W. F., Maldonado, E., Carcamo, J.,Garfinkel, &, Cohen,R. (1991)InteractionofCAP sequencesitebind-ing factor and transcription factor lID preceding and following bind-ing to the adenovirus major late promoter. J. BioL Chem. 266,10989-10994

23. Cortes, P., Flores, 0., Reinberg, D. (1992) Factors involved in specifictranscription by mammalian RNA polymerase II: purification andanalysis of transcription factor hA and identification of transcrip-tion factor lU. MoL Cell. Biol. 12, 413-421

Page 10: Transcription by RNA polymerase II: initiator-directed formation

MECHANISMS OF TRANSCRIPTION 3309

24. Roeder, R. G. (1991) The complexities of eukaryotic transcriptioninitiation: regulation of preinitiation complex. Trends Biocizem. Sd. 16,402-408

25. Zenzie-Gregor#{231} B. O’Shea-Greenfield, A., Smale, S. T (1992) Similarmechanisms for transcription initiation mediated through a TATAbox or an initiator element. j Biol. Chem. 267, 2823-2830

26. Reinberg, D., Horikoshi, M., Roeder, R. G. (1987) Factor involvedin specific transcription in mammalian RNA polymerase LI: func-tional analysis of initiation factors hA and LID and identificationof a new factor operating at sequences downstream of the initiationsite. j Biol. Chem. 262, 3322-3330

27. Lassar, A. B., Martin, P. L., Roeder, R. G. (1983) Transcription ofclass III genes: formation of preinitiation complexes. Science 222,740-748

28. Silverman, T A., Noguchi, M., Safer, B. (1992) Role of sequenceswithin the first intron in the regulation of expression of eukaryoticinitiation Factor 2a. j Biol. Chem. 267, 9738-9742

29. Comai, L., Tanese, N., Tijan, R. (1992) The TATA-binding proteinand associated factors are integral components of the RNA poly-merase I transcription factor, SL1. Cell 68, 965-976

30. Cormack, B. P., Struhl, K. (1992) The TATA-binding protein is re-quired for transcription by all three nuclear RNA polymerases inyeast cells. Cell 69, 685-696

31. Schultz, M. C., Reeder, R. H., Hahn, S. (1992) Variants of the TATA-binding protein can distinguish subsets of RNA polymerase I, II,and III promoters. Cell 69, 697-702

32. White, R. J., Jackson, S. P., Rigby, P. W. J. (1992) A role for theTATA-box-binding protein component of the transcription factor hIDcomplex as a general RNA polymerase III transcription factor. Proc.Nail. Aced Sci. USA 89, 1949-1953

33. Parvin, J. D., Timmers, H. T. M., Sharp, P. (1992) Promoterspecificity of basal transcription factors. Cell 68, 1135-1144

34. Blake, M. C.,Jambou, R. C., Swick, A. G., Kahn,J. W., Azizkhan,J. C. (1990) Transcriptional initiation is controlled by upstream GC-box interactions in a TATAA-less promoter. Mol. Cell. Biol. 10,6632-6641

35. Blake, M. C., Azizkhan,J. C. (1989) Transcription factor E2F is re-quired for efficient expression of the hamster dihydrofolate reductasegene in vitro and in vivo. Mol. Cell. Biol. 9, 4994-5002

36. Kovesdi, I., Reichel, R., Nevins,J. (1986) Identification of a cellulartranscription factor involved in E1A trans-activation. Cell 45, 219-228

37. Ayer, D. E., Dynan, W S. (1988) Simian virus 40 major late promoter:a novel tripartite structure that includes intragenic sequences. Mol.Cell. Biol. 8, 2021-2033

38. Means, A., Slansky, J. E., McMahon, S. L., Knuth, M. W, Farnham,P.J. (1992) The HIP1 binding site is required for growth regulationof the dihydrofolate reductase gene promoter. Mol. Cell. Biol. 12,1054-1063

39. Raychaudhuri, P., Bagchi, S., Neill, S., Nevins, J. R. (1990) Activa-tion of the E2F transcription factor in adenovirus-infected cells in-volves an E1A-dependent stimulation of DNA binding activity andinduction of cooperative binding mediated by an E4 gene product.

j Virol. 64, 2702-271040. Thompson, N. E., Steinberg, T H., Aronson, D. B., Burgess, R. R.

(1989) Inhibition of in vivo and in vitro transcription by monoclonalantibodies prepared against wheat germ RNA polymerase H thatreact with the heptapeptide repeat of eukaryotic RNA polymeraseLI.j BioL C/tern. 264, 11511-11520

41. Buermeyer, A. B., Thompson, N. E., Strasheim, L. A., Burgess,R. R., Farnham, P. J. (1992) The HIP1 initiator element plays a rolein determining the in vitro requirement of the dihydrofolate reductasegene promoter for the C-terminal domain of RNA polymerase II.MoL CelL Biol. 12, 2250-2259

42. Corden, J. L. (1990) Tails of RNA polymerase II. TrendsBiochern. Sci.15, 383-387

43. Young, R. A. (1991) RNA polymerase II. Annu. Rev. Biochem. 60,689-715

44. Lu, H., Flores, 0., Weinmann, R., Reinberg, D. (1991) The non-phosphorylated form of RNA polymerase II preferentially associateswith the preinitiation complex. Proc. NaiL Acad. Sci. USA 88,10004-10008

45. Layborn, P. J., Dahmus, M. E. (1990) Phosphorylation of RNA poly-merase IIA occurs subsequent to interaction with the promoter andbefore the initiation of transcription. j Bid. Che,n. 265, 13165-13173

46. Lu, H., Zawel, L., Fisher, L., Egly, J.-M., Reinberg, D. (1992) Human

general transcription factor LIH phosphorylates the CTD-tail of RNApolymerase II. Nature (London) 358, 641-645

47. Zhering, W. A., Lee, J. M., Weeks, J. R., Jokerst, R. S., Greenleaf,A. L. (1988) The C-terminal repeat domain of RNA polymerase ILlargest subunit is essential in vivo but is not required for accuratetranscription initiation in vitro. Proc. Nail. Acad. Sci. USA 85,3698-3702

48. Usheva, A., Maldonado, E., Goldring, A., Lu, H., Houbavi, C., Rein-berg, D., Aloni, Y. (1992) Specific interaction between the non-phosphorylated form of RNA polymerase II and the TATA-bindingprotein. Cell 69, 1-20

49. Hariharan, N., Kelley, D., Perry, R. P. (1989) Equipotent mouseribosomal protein promoters have a similar architecture that indudesinternal sequence elements. Genes & Dcv. 3, 1789-1800

50. Chung, S., Perry, R. P. (1991) Cell-free transcription of a mouseribosomal-protein encoding gene: the effects of promoter mutations.Gene 100, 173-180

51. Breathnach, R., Chambon, P. (1981) Organization and expressionof eukaryotic split genes coding for proteins. Annu. Rev. Biochem. 50,349-403

52. Hariharan, N., Kelley, D. E., Perry, R. P. (1991) 5, a transcriptionfactor that binds to downstream elements in several polymerase IIpromoters, is a functionally versatile zinc finger protein. Proc. Nail.Acad. Sci. USA 88, 9799-9803

53. Flanagan,J. R., Becker, K. G., Ennist, K. L., Gleason, S. L., Drig-gers, P. H., Levi, B. Z., Appella, E., Ozato, K. (1992) Cloning ofa negative transcription factor that binds to the upstream conservedregion of Moloney murine leukemia virus. Mo!. Cell BioL 12, 38-44

54. Park, K., Atchison, M. L. (1991) Isolation of a candidate repressor!activator, NF-E1 (YY1, 5), that binds to the immunoglobulin x 3enhancer and -the immunoglobulin heavy-chain E1 site. Proc. Nail.Aced Sci. USA 88, 9804-9808

55. Riggs, K. J., Merrell, K. T, Wilson, G., Calame, K. (1991) CommonFactor 1 is a transcriptional activator which binds in the c-mycpromoter, the skeletal a-actin promoter, and the immunoglobulinheavy-chain enhancer. Mo!. CelL Biol. 11, 1765-1769

56. Seto, E., Shi, Y., Shenk, T (1991) YY1 is an initiator sequence-bindingprotein that directs and activates transcription in vitro. Nature (London)354, 241-245

57. Wilson, S. H. (1989) Gene regulation and structure-function studiesof mammalian -polymerase. In The Eukwyolic Nucleus: Molecular Struc-ture and Macronwlecular Assemblies (Strauss, P., and Wilson, S., eds) pp.198-233, The Teleford Press, Caldwell, NJ

58. Jones, K. A., Luciw, P. A., Duchange, N. (1988) Structural arrange-ments of transcription control domains within the 5-untranslatedleader regions of the HIV-1 and HIV-2 promoters. Genes & Dcv. 2,1101-1114

59. Kato, H., Horikoshi, M., Roeder, R. G. (1991) Repression of HLV-1transcription by a cellular protein. Science 251, 1476-1479

60. Peterson, M. G., Tanese, N., Pugh, B. F., Tijan, R. (1990) Functionaldomains and upstream activation properties of cloned human TATAbinding protein. Science 248, 1625-1630

61. Hoffman, A., Sinn, E., Yamamoto, T., Wang, J., Roy, A., Horikoshi,M., Roeder, R. G. (1990) Highly conserved core domain and uniqueN terminus with presumptive regulatory motifs in a human TATAfactor (TFIID). Nature (London) 346, 387-390

62. Kao, C. C., Lieberman, P. M., Schmidt, M. C., Zhou, Q, Pci, R.,Berk, A. J. (1990) Cloning of a transcriptionally active human TATAbinding factor. Science 248, 1646-1650

63. Ha, I., Lane, W. S., Reinberg, D. (1991) Cloning of a human geneencoding the general transcription initiation factor IIB. Nature (London)352, 689-695

64. Malik, S., Hisatake, K., Sumimoto, H., Roeder, R. G. (1991) Se-quence of general transcription factor TFIIB and relationships toother initiation factors. Proc. Nail. Acad. Sci. USA 88, 9553-9557

65. Flores, 0., Ha, I., Reinberg, D. (1990) Factors involved in specifictranscription initiation by mammalian RNA polymerase II: purifica-tion and subunit composition of transcription factor 11Ff BioL C/tern.265, 5629-5634

66. Inostroza, J., Flores, 0., Reinberg, D. (1991) Factors involved inspecific transcription by mammalian RNA polymerase II: purificationand functional analysis of general transcription factor lIE. j BioLC/tern. 266, 9304-9308

67. Flores, 0., Lu, H., Reinberg, D. (1992) Factors involved in specifictranscription by mammalian RNA polymerase II: identification andcharacterization of factor LIH. j Biol. Chern. 267, 2786-2793