Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing...

14
Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan 1 , Rodolfo Aramayo, Sunee Kertbundit 2 , Timothy C. Hall Institute of Developmental and Molecular Biology and Department of Biology, Texas A&M University, College Station, TX 77843-3155, USA Received 27 October 2006; received in revised form 3 April 2007; accepted 4 April 2007 Available online 12 April 2007 Abstract Modification of the histone proteins that form the core around which chromosomal DNA is looped has profound epigenetic effects on the accessibility of the associated DNA for transcription, replication and repair. The SET domain is now recognized as generally having methyltransferase activity targeted to specific lysine residues of histone H3 or H4. There is considerable sequence conservation within the SET domain and within its flanking regions. Previous reviews have shown that SET proteins from Arabidopsis and maize fall into five classes according to their sequence and domain architectures. These classes generally reflect specificity for a particular substrate. SET proteins from rice were found to fall into similar groupings, strengthening the merit of the approach taken. Two additional classes, VI and VII, were established that include proteins with truncated/ interrupted SET domains. Diverse mechanisms are involved in shaping the function and regulation of SET proteins. These include proteinprotein interactions through both intra- and inter-molecular associations that are important in plant developmental processes, such as flowering time control and embryogenesis. Alternative splicing that can result in the generation of two to several different transcript isoforms is now known to be widespread. An exciting and tantalizing question is whether, or how, this alternative splicing affects gene function. For example, it is conceivable that one isoform may debilitate methyltransferase function whereas the other may enhance it, providing an opportunity for differential regulation. The review concludes with the speculation that modulation of SET protein function is mediated by antisense or senseantisense RNA. © 2007 Elsevier B.V. All rights reserved. Keywords: Arabidopsis SET genes; Alternative splicing; Epigenetics; Histone methylation; Rice SET genes; SET domain classes 1. Introduction In eukaryotes, chromosomal DNA is organized as chromatin, in which 146-bp form almost 2 left-handed coils around an octamer of basic proteins comprised of a histone H3H4 tetramer and two H2AH2B dimers [13]. These fundamental units, nucleosomes, are linked together by 50 bp of spacer DNA that is associated with histone H1 to yield a characteristic beads-on-a-stringstructure that folds further to yield a very compact state in the nucleus [4,5]. Recently, it has become evident that fast, long range, reversible conformational fluctua- tions in nucleosomal structure, together with specific chemical modifications of the histones, play vital roles in eukaryotic gene regulation [68]. Covalent modifications of the N-terminal histone tails include acetylation, phosphorylation, methylation, ubiquitination and ADP-ribosylation [911]. These modifica- tions form the basis of the histone code (or, possibly, codes) that regulate gene expression epigenetically through various mechanisms [9,1214]. For example, methylation of histone H3K9 provides an epigenetic mark for the binding of the chromodomain (protein domain structure that binds methylated lysine) containing protein, HP1 (heterochromatin protein 1), that leads to heterochromatin formation and gene repression [15]. In contrast, acetylation of histones tends to decrease interaction between histones and DNA, and facilitates binding of bromodomain (protein domain structure that binds acetylated lysine) containing proteins, thereby promoting an open chro- matin conformation suitable for transcriptional activation [1618]. Biochimica et Biophysica Acta 1769 (2007) 316 329 www.elsevier.com/locate/bbaexp Corresponding author. Tel.: +1 979 845 7728; fax: +1 979 862 4098. E-mail address: [email protected] (T.C. Hall). 1 Current address: Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA. 2 Current address: Department of Plant Science, Faculty of Science, Mahidol University, Rama 6, Bangkok 10400, Thailand. 0167-4781/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.bbaexp.2007.04.003

Transcript of Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing...

Page 1: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

Biochimica et Biophysica Acta 1769 (2007) 316–329www.elsevier.com/locate/bbaexp

Review

Plant SET domain-containing proteins: Structure, function and regulation

Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan 1, Rodolfo Aramayo,Sunee Kertbundit 2, Timothy C. Hall ⁎

Institute of Developmental and Molecular Biology and Department of Biology, Texas A&M University, College Station, TX 77843-3155, USA

Received 27 October 2006; received in revised form 3 April 2007; accepted 4 April 2007Available online 12 April 2007

Abstract

Modification of the histone proteins that form the core around which chromosomal DNA is looped has profound epigenetic effects on theaccessibility of the associated DNA for transcription, replication and repair. The SET domain is now recognized as generally havingmethyltransferase activity targeted to specific lysine residues of histone H3 or H4. There is considerable sequence conservation within the SETdomain and within its flanking regions. Previous reviews have shown that SET proteins from Arabidopsis and maize fall into five classesaccording to their sequence and domain architectures. These classes generally reflect specificity for a particular substrate. SET proteins from ricewere found to fall into similar groupings, strengthening the merit of the approach taken. Two additional classes, VI and VII, were established thatinclude proteins with truncated/ interrupted SET domains. Diverse mechanisms are involved in shaping the function and regulation of SETproteins. These include protein–protein interactions through both intra- and inter-molecular associations that are important in plant developmentalprocesses, such as flowering time control and embryogenesis. Alternative splicing that can result in the generation of two to several differenttranscript isoforms is now known to be widespread. An exciting and tantalizing question is whether, or how, this alternative splicing affects genefunction. For example, it is conceivable that one isoform may debilitate methyltransferase function whereas the other may enhance it, providing anopportunity for differential regulation. The review concludes with the speculation that modulation of SET protein function is mediated byantisense or sense–antisense RNA.© 2007 Elsevier B.V. All rights reserved.

Keywords: Arabidopsis SET genes; Alternative splicing; Epigenetics; Histone methylation; Rice SET genes; SET domain classes

1. Introduction

In eukaryotes, chromosomal DNA is organized as chromatin,in which ∼146-bp form almost 2 left-handed coils around anoctamer of basic proteins comprised of a histone H3–H4tetramer and two H2A–H2B dimers [1–3]. These fundamentalunits, nucleosomes, are linked together by ∼50 bp of spacerDNA that is associated with histone H1 to yield a characteristic‘beads-on-a-string’ structure that folds further to yield a verycompact state in the nucleus [4,5]. Recently, it has becomeevident that fast, long range, reversible conformational fluctua-

⁎ Corresponding author. Tel.: +1 979 845 7728; fax: +1 979 862 4098.E-mail address: [email protected] (T.C. Hall).

1 Current address: Department of Biochemistry, Vanderbilt University Schoolof Medicine, Nashville, TN 37232, USA.2 Current address: Department of Plant Science, Faculty of Science, Mahidol

University, Rama 6, Bangkok 10400, Thailand.

0167-4781/$ - see front matter © 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.bbaexp.2007.04.003

tions in nucleosomal structure, together with specific chemicalmodifications of the histones, play vital roles in eukaryotic generegulation [6–8]. Covalent modifications of the N-terminalhistone tails include acetylation, phosphorylation, methylation,ubiquitination and ADP-ribosylation [9–11]. These modifica-tions form the basis of the histone code (or, possibly, codes) thatregulate gene expression epigenetically through variousmechanisms [9,12–14]. For example, methylation of histoneH3K9 provides an epigenetic mark for the binding of thechromodomain (protein domain structure that binds methylatedlysine) containing protein, HP1 (heterochromatin protein 1),that leads to heterochromatin formation and gene repression[15]. In contrast, acetylation of histones tends to decreaseinteraction between histones and DNA, and facilitates bindingof bromodomain (protein domain structure that binds acetylatedlysine) containing proteins, thereby promoting an open chro-matin conformation suitable for transcriptional activation[16–18].

Page 2: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

317D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

Histone methylation is linked to multiple developmentalprocesses including heterochromatin formation, cell cycle regu-lation, transcriptional silencing and transcriptional activation[19–25]. At least six lysine residues on histone H3 (K4, K9,K27, K36, K79) and H4 (K20) are targeted by histone lysinemethyltransferases (HKMTs) [13,24]. Except for H3K79 [26],SET domain-containing HKMTs have the ability to transfer oneor multiple methyl groups to the ε-nitrogen of specific lysineresidues on histones [10,27]. Adding to the complexity ofhistone code, each lysine residue can be mono-, di- or tri-methylated [27,28]. Moreover, unlike histone lysine acetylationwhich is generally associated with gene activation [29,30],histone methylation at specific lysine residues can lead to eithergene activation or repression [10].

Insight into the nature and regulation of enzymes responsiblefor modifications of specific amino acid residues in the nu-cleosomal core histones is essential towards deciphering thehistone code. Whilst mammalian SUV39H1 was the first en-zyme to be shown to possess HKMT activity towards H3K9[19], its homolog, Su(var)3–9, in Drosophila was the first to beidentified in a genetic screen for a suppressor of position effectvariegation (PEV) [31]. The phenomenon of PEV was dis-covered by H. J. Muller in 1930 when describing the re-arrangement of the white color eye gene from euchromatic toheterochromatic chromosomal regions. These conformationalchanges result in silencing and cell-to-cell variation of geneexpression that lead to the mosaic eye color phenotype in Dro-sophila. Subsequent identification of HP1, that binds methylatedH3K9, provided a bridge connecting H3K9 methylation toheterochromatin formation and hence gene repression [15,32,33]. In Drosophila, transcriptional repression or activation ofhomeotic gene expression is mediated by two distinct andantagonistic groups of protein complexes. Repression ofhomeotic gene expression is mediated by a Polycomb group(PcG) protein complex, containing Enhancer of zeste [E(Z)] andextra sex combs (ESC), that targets H3K27 and H3K9methylation [34,35]. In contrast, the trithorax group (trxG)proteins, ASH1 [36] and Trx [37], function in targeting H3K4methylation [38–40] and hence in maintaining an active state ofhomeotic genes.

Identification of the SET domain followed from theobservation [41] that the Drosophila E(Z) PcG proteinincludes a region with high sequence similarity to twopreviously identified trxG proteins: Drosophila Trx [37] andhuman acute lymphoblastic leukemia 1 (ALL-1) [42,43]. Thepresence of this conserved region in these two groups ofproteins with opposing functions (maintaining gene repressionby PcG and activation by trxG proteins) led to the proposition[41] that it may comprise a domain that interacts withcommon nucleic acid or protein targets for gene regulation.Later, Su(var)3–9 was found to contain the C-terminal domainthat is also shared by E(Z) and Trx [31]. The SET domain isnamed after these three founding Drosophila proteins:Suppressor of variegation 3–9 [Su(var)3–9], Enhancer ofzeste [E(Z)], and Trithorax (Trx) [31]. The discovery of theevolutionarily conserved SET domain, present in the above-mentioned HKMTs, was an exciting step towards a compre-

hensive understanding of epigenetic regulation of geneexpression through histone methylation. Here, we summarizecurrent knowledge concerning plant SET proteins andcompare the known functions of orthologous SET proteinsof plants and other organisms. In addition to the existingclassification of SET proteins in Arabidopsis and maize[22,44], we have included 9 additional Arabidopsis SETproteins and 34 rice SET proteins from Plant ChromatinDatabase (ChromDB; http://www.chromdb.org). We alsocomment on recent evidence that alternative splicing may berelatively common for plant SET genes and consider thepossibility that it has a functional role in regulating histonemethylation. In addition, the possible role of antisense RNA inexpression of Arabidopsis SET genes will be discussed.

2. Classification and functions of plant SET proteins

Proteins containing the conserved SET domain can be foundin organisms ranging from virus to all three domains of life(Bacteria, Archaea, and Eukayota). At the time of writing, 1026entries of SET proteins are cataloged in the Pfam sequencealignment database [45]. According to annotation from bothPfam [45] and ChromDB, at least 47, 37 and 35 SET proteins arepresent in Arabidopsis, rice and maize respectively. In plants,Arabidopsis SET (AtSET) proteins are the best annotated andcharacterized. Baumbusch et al. [44] were the first to classify 37putative Arabidopsis SET genes into four distinct classes: (1)Enhancer of zeste [E(z)] homologs; (2) Ash1 homologs andrelated; (3) trithorax (trx) homologs and related; and (4) Sup-pressor of variegation [Su(var)] homologs and related. Subse-quently, Springer et al. [22] classified 32 Arabidopsis and 22maize SET proteins into 19 orthology groups distributed amongfive classes according to their phylogenetic relationships anddomain organization.

We have added SET genes from rice to those of Arabidopsisand maize based on their sequence homology and phylogeneticrelationships (Table 1). Except for the addition of two classes(VI and VII), this analysis is consistent with the classification bySpringer et al. [22]. Although much remains to be experimen-tally verified, in general, it appears that the resulting groupingsreflect the substrate specificities of the members.

2.1. Class I: enhancer of zeste [E(Z)] homologs (H3K27)

In Drosophila, homeotic gene regulation (via H3K27methylation) is mediated by the polycomb repressive complex2 (PRC2) that contains the PcG proteins, E(Z), ESC, P55 and Su(z)12 [34,35,46–48]. As reviewed in Gutton and Berger [49],homologs corresponding to these PcG proteins are also presentin Arabidopsis PcG complexes that function in mediatingvarious aspects of plant development. In Arabidopsis, CURLYLEAF (CLF), SWINGER (SWN) and MEDEA (MEA)constitute the E(Z) homologs of SET domain-containing PcGproteins (Table 1). CLF was the first identified plant SETdomain-containing protein within this class [50]. In Arabidopsis,CLF controls leaf and flower morphology as well as floweringtime through repression of the floral homeotic genes

Page 3: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

318 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

AGAMOUS (AG) and SHOOTMERISTEMLESS (STM) [51,52].Its close relative, SWN, acts redundantly in mediating tri-methylation of histone H3K27 at both AG and STM loci [52,53].MEA plays a role in maintaining its own imprinting duringendosperm development and methylation of H3K27 is asso-ciated with MEA silencing during vegetative development andmale gametogenesis [54–56]. These recent findings providefurther evidence that the E(Z) group of SET domain-containingproteins is primarily associated with H3K27 methylation [22]and that tri-methylation of H3K27 is present at specific repressedloci at different stages of plant development. In Drosophila, anadditional polycomb repressive complex, PRC1, was found toimpede chromatin remodeling mediated by hSWI/SNF [57,58].Although no complex corresponding to PRC1 has been iden-tified in plants, the presence of a family of multiple PcGmembers in Arabidopsis suggested that they may constitutealternative functional PRC2 complexes [49,53]. For example,Makarevich et al. [59] recently showed that at least two differentMEA, CLF or SWN-containing PcG complexes are involved inhistone H3K27 tri-methylation at PHERES1 (PHE1) orFUSCA3 (FUS3) loci during seed development. In addition,vegetative expression of PHE1 and FUS3 was upregulated onlyin the swn clf double mutant, suggesting their redundant role inconstituting PRC2, and hence their participation in differentPRC2 complexes, thereby repressing PHE1 [59].

A conserved role of SET proteins in plant development isreinforced by the finding that overexpression of the SETdomain of the rice CLF homolog, OsSET (SDG711), affectsshoot development in transgenic Arabidopsis [60]. In additionto the C-terminally located SET domain, alignment ofArabidopsis and maize E(Z) homologs (MEZ1, MEZ2 andMEZ3) revealed several domains that are unique to this classof proteins (Fig. 1). These include two domains of unknownfunction (EZD1 and EZD2), a SANT (SWI3, ADA2, N-CoRand TFIIIB′′ DNA-binding domains) and a cysteine-richregion [61]. These domains can also be found in rice OsSET1and OsiEZ1 (SDG718) [62]. Therefore, it is highly likely thatthese domains may facilitate the ability of the SET domain tomodify histones, perhaps by contributing to the formation ofthe PcG complex that is functionally important for this class ofproteins.

2.2. Class II: the ASH1 homologs (H3K36)

All members within this class have a SET domain that isinvariably preceded by an AWS (associated with SET) domainand followed by a cysteine-rich post-SET domain (Fig. 1). As

Notes to Table 1:a SET protein from Arabidopsis, maize and rice were classified into seven classes. C

[22]. Class VI and VII include SET proteins with truncated or interrupted SET domb Different orthology groups of Class I to V SET proteins defined by Springer etc Arabidopsis SET proteins listed in Pfam database, common name, ChromDB nad Maize SET proteins listed in ChromDB. Asterisk denotes length of partial protee Rice SET proteins from Chrom DB were queried against Arabidopsis protein dat

NCBI conserved domains database. Phylogenetic relationship between homologous pgrouped into different orthology groups corresponding to Arabidopsis and maize SEnumber and amino acid length are indicated. Rice SDG701, SDG732 and SDG738

reported by Springer et al. [22], this class of proteins can begrouped into several orthology groups, based on the position ofthe SET domain. As shown in Table 1, the rice homologs can beclassified into corresponding subgroups.

Arabidopsis ASHH1, maize SDG102 and rice SDG708contain an N-terminally located SET domain while ASHH3,ASHH4, SDG110 and SDG724 possess a more centrally locatedSET domain. Compared to other proteins within this class,ASHH2 (1759-aar) is significantly larger and it, like rice SDG725(2133-aar), can be placed in a distinct orthology group. In additionto the presence of canonical domains (AWS, SET, post-SET),ASHH2 contains a CW (cysteine and tryptophan conserved)domain (as predicted by Pfam-A) that can be found in at least fiveother protein families in higher plants [63]. As for the orthologygroup that contains ASHR3 and SDG736, an additional PHD(plant homeodomain) element is located upstream of the AWSdomain (Fig. 1). Therefore, it is possible that during the evolutionof these SET domain-containing proteins, an additional domainwas acquired to enable additional functions.

Although biochemical characterization of proteins withinthis class is not yet available, two lines of evidence suggest thatthis class of proteins is involved in H3K36 methylation.ASHH1 is the Arabidopsis homolog of yeast Set2 [44] whichhas been shown to exhibit H3K36 methyltransferase activity[64]. In addition, genetic studies revealed that ashh2 mutantsshow an early flowering phenotype that is accompanied by adecrease in H3K36 methylation at the FLOWERING LOCUS C(FLC) [65].

2.3. Class III: the trithorax homologs and related proteins(H3K4)

In contrast to PcG proteins, trxG proteins are positiveregulators of homeotic gene expression [66,67]. In Drosophila,Trx functions through the trithorax acetyltransferase complex(TAC1) in mediating homeotic gene expression [68]. H3K4methyltransferase activity of Trx was only recently demons-trated experimentally [40]. In contrast, Set1, the Trx homolog inyeast, was the first and only H3K4 methyltransferase identifiedin yeast [69] and it is important for normal cell growth andsilencing of rDNA [70]. Similar to Trx, Set1 functions in a largeprotein complex, COMPASS (Complex Proteins Associatedwith Set1), in mediating histone methylation [71,72]. InArabidopsis, it has been demonstrated that H3K4 methylationis associated with a permissive state of gene expression duringplant development [73,74]. ATX1 (subgroup III-1; Table 1) wasthe first protein confirmed to have H3K4 methyltransferase

lassification of Arabidopsis and maize Class I to V was reported in Springer et al.ains.al. [22].me, locus number and amino acid length are indicated.in sequence and corresponding accession number of EST sequence.abase using NCBI BLAST and their domain architecture were determined usingroteins were determined by Vector NTI software (Invitrogen) and the proteins areT proteins based on both sequence homology and domain architecture. Locuswere not grouped.

Page 4: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

319D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

activity in plants and it is involved in floral developmentthrough activating flower homeotic genes [75,76].

Chromatin immunoprecipitation experiments showed thattri-methylation of H3K4 is associated with active transcription

Table 1SET domain-containing proteins in plants

Class a Gp b Arabidopsis thaliana c

I 1 MEA (SDG5; At1g02580; 689aar)2 CLF (SDG1; At2g23380; 902aar)3 SWN (SDG10; At4g02020; 856aar)

II 1 ASHH3 (SDG7; At2g44150; 363aar)ASHH4 (SDG24; At3g59960; 352aar)

2 ASHR3 (SDG4; At4g30860; 497aar)3 ASHH1 (SDG26; At1g 76710; 492aar)4 ASHH2 (SDG8; At1g77300; 1759aar)

III 1 ATX1 (SDG27; At2g31650; 1062aar)ATX2 (SDG30; At1g05830; 1193aar)

2 ATX3 (SDG14; At3g61740; 902aar)ATX4 (SDG16; At4g27910; 912aar)ATX5 (SDG29; At5g53430; 1043aar)

3 ATXR3 (SDG2; At4g15180; 2351aar)4 ATXR7 (SDG25; At5g42400; 1421aar)

IV ATXR5 (SDG15; At5g09790; 352aar)ATXR6 (SDG34; At5g24330; 349aar)

V 1 SUVH1 (SDG32; At5g04940; 670aar)

SUVH3 (SDG19; At1g73100; 669aar)

SUVH7 (SDG17; At1g17770; 693aar)SUVH8 (SDG21; At2g24740; 755aar)

2 SUVH4 (SDG33; At5g13960; 624aar)SUVH6 (SDG23; At2g22740; 790aar)

3 SUVH2 (SDG3; At2g33290; 651aar)SUVH9 (SDG22; At4g13460; 650aar)

4 SUVR3 (SDG20; At3g03750; 338aar)5 SUVH5 (SDG9; At2g35160; 794aar)

6 SUVR1 (SDG13; At1g04050; 630aar)SUVR2 (SDG18; At5g43990; 717aar)SUVR4 (SDG31; At3g04380; 492aar)

7 SUVR5 (SDG6; At2g23740; 203aar)

VI ASHR1 (SDG37; At2g17900; 480aar)ASHR2 (SDG39; At2g19640; 398aar)ATXR1 (SDG35; At1g26760; 969aar)ATXR2 (SDG36; At3g21820; 473aar)ATXR4 (SDG38; At5g06620; 258aar)

VII SDG40 (At5g17240; 491aar)Putative RuBisCo SSMT(At3g07670; 504aar)At2g18850 (543aar)At5g14260 (514aar)RuBisCo LSMT (At1g14030; 482aar)At1g01920 (572aar)At1g24610 (476aar)At3g55080 (463aar)At3g56570 (531aar)At4g20130 (483aar)

from the seed-specific promoter, phaseolin (phas) through theactivity of a family of B3 domain-containing transcriptionfactors, including FUS3 (D.W-K Ng and T.C. Hall, unpub-lished data) and PvALF (Phaseolus vulgaris ABI3-like factor)

Zea mays d Oryza sativa spp. Japonica e

– –MEZ1 (AF443596; 931aar) SDG711 (Os06g16390; 896aar)MEZ2 (AF443597; 894aar),MEZ3 (AF443598; 895aar)

SDG718 (Os03g19480; 895aar)

SDG110 (AF545814; 342aar) SDG724 (Os09g13740; 340aar)– –– SDG736 (Os02g39800; 492aar)SDG102 (AY122273; 513aar) SDG708 (Os04g34980; 518aar)– SDG725 (Os02g34850; 2133aar)– SDG723 (Os09g04890; 1022aar)SDG106 (DR826986; 385aar)*,SDG128 (DV534215; 296aar)*

– –SDG115 (DN233805; 147aar)* SDG721 (Os01g11950; 991aar)– SDG705 (Os01g46700; 1057aar)SDG108 (DV025562; 614aar)* SDG701 (Os08g08210; 2257aar)SDG127 (DN204179; 141aar)* SDG717 (Os12g41900; 1212aar)SDG139 (DV172958; 231aar)* SDG720 (Os01g73460; 385aar)SDG129 (BM736459; 109aar)* SDG730 (Os02g03030; 361aar)SDG101 (DV540845; 795aar) SDG704 (Os11g38900; 813aar),

SDG713 (Os03g20430; 627aar)SDG105 (AY093419; 678aar),SDG111 (AY187718; 486aar)

SDG728 (Os05g41170; 672aar)

SDG113 (AF545813; 766aar) SDG709 (Os01g59620; 736aar),– SDG733 (Os11g03700; 663aar),

SDG734 (Os12g03460; 663aar)SDG118 (AY122271; 696aar) SDG714 (Os01g70220; 663aar)– SDG727 (Os09g19830; 921aar)SDG136 (DN217108; 662aar) SDG726 (Os07g25450; 684aar)SDG135 (CF049431; 449aar),* SDG715 (Os08g45130; 594aar)SDG137 (DN232551; 685aar)SDG116 (CO520549; 128aar)* SDG729 (Os01g56540; 338aar)SDG103 (DN214510, 654aar)*,SDG104 (AY122272; 886aar),SDG119 (DV031465, 508aar)*

SDG703 (Os04g34990; 842aar),SDG710 (Os08g30910; 1173aar)

– –– SDG712 (Os02g40770; 741aar)SDG107 (DV544186; 148aar)* –SDG117 (AY187719; 1198aar),SDG131 (DN228281; 131aar)*

SDG706 (Os02g47900; 1136aar)

SDG130 (AAL75997; 410aar) SDG716 (Os03g49730; 403aar)– SDG740 (Os08g10470; 392aar)SDG122 (DY238779; 534aar) SDG739 (Os03g07260; 536aar)SDG140 (DV505249; 217aar)* SDG722 (Os04g53700; 517aar)SDG123 (AY172976; 303aar) SDG741 (Os10g27060; 298aar)

SDG731 (Os07g28840; 479aar)

Page 5: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

Fig. 1. Domain architecture of plant SET proteins. Representative schematic diagram (right) showing overall domain composition of various orthology groups of SETproteins from different classes (left). Diagrams for Arabidopsis and maize SET proteins are described in detail by Springer et al. [22]. Abbreviations: EZD, E(Z)domain; SANT, SWI3, ADA2, N-CoR and TFIIIB′′DNA-binding domain; CXC, cysteine-rich region; AWS, associated with SET domain; PHD, plant homeodomain;Zf-CW, a zinc finger with conserved Cys and Trp residues; PWWP, domain named after a conserved Pro–Trp–Trp–Pro motif; FYRN, F/Y-rich N-terminus; FYRC,F/Y-rich C-terminus; YDG, conserved Tyr–Asp–Gly motif.

320 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

[77–79]. In addition, an increase in H3K4me3 at the phaschromatin was detected within 1 h upon phas activation,illustrating the dynamic aspects of chromatin remodeling inmediating the activation of this seed-specific promoter [79].Moreover, LEC1 (LEAFY COTYLEDON1; [80]) and FUS3(transcription factors involved in embryo development) are up-regulated in the clf swn mutant [59]. Therefore, further studieson the roles of the various classes SET proteins in modulatingepigenetic regulation of gene expression during seed develop-ment are likely to provide new insight to developmentalprocesses.

As shown in Fig. 1, in addition to the SET domain, ClassIII proteins bear several highly conserved protein domains(PWWP, FYRN, FYRC) and the PHD domain (often found inchromatin-associated proteins [81,82]). While all theseconserved domains can be located in proteins within subgroupIII-1 (ATX1 and ATX2), subgroup III-2 proteins (ATX3,ATX4 and ATX5) lack the FYRN and FYRC domains. Incontrast to other subgroups of proteins within the same class,ATXR3, ATXR7 (subgroups III-3 and -4) and theircorresponding rice homologs (SDG701 and SDG717) lackall of the highly conserved domains, and they possess only aC-terminally located SET domain and a post-SET domain.Although biochemical characterization is lacking for the othermembers of the plant Trx class, the presence of various highlyconserved domains within this class of proteins may suggestdiverse functions for these SET domain-containing proteins.In fact, Alvarez-Venegas et al. [83] showed that ATX1interacts with phosphatidylinositol 5-phosphate (PI5P)

through its PHD finger. PI5P is a component of the celllipid pool and acts as secondary messenger mediating severalplant processes, such as hyperosmotic stress response andmembrane trafficking, via lipid signal transduction [84]. Suchinteraction between PHD finger and PI5P has also beendemonstrated in animals [85]. These findings support the ideathat SET domain proteins may act as transducers that bindsignaling molecules such as PI5P, modifying enzymaticactivity either by complete deactivation or by changing sub-strate specificity.

2.4. Class IV: proteins with a SET and a PHD domain

This class of proteins is characterized based on the presenceof a PHD finger and a C-terminally located SET domain (Fig. 1).Corresponding rice homologs, SDG720 and SDG730, can belocated in the ChromDB in addition to the Arabidopsis (ATXR5andATXR6) andmaize (SDG139 and SDG129) members of thisclass. In contrast to Class III proteins, Springer et al. [22]suggested that Arabidopsis ATXR5 and ATXR6 arose from anindependent duplication of Arabidopsis SET genes and thatcertain conserved amino acid residues within the SET domainimportant for HKMT activity are different. Nevertheless, thepresence of the PHD finger and recent findings that this domaincan bind methylated H3K4 [86,87], suggested that this class ofproteins, like those in Class III, is closely related to the trithoraxclass of SET-domain containing proteins. Although little isknown about the function of this class of proteins, recent datasuggested that ATXR5 and ATXR6 may contribute to cell cycle

Page 6: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

321D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

regulation or DNA replication through interactions withproliferating cell nuclear antigen (PCNA) [88].

2.5. Class V: suppressor of variegation [Su(var)] homologsand relatives (H3K9)

This large class of proteins is characterized by the presenceof pre-SET, SET and post-SET domains (Fig. 1). In addition tothe already classified Arabidopsis and maize Su(var) proteins,corresponding rice homologs are sorted in Table 1 into thevarious orthologous groups. In all the Su(var) homologs(subgroups V-1, -2 and -5), an additional conserved YDG(conserved Tyr–Asp–Gly motif) domain is present upstream ofthe pre-SET domain.

Members of this class are likely to be involved in hetero-chromatin formationmediated by H3K9methylation [33,89,90].In Arabidopsis, SUVH4 or KRYPTONITE (KYP), the beststudied member of this class, represses expression of the floralhomeotic gene, SUPERMAN (SUP), through its H3K9 methyl-transferase activity [90]. Naumann et al. [91] reported that theN-terminus and YDG domain in SUVH2 are involved indirecting DNAmethylation and subsequent histone methylationat its target locus so that heterochromatic methylation marks areaffected in suvh2 mutant plants. Recently, Ebbs et al. [92,93]showed that there is a hierarchical and locus-specific control ofH3K9 di-methylation as well as non-CG DNA methylation bySUVH4, SUVH5 and SUVH6 HMT in Arabidopsis. Theinvolvement of this class of proteins in heterochromatinformation was demonstrated for NtSET1 in tobacco [94], andreinforced by the subsequent finding that NtSET1 interacts withLHP1 (Arabidopsis homolog of HP1) [95]. In addition, Yu et al.[95] showed that NtSET1 contains both H3K9 and H3K27methylation activities in vitro and that overexpression ofNtSET1 leads to an increase in H3K9 di-methylation in tobaccoBY2 cells. Overall, the highly conserved, plant-specificdistribution of domains (YDG, Pre-SET and SET) found insubgroups V-1, -2 and -5 (Fig. 1) of proteins suggests that theyshare a recent, plant-specific common ancestor [61].

Unlike the Su(var) homologs, all the Su(var)-related homo-logs (subgroups V-4, -6 and -7) lack the YDG domain (Fig. 1).However, Thorstensen et al. [96] have identified a novelconserved plant specific N-terminal domain, WIYLD, that ispresent in Arabidopsis SUVR1, SUVR2, SUVR4 as well as inother SUVR-homologs of other plant species. Interestingly, theauthors also showed that recombinant SUVR4 acts as a di-methyltransferase with preference for mono-methylated H3K9as substrate. Therefore, this finding may suggest that the SUVHand SUVR proteins can act in concert in achieving variousfunctional H3K9 methylation states that will eventually lead toDNA methylation in a locus-specific manner [97].

2.6. Class VI: proteins with an interrupted SET domain

Proteins from both Classes VI and VII lack the diversedomains that can be found in other classes of SET proteins andthe SET domain is either truncated or interrupted in the SET-Iregion (Fig. 1). Class VI consists of two ASH-related proteins

(ASHR1 and ASHR2) and three Trx-related (ATXR1, ATXR2and ATXR4) proteins. Homologous SET genes can be found inmaize (SDG122, SDG123, SDG130 and SDG140) and rice(SDG716, SDG740, SDG739, SDG722 and SDG741). Thissuggests that this class of proteins existed before the divergenceof monocot and dicot plants. The SET domain of ASHR1 isinterrupted by a Zf-MYND [zinc finger MYND (Myeloid,Nervy, and DEAF-1)] domain. Brown et al. [98] recentlyreported that Smyd2, the human homolog of ArabidopsisASHR1, restricts cell-cycle progression through di-methylationof H3K36. Although not yet established in plants, it is likely that,like its human homolog, ASHR1 will be shown to have H3K36methyltransferase activity.

2.7. Class VII: RBCMT and other SET-related proteins

While the majority of SET domain-containing proteinspossess histone-methylating activity, Class VII SET proteinsmethylate non-histone proteins (Table 1). Rubisco large(RBLSMT) and small (RBSSMT) subunit methyltransferases(RBCMTs) exemplify SET-related proteins that methylate non-histone targets [99–101]. While the N- and C-termini of theseproteins bear high similarity to those of other SET domainproteins (Fig. 1), the SET-I region has only weak similarity to,and is longer than that of canonical SET proteins. Nevertheless,the structural features within the SET-I region essential forsubstrate-specific methylation are retained [100,102]. Weinclude Arabidopsis SDG40, two unnamed proteins (corre-sponding to At2g18850 and At5g14260) and rice SDG731 inthis class as they show some similarity to RBSSMT. Theremaining five uncharacterized Arabidopsis proteins wereclassified into this class as they contain domain architectureresembling that of RBLSMT.

3. Regulation of plant SET proteins

3.1. Regulation of SET protein functions throughprotein–protein interactions

At least some SET proteins are capable of self-association[103–105]. This property suggests that the lysine-methyltrans-ferase activity, including varying degrees of methylation (mono,di- or tri-methylation), substrate specificity and localization canbe subjected to regulation by intra- and inter-molecularinteractions with other proteins. Indeed, modulation of theSET domain-mediated lysine-methyltransferase activity byinteracting proteins has been demonstrated for yeast SET1-COMPASS [106,107] and for human MLL1 trithorax complex[108].

Regulation of histone methylation by SET domain-contain-ing methyltransferases via their interaction with other proteinshas been demonstrated in yeast and mammalian systems[48,72,109]. In plants, the regulation of SET protein-mediatedhistone methylation through protein–protein interactions ispoorly understood. However, the involvement of E(Z) classproteins in PcG complexes that contain various non-SETproteins provides an avenue to investigate the role, if any, of

Page 7: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

322 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

interacting proteins in regulating the methyltransferase activity.In Arabidopsis, the involvement of MEA, CLF and SWN indevelopmental processes may be linked to their interactions withother non-SET proteins, as mutations in the interacting partnersdisplay phenotypes indicative of affecting similar developmen-tal pathways. During seed development, MEA is part of a PcGcomplex that includes several other regulators of early seeddevelopment such as FERTILIZATION-INDEPENDENTENDOSPERM (FIE) and FERTILIZATION-INDEPENDENTSEED2 (FIS2) [110–112]. FIE is a WD40 domain-containingprotein that shows similarity to the Drosophila EXTRA SEXCOMB (ESC). FIS2 is a Zn-finger protein similar to Su(Z)12[35,47]. Physical interactions betweenMEA and FIE [113,114]),and between MEA (or SWN) and FIS2 [53,115] have beendemonstrated using yeast-two-hybrid assays. Therefore, basedon the similarities in the mutant phenotypes and theirrequirement for the repression of PHE1 during endospermdevelopment [59], complex formation is a prerequisite for MEAor SWN to function as HMTs in planta.

Recently, Kohler et al. [116] showed that ArabidopsisMULTICOPY SUPPRESSOR OF IRA 1 (MSI1) interactswith FIE and is required for seed development. Like FIE, MSI1is a WD40 domain-containing protein that shares similarity withDrosophila NURF55 [35]. This group further demonstrated thatMSI1 is a component of a 600-kDa protein complex thatincludes MEA and FIE. The high degree of sequence similarityand the functional association of various relevant subunitsstrongly suggest that the MEA–FIE–FIS2–MSI1 complex maybe the plant homolog of the fly and mammalian ESC-E(Z)complexes. Interestingly, FIE also physically interacts with CLF[51]. Based on the similarity of mutant phenotypes, EMF2 (azinc-finger protein) has been proposed to be an additionalcomponent of a putative complex containing FIE and CLF [53].Recently, Wood et al. [117] showed that VERNALIZATION2(VRN2), a Zn-finger domain-containing protein, and VERNA-LIZATION INSENSITIVE 3 (VIN3), a PHD domain-containingprotein, also interact with the FIE–CLF–SWN complex,providing a mechanism for the repression of FLC expressionduring vernalization. This finding further supports the idea thatthe functionality of SET proteins is diversified through protein–protein interactions.

In addition to its methyltransferase activity, the substratespecificity of a particular SET protein may be mediated byprotein interactions. For example, the WDR5 (a WD40 domain-containing protein) component of MLL1 trithorax complex hasbeen shown to confer substrate specificity by interacting withdi-methylated H3K4 [118,119]. By analogy, the importance ofthe WD40 domain-containing protein, FIE, to overall plantdevelopment can be explained by its association with MEA andCLF in the regulation of sporophytic and gametophytic stages,respectively [51,120]. It can be speculated that FIE may regulateHMT activity by maintaining complex integrity and (or) byproviding substrate specificity similar to that observed forWDR5 in the human MLL1 complex. The presence of anadditional WD40 domain protein, MSI1, in the MEA–FIE–FIS2–MSI1 complex may confer the ability to recognize twodifferent methylated lysine residues on histones.

3.2. Alternative splicing of Arabidopsis SET genes

The presence of introns in eukaryotic RNAs confers multipledegrees of flexibility in gene expression via various molecularmechanisms, such as developmental and cell-specific regulationof transcription [121], generation of non-coding RNAs [122],and alternative splicing [123]. Among these mechanisms, alter-native splicing is of special interest in that it can produce two ormore forms of mature mRNA from the same precursortranscript and hence generate transcripts bearing diverseinformation.

In general, alternative splicing in non-translated regions canhave an impact on protein expression levels [124]. Conversely, incoding regions, it may alter protein structure and function. Inextreme cases, alternative splicing can lead to the production oftwo proteins with antagonistic functions [125]. It is estimated that60–80% of human genes undergo alternative splicing [126],contributing to human genome complexity. A smaller percentageof plant genes, ∼22% in Arabidopsis and 10% in rice [127], alsoexperience alternative splicing. It is possible that these numberswill increase in the future, as more plant cDNAs or ESTs arecharacterized. Any statement about alternative splicing eventsshould take into consideration the fact that they may be the resultof experimental artifacts such as sequencing errors, splicingerrors, and RNA degradation events [127,128]. However, despitethese possibilities, some alternative splicing events, especiallythose that are evolutionarily conserved, may be biologicallyimportant and contribute to organismal fitness [129]. Expressionof alternatively spliced transcripts can be tissue specific [130] andcan be differentially regulated by physical [131] or chemical [132]stimuli, arguing for the involvement of multiple regulatorysystems in the expression of individual isoforms.

Alternative splicing in the coding region of ArabidopsisSET genes may generate proteins with distinct functions. Sucheffects have been reported for Drosophila Su(var)3–9, whichcan express two distinct transcripts (2.4 kb and 2.0 kb),encoding two proteins that share similarity only for the first 80amino acid residues (aar). Whereas the 2.4 kb transcript istranslated to yield a SET domain-containing HMT, the 2.0 kbtranscript encodes the γ subunit of eukaryotic translationinitiation factor 2 [133]. Conversely, protein isoforms fromalternative splicing in coding regions may have redundantfunctions if the conserved domains are not affected. Anexample is zebrafish SET gene, SmyD1. Inactivation of eitherisoform of this gene results in no morphological phenotypewhereas debilitation of both isoforms have severe effects onmyofibril organization [134]. The possibility also exists thatSET protein isoforms resulting from alternative splicingassemble into higher order protein complexes. Homodimeriza-tion or heterodimerization has been reported for the virus SETprotein vSET [135], human G9a, and GLP [136], and Droso-phila ASH1 and Trx [105]. If plant SET proteins form dimers,alternative splicing of SET genes could generate proteinsforming various homo- and hetero-dimers having distinctbiological activities.

Scanning all 47 Arabidopsis SET genes against theAlternative-Splicing-in-Arabidopsis database (ASIP) [127], the

Page 8: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

Fig. 2. Alternative splicing patterns. Schematic diagram showing normal andfive major alternative splicing patterns for a representative gene containing threeexons (black boxes). Shaded boxes denote alternative exonic regions: AltA,alternative acceptor (3′ side of introns); AltD, alternative donor (5′ side ofintrons); AltP, alternative positions (both 5′ and 3′ side of introns); IntronR,intron retention; ExonS, exon skipping. Angled lines denote the intronic regions.Modified from: Wang and Brendel [127].

323D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

UniProt consortium [137], and published literature [96], led tothe identification of 18 (38%) Arabidopsis SET genes withalternative splicing (Table 2). This is a substantially higherpercentage than the overall percentage (22%) of splicing forArabidopsis genes [127], and suggests that alternative splicingis important in generating SET protein complexity andfunctionality. Alternative splicing in SET genes occurs throughvarious mechanisms (Fig. 2) including alternative donor (AltD),alternative acceptor (AltA), alternative position (AltP), intronretention (IntronR), and exon skipping (ExonS) [127]. A searchin the ASIP database revealed two pairs of SET genes fromArabidopsis and rice (At2g19640 and OS08g10470; At5g17240and Os07g28840) for which the same alternative splicingmechanism (intron retention) is employed to produce maturetranscripts encoding the orthologous proteins. This provides aspecific example of evolutionary conservation, presumably as aresult of some functional advantages.

Alternative splicing of Arabidopsis SET genes could haveimportant consequences in regard to protein functions. For 3(ATX2, SUVH1, At5g14260) of the 18 alternatively spliced SETgenes, examination of ESTs in the ASIP database revealed thatsplicing differences occurred in the 3′UTRs (Table 2). The 3′ (or5′) UTRs are known to contain elements that can control proteinstability [138], protein translation efficiency [139], and spatial[140] or subcellular distribution of proteins [141]. Casas-Mollano et al. [142] showed that an intron located in the 5′UTR of the Arabidopsis SUVH3 regulates both constitutive and

Table 2Arabidopsis SET genes that undergo alternative splicing and/or with naturally occur

Class a SET gene (locus) b Alternative splicing patt

II-2 ASHR3 (At4g30860) IntronRII-3 ASHH1 (At1g76710) AltAII-4 ASHH2 (At1g77300) AltA; ExonSIII-1 ATX1 (At2g31650) AltDIII-1 ATX2 (At1g05830) AltAIII-2 ATX3 (At3g61740) IntronRIII-2 ATX5 (At5g53430) –IV ATXR5 (At5g09790) AltA; ExonSV-1 SUVH1 (At5g04940) IntronRV-1 SUVH3 (At1g73110) –V-1 SUVH8 (At2g24740) –V-2 SUVH6 (At2g22740) –V-4 SUVR3 (At3g03750) AltPV-5 SUVH5 (At2g35160) AltAV-6 SUVR1 (At1g04050) IntronRV-6 SUVR2 (At5g43990) AltA; AltDV-6 SUVR4 (At3g04380) IntronRVI ASHR2 (At2g19640) IntronRVI ATXR1 (At1g26760) –VII SDG40 (At5g17240) IntronRVII At5g14260 AltD; IntronRVII At1g01920 AltA; IntronRVII At1g24610 –VII At3g55080 AltA; ExonSa Orthology groups of Arabidopsis SET proteins listed in Table 1.b Arabidopsis SET genes, common name and/or locus number are indicated.c Abbreviations for alternative splicing mechanisms described in Fig. 2.d The possible effects of alternative splicing on proteins according to the concepte “Y” denotes SET gene to which naturally occurring antisense transcripts can be

high levels of expression. Therefore, alternative splicing eventsobserved in ATX2, SUVH1 and At5g14260 may affect thelocalization or abundance of transcripts or encoded proteins,rather than causing a change in the structure of encoded proteins.

Alternative splicing for the other 15 Arabidopsis SET genes ispredicted by conceptual translation to occur at the protein-en-coding region of transcripts (Table 2). For 10 SET genes, alter-native splicing events appear to affect the start/ stop codon or aregion of the protein that lies outside of a conserved domainof sequence similarity. However, the other 5 SET genes(At1g01920, At3g55080, At5g17240, ATXR5 and SUVR3) havealternative splicing affecting the SET domain. In the case ofAt1g01920, ∼40-aar are added/removed from the SET-N motif

ring antisense transcripts

erns c Possible effects on proteins d Antisense e

Use a different stop codon –No effect on domain structure –No effect on domain structure YNo effect on domain structure –Affect 3′UTR –Use a different stop codon –– YAffect SET domain YAffect 3′UTR Y– Y– Y– YAffect SET domain –Use a different start codon –Use a different start codon –Use a different start codon –No effect on domain structure –No effect on domain structure –– YAffect SET domain –Affect 3′UTR –Affect SET domain –– YAffect SET domain –

ual translation of alternatively spliced transcripts.located in ASIP [127] and Arabidopsis Cis-NAT Pairs Database [148].

Page 9: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

324 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

of the encoded protein. The splicing event of At3g55080 seemsto directly regenerate or eliminate the SET domain coding region.In At5g17240, a 79-nt intron is retained in one isoform, resultingin a frameshift near the beginning of the SET domain codingregion. Similarly, a 59-nt exon deletion from one isoform ofATXR5 is likely to cause a frameshift at the SET domain codingregion. In the case of SUVR3, the consequences resulting fromalternative splicing may be dramatic. We have found that RNAi-mediated debilitation of SUVR3 in transgenic Arabidopsis resultsin aberrant flower formation and other abnormalities (Wang, T. etal., unpublished data). During analysis of these plants, we foundthat two different SUVR3 transcripts were present. Sequencesidentical to those of the transcripts are present in cDNA and ESTlibraries [143,144], each comprising two exons and one intron(Fig. 3). It became evident that these RNAs encoded SUVR3isoforms, with the total coding sequence length of isoformSUVR3α being 48-nt shorter than that in SUVR3β. This results ina 16-aar difference between the two isoforms, with a loss of 3invariant and 3 conserved residues from SUVR3α, suggestingthat the SUVR3 isoforms may differ in their ability to methylatehistone H3. If so, alternative splicing could be seen to have asignificant regulatory role in regard to chromatin remodeling.

3.3. Anti-sense RNA expression of Arabidopsis SET genes

Our understanding of anti-sense transcription in all organismsis very limited. In mouse, genome-wide analysis of transcriptssuggests that antisense transcription is widespread and not re-stricted to imprinted genes, as previously postulated [145]. In

Fig. 3. Structures of SUVR3α and SUVR3β isoforms. Panel A: the red, green, orangepositions in SUVR3α and SUVR3β are indicated. The 48-nt fragment represented byintronic region in SUVR3α, and its corresponding 16-aar are shown in blue letters. PanSUVR3α, SUVR3β, Arabidopsis KRYPTONITE (At_KYP, AAK28969); human SDIM-5 (CAF06044) and S. pombe (Sp) CLR4 (NP_595186). The aar highlighted aproteins.

Arabidopsis, the ratio of expression of the sense (S) to the anti-sense (AS) transcripts fluctuatesmarkedly among different tissuesduring development [146]. In addition, sense–antisense tran-scripts (SAT) tend to be poly(A)-negative and nuclear localized inArabidopsis, as in mouse [147].

Naturally occurring AS transcripts (NATs) corresponding tonine Arabidopsis SET genes were found in the ASIP [127] andArabidopsis Cis-NAT Pairs Databases [148] (Table 2). Thisopens the possibility that the expression of the sense transcriptfor some of these genes might be under the control of the anti-sense message. Sense–antisense association would result indouble-stranded RNA (dsRNA), a natural substrate for a Dicerribonuclease [149]. The small RNAs produced by this reactioncould direct the formation of a silent chromatin state, thussilencing the expression of the homologous genomic region.

NATs can be found in both prokaryotes and eukaryotes. Ineukaryotic cells, such RNAs are known to be mostly involved ingenomic imprinting [150] and X chromosome inactivation[151]. The discovery that AS transcription is more widespreadthan previously thought challenges this view and suggests thatdsRNA may play a more general role (or perhaps a wider role)in regulation of gene expression. Therefore, it would be quiteinformative to determine if NATs affect the expression andchromatin state of one or more AtSET genes and then toelucidate the mechanism by which NATs act at these loci. NATscould facilitate a sensitive, rapid and dynamic control over At-SET expression under different environmental cues. Determina-tion of the spatial and temporal expression pattern of theseNATs and their corresponding genes will provide some clues to

and boxes denote 5′ UTR, exon, intron and 3′ UTR regions, respectively. Theirthe black bar, which belongs to the exonic regions in SUVR3β but belongs to theel B: sequence alignment of the SET domain of three Arabidopsis SET proteins:ET domain mariner transposase (Hs_SETMAR, AAH11635); N. crassa (Nc)re invariant (white on black) or conserved (white on gray) among all six SET

Page 10: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

325D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

the role of AtSET NATs in regulating plant development.Alternatively, NATs could assist in the permanent imprinting ofAtSET genes. Except for MEA, a self-regulated imprinting SETgene [54–56] devoid of antisense transcript, no AtSET has beenreported to be imprinted. Therefore, the nine AtSET genes forwhich AS transcript information is listed in Table 2 could beadditional targets for genomic imprinting.

4. Future directions

The extensive genome duplication observed in plants appearsto provide themwith the plasticity necessary to successfully adaptto the various environmental conditions they encounter as sessileorganisms. This is best exemplified by the diversification of genesinvolve in plant secondary metabolism through gene duplication[152,153]. Such gene duplication events offer a great opportunityfor studying gene evolution [154,155]. In particular, studying thediverse functions of families of paralogous plant SET genes inplant development is likely to provide new insight to themolecular and functional evolution of proteins containing theSET and other auxiliary domains. In addition, although consider-able attention has been paid to the SET domain proteins as histonemethyltransferases, their activity as protein methyltransferasescapable of modifying non-histone substrates andmodulating theirfunctions is still poorly understood. The paucity of knowledge isnot restricted to plant biology, as only a few instances have beenreported even in yeast and mammalian systems. For example,methylation of p53 bymammalian SET9 has been shown to causethe retention of p53 in the nucleus and increase the stability of anotherwise highly unstable protein and, consequently, regulatep53-dependent gene expression [156]. SET9 has also beendemonstrated to methylate TAF10, a general transcription factor,thereby increasing its affinity to RNA polymerase II [157]. Themethylation of p53 and TAF10 shows how SET domain proteinscan play a role in transcription by methylating non-histoneproteins. Furthermore, the SET domain-containing methyltrans-ferases can contribute to cellular processes other than transcrip-tion. For instance, Set1 a histone H3 Lys4 methyltransferase hasrecently been shown to mediate methylation of Dam1, akinetochore protein [158]. Therefore, it is possible Set1 mayplay an unknown function during mitosis. Determining thehistone and non-histone protein substrates for the plant SETdomain-containing methyltransferases is an exciting area ofresearch that will continue to provide novel insight into thecontribution of SET proteins to plant development.

Acknowledgements

This work was supported in part by NSF grants MCB0110477 and MCB 0346681to TCH and U.S. Public HealthService Grant GM58770 to R.A. We thank Xiangyu Shi for helpand discussion.

References

[1] R.D. Kornberg, Chromatin structure: a repeating unit of histones andDNA, Science 184 (1974) 868–871.

[2] K. Luger, A.W. Mader, R.K. Richmond, D.F. Sargent, T.J. Richmond,Crystal structure of the nucleosome core particle at 2.8 A resolution,Nature 389 (1997) 251–260.

[3] A.P. Wolffe, D. Guschin, Review: chromatin structural features andtargets that regulate transcription, J. Struct. Biol. 129 (2000) 102–122.

[4] J.T. Finch, M. Noll, R.D. Kornberg, Electron microscopy of definedlengths of chromatin, Proc. Natl. Acad. Sci. U. S. A. 72 (1975)3320–3322.

[5] P. Oudet, M. Gross-Bellard, P. Chambon, Electron microscopic andbiochemical evidence that chromatin structure is a repeating unit, Cell 4(1975) 281–300.

[6] R.D. Kornberg, Y. Lorch, Twenty-five years of the nucleosome,fundamental particle of the eukaryote chromosome, Cell 98 (1999)285–294.

[7] A. Nemeth, G. Langst, Chromatin higher order structure: opening upchromatin for transcription, Brief. Funct. Genomics Proteomics 2 (2004)334–343.

[8] M. Tomschik, H. Zheng, K. van Holde, J. Zlatanova, S.H. Leuba, Fast,long-range, reversible conformational fluctuations in nucleosomesrevealed by single-pair fluorescence resonance energy transfer, Proc.Natl. Acad. Sci. U. S. A. 102 (2005) 3278–3283.

[9] B.D. Strahl, C.D. Allis, The language of covalent histone modifications,Nature 403 (2000) 41–45.

[10] Y. Zhang, D. Reinberg, Transcription regulation by histone methylation:interplay between different covalent modifications of the core histonetails, Genes Dev. 15 (2001) 2343–2360.

[11] M. Iizuka, M.M. Smith, Functional consequences of histone modifica-tions, Curr. Opin. Genet. Dev. 13 (2003) 154–160.

[12] K. Luger, T.J. Richmond, The histone tails of the nucleosome, Curr. Opin.Genet. Dev. 8 (1998) 140–146.

[13] B.M. Turner, Histone acetylation and an epigenetic code, Bioessays 22(2000) 836–845.

[14] T. Jenuwein, C.D. Allis, Translating the histone code, Science 293 (2001)1074–1080.

[15] A.J. Bannister, P. Zegerman, J.F. Partridge, E.A. Miska, J.O. Thomas,R.C. Allshire, T. Kouzarides, Selective recognition of methylated lysine9 on histone H3 by the HP1 chromo domain, Nature 410 (2001)120–124.

[16] S.R. Haynes, C. Dollard, F. Winston, S. Beck, J. Trowsdale, I.B. Dawid,The bromodomain: a conserved sequence found in human, Drosophilaand yeast proteins, Nucleic Acids Res. 20 (1992) 2603.

[17] J.W. Tamkun, R. Deuring, M.P. Scott, M. Kissinger, A.M. Pattatucci, T.C.Kaufman, J.A. Kennison, brahma: a regulator of Drosophila homeoticgenes structurally related to the yeast transcriptional activator SNF2/SWI2, Cell 68 (1992) 561–572.

[18] L. Zeng, M.M. Zhou, Bromodomain: an acetyl-lysine binding domain,FEBS Lett. 513 (2002) 124–128.

[19] S. Rea, F. Eisenhaber, D. O'Carroll, B.D. Strahl, Z.W. Sun, M. Schmid, S.Opravil, K. Mechtler, C.P. Ponting, C.P. Allis, T. Jenuwein, Regulation ofchromatin structure by site-specific histone H3 methyltransferases,Nature 406 (2000) 593–599.

[20] T. Kouzarides, Histone methylation in transcriptional control, Curr. Opin.Genet. Dev. 12 (2002) 198–209.

[21] R.J. Sims III, K. Nishioka, D. Reinberg, Histone lysine methylation: asignature for chromatin function, Trends Genet. 19 (2003) 629–639.

[22] N.M. Springer, C.A. Napoli, D.A. Selinger, R. Pandey, K.C. Cone, V.L.Chandler, H.F. Kaeppler, S.M. Kaeppler, Comparative analysis of SETdomain proteins in maize and Arabidopsis reveals multiple duplicationspreceding the divergence of monocots and dicots, Plant Physiol. 132(2003) 907–925.

[23] R. Margueron, P. Trojer, D. Reinberg, The key to development:interpreting the histone code? Curr. Opin. Genet. Dev. 15 (2005)163–176.

[24] C. Martin, Y. Zhang, The diverse functions of histone lysine methylation,Nat. Rev., Mol. Cell Biol. 6 (2005) 838–849.

[25] P. Volkel, P.O. Angrand, The control of histone lysine methylation inepigenetic regulation, Biochimie 89 (2007) 1–20.

[26] Q. Feng, H. Wang, H.H. Ng, H. Erdjument-Bromage, P. Tempst, K.

Page 11: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

326 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

Struhl, Y. Zhang, Methylation of H3-lysine 79 is mediated by a newfamily of HMTases without a SET domain, Curr. Biol. 12 (2002)1052–1058.

[27] M. Lachner, T. Jenuwein, The many faces of histone lysine methylation,Curr. Opin. Cell Biol. 14 (2002) 286–298.

[28] A.J. Bannister, T. Kouzarides, Histone methylation: recognizing themethyl mark, Methods Enzymol. 376 (2004) 269–288.

[29] P.A. Wade, D. Pruss, A.P. Wolffe, Histone acetylation: chromatin inaction, Trends Biochem. Sci. 22 (1997) 128–132.

[30] A. Lusser, D. Kolle, P. Loidl, Histone acetylation: lessons from the plantkingdom, Trends Plant Sci. 6 (2001) 59–65.

[31] B. Tschiersch, A. Hofmann, V. Krauss, R. Dorn, G. Korge, G. Reuter, Theprotein encoded by the Drosophila position-effect variegation suppressorgene Su(var)3–9 combines domains of antagonistic regulators ofhomeotic gene complexes, EMBO J. 13 (1994) 3822–3831.

[32] J.C. Eissenberg, T.C. James, D.M. Foster-Hartnett, T. Hartnett, V. Ngan,S.C. Elgin, Mutation in a heterochromatin-specific chromosomal proteinis associated with suppression of position-effect variegation inDrosophilamelanogaster, Proc. Natl. Acad. Sci. U. S. A. 87 (1990) 9923–9927.

[33] G. Schotta, A. Ebert, V. Krauss, A. Fischer, J. Hoffmann, S. Rea, T.Jenuwein, R. Dorn, G. Reuter, Central role of Drosophila SU(VAR)3–9in histone H3–K9 methylation and heterochromatic gene silencing,EMBO J. 21 (2002) 1121–1131.

[34] R. Cao, L. Wang, H. Wang, L. Xia, H. Erdjument-Bromage, P. Tempst,R.S. Jones, Y. Zhang, Role of histone H3 lysine 27 methylation inPolycomb-group silencing, Science 298 (2002) 1039–1043.

[35] J. Muller, C.M. Hart, N.J. Francis, M.L. Vargas, A. Sengupta, B. Wild,E.L. Miller, M.B. O'Connor, R.E. Kingston, J.A. Simon, Histonemethyltransferase activity of a Drosophila Polycomb group repressorcomplex, Cell 111 (2002) 197–208.

[36] N.A. Tripoulas, E. Hersperger, D. La Jeunesse, A. Shearn, Moleculargenetic analysis of the Drosophila melanogaster gene absent, small orhomeotic discs1 (ash1), Genetics 137 (1994) 1027–1038.

[37] A.M. Mazo, D.H. Huang, B.A. Mozer, I.B. Dawid, The trithorax gene, atrans-acting regulator of the bithorax complex in Drosophila, encodes aprotein with zinc-binding domains, Proc. Natl. Acad. Sci. U. S. A. 87(1990) 2112–2116.

[38] C. Beisel, A. Imhof, J. Greene, E. Kremmer, F. Sauer, Histonemethylation by the Drosophila epigenetic transcriptional regulatorAsh1, Nature 419 (2002) 857–862.

[39] K.N. Byrd, A. Shearn, ASH1, a Drosophila trithorax group protein, isrequired for methylation of lysine 4 residues on histone H3, Proc. Natl.Acad. Sci. U. S. A. 100 (2003) 11535–11540.

[40] S.T. Smith, S. Petruk, Y. Sedkov, E. Cho, S. Tillib, E. Canaani, A. Mazo,Modulation of heat shock gene expression by the TAC1 chromatin-modifying complex, Nat. Cell Biol. 6 (2004) 162–167.

[41] R.S. Jones, W.M. Gelbart, The Drosophila Polycomb-group geneEnhancer of zeste contains a region with sequence similarity totrithorax, Mol. Cell. Biol. 13 (1993) 6357–6366.

[42] Y. Gu, T. Nakamura, H. Alder, R. Prasad, O. Canaani, G. Cimino, C.M.Croce, E. Canaani, The t(4;11) chromosome translocation of human acuteleukemias fuses the ALL-1 gene, related to Drosophila trithorax, to theAF-4 gene, Cell 71 (1992) 701–708.

[43] D.C. Tkachuk, S. Kohler, M.L. Cleary, Involvement of a homolog ofDrosophila trithorax by 11q23 chromosomal translocations in acuteleukemias, Cell 71 (1992) 691–700.

[44] L.O. Baumbusch, T. Thorstensen, V. Krauss, A. Fischer, K. Naumann,R. Assalkhou, I. Schulz, G. Reuter, R.B. Aalen, The Arabidopsisthaliana genome contains at least 29 active genes encoding SETdomain proteins that can be assigned to four evolutionarily conservedclasses, Nucleic Acids Res. 29 (2001) 4319–4333.

[45] A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E.L. Sonnhammer, D.J.Studholme, C. Yeats, S.R. Eddy, The Pfam protein families database,Nucleic Acids Res. 32 (2004) D138–D141.

[46] J. Ng, C.M. Hart, K. Morgan, J.A. Simon, A Drosophila ESC-E(Z) pro-tein complex is distinct from other polycomb group complexes andcontains covalentlymodified ESC,Mol. Cell. Biol. 20 (2000) 3069–3078.

[47] F. Tie, T. Furuyama, J. Prasad-Sinha, E. Jane, P.J. Harte, The DrosophilaPolycomb Group proteins ESC and E(Z) are present in a complexcontaining the histone-binding protein p55 and the histone deacetylaseRPD3, Development 128 (2001) 275–286.

[48] B. Czermin, R. Melfi, D. McCabe, V. Seitz, A. Imhof, V. Pirrotta, Dro-sophila enhancer of Zeste/ESC complexes have a histone H3 methyl-transferase activity that marks chromosomal Polycomb sites, Cell 111(2002) 185–196.

[49] A.E. Guitton, F. Berger, Control of reproduction by Polycomb Groupcomplexes in animals and plants, Int. J. Dev. Biol. 49 (2005) 707–716.

[50] J. Goodrich, P. Puangsomlee, M. Martin, D. Long, E.M. Meyerowitz,G. Coupland, A Polycomb-group gene regulates homeotic geneexpression in Arabidopsis, Nature 386 (1997) 44–51.

[51] A. Katz, M. Oliva, A. Mosquna, O. Hakim, N. Ohad, FIE and CURLYLEAF polycomb proteins interact in the regulation of homeobox geneexpression during sporophyte development, Plant J. 37 (2004)707–719.

[52] D. Schubert, L. Primavesi, A. Bishopp, G. Roberts, J. Doonan, T. Jenuwein,J. Goodrich, Silencing by plant Polycomb-group genes requires dispersedtrimethylation of histone H3 at lysine 27, EMBO J. 25 (2006) 4638–4649.

[53] Y. Chanvivattana, A. Bishopp, D. Schubert, C. Stock, Y.H. Moon, Z.R.Sung, J. Goodrich, Interaction of Polycomb-group proteins controllingflowering in Arabidopsis, Development 131 (2004) 5263–5276.

[54] C. Baroux, V. Gagliardini, D.R. Page, U. Grossniklaus, Dynamic regulatoryinteractions of Polycomb group genes: MEDEA autoregulation is requiredfor imprinted gene expression in Arabidopsis, Genes Dev. 20 (2006)1081–1086.

[55] M. Gehring, J.H. Huh, T.F. Hsieh, J. Penterman, Y. Choi, J.J. Harada, R.B.Goldberg, R.L. Fischer, DEMETER DNA glycosylase establishesMEDEA polycomb gene self-imprinting by allele-specific demethylation,Cell 124 (2006) 495–506.

[56] P.E. Jullien, A. Katz, M. Oliva, N. Ohad, F. Berger, Polycomb groupcomplexes self-regulate imprinting of the Polycomb group gene MEDEAin Arabidopsis, Curr. Biol. 16 (2006) 486–492.

[57] Z. Shao, F. Raible, R. Mollaaghababa, J.R. Guyon, C.T. Wu, W. Bender,R.E. Kingston, Stabilization of chromatin structure by PRC1, a Polycombcomplex, Cell 98 (1999) 37–46.

[58] N.J. Francis, A.J. Saurin, Z. Shao, R.E. Kingston, Reconstitution of afunctional core polycomb repressive complex,Mol. Cell 8 (2001) 545–556.

[59] G. Makarevich, O. Leroy, U. Akinci, D. Schubert, O. Clarenz, J. Goodrich,U. Grossniklaus, C. Kohler, Different Polycomb group complexes regulatecommon target genes in Arabidopsis, EMBO Rep. 7 (2006) 947–952.

[60] Y.K. Liang, Y. Wang, Y. Zhang, S.G. Li, X.C. Lu, H. Li, C. Zou, Z.H.Xu, S.N. Bai, OsSET1, a novel SET-domain-containing gene from rice,J. Exp. Bot. 54 (2003) 1995–1996.

[61] N.M. Springer, O.N. Danilevskaya, P. Hermon, T.G. Helentjaris, R.L.Phillips, H.F. Kaeppler, S.M. Kaeppler, Sequence relationships, con-served domains, and expression patterns for maize homologs of thepolycomb group genes E(z), esc, and E(Pc), Plant Physiol. 128 (2002)1332–1345.

[62] J.K. Thakur, M.R. Malik, V. Bhatt, M.K. Reddy, S.K. Sopory, A.K.Tyagi, J.P. Khurana, A POLYCOMB group gene of rice (Oryza sativa L.subspecies indica), OsiEZ1, codes for a nuclear-localized proteinexpressed preferentially in young seedlings and during reproductivedevelopment, Gene 314 (2003) 1–13.

[63] J. Perry, Y. Zhao, The CW domain, a structural module shared amongstvertebrates, vertebrate-infecting parasites and higher plants, TrendsBiochem. Sci. 28 (2003) 576–580.

[64] B.D. Strahl, P.A. Grant, S.D. Briggs, Z.W. Sun, J.R. Bone, J.A. Caldwell,S. Mollah, R.G. Cook, J. Shabanowitz, D.F. Hunt, C.D. Allis, Set2 is anucleosomal histone H3-selective methyltransferase that mediatestranscriptional repression, Mol. Cell. Biol. 22 (2002) 1298–1306.

[65] Z. Zhao, Y. Yu, D. Meyer, C. Wu, W.H. Shen, Prevention of earlyflowering by expression of FLOWERING LOCUS C requires methyla-tion of histone H3 K36, Nat. Cell Biol. 7 (2005) 1256–1260.

[66] L. Ringrose, R. Paro, Epigenetic regulation of cellular memory by thePolycomb and Trithorax group proteins, Annu. Rev. Genet. 38 (2004)413–443.

Page 12: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

327D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

[67] C. Grimaud, N. Negre, G. Cavalli, From genetics to epigenetics: the taleof Polycomb group and trithorax group genes, Chromosome Res. 14(2006) 363–375.

[68] S. Petruk, Y. Sedkov, S. Smith, S. Tillib, V. Kraevski, T. Nakamura,E. Canaani, C.M. Croce, A. Mazo, Trithorax and dCBP acting in acomplex to maintain expression of a homeotic gene, Science 294(2001) 1331–1334.

[69] A. Roguev, D. Schaft, A. Shevchenko, W.W. Pijnappel, M. Wilm, R. Aasl,A.F. Stewart, The Saccharomyces cerevisiae Set1 complex includes anAsh2 homologue and methylates histone 3 lysine 4, EMBO J. 20 (2001)7137–7148.

[70] S.D. Briggs, M. Bryk, B.D. Strahl, W.L. Cheung, J.K. Davie, S.Y. Dent,F. Winston, C.D. Allis, Histone H3 lysine 4 methylation is mediated bySet1 and required for cell growth and rDNA silencing in Saccharomycescerevisiae, Genes Dev. 15 (2001) 3286–3295.

[71] T. Miller, N.J. Krogan, J. Dover, H. Erdjument-Bromage, P. Tempst,M. Johnston, J.F. Greenblatt, A. Shilatifard, COMPASS: a complex ofproteins associated with a trithorax-related SET domain protein, Proc.Natl. Acad. Sci. U. S. A. 98 (2001) 12902–12907.

[72] P.L. Nagy, J. Griesenbeck, R.D. Kornberg, M.L. Cleary, A trithorax-group complex purified from Saccharomyces cerevisiae is required formethylation of histone H3, Proc. Natl. Acad. Sci. U. S. A. 99 (2002)90–94.

[73] H. Santos-Rosa, R. Schneider, A.J. Bannister, J. Sherriff, B.E. Bernstein,N.C. Emre, S.L. Schreiber, J. Mellor, T. Kouzarides, Active genes are tri-methylated at K4 of histone H3, Nature 419 (2002) 407–411.

[74] R. Bouveret, N. Schonrock, W. Gruissem, L. Hennig, Regulation offlowering time by Arabidopsis MSI1, Development 133 (2006)1693–1702.

[75] R. Alvarez-Venegas, S. Pien, M. Sadder, X. Witmer, U. Grossniklaus,Z. Avramova, ATX-1, an Arabidopsis homolog of trithorax, activatesflower homeotic genes, Curr. Biol. 13 (2003) 627–637.

[76] R. Alvarez-Venegas, Z. Avramova, Methylation patterns of histone H3Lys 4, Lys 9 and Lys 27 in transcriptionally active and inactiveArabidopsis genes and in atx1 mutants, Nucleic Acids Res. 33 (2005)5199–5207.

[77] A.J. Bobb, H.G. Eiben, M.M. Bustos, PvAlf, an embryo-specific acidictranscriptional activator enhances gene expression from phaseolin andphytohemagglutinin promoters, Plant J. 8 (1995) 331–343.

[78] G. Li, K.J. Bishop, M.B. Chandrasekharan, T.C. Hall, β-phaseolin geneactivation is a two-step process: PvALF-facilitated chromatin modifica-tion followed by abscisic acid-mediated gene activation, Proc. Natl.Acad. Sci. U. S. A. 96 (1999) 7104–7109.

[79] D.W. Ng, M.B. Chandrasekharan, T.C. Hall, Ordered histone modifica-tions are associated with transcriptional poising and activation of thephaseolin promoter, Plant Cell 18 (2006) 119–132.

[80] M. West, K.M. Yee, J. Danao, J.L. Zimmerman, R.L. Fischer, R.B.Goldberg, J.J. Harada, LEAFY COTYLEDON1 is an essential regulatorof late embryogenesis and cotyledon identity in Arabidopsis, Plant Cell 6(1994) 1731–1745.

[81] R. Aasland, T.J. Gibson, A.F. Stewart, The PHD finger: implications forchromatin-mediated transcriptional regulation, Trends Biochem. Sci. 20(1995) 56–59.

[82] M. Bienz, The PHD finger, a nuclear protein-interaction domain, TrendsBiochem. Sci. 31 (2006) 35–40.

[83] R. Alvarez-Venegas, M. Sadder, A. Hlavacka, F. Baluska, Y. Xia, G. Lu,A. Firsov, G. Sarath, H. Moriyama, J.G. Dubrovsky, Z. Avramova, TheArabidopsis homolog of trithorax, ATX1, binds phosphatidylinositol5-phosphate, and the two regulate a common set of target genes, Proc.Natl. Acad. Sci. U. S. A. 103 (2006) 6049–6054.

[84] X. Wang, Lipid signaling, Curr. Opin. Plant Biol. 7 (2004) 329–336.[85] O. Gozani, P. Karuman, D.R. Jones, D. Ivanov, J. Cha, A.A. Lugovskoy,

C.L. Baird, H. Zhu, S.J. Field, S.L. Lessnick, J. Villasenor, B. Mehrotra,J. Chen, V.R. Rao, J.S. Brugge, C.G. Ferguson, B. Payrastre, D.G.Myszka, L.C. Cantley, G. Wagner, N. Divecha, G.D. Prestwich, J. Yuan,The PHD finger of the chromatin-associated protein ING2 functions as anuclear phosphoinositide receptor, Cell 114 (2003) 99–111.

[86] X. Shi, T. Hong, K.L.Walter, M. Ewalt, E. Michishita, T. Hung, D. Carney,

P. Pena, F. Lan, M.R. Kaadige, N. Lacoste, C. Cayrou, F. Davrazou,A. Saha, B.R. Cairns, D.E. Ayer, T.G. Kutateladze, Y. Shi, J. Cote, K.F.Chua, O. Gozani, ING2 PHD domain links histone H3 lysine 4methylation to active gene repression, Nature 442 (2006) 96–99.

[87] J. Wysocka, T. Swigut, H. Xiao, T.A. Milne, S.Y. Kwon, J. Landry,M. Kauer, A.J. Tackett, B.T. Chait, P. Badenhorst, C. Wu, C.D. Allis,A PHD finger of NURF couples histone H3 lysine 4 trimethylationwith chromatin remodelling, Nature 442 (2006) 86–90.

[88] C. Raynaud, R. Sozzani, N. Glab, S. Domenichini, C. Perennes, R. Cella,E. Kondorosi, C. Bergounioux, Two cell-cycle regulated SET-domainproteins interact with proliferating cell nuclear antigen (PCNA) inArabidopsis, Plant J. 47 (2006) 395–407.

[89] J. Nakayama, J.C. Rice, B.D. Strahl, C.D. Allis, S.I. Grewal, Role ofhistone H3 lysine 9 methylation in epigenetic control of heterochromatinassembly, Science 292 (2001) 110–113.

[90] J.P. Jackson, A.M. Lindroth, X. Cao, S.E. Jacobsen, Control of CpNpGDNA methylation by the KRYPTONITE histone H3 methyltransferase,Nature 416 (2002) 556–560.

[91] K. Naumann, A. Fischer, I. Hofmann, V. Krauss, S. Phalke, K. Irmler,G. Hause, A.C. Aurich, R. Dorn, T. Jenuwein, G. Reuter, Pivotal role ofAtSUVH2 in heterochromatic histone methylation and gene silencing inArabidopsis, EMBO J. 24 (2005) 1418–1429.

[92] M.L. Ebbs, L. Bartee, J. Bender, H3 lysine 9 methylation is maintained ona transcribed inverted repeat by combined action of SUVH6 and SUVH4methyltransferases, Mol. Cell Biol. 25 (2005) 10507–10515.

[93] M.L. Ebbs, J. Bender, Locus-specific control of DNA methylation by theArabidopsis SUVH5 histone methyltransferase, Plant Cell 18 (2006)1166–1176.

[94] W.H. Shen, NtSET1, a member of a newly identified subgroup of plantSET-domain-containing proteins, is chromatin-associated and its ectopicoverexpression inhibits tobacco plant growth, Plant J. 28 (2001) 371–383.

[95] Y. Yu, A. Dong, W.H. Shen, Molecular characterization of the tobaccoSET domain protein NtSET1 unravels its role in histone methylation,chromatin binding, and segregation, Plant J. 40 (2004) 699–711.

[96] T. Thorstensen, A. Fischer, S.V. Sandvik, S.S. Johnsen, P.E. Grini,G. Reuter, R.B. Aalen, The Arabidopsis SUVR4 protein is a nucleolarhistone methyltransferase with preference for monomethylated H3K9,Nucleic Acids Res. 34 (2006) 5461–5470.

[97] V. Mutskov, G. Felsenfeld, Silencing of transgene transcription precedesmethylation of promoter DNA and histone H3 lysine 9, EMBO J. 23(2004) 138–149.

[98] M.A. Brown, R.J. Sims III, P.D. Gottlieb, P.W. Tucker, Identification andcharacterization of Smyd2: a split SET/MYND domain-containinghistone H3 lysine 36-specific methyltransferase that interacts with theSin3 histone deacetylase complex, Mol. Cancer 5 (2006) 26–36.

[99] Z. Ying, R.M. Mulligan, N. Janney, R.L. Houtz, Rubisco small and largesubunit N-methyltransferases. Bi- and mono-functional methyltrans-ferases that methylate the small and large subunits of Rubisco, J. Biol.Chem. 274 (1999) 36750–36756.

[100] R.C. Trievel, B.M. Beach, L.M. Dirk, R.L. Houtz, J.H. Hurley, Structureand catalytic mechanism of a SET domain protein methyltransferase, Cell111 (2002) 91–103.

[101] R.C. Trievel, E.M. Flynn, R.L. Houtz, J.H. Hurley, Mechanism ofmultiple lysine methylation by the SET domain enzyme Rubisco LSMT,Nat. Struct. Biol. 10 (2003) 545–552.

[102] R. Marmorstein, Structure of SET domain proteins: a new twist onhistone methylation, Trends Biochem. Sci. 28 (2003) 59–62.

[103] X. Cui, I. De Vivo, R. Slany, A. Miyamoto, R. Firestein, M.L. Cleary,Association of SET domain and myotubularin-related proteins modulatesgrowth control, Nat. Genet. 18 (1998) 331–337.

[104] O. Rozenblatt-Rosen, T. Rozovskaia, D. Burakov, Y. Sedkov, S. Tillib,J. Blechman, T. Nakamura, C.M. Croce, A. Mazo, E. Canaani, TheC-terminal SET domains of ALL-1 and TRITHORAX interact with theINI1 and SNR1 proteins, components of the SWI/SNF complex, Proc.Natl. Acad. Sci. U. S. A. 95 (1998) 4152–4157.

[105] T. Rozovskaia, O. Rozenblatt-Rosen, Y. Sedkov, D. Burakov, T. Yano,T. Nakamura, S. Petruck, L. Ben-Simchon, C.M. Croce, A. Mazo, E.Canaani, Self-association of the SET domains of human ALL-1 and of

Page 13: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

328 D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

Drosophila TRITHORAX and ASH1 proteins, Oncogene 19 (2000)351–357.

[106] A. Morillon, N. Karabetsou, A. Nair, J. Mellor, Dynamic lysine methy-lation on histone H3 defines the regulatory phase of gene transcription,Mol. Cell 18 (2005) 723–734.

[107] J.E. Mueller, M. Canze, M. Bryk, The requirements for COMPASS andPaf1 in transcriptional silencing and methylation of histone H3 inSaccharomyces cerevisiae, Genetics 173 (2006) 557–567.

[108] Y. Dou, T.A. Milne, A.J. Ruthenburg, S. Lee, J.W. Lee, G.L. Verdine,C.D. Allis, R.G. Roeder, Regulation of MLL1 H3K4 methyltransferaseactivity by its core components, Nat. Struct. Mol. Biol. 13 (2006)713–719.

[109] J. Simon, D. Bornemann, K. Lunde, C. Schwartz, The extra sex combsproduct contains WD40 repeats and its time of action implies a roledistinct from other Polycomb group products, Mech. Dev. 53 (1995)197–208.

[110] U. Grossniklaus, J.P. Vielle-Calzada, M.A. Hoeppner, W.B. Gagliano,Maternal control of embryogenesis by MEDEA, a polycomb group genein Arabidopsis, Science 280 (1998) 446–450.

[111] M. Luo, P. Bilodeau, A. Koltunow, E.S. Dennis, W.J. Peacock, A.M.Chaudhury, Genes controlling fertilization-independent seed develop-ment in Arabidopsis thaliana, Proc. Natl. Acad. Sci. U. S. A. 96 (1999)296–301.

[112] N. Ohad, R. Yadegari, L. Margossian, M. Hannon, D. Michaeli, J.J.Harada, R.B. Goldberg, R.L. Fischer, Mutations in FIE, a WD polycombgroup gene, allow endosperm development without fertilization, PlantCell 11 (1999) 407–416.

[113] M. Luo, P. Bilodeau, E.S. Dennis, W.J. Peacock, A. Chaudhury,Expression and parent-of-origin effects for FIS2, MEA, and FIE in theendosperm and embryo of developing Arabidopsis seeds, Proc. Natl.Acad. Sci. U. S. A. 97 (2000) 10637–10642.

[114] C. Spillane, C. MacDougall, C. Stock, C. Kohler, J.P. Vielle-Calzada,S.M. Nunes, U. Grossniklaus, J. Goodrich, Interaction of theArabidopsis polycomb group proteins FIE and MEA mediates theircommon phenotypes, Curr. Biol. 10 (2000) 1535–1538.

[115] D. Wang, M.D. Tyson, S.S. Jackson, R. Yadegari, Partially redundantfunctions of two SET-domain polycomb-group proteins in controllinginitiation of seed development in Arabidopsis, Proc. Natl. Acad. Sci. U. S.A. 103 (2006) 13244–14249.

[116] C. Kohler, L. Hennig, R. Bouveret, J. Gheyselinck, U. Grossniklaus,W. Gruissem, Arabidopsis MSI1 is a component of the MEA/FIEPolycomb group complex and required for seed development, EMBOJ. 22 (2003) 4804–4814.

[117] C.C. Wood, M. Robertson, G. Tanner, W.J. Peacock, E.S. Dennis, C.A.Helliwell, The Arabidopsis thaliana vernalization response requires apolycomb-like protein complex that also includes VERNALIZATIONINSENSITIVE 3, Proc. Natl. Acad. Sci. U. S. A. 103 (2006) 14631–14636.

[118] J. Wysocka, T. Swigut, T.A. Milne, Y. Dou, X. Zhang, A.L. Burlingame,R.G. Roeder, A.H. Brivanlou, C.D. Allis, WDR5 associates with histoneH3 methylated at K4 and is essential for H3 K4 methylation andvertebrate development, Cell 121 (2005) 859–872.

[119] A. Schuetz, A. Allali-Hassani, F. Martin, P. Loppnau, M. Vedadi, A.Bochkarev, A.N. Plotnikov, C.H. Arrowsmith, J. Min, Structural basis formolecular recognition and presentation of histone H3 ByWDR5, EMBO J.25 (2006) 4245–4252.

[120] D. Preuss, Chromatin silencing and Arabidopsis development: A role forpolycomb proteins, Plant Cell 11 (1999) 765–768.

[121] V.S. Reddy, A.S. Reddy, Developmental and cell-specific expression ofZWICHEL is regulated by the intron and exon sequences of its gene,Plant Mol. Biol. 54 (2004) 273–293.

[122] F. Hube, J. Guo, S. Chooniedass-Kothari, C. Cooper,M.K.Hamedani, A.A.Dibrov, A.A. Blanchard, X. Wang, G. Deng, Y. Myal, E. Leygue,Alternative splicing of the first intron of the steroid receptor RNA activator(SRA) participates in the generation of coding and noncoding RNAisoforms in breast cancer cell lines, DNA Cell Biol. 25 (2006) 418–428.

[123] E. Castells, P. Puigdomenech, J.M. Casacuberta, Regulation of the kinaseactivity of the MIK GCK-like MAP4K by alternative splicing, Plant Mol.Biol. 61 (2006) 747–756.

[124] T.A. Hughes, Regulation of gene expression by alternative untranslatedregions, Trends Genet. 22 (2006) 119–122.

[125] D. Mumberg, F.C. Lucibello, M. Schuermann, R. Muller, Alternativesplicing of fosB transcripts results in differentially expressed mRNAsencoding functionally antagonistic proteins, Genes Dev. 5 (1991)1212–1223.

[126] C. Lee, Q. Wang, Bioinformatics analysis of alternative splicing, Brief.Bioinform. 6 (2005) 23–33.

[127] B.B. Wang, V. Brendel, Genomewide comparative analysis of alter-native splicing in plants, Proc. Natl. Acad. Sci. U.S.A. 103 (2006)7175–7180.

[128] A.D. Neverov, Artamonova II, R.N. Nurtdinov, D. Frishman, M.S.Gelfand, A.A. Mironov, Alternative splicing and protein function, BMCBioinformatics 6 (2005) 266.

[129] G.W. Yeo, E. Van Nostrand, D. Holste, T. Poggio, C.B. Burge, Identi-fication and analysis of alternative splicing events conserved in humanand mouse, Proc. Natl. Acad. Sci. U. S. A. 102 (2005) 2850–2855.

[130] Q. Xu, B. Modrek, C. Lee, Genome-wide detection of tissue-specificalternative splicing in the human transcriptome, Nucleic Acids Res. 30(2002) 3754–3766.

[131] H.V. Colot, J.J. Loros, J.C. Dunlap, Temperature-modulated alternativesplicing and promoter use in the Circadian clock gene frequency, Mol.Biol. Cell 16 (2005) 5563–5571.

[132] S.M. Bagnasco, T. Peng, Y. Nakayama, J.M. Sands, Differential expressionof individual UT-A urea transporter isoforms in rat kidney, J. Am. Soc.Nephrol. 11 (2000) 1980–1986.

[133] V. Krauss, G. Reuter, Two genes become one: the genes encodingheterochromatin protein Su(var)3–9 and translation initiation factor subuniteIF-2gamma are joined to a dicistronic unit in holometabolic insects,Genetics 156 (2000) 1157–1167.

[134] X. Tan, J. Rotllant, H. Li, P. De Deyne, S.J. Du, SmyD1, a histonemethyltransferase, is required for myofibril organization and musclecontraction in zebrafish embryos, Proc. Natl. Acad. Sci. U. S. A. 103(2006) 2713–2718.

[135] K.L. Manzur, A. Farooq, L. Zeng, O. Plotnikova, A.W. Koch, M.M.Sachchidanand, A dimeric viral SET domain methyltransferase specificto Lys27 of histone H3, Nat. Struct. Biol. 10 (2003) 187–196.

[136] J. Ueda, M. Tachibana, T. Ikura, Y. Shinkai, Zinc finger protein Wiz linksG9a/GLP histone methyltransferases to the co-repressor molecule CtBP,J. Biol. Chem. 281 (2006) 20120–20128.

[137] C.H. Wu, R. Apweiler, A. Bairoch, D.A. Natale, W.C. Barker, B.Boeckmann, S. Ferro, E. Gasteiger, H. Huang, R. Lopez, M. Magrane,M.J. Martin, R. Mazumder, C. O'Donovan, N. Redaschi, B. Suzek, TheUniversal Protein Resource (UniProt): an expanding universe of proteininformation, Nucleic Acids Res. 34 (2006) D187–D191.

[138] C. Touriol, A. Morillon, M.C. Gensac, H. Prats, A.C. Prats, Expression ofhuman fibroblast growth factor 2 mRNA is post-transcriptionallycontrolled by a unique destabilizing element present in the 3′-untranslatedregion between alternative polyadenylation sites, J. Biol. Chem. 274(1999) 21402–21408.

[139] L. Knirsch, L.B. Clerch, A region in the 3′ UTR of MnSOD RNAenhances translation of a heterologous RNA, Biochem. Biophys. Res.Commun. 272 (2000) 164–168.

[140] D.W. Ng, M.B. Chandrasekharan, T.C. Hall, The 5′ UTR negativelyregulates quantitative and spatial expression from the ABI3 promoter,Plant Mol. Biol. 54 (2004) 25–38.

[141] E.H. Kislauskis, X. Zhu, R.H. Singer, Sequences responsible forintracellular localization of beta-actin messenger RNA also affect cellphenotype, J. Cell Biol. 127 (1994) 441–451.

[142] J.A. Casas-Mollano, N.T. Lao, T.A. Kavanagh, Intron-regulated expres-sion of SUVH3, an Arabidopsis Su(var)3–9 homologue, J. Exp. Bot. 57(2006) 3301–3311.

[143] E. Asamizu, Y. Nakamura, S. Sato, S. Tabata, A large scale analysis ofcDNA in Arabidopsis thaliana: generation of 12,028 non-redundantexpressed sequence tags from normalized and size-selected cDNA libraries,DNA Res. 7 (2000) 175–180.

[144] K. Yamada, J. Lim, J.M. Dale, H. Chen, P. Shinn, C.J. Palm, A.M.Southwick, H.C. Wu, C. Kim, M. Nguyen, P. Pham, R. Cheuk, G. Karlin-

Page 14: Review Plant SET domain-containing proteins: Structure ... · Review Plant SET domain-containing proteins: Structure, function and regulation Danny W-K Ng, Tao Wang, Mahesh B. Chandrasekharan

329D.W-K Ng et al. / Biochimica et Biophysica Acta 1769 (2007) 316–329

Newmann, S.X. Liu, B. Lam, H. Sakano, T. Wu, G. Yu, M. Miranda, H.L.Quach, M. Tripp, C.H. Chang, J.M. Lee, M. Toriumi, M.M. Chan, C.C.Tang, C.S. Onodera, J.M. Deng, K. Akiyama, Y. Ansari, T. Arakawa,J. Banh, F. Banno, L. Bowser, S. Brooks, P. Carninci, Q. Chao, N. Choy,A. Enju, A.D. Goldsmith, M. Gurjal, N.F. Hansen, Y. Hayashizaki,C. Johnson-Hopson, V.W. Hsuan, K. Iida, M. Karnes, S. Khan, E.Koesema, J. Ishida, P.X. Jiang, T. Jones, J. Kawai, A. Kamiya, C. Meyers,M. Nakajima, M. Narusaka, M. Seki, T. Sakurai, M. Satou, R. Tamse, M.Vaysberg, E.K. Wallender, C. Wong, Y. Yamamura, S. Yuan, K.Shinozaki, R.W. Davis, A. Theologis, J.R. Ecker, Empirical analysis oftranscriptional activity in the Arabidopsis genome, Science 302 (2003)842–846.

[145] S. Katayama, Y. Tomaru, T. Kasukawa, K. Waki, M. Nakanishi, M.Nakamura, H. Nishida, C.C. Yap, M. Suzuki, J. Kawai, H. Suzuki, P.Carninci, Y. Hayashizaki, C. Wells, M. Frith, T. Ravasi, K.C. Pang, J.Hallinan, J. Mattick, D.A. Hume, L. Lipovich, S. Batalov, P.G.Engstrom, Y. Mizuno, M.A. Faghihi, A. Sandelin, A.M. Chalk, S.Mottagui-Tabar, Z. Liang, B. Lenhard, C. Wahlestedt, Antisense tran-scription in the mammalian transcriptome, Science 309 (2005)1564–1566.

[146] C. Llave, K.D. Kasschau, M.A. Rector, J.C. Carrington, Endogenous andsilencing-associated small RNAs in plants, Plant Cell 14 (2002)1605–1619.

[147] H.Kiyosawa,N.Mise, S. Iwase,Y. Hayashizaki, K. Abe, Disclosing hiddentranscripts: mouse natural sense–antisense transcripts tend to be poly(A)negative and nuclear localized, Genome Res. 15 (2005) 463–474.

[148] X.J. Wang, T. Gaasterl, N.H. Chua, Genome-wide prediction andidentification of cis-natural antisense transcripts in Arabidopsis thaliana,Genome Biol. 6 (2005) R30.

[149] Y. Kurihara, Y. Watanabe, Arabidopsis micro-RNA biogenesis through

Dicer-like 1 protein functions, Proc. Natl. Acad. Sci. U. S. A. 101 (2004)12753–12758.

[150] T. Moore, M. Constancia, M. Zubair, B. Bailleul, R. Feil, H. Sasaki, W.Reik, Multiple imprinted sense and antisense transcripts, differential meth-ylation and tandem repeats in a putative imprinting control region upstreamof mouse Igf2, Proc. Natl. Acad. Sci. U. S. A. 94 (1997) 12509–12514.

[151] J.T. Lee, L.S. Davidow, D. Warshawsky, Tsix, a gene antisense to Xist atthe X-inactivation centre, Nat. Genet. 21 (1999) 400–404.

[152] D.J. Kliebenstein, V.M. Lambrix,M. Reichelt, J. Gershenzon, T.Mitchell-Olds, Gene duplication in the diversification of secondary metabolism:tandem 2-oxoglutarate-dependent dioxygenases control glucosinolatebiosynthesis in Arabidopsis, Plant Cell 13 (2001) 681–693.

[153] D. Ober, Seeing double: gene duplication and diversification in plantsecondary metabolism, Trends Plant Sci. 10 (2005) 444–449.

[154] E. Pichersky, D.R. Gang, Genetics and biochemistry of secondarymetabolites in plants: an evolutionary perspective, Trends Plant Sci. 5(2000) 439–445.

[155] M. Wink, Evolution of secondary metabolites from an ecological andmolecular phylogenetic perspective, Phytochemistry 64 (2003) 3–19.

[156] S. Chuikov, J.K. Kurash, J.R. Wilson, B. Xiao, N. Justin, G.S. Ivanov, K.McKinney, P. Tempst, C. Prives, S.J. Gamblin, N.A. Barlev, D. Reinberg,Regulation of p53 activity through lysine methylation, Nature 432 (2004)353–360.

[157] A. Kouskouti, E. Scheer, A. Staub, L. Tora, I. Talianidis, Gene-specificmodulation of TAF10 function by SET9-mediated methylation, Mol. Cell14 (2004) 175–182.

[158] K. Zhang, W. Lin, J.A. Latham, G.M. Riefler, J.M. Schumacher, C. Chan,K. Tatchell, D.H. Hawke, R. Kobayashi, S.Y. Dent, The Set1methyltransferase opposes Ipl1 aurora kinase functions in chromosomesegregation, Cell 122 (2005) 723–734.