Differences in DNA-binding specificity of floral homeotic ... · Differences in DNA-binding...
Transcript of Differences in DNA-binding specificity of floral homeotic ... · Differences in DNA-binding...
1
LARGE-SCALE BIOLOGY ARTICLE 1
2
Differences in DNA-binding specificity of floral homeotic protein 3
complexes predict organ-specific target genes 4
5
Cezary Smaczniak1,2,3,#
, Jose M. Muiño3,4,#
, Dijun Chen2,3
, Gerco C. Angenent1,5
, Kerstin 6
Kaufmann2,3,*
7
8 1
Laboratory of Molecular Biology Wageningen University, Wageningen, 6708PB, The 9
Netherlands 10 2 Institute for Biochemistry and Biology, Potsdam University, Potsdam, 14476, Germany 11
3 Present Address: Institute of Biology, Humboldt-Universität zu Berlin, Berlin, 10115, Germany 12
4 Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, 14195, 13
Germany 14 5 Bioscience, Wageningen Plant Research, Wageningen, 6708PB, The Netherlands 15
16 # These authors contributed equally to this work 17
18 * To whom correspondence should be addressed: [email protected] 19
20
Short Title: Specificities of MADS-domain homeotic complexes 21
22
One-Sentence Summary: DNA-binding specificity of floral MADS-domain protein complexes 23
depends on protein complex composition and provides clues to the mechanisms underlying target 24
gene specificity. 25
26
The author responsible for distribution of materials integral to the findings presented in this 27
article in accordance with the policy described in the Instructions for Authors 28
(www.plantcell.org) is: Kerstin Kaufmann ([email protected]). 29
30
31
ABSTRACT 32
Floral organ identities in plants are specified by the combinatorial action of homeotic master 33
regulatory transcription factors. How these factors achieve their regulatory specificities is 34
however still largely unclear. Genome-wide in vivo DNA binding data show that homeotic 35
MADS-domain proteins recognize partly distinct genomic regions, suggesting that DNA binding 36
specificity contributes to functional differences of homeotic protein complexes. We used in vitro 37
systematic evolution of ligands by exponential enrichment followed by high throughput DNA 38
sequencing (SELEX-seq) on several floral MADS-domain protein homo- and heterodimers to 39
measure their DNA-binding specificities. We show that specification of reproductive organs is 40
associated with distinct binding preferences of a complex formed by SEPALLATA3 (SEP3) and 41
AGAMOUS (AG). Binding specificity is further modulated by different binding site spacing 42
preferences. Combination of SELEX-seq and genome-wide DNA binding data allows to 43
differentiate between targets in specification of reproductive versus perianth organs in the 44
flower. We validate the importance of DNA-binding specificity for organ-specific gene 45
regulation by modulating promoter activity through targeted mutagenesis. Our study shows that 46
Plant Cell Advance Publication. Published on July 21, 2017, doi:10.1105/tpc.17.00145
©2017 American Society of Plant Biologists. All Rights Reserved
2
intrafamily protein interactions affect DNA-binding specificity of floral MADS-domain proteins. 47
Differential DNA-binding of MADS-domain protein complexes plays a role in the specificity of 48
target gene regulation. 49
50
INTRODUCTION 51
The exact molecular mechanisms of how most transcription factors (TFs) achieve their DNA-52
binding specificity are largely unknown. Most intriguing are the questions how closely related 53
TFs control distinct biological processes, and how heteromeric complex formation affects 54
functional specificity. DNA binding specificity of proteins stems from primary DNA sequence 55
and its structural properties (Rohs et al., 2009). Another aspect that contributes to functional 56
specificity comes from the ability of TFs to form higher-order protein complexes. The protein 57
interactions potentially modify the DNA-binding affinity of individual members of the complex. 58
For example, interactions of TFs from the same family with a common cofactor can evoke latent 59
differences in DNA-binding specificities (Slattery et al., 2011). Moreover, the formation of 60
heterodimeric complexes between TFs of the same or different families results in the recognition 61
of novel, composite DNA TF binding sites (TFBSs) (Jolma et al., 2013; Jolma et al., 2015; 62
Bemer et al., 2017). The interplay of these molecular mechanisms could influence the DNA-63
binding specificity of plant MADS-domain TFs that are known to form a complex intra-family 64
protein interaction network. 65
MADS-domain TFs are present in all major eukaryotic lineages. Especially in plants, they 66
form a large family, e.g. of more than 100 members in the flowering plant Arabidopsis thaliana 67
(Parenicova et al., 2003). MADS-domain TFs have important roles in the regulation of many 68
developmental processes (Smaczniak et al., 2012a). Remarkably, some MADS-domain proteins 69
acquired several distinct functions in different organs related with their ability to interact with 70
other family members in a combinatorial manner. To explain the variety of regulatory functions 71
of MADS-domain proteins, we need to understand how the formation of heteromeric protein 72
complexes affect their DNA-binding specificity and target gene regulation. Previous studies 73
showed that MADS-domain proteins bind DNA sequence elements called CArG-boxes 74
(consensus CC[A/T]6GG) (Pollock and Treisman, 1990; Schwarz-Sommer et al., 1992; Huang et 75
al., 1993; Riechmann et al., 1996b; Riechmann et al., 1996a) by means of their highly conserved, 76
56 amino acid N-terminal DNA binding MADS-domain (Schwarz-Sommer et al., 1990). 77
Thousands of CArG-boxes are present in the genome of Arabidopsis thaliana, many of 78
3
which seem not to be bound by MADS-domain TFs (de Folter and Angenent, 2006; Kaufmann et 79
al., 2009; Muino et al., 2014). Besides that, a large fraction of genomic regions bound in vivo do 80
not contain consensus CArG-box sequences (Kaufmann et al., 2009; Zheng et al., 2009; 81
Kaufmann et al., 2010; Deng et al., 2011). A good example how combinatorial protein 82
interactions determine regulatory specificity of plant MADS-domain proteins is flower 83
development. According to the ‘floral quartet’ model, each type of floral organ is specified by a 84
distinct combination of MADS-domain proteins that form quaternary protein complexes and 85
bind two CArG-boxes (a protein dimer contacts a single TFBS in such tetrameric complex) in 86
the regulatory regions of target genes (Theissen and Saedler, 2001). The interactions of 87
Arabidopsis MADS-domain proteins suggested in the ‘floral quartet’ model as well as 88
interactions with other TFs and chromatin-associated proteins were characterized in vitro and in 89
vivo (Honma and Goto, 2001; Melzer and Theissen, 2009; Melzer et al., 2009; Smaczniak et al., 90
2012b). Therefore, part of the functional specificity may come from the ability of MADS-91
domain TFs to form homo- and heteromeric protein complexes, which has been proposed before 92
(Krizek and Meyerowitz, 1996; Riechmann et al., 1996b). The functional analysis of chimeric 93
proteins, where MADS domain swaps between plant and human MADS-domain TFs are able to 94
rescue floral homeotic mutants, suggests, somewhat controversially, that their functional 95
specificity is independent of DNA-binding specificity (Riechmann and Meyerowitz, 1997; 96
Krizek et al., 1999). However, recent genome-wide in vivo DNA binding for homeotic MADS-97
domain proteins (Wuest et al., 2012; Ó’Maoiléidigh et al., 2013; Pajoro et al., 2014) show 98
between 30% and 60% non-overlapping TFBSs (Yan et al., 2016). Also, the DNA looping may 99
affect binding preferences, as shown in vitro for SEPALLATA homotetrameric complexes (Jetha 100
et al., 2014). Together, these results indicate a complex recognition mechanism that depends on 101
the DNA-binding specificities of MADS-domain TF dimers and their higher-order interactions. 102
Using systematic evolution of ligands by exponential enrichment followed by high-103
throughput sequencing (SELEX-seq), we determined DNA-binding specificities of selected 104
MADS-domain protein complexes in vitro. Here we show that MADS-domain protein 105
homodimers SEPALLATA3 (SEP3)–SEP3, AGAMOUS (AG)–AG, APETALA1 (AP1)–AP1 106
and organ-specific heterodimers SEP3–AG (reproductive organs), SEP3–AP1 (perianth) bind 107
DNA sequences with different specificities and affinities. In particular, the identity specification 108
of reproductive organs is linked with unique DNA-binding preferences of SEP3–AG dimers, 109
4
resulting in a set of target genes not bound by other MADS-domain proteins. More generally, we 110
show that differences in DNA-binding specificities can discriminate between complex-specific in 111
vivo DNA TFBSs (as identified by ChIP-seq experiments). This approach allows to modify the 112
specificity of the native DNA TFBSs of MADS-domain proteins. As a proof of concept, we 113
modified TFBSs in the APETALA3 (AP3) promoter to modulate organ-specific gene expression 114
in vivo. 115
116
RESULTS 117
DNA-Binding Specificities of Floral MADS-Domain TF Complexes 118
To determine DNA-binding specificities of individual MADS-domain TF complexes we 119
followed a SELEX-seq approach (Figure 1), in which we made use of in vitro 120
transcribed/translated MADS-domain protein complexes and double-stranded DNA (dsDNA) 121
libraries containing a 20 bp region of randomized nucleotides (Jolma et al., 2010). Starting with 122
this random pool of dsDNA, we isolated DNA sequences bound by MADS-domain TF 123
complexes by immunoprecipitation with immobilized, protein-specific antibodies (Figure 1A). 124
For each MADS-domain TF dimer combination we performed at least three rounds of SELEX 125
and characterized the evolved pools of sequences after each round by high-throughput DNA 126
sequencing (Supplemental Table 1) and electrophoretic mobility shift assays (EMSAs) (Figure 127
1B). Heteromeric complexes were not separated from homomeric complexes during our SELEX 128
experiment. However, our earlier data showed that SEP3 and AG or AP1 proteins, when 129
incubated with the DNA, form predominantly heteromeric rather than homomeric complexes 130
(Smaczniak et al., 2012b). This was confirmed by EMSA experiments using the SELEX-131
enriched DNA and AP1 or AG protein complexes with a truncated SEP3 protein (Supplemental 132
Figure 1). The predominant formation of heteromeric complexes with specific DNA-binding 133
preferences is also confirmed here by high reproducibility of the SELEX-seq-derived affinities of 134
the SEP3–AG complex when we used either the SEP3 or the AG antibody (Supplemental Figure 135
2). Furthermore, we observed a specific enrichment pattern of sequences for SEP3–AG 136
heterodimers that is neither strong for the SEP3 homodimer nor for the AG homodimer (Figure 137
2). 138
To estimate affinities of MADS-domain TF dimers to the DNA fragments we used 139
sequencing data of the initial randomized dsDNA libraries (Round 0, R0) and calculated the 140
5
relative enrichments between R1–R3 of SELEX and R0 (Slattery et al., 2011). High-throughput 141
sequencing of the first three rounds of SELEX (R1–R3) for the various MADS-domain protein 142
combinations showed enrichment of the generic MADS-domain TFBS consensus in the evolved 143
pools of sequences (Figure 1C). The enrichment of a generic CArG-box in the SELEX-seq for 144
the AP1 homodimer was lower compared to any other studied MADS-domain protein complex 145
(Figure 1C). This suggests that either the AP1 homodimer binds more weakly to DNA compared 146
to other MADS-domain protein complexes, as was observed before in EMSA experiments on 147
individual DNA probes (Smaczniak et al., 2012b), or AP1 homodimers are formed with lower 148
efficiency. In comparison, we did not see any enrichment of randomly permutated CArG-box 149
sequence fragments in the enriched DNA sequence pools, confirming DNA-binding specificity 150
of MADS-domain proteins (Figure 1D). 151
152
SELEX-seq Analyses Reveal DNA-Binding Preferences of Floral Homeotic Protein 153
Complexes 154
To determine the DNA-binding preferences for MADS-domain TF complexes independently of 155
the presumed consensus CArG-box, we follow the methodology proposed by Slattery et al. 156
(Slattery et al., 2011). We estimated the optimal length of the k-mer (subsequence of a length k) 157
from which we can accurately predict the relative DNA-binding affinities in the SELEX-seq 158
data. We based our analysis on k-mers and not on the full length 20 bp sequences, because, in 159
theory, there are more than 1012
possible combinations of unique 20 bp sequences, and our 160
sequencing libraries contain on average 2.2 M reads (Supplemental Table 1). Depending on the 161
library, the optimal length of the k-mer varied between 9 and 12 bp (Supplemental Figure 3). For 162
further analyses, we chose k-mer length of 12 bp that had the highest information gain score in 163
most of the libraries. Then, we calculated the relative DNA-binding affinities using the ratio 164
between frequencies of k-mers in R3 and R0 of SELEX. Owing to a large diversity of sequences 165
in the randomized initial libraries (R0), their k-mer frequencies are low and, therefore, difficult to 166
estimate directly from the sequencing results. For this reason, we used a Markov Model to 167
estimate the best frequency of R0. This workflow allowed us to calculate relative DNA-binding 168
affinities of MADS-domain protein complexes to k-mers of length 12. We clustered those 169
relative affinities in a heatmap, highlighting diverse and overlapping binding preferences (Figure 170
2A). This analysis could differentiate the complexes based on their DNA-binding preferences. 171
6
Sequences in cluster 1 were specific for SEP3–AG, sequences in cluster 2 were specific for AG 172
alone, while sequences in cluster 3 had mixed specificity mainly for SEP3, SEP3–AP1 and AP1 173
that could be further divided into three additional clusters 3a–3c (Supplemental Figure 4). In 174
detail, sequences in cluster 3a were not enriched for SEP3; sequences in cluster 3b were not 175
enriched for SEP3–AG while sequences in cluster 3c were specific to SEP3 alone. Remarkably, 176
more k-mers were specifically enriched for SEP3–AG or for AG than for the other proteins and 177
protein combinations. Also the SEP3–AG complex seemed to bind a wide range of sequences 178
including, but not limited to, the ones that were also bound by SEP3–AP1 or SEP3 complexes. 179
This shows that MADS-domain protein complexes bind overlapping sets of TFBSs and suggests 180
a presence of active competition for those common sites when multiple proteins are expressed in 181
the same tissues. However, SEP3–AG and AG complexes showed additional specific DNA-182
binding capacities, which may suggest that reproductive organ specification by AG and SEP3 183
involves in vivo binding to specific target genes not bound by other MADS-domain proteins. 184
To validate SELEX-seq derived relative affinities for 12-mer sequences represented in the 185
heatmap, we performed quantitative multiple fluorescence relative affinity (quMFRA) assays 186
(Man and Stormo, 2001). For these experiments, we used 20 bp fragments (plus barcodes and 187
flanking regions with labelled primers) obtained from sequenced SELEX libraries that contained 188
a particular 12-mer sequence (Supplemental Tables 2 and 3). Our SELEX and quMFRA 189
approaches gave very similar results with a correlation coefficient (R2) between 0.6–0.8 190
(Supplemental Figure 5 and Supplemental Data Set 1). Additionally, for five selected 20 bp 191
SELEX fragments and three protein complexes, we performed absolute protein-DNA binding 192
affinity estimation by EMSA. We found that four out of five measured dissociation constants 193
(Kd) were following relative affinity values observed in the SELEX-seq: high SELEX-seq 194
relative affinity should relate to low Kd (Supplemental Figure 6 and Supplemental Data Set 2). 195
Differences could be caused by the fact that SELEX-seq derived relative affinities are based on 196
the 12-mer sequences, while quMFRA and Kd estimations by EMSA were performed with the 197
full length 20 bp fragments containing a particular 12-mer and its flanking regions. 198
To identify consensus TFBSs of MADS-domain protein complexes in each cluster, we 199
extracted all full-length 20 bp fragments from the R3 sequencing data that contained specific 12-200
mer sequences and performed DNA motif discovery using the GADEM algorithm (Li, 2009). 201
Comparison between different motifs revealed differences in nucleotide composition and number 202
7
of consecutive nucleotides between different MADS-domain TFBSs mainly in the central, A/T-203
rich region (Figure 2B). We observed that the motif for AG has the A/T-rich region length 204
between 3-5 bp (cluster 2), while the A/T-rich region for SEP3, AP1 and SEP3-AP1 is longer, 6–205
8 bp (cluster 3). Especially long A/T-rich regions were found for sequences in the cluster 3c, 206
which are specific for SEP3 alone (Supplemental Figure 4). The SEP3–AG dimer showed a 207
binding preference (cluster 1) intermediate between SEP3–SEP3 and AG–AG dimers. The 208
analysis of the sequence enrichment for all 64 possible variations of the consensus CArG-box 209
allowing one nucleotide change each in the A/T-rich region of the CC[A/T]6GG sequence, 210
confirmed that different MADS-domain protein complexes have different binding preferences, 211
even in the case of the generic CArG box (Supplemental Figure 7). 212
Presence of additional nucleotides flanking the CArG-box region in the consensus motifs also 213
constitutes important information for TFBS characterization. TTN- and -NAA nucleotide 214
sequences were the prevalent flanking sites of the CArG-box motif. Moreover, the analysis of the 215
12-mer fragment enrichment revealed that MADS-domain protein complexes bound non-CArG-216
box-like sequences. Examples of these sequences are listed in Supplemental Table 2 (confirmed 217
by EMSA-based, quMFRA experiments provided in Supplemental Data Set 1) and detailed 218
tables with relative affinities to all k-mers are available from the Gene Expression Omnibus 219
database. Sequences with high relative affinity to AG complexes showed the strongest deviation 220
from the ‘generic’ CArG-box consensus, e.g. seq1 for AG, or seq5 for the SEP3–AG complex. 221
On the other hand, sequences with high relative affinity to AP1 complexes, e.g. seq15 for SEP3–222
AP1 or seq12 for AP1–AP1, showed similarity to the generic CArG-box. This shows that TFBSs 223
for AG and its heteromeric complexes differ strongly from the consensus CArG-box. 224
Because we found previously an association of DNA binding affinity of MADS-domain TFs 225
with structural parameters (e.g. narrower minor DNA groove width) (Muino et al., 2014), we 226
used the DNAshape tool to predict DNA structural features (Zhou et al., 2013) for our SELEX-227
seq data. This allowed us, for example, to compare DNA minor groove width (MGW) between 228
sequence clusters bound by different protein complexes (Figure 2C). We observed narrowing of 229
the DNA minor groove for sequences more specific to SEP3 or SEP3–AP1 complexes (cluster 230
3), while the MGW for sequences specific for AG alone was the widest. Moreover the roll and 231
helix twist are the two DNA shape properties that mainly differed between studied sequences in 232
clusters 1–3 (Supplemental Figure 8). Sequences with the longer A/T-rich region specific for 233
8
SEP3 and SEP3-AP1 complexes have narrower minor groove and higher roll values comparing 234
to sequences with the shorter A/T-rich region specific for AG (p < 0.05; t-test). These parameters 235
determine intrinsic bending of the DNA (Dickerson, 1998; Rohs et al., 2009), which supports the 236
idea that the bend level of the DNA might play a role in the DNA sequence selection by MADS-237
domain TFs. This phenomenon was observed before for other types of TFs (Nelson and 238
Laughon, 1990; Kneidl et al., 1995; Stella et al., 2010). The increased degree of bending of the 239
pre-bent DNA sequence towards the minor groove was observed in the crystal structure of the 240
mammalian MADS-domain protein SRF bound to DNA (Pellegrini et al., 1995) and in in vitro 241
experiments on Arabidopsis MADS-domain TFs (Riechmann et al., 1996a; West et al., 1997; 242
West et al., 1998). Our results suggest that these parameters not only influence DNA recognition 243
by plant MADS-domain TFs in general, but also contribute to DNA-binding specificity. 244
245
Combining SELEX-seq and ChIP-seq to Predict Organ-Specific Targets of Floral 246
Homeotic Protein Complexes 247
Since SELEX-seq data can separate the DNA-binding specificities of different MADS-domain 248
protein complexes, we used this information to classify SEP3 binding events obtained by ChIP-249
seq (flower development stage 4–5 for SEP3, 4 days after induction) (Pajoro et al., 2014) 250
depending on whether they were bound by SEP3-AP1 or SEP3-AG complexes (Figure 3). We 251
chose these two dimers because they distinguish regulatory programs controlling the formation 252
of perianth (sepals, and in combination with AP3/PI petals) and reproductive organs (carpels, 253
and in combination with AP3/PI stamens), respectively. Among sequences bound by SEP3–AG 254
and SEP3–AP1 complexes in SELEX-seq we could distinguish the ones that are either more 255
SEP3–AG or SEP3–AP1 specific (Figure 3A) based on the identified consensus sequences from 256
Figure 2B. This prompted us to build a classifier to identify SEP3–AG or SEP3–AP1 specific 257
SEP3 in vivo DNA TFBSs. 258
To classify SEP3 binding events obtained by ChIP-seq depending on their specificity, we 259
defined a score function (see Methods) to annotate positions in the Arabidopsis genome based on 260
the SELEX-seq-inferred DNA-binding specificities. Comparison of available ChIP-seq data of 261
AP1 and AG (flower development stages 4–5 for AP1 and 5–6 for AG, 4 days and 5 days after 262
induction, respectively) (Ó’Maoiléidigh et al., 2013; Pajoro et al., 2014) within 1500 most 263
enriched SEP3 ChIP-seq peaks revealed that the score ratio for SELEX-seq derived peaks within 264
9
those SEP3 binding regions predicts complex specific TFBSs (Figure 3B and Supplemental Data 265
Set 3). In this manner, we were able to discriminate TFBSs that were either more SEP3–AP1 or 266
more SEP3–AG specifically bound in the Arabidopsis genome. We also showed that the SEP3–267
AG specific motif of cluster 1 (Figure 2B) overlaps very well with the AG-specific ChIP-seq 268
regions (Figure 3C). In addition, SEP3–AP1 highly bound sequences from cluster 3 and had 269
good overlap with AP1-specific ChIP-seq bound regions. Moreover, the AG-specific motif from 270
cluster 2 overlaps very well with the AG-specific ChIP-seq TFBSs. In general, the number of 271
specifically bound TFBSs is higher for AG than for AP1, which is linked to the presence of 272
specific DNA-binding motifs (Figure 3C). This analysis showed that in vivo DNA-binding 273
specificities of reproductive organ-specific complexes can be derived from the SELEX-seq data. 274
Inferring protein complexes that bind to specific genomic regions is difficult based on ChIP-275
seq data alone, since standard ChIP pulldowns usually detect all genomic TFBSs of a TF, 276
independent of its interaction partners. Combining ChIP-seq experiments of different TFs can 277
identify which TFs bind to the same regions, but not necessarily together in a complex. However, 278
SELEX-seq data allowed us to identify genomic regions that are directly bound by either SEP3–279
AG or by SEP3–AP1. To test whether protein complex-specific TFBSs drive expression in 280
specific floral whorls, we integrated the information from SELEX-seq data for SEP3–AG and 281
SEP3–AP1 with whorl-specific expression data (TRAP-seq) (Jiao and Meyerowitz, 2010). We 282
used genome aligned SEP3–AP1 and SEP3–AG SELEX-seq data to classify the 1500 most 283
enriched SEP3 ChIP-seq peaks. This analysis allows us to predict genomic regions that are 284
bound directly by a specific TF complex. As a result, the SELEX-based classification of SEP3–285
bound genomic regions correlated better with whorl-specific expression data, than a 286
classification based on SEP3, AP1 and AG ChIP-seq data (Figure 3D, Supplemental Figure 9). 287
To validate TFBSs obtained with SELEX-seq we focused on specific examples of regulatory 288
regions that are bound by MADS-domain proteins. The upstream promoter regions of SEP3 and 289
AP3 were predicted by our SELEX-seq approach as binding regions of several MADS-domain 290
proteins (Figure 4). The position of the SELEX-seq peaks is in very good agreement with the 291
position of ChIP-seq peaks for all MADS-domain homeotic TFs within those two regulatory 292
regions. One of these genomic regions, positioned 4.1 kb upstream in the promoter of SEP3, has 293
two TFBSs in close proximity that show differential affinity (represented by the SELEX peak 294
height) and binding specificity (not all heteromeric complexes bound to both TFBSs). These 295
10
results are in agreement with our previous in vitro EMSA studies of the same regulatory 296
fragment (Smaczniak et al., 2012b) where these two TFBSs showed variable binding efficiencies 297
for different MADS-domain protein complexes with one site being superior in importance to the 298
other. Also in the AP3 promoter, two previously characterized TFBSs (Tilly et al., 1998; Jetha et 299
al., 2014), positioned close to the transcriptional start site showed differential binding affinity. In 300
the same region, a third potential CArG-box has been identified (Tilly et al., 1998), but it is not 301
predicted to be bound by MADS-domain complexes in SELEX-seq, in agreement with other in 302
vitro data (Tilly et al., 1998; Jetha et al., 2014). 303
The SELEX-seq data mapped to the genome (Figure 4) allowed us to predict TFBSs at much 304
higher resolution compared to the ChIP-seq data alone. Because of the high resolution we 305
estimated a preferred distance between located TFBSs. For all analyzed complexes we showed 306
that the preferred distance was around 62 bp (6 helical turns, visualized as the first peak from the 307
center and a red vertical line) (Figure 5). Moreover, we could see that for SEP3–AG this was the 308
only distance observed, while for SEP3–SEP3 complex TFBSs were also separated by longer 309
DNA stretch of around 210 bp. For SEP3–AP1, no strong preferences between more distal 310
TFBSs were clearly defined. This suggests that SEP3–AG and SEP3–AP1 complexes prefer to 311
act in different DNA loop size configurations, with one complex being stricter in the loop size 312
than the other is. 313
314
Activity of the AP3 Promoter with Altered DNA TFBSs 315
Combining information from SELEX-seq and ChIP-seq experiments to predict complex-specific 316
TFBSs in the Arabidopsis genome suggested that TFBSs could be potentially engineered in vivo 317
to modulate gene regulation by MADS-domain factors. To test this hypothesis we altered the 318
AP3 promoter (pAP3_wt) in a reporter construct that contained two CArG-boxes (Figure 6). We 319
chose the AP3 promoter for our in vivo and in vitro experiments as it is expressed in in a floral 320
whorl-specific manner (whorl 2 and whorl 3). The AP3 promoter is relatively short (0.9 kb) and 321
well characterized (Tilly et al., 1998; Jetha et al., 2014), with two identified MADS TFBSs. The 322
genomic region harboring the two functional CArG boxes is bound by MADS-domain proteins 323
in vivo (Figure 4). These properties of the AP3 promoter make it ideal for mutagenesis studies 324
because the compensation effects coming from other potential TFBSs would be minimized. The 325
activity of the AP3 promoter, when the two CArG-boxes were completely mutated (pAP3_mut), 326
11
did not respond to the presence of the effectors (SEP3–AP1 or SEP3–AG protein complex). 327
Modifying the two CArG-boxes in the AP3 promoter based on the SELEX-seq data as having 328
more affinity towards SEP3–AG (pAP3_(SEP3-AG)) protein complex moderately altered the 329
activity of the AP3 promoter in response to the effectors (Figure 6A and 6B). The activity of the 330
pAP3_(SEP3-AG) promoter comparing to the pAP3_wt promoter increased when SEP3-AG 331
effector was used and decreased when SEP3–AP1 effector was present. The in vitro binding 332
strength of corresponding MADS-domain protein complexes to modified AP3 promoters was 333
stronger comparing to wt AP3 promoter in case of the SEP3–AG complex and weaker in case of 334
the SEP3–AP1 complex (Figure 6C). 335
The importance of these two TFBSs in the AP3 promoter was visualized by dual luciferase 336
activity assays in protoplasts and in in planta fluorescent reporter assays. These two approaches 337
showed that the expression of the mutated AP3 promoter without any functional CArG-box is 338
higher (Figure 6D) and not restricted to whorl 2 and 3 – it is detected throughout early stages of 339
flower development and the inflorescence meristem (Figure 6E). This is in agreement with the 340
previously reported GUS expression patterns (Tilly et al., 1998), and suggests that the CArG-341
boxes have also repressive roles, restricting AP3 expression to the developing flower. Thus, 342
those CArG-boxes are likely bound by other MADS-domain TF complexes as well, such as 343
those that are expressed in the inflorescence meristem. Indeed, studies have shown that SOC1 344
and SVP (Gregis et al., 2009; Immink et al., 2012) directly repress AP3 expression in the 345
inflorescence meristem and early floral meristem, providing an explanation of these results. 346
Moreover, the activity of the promoter with TFBSs more specific to the SEP3–AG complex 347
showed a spatial enhancement of the fluorescent signal in whorl 4 of the flower (Figure 6E). 348
Taken together, these results indicate that MADS-domain protein heteromeric complexes have 349
partly different DNA-binding specificities and consequently, this affects their target gene 350
specificity. 351
352
DISCUSSION 353
The variety of functions of MADS-domain TFs in the life cycle of Arabidopsis thaliana suggests 354
that they may regulate different sets of target genes. How exactly MADS-domain TFs achieve 355
their functional specificity is not yet fully understood. Here, we showed that part of the 356
functional specificity can relate to DNA binding. By making use of the SELEX-seq approach, we 357
12
were able to distinguish common and specific TFBSs for several key floral homeotic MADS-358
domain TF complexes. Moreover, we presented here that differential binding of MADS-domain 359
protein complexes to their TFBSs plays a role in specificity of target gene regulation. 360
361
DNA-Binding Specificities of Floral Homeotic Proteins Predict Organ-Specific Targets 362
Recent high throughput approaches applied to large number of plant TFs aimed to identify their 363
in vitro DNA-binding specificities by using protein binding microarrays (Franco-Zorrilla et al., 364
2014) or a modification of the SELEX, DNA affinity purification sequencing (O'Malley et al., 365
2016), did not focus on MADS-domain proteins. Moreover, these studies did not take into 366
consideration the impact of TF heterodimerization on DNA TFBS selection. SELEX-seq allows 367
the systematic and sensitive characterization of DNA-binding properties of TFs. Previously, 368
‘classical’ SELEX followed by Sanger sequencing was used to study DNA-binding properties of 369
several plant MADS-domain TFs (Huang et al., 1993; Huang et al., 1995; Huang et al., 1996; 370
Tang and Perry, 2003). Some features of the consensus sequences found in those studies are 371
common to our SELEX-seq derived consensus. For example, for most MADS-domain TFs the 372
consensus motif resembles the putative CArG-box similar to our SELEX-seq motifs, and they 373
also contain specific nucleotides in the flanking regions. Moreover, we identified additional 374
TFBSs that strongly deviate from the consensus sequence. Although position weight matrices or 375
consensus sequences give substantial information on the DNA sequence characteristics that 376
determine TF binding, they fail, for example, to visualize the dependencies between nucleotide 377
positions or, most importantly, the affinities to particular DNA structures. Our analysis shows 378
that there are differences in DNA binding between selected MADS-domain homo- and hetero-379
complexes and support the concept that different MADS-domain protein complexes could bind 380
overlapping sets of TFBSs, although with different affinities, which would allow for active 381
competition between different MADS-domain protein complexes for the same target genes in 382
vivo. Floral homeotic complexes containing the AG protein, responsible for stamen and carpel 383
specification, have distinct DNA-binding preferences that result in the regulation of a specific set 384
of target genes. On the other hand, other target genes are commonly bound by different MADS-385
domain proteins but may be antagonistically regulated in perianth and reproductive organs in the 386
flower (Yan et al., 2016). 387
388
13
DNA Structure Affects MADS TF Binding Specificity 389
MADS-domain TFs bind DNA through interactions of the N-terminal part of the MADS domain 390
with the CArG-box A/T-rich region of the minor groove (Pellegrini et al., 1995) causing 391
substantial bending of the DNA. Although full crystal structures for plant MADS-domains are 392
not available, it was shown that plant MADS-domain proteins – similar to MADS proteins from 393
animals and yeast – bend the DNA (Riechmann et al., 1996a; Melzer et al., 2009). Additionally, 394
it was reported that DNA-bending by MADS-domain complexes could be DNA-sequence 395
specific (West et al., 1997; West et al., 1998), which supports the importance of the DNA 396
sequence in the regulation of a protein-DNA complex structure. The determinants of this 397
characteristic DNA-binding are not well understood. Muiño et al. showed that there is an 398
enrichment of a particular DNA structural pattern in regions bound by MADS-domain TFs 399
(Muino et al., 2014). Recently, large scale DNA structure predictions for human and plant 400
MADS-domain ChIP-seq TF binding data showed that combining DNA sequence information 401
with DNA shape features improves prediction of TF-bound genomic regions in vivo (Mathelier 402
et al., 2016). In their analysis, authors captured the importance of the MGW and Propeller Twist 403
(ProT) in selection of TFBSs of MADS-domain proteins. Our DNA-shape predictions of the 404
SELEX-seq sequences show that two ProT maxima are present at the both ends of the CArG-box 405
motif corresponding to CC and GG flanks. 406
Intrinsic ‘shape’ properties of the DNA can influence DNA binding, especially of TFs that 407
mainly recognize the DNA via interactions with the minor groove (Rohs et al., 2009). The A/T-408
rich region of the CArG-box can be considered as the “A-tract” (Muino et al., 2014). It was 409
previously reported that A-tracts facilitate MADS-domain TF DNA binding when located inside 410
the CArG-box motif and periodically distributed around it (Muino et al., 2014). Based on our 411
SELEX-seq sequence motifs we can infer that differences in the length of the A/T-rich regions 412
influence TF binding specificity. For example, AG alone usually prefers to bind sequences with 413
shorter A/T-rich region while the SEP3 alone binds sequences with a long A/T-region. The 414
SEP3-AG heterodimer binds sequences with an intermediate A/T-region. This, and the predicted 415
structural characteristics of the DNA-binding motifs of MADS-domain TF dimers, suggests 416
DNA structure, especially the width of the CArG-box minor groove, plays a role in determining 417
DNA-binding specificity. Since DNA bending (Haran and Mohanty, 2009) induced by the TF 418
14
dynamically changes conformation of promoters and juxtaposition of regulatory regions, this can 419
play a prominent role in MADS-DNA binding and gene regulation. 420
421
The Role of TF Multimerization in MADS DNA-Binding Specificity 422
According to the floral quartet model (Theissen and Saedler, 2001), quaternary MADS-domain 423
TF complexes bend the DNA in order to bind simultaneously two different TFBSs. The exact 424
nature of this binding and a possible regulation by formation of DNA regulatory loops by TFs is 425
not fully explored. Recent in vitro studies on homotetrameric complexes by SEP1, SEP2, SEP3 426
or SEP4 proteins from Arabidopsis shows their ability to bind cooperatively to DNA as tetramers 427
and loop the DNA, with different preferences for TFBS spacing (Jetha et al., 2014). The role of 428
TFBS spacing in DNA-binding specificity of MADS protein complexes is supported by the 429
distribution of TFBSs predicted by SELEX-seq within genomic regions bound in vivo. We 430
observed that spacing between SEP3/AG-specific TFBSs is more defined compared to that of 431
SEP3-AP1 TFBSs. This might suggest that some of the functional specificity comes from the 432
coordinated, simultaneous interaction of MADS-domain proteins with more than one cis-433
regulatory element in the genome. Since MADS-domain protein complexes can induce bending 434
to a different degree (Pellegrini et al., 1995; Riechmann et al., 1996a; West et al., 1997; Huang et 435
al., 2000) it also might be that stronger bending induced by SEP3-AG evokes interaction 436
between two TFBSs at shorter distances when the floral quartet is formed while a potentially 437
mild bending induced by the SEP3-AP1 protein complex does not impose such a restriction. In 438
summary, the commonly observed presence of more than one TFBS identified by SELEX-seq 439
within a single ChIP-seq peak suggest a role of these TFBSs in recruitment of specific higher-440
order complexes and robust regulation of the target gene expression. 441
442
Towards Understanding the Specificity of Floral Homeotic Protein Complexes 443
Since different MADS-domain proteins bind overlapping but not identical sets of genomic 444
regions, so part of the functional specificity of MADS-domain TFs is attributed to DNA-binding 445
specificity (Yan et al., 2016). High throughput in vivo DNA binding experiments showed that 446
MADS-domain proteins bind the DNA in places that lack canonical CArG-box (or CArG-box-447
like) sequences (Kaufmann et al., 2009; Zheng et al., 2009; Kaufmann et al., 2010; Deng et al., 448
2011). AG and AG-SEP3 bind DNA-sequence elements that resemble – if at all – CArG boxes 449
15
with very short A/T rich regions in the center. Comparing SELEX-seq with ChIP-seq binding 450
profiles, we were able to unravel part of the cis-regulatory code for specification of perianth vs. 451
reproductive organs. We showed that AG and AG–SEP3 complexes have unique DNA-binding 452
preferences that can be linked to a specific set of genomic regions bound in vivo, for example: 453
HECATE 2 (HEC2), which plays an important role in the gynoecium development (Gremski et 454
al., 2007) or CAPRICE (CPC) (Schellmann et al., 2002) and GLABRA3 (GL3) (Payne et al., 455
2000) that regulate trichome development, a process that is repressed by AG in carpels 456
(Ó’Maoiléidigh et al., 2013). In line with these specific functions, only 12.4% of the AG-bound 457
genomic regions are also bound by MADS-domain TFs that act during vegetative development 458
or floral transition (FLC, SVP, SOC1), while ~17% of the AP1-bound genomic regions show an 459
overlap with these factors (D. Chen, personal communications). 460
Targeted in vivo immunoprecipitation experiments of several MADS-domain TFs and other 461
studies revealed that MADS-domain proteins can form larger complexes with other 462
transcriptional regulators (Brambilla et al., 2007; Simonini et al., 2012; Smaczniak et al., 2012b), 463
and as such could bind the DNA. Whether the presence of other cofactors or other TFs that 464
interact with the MADS-domain TFs can modulate DNA-binding specificity remains an 465
important question to be resolved in future studies. Similarly, the chromatin status may have an 466
impact on recruitment of specific complexes. 467
Our in vivo AP3 promoter studies suggest that the expression of MADS-domain TF target 468
genes could be modulated by single nucleotide changes in their regulatory regions. In agreement 469
with these results, it was shown that differences between the spatio-temporal level of expression 470
of AP1 and CAULIFLOWER (CAL), two recent Arabidopsis duplicated genes, was determined 471
by the presence or absence of individual, functionally important TFBSs in regulatory regions (Ye 472
et al., 2016). Together, the findings suggest that changes in individual MADS-domain TFBSs 473
contribute to regulating spatio-temporal levels of target gene expression, but do not provide a 474
strict on/off regulation of target gene expression. 475
476
METHODS 477
SELEX-seq 478
The dsDNA libraries were made from the ssDNA sequences by single-cycle PCR amplification 479
with a complementary primer essentially as described before (Jolma et al., 2010). The dsDNA 480
16
libraries contained 20 random nucleotide fragment flanked by specific barcodes that allowed for 481
later characterization when multiplexed in high-throughput sequencing. The dsDNA libraries 482
contained all necessary features required for direct sequencing with an Illumina platform (Jolma 483
et al., 2010). 484
Protein dimers were synthesized using TNT SP6 Quick Coupled Transcription/Translation 485
System (Promega) following the manufacturer’s instructions in a total volume of 20 µl and 486
equimolar expression plasmid concentrations. The binding reaction mix was prepared essentially 487
as described previously for EMSA experiments (Egea-Cortines et al., 1999; Smaczniak et al., 488
2012b) and contained 20 µl of in vitro-synthesized proteins and 50–100 ng of dsDNA library in a 489
total volume of 120 µl. The binding reaction was incubated on ice for 1 h followed by 1 h 490
immunoprecipitation with protein specific antibodies (affinity-purified, polyclonal peptide 491
antibodies, Eurogentec) coupled to magnetic beads (MyOne, Invitrogen) in a thermomixer at 4 492
°C with constant mixing at 700 rpm. Antibodies were raised against the protein-specific C-493
terminal domain (the C-domain) or last part of the K-domain of MADS-domain proteins. These 494
domain parts are not responsible for protein-DNA (the M-domain is) or protein-protein 495
interactions (the I- and first part of K-domain are), rather they are assumed to stabilize higher-496
order protein complex formation (Kaufmann et al., 2005). Therefore we expected no influence of 497
antibody-protein interaction on studied protein-DNA interactions. The following synthetic 498
peptides were used for immunizations: LNPNQEEVDHYGRHH for SEP3; 499
EQWDQQNQGHNMPPPLPPQQ for AP1; PPQTQSQPFDSRNYFC and 500
QPNNHHYSSAGRQDQT for AG. The SEP3 antibodies were previously used in ChIP-seq 501
experiments (Kaufmann et al., 2009; Pajoro et al., 2014). Magnetic beads with attached 502
antibodies where prepared in advance according to the manufacturer’s instructions (MyOne, 503
Invitrogen) with purified antibodies resuspended in 1X PBS (~1 mg/ml); 20 µg of antibodies and 504
0.5 mg of beads were used for a single binding reaction. After immunoprecipitation, beads were 505
washed 5 times with 150 µl of binding buffer without salmon-sperm DNA and bound DNA was 506
eluted with 50 µl 1X TE in a thermomixer at 90 °C with full mixing speed. Afterwards, magnetic 507
beads were immobilized and the supernatant was transferred to a 1.5-ml tube. DNA fragments 508
were amplified with 10 to 15 cycles of PCR with SELEX round-specific primers (Jolma et al., 509
2010) and the total amplicon was used in the subsequent SELEX round. The amplification 510
efficiency was checked on the agarose gel by comparing to a known concentration of a standard 511
17
probe. Samples for sequencing, after amplification, were cut out from agarose gel and purified 512
using MinElute Gel Extraction Kit (Qiagen). Different libraries were multiplexed by mixing in 513
equimolar amounts with 40% PhiX Control (Illumina) in the Elution Buffer (Qiagen) and 514
sequencing was performed on the HiSeq 2000 or GAII sequencers (Illumina). 515
516
Obtaining Relative Affinities from the SELEX-seq Data. 517
Sequence reads that did not pass the filter quality of CASAVA 1.8 or mapped with no 518
mismatches to the phix174 genome were eliminated. The remaining sequences in fasta format 519
were extracted and grouped according to library specific barcodes, allowing no mismatches. 520
Barcodes were removed leading to 20 bp sequence libraries used in the data analysis. The 20 bp 521
sequences that were present in libraries in an unexpected high number (>1000) were eliminated, 522
as well as 20 bp reads containing the sequence “TCGTATGCCG” which is part of the Illumina 523
adapter sequence used for sequencing. 524
Data analysis was essentially performed as described before (Slattery et al., 2011). Because 525
the number of sequenced reads in each SELEX round (Supplemental Table 1) is much smaller 526
than the 1012
possible different 20-mer sequences, we calculated which k-mer length should be 527
used to obtain the maximum gain in information. For this, we computed the information gain 528
(the Kullback–Leibler divergence of Round 3 relative to Round 0) for each k-mer length. For 529
most of libraries the length 12 bp was the most optimal (Supplemental Figure 3), therefore, we 530
based the further analysis on the 12-mer sequences. 531
Frequencies of 12-mer sequences in each round except Round 0 was calculated directly from 532
the data using the function oligonucleotideFrequency from the Bioconductor R package: 533
Biostrings. The 12-mer sequences that were present in libraries in an unexpected high number 534
(>1,000) were eliminated at this step. 535
Sequences in Round 0 represent a set of randomly synthetized oligonucleotides and their 536
complexity did not allow for the direct calculation of 12-mer frequencies. Therefore, the 537
sequence frequency in Round 0 was estimated by the sixth-order Monte Carlo model, as 538
proposed before (Slattery et al., 2011). We chose the sixth-order Monte Carlo model, because 539
when the model was trained using 75% of the sequencing data, it resulted in the highest 540
prediction value as measured by the Pearson correlation coefficient between the predicted and 541
observed frequencies in the other 25% of the sequencing data. 542
18
Relative affinity for each possible 12-mer was calculated as the ratio between the frequencies 543
of 12-mers in Round 3 to Round 0, and normalized to 1 by dividing for the highest affinity-544
predicted 12-mer. These affinities can be download in Excel file format from the GEO omnibus 545
submission GSE95730. 546
547
Combining SEP3 ChIP-seq and SELEX-seq Data to Predict MADS-domain Dimer TFBSs 548
To in silico predict genomic regions bound by a given MADS-domain dimer based on our 549
SELEX-seq experiments, we obtained the affinity value for each k-mer of length 13 bp. For each 550
studied MADS-domain protein complex, we estimated the average affinity value to a particular 551
13-mer sequence and its reverse complementary sequence. We chose 13-mers instead of 12-mers 552
for technical reasons: the mapping software ‘soapv2’ (Li et al., 2009) was only able to map 553
sequences with a minimum length of 13 bp. We created fasta files for each library containing 13-554
mer sequences in a number equal to their estimated relative affinity multiplied by 100 and 555
rounded up to the closest integer (e.g. a sequence with the relative affinity of 0.98 was present in 556
98 copies in the fasta file). Next, we mapped these fasta files to the TAIR10 genome with 557
‘soapv2’ (Li et al., 2009) allowing no mismatches. Later, the function mappedReads2Nhits from 558
the peak caller CSAR (Muino et al., 2011) was used to identify enriched regions and their 559
associated score, extending the 13 bp reads and using no control since the relative affinities were 560
already corrected by the Round 0 enrichment. This resulted in a genomic SELEX-seq score 561
proportional to the affinity of the sum of 13-mers located on the particular genomic region. 562
To relate ChIP-seq scores with the SELEX-seq scores obtained by CSAR, we reanalyzed the 563
SEP3 (4 day after induction, stage 4-5 of flower development) (Pajoro et al., 2014), AP1 (4 day 564
after induction, stage 4-5 of flower development) (Pajoro et al., 2014), and AG (5 day after 565
induction, stage 5–6 of flower development) (Ó’Maoiléidigh et al., 2013) publically available 566
ChIP-seq data using CSAR (default parameters expect the backg that was set to 3). Next, we 567
linked the ChIP-seq and SELEX-seq peak to the SEP3 ChIP-seq when their distance was shorter 568
than 500 bp. Only the top 1500 SEP3 ChIP-seq peaks were considered. Quantile normalization 569
was used to normalize the SELEX-seq peaks scores, and independently to normalize the ChIP-570
seq peak scores aligned to the SEP3 ChIP-seq peaks. When it was needed, a TFBS was 571
associated to a gene when this was located 3 kb upstream or 1 kb downstream of a gene. The 572
19
method to classify SEP3 TFBSs showed good precision/recall ratios compared to the random 573
classifier (Supplemental Figure 10). 574
575
Other Bioinformatic Analyses 576
Position weight matrices were calculated based on the extracted 20N sequences containing 12-577
mers analyzed in the heatmap with the GADEM algorithm (Li, 2009) and DNA sequence logos 578
were built with the ‘seqLogo’ R script. 579
We used the Find Individual Motif Occurrences tool (‘FIMO’; version 4.10.2) (Grant et al., 580
2011) to scan the promoter region (TSS +3000 bp ~ -1000 bp) of genes for TFBSs using the 581
binding motifs generated from SELEX-seq data (with a p-value threshold of 10e-5 and defaults 582
for other parameters). A potential interaction was assigned if there was at least one TFBS in the 583
promoter of a gene. Significant enrichment of specific and common ChIP-seq target genes for 584
AP1 (Pajoro et al., 2014) and AG (Ó’Maoiléidigh et al., 2013) that overlapped with the different 585
SELEX-seq clusters was evaluated by a hypergeometric test (using the “phyper” function in R). 586
587
EMSA, Kd Estimation and QuMFRA 588
SELEX-derived sequences for EMSAs were produced by PCR with biotin-labeled or IR-589
fluorophore-labeled primers and purified from 2% agarose gel. Electrophoretic mobility shift 590
assays were performed essentially as described before (Smaczniak et al., 2012b). The detection 591
was performed with the Odyssey Scanner (LI-COR) for IR-labeled fragments or with the 592
Chemiluminescent Nucleic Acid Detection Module Kit (Pierce) for biotin-labeled fragments. 593
Absolute dissociation constants (Kd) for various protein complex-DNA interactions was 594
performed with the EMSA-based method, essentially as described before (Riechmann et al., 595
1996a). The constant amount of the protein complex was used with the various, known 596
concentrations of the IR-fluorophore-labeled DNA. 597
QuMFRA between selected 20 bp SELEX-seq-derived DNA fragments was performed as in 598
(Muino et al., 2014). DNA fragments were labeled with different fluorophores (Cy3/Cy5 or 599
Dy682/Dy782) and binding quantification was recorder with the Molecular Imager (Bio-Rad) 600
and Odyssey Scanner (Li-Cor). 601
602
Plant Materials and Growth Conditions 603
20
In our in vivo assays for AP3 promoter activity by fluorescent protein reporter, we used 991 bp 604
AP3 promoter region and we modified the known MADS-domain TFBSs according to the 605
SELEX-seq-inferred DNA sequences. Modifications were done following the highest relative 606
affinity towards SEP3-AG protein complex (pAP3_(SEP3-AG)) and by mutating the CArG-607
boxes altogether (pAP3_mut). Modifications to the AP3_wt promoter were introduced by PCR 608
and the promoter constructs were cloned into the Gateway entry vector pCR8/GW/TOPO 609
(Invitrogen) followed by a sub-cloning via Gateway LR reaction into the destination vector 610
pGREEN:GW:NLS-GFP (Horstman et al., 2015). The pAP3_x:GFP constructs were transformed 611
into Col-0 plants using floral dip method (Clough and Bent, 1998). Plants were grown under 612
long day conditions (16 h of light/8 h of dark at 20-22°C) on soil/vermiculite mix (3:1; vol:vol) 613
in a greenhouse. Plant inflorescences were studied by confocal microscopy (Zeiss LSM 510). 614
615
AP3 Promoter Activity Assays 616
In our in vivo AP3 promoter activity experiments by dual-LUC assays (Dual-Luciferase Reporter 617
Assay System from Promega), the same promoter region of the AP3 gene was used and the same 618
modifications were studied as in the fluorescent protein reporter assays. Promoters were inserted 619
in the front of the firefly luciferase CDS and together with the control Renilla luciferase CDS 620
and corresponding effector protein complex CDS, transfected into Arabidopsis thaliana 621
protoplasts cells (Diaz-Trivino et al., 2017). Protoplasts were isolated from young leaves 622
essentially as reported before (Yoo et al., 2007). Relative value of the signal between firefly 623
LUC and Renilla LUC was calculated and normalized to the relative signal of the sample without 624
the effector protein complex. The resulting value corresponds to the activity of the promoter in 625
the Arabidopsis protoplasts cells in arbitrary units. 626
627
Accession Numbers 628
Nucleotide sequence data are available from the NCBI/GenBank data library under the following 629
accession numbers: NM_102272 (SEP3), NM_105581 (AP1), NM_118013 (AG), NM_115294 630
(AP3), NM_122031 (PI). Accession numbers used in the Supplemental Data Sets are from the 631
TAIR10 annotation of the Arabidopsis genome (www.arabidopsis.org). SELEX-seq raw 632
sequencing data are available from the Gene Expression Omnibus database under accession 633
number GSE95730. 634
21
635
Supplemental Data 636
Supplemental Figure 1. EMSA analysis of homo- and heteromeric complexes bound to the 637
SELEX DNA from round 5. 638
Supplemental Figure 2. SELEX-seq reproducibility. 639
Supplemental Figure 3. Optimal lengths of randomized fragments that sufficiently predict the 640
specificity of bound MADS-domain complexes. 641
Supplemental Figure 4. DNA-binding specificities of MADS-domain TF complexes in cluster 642
3. 643
Supplemental Figure 5. Comparison of relative binding affinities for MADS-domain protein 644
complexes obtained by different methods. 645
Supplemental Figure 6. Comparison of the absolute binding affinity (dissociation constants, 646
Kd) by EMSA with the relative binding affinity by SELEX-seq. 647
Supplemental Figure 7. Enrichment of individual consensus CArG-boxes by MADS-domain 648
TF complexes. 649
Supplemental Figure 8. DNA structure predictions of the motifs from clusters 1-3 and 650
subclusters 3a-c depicting an average value of the four DNA structural properties at each 651
dinucleotide step in the sequences. 652
Supplemental Figure 9. Comparison of the SELEX-seq and ChIP-seq data with the whorl 653
specific expression data. 654
Supplemental Figure 10. Evaluation of the AP1/AG classifier. 655
Supplemental Table 1. Summary of the high-throughput sequencing analysis. 656
Supplemental Table 2. Sequences of selected 12-mers and their corresponding 20N most 657
representative dsDNA SELEX library sequences. 658
Supplemental Table 3. The experimental DNA sequence setup and the number of observations 659
for quMFRA experiments. 660
Supplemental Table 4. Wt and modified sequences of the AP3 promoter used in EMSA and 661
quMFRA. 662
Supplemental Data Set 1. Results of the EMSA-based quMRFA experiments. 663
Supplemental Data Set 2. Measurement of DNA-binding affinity of selected protein complexes 664
for selected SELEX-seq DNA probes. 665
22
Supplemental Data Set 3. Comparison of the top 1500 SEP3 ChIP-seq TFBSs compared to the 666
AG and AP1 ChIP-seq data and the SELEX-seq data for SEP3-AG vs. SEP3-AP1 dimer. 667
Supplemental Data Set 4. The output of the ‘FIMO’ and ‘phyper’ analysis for the prediction of 668
specific and common ChIP-seq target genes for AG and AP1, based on the SELEX-seq complex-669
specific motifs. 670
671
AUTHOR CONTRIBUTIONS 672
CS, JMM, GCA, and KK designed the research. CS performed the experiments. JMM and DC 673
analyzed the data. GCA and KK supervised the project. CS, JMM, and KK wrote the manuscript. 674
All authors discussed the results and commented on the manuscript. 675
676
ACKNOWLEDGMENTS 677
We would like to thank Arttu Jolma and Jussi Taipale for providing barcoded ssDNA libraries 678
and sharing details on the dsDNA library preparation and amplification; Markus Schmid, and 679
Elio Schijlen for generating the Illumina GAII and HiSeq data; Anneke Horstman for providing 680
pGREEN:GW:NLS-GFP reporter vector; and Ikram Blilou for providing dual-LUC reporter 681
vectors. This work was supported by an NWO-VIDI grant to KK. JMM was supported by a DFG 682
grant (TRR175). 683
684
REFERENCES 685
Bemer, M., van Dijk, A.D., Immink, R.G., and Angenent, G.C. (2017). Cross-Family 686
Transcription Factor Interactions: An Additional Layer of Gene Regulation. Trends Plant Sci 22, 687
66-80.688
Brambilla, V., Battaglia, R., Colombo, M., Masiero, S., Bencivenga, S., Kater, M.M., and 689
Colombo, L. (2007). Genetic and molecular interactions between BELL1 and MADS box factors 690
support ovule development in Arabidopsis. Plant Cell 19, 2544-2556. 691
Clough, S.J., and Bent, A.F. (1998). Floral dip: a simplified method for Agrobacterium-mediated 692
transformation of Arabidopsis thaliana. Plant J 16, 735-743. 693
de Folter, S., and Angenent, G.C. (2006). trans meets cis in MADS science. Trends Plant Sci 11, 694
224-231.695
23
Deng, W., Ying, H., Helliwell, C.A., Taylor, J.M., Peacock, W.J., and Dennis, E.S. (2011). 696
FLOWERING LOCUS C (FLC) regulates development pathways throughout the life cycle of 697
Arabidopsis. Proc Natl Acad Sci USA 108, 6680-6685. 698
Diaz-Trivino, S., Long, Y., Scheres, B., and Blilou, I. (2017). Analysis of a Plant Transcriptional 699
Regulatory Network Using Transient Expression Systems. Methods Mol Biol 1629, 83-103. 700
Dickerson, R.E. (1998). DNA bending: the prevalence of kinkiness and the virtues of normality. 701
Nucleic Acids Res 26, 1906-1926. 702
Egea-Cortines, M., Saedler, H., and Sommer, H. (1999). Ternary complex formation between the 703
MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of 704
floral architecture in Antirrhinum majus. EMBO J 18, 5370-5379. 705
Franco-Zorrilla, J.M., Lopez-Vidriero, I., Carrasco, J.L., Godoy, M., Vera, P., and Solano, R. 706
(2014). DNA-binding specificities of plant transcription factors and their potential to define 707
target genes. Proc Natl Acad Sci USA 111, 2367-2372. 708
Grant, C.E., Bailey, T.L., and Noble, W.S. (2011). FIMO: scanning for occurrences of a given 709
motif. Bioinformatics 27, 1017-1018. 710
Gregis, V., Sessa, A., Dorca-Fornell, C., and Kater, M.M. (2009). The Arabidopsis floral 711
meristem identity genes AP1, AGL24 and SVP directly repress class B and C floral homeotic 712
genes. The Plant Journal 60, 626-637. 713
Gremski, K., Ditta, G., and Yanofsky, M.F. (2007). The HECATE genes regulate female 714
reproductive tract development in Arabidopsis thaliana. Development 134, 3593-3601. 715
Haran, T.E., and Mohanty, U. (2009). The unique structure of A-tracts and intrinsic DNA 716
bending. Q Rev Biophys 42, 41-81. 717
Honma, T., and Goto, K. (2001). Complexes of MADS-box proteins are sufficient to convert 718
leaves into floral organs. Nature 409, 525-529. 719
Horstman, A., Fukuoka, H., Muino, J.M., Nitsch, L., Guo, C., Passarinho, P., Sanchez-Perez, G., 720
Immink, R., Angenent, G., and Boutilier, K. (2015). AIL and HDG proteins act antagonistically 721
to control cell proliferation. Development 142, 454-464. 722
Huang, H., Mizukami, Y., Hu, Y., and Ma, H. (1993). Isolation and characterization of the 723
binding sequences for the product of the Arabidopsis floral homeotic gene AGAMOUS. Nucleic 724
Acids Res 21, 4769-4776. 725
24
Huang, H., Tudor, M., Weiss, C.A., Hu, Y., and Ma, H. (1995). The Arabidopsis MADS-box 726
gene AGL3 is widely expressed and encodes a sequence-specific DNA-binding protein. Plant 727
Mol Biol 28, 549-567. 728
Huang, H., Tudor, M., Su, T., Zhang, Y., Hu, Y., and Ma, H. (1996). DNA binding properties of 729
two Arabidopsis MADS domain proteins: binding consensus and dimer formation. The Plant cell 730
8, 81-94. 731
Huang, K., Louis, J.M., Donaldson, L., Lim, F.L., Sharrocks, A.D., and Clore, G.M. (2000). 732
Solution structure of the MEF2A-DNA complex: structural basis for the modulation of DNA 733
bending and specificity by MADS-box transcription factors. EMBO J 19, 2615-2628. 734
Immink, R.G., Pose, D., Ferrario, S., Ott, F., Kaufmann, K., Valentim, F.L., de Folter, S., van der 735
Wal, F., van Dijk, A.D., Schmid, M., and Angenent, G.C. (2012). Characterization of SOC1's 736
central role in flowering by the identification of its upstream and downstream regulators. Plant 737
Physiol 160, 433-449. 738
Jetha, K., Theissen, G., and Melzer, R. (2014). Arabidopsis SEPALLATA proteins differ in 739
cooperative DNA-binding during the formation of floral quartet-like complexes. Nucleic Acids 740
Res 42, 10927-10942. 741
Jiao, Y., and Meyerowitz, E.M. (2010). Cell-type specific analysis of translating RNAs in 742
developing flowers reveals new levels of control. Mol Syst Biol 6, 419. 743
Jolma, A., Yin, Y., Nitta, K.R., Dave, K., Popov, A., Taipale, M., Enge, M., Kivioja, T., 744
Morgunova, E., and Taipale, J. (2015). DNA-dependent formation of transcription factor pairs 745
alters their binding specificity. Nature 527, 384-388. 746
Jolma, A., Kivioja, T., Toivonen, J., Cheng, L., Wei, G., Enge, M., Taipale, M., Vaquerizas, 747
J.M., Yan, J., Sillanpaa, M.J., Bonke, M., Palin, K., Talukder, S., Hughes, T.R., Luscombe, 748
N.M., Ukkonen, E., and Taipale, J. (2010). Multiplexed massively parallel SELEX for 749
characterization of human transcription factor binding specificities. Genome Res 20, 861-873. 750
Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K.R., Rastas, P., Morgunova, E., Enge, 751
M., Taipale, M., Wei, G., Palin, K., Vaquerizas, J.M., Vincentelli, R., Luscombe, N.M., Hughes, 752
T.R., Lemaire, P., Ukkonen, E., Kivioja, T., and Taipale, J. (2013). DNA-binding specificities of 753
human transcription factors. Cell 152, 327-339. 754
25
Kaufmann, K., Melzer, R., and Theissen, G. (2005). MIKC-type MADS-domain proteins: 755
structural modularity, protein interactions and network evolution in land plants. Gene 347, 183-756
198. 757
Kaufmann, K., Muino, J.M., Jauregui, R., Airoldi, C.A., Smaczniak, C., Krajewski, P., and 758
Angenent, G.C. (2009). Target genes of the MADS transcription factor SEPALLATA3: 759
integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol 7, 760
e1000090. 761
Kaufmann, K., Wellmer, F., Muino, J.M., Ferrier, T., Wuest, S.E., Kumar, V., Serrano-Mislata, 762
A., Madueno, F., Krajewski, P., Meyerowitz, E.M., Angenent, G.C., and Riechmann, J.L. (2010). 763
Orchestration of floral initiation by APETALA1. Science 328, 85-89. 764
Kneidl, C., Dinkl, E., and Grummt, F. (1995). An intrinsically bent region upstream of the 765
transcription start site of the rRNA genes of Arabidopsis thaliana interacts with an HMG-related 766
protein. Plant Mol Biol 27, 705-713. 767
Krizek, B.A., and Meyerowitz, E.M. (1996). Mapping the protein regions responsible for the 768
functional specificities of the Arabidopsis MADS domain organ-identity proteins. Proc Natl 769
Acad Sci USA 93, 4063-4070. 770
Krizek, B.A., Riechmann, J.L., and Meyerowitz, E.M. (1999). Use of the APETALA1 promoter 771
to assay the in vivo function of chimeric MADS box genes. Sex Plant Reprod 12, 14-26. 772
Li, L. (2009). GADEM: a genetic algorithm guided formation of spaced dyads coupled with an 773
EM algorithm for motif discovery. J Comput Biol 16, 317-329. 774
Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., and Wang, J. (2009). SOAP2: an 775
improved ultrafast tool for short read alignment. Bioinformatics 25, 1966-1967. 776
Man, T.K., and Stormo, G.D. (2001). Non-independence of Mnt repressor-operator interaction 777
determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay. 778
Nucleic Acids Res 29, 2471-2478. 779
Mathelier, A., Xin, B., Chiu, T.P., Yang, L., Rohs, R., and Wasserman, W.W. (2016). DNA 780
Shape Features Improve Transcription Factor Binding Site Predictions In Vivo. Cell Syst 3, 278-781
286 e274. 782
Melzer, R., and Theissen, G. (2009). Reconstitution of 'floral quartets' in vitro involving class B 783
and class E floral homeotic proteins. Nucleic Acids Res 37, 2723-2736. 784
26
Melzer, R., Verelst, W., and Theissen, G. (2009). The class E floral homeotic protein 785
SEPALLATA3 is sufficient to loop DNA in 'floral quartet'-like complexes in vitro. Nucleic 786
Acids Res 37, 144-157. 787
Muino, J.M., Kaufmann, K., van Ham, R.C.H.J., Angenent, G.C., and Krajewski, P. (2011). 788
ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound 789
genomic regions. Plant Methods 7. 790
Muino, J.M., Smaczniak, C., Angenent, G.C., Kaufmann, K., and van Dijk, A.D. (2014). 791
Structural determinants of DNA recognition by plant MADS-domain transcription factors. 792
Nucleic Acids Res 42, 2138-2146. 793
Nelson, H.B., and Laughon, A. (1990). The DNA binding specificity of the Drosophila fushi 794
tarazu protein: a possible role for DNA bending in homeodomain recognition. New Biol 2, 171-795
178. 796
O'Malley, R.C., Huang, S.S., Song, L., Lewsey, M.G., Bartlett, A., Nery, J.R., Galli, M., 797
Gallavotti, A., and Ecker, J.R. (2016). Cistrome and Epicistrome Features Shape the Regulatory 798
DNA Landscape. Cell 165, 1280-1292. 799
Ó’Maoiléidigh, D.S., Wuest, S.E., Rae, L., Raganelli, A., Ryan, P.T., Kwasniewska, K., Das, P., 800
Lohan, A.J., Loftus, B., Graciet, E., and Wellmer, F. (2013). Control of reproductive floral organ 801
identity specification in Arabidopsis by the C function regulator AGAMOUS. Plant Cell 25, 802
2482-2503. 803
Pajoro, A., Madrigal, P., Muino, J.M., Matus, J.T., Jin, J., Mecchia, M.A., Debernardi, J.M., 804
Palatnik, J.F., Balazadeh, S., Arif, M., O'Maoileidigh, D.S., Wellmer, F., Krajewski, P., 805
Riechmann, J.L., Angenent, G.C., and Kaufmann, K. (2014). Dynamics of chromatin 806
accessibility and gene regulation by MADS-domain transcription factors in flower development. 807
Genome Biol 15, R41. 808
Parenicova, L., de Folter, S., Kieffer, M., Horner, D.S., Favalli, C., Busscher, J., Cook, H.E., 809
Ingram, R.M., Kater, M.M., Davies, B., Angenent, G.C., and Colombo, L. (2003). Molecular and 810
phylogenetic analyses of the complete MADS-box transcription factor family in Arabidopsis: 811
new openings to the MADS world. The Plant cell 15, 1538-1551. 812
Payne, C.T., Zhang, F., and Lloyd, A.M. (2000). GL3 encodes a bHLH protein that regulates 813
trichome development in arabidopsis through interaction with GL1 and TTG1. Genetics 156, 814
1349-1362. 815
27
Pellegrini, L., Tan, S., and Richmond, T.J. (1995). Structure of serum response factor core bound 816
to DNA. Nature 376, 490-498. 817
Pollock, R., and Treisman, R. (1990). A sensitive method for the determination of protein-DNA 818
binding specificities. Nucleic Acids Res 18, 6197-6204. 819
Riechmann, J.L., and Meyerowitz, E.M. (1997). Determination of floral organ identity by 820
Arabidopsis MADS domain homeotic proteins AP1, AP3, PI, and AG is independent of their 821
DNA-binding specificity. Mol Biol Cell 8, 1243-1259. 822
Riechmann, J.L., Wang, M., and Meyerowitz, E.M. (1996a). DNA-binding properties of 823
Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA and 824
AGAMOUS. Nucleic Acids Res 24, 3134-3141. 825
Riechmann, J.L., Krizek, B.A., and Meyerowitz, E.M. (1996b). Dimerization specificity of 826
Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and 827
AGAMOUS. Proc Natl Acad Sci USA 93, 4793-4798. 828
Rohs, R., West, S.M., Sosinsky, A., Liu, P., Mann, R.S., and Honig, B. (2009). The role of DNA 829
shape in protein-DNA recognition. Nature 461, 1248-1253. 830
Schellmann, S., Schnittger, A., Kirik, V., Wada, T., Okada, K., Beermann, A., Thumfahrt, J., 831
Jurgens, G., and Hulskamp, M. (2002). TRIPTYCHON and CAPRICE mediate lateral inhibition 832
during trichome and root hair patterning in Arabidopsis. EMBO J 21, 5036-5046. 833
Schwarz-Sommer, Z., Huijser, P., Nacken, W., Saedler, H., and Sommer, H. (1990). Genetic 834
Control of Flower Development by Homeotic Genes in Antirrhinum majus. Science 250, 931-835
936. 836
Schwarz-Sommer, Z., Hue, I., Huijser, P., Flor, P.J., Hansen, R., Tetens, F., Lonnig, W.E., 837
Saedler, H., and Sommer, H. (1992). Characterization of the Antirrhinum floral homeotic 838
MADS-box gene deficiens: evidence for DNA binding and autoregulation of its persistent 839
expression throughout flower development. EMBO J 11, 251-263. 840
Simonini, S., Roig-Villanova, I., Gregis, V., Colombo, B., Colombo, L., and Kater, M.M. (2012). 841
Basic pentacysteine proteins mediate MADS domain complex binding to the DNA for tissue-842
specific expression of target genes in Arabidopsis. Plant Cell 24, 4163-4172. 843
Slattery, M., Riley, T., Liu, P., Abe, N., Gomez-Alcala, P., Dror, I., Zhou, T., Rohs, R., Honig, 844
B., Bussemaker, H.J., and Mann, R.S. (2011). Cofactor binding evokes latent differences in 845
DNA binding specificity between Hox proteins. Cell 147, 1270-1282. 846
28
Smaczniak, C., Immink, R.G., Angenent, G.C., and Kaufmann, K. (2012a). Developmental and 847
evolutionary diversity of plant MADS-domain factors: insights from recent studies. 848
Development 139, 3081-3098. 849
Smaczniak, C., Immink, R.G., Muino, J.M., Blanvillain, R., Busscher, M., Busscher-Lange, J., 850
Dinh, Q.D., Liu, S., Westphal, A.H., Boeren, S., Parcy, F., Xu, L., Carles, C.C., Angenent, G.C., 851
and Kaufmann, K. (2012b). Characterization of MADS-domain transcription factor complexes in 852
Arabidopsis flower development. Proc Natl Acad Sci USA 109, 1560-1565. 853
Stella, S., Cascio, D., and Johnson, R.C. (2010). The shape of the DNA minor groove directs 854
binding by the DNA-bending protein Fis. Genes Dev 24, 814-826. 855
Tang, W., and Perry, S.E. (2003). Binding site selection for the plant MADS domain protein 856
AGL15: an in vitro and in vivo study. J Biol Chem 278, 28154-28159. 857
Theissen, G., and Saedler, H. (2001). Plant biology. Floral quartets. Nature 409, 469-471. 858
Tilly, J.J., Allen, D.W., and Jack, T. (1998). The CArG boxes in the promoter of the Arabidopsis 859
floral organ identity gene APETALA3 mediate diverse regulatory effects. Development 125, 860
1647-1657. 861
West, A.G., Shore, P., and Sharrocks, A.D. (1997). DNA binding by MADS-box transcription 862
factors: a molecular mechanism for differential DNA bending. Mol Cell Biol 17, 2876-2887. 863
West, A.G., Causier, B.E., Davies, B., and Sharrocks, A.D. (1998). DNA binding and 864
dimerisation determinants of Antirrhinum majus MADS-box transcription factors. Nucleic Acids 865
Res 26, 5277-5287. 866
Wuest, S.E., O'Maoileidigh, D.S., Rae, L., Kwasniewska, K., Raganelli, A., Hanczaryk, K., 867
Lohan, A.J., Loftus, B., Graciet, E., and Wellmer, F. (2012). Molecular basis for the 868
specification of floral organs by APETALA3 and PISTILLATA. Proc Natl Acad Sci USA 109, 869
13452-13457. 870
Yan, W., Chen, D., and Kaufmann, K. (2016). Molecular mechanisms of floral organ 871
specification by MADS domain proteins. Curr Opin Plant Biol 29, 154-162. 872
Ye, L., Wang, B., Zhang, W., Shan, H., and Kong, H. (2016). Gains and Losses of Cis-regulatory 873
Elements Led to Divergence of the Arabidopsis APETALA1 and CAULIFLOWER Duplicate 874
Genes in the Time, Space, and Level of Expression and Regulation of One Paralog by the Other. 875
Plant Physiol 171, 1055-1069. 876
29
Yoo, S.D., Cho, Y.H., and Sheen, J. (2007). Arabidopsis mesophyll protoplasts: a versatile cell 877
system for transient gene expression analysis. Nat Protoc 2, 1565-1572. 878
Zheng, Y., Ren, N., Wang, H., Stromberg, A.J., and Perry, S.E. (2009). Global identification of 879
targets of the Arabidopsis MADS domain protein AGAMOUS-Like15. The Plant cell 21, 2563-880
2577. 881
Zhou, T., Yang, L., Lu, Y., Dror, I., Dantas Machado, A.C., Ghane, T., Di Felice, R., and Rohs, 882
R. (2013). DNAshape: a method for the high-throughput prediction of DNA structural features 883
on a genomic scale. Nucleic Acids Res 41, W56-62. 884
885
FIGURE LEGENDS 886
Figure 1. SELEX-seq for MADS-domain protein complexes. 887
(A) Overview of the experimental setup for the SELEX-seq approach performed in this study. 888
(B) EMSA analysis of the DNA libraries obtained in different rounds of SELEX for the SEP3–889
AG complex. 890
(C) Enrichment of the putative CArG-box consensus sequence (CC[A/T]6GG) in the SELEX-seq 891
rounds (log scale). Frequencies (in %) at Round 3 are: 14.2%, 14.1%, 6.4%, 4.5%, 3.6%, and 892
0.3% for SEP3–AG e1, SEP3–AG e2, SEP3–AP1, SEP3–SEP3, AG–AG and AP1–AP1 893
respectively. 894
(D) Enrichment of randomly permutated CArG-box sequences in the SELEX-seq rounds (log 895
scale). SEP3–AG e1 and SEP3–AG e2 indicate two independent experiments where different 896
antibodies were used for immunoprecipitation, SEP3 antibodies (e1) and AG antibodies (e2). 897
898
Figure 2. DNA-binding specificities of MADS-domain TF complexes. 899
(A) Relative affinity heatmap based on 12-mer sequences enriched in the 3rd
round of SELEX for 900
all studied MADS-domain TF complexes. Each line in the heatmap corresponds to a single 12-901
mer DNA fragment. High relative affinities for a particular sequence are marked in yellow, low 902
relative affinities are in blue. 903
(B) Sequence logos corresponding to the three main clusters of sequences in the heatmap built 904
from the position weight matrices for all 20N sequences containing group specific 12-mers. 905
(C) Minimal DNA minor groove width predictions of the sequences from clusters 1–3. The mean 906
value differences of the minor groove width are significant with p < 0.05 (t-test) for all pairwise 907
30
comparisons. Number of sequences used in predictions (sample size) are 1298340, 416077 and 908
705199 for cluster 1, 2 and 3 respectively. 909
910
Figure 3. Comparison between in vitro SELEX-seq and in vivo ChIP-seq. 911
(A) DNA specificity plots comparing the relative binding affinities of SELEX-seq sequences 912
selected by SEP3–AG (y-axis) and SEP3–AP1 (x-axis). Each point represents a unique sequence 913
that contains a color-coded motif. Black points represent all sequences. 914
(B) Association between SELEX-seq normalized score ratios (SEP3–AG/SEP3–AP1) and ChIP-915
seq normalized score ratios (AG/AP1) for TFBSs within the top 1500 SEP3 ChIP-seq peaks. The 916
plot represents a moving average of overlapping windows of size 1 over the SELEX-seq log2 917
score ratio. Windows with less than 5 elements were not considered. 918
(C) Prediction of specific and common ChIP-seq target genes for AG and AP1 based on the 919
SELEX-seq complex-specific motifs. The heatmap shows significance (hypergeometric test) of 920
the enrichment of SELEX-seq cluster motifs obtained in Figure 2C in specific and common 921
ChIP-seq binding regions compared previously by Yan et al. (Yan et al., 2016); numerical values 922
represent gene numbers. For the raw data of this figure see Supplemental Data Set 4. 923
(D) Comparison of the TRAP-seq expression data (Jiao and Meyerowitz, 2010) with the SELEX-924
seq and the ChIP-seq data (Ó’Maoiléidigh et al., 2013; Pajoro et al., 2014). The violin plot 925
visualizes the distribution of AP1-/AG-domain expression ratios of genes containing TFBSs with 926
a certain SEP3–AG/SEP3–AP1 SELEX-seq score ratio (orange) or with a certain AG/AP1 ChIP-927
seq score ratio (blue). 928
929
Figure 4. Examples of SELEX-seq TFBSs and ChIP-seq profiles mapped to the genome of 930
Arabidopsis. (Top) AP3 genomic locus. (Bottom) SEP3 genomic locus. 931
932
Figure 5. Characteristics of SELEX-seq peaks within ChIP-seq peaks. Distribution of complex 933
specific SELEX-seq TFBSs within ChIP-seq TFBSs of SEP3 (top), AP1 (middle) and AG 934
(bottom). Shown is the frequency of distances normalized by the background frequency 935
distances outside of the ChIP-seq peaks, plotted based on the calculated p-values using the 936
hypergeometric test. A distance of zero results when the position of compared TFBSs is the 937
same. 938
31
939
Figure 6. Comparison of activities and protein-DNA binding between various MADS-domain 940
protein complexes to modified AP3 promoters. 941
(A) and (B) Promoter activity quantification in protoplasts using dual luciferase reporter assay 942
with specific protein effector complexes SEP3–AP1 (A) and SEP3–AG (B). Error bars represent 943
standard deviation. Numbers above the boxes represent t-test p-value of the difference between 944
wt and modified promoters. 945
(C) Protein–DNA relative binding intensities of various MADS-domain protein complexes 946
between modified and wt AP3 promoters studied by quMFRA. The quMFRA was performed 947
using two sets of infrared (IR)-labelled DNA sequences (Dy682 and Dy782) where IR 948
fluorophores were reciprocally exchanged between AP3 promoter sequences. Error bars 949
represent standard deviation. Sequences used in EMSA are indicated in Supplemental Table 4. 950
(D) Basal promoter activity quantification in protoplasts using dual luciferase reporter assay. 951
Error bars represent standard deviation. 952
(E) Confocal pictures of the fluorescent reporter expression patterns of the wt and modified AP3 953
promoters. Red arrows indicate the most pronounced changes in spatial GFP expression in 954
modified promoter signal compared to the wt signal. Scale bars = 50 µm. 955
washelution
amplification
binding
immunoprecipitation
re-cycle
ssDNA library
protein-DNA mix
target protein complex
anti-targetantibodies
unbound DNA
eluted DNA
immunoprecipitatedprotein-DNA complexes
SELEXrounds
evolved DNAlibrary
sequencing dsDNA library generation
A
DC
0 1 2 3
0.001
0.01
0.1
Round of SELEX
Freq
uenc
y of
read
s w
ith C
C[A
/T] 6
GG
CAr
G−b
ox
SEP3−SEP3SEP3−AG e1SEP3−AP1SEP3−AG e2AG−AGAP1−AP1
B Protein target: SEP3-AG
protein-DNA
complex
DNAalone
0 1 2 3
0.0002
0.0004
0.0006
0.0008
Med
ian
frequ
ency
of r
eads
with
pe
rmut
ated
Car
G−b
ox
SEP3−SEP3SEP3−AG e1SEP3−AP1SEP3−AG e2AG−AGAP1−AP1
Round of SELEX
0 1 2 3
Round of SELEX
4 5 6
Figure 1. SELEX-seq for MADS-domain protein complexes. (A) Overview of the experimental setup for the SELEX-seq approach performed in this study. (B) EMSA analysis of the DNA libraries obtained in different rounds of SELEX for the SEP3-AG complex. (C) Enrichment of the putative CArG-box consensus sequence (CC[A/T]6GG) in the SELEX-seq rounds (log scale). Frequencies (in %) at Round 3 are: 14.2%, 14.1%, 6.4%, 4.5%, 3.6%, and 0.3% for SEP3–AG e1, SEP3–AG e2, SEP3–AP1, SEP3–SEP3, AG–AG and AP1–AP1 respectively. (D) Enrichment of randomly permutated CArG-box sequences in the SELEX-seq rounds (log scale). SEP3–AG e1 and SEP3–AG e2 indicate two independent experiments where different antibodies were used for immunoprecipitation, SEP3 antibodies (e1) and AG antibodies (e2).
AG−A
G
SEP3
−SEP
3
AP1−AP
1
SEP3
−AP1
SEP3
−AG
e2
SEP3
−AG
e1
0 0.5 10
500
Color Keyand Histogram
Cou
nt
1
2
3
A
Relative affinity
Position
012
012
1
2
3
1 2 3
3.03.54.04.55.05.56.0
Min
or g
roov
e w
idth
(Å)
B
CPosition
012
Info
rmat
ion
cont
ent
250
Info
rmat
ion
cont
ent
Info
rmat
ion
cont
ent
1 3 5 7 9 11 13 152 4 6 8 10 12 14
1 3 5 7 9 11 132 4 6 8 10 12 14
1 3 5 7 9 11 13 152 4 6 8 10 12 14
Position
Cluster
Cluster
Figure 2. DNA-binding specificities of MADS-domain TF complexes. (A) Relative affinity heatmap based on 12-mer sequences enriched in the 3rd round of SELEX for all studied MADS-domain TF complexes. Each line in the heatmap corresponds to a single 12-mer DNA fragment. High relative affinities for a particular sequence are marked in yellow, low relative affinities are in blue. (B) Sequence logos corresponding to the three main clusters of sequences in the heatmap built from the position weight matrices for all 20N sequences containing group specific 12-mers. (C) Minimal DNA minor groove width predictions of the sequences from clusters 1–3. The mean value differences of the minor groove width are significant with p < 0.05 (t-test) for all pairwise comparisons. Number of sequences used in predictions (sample size) are 1298340, 416077 and 705199 for cluster 1, 2 and 3 respectively.
A
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
SEP3−AP1 relative affinity
SEP3
−AG
rela
tive
affin
ity
SEP3−AP1 specific (CyATAAATrG)SEP3−AG specific (CyAwwTnrGG)
B
Clu
ster
1
Clu
ster
2
Clu
ster
3
AP1−specific (555)
common (548)
AG−specific (914)3
10
30
100
-log(p-value)
SELEX-seq
ChIP-seqtarget genes
152
249
421
79
158
282
129
157
225
C D
−2
0
2
≤ -2
-2 < &
< 2 ≥ 2
log 2
Exp
ress
ion
ratio
(AP1
-dom
ain/
AG-d
omai
n)
≤ -2
-2 < &
< 2 ≥ 2
SELEX-seq (SEP3-AG/SEP3-AP1)
ChIP-seq (AG/AP1)
p < 0.02
p < 0.07
p < 0.33
p < 0.05
log2 Score ratio
−4 −2 0 2 4
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
log2 SELEX-seq score ratio(SEP3−AG/SEP3−AP1)
Aver
age
log 2
ChI
P-se
q sc
ore
ratio
(AG
/AP1
)
SEP3-AGSEP3-AP1
AP1
AG
SELEX-seq score
ChI
P-se
q sc
ore
Figure 3. Comparison between in vitro SELEX-seq and in vivo ChIP-seq. (A) DNA specificity plots comparing the relative binding affinities of SELEX-seq sequences selected by SEP3–AG (y-axis) and SEP3–AP1 (x-axis). Each point represents a unique sequence that contains a color-coded motif. Black points represent all sequences. (B) Association between SELEX-seq normalized score ratios (SEP3–AG/SEP3–AP1) and ChIP-seq normalized score ratios (AG/AP1) for TFBSs within the top 1500 SEP3 ChIP-seq peaks. The plot represents a moving average of overlapping windows of size 1 over the SELEX-seq log2 score ratio. Windows with less than 5 elements were not considered. (C) Prediction of specific and common ChIP-seq target genes for AG and AP1 based on the SELEX-seq complex-specific motifs. The heatmap shows significance (hypergeometric test) of the enrichment of SELEX-seq cluster motifs obtained in Figure 2C in specific and common ChIP-seq binding regions compared previously by Yan et al. (Yan et al., 2016); numerical values represent gene numbers. For the raw data of this figure see Supplemental Data Set 4. (D) Comparison of the TRAP-seq expression data (Jiao and Meyerowitz, 2010) with the SELEX-seq and the ChIP-seq data (Ó’Maoiléidigh et al., 2013; Pajoro et al., 2014). The violin plot visualizes the distribution of AP1-/AG-domain expression ratios of genes containing TFBSs with a certain SEP3–AG/SEP3–AP1 SELEX-seq score ratio (orange) or with a certain AG/AP1 ChIP-seq score ratio (blue).
AP3
SEP3
ChIP SEP3
ChIP AP1
ChIP AG
SELEX SEP3-SEP3
SELEX SEP3-AG
SELEX AG-AG
SELEX SEP3-AP1
SELEX AP1-AP1
TAIR10 mRNA (+)
Coordinates
TAIR10 mRNA (−)
ChIP SEP3
ChIP AP1
ChIP AG
SELEX SEP3-SEP3
SELEX SEP3-AG
SELEX AG-AG
SELEX SEP3-AP1
SELEX AP1-AP1
TAIR10 mRNA (+)
Coordinates
TAIR10 mRNA (−)
Figure 4. Examples of SELEX-seq TFBSs and ChIP-seq profiles mapped to the genome of Arabidopsis. (Top) AP3 genomic locus. (Bottom) SEP3 genomic locus.
−600 −400 −200 0 200 400 6000
2
4
6
8SEP3-AP1
Distance
−log
p-v
alue
−600 −400 −200 0 200 400 60002468
12SEP3-SEP3
Distance−l
og p
-val
ue 10
−600 −400 −200 0 200 400 600
05
10
20
SEP3-AG
Distance
−log
p-v
alue
15
25
Figure 5. Characteristics of SELEX-seq peaks within ChIP-seq peaks. Distribution of complex specific SELEX-seq TFBSs within ChIP-seq TFBSs of SEP3 (top), AP1 (middle) and AG (bottom). Shown is the frequency of distances normalized by the background frequency distances outside of the ChIP-seq peaks, plotted based on the calculated p-values using the hypergeometric test. A distance of zero results when the position of compared TFBSs is the same.
pAP3_(SEP3−AG)
pAP3_mut
pAP3_wt
02
04
06
08
0
+ SEP3-AP1
Re
sp
on
se
to
th
e e
ffe
cto
r
(fo
ld e
nri
ch
me
nt o
f F
/R L
UC
sig
na
l)
1.53e−11
8.00e−02
SEP
3-SEP
3
AG-A
G
AP1-
AP1
SEP
3-AG
SEP
3-AP1
Dy682/Dy782 Dy782/Dy682
Re
lative
bin
din
g in
ten
sity
pAP3_(SEP3-AG)/pAP3_wt
05
10
15
20
25
30
+ SEP3-AG
Re
sp
on
se
to
th
e e
ffe
cto
r
(fo
ld e
nri
ch
me
nt o
f F
/R L
UC
sig
na
l)
1.26e−07
4.26e−04
pAP3_(SEP3−AG)
pAP3_mut
pAP3_wt
0
0.05
0.10
0.15
0.20
0.25
0.30
Pro
mo
ter
activity a
s L
UC
sig
na
l
with
ou
t th
e e
ffe
cto
r [A
U]
pAP3_wt
pAP3_mut
pAP3_(SEP3−AG)
pAP3_wt:GFP pAP3_(SEP3-AG):GFPpAP3_mut:GFP
A B
C D
E
0.03125
0.0625
0.125
0.25
0.5
1
2
4
8
Figure 6. Comparison of activities and protein-DNA binding between various MADS-domain protein complexes to modified AP3 promoters. (A) and (B) Promoter activity quantification in protoplasts using dual luciferase reporter assay with specific protein effector complexes SEP3–AP1 (A) and SEP3–AG (B). Error bars represent standard deviation. Numbers above the boxes represent t-test p-value of the difference between wt and modified promoters. (C) Protein–DNA relative binding intensities of various MADS-domain protein complexes between modified and wt AP3 promoters studied by quMFRA. The quMFRA was performed using two sets of infrared (IR)-labelled DNA sequences (Dy682 and Dy782) where IR fluorophores were reciprocally exchanged between AP3 promoter sequences. Error bars represent standard deviation. Sequences used in EMSA are indicated in Supplemental Table 4. (D) Basal promoter activity quantification in protoplasts using dual luciferase reporter assay. Error bars represent standard deviation. (E) Confocal pictures of the fluorescent reporter expression patterns of the wt and modified AP3 promoters. Red arrows indicate the most pronounced changes in spatial GFP expression in modified promoter signal compared to the wt signal. Scale bars = 50 µm.
Parsed CitationsBemer, M., van Dijk, A.D., Immink, R.G., and Angenent, G.C. (2017). Cross-Family Transcription Factor Interactions: An AdditionalLayer of Gene Regulation. Trends Plant Sci 22, 66-80.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Brambilla, V., Battaglia, R., Colombo, M., Masiero, S., Bencivenga, S., Kater, M.M., and Colombo, L. (2007). Genetic and molecularinteractions between BELL1 and MADS box factors support ovule development in Arabidopsis. Plant Cell 19, 2544-2556.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Clough, S.J., and Bent, A.F. (1998). Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsisthaliana. Plant J 16, 735-743.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
de Folter, S., and Angenent, G.C. (2006). trans meets cis in MADS science. Trends Plant Sci 11, 224-231.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Deng, W., Ying, H., Helliwell, C.A., Taylor, J.M., Peacock, W.J., and Dennis, E.S. (2011). FLOWERING LOCUS C (FLC) regulatesdevelopment pathways throughout the life cycle of Arabidopsis. Proc Natl Acad Sci USA 108, 6680-6685.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Diaz-Trivino, S., Long, Y., Scheres, B., and Blilou, I. (2017). Analysis of a Plant Transcriptional Regulatory Network Using TransientExpression Systems. Methods Mol Biol 1629, 83-103.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Dickerson, R.E. (1998). DNA bending: the prevalence of kinkiness and the virtues of normality. Nucleic Acids Res 26, 1906-1926.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Egea-Cortines, M., Saedler, H., and Sommer, H. (1999). Ternary complex formation between the MADS-box proteins SQUAMOSA,DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. EMBO J 18, 5370-5379.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Franco-Zorrilla, J.M., Lopez-Vidriero, I., Carrasco, J.L., Godoy, M., Vera, P., and Solano, R. (2014). DNA-binding specificities of planttranscription factors and their potential to define target genes. Proc Natl Acad Sci USA 111, 2367-2372.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Grant, C.E., Bailey, T.L., and Noble, W.S. (2011). FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017-1018.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Gregis, V., Sessa, A., Dorca-Fornell, C., and Kater, M.M. (2009). The Arabidopsis floral meristem identity genes AP1, AGL24 and SVPdirectly repress class B and C floral homeotic genes. The Plant Journal 60, 626-637.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Gremski, K., Ditta, G., and Yanofsky, M.F. (2007). The HECATE genes regulate female reproductive tract development inArabidopsis thaliana. Development 134, 3593-3601.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Haran, T.E., and Mohanty, U. (2009). The unique structure of A-tracts and intrinsic DNA bending. Q Rev Biophys 42, 41-81.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Honma, T., and Goto, K. (2001). Complexes of MADS-box proteins are sufficient to convert leaves into floral organs. Nature 409,525-529.
Pubmed: Author and TitleCrossRef: Author and Title
Google Scholar: Author Only Title Only Author and Title
Horstman, A., Fukuoka, H., Muino, J.M., Nitsch, L., Guo, C., Passarinho, P., Sanchez-Perez, G., Immink, R., Angenent, G., andBoutilier, K. (2015). AIL and HDG proteins act antagonistically to control cell proliferation. Development 142, 454-464.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Huang, H., Mizukami, Y., Hu, Y., and Ma, H. (1993). Isolation and characterization of the binding sequences for the product of theArabidopsis floral homeotic gene AGAMOUS. Nucleic Acids Res 21, 4769-4776.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Huang, H., Tudor, M., Weiss, C.A., Hu, Y., and Ma, H. (1995). The Arabidopsis MADS-box gene AGL3 is widely expressed andencodes a sequence-specific DNA-binding protein. Plant Mol Biol 28, 549-567.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Huang, H., Tudor, M., Su, T., Zhang, Y., Hu, Y., and Ma, H. (1996). DNA binding properties of two Arabidopsis MADS domainproteins: binding consensus and dimer formation. The Plant cell 8, 81-94.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Huang, K., Louis, J.M., Donaldson, L., Lim, F.L., Sharrocks, A.D., and Clore, G.M. (2000). Solution structure of the MEF2A-DNAcomplex: structural basis for the modulation of DNA bending and specificity by MADS-box transcription factors. EMBO J 19, 2615-2628.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Immink, R.G., Pose, D., Ferrario, S., Ott, F., Kaufmann, K., Valentim, F.L., de Folter, S., van der Wal, F., van Dijk, A.D., Schmid, M.,and Angenent, G.C. (2012). Characterization of SOC1's central role in flowering by the identification of its upstream anddownstream regulators. Plant Physiol 160, 433-449.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jetha, K., Theissen, G., and Melzer, R. (2014). Arabidopsis SEPALLATA proteins differ in cooperative DNA-binding during theformation of floral quartet-like complexes. Nucleic Acids Res 42, 10927-10942.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jiao, Y., and Meyerowitz, E.M. (2010). Cell-type specific analysis of translating RNAs in developing flowers reveals new levels ofcontrol. Mol Syst Biol 6, 419.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jolma, A., Yin, Y., Nitta, K.R., Dave, K., Popov, A., Taipale, M., Enge, M., Kivioja, T., Morgunova, E., and Taipale, J. (2015). DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature 527, 384-388.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jolma, A., Kivioja, T., Toivonen, J., Cheng, L., Wei, G., Enge, M., Taipale, M., Vaquerizas, J.M., Yan, J., Sillanpaa, M.J., Bonke, M.,Palin, K., Talukder, S., Hughes, T.R., Luscombe, N.M., Ukkonen, E., and Taipale, J. (2010). Multiplexed massively parallel SELEX forcharacterization of human transcription factor binding specificities. Genome Res 20, 861-873.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K.R., Rastas, P., Morgunova, E., Enge, M., Taipale, M., Wei, G., Palin, K.,Vaquerizas, J.M., Vincentelli, R., Luscombe, N.M., Hughes, T.R., Lemaire, P., Ukkonen, E., Kivioja, T., and Taipale, J. (2013). DNA-binding specificities of human transcription factors. Cell 152, 327-339.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kaufmann, K., Melzer, R., and Theissen, G. (2005). MIKC-type MADS-domain proteins: structural modularity, protein interactionsand network evolution in land plants. Gene 347, 183-198.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kaufmann, K., Muino, J.M., Jauregui, R., Airoldi, C.A., Smaczniak, C., Krajewski, P., and Angenent, G.C. (2009). Target genes of theMADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol
7, e1000090.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kaufmann, K., Wellmer, F., Muino, J.M., Ferrier, T., Wuest, S.E., Kumar, V., Serrano-Mislata, A., Madueno, F., Krajewski, P.,Meyerowitz, E.M., Angenent, G.C., and Riechmann, J.L. (2010). Orchestration of floral initiation by APETALA1. Science 328, 85-89.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kneidl, C., Dinkl, E., and Grummt, F. (1995). An intrinsically bent region upstream of the transcription start site of the rRNA genesof Arabidopsis thaliana interacts with an HMG-related protein. Plant Mol Biol 27, 705-713.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Krizek, B.A., and Meyerowitz, E.M. (1996). Mapping the protein regions responsible for the functional specificities of theArabidopsis MADS domain organ-identity proteins. Proc Natl Acad Sci USA 93, 4063-4070.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Krizek, B.A., Riechmann, J.L., and Meyerowitz, E.M. (1999). Use of the APETALA1 promoter to assay the in vivo function of chimericMADS box genes. Sex Plant Reprod 12, 14-26.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Li, L. (2009). GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery. JComput Biol 16, 317-329.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., and Wang, J. (2009). SOAP2: an improved ultrafast tool for short readalignment. Bioinformatics 25, 1966-1967.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Man, T.K., and Stormo, G.D. (2001). Non-independence of Mnt repressor-operator interaction determined by a new quantitativemultiple fluorescence relative affinity (QuMFRA) assay. Nucleic Acids Res 29, 2471-2478.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Mathelier, A., Xin, B., Chiu, T.P., Yang, L., Rohs, R., and Wasserman, W.W. (2016). DNA Shape Features Improve TranscriptionFactor Binding Site Predictions In Vivo. Cell Syst 3, 278-286 e274.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Melzer, R., and Theissen, G. (2009). Reconstitution of 'floral quartets' in vitro involving class B and class E floral homeoticproteins. Nucleic Acids Res 37, 2723-2736.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Melzer, R., Verelst, W., and Theissen, G. (2009). The class E floral homeotic protein SEPALLATA3 is sufficient to loop DNA in 'floralquartet'-like complexes in vitro. Nucleic Acids Res 37, 144-157.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Muino, J.M., Kaufmann, K., van Ham, R.C.H.J., Angenent, G.C., and Krajewski, P. (2011). ChIP-seq Analysis in R (CSAR): An Rpackage for the statistical detection of protein-bound genomic regions. Plant Methods 7.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Muino, J.M., Smaczniak, C., Angenent, G.C., Kaufmann, K., and van Dijk, A.D. (2014). Structural determinants of DNA recognition byplant MADS-domain transcription factors. Nucleic Acids Res 42, 2138-2146.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Nelson, H.B., and Laughon, A. (1990). The DNA binding specificity of the Drosophila fushi tarazu protein: a possible role for DNAbending in homeodomain recognition. New Biol 2, 171-178.
Pubmed: Author and Title
CrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
O'Malley, R.C., Huang, S.S., Song, L., Lewsey, M.G., Bartlett, A., Nery, J.R., Galli, M., Gallavotti, A., and Ecker, J.R. (2016). Cistromeand Epicistrome Features Shape the Regulatory DNA Landscape. Cell 165, 1280-1292.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Ó'Maoiléidigh, D.S., Wuest, S.E., Rae, L., Raganelli, A., Ryan, P.T., Kwasniewska, K., Das, P., Lohan, A.J., Loftus, B., Graciet, E., andWellmer, F. (2013). Control of reproductive floral organ identity specification in Arabidopsis by the C function regulator AGAMOUS.Plant Cell 25, 2482-2503.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Pajoro, A., Madrigal, P., Muino, J.M., Matus, J.T., Jin, J., Mecchia, M.A., Debernardi, J.M., Palatnik, J.F., Balazadeh, S., Arif, M.,O'Maoileidigh, D.S., Wellmer, F., Krajewski, P., Riechmann, J.L., Angenent, G.C., and Kaufmann, K. (2014). Dynamics of chromatinaccessibility and gene regulation by MADS-domain transcription factors in flower development. Genome Biol 15, R41.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Parenicova, L., de Folter, S., Kieffer, M., Horner, D.S., Favalli, C., Busscher, J., Cook, H.E., Ingram, R.M., Kater, M.M., Davies, B.,Angenent, G.C., and Colombo, L. (2003). Molecular and phylogenetic analyses of the complete MADS-box transcription factor familyin Arabidopsis: new openings to the MADS world. The Plant cell 15, 1538-1551.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Payne, C.T., Zhang, F., and Lloyd, A.M. (2000). GL3 encodes a bHLH protein that regulates trichome development in arabidopsisthrough interaction with GL1 and TTG1. Genetics 156, 1349-1362.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Pellegrini, L., Tan, S., and Richmond, T.J. (1995). Structure of serum response factor core bound to DNA. Nature 376, 490-498.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Pollock, R., and Treisman, R. (1990). A sensitive method for the determination of protein-DNA binding specificities. Nucleic AcidsRes 18, 6197-6204.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Riechmann, J.L., and Meyerowitz, E.M. (1997). Determination of floral organ identity by Arabidopsis MADS domain homeoticproteins AP1, AP3, PI, and AG is independent of their DNA-binding specificity. Mol Biol Cell 8, 1243-1259.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Riechmann, J.L., Wang, M., and Meyerowitz, E.M. (1996a). DNA-binding properties of Arabidopsis MADS domain homeotic proteinsAPETALA1, APETALA3, PISTILLATA and AGAMOUS. Nucleic Acids Res 24, 3134-3141.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Riechmann, J.L., Krizek, B.A., and Meyerowitz, E.M. (1996b). Dimerization specificity of Arabidopsis MADS domain homeoticproteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci USA 93, 4793-4798.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Rohs, R., West, S.M., Sosinsky, A., Liu, P., Mann, R.S., and Honig, B. (2009). The role of DNA shape in protein-DNA recognition.Nature 461, 1248-1253.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Schellmann, S., Schnittger, A., Kirik, V., Wada, T., Okada, K., Beermann, A., Thumfahrt, J., Jurgens, G., and Hulskamp, M. (2002).TRIPTYCHON and CAPRICE mediate lateral inhibition during trichome and root hair patterning in Arabidopsis. EMBO J 21, 5036-5046.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Schwarz-Sommer, Z., Huijser, P., Nacken, W., Saedler, H., and Sommer, H. (1990). Genetic Control of Flower Development byHomeotic Genes in Antirrhinum majus. Science 250, 931-936.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Schwarz-Sommer, Z., Hue, I., Huijser, P., Flor, P.J., Hansen, R., Tetens, F., Lonnig, W.E., Saedler, H., and Sommer, H. (1992).Characterization of the Antirrhinum floral homeotic MADS-box gene deficiens: evidence for DNA binding and autoregulation of itspersistent expression throughout flower development. EMBO J 11, 251-263.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Simonini, S., Roig-Villanova, I., Gregis, V., Colombo, B., Colombo, L., and Kater, M.M. (2012). Basic pentacysteine proteins mediateMADS domain complex binding to the DNA for tissue-specific expression of target genes in Arabidopsis. Plant Cell 24, 4163-4172.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Slattery, M., Riley, T., Liu, P., Abe, N., Gomez-Alcala, P., Dror, I., Zhou, T., Rohs, R., Honig, B., Bussemaker, H.J., and Mann, R.S.(2011). Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell 147, 1270-1282.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Smaczniak, C., Immink, R.G., Angenent, G.C., and Kaufmann, K. (2012a). Developmental and evolutionary diversity of plant MADS-domain factors: insights from recent studies. Development 139, 3081-3098.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Smaczniak, C., Immink, R.G., Muino, J.M., Blanvillain, R., Busscher, M., Busscher-Lange, J., Dinh, Q.D., Liu, S., Westphal, A.H.,Boeren, S., Parcy, F., Xu, L., Carles, C.C., Angenent, G.C., and Kaufmann, K. (2012b). Characterization of MADS-domaintranscription factor complexes in Arabidopsis flower development. Proc Natl Acad Sci USA 109, 1560-1565.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Stella, S., Cascio, D., and Johnson, R.C. (2010). The shape of the DNA minor groove directs binding by the DNA-bending proteinFis. Genes Dev 24, 814-826.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Tang, W., and Perry, S.E. (2003). Binding site selection for the plant MADS domain protein AGL15: an in vitro and in vivo study. JBiol Chem 278, 28154-28159.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Theissen, G., and Saedler, H. (2001). Plant biology. Floral quartets. Nature 409, 469-471.Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Tilly, J.J., Allen, D.W., and Jack, T. (1998). The CArG boxes in the promoter of the Arabidopsis floral organ identity gene APETALA3mediate diverse regulatory effects. Development 125, 1647-1657.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
West, A.G., Shore, P., and Sharrocks, A.D. (1997). DNA binding by MADS-box transcription factors: a molecular mechanism fordifferential DNA bending. Mol Cell Biol 17, 2876-2887.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
West, A.G., Causier, B.E., Davies, B., and Sharrocks, A.D. (1998). DNA binding and dimerisation determinants of Antirrhinum majusMADS-box transcription factors. Nucleic Acids Res 26, 5277-5287.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Wuest, S.E., O'Maoileidigh, D.S., Rae, L., Kwasniewska, K., Raganelli, A., Hanczaryk, K., Lohan, A.J., Loftus, B., Graciet, E., andWellmer, F. (2012). Molecular basis for the specification of floral organs by APETALA3 and PISTILLATA. Proc Natl Acad Sci USA 109,13452-13457.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Yan, W., Chen, D., and Kaufmann, K. (2016). Molecular mechanisms of floral organ specification by MADS domain proteins. CurrOpin Plant Biol 29, 154-162.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Ye, L., Wang, B., Zhang, W., Shan, H., and Kong, H. (2016). Gains and Losses of Cis-regulatory Elements Led to Divergence of theArabidopsis APETALA1 and CAULIFLOWER Duplicate Genes in the Time, Space, and Level of Expression and Regulation of OneParalog by the Other. Plant Physiol 171, 1055-1069.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Yoo, S.D., Cho, Y.H., and Sheen, J. (2007). Arabidopsis mesophyll protoplasts: a versatile cell system for transient gene expressionanalysis. Nat Protoc 2, 1565-1572.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Zheng, Y., Ren, N., Wang, H., Stromberg, A.J., and Perry, S.E. (2009). Global identification of targets of the Arabidopsis MADSdomain protein AGAMOUS-Like15. The Plant cell 21, 2563-2577.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Zhou, T., Yang, L., Lu, Y., Dror, I., Dantas Machado, A.C., Ghane, T., Di Felice, R., and Rohs, R. (2013). DNAshape: a method for thehigh-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res 41, W56-62.
Pubmed: Author and TitleCrossRef: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
DOI 10.1105/tpc.17.00145; originally published online July 21, 2017;Plant Cell
Cezary Smaczniak, Jose M. Muiño, Dijun Chen, Gerco C. Angenent and Kerstin Kaufmanntarget genes
Differences in DNA-binding specificity of floral homeotic protein complexes predict organ-specific
This information is current as of June 8, 2020
Supplemental Data /content/suppl/2017/12/11/tpc.17.00145.DC2.html /content/suppl/2017/07/21/tpc.17.00145.DC1.html
Permissions https://www.copyright.com/ccc/openurl.do?sid=pd_hw1532298X&issn=1532298X&WT.mc_id=pd_hw1532298X
eTOCs http://www.plantcell.org/cgi/alerts/ctmain
Sign up for eTOCs at:
CiteTrack Alerts http://www.plantcell.org/cgi/alerts/ctmain
Sign up for CiteTrack Alerts at:
Subscription Information http://www.aspb.org/publications/subscriptions.cfm
is available at:Plant Physiology and The Plant CellSubscription Information for
ADVANCING THE SCIENCE OF PLANT BIOLOGY © American Society of Plant Biologists