genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web...

28
Supplemental Material A conserved role for Snail as a potentiator of active transcription Martina Rembold, Lucia Ciglar, J. Omar Yáñez-Cuna, Robert P. Zinzen, Charles Girardot, Ankit Jain, Michael A. Welte, Alexander Stark, Maria Leptin, and Eileen E. M. Furlong

Transcript of genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web...

Page 1: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

Supplemental Material

A conserved role for Snail as a potentiator of active

transcription

Martina Rembold, Lucia Ciglar, J. Omar Yáñez-Cuna, Robert P. Zinzen, Charles

Girardot, Ankit Jain, Michael A. Welte, Alexander Stark, Maria Leptin, and Eileen E.

M. Furlong

Page 2: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Supplemental Figures

Figure S1: Snail occupancy at loci of mesoderm-specific genes at stages when they are expressed

Figure S2: Snail represses Twist and Dorsal-mediated activation of the rho NEE enhancer

Figure S3: Expression of Mef2 I-D[L] and Mef2 I-D[L] ∆Sna1,3 at consecutive stages of development.

Figure S4: A new enhancer within the Mef2 locus requires Snail for its mesodermal activity

Figure S5: Subtle differences in Twist and Snail motifs between activated and repressed CRMs

Figure S6: In vivo activity of CycE_401 and CycE_401 ∆Tll in neuroblasts

Figure S7: The enhancer CG14688_400 requires the Tll-like motifs for robust activation

Figure S8: dTCF does not mediate activation through the Tll-like motif

Supplemental Tables

Table S1: Expression profiling of stage 5 embryos, showing data for all probe sets that passed the IQR filter

Table S2: Expression profiling of stage 7 embryos, showing data for all probe sets that passed the IQR filter

Table S3: Genes differentially regulated in sna mutants (‘sna-pool’).

Table S4: All (unfiltered) TileMap regions for Snail at 2-4 hours of development.

Table S5: All (unfiltered) TileMap regions for Twist at 2-4 hours of development.

Table S6: All defined ChIP-CRMs with maximum ChIP binding signal for Snail and Twist.

Table S7: List of ChIP-CRMs found in the final ChIP-CRM sets.

Related to Methods “Definition of ChIP-CRM sets”.

Table S8: Snail PWM found de novo in the repressed co-bound ChIP-CRMs and the Twist PWM from Zinzen et al (Zinzen et al. 2009).

Table S9: List of differentially enriched motifs and SVM-selected motifs including their IUPAC sequence and original source (Schroeder et al. 2004; Down et al. 2007; Stark et al. 2007), related to Figure 5A and 5D

Supplemental Methods

Supplemental References

2

Page 3: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Supplemental Figures

Figure S1: Snail occupancy at loci of mesoderm-specific genes at stages when they are

expressed

ChIP signal (log2 mean IP/mock) of Snail (red) and Twist (blue) binding to known mesodermal

enhancers (highlighted in green, A) or putative novel mesodermal enhancers (B-F).

The gene model is shown underneath the ChIP signal, where thick lines indicate exons, thin lines

introns, arrow indicates direction of transcription for the gene of interest (black) and surrounding

genes (grey). The chromosome arm and genome coordinates are indicated along the dashed line.

3

Page 4: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S2: Alternative CAGGTA motif mediates repression of the rho NEE enhancer in Kc

cells

A) Luciferase assay on the rho NEE enhancer in Kc cells. X-axis indicates the amount of DNA

transfected (ng), Y-axis is average fold luciferase activity across replicates (n = 3) normalized to

Renilla. Twist and Dorsal cooperatively activate rho NEE enhancer activity 42-fold, compared to

<3-fold when either factor is transfected alone (dark grey bars). Co-transfection of 10 ng of Snail

dramatically reduces this effect – repressing Twist and Dorsal mediated enhancer activation to only

1.7-fold. Snail efficiently represses enhancer activity via CAGGTA motifs in place of CAGGTG

motifs (Rho NEE ∆sna1,2,3 CAGGTA, light grey bars). Twist and Dorsal activate the modified

enhancer 35.7-fold. Importantly, co-transfection of only 1ng of Snail causes an approximate 4-fold

reduction in enhancer activity using the CAGGTA motif (light grey bars; going from 35.7-fold to

8.7-fold), while only a 2-fold reduction is seen using the CAGGTG motif (dark grey bars; going

from 42-fold without Snail to 21.7-fold with Snail).

B) The core E-Box of three CAGGTG motifs (bold green letters) in the enhancer was replaced with

CAGGTA (red bold letters).

4

Page 5: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S3: Expression of Mef2 I-D[L] and Mef2 I-D[L] ∆Sna1,3 at consecutive stages of

development.

Triple in situ hybridization showing enhancer activity (reporter gene lacZ; green), Mef2 (red) and

sna or sim (blue). (A) At the onset of gastrulation (stage 6) the activity of Mef2 I-D[L] coincides

with the expression of the Mef2 gene in the trunk mesoderm of wildtype embryos. Enhancer activity

and expression of the Mef2 gene is almost absent in sna18 but not snaV2 embryos, although a slight

reduction of lacZ signal intensity is visible. Mutation of the Snail binding sites in Mef2 I-D[L]

∆Sna1,3 reduces activity significantly. (B) During germband extension (stage 8), enhancer activity

and Mef2 gene expression are completely absent in sna18 mutant, but not snaV2 embryos. The mutant

5

Page 6: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

enhancer Mef2 I-D[L] ∆Sna1,3 is inactive, except for a subset of cells in the posterior trunk

mesoderm.

All embryos are oriented anterior to the left. Single confocal planes are shown. LacZ expression in

the head-fold is caused by the eve minimal promoter used in the reporter vector.

6

Page 7: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S4: A new enhancer within the Mef2 locus requires Snail for its mesodermal activity

(A) ChIP signal (log2 mean IP/mock) of Snail (red) and Twist (blue) binding to a region within an

intron of Mef2. The gene model is shown underneath, where thick lines indicate exons, thin lines

introns, arrow indicates direction of transcription for the gene of interest (black). The chromosome

arm and genome coordinates are indicated along the dashed line. (B) In vivo activity of the

Mef2_401 enhancer. In situ hybridization of the reporter gene lacZ (green) and the endogenous

Mef2 gene (red). Upper panel, wildtype embryo; the Mef2_401 enhancer is active in a striped

pattern in the mesoderm (arrow) as well as the dorsal ectoderm. Lower panels, sna mutant embryos

(middle panel: Df(sna), lower panel: sna18); mesodermal lacZ expression is strongly reduced

(arrow) and Mef2 expression is also downregulated. Maximum intensity projections, magnification

20x. Genome coordinates of the enhancer element: chr2R_ 5821681-5821282.

7

Page 8: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S5: Subtle differences in Twist and Snail motifs between activated and repressed co-

bound CRMs

(A) Distribution of the Jaspar Snail motif: PWM scores for the CAGGTG (MA0086.1) Snail motif

(left graph) and cumulated match scores for CAGGTG do not differ significantly between all four

CRM groups. (B) Base-pair distance between Twist and CAGGTG Snail motifs is greater in co-

bound activated CRMs compared to repressed CRMs. Upper panel: in activated CRMs, Twist

motifs are preferentially enriched at a distance of 76 to 83 bps. Lower panel: Twist motifs cluster

around Snail motifs at a distance of 20 to 26 bps. No enrichment of Twist motifs around CAGGTG

Snail motifs is seen in Snail-only CRMs, as expected. Red stars indicate where the signal deviates

from random.

8

Page 9: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S6: In vivo activity of CycE_401 and CycE_401 ∆Tll in neuroblasts

Triple in situ hybridization showing enhancer activity (reporter gene lacZ; green), CycE (red) and

sna or sim (blue). Ectopic expression of sim in the mesoderm was used to identify sna mutant

embryos. At stage 10 overlapping expression of lacZ, CycE and sna is observed in neuroblasts.

Neither enhancer activity nor CycE expression are affected in either of the mutants, sna18 and snaV2.

Similarly, the absence of the Tll-like motif in CycE_401 ∆Tll has no effect on enhancer activity in

neuroblasts, indicating that both, CycE expression as well as enhancer activity are independent of

Snail and the factor binding the Tll-like motif in these cells. All embryos are oriented anterior to

the left. Single confocal planes are shown.

9

Page 10: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S7: The enhancer CG14688_400 requires the Tll-like motifs for robust activation

A region upstream of the CG14688 gene is co-bound by Twist and Snail (blue and red ChIP signal

(log2 mean IP/mock), respectively). Three Tll-like motifs in the CG14688_400 enhancer (-456 to -

855 upstream of the TSS) are indicated. Bold red letters mark mutated nucleotides. The gene

model is shown underneath the ChIP signal, where thick lines indicate exons, thin lines introns,

arrow indicates direction of transcription for the gene of interest (black) and surrounding genes

(grey). The chromosome arm and genome coordinates are indicated along the dashed line.

10

Page 11: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

(B) In vivo activity of the CG14688_400 and CG14688_400 ∆Tll1,2,3 enhancer. In situ

hybridization of the reporter gene lacZ (green) and endogenous CG14688 gene (red) in wildtype

embryos at two successive stages of development. The wildtype CG14688_400 enhancer drives

lacZ expression in the trunk mesoderm (white arrows) and the anterior midgut (AMG) anlage,

partially recapitulating CG14688 expression at both stages of development. The mutation of all

three Tll-like motifs in CG14688_400 ∆Tll1,2,3 reduces enhancer activity in the trunk mesoderm

(arrow) but not the AMG anlage. This indicates that the Tll-like motifs in this enhancer are

required for efficient activation within the mesoderm. Tll is expressed in the embryonic termini and

the mutation of the motifs causes the ectopic expression of the enhancer in a posterior ventral

domain (hindgut anlage, arrowhead). This indicates that the wildtype enhancer is normally

repressed by Tll (presumably) in the posterior pole and activated by an unknown factor that binds

the same motifs in the trunk mesoderm. Lateral-ventral (upper panels) or ventral (lower panels)

views of embryos oriented anterior to the left, single confocal planes. LacZ expression in the head-

fold is caused by the eve minimal promoter used in the reporter vector. Genome coordinates of the

enhancer element: chr3R:6507103-6507502.

11

Page 12: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Figure S8: dTCF does not mediate activation through the Tll-like motif

(A) The Tll-like motif in the CycE_401 enhancer (red underlined bold letters) is shown in relation

to other motifs. Red and blue boxes: overlapping Snail and Twist motifs. Green box: Ftz motif,

light blue box: dTCF motif. (B) Double in-situ hybridization showing expression of CycE_401

(lacZ reporter (red) and wingless (wg, grey) in a wildtype embryo at stage 8 (lateral view). Inset:

40x magnification of the region indicated by the dashed box in B. LacZ-positive cells are adjacent

to and partially overlapping every second wg stripe. (C) Over-expression of dominant-negative

TCF (TCF-DN) using a maternal Gal4 (mG4, middle panel) or mesoderm-specific twist-Gal4

(twiG4, right panel). Left panel: Control embryos containing the mG4 and CycE_401-lacZ were

stained for lacZ (green) and CycE (red). Middle panel: embryos carrying mG4, CycE_401-lacZ and

UAS-TCF-DN were stained for lacZ (green) and CycE (red). Right panel: embryos containing

twiG4, CycE_401-lacZ and UAS-TCF-DN were stained for lacZ (green) and TCF (red) to visualize

the ectopic expression of TCF-DN. Lateral views, stage 8. Images are maximum intensity

projections of several confocal planes through the mesoderm at 20x magnification.

12

Page 13: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Supplemental MethodsOligos used to subclone the tested enhancers

>tin B-374 (chr3R: 17205671-17206053)tinB_374_F_BglII: TGAGATCTCTCGAGGCTTTGACAAATCATCGtinB_374_R_KpnI: TAGGTACCGCGGGAAAGCAGGAAAATG

>Mef2_I-D[L] (chr2R: 5819019-5819498)Mef2_I-D[L]_F_KpnI: GCGGTACCCTGTAAAAATCACGCATAACCGMef2_I-D[L]_R_BglII: CGAGATCTCCTGAAGAAACCCCTGCCAAG

>Mef2_401 (chr2R: 5821681-5821282)Mef2_400_F_BglII: GCGAGATCTAGGCAAATATTTACACTCAATGGMef2_400_R_KpnI: AGGTACCAACTGCAGCGACTGCTGTT

>CycE_401 (chr2L:15733198-15733599)CycE_400_F_BglII: TGAGATCTGTGATTCCATAACGCTTGACCCycE_400_R_KpnI: AGGTACCCGAGCAGAACTCCCCTCC

>CG14688_400 (chr3R: 6507103-6507502)CG14688_400_F_BglII: GACAGATCTTGAGAGAAAAATGTAGTATGCATCACG14688_400_R_KpnI: AGGTACCGCCATTGACTTTGATATATAAGAATTAC

>rho NEE-300 (chr3L:1461807-1462134)rhoNEE_F_Bgl2: GACAGATCTCTTGGGCAGGATGGAAAAATGrhoNEE-R_Kpn1: TAGGTACCAGCTCGAATTCAGGTAAC

Oligos used for site-directed mutagenesis (mutated bases are bold and underlined)

Mef2_∆Sna1: GCGACGTACGGTTGATGCTGAGTATTGCATGCACTCATCACATG

Mef2_∆Sna3: CTGCATGTTGCATGCACTCAACACATGTGCAATACTCGGCATCTGCGGCAGTAGC

CycE_∆Tll: CCAGATGCAATGTAATTAAAGCTGAAGAGTGCAATGGCCTAAGAAGCC

CG14688 ∆Tll 1: CTATCGATAGGTACCGCCATAGATTTTGATATATAAGAATTACTTTAG

CG14688 ∆Tll 2: GGTCTGGGTCTTCCACATGATATCATTGTCTTTATGTTC

CG14688 ∆Tll 3: GGGTCTTCCACATGATATCATCTTCTTTATGTTCATGCATATG

CAGGTA 1:ACATCGCGAAACATTTGGCGCAGGTACGGAAGACAAGTGCG

13

Page 14: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

CAGGTA 2:GGGAAGCGGAAAAAGGACAGGTACTGTGCGGCGGG

CAGGTA 3: TGTGCGGCGGGAACGTACCTGGCGGGCGGAATTT

Luciferase reporter assays

Luciferase assays were performed according to standard procedures using the dual-luciferase

reporter assay system (Promega). Briefly, Kc cells were transiently transfected with Cellfectin

(Invitrogen) to introduce plasmids carrying (i) enhancer sequence linked to an hsp70 minimal

promoter and luciferase (pGL3-Hsp70), (ii) full-length TFs under constitutive control (pAc5.1), and

(iii) a Renilla construct for normalization. In each transfection, the total amount of transfected

DNA was kept constant by adjusting with empty pAc5.1 vector.

The following ESTs were used to generate the TF constructs: AT15089 (twi), RE35237 (sna), and

RE58537 (dl). Successful transfection and expression of all TFs was assessed by Western blot.

Levels of Luciferase and Renilla were measured 48 hours after transfection with a PerkinElmer

1420 Luminescence Counter. For each transfection, three biological replicates were performed,

each done in triplicate.

Cloning and mutagenesis of enhancer elements

400 (401) bp CRMs were PCR amplified from genomic DNA and subcloned into pGL3-hsp70 and

pDuo2n-attB via BglII and KpnI. Site directed mutagenesis was performed using the QuickChange

Lightning Multi Site-Directed Mutagenesis Kit (Agilent Technologies). All primer sequences are

provided in Supplemental Methods.

Transgenic reporter assays

Enhancers were cloned into pDuo2n-attB (Zinzen et al. 2009) directionally using BglII and KpnI to

preserve their native orientation to the lacZ reporter. All integrase constructs were injected using

standard methods to produce stably integrated insertions. In situ hybridization of homozygous

embryos was performed to monitor expression of lacZ, the corresponding gene of interest and either

sna (mesodermal marker) or sim (to identify sna mutant embryos). Antisense RNA probes were

labeled with Fluorescein (FITC), Digoxigenin (DIG), or Biotin (BIO) and visualized fluorescently

via hapten-specific antibody covalently linked to a peroxidase and tyramide signal amplification

(TSA-Plus System, PerkinElmer). Images were taken using on a Zeiss 510META or Olympus

FluoView FV1000 confocal microscope and analyzed using ImageJ and Adobe Photoshop.

14

Page 15: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

ChIP data analysis and definition of ChIP-CRMs

All bioinformatics analyses were performed using D. melanogaster genome version 5 (UCSC dm3)

and the Flybase 5.7 genome annotation release. The analysis procedure is described in (Zinzen et

al. 2009). Briefly, TF bound regions were called with TileMap version 2.0 (mapping of the

Affymetrix GeneChip Drosophila Tiling 1.0R probes to the genome was obtained from the MAT

website (http://liulab.dfci.harvard.edu/MAT/). For each TF, two biological ChIP replicates were

analyzed against two stage-matched mock control hybridizations (Supplemental Tables S4 and S5).

High confidence bound regions were defined as those having a -log10(1-PP), where PP is the probe-

wise maximum (Tilemap posteriori probability, greater than or equal to 5.3 (PP=0.999995,

FDR~11%, 1770 regions) and 5.5 (PP=0.999997, FDR~11%, 1212 regions)) for Snail and Twist,

respectively.  Similar to the procedure described in (Zinzen et al. 2009), threshold effects were

limited using a rescue strategy: Snail (and Twist) regions below cutoffs that had (1) a FDR > 25%

and (2) an overlap of at least 50% with an above-threshold Twist region (or Snail, respectively)

were rescued and included in the final ‘high-confidence’ regions set. This procedure resulted in

rescuing 185 Snail-bound regions and 69 Twist-bound regions. Summits were localized in bound

regions as described previously (Zinzen et al. 2009), leading to the identification of 2021 ad 1540

summits or peaks, for Snail and Twist respectively.  Finally, 300 bp regions centered on these peaks

were defined and overlapping regions (all Twist and Snail peaks taken together) were merged

resulting in potential ChIP-CRMs (Supplemental Table S6).

Quantitative ChIP signal was computed as described in (Wilczynski and Furlong 2010). Briefly,

the quantitative ChIP signal was calculated for each CRM at each experimental condition (i.e. Twist

and Snail) as a maximum average value of probe intensities (i.e. the log2 ratio of IP over Mock

using the probe mean intensities) over a sliding window of size 200bp (Supplemental Table S6).  

Differential Gene Expression Analysis

Sample hybridization was carried out on the Affymetrix Drosophila 2.0 Array platform. Data were

analyzed using R/Bioconductor using an updated probe sets to NCBI Entrez gene mapping (CDF

file version 11 downloaded from the brain array

website http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/

CDF_download.asp). Note that NCBI Entrez gene version 11 is similar to Flybase 5.7. Arrays

were normalized using the GC-RMA method (as implemented in the R/Bioconductor gcrma

package) and probe sets showing little variation (Inter Quantile Range < 0.5) across all conditions

were removed (Bioconductor genefilter package) to increase downstream statistical power.

Genes differentially expressed in mutant conditions compared to the ∆haloAJ reference were called

15

Page 16: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

using the Bioconductor limma package (Supplemental Tables S1 and S2). Genes with an absolute

log2 ratio >1 and FDR < 0.05 were considered differentially enriched.  The final "sna pooled"

differentially expressed gene list is composed of genes that showed differential expression in any of

the two snail mutants, in any of the two time points. Genes exhibiting contradictory significant

differential expression (e.g. a gene that would be significantly up-regulated in one condition and

down-regulated in another) were discarded (11 genes). This analyses defined 223 down-regulated

genes and 255 up-regulated genes in Snail mutants (lists available in Supplemental Table S3).

Definition of ChIP-CRM sets

ChIP defined CRMs found in the vicinity (zone encompassing 5 Kb upstream to 1 Kb downstream

gene boundaries as defined in Flybase 5.7) of differentially expressed genes in snail mutants were

further considered and divided into 4 distinct groups :

(1) 'activated co-bound ChIP-CRMs' : ChIP-CRMs co-bound by Twist and Snail found in the

vicinity of down-regulated genes in snail mutants, 52 ChIP CRMs (51 before manually inspection,

see below)

(2) 'repressed co-bound ChIP-CRMs' : ChIP-CRMs bound by Twist and Snail found in the vicinity

of up-regulated genes in snail mutants, 50 ChIP CRMs

(3) 'activated Snail-only ChIP-CRMs' : ChIP-CRMs co-bound by Snail (but not Twist) found in the

vicinity of down-regulated genes in snail mutants, 38 ChIP CRMs (39 before post filtering, see

below)

(4) 'repressed Snail-only ChIP-CRMs' : ChIP-CRMs co-bound by Snail (but not Twist) found in the

vicinity of up-regulated genes in snail mutants, 40 ChIP CRMs (45 before post filtering, see below) 

In groups (3) and (4), the absence of Twist binding (as defined by TileMap peak calling analysis)

was insured by filtering out CRMs having a Twist ChIP signal (see "Quantitative ChIP Signal")

greater than 1.5 (threshold was derived from the distribution of Twist log-ratios observed on 2635

random regions of size equivalent to the ChIP defined CRMs, data not shown). Manual inspection

of the eliminated sna-only CRMs revealed that a Mef2 enhancer known to be co-bound by Twist

and Snail was eliminated by this procedure from the group (4) i.e. the CRM was first classified

as  'repressing Sna-only CRMs' and filtered out due to high level of Twist binding signal. We

confirmed that this Twist binding event couldn't be accurately modeled by our peak modeling

algorithm due to its close proximity to the region boundary as defined by TileMap. We therefore

manually added this CRM in the 'repressing co-bound CRMs' group. ChIP-CRM set composition is

available in Supplemental Table S7.

16

Page 17: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

de novo motif discovery

de novo motif discovery was performed using the peak-motifs (Thomas-Chollier et al. 2012) tool

from the RSAT analysis suite using 200 bp repeat-masked regions centered on binding summits

(see "ChIP-chip data analysis, peak calling and definition of ChIP-CRMs" section). The following

run parameters were used "-markov auto -disco oligos,positions -nmotifs 5 -minol 6 -maxol 7 -2str".

Note that the results obtained with peak-motifs were confirmed using the XXmotif tool (Luehr et al.

2012) (data not shown).  

Enrichment of Snail motifs in ChIP-CRM sets

Transcription Factor Binding Sites (TFBSs) for Snail were predicted in the 4 CRM sets (see above)

using Patser (Hertz and Stormo 1999) with a Patser score threshold of 3. The Snail position weight

matrix (PWM) used is the one found by de novo discovery (Alternative-Snail CAGGTA motif

shown on Fig4.A, "Repressed Set" row, available in Supplemental Tables). At this threshold, a

similar number of alternative CAGGTA Snail motifs was found in Snail-only CRM classes while a

statistically significant higher number of TFBS predictions is observed in co-bound repressed

CRMs compared to co-bound activated CRMs (p<1e-4 binomial test ; co-bound activated: 90

matches, co-bound repressed: 130 matches, Snail-only activated: 84 matches, Snail-only repressed:

88 matches). As the alternative-Snail PWM used here was learned on the co-bound repressed CRM

set, it is expected to perform better on this set of CRMs. We used a Patser score threshold of 3

(corresponding to a PWM match prediction p-value of 2e-3) to include low affinity TFBS

predictions, which is required to compute the cumulative match score per CRM (right boxplots of

Fig. 4A and Fig S5A). However, at higher cut-offs (5-6, corresponding to a PWM match prediction

p-value of 5e-3 and 1e-4 respectively), the match frequency difference between co-bound activated

and repressed CRM classes is not statistically significant (p=0.1, binomial test). We therefore

conclude that there is no robust difference in match frequency between the two classes of CRMs.

Similar results were obtained with the Jaspar MA0086.1 Snail PWM (shown in Fig 4A and used in

Fig S5B) under the same settings (co-bound activated: 233 matches, co-bound repressed: 263

matches, Snail-only activated: 113 matches, Snail-only repressed: 150 matches, at a Patser

threshold of 3).

Enrichment of Twist motifs near Snail motifs

TFBSs for Twist and Snail were predicted in the 4 ChIP-CRM sets (see above) using Patser (Hertz

and Stormo 1999), with a p-value threshold of 1e-3 to ensure using TFBS predictions of equivalent

quality for the different PWMs. The PWMs used were (1) the Twist PWM from (Zinzen et al. 2009)

and (2) the Snail PWM found by de novo discovery shown on Fig. 4A (alternative CAGGTA Snail

17

Page 18: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

motif, bottom left), PWMs are available in Supplemental Tables). The procedure used is described

in (Zinzen et al. 2009). Briefly, the background distribution of Snail-Twist distances (distance to

the closest Twist TFBS) was derived assuming uniform distribution over the ChIP-CRM regions (as

defined in section "ChIP-chip data analysis, peak calling and definition of ChIP-CRMs"). Also note

that the centers of the TFBSs are used to compute the motif-to-motif distances. Enrichment over

background was defined as the ratio of the frequency in the data set over background frequency and

was robustly estimated using a moving average with a 10-bp window. Equi-tailed 95% confidence

intervals of the enrichment were estimated by re-sampling 1,000 times the observed distances, with

replacement. Deviation from random (red stars in Fig. 4C and Fig. S5C) were appreciated as the

confidence interval remaining above 1 for at least 5 consecutive values. Plots presented on Fig 4C

are based on 63 and 83 distinct Snail-Twist distances (top left and bottom left plots, respectively)

with 14 distinct distances from 11 distinct CRMs in the [50,65] interval (Fig. 4C top left), 12

distinct distances from 11 distinct CRMs in the [10,20] interval (Fig. 4C bottom left) and 7 distinct

distances from 7 distinct CRMs in the [40,45] interval (Fig. 4C bottom left). Plots presented on Fig

S5C are based on 47 and 56 distinct Snail-Twist distances (top left and bottom left plots,

respectively) with 8 distinct distances from 8 distinct CRMs in the [76,83] interval (Fig. S5C top

left), 8 distinct distances from 8 distinct CRMs in the [20,26] interval (Fig. S5C bottom left).

Differential motif analysis, SVM predictions, and in silico mutagenesis

Differential motif analysis, SVM predictions, and in silico mutations were performed as described

in (Yanez-Cuna et al. 2012). Briefly, motif occurrences were counted in 401bp long windows

centered at each ChIP peak’s summit for occurrences of known and predicted motifs from (Stark et

al. 2007) with a PWM-cutoff 1/1024. Motif counts for Snail-activated and Snail-repressed peaks

were used as features for SVM predictions using leave-one-out cross-validation. To identify the

most discriminative features, we first clustered motifs by their similarity based on their PWM.

Feature selection was performed by backward elimination, where the feature list was cut at the

maximum of the prediction accuracy. We repeated exactly the same procedure in the control set

where the class assignments were shuffled to avoid overfitting. For the in silico mutations, we

manually deleted all occurrences of a given motif in the test CRM by setting the respective count in

the region’s feature vector to zero. A score was calculated by training 100 SVMs by random

subsampling the remaining regions 100 times (i.e., choosing 50% of the sites in each class at

random), and then predicting the wild-type and the in silico mutated test CRM with all 100 SVMs

as in (Yanez-Cuna et al. 2012). A list of all the motifs found differentially enriched and

discriminative with their corresponding IUPAC sequence and the original source can be found in

Supplemental Table S9.

18

Page 19: genesdev.cshlp.orggenesdev.cshlp.org/.../Rembold_Supplementary_Material.docx · Web viewSupplemental Material A conserved role for Snail as a potentiator of active transcription Martina

A conserved role for Snail as a potentiator of active transcription Rembold et al.

Supplemental References

Down TA, Bergman CM, Su J, Hubbard TJ. 2007. Large-scale discovery of promoter motifs in Drosophila melanogaster. PLoS computational biology 3: e7.

Hertz GZ, Stormo GD. 1999. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15: 563-577.

Luehr S, Hartmann H, Soding J. 2012. The XXmotif web server for eXhaustive, weight matriX-based motif discovery in nucleotide sequences. Nucleic Acids Res 40: W104-109.

Schroeder MD, Pearce M, Fak J, Fan H, Unnerstall U, Emberly E, Rajewsky N, Siggia ED, Gaul U. 2004. Transcriptional control in the segmentation gene network of Drosophila. PLoS biology 2: E271.

Stark A, Kheradpour P, Parts L, Brennecke J, Hodges E, Hannon GJ, Kellis M. 2007. Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. Genome Res 17: 1865-1879.

Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J. 2012. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res 40: e31.

Wilczynski B, Furlong EE. 2010. Dynamic CRM occupancy reflects a temporal map of developmental progression. Molecular systems biology 6: 383.

Zinzen RP, Girardot C, Gagneur J, Braun M, Furlong EE. 2009. Combinatorial binding predicts spatio-temporal cis-regulatory activity. Nature 462: 65-70.

19