Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion,...

44
www.sciencemag.org/cgi/content/full/science.1237973/DC1 Supplementary Materials for The Xist lncRNA Exploits Three-Dimensional Genome Architecture to Spread Across the X Chromosome Jesse M. Engreitz, Amy Pandya-Jones, Patrick McDonel, Alexander Shishkin, Klara Sirokman, Christine Surka, Sabah Kadri, Jeffrey Xing, Alon Goren, Eric S. Lander,* Kathrin Plath,* Mitchell Guttman* *Corresponding author. E-mail: [email protected] (M.G.); [email protected] (K.P.); [email protected] (E.S.L.) Published 4 July 2013 on Science Express DOI: 10.1126/science.1237973 This PDF file includes: Materials and Methods Supplementary Text Figs. S1 to S14 Captions for tables S1 to S6 References Other supplementary material for this manuscript includes the following: Tables S1 to S6 (Excel format)

Transcript of Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion,...

Page 1: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

www.sciencemag.org/cgi/content/full/science.1237973/DC1

Supplementary Materials for

The Xist lncRNA Exploits Three-Dimensional Genome Architecture to Spread

Across the X Chromosome

Jesse M. Engreitz, Amy Pandya-Jones, Patrick McDonel, Alexander Shishkin, Klara Sirokman,

Christine Surka, Sabah Kadri, Jeffrey Xing, Alon Goren, Eric S. Lander,* Kathrin Plath,*

Mitchell Guttman*

*Corresponding author. E-mail: [email protected] (M.G.); [email protected] (K.P.);

[email protected] (E.S.L.)

Published 4 July 2013 on Science Express

DOI: 10.1126/science.1237973

This PDF file includes:

Materials and Methods

Supplementary Text

Figs. S1 to S14

Captions for tables S1 to S6

References

Other supplementary material for this manuscript includes the following:

Tables S1 to S6 (Excel format)

Page 2: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

2

Materials and Methods ES cell lines

Male ES cells expressing Xist from the endogenous locus under control of a tet-inducible promoter (pSM33 ES cell line). We created an ES cell line in which 1,027-bp of the wild-type Xist promoter were replaced with a TetO promoter (Fig. S5). To this end, male mouse ES cells (V6.5 line –derived from blastocysts of a C57BL/6 × 129SV-Jae mouse cross) harboring the reverse tetracycline transactivator (M2rtTA) in the Rosa26 locus were electroporated with a targeting cassette containing a 2.93-Kb 5’ homology arm and a 2.99-Kb 3’ homology arm, encompassing a lox-Hygro-TK-lox selection cassette and a minimal CMV promoter that carries several tet-response elements. Homologous recombinants were selected for Hygromycin resistance after electroporation. Subsequent transient expression of Cre-recombinase led to excision of the selection cassette leaving only the TetO promoter upstream of the Xist gene, replacing the endogenous Xist promoter. Correct targeting and excision of the selection cassette were confirmed by Southern blotting. Upon induction with doxycycline, the majority of ES cells induce Xist expression as detected by FISH (Fig. S6B). Non-expressing cells do not affect the RAP experiments, which will capture Xist RNA-mediated chromatin interactions only when the Xist RNA is present.

Male ES cells carrying a wild-type or ∆A cDNA Xist transgene in the Hprt locus under control of the tet-inducible promoter. These transgenic male mouse ES cell lines harbor the mouse Xist cDNA with and without the A-repeat, respectively (cell lines ∆X, clone F6 and ∆SX, clone C9), under the control of a tet-responsive promoter within the Hprt locus on the X chromosome, and have the reverse tetracycline transactivator (M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in Xist Exon 1 were deleted (Fig. S12). These cell lines are subclones of previously published wild-type and ∆A repeat Xist cDNA lines (26), which were generously provided by Dr. Anton Wutz. We note that 50-60% of cells displayed Xist FISH signal after three hours of induction with doxycycline (Fig. S12). Similar to the pSM33 ES cell line, cells not expressing Xist will not affect RAP. We do not detect expression of the endogenous Xist allele by RAP or FISH in these Hprt transgenic ES cell lines.

Female ES cells (F1 2-1 line). This wild-type female mouse ES cell line is derived from a 129 × castaneous F1 mouse cross and undergoes skewed X-chromosome inactivation, preferentially silencing the 129 X-chromosome (~70%) due to genetic differences between the two chromosomes.

Throughout the text, the term “Xist transcription locus” refers to the site of active Xist transcription in that specific cell type: the Xist locus in pSM33s, MLFs, and F1 1-2s; and the Hprt locus in the Xist transgenic lines.

Cell culture For RAP, mouse lung fibroblasts (MLFs) were cultured in DMEM High Glucose

(Life Technologies) supplemented with 10% fetal calf serum (GlobalStem) and female ES cells and TetO-Xist male pSM33 ES cells were grown on plates coated with 0.2% gelatin in serum-free 2i/LIF media composed as follows: 1:1 mix of DMEM/F-12

Page 3: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

3

(Gibco) and Neurobasal (Gibco) supplemented with 1× N2 (Gibco), 0.5× B-27 (Gibco 17504-044), 2 mg/mL bovine insulin (Gemini BioSciences), 1.37 μg/mL progesterone (Sigma), 5 mg/mL BSA Fraction V (Gibco), 0.1 mM 2-mercaptoethanol (Sigma), 5 ng/mL murine LIF (GlobalStem), 0.1 μM PD0325901 (Axon Medchem 1408) and 0.3 μM CHIR99021 (University of Dundee Division of Signal Transduction Therapy). 2i inhibitors were added fresh with each medium change. Fresh medium was replaced every 24-48 hours depending on culture density, and passaged every 72 hours using StemPro Accutase (Life Technologies), rinsing dissociated cells from the plates with DMEM/F12 containing 0.038% BSA Fraction V. For RAP, the ΔX-F6 and ΔSX-C9 ES cell lines were maintained in knockout DMEM (Life Technologies) – supplemented with 15% FBS (Omega), 2 mM L-glutamine (Life Technologies), 1× NEAA (Life Technologies), 0.1 mM Beta-Mercaptoethanol (Sigma), 1× Penicillin/Streptomycin (Life Technologies), and 1000 U/mL murine LIF – on gelatinized plates covered with irradiated male DR4 feeders. For FISH on undifferentiated ES cells, cells were maintained under the same conditions on glass coverslips. To induce Xist expression in the pSM33, ΔX-F6 and ΔSX-C9 ES cell lines, we added doxycycline to a final concentration of 2 μg/mL at a defined time before harvesting or fixing for RAP or FISH.

Female ES cell differentiation time course For RAP on differentiating female ES cells (F1 2-1s), we cultured these cells in ES

medium containing Knockout DMEM (Life Technologies) supplemented with 15% fetal calf serum (GlobalStem), 1× NEAA (Life Technologies), 2 mM L-glutamine (Life Technologies), and 0.1 mM beta-mercaptoethanol (Sigma) with or without 1000 U/mL LIF (GlobalStem) on a mitotically inactivated male MEF feeder layer. At the start of the experiment, the ES cells were passaged using 0.05% trypsin and pre-plated for 45 minutes in ES medium onto plates treated with 0.3% gelatin to deplete MEFs from the culture. ES cells were then collected and seeded onto plates coated with 0.3% gelatin at a density of 15,000 cells/cm2 in ES medium + LIF. Cells were cultured for 48 hours and all-trans retinoic acid (RA) (Sigma) was added at different time points. At each time point, medium was removed from the culture and replaced with ES medium lacking LIF and supplemented with freshly diluted 1 μM RA. For the 48-hour timepoint, RA was added directly to the plated cells and fresh RA-containing ES medium (no LIF) was replaced after 24 hours. For FISH, differentiating female ESCs (F1 2-1) were passaged with 0.25% trypsin and plated in ES media onto 15-cm plates pre-coated with 0.3% gelatin to deplete feeder cells. After 45 minutes, the ES cell enriched-supernatant was harvested and then plated (25,000 cells/cm2) on glass coverslips pre-coated with 0.3% gelatin and left overnight to adhere. The following morning (t=0 hours), the ES medium was replaced with differentiation media (High glucose DMEM (Life Technologies) supplemented with 10% FBS (Omega), 2 mM L-glutamine (Life Technologies), 1× NEAA (Life Technologies), 0.1 mM beta-mercaptoethanol (Sigma), 1× Penicillin/Streptomycin (Life Technologies)) containing 1 μM RA. The cells were maintained in this differentiation media until fixation at specific times after addition of RA.

Page 4: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

4

Fluorescence In Situ Hybridization (FISH) and immunostaining FISH for Xist and Tsix RNA was performed as previously published (71) using

strand-specific RNA probes generated by in-vitro transcription in the presence of Cy3-UTP (Xist) (Perkin Elmer) or AlexaFluor488-UTP (Tsix) (Life Technologies) using a T3 riboprobe synthesis kit (Promega) from Xist exon 1 and exon 7 templates, respectively. To detect Xist, we labeled its antisense strand and for Tsix detection the Xist sense strand. At each time point after addition of doxycycline or RA, coverslips were transferred to a new culture dish containing 1× PBS with calcium and magnesium. The cells were then fixed in 4% paraformaldehyde (PFA) (EMS) in 1× PBS (lacking calcium and magnesium) for 10 minutes at room temperature (RT) under standard laboratory safety practices. After fixation, the cells were stored in 70% ethanol at -20°C until samples from all time points had been collected. Prior to hybridization with probe, cells were brought back to RT and serially dehydrated by 5-minute incubations in 80%, 95% and 100% ethanol at RT. Coverslips were removed from ethanol and allowed to air dry prior to incubation with probe for 12-18 hours at 37°C. Next day, coverslips were washed three times for 5 minutes in 50% formamide (Fisher) / 2× SSC (Ambion), three times for 5 minutes in Wash Buffer II (10 mM Tris, 0.5 M NaCl, 0.1% Tween-20), prior to a 45 minute incubation with 25 μg/mL RNase A (Life Technologies) in Wash Buffer II at 37°C. After RNase A treatment, coverslips were washed twice for 5 minutes in Wash Buffer II, twice for 5 minutes in 50% formamide / 2× SSC, three times for 5 minutes in 2× SSC and three times for 5 minutes in 1× SSC before briefly absorbing excess SSC with a kimwipe and mounting with prolong Gold antifade media (Life Technologies). All FISH washes were conducted at 42°C.

In cases where immunostaining and FISH were combined, immunostaining preceded FISH. In this case, cells were fixed in 4% PFA in 1× PBS for 10 minutes at RT under standard safety procedures. Cells were permeabilized in 0.5% Triton X-100 (Acros) in 1× PBS for 5 minutes at RT, followed by a 5 minute incubation in 0.2% Tween-20 (Acros) at RT prior to incubation in block (1× PBS, 0.2% Fish Skin Gelatin (Sigma), 0.2% Tween-20, 1/20th (v/v) Goat Serum (Vector Labs), 1 mg/mL yeast tRNA (Sigma) and 0.2 U/mL RNase Inhibitor (Life Technologies)) for 30 minutes at RT. Cells on coverslips were incubated in primary antibody in block for 1 hour at RT, washed 3× for 5 minutes in 0.2% tween-20 / 1× PBS, incubated with an Alexafluor-488 conjugated secondary antibody (Life Technologies) in block for 30 minutes (light-protected) at RT, prior to another round of washes. After the final immuostaining wash, cells were re-fixed for FISH in 4% PFA in 1× PBS, then dehydrated through a 70-85-95-100% ethanol series prior to overnight incubation with probe. At the end, cells were mounted in Prolong Gold antifade mounting media (Life Technologies). Primary antibodies were used at dilutions of 1:1000 (Pol II, Clone CTD4H8, Millipore Cat# 05-623), 1:400 (H3K27me3, Active Motif Cat# 39155), 1:200 (Ezh2, BD Pharmingen Cat# 612667).

We categorized Xist FISH signals as pinpoints, spots, small clouds, and large clouds. The pinpoint Xist signal was defined as a signal smaller than that of Tsix. The Xist Spot signal was larger than that of Tsix. Small Xist clouds displayed a small degree of spreading and within which a bright foci, likely the site of transcription, was easily discernible. The large Xist clouds were characterized by a large uniformly bright Xist RNA signal.

Page 5: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

5

RNA Antisense Purification (RAP) Probe design. To capture target RNAs, we designed sets of 120-nucleotide oligos

tiled every 15 nucleotides across the entire RNA sequence. To avoid off-target hybridization, we excluded sequences that contained a perfect 30 base-pair match or an imperfect (90% identity) 60 base-pair match with another transcript or genomic region. We further excluded all probes that contained more than 30 bases that originated from a repetitive region, regardless of its unique mapping.

Probe generation. For each probe-set (e.g. the set of all probes targeting Xist), we added a unique pair of PCR tags that allowed amplification of the DNA oligo templates (Table S4, Table S5). We employed microarray-based DNA synthesis (Agilent Technologies, Inc.) to synthesize an oligo pool (72) containing the probe-sets for multiple genes. Following DNA synthesis, we resuspended the lyophilized DNA pellet in water and enriched for particular probe-sets by 20-25 cycles of PCR (Phusion High-Fidelity PCR Master Mix (New England Biolabs), 2.5 pmol left and right primers, 10 fmol DNA template). We incorporated a T7 promoter sequence (GGATTCTAATACGACTCACTATAGGG) in a second round of 10-15 cycles of PCR, switching the strand position of the T7 promoter to allow for generation of sense or antisense RNA probes (Table S5). In vitro transcription by T7 RNA polymerase was carried out in the presence of 25% 16-biotin-UTP (Ambion). Following in vitro transcription, we eliminated template DNA with TURBO DNase (Ambion) and purified RNA probes with RNeasy columns (Qiagen). While for this study we generated the probesets using custom microarrays, RAP can also be performed using in vitro transcribed fragments. We recommend in vitro transcription of full-length antisense RNA clones followed by controlled fragmentation (e.g., by heating in the presence of 10 mM ZnCl at 72°C for 10 minutes) to generate ~60-120mer RNA probes.

Crosslinking. We crosslinked adherent cells on plates by first rinsing with room temperature PBS and then adding PBS + 2 mM disuccinimidyl glutarate (DSG, Pierce) for 45 minutes at room temperature. Cells were washed in PBS, then further crosslinked in PBS + 3% formaldehyde at 37°C for 10 minutes. Glycine was added to 250 mM final concentration and incubated at 37°C for an additional 5 minutes. We washed the cells in the plate twice in ice-cold PBS, then added ice-cold PBS + 0.5% BSA Fraction V and harvested cells by scraping. We washed cell pellets in ice-cold PBS + 0.5% BSA Fraction V, spun to remove supernatant, and snap-froze cell pellets in liquid nitrogen.

Lysate Preparation. We lysed batches of ten million cells on ice in 1 mL Lysis Buffer 1 (10 mM HEPES pH 7.5, 20 mM KCl, 1.5 mM MgCl2, 0.5 mM EDTA, 1 mM Tris(2-carboxyethyl)phosphine (TCEP), 0.5 mM PMSF), then spun pellets at 3,300× g for 10 minutes. We resuspended cell pellets in 1 mL Lysis Buffer 1 plus 0.1% NP-40 and dounced 20 times. Following another spin, we lysed nuclei in 550 μL Lysis Buffer 2 (20 mM Tris pH 7.5, 50 mM KCl, 1.5 mM MgCl2, 2 mM TCEP, 0.5 mM PMSF, 2.5% Murine RNase Inhibitor, 0.4% sodium deoxycholate, 1% NP-40, 0.1% N-lauroylsarcosine) for 10 minutes. To solubilize chromatin, we sonicated samples with a Branson Sonifier at 5 watts for 1 minute at 4°C. To obtain DNA fragments of approximately 100-300 bp, we treated samples with 2.5 mM MnCl2, 0.5 mM CaCl2, and 200 U TURBO DNase (Ambion) at 37°C for 10-20 minutes. We halted DNase digestion by addition of 10 mM EDTA and 5 mM EGTA on ice. We diluted the lysate to hybridization conditions by adding 1.4× RAP Hybridization Buffer (1×: 20 mM Tris pH

Page 6: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

6

7.5, 7 mM EDTA, 3 mM EGTA, 150 mM LiCl, 1% NP-40, 0.2% N-laroylsarcosine, 0.1% sodium deoxycholate, 3 M guanidine thiocyanate, 2.5 mM TCEP). We cleared the lysate by spinning samples at 16,000× g for 10 minutes, then snap-froze batches of lysate in liquid nitrogen.

Purification. The following purification describes RAP for 200,000, but we performed purifications on 500,000-5,000,000 cells and scaled the protocol accordingly. This approach provided sufficient yield of the Xist RNA and was amenable to 12-well format, but we note that RNA and DNA yields can be improved without compromising enrichments by increasing the amounts of probe and bead. For each purification, we heated 80 µL of lysate in hybridization buffer to 45°C, then pre-cleared by adding 12 µL streptavidin-coated C1 beads (Invitrogen) and incubating at 45°C for 20 minutes. Approximately 20 ng (350 fmol) of biotin-labeled RNA capture probes were denatured in water for 2 minutes at 75°C, then snap-cooled on ice. We mixed denatured probes with the heated lysate and incubated at 45°C for 2 hours to capture target RNAs. We captured the biotinylated probes by addition of 4 µL Streptavidin C1 beads for 15 minutes. We transferred samples to a magnetic rack and washed six times with RAP Wash Buffer (20 mM Tris pH 7.5, 10 mM EDTA, 1% NP-40, 0.2% N-laroylsarcosine, 0.1% sodium deoxycholate, 3 M guanidine thiocyanate, 2.5mM TCEP) at 45°C for 4-5 minutes. At this point, we used two different elution methods for examining associated DNA or RNA.

Eluting for RNA qPCR and RNA sequencing. We eluted twice in 22 µL RAP Elution Buffer (20 mM Tris pH 7.5, 10 mM EDTA, 2% N-laroylsarcosine, 2.5 mM TCEP) by heating to 94°C for 5 minutes. We pooled the resulting eluates and reversed crosslinks by incubating at 65°C for two hours after addition of 250 mM sodium chloride and 1 mg/mL Proteinase K (New England Biolabs). We purified nucleic acids by isopropanol precipitation onto SILANE beads (Invitrogen) and treated with TURBO DNase (Ambion) before proceeding to RT-qPCR or library preparation for RNA-Seq.

Eluting for DNA sequencing. We eluted captured chromatin complexes and reversed crosslinks by adding 80 µL RAP Elution Buffer plus 250 mM NaCl and 1 mg/mL Proteinase K to the oligo-bead complexes and incubating overnight at 65°C.

RNA RT-qPCR. To quantify RNA yield and enrichment, we first treated eluted samples with TURBO DNase. Next, we converted eluted RNA to cDNA using random primers and AffinityScript Reverse Transcriptase (Agilent). We performed qPCR using LightCycler 480 SYBR Green I Master Mix (Roche). Since the RNA samples used here contain both the target RNA as well as the RNA capture probes, we needed to avoid amplifying the probe sequence when performing qPCR. To accomplish this, we designed qPCR primers with amplicons greater than 160 bp on the target RNA (Table S4). Because any given probe contains only 120 bp of the target RNA sequence, this primer design scheme ensured that only the target cDNA would exponentially amplify during qPCR measurement. In some cases where linear amplification of probe cDNA interfered with accurate qPCR quantification, we used strand-specific reverse-transcription primed with a target-specific primer followed by qPCR with long-amplicon primers.

RAP RNA sequencing. While we observed excellent RNA enrichments by qPCR, we wanted to ensure that RAP did not enrich for nonspecific RNAs through off-target hybridization. To accomplish this, we sequenced all of the RNA purified in Xist RAP in TetO-Xist male ES cells (pSM33) after three hours of induction with doxycycline. However, sequencing the eluted RNA proved challenging because our RNA capture

Page 7: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

7

probes comprised a large proportion of the RNA in the sample. Since the probes are antisense to the RNA of interest and contain unique PCR tags on the ends, it is easy to distinguish probe reads from that of the captured RNA. However, the high ratio of probes to endogenous RNAs would overwhelm our ability to detect other RNAs without a method for reducing the representation of probe sequences in our RNA-seq library. To accomplish this, we used a custom strand-specific library preparation protocol that removed probe-cDNA hybrids immediately after reverse-transcription as follows.

To prepare eluted RNA for RNA-sequencing, we first treated with TURBO DNase at 37°C for 10 minutes. We then dephosphorylated eluted RNA (including probe RNA) with FastAP Thermosensitive Alkaline Phosphatase (Thermo Scientific) at 37°C for 10 minutes, then added 25 mM EDTA and heated at 68°C for 2 minutes. We cleaned the reaction using RNA Clean & Concentrator 5 columns (Zymo Research). We combined dephosphorylated RNA with 20 pmol RiL-19 adaptor (Table S5), denatured at 70°C for 2 minutes, then snap-cooled by transferring to ice. We ligated the adaptor to the RNA using T4 RNA Ligase 1 (New England Biolabs) at 23°C for 75 minutes, shaking frequently. We cleaned the ligation reactions by adding 3× volume Buffer RLT (Qiagen) and 0.58× volume ethanol, precipitating onto SILANE beads, washing twice in 70% ethanol, and eluting in water. Following clean-up, we added 12 pmol AR17 RT primer (Table S5) and synthesized first strand cDNA using AffinityScript Reverse Transcriptase (Agilent), incubating at 55°C for 5 minutes followed by 4°C hold. Here, we omitted the heat-inactivation step in order to preserve RNA-cDNA hybrids. Next, we removed probe RNA and newly-synthesized cDNA by adding Streptavidin C1 beads with 250 mM LiCl and 25 mM EDTA, incubating at 60°C for 15 minutes, then removing the beads and associated probe-cDNA hybrids from the cDNA mixture. For the remaining cDNA sample, we degraded RNA by adding 100 mM sodium hydroxide and incubating at 70°C for 10 minutes. Base was neutralized with addition of 100 mM acetic acid, then cleaned using SILANE beads as described above. We added a second adaptor to the cDNA by adding 40 pmol 3Tr3 DNA adaptor (Table S5) and ligating with T4 RNA Ligase 1 (high-concentration, New England Biolabs) at 23°C overnight. Following clean-up with SILANE beads, we amplified the cDNA library for 10 cycles using barcoded primers (Table S5). We cleaned amplified libraries with AMPure XP beads (Agencourt) and quantified library yield using an Agilent Bioanalyzer. We sequenced these libraries (Illumina MiSeq, 25-bp paired-end) and for Xist RAP obtained ~300,000 reads that mapped to the transcriptome, of which ~70% mapped with the correct strand to the Xist RNA.

RAP DNA sequencing. From eluted DNA, we generated standard Illumina sequencing libraries using the ChIP-Seq Library Prep Master Mix Set (New England Biolabs). We used barcoded adaptors to allow multiplexed sequencing, and amplified RAP libraries with 10-16 cycles of PCR. We sequenced pooled libraries on the Illumina HiSeq to generate >5 million 25-bp paired-end reads per sample (Table S6).

For additional protocols and information, visit http://lncRNA.caltech.edu/RAP/. Additional RAP controls

RAP in non-crosslinked extracts. We performed RAP in non-crosslinked TetO-Xist male ES cells (pSM33) using light sonication and DNase for lysis. Although the RNA was enriched, X-chromosome DNA levels were undetectable after 45 cycles of qPCR.

Page 8: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

8

RAP using random probes. As an additional control for nonspecific interactions with RNA probes, we used probes tiled across a scrambled Tug1 sequence (Table S3). Purifications with this probeset typically purified RNA and DNA amounts that were undetectable by qPCR after 40 cycles (Fig. S1).

RAP of mRNA controls. To confirm that we did not capture Xist RNA nonspecifically with any purified RNA, we designed a probeset targeting the abundant ES mRNA Oct4. Oct4 RAP enriched for the Oct4 RNA but did not enrich for the Xist RNA (Fig. S1).

RAP in cells that do not express Xist. To confirm that we did not pull down X-chromosome DNA nonspecifically, we performed RAP in pSM33 male ES cells prior to induction, when Xist is not expressed. We found that Xist RAP does not enrich X-chromosome DNA except for the Xist genomic locus itself.

DNA sequencing alignment and analysis

Sequencing reads were aligned to the Mus musculus genome (mm9) using BWA version 0.5.9 with default parameters (73). Reads were removed if they had mapping qualities less than 30 or were flagged as PCR duplicates for both read pairs. We calculated alignment statistics for each library using the Picard package (http://picard.sourceforge.net). All calculations were performed using fragment counts, which were defined using the read-pairs.

Defining "unmappable" regions To define regions across the chromosome which were unmappable or showed biased

mapping statistics, we excluded from our analysis regions if they showed >50% “unmappable” bases based on one of two properties. (i) We examined the input sample containing >100 million reads (25-bp paired end) and flagged all 100-bp windows that contained fewer than two reads. (ii) We simulated 100 million random paired end reads using the same fragment-length distribution as our experiment. We then mapped these random reads to the genome, flagging all 100-bp windows that contained fewer than two reads. All bases contained in any of these flagged windows were defined as unmappable. These unmappable regions were removed from subsequent analyses as described below.

Calculating and plotting RAP enrichments To account for differences in input coverage of different genomic regions, we

calculated an enrichment ratio between fragment counts in the RAP sample and in the input. This ratio was defined as (fragment counts in the RAP sample + 0.1) / (fragment counts in the input sample + 0.1), where adding the extra fractional counts prevented division by zero. This normalized enrichment ratio was used in all further computational analysis. We note that an alternative to this approach would be to normalize using the control RAP experiment instead of the input; however, this approach was not practical because the control sequencing libraries had extremely low coverage and library complexity owing to the low amount of DNA captured (Table S6), and thus resulted in noisy enrichment ratios dominated by sampling error.

To plot RAP enrichments (as well as other continuous data tracks), we calculated an enrichment or score in sliding windows across the genome, where each window

Page 9: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

9

overlapped the previous one by 75% (e.g. 100-Kb windows tiled every 25-Kb). We plotted each of these data points for the region of the interest. For windows containing >50% unmappable bases, we performed a linear interpolation from the closest mappable windows and plotted these interpolated enrichments in light gray in all figures.

Defining significantly enriched or depleted sites We used a permutation-based approach to identify regions that show strong

enrichment or depletion relative to the rest of the genome. We scanned across the genome with overlapping windows of various sizes. We excluded all windows with >50% unmappable bases, as defined above. For each window, we counted the number of fragments overlapping the window in both the Xist RAP sample (nXist) and in the input (ninput). We calculated a binomial p-value for each window, asking if the number of reads mapping to that window in the Xist sample (nXist) represents a significantly large or small fraction of the total number of reads mapping to that window (nXist + ninput) given the total number of reads in each library. We compared these p-values to a null distribution. To generate a null distribution of p-values, we first permuted reads in the RAP library to other mappable regions on the same chromosome. We then calculated binomial p-values in windows across the genome as above. We repeated this process for 100 permutations of the reads and combined the permuted ratios to create a null distribution. We called windows as significant if they exceeded the 99.9th percentile of the null distribution.

To call significantly enriched or depleted sites in MLF, we scanned the X-chromosome using a window size of 100 Kb. To call peaks across the genome for homology and de novo motif analysis, we used a window size of 500 bp.

Motif analysis on initiation sites To identify de novo motifs, we tested 500-bp peaks using MEME-ChIP (74)

(http://meme.nbcr.net/meme) with the default parameters and a DREME E-value cutoff of 0.001. We used the 500 peaks with the highest binomial p-values since MEME-ChIP limits the number of input regions. We used this protocol for finding de novo motifs in Xist RAP data from three experiments: MLFs, TetO-Xist male ES cells before induction, and TetO-Xist male ES cells six hours after induction. We considered peaks on autosomes and the X-chromosome separately in order to distinguish artificial motifs on autosomes from potentially real ones on the X-chromosome. In these six runs of MEME-ChIP (three RAP experiments, two sets of peaks for each experiment), we identified no significant motifs.

Sequence homology to probes To determine whether RAP nonspecifically enriched some regions of the genome

due to off-target complementarity between the probes or Xist RNA and genomic DNA, we examined the sequence homology between significantly enriched genomic regions and the probes or Xist RNA sequence. We first identified the most significantly enriched 500-bp regions and computed the enrichment in these regions as described above. We then aligned each of these 500-bp regions to the Xist RNA sequence or to the probe sequences using the Jaligner implementation of the Smith-Waterman algorithm (match score = 1.0, mismatch score = -1.0, open gap penalty = 2.0, extended gap penalty = 1.0).

Page 10: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

10

For each region, we took the best local alignment and calculated an identity score, defined as the percentage of bases in the region that locally align with the probe or Xist RNA sequence. For alignments to the probes, we used the maximum identity score across all probes as the score for each region.

As a control, we permuted these 500-bp regions ten times across the mappable portions of their respective chromosomes as described above. We computed the distribution of identity scores for the real and permuted regions, and found that these distributions were highly similar both on autosomes and on the X-chromosome (Fig. S2). This finding was true for Xist localization in MLFs and in male ES cells one hour after induction. These results demonstrate that regions with high Xist enrichment do not have significantly higher sequence homology to the Xist RNA or probes.

Defining early localization sites

To identify early sites in the time-course experiments, we looked for windows with enrichments that exceeded the local mean. We used a local enrichment statistic so that we could find preferential contacts even on the distant ends of the chromosome; such sites might not be enriched when compared to the entire chromosome, but would deviate significantly from the local background expected from the observed slope away from the Xist locus. To accomplish this, we calculated input-normalized enrichment ratios as described above and compared each window to neighboring windows within 10 Mb on the chromosome. We calculated a z-score based on enrichments to the neighboring windows, and called initiation sites as windows with z > 1.65 (P < 0.05). We used a 100-Kb window size to define initiation sites (Fig. 3D, Fig. S9).

Correlation analysis

We downloaded genome annotation tracks from the UCSC Genome Browser, NCBI GEO, and authors' web sites (3, 48, 52, 75, 76). These annotations included sequence annotations such as repeat elements as well as functional genomics datasets generated in embryonic stem cells, mouse lung fibroblasts, or related cell types such as mouse embryonic fibroblasts. For discrete annotation tracks (e.g., BED files containing positions of LINE elements), we counted the number of annotations contained in a given window on the genome. For continuous annotation tracks (e.g., BEDGraph files containing continuous measurements of lamin binding in a given window), we calculated the sum of the scores for all entries contained in each window. We calculated correlations between these scores and input-normalized Xist enrichment in overlapping windows across the genome at various resolutions, where each window overlapped the previous by 75% (Table S1). For all analyses (except in MLF, where the Xist locus is not a strong outlier), we excluded regions within 10-Mb of the Xist transcription locus from the correlation calculation.

The pattern of Xist enrichment after one hour of induction in TetO-Xist male ES cells is dominated by an upward slope towards the Xist transcription locus. To account for this trend and search for features that might recruit Xist localization, we computed a z-score representing the local enrichment for each window compared to neighboring windows on the chromosome, as described above. We then correlated this z-score with the various genomic features. For correlations with a 1-Mb window size, we compared each window to others within 10-Mb on the chromosome. For correlations with a 10-Kb

Page 11: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

11

window size, we compared each window to others within 1-Mb on the chromosome. All correlations involving the TetO-Xist male ES cells (except for correlations with Hi-C data) compare genomic features to these local enrichment z-scores rather than the standard input-normalized Xist enrichment.

Annotation enrichment analysis To determine the enrichment of features within defined sites, we computed a score

for each region. In the case of discrete features, we computed the density of the features within the region. For continuous features, we computed the sum of the scores of all regions within the window. To assign a normalized score to each window that is comparable across feature classes, we permuted the regions across the chromosome 100 times and calculated scores observed for these random regions. We compared the scores for the real set of regions to those for the permuted set of regions to calculate an average fold enrichment. To compute the significance of the enrichments across the entire set of features, we took all permuted values and all real values and tested for a significant difference between them using the nonparametric Mann-Whitney test.

Analysis of Hi-C data We downloaded Hi-C data from male mouse ES cells (GEO GSE35156) (52). We

used the “HindIII_combined” table containing normalized Hi-C interaction counts between all 40-Kb bins across each chromosome. An interaction count is defined by a pair of reads mapping in two different 40-Kb bins. Because of the sparseness of the data, we further binned it into 1-Mb bins by summing all 40-Kb windows within the 1-Mb region. We examined the interaction counts between the bin containing the Xist transcription locus and all other bins across the X-chromosome. For the correlation analysis, we excluded all bins within 10 Mb on either side of the Xist transcription locus, which would otherwise dominate the correlation calculation due to the strong local peaks in both the Hi-C and RAP datasets. Correlation between escape frequency and Xist localization

To approximately quantify escape gene frequency, we used data from a previous study that calculated the RNA allelic expression ratio between the inactive and active X-chromosomes using RNA sequencing (39). We plotted values for the relative levels of Xi and Xa expression for the 13 known escape genes directly from Table 1 in Yang et al. (39). To compare these ratios to the level of Xist enrichment observed by RAP, we compute Xist coverage over the entire gene (including introns). We note that the frequency that a gene escapes XCI appears to vary somewhat between cell types and studies.

Aggregate gene analysis For Figure 5 and Figure S11, we analyzed previously published RNA-Seq data from

embryonic stem cells using Scripture (3), and defined “active” genes as those expressed with P < 0.001. We averaged Xist enrichments in 1-Kb windows for the 100 Kb upstream and downstream of a gene, the 10 Kb starting at the beginning and end of a gene, and the 20 Kb centered at the middle of a gene. Genes within 5 Mb of the Xist

Page 12: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

12

transcription locus were excluded from the analysis because they represent outliers in terms of average Xist enrichment.

Enrichment of L1 and SINE subfamilies in RAP sequencing data We aligned all sequencing reads to subfamily consensus sequences from RepBase

17.03 (77) and did not filter duplicate reads or reads with low alignment scores. To determine whether specific subfamilies were over-represented in the Xist RAP sequencing data, we performed two normalizations. First, we compared the number of reads mapping to a subfamily in Xist RAP versus input DNA. Second, we accounted for differences in the representation of this subfamily on the X-chromosome versus the rest of the genome. This last normalization is important because L1s are enriched 2-fold on the X-chromosome compared to the rest of the genome, so that if we purified the X-chromosome evenly across its entire sequence we would see a 2-fold enrichment for L1s. To accomplish this, we counted the frequency of each subfamily on the X-chromosome and across the genome using repeats annotated in RepeatMasker (77), and incorporated this information into a final enrichment score for each subfamily:

x = fraction of repeat instances that are located on chrX fobserved = fraction of all reads that map to the subfamily consensus in Xist RAP fInput = fraction of all reads that map to the subfamily consensus in the input RXistRAP = fraction of all uniquely mapping reads that align to chrX in Xist RAP RInput = fraction of all uniquely mapping reads that align to chrX in the input OchrX = enrichment of chrX reads in Xist RAP versus input

= RXistRAP / RInput OnonX = enrichment of non-chrX reads in Xist RAP versus input

= (1 – RXistRAP) / (1 – RInput) fexpected

= fraction of all reads expected to map to chrX based on enrichment for chrX in the purification and representation of repeat subfamily on chrX = fInput × [(x × OchrX) + ((1 – x) × OnonX)]

Enrichment = fobserved / fexpected

Resolution for figures and analyses To visualize dense enrichment data, all figures displaying chromosome-wide

enrichments were plotted at 100-Kb resolution, and all figures showing a subset of the chromosome are plotted at 10-Kb resolution. In the text and in Fig. 4, we reported all correlations at 1-Mb resolution so that these values can be compared directly to correlations with the Hi-C data, which has limited resolution.

Data visualization

We plotted genomic data using Bioconductor and Gviz (78).

Page 13: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

13

Supplementary Text Note S1. RAP: A generalizable method to map lncRNA interactions with chromatin

We set out to develop a generalizable method to map endogenous lncRNA interactions with chromatin. To do this, we needed to (i) specifically purify chromatin associated with a target lncRNA, (ii) achieve high resolution mapping of the associated DNA target sites, and (iii) robustly capture any lncRNA with minimal optimization.

We designed RNA Antisense Purification (RAP) to achieve these three goals. We first fix endogenous lncRNA-chromatin interactions by crosslinking cells. Here, we used a combination of formaldehyde and the protein-protein crosslinker disuccinimidyl glutarate (DSG), although RNA purification with RAP also works with cells fixed with other crosslinking reagents. To enable high resolution mapping of target DNA regions, we lyse crosslinked cells and digest DNA into ~150-bp fragments using DNase I, an enzyme that specifically cleaves DNA and leaves RNA intact. We capture the target RNA by incubating cell lysate with a pool of 120-nucleotide biotinylated antisense RNA probes tiled across the entire RNA sequence, enabling hybridization to any accessible portion of the target RNA. Following the hybridization step, we capture probe-RNA hybrids and associated chromatin using streptavidin beads. Finally, we elute bound RNA-chromatin complexes and sequence the purified genomic DNA. To ensure specificity for lncRNA-associated DNA, we perform the experiment in parallel with a control probeset consisting of “sense” probes from the same strand as the RNA, which should not capture single-stranded RNA but will bind equally well to double-stranded genomic DNA.

RAP allows specific, high-resolution, and robust purification of endogenous lncRNA-chromatin complexes. In the following sections, we describe the design choices we made to achieve these goals, and compare our protocol with previous approaches, ChIRP and CHART (32, 33). These comparisons represent conceptual differences in the design of the RAP method and are not meant as direct comparisons with other methods. When features are shared by either ChIRP or CHART, we highlight how RAP integrates these features into a single approach. Finally, we note that each method may be more suitable for different applications and/or lncRNAs, and a systematic comparison of the three methods will be required to evaluate their strengths in different contexts.

RAP achieves high specificity for the target RNA and associated DNA. The most important feature of a method to map lncRNA target sites is the ability to specifically enrich the target RNA and its associated chromatin. To accomplish this, any method designed to capture RNAs by hybridization must account for the possibility of nonspecific hybridization between the capture probes and off-target nucleic acids. In particular, methods to investigate lncRNA-chromatin interactions must distinguish between false signals that result from nonspecific probe-DNA hybridization and real signals that reflect in vivo RNA-DNA interactions. Indeed, a previous study using ChIRP identified a lncRNA binding motif that matched the sequence of the lncRNA (32). Because direct RNA-DNA hybridization represents a possible mechanism by which lncRNAs bind to chromatin, we wanted to develop a method that could easily distinguish this from potential direct hybridization artifacts.

In designing RAP, we addressed this challenge by using 120-nucleotide capture probes that allow more specific purification of target RNAs compared to the 20-25-

Page 14: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

14

nucleotide probes used previously (32, 33). The stronger interaction between the probes and target RNA allows hybridization and washing in 3 M guanidine thiocyanate at 45°C, drastically increasing the stringency of nucleic acid hybridization (64) and thus reducing the potential for false positives due to direct probe-DNA hybridization. The sense-strand control also helps us to distinguish specific RNA hybridization in vivo from RNA- or probe-mediated hybridization in solution. Using these two approaches, we identify no relationship between Xist enrichment and sequence homology to the probes or captured RNA sequences (Fig. S2), with the exception of the Xist locus itself (Fig. S7). We estimate by qPCR that <5% of the reads in Xist RAP at the Xist locus result from direct probe-DNA interactions, while >95% of the reads result from probe-RNA interactions.

Beyond direct hybridization between the probes and genomic DNA, other potential sources of artifacts include capture of off-target RNAs or protein complexes and their associated target sites through nonspecific interactions with the probes or bead (Fig. S1). We note that this is a well-characterized problem for RNA-protein interactions (65). To account for these possibilities, we used a highly denaturing hybridization buffer with 3 M guanidine thiocyanate, and additionally incorporated a pre-clearing step with streptavidin beads to remove proteins, DNA, or RNA species that might directly interact with the beads. These features allow RAP to achieve high enrichments of the target RNA and associated chromatin while avoiding nonspecific interactions (Fig. 1).

RAP provides high resolution mapping of target sites. A second feature of RAP is the ability to map lncRNA localization at high resolution. For all methods used to map chromatin interactions, resolution is achieved through digestion of genomic DNA prior to purification of specific regions. The resolution of the assay is dependent on the size of these fragments. For example, ChIP experiments are performed after generating ~100-500bp fragments of genomic DNA through the use of either heavy sonication or micrococcal nuclease (MNase) digestion. However, these approaches do not work for capturing RNA because both sonication and MNase digestion also fragment RNA, thus complicating attempts to capture lncRNAs and associated chromatin complexes. To avoid this problem, CHART utilizes light sonication to avoid extensive degradation of the RNA. However, this approach produces DNA fragments that are ~2-3 Kb (33), effectively limiting the resolution of mapped DNA target sites. Conversely, the ChIRP method digests DNA to much smaller sizes through heavy sonication. This enables higher resolution mapping but likely leads to higher levels of RNA degradation.

To achieve high resolution mapping while maintaining high RNA integrity, RAP uses a combination of light sonication and DNase I, an enzyme that specifically cleaves DNA but not RNA, to digest genomic DNA to an average size of ~150 bp (see Methods). After treatment with DNase I, we find that most of the RNA is fully intact (based on examination of the 18S and 28S ribosomal RNAs), enabling robust mapping of the RNA while achieving high resolution to target locations.

Because different regions of the genome have differing levels of sensitivity to DNase digestion, we verified that our use of DNase I did not affect our ability to quantitatively map lncRNA localization across the genome. To accomplish this, we sequenced input samples prepared with sonication alone (DNase-, ChIP input) or with DNase I treatment (DNase+, RAP input). We found that the genomic DNA representation of the ChIP and RAP inputs showed very similar magnitudes and patterns of enrichment and depletion across the entire genome (Fig. S13, see Methods). In both RAP and ChIP

Page 15: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

15

input samples, we observed higher DNA coverage in regions of active chromatin – consistent with the notion that these regions are more amenable to both enzymatic digestion and sonication-induced fragmentation. To control for this modest variation as well as for differences in coverage due to mappability, we normalize all RAP and ChIP data to the relative coverage in the input (see Methods), thus allowing comparisons between regions with differing mappability and DNase sensitivity.

Although RAP was designed to achieve high resolution mapping, this feature was not required in the case of Xist, which we show binds broadly across the X-chromosome rather than at focal sites. We expect that the high resolution provided by the RAP method might be useful for the study of other lncRNAs, which might have more focal localization patterns.

RAP robustly captures a target RNA without the need for probe design optimization. A generalizable and scalable method should enable purification of any RNA with minimal optimization of the capture probeset. In particular, the probe design strategy should be robust to the many features of an endogenous RNA – including secondary structure, RNA-protein interactions, and potentially RNA-DNA interactions – that might obstruct probe hybridization and capture. One solution to this problem, used by CHART, involves mapping accessible regions of a lncRNA and designing a handful of probes targeting these accessible regions (33). However, this approach requires optimization for each lncRNA and does not ensure capture of the entire RNA sequence if the RNA is fragmented during the protocol. Another solution, used by ChIRP, is to intersperse probes along the transcript. While this approach avoids the need for optimization of accessibility, it could be sensitive to partial RNA degradation.

We designed RAP to robustly capture any target RNA without previous knowledge of the regions of accessibility. To do this, RAP utilizes pools of overlapping capture probes that are tiled across the entire lncRNA sequence (see Methods), enabling hybridization to any accessible region within the lncRNA. Because of this tiled approach, RAP does not require knowledge of which domain of the RNA interacts with chromatin and additionally is robust to partial RNA degradation. This approach allows RAP to be applied to any long RNA with minimal optimization: in this study we show that RAP enriches the Xist and Oct4 RNAs (Fig. S1), and we note that RAP has robustly purified >200 lncRNAs, ranging in length from ~400 nucleotides to ~17,000 nucleotides, (unpublished data).

Page 16: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

16

Note S2. The role of LINE1 repeats in Xist localization and spreading Previous studies have found that Xist can spread over and silence autosomal DNA in

the context of an X;autosome translocation or when Xist is expressed from a transgene on an autosome. However, Xist-mediated silencing on autosomes is variable and not as efficient as on the X-chromosome (43, 66), suggesting the existence of “booster” elements that promote Xist spreading specifically on the X-chromosome (67). In 1998, Mary Lyon proposed that the interspersed repetitive elements LINE1s (L1s) may represent these booster elements (68) based on the observation that mammalian X-chromosomes show ~2-fold enrichment for L1s compared to autosomes (68). Indeed, subsequent work showed that genes in L1-poor regions are more likely to escape Xist-mediated silencing (66) and that transcription of a subset of young L1s temporally correlates with Xist RNA coating (69). However, it still remains unclear whether L1s enhance Xist spreading.

We reasoned that the ability to look at Xist localization at high resolution might provide insight into this question. Specifically, if L1s play a role in facilitating Xist spreading, we might expect that Xist localization would be significantly enriched at L1 elements compared to other regions on the X-chromosome. To test this, we explored the correlation between Xist and L1 density across the X-chromosome. In both female MLFs and male ES cells after one hour of Xist induction, Xist localization negatively correlated with L1 density (1-Mb resolution, Pearson’s correlation = -0.34 in MLF and -0.17 in TetO Xist ES cells, Table S1). This relationship reflects the preferential localization of Xist to regions with high gene density since LINE elements are negatively correlated with gene density across the entire genome (70).

While the analysis above included all L1 subfamilies, different subtypes have been suggested to have different functional properties related to XCI (69). Indeed, we found several mammalian-specific L1 subfamilies that showed modest positive correlations with Xist localization (e.g., L1MB7, 1-Mb resolution, Pearson’s correlation = 0.38 in MLF and 0.27 in TetO Xist ES cells, Table S1). However, the relationship of Xist localization with these L1 subfamilies was not as significant as that with gene density or chromosome conformation. Furthermore, at higher resolution we did not observe focal enrichment of Xist over individual uniquely mappable L1s.

To ensure that this observation was not due to the difficulty in uniquely mapping individual L1s, we aligned all sequencing reads to the consensus sequences of L1 subfamilies. As a control, we also aligned all reads directly to the SINE subfamilies (see Methods). After correcting for the skewed representation of various subfamilies on the X-chromosome, we found that most L1 and SINE subfamilies were not enriched in Xist RAP compared to input, with the exception of several mammalian-specific L1 subfamilies (Figure S14). These mammalian-specific L1 subfamilies showed modest enrichment (less than 2-fold) in MLFs but did not show notable enrichment after one hour of induction in the male ES cell model.

Thus, while our data do not exclude the possibility for the involvement of L1s in the spreading of Xist across the X-chromosome, we do not find strong evidence to support it.

Page 17: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

17

Fig. S1. RAP specifically purifies lncRNA-chromatin interactions (A) A schematic of three potential sources of artifacts. RNA, DNA or proteins may interact nonspecifically with (1) the streptavidin bead surface (sphere), (2) the RNA probes (blue), or (3) the captured target RNA (red). (B) RT-qPCR for Xist, Oct4, and 18S ribosomal RNA after capture using probes antisense to Xist and Oct4 (samples) and probes sense to Xist, Oct4, and a randomized sequence (controls). RAP was performed in TetO-Xist male ES cells (pSM33) after 3 hours of induction with doxycycline. Error bars represent 95% confidence intervals for the average of three replicate experiments.

Page 18: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

18

Fig. S2. RAP enriched regions do not show sequence homology to probe sequences The Xist RNA sequence and probe sequence were aligned to the most highly enriched regions defined by RAP (see Methods). For each region, the percent identity of the best alignment between the sequence (probe or RNA) and the window (enriched or permuted) was computed. A quantile-quantile plot is shown for the scores of the best alignment between the Xist RNA sequence (top) or probe sequence (bottom) against Xist-enriched regions (y-axis) compared to randomly permuted regions (x-axis) in TetO-Xist male ES cells (pSM33) after 1 hour of Xist induction (left) or MLFs (right). (A) Xist-enriched regions and permuted regions defined across autosomes. (B) Xist-enriched regions and permuted regions defined across the X-chromosome.

Page 19: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

19

Fig. S3. Xist does not show reproducible ‘peaks’ at 25-bp resolution Fragment counts for Xist RAP and input DNA replicates in MLFs across a representative 42-Kb segment of the X-chromosome (25-bp resolution). Peaks in fragment counts are apparent in both the Xist RAP and input samples but are not the same between replicates, demonstrating that variation in Xist enrichment at this resolution results from sampling noise rather than biological variation.

Page 20: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

20

Fig. S4. Reproducibility of Xist localization at 100-Kb resolution RAP was performed in independent biological replicates for both MLF (left) and TetO-Xist male ES cells after one hour of induction (pSM33, right). Units represent log2 Xist enrichment in 100-Kb windows across the X-chromosome (red) and autosomes (gray). Xist localization is highly reproducible across biological replicates at this resolution.

Page 21: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

21

Fig. S5. Targeted replacement of the Xist promoter 1,027 nucleotides directly upstream of the Xist transcription start site were replaced with a Tetracycline responsive promoter and a loxP-flanked CMV-driven Hygromycin-Thymidine Kinase selection cassette in V6.5 male mouse ES cells that carry the M2rtTA in the Rosa26 locus. The selection cassette was removed from positively targeted clones by treatment with Cre Recombinase.

Page 22: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

22

Fig. S6. Characterization of TetO Xist male ES cells (pSM33 cell line) (A) RT-qPCR quantification of Xist and Oct4 RNA expression in pSM33 cells after dox induction. Fold-change is normalized to the qPCR measurement at zero hours. (B) Quantification of the Xist RNA signal at times after addition of dox (h=hour(s)). Figures 3B and S6J present corresponding representative FISH images. We categorized Xist RNA FISH patterns as pinpoints, spots, small clouds, and large clouds (see Methods).

Page 23: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

23

Representative images of (C) Ezh2 and (D) H3K27me3 enrichment on the X-chromosome and (E, F) RNA Pol II exclusion under the Xist RNA signal after Xist induction. PolII exclusion is discernible after one hour of induction under Xist RNA spots as shown in (E), but is more obvious at later time points under larger clouds as in (F). (G-I) Quantification of Ezh2 and H3K27me3 enrichment and PolII exclusion over the Xist compartment in Xist-RNA positive cells. For the quantification in (G-I) all cells displaying an Xist RNA signal were considered positive irrespective of the size of the signal. (J) Representative FISH images of an Xist RNA (red) pinpoint and spot in relation to the corresponding Tsix signal, which forms a single pinpoint in male ES cells. (K) Quantification of cells expressing the antisense Tsix RNA in the same cells as (B) shows silencing of Tsix over time. n=number of cells counted originating from at least 25 different colonies.

Page 24: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

24

Fig. S7. Xist localization across different cell types and integration sites Xist enrichments across the X-chromosome from all RAP sequencing experiments performed in this study. Light gray enrichments represent interpolations over unmappable regions. For the MLF experiment, Xist RAP represents the experiment with antisense probes, while Control RAP represents the experiment with sense probes. The Control

Page 25: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

25

RAP appears to have “spiky” regions of enrichment due to the very limited amount of DNA recovered from this experiment; because of the low total unique read count, any 100-Kb window with more than five reads appears “enriched” over input. For all other experiments, RAP was performed with antisense probes. For the TetO Xist washout experiment in pSM33 male ES cells, the NoDox time point (before induction with doxycycline) represents the same conditions as the TetO Xist time course 0hr time point; the 0hr Wash time point represents the same conditions as the TetO Xist time course 1hr time point; and the 1hr Wash time point represents cells that are induced for 1 hour with doxycycline, then washed and grown for 1 hour without doxycycline. Xist transgene tracks represent data for Xist wild-type and A-repeat deletion transgenes incorporated into the Hprt locus. Xist enrichments may extend beyond the y-axis maximum in the regions surrounding the Xist transcription locus. We note that in female ES cells, we observe Xist localization even at 0hr (before additional of retinoic acid), consistent with the presence of a pinpoint Xist FISH signal in the majority of cells (Fig. S8). We also observe a more focal Xist localization pattern in female ES cells compared to that observed in the TetO Xist system. This is consistent with our observations by FISH that in female ES cells Xist shows a more defined localization pattern in the nucleus (Fig. S8), perhaps due to more precise control of Xist transcription levels or dynamics during development (20, 79).

Page 26: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

26

Fig. S8. Xist RNA FISH patterns in F1 2-1 female ES cells during differentiation (A) Representative images of Xist and Tsix RNA FISH signals observed during the first 48 hours of differentiation categorized as described above (see Methods). (B) Quantification of the pattern of Xist in cells at indicated times of differentiation. Pinpoint and Spot signals were binned for quantification. For the differentiation time course, ES cells were plated on gelatinized glass coverslips for 12 hours in ES cell media with LIF without feeders (0hr) and subsequently induced to differentiate for the indicated times by the addition of 1uM retinoic acid and concurrent LIF withdrawal. Upon differentiation, the number of cells with small and large Xist clouds increased strongly. We also quantified the Xist pattern in undifferentiated ES cells grown on gelatinized coverslips for 6 hours in ideal ES cell growth conditions (in the presence of LIF and male feeders). We found that in the presence of feeders the pinpoint Xist signal appeared dimmer than the signal from cells in the absence of feeders. n = number of cells counted from at least 25 different colonies. (C) Quantification of mono- and biallelic Tsix RNA signals in ES cells during retinoic acid treatment. Tsix is expressed from both X-chromosomes in female ES cells, but is silenced during differentiation.

Page 27: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

27

Fig. S9. Early localization of Xist across different experiments Early sites defined in TetO-Xist male ES cells (pSM33) one hour after induction of Xist (top) are present after washout of doxycycline for one hour (middle), and are consistent with the early sites observed in female ES cells (bottom) after six hours of differentiation. In the top and bottom enrichment panels, Xist enrichment at the Xist locus extends above the y-axis maximum.

Page 28: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

28

Fig. S10. Early Xist localization correlates with chromosome conformation Contact frequencies (blue) represent normalized Hi-C interaction counts in undifferentiated male ES cells between distal windows on the X-chromosome and the window containing the Xist genomic locus. (A) Correlation with Xist RNA localization (red) after six hours of differentiation in female ES cells. Correlation calculations exclude the shaded gray region (10-Mb on each side). (B) Xist enrichment at and around the Xist locus decreases after removal of doxycycline for one hour. (C) Correlation with Xist RNA localization (red) after one hour of induction and one hour of removal of doxycycline in TetO Xist male ES cells (pSM33). (D) Overlay of early Xist enrichment for male ES cells expressing Xist from its endogenous locus (gray) or from the Hprt locus (black). These data correspond to the enrichments shown in Fig. 4B and Fig. 4C, respectively. Enrichments extend above the y-axis maximum.

Page 29: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

29

Fig. S11. Xist requires its silencing domain to access active gene-rich regions (A) TetO Xist enrichment after 3 hours of induction shows a region that is depleted compared to what we expect based on Hi-C proximity contacts (chrX:69,500,000-72,500,000, indicated with black arrow in Fig. 4B). (B) Xist enrichment after 6 hours of induction negatively correlates with gene expression in undifferentiated ES cells. Each gray point represents the average enrichment over one active gene, including introns, on the X-chromosome. Black line: linear regression. (C) Xist enrichment is averaged over all active genes (red lines) and inactive genes (blue lines) on the X-chromosome, extending 100 Kb upstream and downstream of the gene body. Black line represents Xist

Page 30: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

30

enrichment averaged over randomly permuted regions. Shaded regions represent 95% confidence intervals (CI) for the average enrichment. All panels compare the same sets of active and inactive genes defined in ES cells (see Fig. 5, Methods) to visualize the changing Xist enrichment over time. The TetO-Xist (pSM33) 3hr panel presents the same data as Fig. 5B. TSS: transcription start site. TES: transcription end site. (D) Comparison of Xist enrichments (10-Kb resolution) for TetO Xist male ES cells (Xist expressed from endogenous locus) and wt and Δ A Xist transgenes (Xist cDNA expressed from Hprt locus) after three hours of induction with doxycycline. Similar to Fig. 5D, two representative regions (chrX:4,000,000-14,000,000 and chrX:68,000,000-74,000,000) show that active gene-dense regions are depleted for Xist localization in Δ A versus wt Xist. TetO Xist (endogenous locus) and wt Xist (Hprt locus) have comparable patterns of enrichment over these regions. (E) Fold-change between Δ A and wt Xist enrichment averaged across all active genes (red line), inactive genes (blue line), and randomly permuted regions (black line) across the entire X-chromosome. Shaded regions represent 95% confidence intervals for the average enrichment. Enrichment ratios are normalized to a mean of one across the X-chromosome and are plotted on a log scale.

Page 31: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

31

Fig. S12. Characterization of male transgenic ES cells carrying the Xist cDNA These cells harbor a 15-Kb copy of the mouse Xist cDNA (with or without the A-repeat) under control of a dox-inducible promoter in the Hprt locus on the X chromosome (see Methods). (A) Confirmation by PCR of the presence and absence of the A-repeat in the two cell lines using genomic DNA. Diagram of the A-repeat (red) located within the dox-inducible Xist cDNA construct within the Hprt locus. Genotyping primers are indicated. The upstream primer is located within the TetO promoter, to prevent detection of the endogenous Xist gene. The corresponding EtBr stained gel is shown, confirming absence of the A-repeat in the Δ A Xist line. The smeared appearance of the bands results from the upstream primer binding to distinct repeats within the TetO promoter. (B) Quantification of the Xist RNA patterns at indicated times after induction with doxycycline for the wild-type (WT) and Δ A Xist cell lines. The pattern of Xist RNA accumulation over time was categorized as described above (see Methods). (C,D) Representative IF/RNA FISH images showing Xist RNA and H3K27me3 accumulation on the Xist-coated X-chromosome after 24 hours of induction with doxycycline. (E)

Page 32: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

32

Quantification of cells expressing Xist RNA after 24 hours of induction. (F) Quantification of cells positive for Xist RNA signal that also display a co-localizing accumulation of H3K27me3 with Xist RNA.

Page 33: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

33

Fig. S13. Read coverage variation due to mappability and DNase sensitivity

(A) Mappability, input read coverage (fragment counts), and density of DNase-hypersensitive (DHS) sites across the X-chromosome. Mappability and input coverage tracks represent the log2 ratio of mappable reads in each window compared to the average across the chromosome. Input coverage shows a ~4-fold dynamic range across the chromosome, and mappability shows a ~1.2-fold dynamic range. Regions of the chromosome with high density of DHS sites have high coverage in DNA sequencing libraries from both RAP and ChIP input samples; in contrast, regions with low density of DHS sites have low coverage in the input samples (gray boxes). (B, C) Read coverage normalized for mappability in 100-bp windows for the 4 Kb surrounding transcription start sites (TSSs) and transcription end sites (TESs) of genes on the X-chromosome. Lines show the average across all active genes (red) and inactive genes (blue). Shaded regions represent the 95% confidence interval (CI) for the mean. At this resolution, RAP input samples have lower coverage at the transcription start sites, while ChIP input samples have higher coverage at transcription start sites. Input samples come from MLF ChIP and RAP experiments.

Page 34: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

34

Fig. S14. Enrichment of repeat subfamilies in Xist RAP sequencing data Enrichment of reads mapping to the consensus sequences of L1 and SINE subfamilies in Xist RAP compared to input for (A) TetO Xist male ES cells after one hour of induction and (B) MLFs. Enrichments are corrected for the relative representation of each subfamily on the X-chromosome versus the rest of the genome (see Methods). The subfamilies with the highest enrichments are labeled.

Page 35: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

35

Table S1 (separate file) Correlations between Xist RAP enrichments and various genomic features (see Methods).

Table S2 (separate file) Genes contained in regions of low Xist enrichment in MLFs. RPKM: expression levels, where available, from previously published RNA-sequencing data (3). Novel: genes that are expressed in MLFs and do not lie within 300 Kb of a previously reported escape gene.

Table S3 (separate file) Genome annotation enrichments for early localization sites defined after one hour of induction in TetO Xist male ES cells (pSM33, see Methods).

Table S4 (separate file) Sequences for capture probe design and synthesis.

Table S5 (separate file) Primer sequences for qPCR, probe synthesis, and RNA sequencing.

Table S6 (separate file) DNA-sequencing statistics for all RAP experiments.

Page 36: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

References and Notes

1. T. Derrien, R. Johnson, G. Bussotti, A. Tanzer, S. Djebali, H. Tilgner, G. Guernec, D. Martin,

A. Merkel, D. G. Knowles, J. Lagarde, L. Veeravalli, X. Ruan, Y. Ruan, T. Lassmann, P.

Carninci, J. B. Brown, L. Lipovich, J. M. Gonzalez, M. Thomas, C. A. Davis, R.

Shiekhattar, T. R. Gingeras, T. J. Hubbard, C. Notredame, J. Harrow, R. Guigó, The

GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure,

evolution, and expression. Genome Res. 22, 1775–1789 (2012).

doi:10.1101/gr.132159.111 Medline

2. FANTOM Consortium, RIKEN Genome Exploration Research Group and Genome Science

Group (Genome Network Project Core Group), The transcriptional landscape of the

mammalian genome. Science 309, 1559–1563 (2005). doi:10.1126/science.1112014

3. M. Guttman, M. Garber, J. Z. Levin, J. Donaghey, J. Robinson, X. Adiconis, L. Fan, M. J.

Koziol, A. Gnirke, C. Nusbaum, J. L. Rinn, E. S. Lander, A. Regev, Ab initio

reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-

exonic structure of lincRNAs. Nat. Biotechnol. 28, 503–510 (2010).

doi:10.1038/nbt.1633 Medline

4. M. Guttman, I. Amit, M. Garber, C. French, M. F. Lin, D. Feldser, M. Huarte, O. Zuk, B. W.

Carey, J. P. Cassady, M. N. Cabili, R. Jaenisch, T. S. Mikkelsen, T. Jacks, N. Hacohen,

B. E. Bernstein, M. Kellis, A. Regev, J. L. Rinn, E. S. Lander, Chromatin signature

reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature

458, 223–227 (2009). doi:10.1038/nature07672 Medline

5. M. N. Cabili, C. Trapnell, L. Goff, M. Koziol, B. Tazon-Vega, A. Regev, J. L. Rinn,

Integrative annotation of human large intergenic noncoding RNAs reveals global

properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).

doi:10.1101/gad.17446611 Medline

6. M. Guttman, J. Donaghey, B. W. Carey, M. Garber, J. K. Grenier, G. Munson, G. Young, A.

B. Lucas, R. Ach, L. Bruhn, X. Yang, I. Amit, A. Meissner, A. Regev, J. L. Rinn, D. E.

Root, E. S. Lander, lincRNAs act in the circuitry controlling pluripotency and

differentiation. Nature 477, 295–300 (2011). doi:10.1038/nature10398 Medline

Page 37: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

2

7. I. Ulitsky, A. Shkumatava, C. H. Jan, H. Sive, D. P. Bartel, Conserved function of lincRNAs

in vertebrate embryonic development despite rapid sequence evolution. Cell 147, 1537–

1550 (2011). doi:10.1016/j.cell.2011.11.055 Medline

8. U. A. Ørom, T. Derrien, M. Beringer, K. Gumireddy, A. Gardini, G. Bussotti, F. Lai, M.

Zytnicki, C. Notredame, Q. Huang, R. Guigo, R. Shiekhattar, Long noncoding RNAs

with enhancer-like function in human cells. Cell 143, 46–58 (2010).

doi:10.1016/j.cell.2010.09.001 Medline

9. K. C. Wang, Y. W. Yang, B. Liu, A. Sanyal, R. Corces-Zimmerman, Y. Chen, B. R. Lajoie,

A. Protacio, R. A. Flynn, R. A. Gupta, J. Wysocka, M. Lei, J. Dekker, J. A. Helms, H. Y.

Chang, A long noncoding RNA maintains active chromatin to coordinate homeotic gene

expression. Nature 472, 120–124 (2011). doi:10.1038/nature09819 Medline

10. J. L. Rinn, M. Kertesz, J. K. Wang, S. L. Squazzo, X. Xu, S. A. Brugmann, L. H.

Goodnough, J. A. Helms, P. J. Farnham, E. Segal, H. Y. Chang, Functional demarcation

of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell

129, 1311–1323 (2007). doi:10.1016/j.cell.2007.05.022 Medline

11. J. T. Lee, Epigenetic regulation by long noncoding RNAs. Science 338, 1435–1439 (2012).

doi:10.1126/science.1231776

12. M. C. Tsai, O. Manor, Y. Wan, N. Mosammaparast, J. K. Wang, F. Lan, Y. Shi, E. Segal, H.

Y. Chang, Long noncoding RNA as modular scaffold of histone modification complexes.

Science 329, 689–693 (2010). doi:10.1126/science.1192002

13. T. Nagano, J. A. Mitchell, L. A. Sanz, F. M. Pauler, A. C. Ferguson-Smith, R. Feil, P. Fraser,

The Air noncoding RNA epigenetically silences transcription by targeting G9a to

chromatin. Science 322, 1717–1720 (2008). doi:10.1126/science.1163802

14. J. Zhao, B. K. Sun, J. A. Erwin, J. J. Song, J. T. Lee, Polycomb proteins targeted by a short

repeat RNA to the mouse X chromosome. Science 322, 750–756 (2008).

doi:10.1126/science.1163045

15. M. Guttman, J. L. Rinn, Modular regulatory principles of large non-coding RNAs. Nature

482, 339–346 (2012). doi:10.1038/nature10887 Medline

Page 38: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

3

16. J. L. Rinn, H. Y. Chang, Genome regulation by long noncoding RNAs. Annu. Rev. Biochem.

81, 145–166 (2012). doi:10.1146/annurev-biochem-051410-092902 Medline

17. A. Kanhere, R. G. Jenner, Noncoding RNA localisation mechanisms in chromatin regulation.

Silence 3, 2 (2012). doi:10.1186/1758-907X-3-2 Medline

18. C. Gontan, I. Jonkers, J. Gribnau, Long noncoding RNAs and X chromosome inactivation.

Prog. Mol. Subcell. Biol. 51, 43–64 (2011). doi:10.1007/978-3-642-16502-3_3 Medline

19. D. Umlauf, P. Fraser, T. Nagano, The role of long non-coding RNAs in chromatin structure

and gene regulation: Variations on a theme. Biol. Chem. 389, 323–331 (2008).

doi:10.1515/BC.2008.047 Medline

20. K. Plath, S. Mlynarczyk-Evans, D. A. Nusinow, B. Panning, Xist RNA and the mechanism

of X chromosome inactivation. Annu. Rev. Genet. 36, 233–278 (2002).

doi:10.1146/annurev.genet.36.042902.092433 Medline

21. P. Avner, E. Heard, X-chromosome inactivation: Counting, choice and initiation. Nat. Rev.

Genet. 2, 59–67 (2001). doi:10.1038/35047580 Medline

22. K. Plath, J. Fang, S. K. Mlynarczyk-Evans, R. Cao, K. A. Worringer, H. Wang, C. C. de la

Cruz, A. P. Otte, B. Panning, Y. Zhang, Role of histone H3 lysine 27 methylation in X

inactivation. Science 300, 131–135 (2003). doi:10.1126/science.1084274

23. J. Silva, W. Mak, I. Zvetkova, R. Appanah, T. B. Nesterova, Z. Webster, A. H. Peters, T.

Jenuwein, A. P. Otte, N. Brockdorff, Establishment of histone h3 methylation on the

inactive X chromosome requires transient recruitment of Eed-Enx1 polycomb group

complexes. Dev. Cell 4, 481–495 (2003). doi:10.1016/S1534-5807(03)00068-6 Medline

24. C. M. Clemson, J. A. McNeil, H. F. Willard, J. B. Lawrence, XIST RNA paints the inactive

X chromosome at interphase: Evidence for a novel RNA involved in

nuclear/chromosome structure. J. Cell Biol. 132, 259–275 (1996).

doi:10.1083/jcb.132.3.259 Medline

25. J. Chaumeil, P. Le Baccon, A. Wutz, E. Heard, A novel role for Xist RNA in the formation

of a repressive nuclear compartment into which genes are recruited when silenced. Genes

Dev. 20, 2223–2237 (2006). doi:10.1101/gad.380906 Medline

Page 39: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

4

26. A. Wutz, T. P. Rasmussen, R. Jaenisch, Chromosomal silencing and localization are

mediated by different domains of Xist RNA. Nat. Genet. 30, 167–174 (2002).

doi:10.1038/ng820 Medline

27. C. E. Senner, T. B. Nesterova, S. Norton, H. Dewchand, J. Godwin, W. Mak, N. Brockdorff,

Disruption of a conserved region of Xist exon 1 impairs Xist RNA localisation and X-

linked gene silencing during random and imprinted X chromosome inactivation.

Development 138, 1541–1550 (2011). doi:10.1242/dev.056812 Medline

28. A. Beletskii, Y. K. Hong, J. Pehrson, M. Egholm, W. M. Strauss, PNA interference mapping

demonstrates functional domains in the noncoding RNA Xist. Proc. Natl. Acad. Sci.

U.S.A. 98, 9215–9220 (2001). doi:10.1073/pnas.161173098 Medline

29. Y. Hasegawa, N. Brockdorff, S. Kawano, K. Tsutui, K. Tsutui, S. Nakagawa, The matrix

protein hnRNP U is required for chromosomal localization of Xist RNA. Dev. Cell 19,

469–476 (2010). doi:10.1016/j.devcel.2010.08.006 Medline

30. D. Pullirsch, R. Härtel, H. Kishimoto, M. Leeb, G. Steiner, A. Wutz, The Trithorax group

protein Ash2l and Saf-A are recruited to the inactive X chromosome at the onset of stable

X inactivation. Development 137, 935–943 (2010). doi:10.1242/dev.035956 Medline

31. R. Agrelo, A. Souabni, M. Novatchkova, C. Haslinger, M. Leeb, V. Komnenovic, H.

Kishimoto, L. Gresh, T. Kohwi-Shigematsu, L. Kenner, A. Wutz, SATB1 defines the

developmental context for gene silencing by Xist in lymphoma and embryonic cells. Dev.

Cell 16, 507–516 (2009). doi:10.1016/j.devcel.2009.03.006 Medline

32. C. Chu, K. Qu, F. L. Zhong, S. E. Artandi, H. Y. Chang, Genomic maps of long noncoding

RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell 44, 667–678

(2011). doi:10.1016/j.molcel.2011.08.027 Medline

33. M. D. Simon, C. I. Wang, P. V. Kharchenko, J. A. West, B. A. Chapman, A. A.

Alekseyenko, M. L. Borowsky, M. I. Kuroda, R. E. Kingston, The genomic binding sites

of a noncoding RNA. Proc. Natl. Acad. Sci. U.S.A. 108, 20497–20502 (2011).

doi:10.1073/pnas.1113536108 Medline

34. P. D. Mariner, R. D. Walters, C. A. Espinoza, L. F. Drullinger, S. D. Wagner, J. F. Kugel, J.

A. Goodrich, Human Alu RNA is a modular transacting repressor of mRNA transcription

Page 40: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

5

during heat shock. Mol. Cell 29, 499–509 (2008). doi:10.1016/j.molcel.2007.12.013

Medline

35. See supplementary materials on Science Online.

36. J. B. Berletch, F. Yang, J. Xu, L. Carrel, C. M. Disteche, Genes that escape from X

inactivation. Hum. Genet. 130, 237–245 (2011). doi:10.1007/s00439-011-1011-z Medline

37. S. Dietzel, K. Schiebel, G. Little, P. Edelmann, G. A. Rappold, R. Eils, C. Cremer, T.

Cremer, The 3D positioning of ANT2 and ANT3 genes within female X chromosome

territories correlates with gene activity. Exp. Cell Res. 252, 363–375 (1999).

doi:10.1006/excr.1999.4635 Medline

38. G. N. Filippova, M. K. Cheng, J. M. Moore, J.-P. Truong, Y. J. Hu, D. K. Nguyen, K. D.

Tsuchiya, C. M. Disteche, Boundaries between chromosomal domains of X inactivation

and escape bind CTCF and lack CpG methylation during early development. Dev. Cell 8,

31–42 (2005). doi:10.1016/j.devcel.2004.10.018 Medline

39. F. Yang, T. Babak, J. Shendure, C. M. Disteche, Global survey of escape from X inactivation

by RNA-sequencing in mouse. Genome Res. 20, 614–622 (2010).

doi:10.1101/gr.103200.109 Medline

40. D. Tian, S. Sun, J. T. Lee, The long noncoding RNA, Jpx, is a molecular switch for X

chromosome inactivation. Cell 143, 390–403 (2010). doi:10.1016/j.cell.2010.09.049

Medline

41. C. Chureau, S. Chantalat, A. Romito, A. Galvani, L. Duret, P. Avner, C. Rougeulle, Ftx is a

non-coding RNA which affects Xist expression and chromatin structure within the X-

inactivation center region. Hum. Mol. Genet. 20, 705–718 (2011).

doi:10.1093/hmg/ddq516 Medline

42. R. Song, S. Ro, J. D. Michaels, C. Park, J. R. McCarrey, W. Yan, Many X-linked

microRNAs escape meiotic sex chromosome inactivation. Nat. Genet. 41, 488–493

(2009). doi:10.1038/ng.338 Medline

43. S. M. Duthie, T. B. Nesterova, E. J. Formstone, A. M. Keohane, B. M. Turner, S. M. Zakian,

N. Brockdorff, Xist RNA exhibits a banded localization on the inactive X chromosome

Page 41: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

6

and is excluded from autosomal material in cis. Hum. Mol. Genet. 8, 195–204 (1999).

doi:10.1093/hmg/8.2.195 Medline

44. A. Tattermusch, N. Brockdorff, A scaffold for X chromosome inactivation. Hum. Genet. 130,

247–253 (2011). doi:10.1007/s00439-011-1027-4 Medline

45. S. Arthold, A. Kurowski, A. Wutz, Mechanistic insights into chromosome-wide silencing in

X inactivation. Hum. Genet. 130, 295–305 (2011). doi:10.1007/s00439-011-1002-0

Medline

46. D. H. Kim, Y. Jeon, M. C. Anguera, J. T. Lee, X-chromosome epigenetic reprogramming in

pluripotent stem cells via noncoding genes. Semin. Cell Dev. Biol. 22, 336–342 (2011).

doi:10.1016/j.semcdb.2011.02.025 Medline

47. J. T. Lee, N. Lu, Targeted mutagenesis of Tsix leads to nonrandom X inactivation. Cell 99,

47–57 (1999). doi:10.1016/S0092-8674(00)80061-6 Medline

48. S. F. Pinter, R. I. Sadreyev, E. Yildirim, Y. Jeon, T. K. Ohsumi, M. Borowsky, J. T. Lee,

Spreading of X chromosome inactivation via a hierarchy of defined Polycomb stations.

Genome Res. 22, 1864–1876 (2012). doi:10.1101/gr.133751.111 Medline

49. H. Marks, J. C. Chow, S. Denissov, K. J. Françoijs, N. Brockdorff, E. Heard, H. G.

Stunnenberg, High-resolution analysis of epigenetic changes associated with X

inactivation. Genome Res. 19, 1361–1373 (2009). doi:10.1101/gr.092643.109 Medline

50. Y. Jeon, J. T. Lee, YY1 tethers Xist RNA to the inactive X nucleation center. Cell 146, 119–

133 (2011). doi:10.1016/j.cell.2011.06.026 Medline

51. E. Heard, C. Rougeulle, D. Arnaud, P. Avner, C. D. Allis, D. L. Spector, Methylation of

histone H3 at Lys-9 is an early mark on the X chromosome during X inactivation. Cell

107, 727–738 (2001). doi:10.1016/S0092-8674(01)00598-0 Medline

52. J. R. Dixon, S. Selvaraj, F. Yue, A. Kim, Y. Li, Y. Shen, M. Hu, J. S. Liu, B. Ren,

Topological domains in mammalian genomes identified by analysis of chromatin

interactions. Nature 485, 376–380 (2012). doi:10.1038/nature11082 Medline

53. E. Lieberman-Aiden, N. L. van Berkum, L. Williams, M. Imakaev, T. Ragoczy, A. Telling, I.

Amit, B. R. Lajoie, P. J. Sabo, M. O. Dorschner, R. Sandstrom, B. Bernstein, M. A.

Page 42: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

7

Bender, M. Groudine, A. Gnirke, J. Stamatoyannopoulos, L. A. Mirny, E. S. Lander, J.

Dekker, Comprehensive mapping of long-range interactions reveals folding principles of

the human genome. Science 326, 289–293 (2009). doi:10.1126/science.1181369 Medline

54. J. Dekker, M. A. Marti-Renom, L. A. Mirny, Exploring the three-dimensional organization of

genomes: Interpreting chromatin interaction data. Nat. Rev. Genet. 14, 390–403 (2013).

doi:10.1038/nrg3454 Medline

55. T. Misteli, The concept of self-organization in cellular architecture. J. Cell Biol. 155, 181–

185 (2001). doi:10.1083/jcb.200108110 Medline

56. R. S. Nozawa, K. Nagao, K. T. Igami, S. Shibata, N. Shirai, N. Nozaki, T. Sado, H. Kimura,

C. Obuse, Human inactive X chromosome is compacted through a PRC2-independent

SMCHD1-HBiX1 pathway. Nat. Struct. Mol. Biol. 20, 566–573 (2013).

doi:10.1038/nsmb.2532 Medline

57. A. Rego, P. B. Sinclair, W. Tao, I. Kireev, A. S. Belmont, The facultative heterochromatin of

the inactive X chromosome has a distinctive condensed ultrastructure. J. Cell Sci. 121,

1119–1127 (2008). doi:10.1242/jcs.026104 Medline

58. C. Naughton, D. Sproul, C. Hamilton, N. Gilbert, Analysis of active and inactive X

chromosome architecture reveals the independent organization of 30 nm and large-scale

chromatin structures. Mol. Cell 40, 397–409 (2010). doi:10.1016/j.molcel.2010.10.013

Medline

59. E. Splinter, E. de Wit, E. P. Nora, P. Klous, H. J. van de Werken, Y. Zhu, L. J. Kaaij, W. van

Ijcken, J. Gribnau, E. Heard, W. de Laat, The inactive X chromosome adopts a unique

three-dimensional conformation that is dependent on Xist RNA. Genes Dev. 25, 1371–

1383 (2011). doi:10.1101/gad.633311 Medline

60. P. G. Maass, A. Rump, H. Schulz, S. Stricker, L. Schulze, K. Platzer, A. Aydin, S. Tinschert,

M. B. Goldring, F. C. Luft, S. Bähring, A misplaced lncRNA causes brachydactyly in

humans. J. Clin. Invest. 122, 3990–4002 (2012). doi:10.1172/JCI65508 Medline

61. A. Williams, C. G. Spilianakis, R. A. Flavell, Interchromosomal association and gene

regulation in trans. Trends Genet. 26, 188–197 (2010). doi:10.1016/j.tig.2010.01.007

Medline

Page 43: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

8

62. C. G. Spilianakis, M. D. Lalioti, T. Town, G. R. Lee, R. A. Flavell, Interchromosomal

associations between alternatively expressed loci. Nature 435, 637–645 (2005).

doi:10.1038/nature03574 Medline

63. J. T. Lee, Lessons from X-chromosome inactivation: Long ncRNA as guides and tethers to

the epigenome. Genes Dev. 23, 1831–1842 (2009). doi:10.1101/gad.1811209 Medline

64. J. Thompson, D. Gillespie, Molecular hybridization with RNA probes in concentrated

solutions of guanidine thiocyanate. Anal. Biochem. 163, 281–291 (1987).

doi:10.1016/0003-2697(87)90225-9 Medline

65. S. Mili, J. A. Steitz, Evidence for reassociation of RNA-binding proteins after cell lysis:

Implications for the interpretation of immunoprecipitation analyses. RNA 10, 1692–1694

(2004). doi:10.1261/rna.7151404 Medline

66. Y. A. Tang, D. Huntley, G. Montana, A. Cerase, T. B. Nesterova, N. Brockdorff, Efficiency

of Xist-mediated silencing on autosomes is linked to chromosomal domain organisation.

Epigenetics Chromatin 3, 10 (2010). doi:10.1186/1756-8935-3-10 Medline

67. S. M. Gartler, A. D. Riggs, Mammalian X-chromosome inactivation. Annu. Rev. Genet. 17,

155–190 (1983). doi:10.1146/annurev.ge.17.120183.001103 Medline

68. M. F. Lyon, X-chromosome inactivation: A repeat hypothesis. Cytogenet. Cell Genet. 80,

133–137 (1998). doi:10.1159/000014969 Medline

69. J. C. Chow, C. Ciaudo, M. J. Fazzari, N. Mise, N. Servant, J. L. Glass, M. Attreed, P. Avner,

A. Wutz, E. Barillot, J. M. Greally, O. Voinnet, E. Heard, LINE-1 activity in facultative

heterochromatin formation during X chromosome inactivation. Cell 141, 956–969

(2010). doi:10.1016/j.cell.2010.04.042 Medline

70. R. Versteeg, B. D. van Schaik, M. F. van Batenburg, M. Roos, R. Monajemi, H. Caron, H. J.

Bussemaker, A. H. van Kampen, The human transcriptome map reveals extremes in gene

density, intron length, GC content, and repeat pattern for domains of highly and weakly

expressed genes. Genome Res. 13, 1998–2004 (2003). doi:10.1101/gr.1649303 Medline

71. B. Panning, J. Dausman, R. Jaenisch, X chromosome inactivation is mediated by Xist RNA

stabilization. Cell 90, 907–916 (1997). doi:10.1016/S0092-8674(00)80355-4 Medline

Page 44: Supplementary Materials for...(M2rtTA) integrated into the Rosa26 locus. For the A-repeat deletion, 957 nucleotides between SacII to XhoI in . Xist . Exon 1 were deleted (Fig. S12).

9

72. E. M. LeProust, B. J. Peck, K. Spirin, H. B. McCuen, B. Moore, E. Namsaraev, M. H.

Caruthers, Synthesis of high-quality libraries of long (150mer) oligonucleotides by a

novel depurination controlled process. Nucleic Acids Res. 38, 2522–2540 (2010).

doi:10.1093/nar/gkq163 Medline

73. H. Li, R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler transform.

Bioinformatics 25, 1754–1760 (2009). doi:10.1093/bioinformatics/btp324 Medline

74. P. Machanick, T. L. Bailey, MEME-ChIP: Motif analysis of large DNA datasets.

Bioinformatics 27, 1696–1697 (2011). doi:10.1093/bioinformatics/btr189 Medline

75. E. M. Mendenhall, R. P. Koche, T. Truong, V. W. Zhou, B. Issac, A. S. Chi, M. Ku, B. E.

Bernstein, GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet.

6, e1001244 (2010). doi:10.1371/journal.pgen.1001244 Medline

76. ENCODE Project Consortium, A user’s guide to the encyclopedia of DNA elements

(ENCODE). PLoS Biol. 9, e1001046 (2011). doi:10.1371/journal.pbio.1001046 Medline

77. J. Jurka, V. V. Kapitonov, A. Pavlicek, P. Klonowski, O. Kohany, J. Walichiewicz, Repbase

Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–

467 (2005). doi:10.1159/000084979 Medline

78. R. Helaers, E. Bareke, B. De Meulder, M. Pierre, S. Depiereux, N. Habra, E. Depiereux,

gViz, a novel tool for the visualization of co-expression networks. BMC Res. Notes 4,

452 (2011). doi:10.1186/1756-0500-4-452 Medline

79. D. B. Pontier, J. Gribnau, Xist regulation and function explored. Hum. Genet. 130, 223–236

(2011). doi:10.1007/s00439-011-1008-7 Medline