Supplemental Appendix RNA sequencing

16
Supplemental Appendix Supplemental Methods RNA sequencing Frozen Neurospora tissue samples were ground to a fine powder in a prechilled mortar in liquid nitrogen. Tissue powders were resuspended in TRIzol reagent (Ambion) and mixed with 1/5 volume of chloroform. After centrifugation, the upper layer was separated and mixed with 2.5 volumes of ethanol and 0.3M sodium acetate (pH 5.2). The mixtures were kept in -80 °C overnight and then the RNA was pelleted by centrifugation. RNA pellets were dissolved in Depc-treated water and precipitated for another time using lithium chloride (3.75M final concentration) at -20 °C for 2h. After centrifugation, the resulting RNA pellets were washed by 70% ethanol and dissolved in Depc-treated water. The RNA samples were then treated with Turbo DNase (Ambion) and extracted using Acid Phenol: Chloroform (Ambion). Finally, the mRNA samples were sent to Joint Genome Institute for library construction and sequencing. Wild-type samples were included in each batch of samples as controls. The total RNA samples of mutant strains and wild-type controls during the screening process were sequenced at JGI with the following procedure. Plate-based RNA sample preparation was performed on the PerkinElmer Sciclone NGS robotic liquid handling system using Illumina's TruSeq Stranded mRNA HT sample prep kit utilizing poly-A selection of mRNA following the protocol outlined by Illumina in their user guide, and with the following conditions: total RNA starting material was 1μg per sample and 8 cycles of PCR was used for library amplification. The prepared libraries were quantified using KAPA Biosystems' next-generation sequencing library qPCR kit and run on a Roche Light Cycler 480 real-time PCR instrument. Sequencing of the flow cell was performed on the Illumina NovaSeq sequencer using NovaSeq XP V1 reagent kits, S4 flowcell, following a 2x150 indexed run recipe. Nuclear RNA and corresponding total RNA from the same tissue samples were also treated with DNase before library construction. After quality control with Bioanalyzer, total RNA samples were used to make poly(A) RNA sequencing libraries for Illumina sequencing (NEBNext® Ultra™ II Directional RNA Library Prep Kit) and nuclear RNA samples were used directly for library construction without poly(A) purification so that nascent RNAs can be

Transcript of Supplemental Appendix RNA sequencing

Page 1: Supplemental Appendix RNA sequencing

Supplemental Appendix

Supplemental Methods

RNA sequencing

Frozen Neurospora tissue samples were ground to a fine powder in a prechilled mortar in

liquid nitrogen. Tissue powders were resuspended in TRIzol reagent (Ambion) and mixed with

1/5 volume of chloroform. After centrifugation, the upper layer was separated and mixed with

2.5 volumes of ethanol and 0.3M sodium acetate (pH 5.2). The mixtures were kept in -80 °C

overnight and then the RNA was pelleted by centrifugation. RNA pellets were dissolved in

Depc-treated water and precipitated for another time using lithium chloride (3.75M final

concentration) at -20 °C for 2h. After centrifugation, the resulting RNA pellets were washed by

70% ethanol and dissolved in Depc-treated water. The RNA samples were then treated with

Turbo DNase (Ambion) and extracted using Acid Phenol: Chloroform (Ambion). Finally, the

mRNA samples were sent to Joint Genome Institute for library construction and

sequencing. Wild-type samples were included in each batch of samples as controls. The total

RNA samples of mutant strains and wild-type controls during the screening process were

sequenced at JGI with the following procedure. Plate-based RNA sample preparation was

performed on the PerkinElmer Sciclone NGS robotic liquid handling system using Illumina's

TruSeq Stranded mRNA HT sample prep kit utilizing poly-A selection of mRNA following the

protocol outlined by Illumina in their user guide, and with the following conditions: total RNA

starting material was 1µg per sample and 8 cycles of PCR was used for library

amplification. The prepared libraries were quantified using KAPA Biosystems' next-generation

sequencing library qPCR kit and run on a Roche Light Cycler 480 real-time PCR instrument.

Sequencing of the flow cell was performed on the Illumina NovaSeq sequencer using NovaSeq

XP V1 reagent kits, S4 flowcell, following a 2x150 indexed run recipe.

Nuclear RNA and corresponding total RNA from the same tissue samples were also treated

with DNase before library construction. After quality control with Bioanalyzer, total RNA

samples were used to make poly(A) RNA sequencing libraries for Illumina sequencing

(NEBNext® Ultra™ II Directional RNA Library Prep Kit) and nuclear RNA samples were used

directly for library construction without poly(A) purification so that nascent RNAs can be

Page 2: Supplemental Appendix RNA sequencing

sequenced. These libraries were then sequenced on NextSeq 500 at UT Southwestern Next

Generation Sequencing Core.

Nuclear RNA extraction

For nuclear RNA extraction, 3.5 to 4 g of frozen tissue was ground into powder with 1 g of

glass beads in liquid nitrogen as described previously (1). 8 ml of cold Buffer A (1 M sorbitol,

7% Ficoll, 20% glycerol, 5 mM Mg(OAc)2, 3 mM CaCl2, 50 mM Tris·HCl, pH 7.5, 3 mM DTT)

was added to the sample and incubated on ice for 10 min with stirring. The samples were then

filtered through two layers of cheesecloth, and the volume was brought to 8 ml using Buffer A.

Subsequently, 16 ml of Buffer B (10% glycerol, 5 mM Mg(OAc)2, 25 mM Tris·HCl, pH 7.5)

were added slowly with continuous mixing. The sample was then layered onto 10 ml of cold

Buffer A/B (2.5:4, v/v) and centrifuged at 3000 g at 4 °C for 7 min to pellet cell debris. A 1-ml

aliquot of the supernatant was kept as the total fraction, and the rest was layered onto 5 ml of

cold Buffer D (1 M sucrose, 10% glycerol, 5 mM Mg(OAc)2, 25 mM Tris·HCl, pH 7.5, 1 mM

DTT). Nuclei were pelleted by centrifugation at 9400 g at 4 °C for 15 min. A 1-ml aliquot of the

supernatant was kept as the cytosolic fraction, and part of the pellet was collected in SDS buffer

for western blot analysis. Nuclear RNA was extracted from the rest of the pellet using TRIzol

reagent. To compare the nuclear RNA-seq with total RNA-seq, part of each sample was also

used for total RNA extraction.

RNA-seq data analyses

For mRNA-seq data generated at JGI, the following analyses were performed. Raw fastq

file reads were filtered and trimmed using BBDuk (https://sourceforge.net/projects/bbmap/)

to eliminate artifact sequence by kmer matching (kmer=25), RNA spike-in reads, PhiX reads and

reads containing any Ns. Quality trimming was performed using the phred trimming method set

at Q6. Finally, following trimming, reads under the length threshold were removed (minimum

length 25 bases or 1/3 of the original read length - whichever is longer). Filtered reads from each

library were aligned to the reference genome using HISAT2 version 2.1.0 (2). Only primary hits

assigned to the reverse strand were included in the raw gene counts. Features assigned to the

forward strand were also tabulated. Strandness of each library was estimated by calculating the

percentage of reverse-assigned fragments to the total assigned fragments (reverse plus forward

Page 3: Supplemental Appendix RNA sequencing

hits). Raw gene counts were used to evaluate the level of correlation between biological

replicates using Pearson's correlation and determine which replicates would be used in the

differential gene expression analysis. DESeq2 version 1.18.1 (3) was subsequently used to

determine which genes were differentially expressed between pairs of conditions. Statistical

significance for gene expression difference was determined by p-value < 0.05.

To determine the correlation between mRNA levels and CBI, a set of 4,136 genes with

moderate to high expression levels were used. Pearson correlation coefficients and the

corresponding p values were calculated in R (version 3.6.1). Student’s t-test was used to compare

the Pearson correlations of mutant strains and corresponding wild-type strains with 3 replicates

for each strain. To identify candidate strains with significantly lower correlation coefficients than

observed for the wild-type strain, the p values obtained from previous step were adjusted and

selected by 2% false discovery rate. The differentially expressed genes in mutant strains were

identified based on three criteria: 1) FPKM > 1, 2) mRNA level fold change >2 compared to the

wild-type strain and 3) mRNA level differences between mutant strains and wild-type strains are

statistically significant (p < 0.05). The rest of the genes with FPKM >1 were classified as

unchanged genes. The CBI values for up-regulated, down-regulated and unchanged genes were

used to generate violin plots in R (version 3.6.1) with ggplot2 package. The statistical

significance in violin plots were obtained from student’s t-test.

For nuclear RNA sequencing, raw reads were first aligned to the Neurospora crassa

genome using Bowtie2 version 2.3.5 (4) with the “-local” parameter to align both mRNA and

pre-mRNA fragments. Raw counts for each gene were then obtained by HTSeq version 0.11.2

(5) and used as the input for DEseq2 version 1.24.0 (3) for differential expression analyses. For

total RNA sequencing performed together with nuclear RNA sequencing at UT Southwestern

Next Generation Sequencing Core, STAR version 2.7.2b (6) was used instead of Bowtie2 as a

splicing-sensitive aligner. Nuclear RNA sequencing for each strain was repeated 4 times, with 2

replicates accompanied by total RNA sequencing. The comparison of Pearson correlation in

Figure 1A used only the 2 replicates with both nuclear and total RNA sequencing data. Due to

relatively low sequencing depth, differentially expressed genes in nuclear RNA-seq were

selected based on the adjusted p values (less than 0.1) without a fold change cutoff.

References:

Page 4: Supplemental Appendix RNA sequencing

1. C. Luo, J. J. Loros, J. C. Dunlap, Nuclear localization is required for function of the

essential clock protein FREQUENCY. EMBO J. 17, 1228 - 1235 (1998). 2. D. Kim, B. Langmead, S. L. Salzberg, HISAT: a fast spliced aligner with low memory

requirements. Nat Methods 12, 357-360 (2015). 3. M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for

RNA-seq data with DESeq2. Genome biology 15, 550 (2014). 4. B. Langmead, S. L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat Methods 9,

357-359 (2012). 5. S. Anders, P. T. Pyl, W. Huber, HTSeq--a Python framework to work with high-

throughput sequencing data. Bioinformatics 31, 166-169 (2015). 6. A. Dobin et al., STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21

(2013).

Page 5: Supplemental Appendix RNA sequencing

SI APPENDIX

FIGURE and TABLE LEGEND

Figure S1. Validation of the purity of nuclear preparation by western blot. Western blot

analysis showing the levels of tubulin (left) and histone H3 (right) in the total, cytoplasmic and

nuclear fractions.

Figure S2. Correlation between codon usage bias and mRNA synthesis rate or half-life in

previous studies. Harigaya et al. summarized the RNA half-life or synthesis rate data from 14

dataset in previous studies and calculated the corresponding synthesis rate or half-life, respectively

(33). The Pearson correlation between gene tAI and synthesis rate or half-life was summarized in

the table.

Figure S3. Distribution of Pearson correlations between gene codon usage and mRNA levels

in the WT strain from 12 mRNA-seq experiments. Cultures of the wild-type strain were grown

in constant light at 30oC for 24 hrs before harvesting, RNA extraction and RNA sequencing.

Pearson correlations between CBI and mRNA levels from 12 independent mRNA-seq samples

were shown.

Figure S4. Correlations between CBI and mRNA levels in candidate knockout strains. The

correlations between CBI and mRNA levels for the remaining 13 candidate strains not shown in

Figure 3. All correlations are statistically significant with P value < 2.2e-16.

Figure S5. Violin plots of CBI distribution for differentially regulated genes in additional

candidate strains. **P<0.01, ***P<0.001, ****P<0.0001. Gene numbers for each group were

shown at the top.

Figure S6. Correlations between individual codon occurrences and mRNA level in additional

candidate strains. Pearson correlation coefficients were plotted as in Figure 4C.

Page 6: Supplemental Appendix RNA sequencing

Figure S7. Relative protein and mRNA levels of the reporter genes transformed into the wild-

type and set-2KO strains. (A-B) Relative protein (A) or mRNA (B) levels of the WT and OPT

luciferase reporter expressed in the wild-type or set-2KO strains. The protein or mRNA levels were

normalized to WT-Luc expressed in the wild-type strain. *P<0.05, **P<0.01. (C-D) Relative C)

protein or D) mRNA levels of WT and OPT I-sceI expressed in FGSC4200 or set-2KO strains. The

protein or mRNA levels were normalized to I-sceI-WT expressed in FGSC4200 strain. *P<0.05,

**P<0.01. All p-values were obtained from student’s t-test with 3 replicates.

Figure S8. Pearson correlations between CBI and mRNA level excluding overlapping

differentially regulated genes in the fkh1KOand set-2KO strains. The genes that are up- or

down-regulated in both fkh1KO and set-2KO strains were excluded from the analysis. 2 replicates

of (A) nuclear RNA-seq and (B) total mRNA-seq were used. The Log2(RPKM) of remaining

genes were used to perform Pearson correlation analysis against CBI and the correlation

coefficients were plotted.

Table S1. List of the wild-type, candidate and control strains used in this study.

Dataset S1. List of all the mutant strains used in the mRNA-seq based genetic screen. The

name of the knockout or knockdown genes and their corresponding IDs were described.

Data Set S2. FPKM data of the mRNA-seq for all strains used in the genetic screen. In total

206 mutant strains were sequenced in 6 separate batches with 3 replicates for each strain. Some

strains were sequenced more than once. The gene IDs corresponding to each strain can be found

in Table S2.

Dataset S3. RPKM data of the nuclear RNA sequencing and corresponding total RNA

sequencing controls. Nuclear RNA of wild-type, set-2KO and fkh1KO strains were sequenced

with four replicates in two separate batches. In the second batch, two replicates of total RNA

samples for each strain were also sequenced as controls.

Page 7: Supplemental Appendix RNA sequencing

Dataset S4. CBI of the genes selected for ChIP-qPCR verification. The CBI values of

selected up-regulated and down-regulated genes in set-2 KO strain were listed. These genes were

used for ChIP-qPCR verification of increased transcription level as shown in Figure 7. The CBI

values of all Neurospora genes are also listed.

Page 8: Supplemental Appendix RNA sequencing

Figure S1

WT

Δset-2

Δfkh1

Total

Cytosol

Nuclei

Anti-tubulin

WT

Δset-2

Δfkh1

Total

Cytosol

Nuclei

Anti-H3

Page 9: Supplemental Appendix RNA sequencing

Figure S2

Date source Measured ComputedPearson correlation

between gene tAI andmRNA synthesis rate

Pearson correlationbetween gene tAI and

mRNA half-life

Holstege FC et al.,Cell. 1998.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.70 -0.10

Wang Y et al., PNAS.2002.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.62 0.05

Wang Y et al., PNAS.2002.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.65 -0.05

Grigull J et al., MolCell Biol. 2004.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.65 0.04

Duttagupta R et al.,Mol Cell Biol. 2005.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.53 0.00

Shalem O et al., MolSyst Biol. 2008.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.67 0.00

Pelechano V et al.,Yeast. 2010.

mRNAsynthesis rate

Half-life determined byHarigaya & Parker 2016 0.50 0.18

Miller C et al., MolSyst Biol. 2011.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.67 0.54

Munchel SE et al.,Mol Biol Cell. 2011.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.44 0.17

Sun M et al., MolCell. 2013.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.27 0.40

Geisberg JV et al.,Cell. 2014.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.59 -0.12

Neymotin B et al.,RNA. 2014.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.41 0.30

Presnyak V et al.,Cell. 2015.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.58 0.32

Presnyak V et al.,Cell. 2015.

mRNA half-life

Synthesis rate determined byHarigaya & Parker 2016 0.69 -0.03

Page 10: Supplemental Appendix RNA sequencing

Figure S3

0.4

0.5

0.6

0.7

WT

Pear

son R

(mRN

A L

evel

vs

CBI)

Page 11: Supplemental Appendix RNA sequencing

Figure S4

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Pearson R = 0.44

Δfl

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Pearson R = 0.48

ΔNCU07975

Log2(Average FPKM)

-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δkal-1

Pearson R = 0.45

Log2(Average FPKM)

-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δnst-7

Pearson R = 0.44

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

ΔNCU03897

Pearson R = 0.51

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δcol-24

Pearson R = 0.47

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δada-3

Pearson R = 0.46

Log2(Average FPKM)

-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δvad-3

Pearson R = 0.48

0 8 10 12642 14

0 8 10 12642 14 16

0 8 10 12642 14 16

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Pearson R = 0.46

Δset-1

Log2(Average FPKM)

-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

ΔNCU03043

Pearson R = 0.47

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

Δmsn-1

Pearson R = 0.47

Log2(Average FPKM)

-0.2

0

0.2

0.4

0.6

0.8

1C

BI

Δada-2

Pearson R = 0.45

0 8 10 12642 14 160 8 10 12642 14 16

0 8 10 12Log2(Average FPKM)

642 14 16-0.2

0

0.2

0.4

0.6

0.8

1

CB

I

ΔNCU04445

Pearson R = 0.46

Page 12: Supplemental Appendix RNA sequencing

Figure S5

CB

I

Δfl vs WT

Unchanged

Down

Up

**** **

ΔNCU03897 vs WT

CB

I

Unchanged

Down

Up

**** ****

0.0

0.4

0.8

CB

I

ΔNCU07975 vs WT

Unchanged

Down

Up

**** ****

0.0

0.4

0.8

CB

I

Δada-3 vs WT

Unchanged

Down

Up

** ***

CB

I

Δkal-1 vs WT

Unchanged

Down

Up

**** ****

CB

I

Δcol-24 vs WT

Unchanged

Down

Up

**** ****

0.0

0.4

0.8

CB

I

Δmsn-1 vs WT

Unchanged

Down

Up

**** ****

Δvad-3 vs WT

Unchanged

Down

Up

**** ****

0.0

0.5

1.0

CB

I

0.0

0.4

0.8

445 5834 553 1214 4389 890 1188 4279 1011 1162 5081 697

618 5503 565 1134 5100 680 1164 4502 1038 1132 4742 861

0.0

0.4

0.8

0.0

0.4

0.8

0.0

0.5

1.0

0.0

0.4

0.8

CB

I

Δset-1 vs WT

Unchanged

Down

Up

**** ****

CB

I

ΔNCU03043 vs WT

Unchanged

Down

Up

**** **

0.0

0.4

0.8

CB

I

ΔNCU04445 vs WT

Unchanged

Down

Up

**** ****

Δada-2 vs WT

Unchanged

Down

Up

**** n.s.

0.0

0.5

1.0

CB

I

0.0

0.5

1.0

1416 4291 1093 1260 4271 985 1122 4792 744 728 5348 640

Page 13: Supplemental Appendix RNA sequencing

Figure S6V:

GTC

K: A

AGT:

ACC

L: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: TTC

N: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: TTG

F: TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δada-2 IntermediateΔada-2 Optimal Δada-2 Rare

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δada-3 IntermediateΔada-3 Optimal Δada-3 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: TTC

N: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: TTG

F: T

TTL: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AA

T: A

CAS:

TCA

R: C

GG

S: A

GT

P: C

CGR:

CGA

K: A

AA

P: C

CAA:

GCA

E: G

AA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δada-6 IntermediateΔada-6 Optimal Δada-6 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG:

GGT

A: G

CCI:

ATC

G: G

GCF:

TTC

N: A

ACR:

CGT

Y: TAC

R: C

GCA:

GCT

V: G

TTS:

TCC

P: C

CCE:

GAG

C: T

GCQ: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD:

GAC

M: A

TGW

: TGG

S: A

GCP:

CCT

Y: TAT

S: T

CGC:

TGT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GAG:

GGA

V: G

TAT:

ACG

H: C

ATG:

GGG

R: A

GGI:

ATA

L: T

TAL: C

TAN:

AAT

Q: C

AAT:

ACA

S: TCA

R: C

GGS:

AGT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δcol-24 IntermediateΔcol-24 Optimal Δcol-24 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: TCC

P: C

CCE:

GAG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: T

CGC:

TGT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: TTA

L: C

TAN: A

ATQ: C

AAT:

ACA

S: TCA

R: C

GG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare ΔFKH1 IntermediateΔFKH1 Optimal ΔFKH1 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: TTC

N: A

ACR:

CGT

Y: TAC

R: C

GC

A: G

CTV:

GTT

S: TCC

P: C

CCE:

GAG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: TTG

F: T

TTL: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: TTA

L: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AA

P: C

CAA:

GCA

E: G

AA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δkal-1 IntermediateΔkal-1 Optimal Δkal-1 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: TAC

R: C

GC

A: G

CTV:

GTT

S: TCC

P: C

CCE:

GAG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD:

GAC

M: A

TGW

: TGG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GAV:

GTA

T: A

CGH: C

ATG: G

GG

R: A

GG

I: AT

AL: TTA

L: C

TAN: A

ATQ: C

AAT:

ACA

S: TCA

R: C

GG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δnst-5 IntermediateΔnst-5 Optimal Δnst-5 Rare

V: G

TCK:

AAG

T: A

CCL:

CTC

G: G

GTA:

GCC

I: AT

CG:

GGC

F: T

TCN:

AAC

R: C

GTY:

TAC

R: C

GCA:

GCT

V: G

TTS:

TCC

P: C

CCE:

GAG

C: T

GCQ:

CAG

T: A

CTH:

CAC

L: CT

TS:

TCT

I: AT

TD:

GAC

M: A

TGW

: TGG

S: A

GCP:

CCT

Y: T

ATS:

TCG

C: T

GTD:

GAT

V: G

TGL:

TTG

F: T

TTL:

CTG

A: G

CGR:

AGA

G: G

GAV:

GTA

T: A

CGH:

CAT

G: G

GGR:

AGG

I: AT

AL:

TTA

L: CT

AN:

AAT

Q: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GTP:

CCG

R: C

GAK:

AAA

P: C

CAA:

GCA

E: G

AA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δscp160 IntermediateΔscp160 Optimal Δscp160 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG:

GGT

A: G

CCI:

ATC

G: G

GCF:

TTC

N: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD:

GAC

M: A

TGW

: TGG

S: A

GCP:

CCT

Y: T

ATS:

TCG

C: T

GTD:

GAT

V: G

TGL: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GAG:

GGA

V: G

TAT:

ACG

H: C

ATG:

GGG

R: A

GGI:

ATA

L: T

TAL: C

TAN:

AAT

Q: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GTP:

CCG

R: C

GAK:

AAA

P: C

CAA:

GCA

E: G

AA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δvad-3 IntermediateΔvad-3 Optimal Δvad-3 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: TTC

N: A

ACR:

CGT

Y: TAC

R: C

GC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: TCA

R: C

GG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δfl IntermediateΔfl Optimal Δfl Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG:

GGT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD:

GAC

M: A

TGW

: TGG

S: A

GC

P: C

CTY:

TAT

S: T

CGC:

TGT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG:

GGG

R: A

GG

I: AT

AL: TTA

L: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δmsn-1 IntermediateΔmsn-1 Optimal Δmsn-1 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: TCC

P: C

CCE:

GAG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R:AG

GI:

ATA

L: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δnst-7 IntermediateΔnst-7 Optimal Δnst-7 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: T

CGC:

TGT

D: G

ATV:

GTG

L: TTG

F: TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare Δset-1 IntermediateΔset-1 Optimal Δset-1 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: TTC

N: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E: G

AGC:

TGC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: TCG

C: T

GT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: TCA

R: C

GG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Pea

rson

R(C

odo

n F

req

vs m

RN

A L

evel

)

0.4

0.2

0.0

-0.2

-0.4

WT IntermediateWT Optimal WT Rare ΔNCU04424 IntermediateΔNCU04424 Optimal ΔNCU04424 Rare

V: G

TCK:

AAG

T: A

CCL: C

TCG: G

GT

A: G

CCI:

ATC

G: G

GC

F: T

TCN: A

ACR:

CGT

Y: T

ACR:

CGC

A: G

CTV:

GTT

S: T

CCP:

CCC

E:GAG

C: T

GC

Q: C

AGT:

ACT

H: C

ACL: C

TTS:

TCT

I: AT

TD: G

ACM

: ATG

W: T

GG

S: A

GC

P: C

CTY:

TAT

S: T

CGC:

TGT

D: G

ATV:

GTG

L: T

TGF:

TTT

L: C

TGA:

GCG

R: A

GA

G: G

GA

V: G

TAT:

ACG

H: C

ATG: G

GG

R: A

GG

I: AT

AL: T

TAL: C

TAN: A

ATQ: C

AAT:

ACA

S: T

CAR:

CGG

S: A

GT

P: C

CGR:

CGA

K: A

AAP:

CCA

A: G

CAE:

GAA

Page 14: Supplemental Appendix RNA sequencing

Figure S7

A

WT0

2

4

50

100

Rel

ativ

e pr

otei

n le

vel

**150

Luc-WT Luc-OPT

*

B

0Rel

ativ

e m

RN

A le

vel

0

10

20

30

40

*

n.s.

C

0

4

8

500

1500

Rel

ativ

e pr

otei

n le

vel 2500

I-sceI-WT I-sceI-OPT

**

*

D

0Rel

ativ

e m

RN

A le

vel

0

50

100

4

8

150 *

n.s.

Δset-2 WT Δset-2

WT Δset-2 WT Δset-2

Luc-WT Luc-OPT

I-sceI-WT I-sceI-OPT

Page 15: Supplemental Appendix RNA sequencing

Figure S8

A B

Pea

rson

R(C

BI v

s nu

cle

ar R

NA

leve

l)

WT ΔFKH1 Δset-2

0.45

0.35

0.40

0.25

0.30Pea

rson

R(C

BI v

s to

tal R

NA

leve

l)

WT ΔFKH1 Δset-2

0.60

0.65

0.50

0.55

Page 16: Supplemental Appendix RNA sequencing

Table S1

Name Genotype Sourcewildtype wildtype, mat a FGSC 4200ΔNCU03897 ΔNCU03897::Hygr mat a FGSC 21524Δada-6 ΔNCU04866::Hygr mat a FGSC 11022Δnst-5 ΔNCU00203::Hygr mat a FGSC 14806ΔFKH1 ΔNCU00019::Hygr mat a FGSC 11437Δfl ΔNCU08726::Hygr mat a FGSC 11044Δset-2 ΔNCU00269::Hygr mat A FGSC 15505Δada-3 ΔNCU02896::Hygr mat a FGSC 11070Δnst-7 ΔNCU07624::Hygr mat a FGSC 16002Δcol-24 ΔNCU05383::Hygr mat a FGSC 11019ΔNCU07975 ΔNCU07975::Hygr mat A FGSC 11337Δvad-3 ΔNCU06407::Hygr mat a FGSC 11017Δkal-1 ΔNCU03593::Hygr mat a FGSC 11129Δmsn-1 ΔNCU02671::Hygr mat a FGSC 11345Δset-1 ΔNCU01206::Hygr mat A FGSC 15827ΔNCU04424 ΔNCU04424::Hygr mat a FGSC 11752Δada-2 ΔNCU02017::Hygr mat a FGSC 11108ΔNCU03043 ΔNCU03043::Hygr mat a FGSC 11224ΔNCU04445 ΔNCU04445::Hygr mat a FGSC 11754ΔDhh1 ΔNCU06149::Hygr mat a FGSC 13191ΔDbp2 ΔNCU07839::Hygr mat a FGSC 15506Δhda-2 ΔNCU02795::Hygr mat A FGSC 111584200-luc-wt luc-mWT::CsAr, Glufosinater, mat a This study4200-luc-opt luc-opt::CsAr, Glufosinater, mat a This studyset-2-luc-wt ΔNCU00269, luc-mWT::Hygr,CsAr,Glufosinater,mat A This studyset-2-luc-opt ΔNCU00269, luc-opt::Hygr, CsAr, Glufosinater, mat A This study4200-IsceI-wt IsceI-wt::CsAr, Glufosinater, mat a This study4200-IsceI-opt IsceI-opt::CsAr, Glufosinater, mat a This studyset-2-IsceI-wt ΔNCU00269, IsceI-wt::Hygr, CsAr, Glufosinater, mat A This studyset-2-IsceI-opt ΔNCU00269, IsceI-opt::Hygr, CsAr, Glufosinater, mat A This study