DNA thermodynamic stability and supercoil dynamics determine the gene expression program during the...

9
This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst. Cite this: DOI: 10.1039/c3mb25515h DNA thermodynamic stability and supercoil dynamics determine the gene expression program during the bacterial growth cycle Patrick Sobetzko, a Monika Glinkowska,w a Andrew Travers b and Georgi Muskhelishvili* a The chromosomal DNA polymer constituting the cellular genetic material is primarily a device for coding information. Whilst the gene sequences comprise the digital (discontinuous) linear code, physiological alterations of the DNA superhelical density generate in addition analog (continuous) three-dimensional information essential for regulation of both chromosome compaction and gene expression. Insight into the relationship between the DNA analog information and the digital linear code is of fundamental importance for understanding genetic regulation. Our previous study in the model organism Escherichia coli suggested that the chromosomal gene order and a spatiotemporal gradient of DNA superhelicity associated with DNA replication determine the growth phase-dependent gene transcription. In this study we reveal a general gradient of DNA thermodynamic stability correlated with the polarity of chromosomal replication and manifest in the spatiotemporal pattern of gene transcription during the bacterial growth cycle. Furthermore, by integrating the physical and dynamic features of the transcribed sequences with their functional content we identify spatiotemporal domains of gene expression encompassing different functions. We thus provide both an insight into the organisational principle of the bacterial growth program and a novel holistic methodology for exploring chromosomal dynamics. Introduction Understanding genetic regulation at a systemic level necessitates that both the linear code and the three-dimensional information determined by the configuration of the chiral DNA polymer be fully compatible, especially since both are intrinsically coupled in the primary sequence organisation of the DNA. 1,2 There is ample evidence that physiological variations in the composition of nucleoid-associated proteins (NAPs) and DNA processing enzymes play a determinative role in structural dynamics and compaction of the bacterial nucleoid. 3–14 How- ever, it is less obvious how, in response to changing physio- logical demands, the structural dynamics of the chromosomal DNA optimize gene transcription to facilitate adaptation. Genomic variations of the DNA superhelical density affect directly and distinctly the gene activity. 15–18 Yet the organisational principle of genome-wide coupling between the supercoiling- dependent DNA structures and the gene expression patterns remains unclear, although for individual genes it is understood in some detail. This coupling mechanism involves local binding effects of the DNA architectural proteins stabilizing distinct DNA configurations in the gene promoter regions. 19,20 Such three-dimensional DNA structures themselves depend on the superhelical density and can mediate selective channeling of the stored supercoil energy into promoter opening and gene transcription. 21–24 The rationale of holistic methodology To explore the genome-wide coupling between the thermo- dynamic and coding properties of the DNA we carried out next generation sequencing of mRNAs isolated from the E. coli K12 strain CSH50 grown in a fermenter under controlled conditions of temperature and oxygen partial pressure. We generated datasets corresponding to patterns of analog information (genomic distributions of the DNA melting energy and super- coiling) and digital information (distributions of distinct functional classes of genes) in the genomic sequences a Jacobs University Bremen, School of Engineering and Science, Campus Ring 1, D-28759 Bremen, Germany. E-mail: [email protected]; Fax: +49 421 200 3249; Tel: +49 421 200 3143 b MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK † Present address: University of Gdansk, 80-308 Gdansk, Poland. Received 11th November 2012, Accepted 19th February 2013 DOI: 10.1039/c3mb25515h www.rsc.org/molecularbiosystems Molecular BioSystems PAPER Downloaded by University of Queensland on 28/04/2013 12:57:22. Published on 19 February 2013 on http://pubs.rsc.org | doi:10.1039/C3MB25515H View Article Online View Journal

Transcript of DNA thermodynamic stability and supercoil dynamics determine the gene expression program during the...

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

Cite this: DOI: 10.1039/c3mb25515h

DNA thermodynamic stability and supercoil dynamicsdetermine the gene expression program during thebacterial growth cycle

Patrick Sobetzko,a Monika Glinkowska,wa Andrew Traversb andGeorgi Muskhelishvili*a

The chromosomal DNA polymer constituting the cellular genetic material is primarily a device for coding

information. Whilst the gene sequences comprise the digital (discontinuous) linear code, physiological

alterations of the DNA superhelical density generate in addition analog (continuous) three-dimensional

information essential for regulation of both chromosome compaction and gene expression. Insight into the

relationship between the DNA analog information and the digital linear code is of fundamental

importance for understanding genetic regulation. Our previous study in the model organism Escherichia coli

suggested that the chromosomal gene order and a spatiotemporal gradient of DNA superhelicity

associated with DNA replication determine the growth phase-dependent gene transcription. In this study

we reveal a general gradient of DNA thermodynamic stability correlated with the polarity of chromosomal

replication and manifest in the spatiotemporal pattern of gene transcription during the bacterial growth

cycle. Furthermore, by integrating the physical and dynamic features of the transcribed sequences with their

functional content we identify spatiotemporal domains of gene expression encompassing different

functions. We thus provide both an insight into the organisational principle of the bacterial growth

program and a novel holistic methodology for exploring chromosomal dynamics.

Introduction

Understanding genetic regulation at a systemic level necessitatesthat both the linear code and the three-dimensional informationdetermined by the configuration of the chiral DNA polymerbe fully compatible, especially since both are intrinsicallycoupled in the primary sequence organisation of the DNA.1,2

There is ample evidence that physiological variations in thecomposition of nucleoid-associated proteins (NAPs) and DNAprocessing enzymes play a determinative role in structuraldynamics and compaction of the bacterial nucleoid.3–14 How-ever, it is less obvious how, in response to changing physio-logical demands, the structural dynamics of the chromosomalDNA optimize gene transcription to facilitate adaptation.Genomic variations of the DNA superhelical density affectdirectly and distinctly the gene activity.15–18 Yet the organisational

principle of genome-wide coupling between the supercoiling-dependent DNA structures and the gene expression patternsremains unclear, although for individual genes it is understoodin some detail. This coupling mechanism involves localbinding effects of the DNA architectural proteins stabilizingdistinct DNA configurations in the gene promoter regions.19,20

Such three-dimensional DNA structures themselves depend onthe superhelical density and can mediate selective channelingof the stored supercoil energy into promoter opening and genetranscription.21–24

The rationale of holistic methodology

To explore the genome-wide coupling between the thermo-dynamic and coding properties of the DNA we carried out nextgeneration sequencing of mRNAs isolated from the E. coli K12strain CSH50 grown in a fermenter under controlled conditionsof temperature and oxygen partial pressure. We generateddatasets corresponding to patterns of analog information(genomic distributions of the DNA melting energy and super-coiling) and digital information (distributions of distinctfunctional classes of genes) in the genomic sequences

a Jacobs University Bremen, School of Engineering and Science, Campus Ring 1,

D-28759 Bremen, Germany. E-mail: [email protected];

Fax: +49 421 200 3249; Tel: +49 421 200 3143b MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK

† Present address: University of Gdansk, 80-308 Gdansk, Poland.

Received 11th November 2012,Accepted 19th February 2013

DOI: 10.1039/c3mb25515h

www.rsc.org/molecularbiosystems

MolecularBioSystems

PAPER

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5H

View Article OnlineView Journal

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

expressed during the growth cycle. By digital information wemean the DNA sequence written in unique succession of individualletters. Any gene is defined this way as a discontinuous entityand so belongs to the digital information type. Conversely, theDNA superhelical density is by definition a continuous para-meter largely independent of the unique sequence and thusbelonging to the analog type of information. For determinationof genomic melting energy distributions we calculate theaverage melting energy of all dinucleotide steps (either in theentire genome or in the expressed sequences), thus yieldinganalog information. By simultaneous analyses of the digitaland analog parameters of the expressed DNA sequence wethus integrate the physical and dynamic properties of thetranscribed sequences with their functional content.

In addition, we have monitored the leading strand utiliza-tion bias and the density distributions of the activated andrepressed genes conceivably reflecting the structural dynamicsof the chromosome. We next detected the statistically signifi-cant distribution boundaries for each type of analysed informa-tion and correlated them among the different datasets to derivethe boundaries of chromosomal domains. Using this holisticmethodology we approximated the spatiotemporal maps of

functionally relevant chromosomal domains and revealed theirdynamics during the bacterial growth cycle.

ResultsTemporal distributions of DNA information

A striking characteristic of g- and a-proteobacterial genomes isthat the gene order relative to the origin of replication is highlyconserved.25 Notably those genes that are required for rapidvegetative growth are clustered close to the origin while genesthat are preferentially expressed during anaerobic stress arepredominantly located close to the Ter region. We observed thatthis organisation is paralleled by the average underlying physicalproperties of the circular genomic DNA. A coarse resolutionanalysis (with a sliding window of 500 kb) of the physical proper-ties of the E. coli genome revealed apparent gradients of highaverage negative DNA melting energy (blue color coded) extendingfrom the replication origin to the terminus in both replichoreswith a high average negative melting energy in the Ori-proximalregion and a low (red color coded) average negative meltingenergy in the Ter-proximal region (Fig. 1A and B). Analysis of300 g-proteobacterial genomes listed in the DoriC database26

Fig. 1 Averaged distributions of melting energy in the genomes of E. coli (A and B) and all g-proteobacteria (C). Scanning window size is indicated. Note that highnegative melting energy is coded in blue color. (D) Frequency distribution of the gyrase binding sites in the E. coli genome.15 Note that high abundance of binding sitesis coded in red color (red color corresponds to 50 binding sites, blue color to 2 binding sites in a 500 kb window). The origin of replication (OriC) is indicated. Themacrodomains31 are indicated by colors: Ori (green), Right (red), Left (dark blue), Ter (light blue), LNS and RNS in black.

Paper Molecular BioSystems

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

further revealed a similar, but more symmetric pattern of gradients(Fig. 1C) and a compact region of low negative melting energyapparent at the origin of replication. The symmetry of thispattern about the Ori–Ter axis argues strongly that the physicalproperties of the DNA are related to the replication processitself. Not only does the DNA itself exhibit Ori–Ter gradients butso also do the binding sites for DNA gyrase, the major enzymeintroducing negative supercoils into the DNA,15 which arepreferentially located in the OriC-proximal half of the genome(Fig. 1D; note that the color code is inverted, red for high andblue for low values) suggesting a supercoiling gradient fromOriC to Ter.25 Although we have no direct evidence for thisgradient, we believe this organisation to be relevant, sincemutation of HU, the major supercoil-constraining NAP inE. coli, imposes a spatial gradient of the supercoiling responseof transcription closely reflecting the distribution of the gyrasebinding sites in the chromosome.10

Is the observed static asymmetry in DNA thermodynamicstability related to the dynamic gene expression profile duringthe E. coli growth cycle? We found that after commitment of thecells to growth and until the late exponential phase (around100 min) the average negative melting energy of the expressedgene sequences showed a continuous increase (Fig. 2A, blackcurve). This increase correlated with increasing oxygenconsumption and proximity to the Ori region. In contrast, withthe onset of more anaerobic conditions (after 100 min) theaverage melting energy decreased while the average distance ofthe expressed genes to the Ori end increased (yellow curve).Consistent with previous observations25 the expressed Ori-proximal high negative melting energy sequences corre-sponded to HYP genes (for the gene list see ESI in ref. 17)requiring high negative superhelicity for transcription (bluecurve). With further bacterial growth both the negative meltingenergy and proportion of HYP genes continuously decreasedtowards the late stationary phase attaining correspondingminima at the minimum oxygen consumption, likely correlat-ing with nutrient exhaustion (400 min). In contrast the RELgenes requiring DNA relaxation (for the gene list see ESI inref. 17) showed a biphasic pattern with a major peak at minimalpartial oxygen pressure in the medium (around 200 min; pinkcurve) and thus substantially later than the peak of HYP genesduring the growth cycle. However, the REL genes also decreasedwith reduction of oxygen consumption and presumed nutrientexhaustion.

Temporal distributions of the functional gene classes

We next investigated the patterns of the wide functional classes ofanabolic, catabolic, aerobic and anaerobic genes (Regulon DB, seeMethods) in our effective transcript profiles. As expected thesealso showed clear temporal patterns (Fig. 2B). The activation ofanabolic genes (blue curve) closely matched that of the Ori-proximal HYP genes, whereas the peak of anaerobic genes (yellowcurve) coincided with the major peak of the REL genes, both beingobserved at minimal partial oxygen pressure. As expected, theaerobic genes were activated earlier (peak around 160 min) thananaerobic genes (peak around 200 min) in the interval when the

proportion of the HYP and REL genes was more balanced (between100 and 200 min). The catabolic genes increased in concert withREL genes, but remained high in the late stationary phase anddeclined only on commitment to growth (0 min). Overall, thedistribution pattern of anabolic and catabolic gene classes amongthe HYP and REL genes is fully consistent with previous observa-tions17,27,28 and underscores the major role of supercoiling inorganising the cellular metabolism both during normal growthand also during experimental evolution.29,30

Mapping the spatiotemporal patterns of analog and digitalinformation

To relate our temporal patterns to chromosomal dynamics weconstructed spatiotemporal distribution maps of the expressedgene density in the chromosome represented as an assembly offorty-three genomic wheels (Fig. 3), each corresponding to asequential time point sampled with 10 min intervals afterinoculation (0 min) through the entire bacterial growth cycle(0–430 min). We found that on commitment of the cells to fastgrowth in rich medium the Ori macrodomain and both theflanking right and left non-structured domains (RNS andLNS31) showed significant up-regulation of genes, whereas theTer macrodomain was significantly repressed (Fig. 3). However,around 100 min the situation was reversed with the Termacrodomain showing higher activated gene density than thechromosomal Ori end. Later during the growth cycle (after200 min) the density changes demonstrated more patchycharacter. The extended region around the junction of theLNS and Left macrodomains showed little temporal variationthroughout the growth cycle compared to RNS and the RNS-proximal half of the Right macrodomain, which demonstratedan alternating pattern of activation and repression. Further-more, in the late stationary phase the Ter macrodomainshowed higher expression density than the Ori macrodomain.This pattern of changing activation and repression in thegenome likely reflects sequential changes in the morphologyof the chromosome coupled with successive responses tonutrient exhaustion. Indeed, the heterarchical network (HEN)patterns describing the spatiotemporal communicationsbetween functionally related genes25 showed strong activationof communications in the Ori macrodomain and concomitantrepression of the Ter macrodomain during fast exponentialgrowth (Fig. 4), whereas upon transition to the stationary phasethe communications in the Ori-proximal region were repressedand those in the Ter macrodomain activated. This alternationof activated and repressed poles of the chromosome reflected intemporal transcript patterns is fully consistent with previouslyproposed dynamic changes in the gross morphology of thechromosome during the growth cycle.25

To evaluate the role of DNA analog properties in suchmorphological changes we inspected their distributions on thespatiotemporal genomic wheel assemblies (Fig. 5). We observedthat the pattern of high melting energy sequences largely correlatedwith leading strand utilisation in the chromosome (compareFig. 5A and B; note that high negative melting energy is codedin blue, whereas high leading strand utilisation is coded in red)

Molecular BioSystems Paper

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

whereas the patterns of the HYP and REL genes were largelyanti-correlated (compare Fig. 5C and D). We used thesepatterns to detect statistically significant distribution bound-aries for each type of analysed information and correlated theboundaries among the datasets. We could thus detect elevensignificant distribution boundaries delimiting eleven chromo-somal domains (numbered FD1 to FD11 in Fig. 6).

What is the relevance of these domains to the chromosomalfunction? To answer this question we mapped the functionalgroups of genes compiled in the Gene Ontology (GO) dataset onthe genome (outer ring in Fig. 6). The borders between thedistinct functional classes of genes were detected by scoring thedegree of change of function along the chromosome and

correlated with the borders of analog information. We founda close coincidence of the latter with functional borders (blacklines crossing the two outer rings and the macrodomains inFig. 6) strongly implicating these boundaries in generatingtransient chromosomal functional domains (FDs) during thegrowth cycle.

Discussion

In this study we introduce a novel holistic methodology employingthe transcriptomics data for integrating logically distinct types ofDNA information and detecting the organisation and dynamics ofstructural and functional domains in the bacterial chromosome.

Fig. 2 Temporal dynamics of DNA information in the transcript profiles. (A) Temporal dynamics of analog parameters. The plot shows the temporal impact ofdifferent analog parameters on gene expression. The different curves were normalized to [0;1] to compare them in one plot. Minimum and maximum values areindicated in brackets in the legend. Expression values (HYP, REL) in brackets are normalized to the expression of all genes. Melting energy and distance to origin wereaveraged over all genes weighted by their expression. (B) Temporal dynamics of digital components. The plot shows the temporal impact of different digitalparameters on gene expression. The different curves were normalized to [0;1] to compare them in one plot. The envelopes of the curves indicate the standarddeviation at 10% random remapping of the expression patterns to genes. Minimum and maximum values are indicated in brackets in the legend. Expression values(anabolic, catabolic, aerobic, anaerobic) in brackets are normalized to the expression of all genes. In both graphs (A and B) the optical density and partial oxygenpressure are indicated by the dashed, respectively, blue and green lines.

Paper Molecular BioSystems

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

The integration of the data is achieved by mapping the spatio-temporal distributions of the physical and dynamic properties

of the DNA polymer (such as melting energy and superhelicaldensity) in distinct functional classes of metabolic genesexpressed during the bacterial growth cycle.

A revealing insight into the mechanism underlying thegrowth phase-dependent gene expression pattern is providedby the observed gradient of melting energy in the bacterialgenomes (Fig. 1) coinciding with both the distribution of gyrasebinding sites and the directionality of replication, and reflectedin the spatiotemporal transcript patterns as well (Fig. 2 and 3).These observations strongly suggest that organisation of thethermodynamic stability of the DNA in the genome is determi-native for distinct coupling of DNA topology to the metabolicfunction of the expressed gene sequences. During the growthcycle this coupling is evident in sequential activation of theHYP and REL genes in concert with sequential activation ofanabolic and catabolic genes (Fig. 2). Furthermore, both thegene density distributions and HEN communication patterns(Fig. 3 and 4) suggest an alternating dominance of the Ori andTer ends during the growth cycle implicating morphologicalchanges of the chromosome associated with changing expres-sion of the anabolic and catabolic genes. Notably, after the earlypeak the decline of both the HYP and anabolic genes correlateswith steep decline in partial oxygen pressure (Fig. 2), whereasthat of REL and anaerobic genes with attaining minimal partialpressure values. This observation strongly supports the modelin which the bacterial growth phase-dependent transcriptionprogram is determined by high oxygen utilisation generatingsupercoiling gradients along both replichores from the originto the terminus of replication.25 Strong expression of anabolictranscriptons is apparently streamlined by the influence of thereplication process facilitating the leading strand transcription.

Fig. 3 Spatiotemporal genomic wheels showing the density of the expressed genes.The macrodomains are indicated by colors as in Fig. 1. The forty-three temporal rings areassembled from inside to outside and the time flow (10 min to 430 min) is indicated bythe dashed white arrow. The reference point (0 min) is the late stationary phase (lastring), such that the pattern is circular. The colors in displayed distributions indicateconstant (green), high (red) and low (blue) expressed gene density. The two outer ringscorrespond, respectively, to the chromosomal macrodomains and the static averagemelting energy distribution (500 kb window) in the E. coli genome (see Fig. 1).

Fig. 4 HEN communication patterns of the cells harvested during fast exponential growth (stage 4 from Fig. 3, left panel) and on transition to the stationary phase(stage 14 from Fig. 3, right panel). The genomic wheels below the macrodomain rings show the significantly increased (red) or decreased (blue) densities of theexpressed genes according to the depicted stages. Significantly increased communications between the chromosomal domains are indicated by red connecting lines,significantly decreased communications by blue connecting lines; black lines – static HEN communications absent in the expression profile. The positions of OriC andTer, as well as the chromosomal macrodomains, are indicated.

Molecular BioSystems Paper

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

Especially the very strong ribosomal RNA operons aggregatingin microscopically detectible transcription foci are orientedaway from the origin.32 Thus, for the anabolic function thepreferential usage of the leading strand33 would both disfavorclashes between the replication and transcription machineriesand also favor, due to continuous replication mode, accumula-tion of higher levels of superhelicity required for transcriptioninitiation in the wake of the translocating replisome.34

In an attempt to relate the physical and mechanical proper-ties of the chromosomal DNA to its functional content we

observed spatiotemporal organisation of distinct functionaldomains (FDs). Importantly, the apparently flexible bordersdelimiting the FDs are deviant from those of chromosomalmacrodomains detected previously by scoring the recombina-tion frequencies between different chromosomal loci,30 orthose of functional ‘‘core’’ genes detected by the bioinformaticsapproach.34 One possible explanation is that both the above-mentioned studies describe the domains as static entitiesdetermined primarily by either structural31 or functional35

considerations, whereas our approach determines temporal

Fig. 5 Spatiotemporal genomic wheels showing distributions of the analog information in the expression profile. (A) Distribution of average DNA melting energy(blue – high, red – low average negative melting energy). (B) Distribution of leading strand preference (red – high, blue – low preference). (C) Distribution of HYP genes(red – high, blue – low frequency). (D) Distribution of REL genes (red – high, blue – low frequency). Grey regions indicate low numbers of genes insufficient for statisticalanalysis. The two outer rings correspond to the chromosomal macrodomains and the static average melting energy distribution (500 kb window) in the E. coli genome.

Paper Molecular BioSystems

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

borders of structural–functional domains derived from thetranscriptome of the entire growth cycle. Notwithstanding thisdifference in the approach used some coincidences are of note.For example, the domains FD10 and FD11 largely overlap withLNS and the B domain of the Mathelier and Carbone35 map.The three adjacent FDs in the Ori end of the chromosome(FD1–FD3) comprise the entire Ori macrodomain in theValens et al. map,31 whereas FD1–FD4 largely comprise the Adomain of Mathelier and Carbone. Furthermore, FD8 and FD9comprise the C, D and E domains of Mathelier and Carbone,and the five adjacent FDs (FD10, FD11 and FD1–FD3) comprisethe rrn domain of Berger et al. map.10 Importantly, sincethe borders of the FDs are transient, this implies that theydynamically appear in different, non-random constellationsdelimiting domains of variable size depending on cellularphysiology and growth phase. It is also noteworthy that thedomains detected in previous studies are assumed static andfurthermore, the chromosomal boundaries defined in all thesedifferent studies, each having its limitations, including ours,are overall imprecise. Nevertheless, the coincidence of thestructural and functional borders delimiting the FDs stronglysupports their relevance.

Cellular transcriptional regulators and especially the NAPsare implicated in determining the structural organisation ofthe chromosome.36–38 Importantly, the regulatory genes arespatially ordered in the chromosome,25 whereas the geneproducts were shown being translated in the vicinity of

cognate genes,39,40 thus potentially forming concentrationgradients. Also the frequency distributions of binding sitesfor the major regulators of bacterial growth (DNA gyrase, NAPsand RNA polymerase sigma factors) form spatial gradients inthe genome interacting with temporally changing gradients ofthe DNA superhelicity, NAPs and RNA polymerase sigma factorsduring the growth cycle.10,15,25,41,42 We propose that FDs areorganised by dynamic gradients of regulators generatingtemporally unique intersections of cognate regulons duringthe growth cycle and thus selecting distinct groups of function-ally coherent genes instantly responding to the changinggradient.1,28 An important question directly relevant to this issueis whether the observed pattern is preserved in mutants lackingglobal regulators and/or cells impaired in maintaining high ratesof replication and superhelical density.

Conclusion

In conclusion, we propose that conversion of the DNA analoginformation into the digital linear code is a basic devicecoordinating the bacterial growth program. Information con-version is enabled by peculiarity of the primary sequenceorganisation integrating the thermodynamic properties of theDNA with the functional ‘‘meaning’’ of the gene sequences.This integration is facilitated by organising genes in a highlyconserved order relative to the origin of replication alongsidethe gradients of DNA thermodynamic stability and superhelicity inboth chromosomal replichores. Our results support the view thatsupercoiling preferences of spatially ordered genes in the chromo-some are bound to different regimes of oxygen consumption. Theactivation of the high negative melting energy, HYP, and anabolicgenes at high partial oxygen pressure precedes the activation ofthe low melting energy, REL, and catabolic genes under hypoxicconditions, and the peak of aerobic genes precedes that ofanaerobic genes (Fig. 2). Integration of these variable patternsdelimits the borders of FDs, appearing as basic spatiotemporalentities for dynamic assembly of the genetic program duringbacterial growth. The detected borders may mark topologicallysensitive regions acting as flexible supercoil repositories orhinges that shape the chromosome according to the availablesuperhelical energy.43 The dynamic nature and apparentfunctional partition of FDs are fully consistent with the viewof chromosomes as thermodynamic machines converting avail-able energy into information.1,2 Given that E. coli is a facultativeanaerobe, an open question meriting further study is how thisconversion is harnessed in both obligate aerobes and obligateanaerobes.

MethodsCell growth conditions and mRNA isolation

The Escherichia coli CSH50 wild type strain was grown in 4 L ofdouble rich medium (dxYT) in a fermenter under constantpH 7.4 and high aeration (500 rpm stirring, 5 L air per min)at 37 1C. The culture was inoculated from 16 h overnightcultures at an initial OD600 of 0.1. Cells were grown for 7 hours

Fig. 6 Organisation of the dynamic functional domains (FDs) in the E. colichromosome. The eleven detected domains (FD1 to FD11) and the chromosomalpositions of their boundaries (in kb) are indicated. The outermost ring representsthe distribution of static borders (highlighted in red) of functional domains in thechromosome. The following ring represents the distribution of structural domainswith dynamic borders (highlighted in red) delimited by analog informationdistributions in the chromosome, as approximated from expression profilesobtained during the bacterial growth cycle. The third ring indicates the staticborders of chromosomal macrodomains31 of Valens et al. (2004). The fourth ringshows the static borders of the functional domains35 of Mathelier and Carbone(2010). The innermost ring indicates the rrn functional domain10 of Berger et al.(2010). The approximated positions of the boundaries are indicated in kilobases.

Molecular BioSystems Paper

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

Mol. BioSyst. This journal is c The Royal Society of Chemistry 2013

and samples for RNA-seq were taken at 1, 2, 3, 5 and 7 hoursafter inoculation. Each sample was immediately dissolved inice-cold ethanol–phenol (5% phenol) solution to prevent mRNAdegradation. RNA was extracted using the RNeasy Mini kit(Qiagen) and treated with Turbo DNase (Life Technologies). Sub-sequent rRNA depletion was carried out using the MicrobExpresskit (Life Technologies) and 0.5 mg of enriched mRNA of eachsample were sent for RNA-seq (Illumina HiSeq 2000).

Gene expression analysis

The 50 bp reads were mapped on the E. coli MG1655 genome(NCBI). Gene expression was determined by normalizing theCDS reads by the total number of reads as well as the length ofthe CDS. The expression curve of each gene was interpolatedby a natural spline method. Resulting expression curveswere verified by fluorescence measurements of yfp-coupledpromoters exhibiting different temporal patterns. Differentiallyexpressed genes were determined using a threshold of a 2-foldchange of expression per hour.

For the analysis of patterns in the chromosome we used500 kb scanning windows shifting by 4 kb. In order to deter-mine the usage of DNA features on the chromosome (HYP, REL,gene density, melting energy and leading strand usage) wedetermined z-scores. For HYP, REL and leading strand usagethe frequency ratio of the feature in the up- and down-regulatedgenes was compared to the frequency of the same number ofrandomly chosen up- and down regulated genes in the window.For gene density the ratio of up- and down regulated genes wasused to generate z-scores. In the case of melting energy,calculated using the parameters of Santa-Lucia,44 the differencein melting energy of up- and down-regulated genes wascompared to the difference in melting energy of randomlychosen up- and down-regulated genes. Because melting energyis by convention expressed as a negative DG value high meltingenergies have a lower (more negative) numerical value.

Temporal expression patterns were determined by firstnormalizing the gene expression pattern of each gene in therange of [0;1]. For each temporal pattern of gene groups(e.g. HYP or REL) we applied the average normalized expressionof all genes in this group. In the case of melting energy anddistance to origin we applied the average melting energy/distance to origin of all genes, weighted by their normalizedexpression. The robustness of the curves was determined by10% random remapping of expression patterns to genes. Func-tional gene groups were taken from the respective GO branchesprovided by the RegulonDB database. Anabolic genes: biosynth-esis of macromolecules (GID000000120); catabolic genes:degradation of macromolecules (GID000000057); aerobicgenes: aerobic respiration (GID000000068); anaerobic genes:anaerobic respiration (GID000000069).

Determination of structural and functional domains

Structural domains were determined by correlating all pairs ofthe investigated features in time creating correlation rings forall individual pairs. Each such ring indicates the temporalcorrelation of the two analysed features within each window

along the chromosome. For the correlation we used the Pearsoncorrelation coefficient of feature z-scores. In the next step for allrings we summed up the change in correlation of two neighboringwindows in the chromosome. High values indicate coupling of thefeature borders on the chromosome. These sums were comparedto null model sums where the rings were randomly rotatedrelative to each other. Deviations of at least two standard devia-tions (z-score Z 2) were considered significant.

For the determination of functional domains we investi-gated the number of genes assigned to GO-groups at level 4 inthe metabolic function branch.25 We compared the distributionof genes among GO-groups in two windows at a center distanceof 250 kb using the root mean square of the group sizedifference. Prior to the comparison the number of genes ineach group was normalized by the total number of genesassigned to all groups in a window to avoid biases introducedby regions enriched in genes with unknown function.

References

1 G. Muskhelishvili, P. Sobetzko, M. Geertz and M. Berger,Mol. Biosyst., 2010, 6, 662–676.

2 A. A. Travers, G. Muskhelishvili and J. M. T. Thompson,Philos. Trans. R. Soc. London, Ser. A, 2012, 370, 2960–2986.

3 R. Spurio, M. Durrenberger, M. Falconi, A. La Teana,C. L. Pon and C. O. Gualerzi, Mol. Gen. Genet., 1992, 231,201–211.

4 R. Stuger, C. L. Woldringh, C. C. van der Weijden,N. O. Vischer, B. M. Bakker, R. J. van Spanning,J. L. Snoep and H. V. Westerhoff, Mol. Biol. Rep., 2002, 29,79–82.

5 D. Frenkiel-Krispin, I. Ben-Avraham, J. Englander,E. Shimoni, S. G. Wolf and A. Minsky, Mol. Microbiol.,2004, 51, 395–405.

6 R. T. Dame, Mol. Microbiol., 2005, 56, 858–870.7 A. Travers and G. Muskhelishvili, Nat. Rev. Microbiol., 2005,

3, 157–169.8 R. L. Ohniwa, K. Morikawa, J. Kim, T. Ohta, A. Ishihama,

C. Wada and K. Takeyasu, EMBO J., 2006, 25, 5591–5602.9 F. Guo and S. Adhya, Proc. Natl. Acad. Sci. U. S. A., 2007, 104,

4309–4314.10 M. Berger, A. Farcas, M. Geertz, P. Zhelyazkova, K. Brix,

A. Travers and G. Muskhelishvili, EMBO Rep., 2010, 11,59–64.

11 D. F. Browning, D. C. Grainger and S. J. Busby, Curr. Opin.Microbiol., 2010, 13, 773–780.

12 S. C. Dillon and C. J. Dorman, Nat. Rev. Microbiol., 2010, 8,185–195.

13 V. F. Scolari, B. Bassetti, B. Sclavi and M. C. Lagomarsino,Mol. Biosyst., 2011, 7, 878–888.

14 C. Cagliero and D. J. Jin, Nucleic Acids Res., 2013, 41,315–326.

15 K. S. Jeong, J. Ahn and A. B. Khodursky, Genome Biol., 2004,5, R86.

16 B. J. Peter, J. Arsuaga, A. M. Breier, A. B. Khodursky,P. O. Brown and N. R. Cozzarelli, Genome Biol., 2004, 5, R87.

Paper Molecular BioSystems

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online

This journal is c The Royal Society of Chemistry 2013 Mol. BioSyst.

17 N. Blot, R. Mavathur, M. Geertz, A. Travers andG. Muskhelishvili, EMBO Rep., 2006, 7, 710–715.

18 M. J. Ferrandiz, A. J. Martın-Galiano, J. B. Schvartzman andA. G. de la Campa, Nucleic Acids Res., 2010, 38, 3570–3581.

19 G. Muskhelishvili, M. Buckle, H. Heumann, R. Kahmannand A. A. Travers, EMBO J., 1997, 16, 3655–3665.

20 D. Levens and C. J. Benham, Phys. Biol., 2011, 8, 035011.21 A. Travers and G. Muskhelishvili, J. Mol. Biol., 1998, 279,

1027–1043.22 I. Pemberton, G. Muskhelishvili, A. Travers and M. Buckle,

J. Mol. Biol., 2002, 318, 651–663.23 M. Rochman, M. Aviv, G. Glaser and G. Muskhelishvili,

EMBO Rep., 2002, 3, 355–360.24 G. Muskhelishvili and A. Travers, Front. Biosci., 2003, 8, 279–285.25 P. Sobetzko, A. Travers and G. Muskhelishvili, Proc. Natl.

Acad. Sci. U. S. A., 2012, 109, E42–E50.26 F. Gao, H. Luo and C.-T. Zhang, Nucleic Acids Res., 2013, 41,

D90–D93.27 L. S. Hsieh, R. M. Burger and K. Drlica, J. Mol. Biol., 1991,

219, 443–450.28 M. Geertz, A. Travers, S. Mehandziska, S.-C. Janga,

N. Shimamoto and G. Muskhelishvili, mBio, 2011, 2(4), e00034.29 E. Crozat, N. Philippe, R. E. Lenski, J. Geiselmann and

D. Schneider, Genetics, 2005, 169, 523–532.30 N. Sonnenschein, M. Geertz, G. Muskhelishvili and

M.-T. Hutt, BMC Syst. Biol., 2011, 5, 40.

31 M. Valens, S. Penaud, M. Rossignol, F. Cornet andF. Boccard, EMBO J., 2004, 23, 4330–4341.

32 J. E. Cabrera and D. J. Jin, Mol. Microbiol., 2003, 50,1493–1505.

33 X. Mao, H. Zhang, Y. Yin and Y. Xu, Nucleic Acids Res., 2012,40, 8210–8218.

34 L. F. Liu and J. C. Wang, Proc. Natl. Acad. Sci. U. S. A., 1987,84, 7024–7027.

35 A. Mathelier and C. Carbone, Mol. Syst. Biol., 2010, 6, 366.36 S.-C. Janga, H. Salgado and A. Martınez-Antonio, Nucleic

Acids Res., 2009, 37, 3680–3688.37 S. Rimsky and A. Travers, Curr. Opin. Microbiol., 2011, 14,

136–141.38 M. Fritsche, L. Songling, D. W. Heermann and P. Wiggins,

Nucleic Acids Res., 2012, 40, 972–980.39 P. M. Llopis, A. F. Jackson, O. Sliusarenko, I. Surovtsev,

J. Heinritz, T. Emonet and C. Jacobs-Wagner, Nature, 2010,466, 77–82.

40 T. Kuhlman and E. C. Cox, Mol. Syst. Biol., 2012, 8, 610.41 T. A. Azam, A. Iwata, A. Nishimura, S. Ueda and

A. Ishihama, J. Bacteriol., 1999, 181, 6361–6370.42 A. Ishihama, Annu. Rev. Microbiol., 2000, 54, 499–518.43 A. A. Travers and G. Muskhelishvili, Biochem. Soc. Trans.,

2013, 41, 548–553.44 J. Santa-Lucia, Proc. Natl. Acad. Sci. U. S. A., 1998, 95,

1460–1465.

Molecular BioSystems Paper

Dow

nloa

ded

by U

nive

rsity

of

Que

ensl

and

on 2

8/04

/201

3 12

:57:

22.

Publ

ishe

d on

19

Febr

uary

201

3 on

http

://pu

bs.r

sc.o

rg |

doi:1

0.10

39/C

3MB

2551

5HView Article Online