23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on...

47
23. Lecture WS 2006/07 Bioinformatics III 1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate biochemical pathways into a densely-woven metabolic network (2) The connectivity of substrates in this network follows a power-law. (3) Constraint-based modeling approaches (FBA) were successful in analyzing the capabilities of cellular metabolism including - its capacity to predict deletion phenotypes - the ability to calculate the relative flux values of metabolic reactions, and - the capability to identify properties of alternate optimal growth states in a wide range of simulated environmental conditions Open questions - what parts of metabolism are involved in adaptation to environmental conditions?

Transcript of 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on...

Page 1: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 1

V23 Current metabolomicsReview:

(1) recent work on metabolic networks required revising the picture of separate

biochemical pathways into a densely-woven metabolic network

(2) The connectivity of substrates in this network follows a power-law.

(3) Constraint-based modeling approaches (FBA) were successful in analyzing the

capabilities of cellular metabolism including

- its capacity to predict deletion phenotypes

- the ability to calculate the relative flux values of metabolic reactions, and

- the capability to identify properties of alternate optimal growth states

in a wide range of simulated environmental conditions

Open questions

- what parts of metabolism are involved in adaptation to environmental conditions?

- is there a central essential metabolic core?

- what role does transcriptional regulation play?

Page 2: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 2

Distribution of fluxes in E.coli

Stoichiometric matrix for E.coli strain MG1655 containing 537 metabolites and

739 reactions taken from Palsson et al.

Apply flux balance analysis to characterize solution space

(all possible flux states under a given condition).

Nature 427, 839 (2004)

Aim: understand principles that govern

the use of individual reactions under

different growth conditions.

j

jiji vSAdt

d0

vj is the flux of reaction j and Sij is the stoichiometric coefficient of reaction j.

Page 3: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 3

Optimal states

Using linear programming and adapting constraints for each reaction flux vi of the

form imin ≤ vi ≤ i

max, the flux states were calculated that optimize cell growth on

various substrates.

Plot the flux distribution for active (non-zero flux) reactions of E.coli grown in a

glutamate- or succinate-rich substrate.

Denote the mass carried by reaction j producing (consuming) metabolite i by

Fluxes vary widely: e.g. dimensionless flux of succinyl coenzyme A synthetase

reaction is 0.185, whereas the flux of the aspartate oxidase reaction is 10.000

times smaller, 2.2 10-5.

jijij vSv ˆ

Page 4: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 4

Overall flux organization of E.coli metabolic network

a, Flux distribution for optimized biomass production

on succinate (black) and glutamate (red) substrates.

The solid line corresponds to the power-law fit

that a reaction has flux v

P(v) (v + v0)- , with v0 = 0.0003 and = 1.5.

d, The distribution of experimentally determined fluxes

from the central metabolism of E. coli shows

power-law behaviour as well, with a best fit to

P(v) v- with = 1.

Both computed and experimental flux distribution

show wide spectrum of fluxes.

Almaar et al., Nature 427, 839 (2004)

Page 5: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 5

Response to different environmental conditions

Almaar et al., Nature 427, 839 (2004)

Is the flux distribution independent of environmental conditions?

b, Flux distribution for optimized biomass on succinate (black)

substrate with an additional 10% (red), 50% (green) and 80% (blue)

randomly chosen subsets of the 96 input channels (substrates) turned

on.

The flux distribution was averaged over 5,000 independent random

choices of uptake metabolites.

the flux distribution is independent of the external conditions.

Is the wide flux distribution also present in non-optimal

conditions?

c, Flux distribution from the non-optimized hit-and-run sampling

method of the E. coli solution space. The solid line is the best fit,

with v0 = 0.003 and = 2. Inset shows the flux distribution in four

randomly chosen sample points.

Many individual non-optimal states are consistent with an exponent

= 1.

Page 6: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 6

Use scaling behavior to determine local connectivity

The observed flux distribution is compatible with two different potential local flux

structures:

(a) a homogenous local organization would imply that all reactions producing

(consuming) a given metabolite have comparable fluxes

(b) a more delocalized „high-flux backbone (HFB)“ is expected if the local flux

organisation is heterogenous such that each metabolite has a dominant source

(consuming) reaction.

Schematic illustration of the hypothetical scenario in which

(a) all fluxes have comparable activity, in which case we expect kY(k) 1 and

(b) the majority of the flux is carried by a single incoming or outgoing reaction,

for which we should have kY(k) k . Almaar et al., Nature 427, 839 (2004)

Page 7: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 7

Measuring the importance of individual reactions

To distinguish between these 2 schemes for each metabolite i produced

(consumed) by k reactions, define

Almaar et al., Nature 427, 839 (2004)

2

11ˆ

ˆ,

k

jk

l ilv

ijv

ikY

where vij is the mass carried by reaction j which produces (consumes) metabolite i.

If all reactions producing (consuming) metabolite i have comparable vij values,

Y(k,i) scales as 1/k.

If, however, the activity of a single reaction dominates we expect

Y(k,i) 1 (independent of k).

Page 8: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 8

Characterizing the local inhomogeneity of the flux net

a, Measured kY(k) shown as a function of k for

incoming and outgoing reactions, averaged over

all metabolites, indicates that Y(k) k-0.27.

Inset shows non-zero mass flows, v^ij, producing

(consuming) FAD on a glutamate-rich substrate.

an intermediate behavior is found between the

two extreme cases.

the large-scale inhomogeneity observed in the

overall flux distribution is also increasingly valid at

the level of the individual metabolites.

The more reactions that consume (produce) a

given metabolite, the more likely it is that a single

reaction carries most of the flux, see FAD.

Almaar et al., Nature 427, 839 (2004)

Page 9: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 9

Clean up metabolic network

Simple algorithm removes for each metabolite systematically all reactions

but the one providing the largest incoming (outgoing) flux distribution.

The algorithm uncovers the „high-flux-backbone“ of the metabolism,

a distinct structure of linked reactions that form a giant component

with a star-like topology.

Almaar et al., Nature 427, 839 (2004)

Page 10: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 10

Maximal flow networks

glutamate rich succinate rich substrates

Directed links: Two metabolites (e.g. A and B) are connected with a directed link pointing

from A to B only if the reaction with maximal flux consuming A is the reaction with maximal

flux producing B.

Shown are all metabolites that have at least one neighbour after completing this procedure.

The background colours denote different known biochemical pathways.

Almaar et al., Nature 427, 839 (2004)

Page 11: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 11

FBA-optimized network on glutamate-rich substrateHigh-flux backbone for FBA-optimized metabolic

network of E. coli on a glutamate-rich substrate.

Metabolites (vertices) coloured blue have at least one

neighbour in common in glutamate- and succinate-rich

substrates, and those coloured red have none.

Reactions (lines) are coloured blue if they are identical

in glutamate- and succinate-rich substrates, green if a

different reaction connects the same neighbour pair,

and red if this is a new neighbour pair. Black dotted

lines indicate where the disconnected pathways, for

example, folate biosynthesis, would connect to the

cluster through a link that is not part of the HFB. Thus,

the red nodes and links highlight the predicted changes

in the HFB when shifting E. coli from glutamate- to

succinate-rich media. Dashed lines indicate links to the

biomass growth reaction.

Almaar et al., Nature 427, 839 (2004)

(1) Pentose Phospate (11) Respiration

(2) Purine Biosynthesis (12) Glutamate Biosynthesis (20) Histidine Biosynthesis

(3) Aromatic Amino Acids (13) NAD Biosynthesis (21) Pyrimidine Biosynthesis

(4) Folate Biosynthesis (14) Threonine, Lysine and Methionine Biosynthesis

(5) Serine Biosynthesis (15) Branched Chain Amino Acid Biosynthesis

(6) Cysteine Biosynthesis (16) Spermidine Biosynthesis (22) Membrane Lipid Biosynthesis

(7) Riboflavin Biosynthesis (17) Salvage Pathways (23) Arginine Biosynthesis

(8) Vitamin B6 Biosynthesis (18) Murein Biosynthesis (24) Pyruvate Metabolism

(9) Coenzyme A Biosynthesis (19) Cell Envelope Biosynthesis (25) Glycolysis

(10) TCA Cycle

Page 12: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 12

Interpretation

Only a few pathways appear disconnected indicating that although these pathways

are part of the HFB, their end product is only the second-most important source for

another HFB metabolite.

Groups of individual HFB reactions largely overlap with traditional biochemical

partitioning of cellular metabolism.

Almaar et al., Nature 427, 839 (2004)

Page 13: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 13

How sensitive is the HFB to changes in the environment?

Almaar et al., Nature 427, 839 (2004)

b, Fluxes of individual

reactions for glutamate-rich

and succinate-rich conditions.

Reactions with negligible flux

changes follow the diagonal

(solid line).

Some reactions are turned off

in only one of the conditions

(shown close to the

coordinate axes). Reactions

belonging to the HFB are

indicated by black squares,

the rest are indicated by blue

dots. Reactions in which the

direction of the flux is

reversed are coloured green.

Only the reactions in the high-flux territory

undergo noticeable differences!

Type I: reactions turned on in one conditions and

off in the other (symbols).

Type II: reactions remain active but show an

orders-in-magnitude shift in flux under the two

different growth conditions.

Page 14: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 14

Flux distributions for individual reactions

Shown is the flux distribution for four selected

E. coli reactions in a 50% random environment.

a Triosphosphate isomerase;

b carbon dioxide transport;

c NAD kinase;

d guanosine kinase.

Reactions on the v curve (small fluxes)

have unimodal/gaussian distributions (a and

c). Shifts in growth-conditions only lead to small

changes of their flux values.

Reactions off this curve have multimodal

distributions (b and d), showing several

discrete flux values under diverse conditions.

Under different growth conditions they show

several discrete and distinct flux values.

Almaar et al., Nature 427, 839 (2004)

Page 15: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 15

Summary Metabolic network use is highly uneven (power-law distribution) at the global level

and at the level of the individual metabolites.

Whereas most metabolic reactions have low fluxes, the overall activity of the

metabolism is dominated by several reactions with very high fluxes.

E. coli responds to changes in growth conditions by reorganizing the rates of

selected fluxes predominantly within this high-flux backbone.

Apart from minor changes, the use of the other pathways remains unaltered.

These reorganizations result in large, discrete changes in the fluxes of the HFB

reactions.

Page 16: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 16

The same authors as before used Flux Balance Analysis to examine utilization

and relative flux rate of each metabolite in a wide range of simulated

environmental conditions for E.coli, H. pylori and S. cerevisae:

consider in each case 30.000 randomly chosen combinations where each uptake

reaction is a assigned a random value between 0 and 20 mmol/g/h.

adaptation to different conditions occurs by 2 mechanisms:

(a) flux plasticity: changes in the fluxes of already active reactions.

E.g. changing from glucose- to succinate-rich conditions alters the flux of 264

E.coli reactions by more than 20%

(b) less commonly, adaptation includes structural plasticity, turning on

previously zero-flux reactions or switching off active pathways.

Page 17: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 17

The two adaptation method mechanisms allow for the possibility of a group of

reactions not subject to structural plasticity being active under all environmental

conditions.

Assume that active reactions were randomly distributed.

If typically a q fraction of the metabolic reactions are active under a specific

growth condition,

we expect for n distinct conditions an overlap of at least qn reactions.

This converges quickly to 0.

Emergence of the Metabolic Core

Page 18: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 18

However, as the number of conditions increases, the curve converges to a

constant enoted by the dashed line, identifying the metabolic core of an organism.

Red line : number of reactions that are always active if activity is randomly

distributed in the metabolic network. The fact that it converges to zero indicates

that the real core represents a collective network effect, forcing a group of

reactions to be active in all conditions.

Emergence of the Metabolic Core(A–C) The average relative size of the number of reactions that are always active as a function of the number of sampled conditions (blackline) for (A) H. pylori, (B) E. coli, and (C) S. cerevisiae.(D and E) The number of metabolic reactions (D) and the number of metabolic core reactions (E) in the three studied organisms.

Page 19: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 19

The constantly active reactions form a tightly connected cluster!

All reactions that are found to be active in each of the 30,000 investigated external conditions are shown. Metabolites that contribute directly tobiomass formation are colored blue, while core reactions (links) catalyzed by essential (or nonessential) enzymes are colored red (or green).(Black-colored links denote enzymes with unknown deletion phenotype.) Blue dashed lines indicate multiple appearances of a metabolite, while links with arrows denote unidirectional reactions. Note that 20 out of the 51 metabolites necessary for biomass synthesis are not present in the core, indicating that they are produced (or consumed) in a growth-condition-specific manner. Blue and brown shading: folate and peptidoglycan biosynthesis pathways White numbered arrows denote current antibiotic targets inhibited by: (1) sulfonamides, (2) trimethoprim, (3) cycloserine, and (4) fosfomycin. A few reactions appear disconnected since we have omitted the drawing of cofactors.

Metabolic Core of E.coli

Page 20: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 20

The metabolic cores contain 2 types of reactions:

(a) reactions that are essential for biomass production under all environment

conditions (81 of 90 in E.coli)

(b) reactions that assure optimal metabolic performance.

Metabolic Core Reactions

Page 21: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 21

(A) The number of overlapping metabolic reactions in the

metabolic core of H. pylori, E. coli, and S. cerevisiae.

The metabolic cores of simple organisms (H. pylori and

E.coli) overlap to a large extent.

The largest organism (S.cerevisae) has a much larger

reaction network that allows more flexbility the relative

size of the metabolic core is much lower.

(B) The fraction of metabolic reactions catalyzed by

essential enzymes in the cores (black) and outside the

core in E. coli and S. cerevisiae.

Reactions of the metabolic core are mostly

essential ones.

(C) One could assume that the core represents a subset

of high-flux reactions. This is apparently not the case.

The distributions of average metabolic fluxes for the

core and the noncore reactions in E. coli are very

similar.

Characterizing the Metabolic Cores

Page 22: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 22

Pearson correlation using flux values from 30,000

conditions for each reaction pair before grouping

the reactions according to a hierarchical average-

linkage clustering algorithm. The values of the flux-correlation matrix range from -1 (red)

through 0 (white) to 1 (blue). The horizontal color bar denotes if a

reaction is a member of the core (green), and the vertical color

bar denotes whether the enzymes catalyzing the reaction are

essential (red).

group of highly correlated reactions significantly

overlaps with metabolic core.

(B) Distribution of Pearson correlation in mRNA

copy numbers from 41 experiments.

The correlations of the core reactions are clearly

shifted towards higher values.

Correlation among E.coli Metabolic Reactions

Page 23: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 23

- Adaptation to environmental conditions occurs via structural plasticity and/or flux

plasticity.

Here: identification of a surprisingly stable metabolic core of reactions that are

tightly connected to eachother.

- the reactions belonging to this core represent potential targets for antimicrobial

intervention.

Summary

Page 24: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 24

Integrated Analysis of Metabolic and Regulatory NetworksSofar, studies of large-scale cellular networks have focused on their connectivities.

The emerging picture shows a densely-woven web where almost everything is

connected to everything.

In the cell‘s metabolic network, hundreds of substrates are interconnected through

biochemical reactions.

Although this could in principle lead to the simultaneous flow of substrates in

numerous directions, in practice metabolic fluxes pass through specific pathways

( high flux backbone).

Topological studies sofar did not consider how the modulation of this connectivity

might also determine network properties.

Therefore it is important to correlate the network topology with the expression of

enzymes in the cell.

Page 25: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 25

Analyze transcriptional control in metabolic networksRegulatory and metabolic functions of cells are mediated by networks of interacting

biochemical components.

Metabolic flux is optimized to maximize metabolic efficiency under different

conditions.

Control of metabolic flow:

- allosteric interactions

- covalent modifications involving enzymatic activity

- transcription (revealed by genome-wide expression studies)

Here: N. Barkai and colleagues analyzed published experimental expression data of

Saccharomyces cerevisae.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 26: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 26

Recurrence signature algorithmAim: identify transcription „modules“ (TMs).

a set of randomly selected genes is unlikely to be identical to the genes of any

TM. Yet many such sets do have some overlap with a specific TM.

In particular, sets of genes that are compiled according to existing knowledge of

their functional (or regulatory) sequence similarity may have a significant overlap

with a transcription module.

Algorithm receives a gene set that partially overlaps a TM and then provides the

complete module as output.

Therefore this algorithm is referred to as „signature algorithm“.

Ihmels et al. Nat Genetics 31, 370 (2002)

Page 27: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 27

Recurrence signature algorithm

a, The signature algorithm.

b , Recurrence as a reliability measure. The signature algorithm is applied to distinct input

sets containing different subsets of the postulated transcription module. If the different input

sets give rise to the same module, it is considered reliable.

c, General application of the recurrent signature method.

Ihmels et al. Nat Genetics 31, 370 (2002)

normalizationof data

identify modules

classify genesinto modules

Page 28: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 28

Correlation between genes of the same metabolic pathwayDistribution of the average correlation

between genes assigned to the same

metabolic pathway in the KEGG database.

The distribution corresponding to random

assignment of genes to metabolic

pathways of the same size is shown for

comparison. Importantly, only genes

coding for enzymes were used in the

random control.

Interpretation: pairs of genes associated

with the same metabolic pathway show a

similar expression pattern.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

However, typically only a set of the

genes assigned to a given pathway

are coregulated.

Page 29: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 29

Correlation between genes of the same metabolic pathway

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Genes of the glycolysis pathway

(according KEGG) were clustered

and ordered based on the correlation

in their expression profiles.

Shown here is the matrix of their

pair-wise correlations.

The cluster of highly correlated

genes (orange frame) corresponds

to genes that encode the central

glycolysis enzymes.

The linear arrangement of these

genes along the pathway is shown at

right.

Of the 46 genes assigned to the

glycolysis pathway in the KEGG

database, only 24 show a correlated

expression pattern.

In general, the coregulated genes

belong to the central pieces of

pathways.

Page 30: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 30

Coexpressed enzymes often catalyze linear chain of reactionsCoregulation between enzymes

associated with central metabolic

pathways. Each branch

corresponds to several enzymes.

In the cases shown, only one of the

branches downstream of the

junction point is coregulated with

upstream genes.

Interpretation: coexpressed

enzymes are often arranged in a

linear order, corresponding to a

metabolic flow that is directed in a

particular direction.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 31: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 31

Co-regulation at branch points

To examine more systematically whether coregulation enhances the linearity of

metabolic flow, analyze the coregulation of enzymes at metabolic branch-points.

Search KEGG for metabolic compounds that are involved in exactly 3 reactions.

Only consider reactions that exist in S.cerevisae.

3-junctions can integrate metabolic flow (convergent junction)

or allow the flow to diverge in 2 directions (divergent junction).

In the cases where several reactions are catalyzed by the same enzymes, choose

one representative so that all junctions considered are composed of precisely 3

reactions catalyzed by distinct enzymes.

Each 3-junction is categorized according to the correlation pattern found between

enzymes catalyzing its branches. Correlation coefficients > 0.25 are considered

significant.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 32: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 32

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Coregulation pattern in three-point junctions

In the majority of divergent

junctions, only one of the

emanating branches is significantly

coregulated with the incoming

reaction that synthesizes the

metabolite.

All junctions corresponding to metabolites that participate in exactly 3

reactions (according to KEGG) were identified and the correlations

between the genes associated with each such junction were calculated.

The junctions were grouped according to the directionality of the

reactions, as shown.

Divergent junctions, which allow the flow of metabolites in two

alternative directions, predominantly show a linear coregulation pattern,

where one of the emanating reaction is correlated with the incoming

reaction (linear regulatory pattern) or the two alternative outgoing

reactions are correlated in a context-dependent manner with a distinct

isozyme catalyzing the incoming reaction (linear switch).

By contrast, the linear regulatory pattern is significantly less abundant

in convergent junctions, where the outgoing flow follows a unique

direction, and in conflicting junctions that do not support metabolic flow.

Most of the reversible junctions comply with linear regulatory patterns.

Indeed, similar to divergent junctions, reversible junctions allow

metabolites to flow in two alternative directions. Reactions were

counted as coexpressed if at least two of the associated genes were

significantly correlated (correlation coefficient >0.25).

As a random control, we randomized the identity of all metabolic genes

and repeated the analysis.

Page 33: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 33

Co-regulation at branch points: conclusions

The observed co-regulation patterns correspond to a linear metabolic flow, whose

directionality can be switched in a condition-specific manner.

When analyzing junctions that allow metabolic flow in a larger number of

directions, there also only a few important branches are coregulated with the

incoming branch.

Therefore: transcription regulation is used to enhance the linearity of metabolic

flow, by biasing the flow toward only a few of the possible routes.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 34: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 34

The connectivity of a given metabolite

is defined as the number of reactions

connecting it to other metabolites.

Shown are the distributions of

connectivity between metabolites in an

unrestricted network () and in a

network where only correlated

reactions are considered ().

In accordance with previous results

(Jeong et al. 2000) , the connectivity

distribution between metabolites

follows a power law (log-log plot).

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2004)

Connectivity of metabolites

In contrast, when coexpression is

used as a criterion to distinguish

functional links, the connectivity

distribution becomes exponential

(log-linear plot).

Page 35: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 35

Differential regulation of isozymes

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Observe that isozymes at junction points are often preferentially coexpressed

with alternative reactions.

investigate their role in the metabolic network more systematically.

Two possible functions of isozymes

associated with the same metabolic

reaction.

An isozyme pair could provide redundancy which may be needed for buffering genetic

mutations or for amplifying metabolite production. Redundant isozymes are expected

to be coregulated.

Alternatively, distinct isozymes could be dedicated to separate biochemical

pathways using the associated reaction. Such isozymes are expected to be

differentially expressed with the two alternative processes.

Page 36: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 36

Arrows represent metabolic

pathways composed of a sequence

of enzymes.

Coregulation is indicated with the

same color (e.g., the isozyme

represented by the green arrow is

coregulated with the metabolic

pathway represented by the green

arrow).

Most members of isozyme pairs

are separately coregulated with

alternative processes.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Differential regulation of isozymes in central metabolic PW

Page 37: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 37

Regulatory pattern of all gene pairs

associated with a common metabolic

reaction (according to KEGG).

All such pairs were classified into several

classes:

(1) parallel, where each gene is

correlated with a distinct connected

reaction (a reaction that shares a

metabolite with the reaction catalyzed by

the respective gene pair);

(2) selective, where only one of the

enzymes shows a significant correlation

with a connected reaction; and

(3) converging, where both enzymes

were correlated with the same reaction.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Differential regulation of isozymes

Correlations coefficients >0.25 were

considered significant. To be

counted as parallel, rather than

converging, we demanded that the

correlation with the alternative

reaction be <80% of the correlation

with the preferred reaction.

Page 38: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 38

The primary role of isozyme multiplicity is to allow for differential regulation of

reactions that are shared by separated processes.

Dedicating a specific enzyme to each pathway may offer a way of independently

controlling the associated reaction in response to pathway-specific requirements, at

both the transcriptional and the post-transcriptional levels.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Differential regulation of isozymes: interpretation

Page 39: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 39

Identify the coregulated subparts of each metabolic pathway and identify relevant

experimental conditions that induce or repress the expression of the pathway

genes.

Also associate additional genes showing similar expression profiles with each

pathway using the signature algorithm.

Input: set of genes, some of which are expected to be coregulated.

Output: coregulated part of the input and additional coregulated genes together

with the set of conditions where the coregulation is realized.

Numerous genes were found that are not directly involved in enzymatic steps:

- transporters

- transcription factors

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Genes coexpressed with metabolic pathways

Page 40: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 40

Co-expression of transporters

Transporter genes are

co-expressed with the relevant

metabolic pathways providing

the pathways with its metabolites.

Co-expression is marked in green.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 41: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 41

Co-regulation of transcription factors

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Transcription factors are often co-regulated with their regulated pathways. Shown

here are transcription factors which were found to be co-regulated in the analysis.

Co-regulation is shown by color-coding such that the transcription factor and the

associated pathways are of the same color.

Page 42: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 42

Sofar: co-expression analysis revealed a strong tendency toward coordinated

regulation of genes involved in individual metabolic pathways.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Hierarchical modularity in the metabolic network

Does transcription regulation also define a higher-order metabolic organization, by

coordinated expression of distinct metabolic pathways?

Based on observation that feeder pathways (which synthesize metabolites) are

frequently coexpressed with pathways using the synthesized metabolites.

Page 43: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 43

Feeder-pathways/enzymesFeeder pathways or genes

co-expressed with the

pathways they fuel. The

feeder pathways (light blue)

provide the main pathway

(dark blue) with metabolites

in order to assist the main

pathway, indicating that co-

expression extends beyond

the level of individual

pathways.

These results can be

interpreted in the following

way: the organism will

produce those enzymes that

are needed.Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 44: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 44

Hierarchical modularity in the metabolic networkDerive hierarchy by applying an iterative

signature algorithm to the metabolic pathways,

and decreasing the resolution parameter

(coregulation stringency) in small steps.

Each box contains a group of coregulated genes

(transcription module). Strongly associated

genes (left) can be associated with a specific

function, whereas moderately correlated

modules (right) are larger and their function is

less coherent.

The merging of 2 branches indicates that the

associated modules are induced by similar

conditions.

All pathways converge to one of 3 low-resolution

modules: amino acid biosynthesis, protein

synthesis, and stress.Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 45: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 45

Hierarchical modularity in the metabolic networkAlthough amino acids serve as building blocks for proteins, the expression of genes

mediating these 2 processes is clearly uncoupled!

This may reflect the association of rapid cell growth (which triggers enhanced

protein synthesis) with rich growth conditions, where amino acids are readily

available and do not need to be synthesized.

Amino acid biosynthesis genes are only required when external amino acids are

scarce.

In support of this view, a group of amino acid transporters converged to the protein

synthesis module, together with other pathways required for rapid cell growth

(glucose fermentation, nucleotide synthesis and fatty acid synthesis).

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

Page 46: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 46

Global network propertiesJeong et al. showed that the structural connectivity between metabolites imposes a

hierarchical organization of the metabolic network. That analysis was based on

connectivity between substrates, considering all potential connections.

Here, analysis is based on coexpression of enzymes.

In both approaches, related metabolic pathways were clustered together!

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)

There are, however, some differences in the particular groupings (not discussed

here),

and importantly, when including expression data the connectivity pattern of

metabolites changes from a power-law dependence to an exponential one

corresponding to a network structure with a defined scale of connectivity.

This reflects the reduction in the complexity of the network.

Page 47: 23. Lecture WS 2006/07Bioinformatics III1 V23 Current metabolomics Review: (1) recent work on metabolic networks required revising the picture of separate.

23. Lecture WS 2006/07

Bioinformatics III 47

SummaryTranscription regulation is prominently involved in shaping the metabolic network of

S. cerevisae.

1 Transcription leads the metabolic flow toward linearity.

2 Individual isozymes are often separately coregulated with distinct processes,

providing a means of reducing crosstalk between pathways using a common

reaction.

3 Transcription regulation entails a higher-order structure of the metabolic

network.

It exists a hierarchical organization of metabolic pathways into groups of

decreasing expression coherence.

Ihmels, Levy, Barkai, Nat. Biotech 22, 86 (2003)