FBA

55
Optimization Based Frameworks and Search Methodologies for the Analysis and Redesign of the Escherichia coli Metabolic Network. Thesis defense by: William W. Gikandi Major Professor: Matheos Koffas Additional committee Members: Prof. E. (Manolis) S. Tzanakakis Prof. Sriram Neelamegham

Transcript of FBA

Page 1: FBA

Optimization Based Frameworks and Search Methodologies for the

Analysis and Redesign of the Escherichia coli Metabolic Network.

Thesis defense by: William W. GikandiMajor Professor: Matheos KoffasAdditional committee Members:Prof. E. (Manolis) S. Tzanakakis

Prof. Sriram Neelamegham

Page 2: FBA

Cell Modeling to Improve Naringenin Production in E. coli

Page 3: FBA

Cell Modeling

Variety of methods. Identify the steady state fluxes of a cell. Main ones Flux Balance Analysis and MOMA

Page 4: FBA

Flux Balance AnalysisProcedure

Page 5: FBA

Is it biologically justifiable to assume it?

“The steady state approximation is generally valid because of fast equilibration of metabolite concentrations (seconds) with respect to the time scale of genetic regulation (minutes)” – Segre 2002

Steady State Assumption

Page 6: FBA

Maximization ObjectiveCell’s objective is to Maximize Biomass

The Maximization objective = the stoichiometric sum of components that constitute Biomass

Page 7: FBA

Minimization of Metabolic Adjustment (MOMA) Do mutant bacteria exhibit optimum metabolic

states? Not subjected to the same evolutionary

pressure that shaped the wild type Therefore knockouts probably do not possess

a mechanism for immediate regulation of fluxes toward the optimal growth configuration

Page 8: FBA

MOMA

Hypothesis: knocked out bacteria initially display a suboptimal flux distribution with minimal cell-wide changes in fluxes

MOMA uses quadratic programming to approximate this behavior

Page 9: FBA

FBA and MOMA

MOMA calculates initial flux distribution after perturbation assuming sub-optimal growth.

FBA (incorrectly) assumes perturbed cells behave optimally from the onset.

Regulatory/ Kinetic effects not accounted for.

FBA/ MOMAconstraints fluxes

Page 10: FBA

Does Cell Modeling Work?

Qualitatively predict the growth potential of mutant strains

Qualitatively predict media dependent uptake/ secretion of protons in the growth

The average difference between experimental flux measurements and ones predicted by the model was 16%

Quantitatively describe relationship between uptake of a primary carbon source (acetate, malate, succinate), oxygen and maximal cellular growth rate.

Successfully identify triple-knockout gene targets that improved lycopene yield by ~ 40% in E. coli

Page 11: FBA

FBA/ MOMA

Page 12: FBA

Building the Model

[c]akg + ala-L <==> glu-L + pyr

[c]ala-L <==> ala-D

[c]asn-L + h2o --> asp-L + nh4

[c]asp-L + atp + nh4 --> amp + asn-L + h + ppi

[c]asp-L + atp + gln-L + h2o --> amp + asn-L + glu-L + h + ppi

[c]asp-L --> fum + nh4

[c]akg + asp-L <==> glu-L + oaa

[c]3mob + ala-L --> pyr + val-L

[c]ala-D + fad + h2o --> fadh2 + nh4 + pyr

Matrix Creator

Page 13: FBA

1191 Total Fluxes 932 Reactions 259 Transport & Exchange Fluxes 70 Dead end Metabolites

Current Model

Glycolysis, the TCA cycle, the pentose phosphate pathway, respiration, anaplerotic reactions, fermentative reactions, amino acid biosynthesis and degradation, nucleotide biosynthesis and interconversions, fatty acid biosynthesis and degradation, phospholipid biosynthesis, cofactor biosynthesis, and metabolite transport

Page 14: FBA

Testing the ModelObtained in-Silico exchange fluxes vs. Palsson's iJR904 model

Similar results for Anaerobic-Glucose, Aerobic-Succinate, Aerobic-Acetate substrates-20

-10

0

10

20

30

40

50

Exchange flux

Ou

tpu

t (m

mo

l/g D

W-h

r)

Matlab

Palsson

EX_co2(e) EX_h(e) EX_h2o(e) EX_pi(e)

EX_nh4(e)

Biomass

Page 15: FBA

Proton Exchange Flux

Page 16: FBA

Limiting exchange of protons across system boundary

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

-10 -8 -6 -4 -2 0 2 4 6

Proton secretion flux

Re

lati

ve G

row

th R

ate

Acetate

Akg

Glucose-D

L-lactate

D-Lactate

Malate

Pyruvate

Succinate

Glycerol

Proton Exchange Flux

Page 17: FBA

Naringenin

Reactions added

Participating Enzyme ReactionCoumaric Acid transport cma[e] <==> cma[c]4 coumarate:coenzyme A ligase [c]atp + cma + coa --> amp + ppi + cmcoaChalcone Synthase [c](3) malcoa + cmcoa --> (4) coa + chal + (3) co2Chalcone Isomerase [c]chal --> flvaNaringenin exchange flux [e]flva <==>Coumaric Acid exchange flux [e]cma <==>Naringenin transport flva[e] <==> flva[c]

Page 18: FBA

Evaluate Scenarios

Page 19: FBA

Gene-Protein Relationships

Page 20: FBA

Gene-Protein Relationships

Page 21: FBA

Gene-Protein Relationships

Page 22: FBA

Gene Map

Page 23: FBA

Overall Process

Page 24: FBA

Standard Search

Combinatorial Explosion

Quaternary Knockouts ~ 230 days

At 2 seconds/ calculation…

Tertiary Knockouts ~ 12 daysSecondary Knockouts ~ 1 day

Primary Knockouts < 3 hours

Limited search space

Page 25: FBA

Problem of large search space

Time taken Not all search covered Other methods possible? Genetic Algorithm

Page 26: FBA

Genetic Algorithm

Page 27: FBA

Genetic Algorithm

Page 28: FBA

Crossover - Recombination

Crossover combines genetic material from two parents,Crossover combines genetic material from two parents,in order to produce superior offspring.in order to produce superior offspring.

Page 29: FBA

Mutation

•Mutation introduces randomness into the population.Mutation introduces randomness into the population.•The idea of mutation is to reintroduce divergence into a The idea of mutation is to reintroduce divergence into a converging population.converging population.

Page 30: FBA

Fitness Function

The Fitness function determines what solutions are better than others.

Fitness is computed for each individual. Fitness = flavanoid production

Page 31: FBA

Example population

No. Chromosome Fitness

1 1010011010 1

2 1111100001 2

3 1011001100 3

4 1010000000 1

5 0000010000 3

6 1001011111 5

7 0101010101 1

8 1011100111 2

Page 32: FBA

Main idea: better individuals get higher chance Chances proportional to fitness Roulette wheel technique

Selection

fitness(A) = 3

fitness(B) = 1

fitness(C) = 2

A C

1/6 = 17%

3/6 = 50%

B

2/6 = 33%

Page 33: FBA

Stopping Criteria

Final problem is to decide when to stop execution of algorithm.

There are two possible solutions to this problem: First approach:

Stop after production of definite number of generations

Second approach: Stop when the improvement in average fitness

over two generations is below a threshold

Page 34: FBA

Typical behavior of an EA

Early phase:

quasi-random population distribution

Mid-phase:

population arranged around/on hills

Late phase:

population concentrated on high hills

Phases in optimizing on a 1-dimensional fitness landscape

Page 35: FBA

Advantages of GA’s

Search space not limited to first top 10 knockouts

Supports multi-objective optimization Can return a family of solutions with

similar fluxes Easy to exploit previous or alternate

solutions May find synergistic knockouts overlooked

by standard search

Page 36: FBA

Genetic Algorithm

Page 37: FBA

Parameters of the GA

Representation scheme: Integer [00100111][3 6 7 8]

Mutation rate: 1/ string length / locus restricted

Crossover type: scattered (random mix) Elite children : 2 Stall generations: 50 Population size: 1000 Mutation probability: Simulated Annealing

Page 38: FBA

Simulated Annealing

Page 39: FBA

Simulated AnnealingChange in Mutation Rate

0

0.1

0.2

0.3

0.4

0.5

0.6

0 10 20 30 40 50 60 70 80 90 100

Generation %

Mu

tati

on

rat

e

Page 40: FBA

Results:

Page 41: FBA

Results: Summary

Over 10,000 KO results were stored by the algorithms, out of about 900,000 MOMA calculations performed

Page 42: FBA

Results: Hill Climber VS GA

Results for both methods in Agreement Exhaustive combination of top 10 most

frequently suggested KO’s yielded no better results

Implications: the search space is not as chaotic as originally assumed

Which is better?

Page 43: FBA

Results: Effect of Gene Mapping

More accurate prediction on reactions affected by disruption of genes

For example, the top yielding candidate for a primary level knockout predicted the loss of two reactions

Page 44: FBA

Results: Primary Level

The top result predicted a flux increase of naringenin from zero with no knockouts performed to 0.6078 mmol/g-DW/hr

Gene: sdhC Reaction:

Reaction reduces amount of fumerate available to the cell. (Other sources available: e.g. glutamate degradation)

Page 45: FBA

Results: Primary Level

Affects ATP availability?

Page 46: FBA

Results: Primary Level

The top second result Gene: tpiA Glycolysis

Affects ATP availability?

Page 47: FBA

Results: Top 3 in each levelTop 3 Simulated KO in each level

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Primar

y('sd

hC')

Primar

y('tp

iA')

Primar

y('gn

d')

Secon

dary

('gnd

' 'sd

hC')

Secon

dary

( 'gly

A' 'sd

hC')

Secon

dary

('folD

' 'sd

hC')

Tertia

ry('g

dhA' '

gnd'

'sdh

C')

Tertia

ry('g

cd' '

glyA

' 'sd

hC')

Tertia

ry('m

dh' 'g

lyA'

'sdhC')

Qua

tern

ary(

'dcuC

' 'br

nQ' '

gnd'

'sdh

C')

Qua

tern

ary(

'dcu

C' 'br

nQ' '

folD

' 'sd

hC')

Qua

tern

ary(

'gdhA

' 'pg

i' 'br

nQ' '

gnd')

GA Q

uate

rnar

y( 'g

nd'

'dcuC

' 'br

nQ'

'sdhD

')

GA Q

uate

rnar

y('sd

hC' '

gdhA

' 'ac

eA' '

gnd')

GA Q

uate

rnar

y('gn

d' 'm

dh' '

gdhA' '

sdhB

')

KO Genes

Nar

ing

enin

flu

x (m

mo

l/g-D

W/h

r)

Page 48: FBA

Results: Increase over Wild type

% increase over predicted naringenin wildtype flux (0.0002 mmol/g-DW/hr)

0

100000

200000

300000

400000

500000

600000

700000

800000

Primar

y('sd

hC')

Primar

y('tp

iA')

Primar

y('gn

d')

Secon

dary

('gnd

' 'sd

hC')

Secon

dary

( 'gly

A' 'sd

hC')

Secon

dary

('folD

' 'sd

hC')

Tertia

ry('g

dhA' '

gnd'

'sdh

C')

Tertia

ry('g

cd' '

glyA

' 'sd

hC')

Tertia

ry('m

dh' 'g

lyA'

'sdhC')

Qua

tern

ary(

'dcuC

' 'br

nQ' '

gnd'

'sdh

C')

Qua

tern

ary(

'dcu

C' 'br

nQ' '

folD

' 'sd

hC')

Qua

tern

ary(

'gdhA

' 'pg

i' 'br

nQ' '

gnd')

GA Q

uate

rnar

y( 'g

nd'

'dcuC

' 'br

nQ'

'sdhD

')

GA Q

uate

rnar

y('sd

hC' '

gdhA

' 'ac

eA' '

gnd')

GA Q

uate

rnar

y('gn

d' 'm

dh' '

gdhA' '

sdhB

')

% in

crea

se o

f n

arin

gen

in f

lux

Page 49: FBA

Results: Targets

TCA cycle, the pentose phosphate pathway, and other biosynthetic pathways

Page 50: FBA

Results: RationalizationPrecursor Availability

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Naringenin Flux (mmol/g-DW/hr)

Flu

x o

utp

uts

(m

mo

l/g

-DW

/hr)

Malonyl CoA ACP transacylase

acetyl CoA carboxylate

Malonyl CoA ACP transacylase: only consumer of malonyl CoA

Acetyl CoA carboxylate: produces malonyl CoA

Page 51: FBA

Naringenin/ Biomass Relationship

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Naringenin output (mmol/g-DW/hr)

Bio

mas

s fl

ux

(mm

ol/

g-D

W/h

r)

Competition for precursors

Page 52: FBA

Results: Diminishing Returns% increases of top 3 KO's over previous levels

0

50

100

150

200

250

300

Primar

y('sd

hC')

Primar

y('tp

iA')

Primar

y('gn

d')

Secon

dary

('gnd

' 'sd

hC')

Secon

dary

( 'gly

A' 'sd

hC')

Secon

dary

('folD

' 'sd

hC')

Tertia

ry('g

dhA' '

gnd'

'sdh

C')

Tertia

ry('g

cd' '

glyA

' 'sd

hC')

Tertia

ry('m

dh' 'g

lyA'

'sdhC')

Qua

tern

ary(

'dcuC

' 'br

nQ' '

gnd'

'sdh

C')

Qua

tern

ary(

'dcu

C' 'br

nQ' '

folD

' 'sd

hC')

Qua

tern

ary(

'gdhA

' 'pg

i' 'br

nQ' '

gnd')

KO Genes

% in

crea

se o

f n

arin

gen

in f

lux

Page 53: FBA

Results: Diminishing ReturnsBiomass Threshold

Biomass Flux of top 3 KO's in each level

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Wild

type

Primar

y('sd

hC')

Primar

y('tp

iA')

Primar

y('gn

d')

Secon

dary

('gnd

' 'sd

hC')

Secon

dary

( 'gly

A' 'sd

hC')

Secon

dary

('folD

' 'sd

hC')

Tertia

ry('g

dhA' '

gnd'

'sdh

C')

Tertia

ry('g

cd' '

glyA

' 'sd

hC')

Tertia

ry('m

dh' 'g

lyA'

'sdhC')

Qua

tern

ary(

'dcuC

' 'br

nQ' '

gnd'

'sdh

C')

Qua

tern

ary(

'dcu

C' 'br

nQ' '

folD

' 'sd

hC')

Qua

tern

ary(

'gdhA

' 'pg

i' 'br

nQ' '

gnd')

GA Q

uate

rnar

y( 'g

nd'

'dcuC

' 'br

nQ'

'sdhD

')

GA Q

uate

rnar

y('sd

hC' '

gdhA

' 'ac

eA' '

gnd')

GA Q

uate

rnar

y('gn

d' 'm

dh' '

gdhA' '

sdhB

')

KO Genes

Bio

mas

s F

lux

(mm

ol/g

-DW

/hr)

Page 54: FBA

In Conclusion

Will all knockouts identified show increased productivity?

In-vivo results could provide an opportunity to improve the model.

The approaches used justify some optimism regarding gene targeting for strain improvement

Provide a clearer understanding of the nature of the optimization goal

Page 55: FBA

Questions?