Seminar in bioinformatics Computation of elementary modes: a unifying framework and the new binary...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Seminar in bioinformatics Computation of elementary modes: a unifying framework and the new binary...
Seminar in bioinformatics
Computation of elementary modes: a unifying framework and the new binary
approach
Elad Gerson, Spring 2006, Technion.
Julien Gagneur and Steffen Klamt
BMC Bioinformatics 2004, 5:175
AgendaAgenda
• Quick overview of last week’s lecture.• Extension of the EP concept.
– Enter EM.
• General framework for EM computation.– Reversible reactions split.– Network compression.– Post processing.
• Some implementation tweaks.
Last week on bioinformatics seminar!Last week on bioinformatics seminar!Given a metabolic network we wish to find all the possible flux
distributions which results in a steady state.
Meaning, the overall flux in a pathway is 0.
1 2 3 4 5 6 1 2 3
1 0 0 0 0 0 1 0 0
1 2 2 0 0 0 0 0 0
0 1 0 0 1 1 0 0 0
0 0 1 1 1 0 0 0 0
0 0 0 1 0 1 0 1 0
0 1 1 0 0 0 0 0 1
0 0 1 1 1 0 0 0 0
v v v v v v b b b
A
B
C
S D
E
byp
cof
Last week on bioinformatics seminar!Last week on bioinformatics seminar!
This is done by describing the pathway as a stoichiometric matrix S, solving the equation –
1
2
3
4
5
6
7
8
9
1 0 0 0 0 0 1 0 0 0
1 2 2 0 0 0 0 0 0 0
0 1 0 0 1 1 0 0 0 0
0 0 1 1 1 0 0 0 0 0
0 0 0 1 0 1 0 1 0 0
0 1 1 0 0 0 0 0 1 0
0 0 1 1 1 0 0 0 0 0
v
v
v
v
S v
v
v
v
v
Last week on bioinformatics seminar!Last week on bioinformatics seminar!Notice that we are interested only in solutions where
(sign suggests reaction’s direction).
Solution space is spanned by linearly independent vectors.
We look for a spanning set s.t. every solution can be written as a linear combination of the spanning vector where all coefficients are non-negative (Genetically independent).
Those solutions are called
Extreme pathways (EP).
Can be found using the Null
Space Approach (NSA)
Algorithm.
0iv
ProblemProblemBiology suggests some reaction are reversible.
Consider the following network for instance –
R5 can work in both directions (Not simultaneously!)
Solution ?Solution ?Remove the restriction, signs suggests direction ..
Bad idea ..• Not all reactions are reversible.• Solutions no longer take the form of a polyhedral cone.
0iv
Solution !Solution !Split the reversible reactions ..
Find Extreme Pathways using the NSA algorithm.
Post process found EPs, merge split reactions (“opposite direction” should be set with a negative sign).
Post processed EPs are now called - Elementary Modes (EM).
R5a R5b
Compressing the networkCompressing the networkRemoving redundancies
Can be united..
Compressing the networkCompressing the networkRemoving redundancies
R1 is null in any feasiblesteady state
Compressing the networkCompressing the networkRemoving redundancies
Contradict each other ..Can be eliminated.
Compressing the networkCompressing the networkRemoving redundancies
Active in any stead state.
Compressing the networkCompressing the networkRemoving redundancies
• Some redundancies can be detected as dependent linear rows in the kernel matrix.
• Iterative approach, remove redundancies until non detected.– Produce better results.
Preprocessing -• Metabolic networks yield deeper insight of organisms metabolism.• Failure modes analysis will provide
• Crucial parts identification.
• Suitable targets for repressing undesired metabolic functions.
• Apply NSA algorithm. • Post process.
General frameworkGeneral framework
• The authors offers an efficient implementation to the NSA and CBA (Combined basis – Schuster et. al.) algorithms.– Using binary representation for vectors.
• Fast bit operators.
• Efficient memory usage (up to 1.6% of original!)
One more tweakOne more tweak
Seminar in bioinformatics
Minimal cut sets in biochemical reactionnetworks
Elad Gerson, Spring 2006, Technion.
Steffen Klamt and Ernst Dieter Gilles
Bioinformatics Vol. 20 no. 2 2004, pages 226–234
AbstractAbstract• Motivation
• Metabolic networks yield deeper insight of organisms metabolism.• Failure modes analysis will provide
• Crucial parts identification.
• Suitable targets for repressing undesired metabolic functions.
• Results• The biochemical networks minimal cut sets concept.• Algorithm which computes MCS with respect to an objective reaction.• Potential applications includes
• phenotype predictions.• Network verifications.• Structural robustness and fragility assessment.• Metabolic flux analysis.• Target identification in drug discovery.
IntroductionIntroduction
• Assume we wish to prevent the production of metabolite X.• i.e. there is no balanced flux distribution possible which involves obR.• Can be done by gene deletion or enzyme inhibition.
IntroductionIntroduction
Definition - We call a set of reactions a cut set (with
respect to a defined objective reaction) if after the removal
of these reactions from the network no feasible balanced flux
distribution involves the objective reaction.
IntroductionIntroduction
• That’s easy .. Consider C0 = {obR}• One might wish to cut the reaction at the beginning.• What if there are numerous obR’s ?
• Simultaneous failure might be achieved more efficiently.
IntroductionIntroduction
• Take two – Remove all reactions except for oBR.• Not efficient.• Not intelligent.
IntroductionIntroduction
• Consider C1 = {R5, R8}• Sufficient.• Neither the removal of R5 nor R8 is sufficient.
• No subset of C1 is a valid cut set → C1 is minimal.
IntroductionIntroduction
Definition - A cut set C (related to a defined objective reaction)
is a minimal cut set (MCS) if no proper subset of C is a
cut set.
Can you spot all the MCS in the network ?
IntroductionIntroduction
Is C2 = {R2, R4, R6} minimal?
IntroductionIntroduction
Is C3 = {R2, R5, R7} ?
IntroductionIntroduction
How about C1 = {R1} ?
IntroductionIntroduction• OK, what about Graph disconnectivity algorithms ?
• No good, They don’t take the hypergraph nature of metabolic pathways into account.
The algorithmThe algorithmInitialization
(1) Calculate the EMs in the given network
(2) Define the objective reaction obR
(3) Choose all EMs where reaction obR is non-zero andstore it in the binary array em_obR (em_obR[i][j]==1means that reaction j is involved in EM i)
(4) Initialize arrays mcs and precutsets as follows (eacharray contains sets of reaction indices): append {j } to mcs if reaction j isessential (em_obR[i][j]=1 for each EM i), otherwise to precutsets
The algorithmThe algorithm(5) FOR i=2 TO MAX_CUTSETSIZE
(5.1) new_precutsets=[ ];
(5.2) FOR j = 1 TO q (q: number of reactions)
(5.2.1) Remove all sets from precutsets where reaction j participates
(5.2.2) Find all sets of reactions in precutsets that do not cover at least one EM in em_obR where reaction j participates; combine each of these sets
with reaction j and store the new preliminary cut sets in temp_precutsets
(5.2.3) Drop all temp_precutsets which are a superset of any of the already determined minimal cut sets stored in mcs
(5.2.4) Find all retained temp_precutsets which do nowcover all EMs and
append them to mcs; append all others to new_precutsets
ENDFOR
(5.3) If isempty(new_precutsets)
(5.3.1) Break
ELSE
(5.3.2) precutsets=new_precutsets
ENDIF
ENDFOR
(6) result: mcs contains the MCSs
Running exampleRunning exampleInitialization – Calculate EM
We are only interested in EM containing obR
Running exampleRunning exampleInitialization
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 1
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {}temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 1
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {}temp_precutsets = {{1 2}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 1
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {}temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5} {1 6}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 1
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {}temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5} {1 6} {1 7} {1 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 1
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {}temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5}, {1 6}, {1 7}, {1 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}}
new_precutsets = {} temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}}
new_precutsets = {} temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}}
new_precutsets = {} temp_precutsets = {{2 4}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}}
new_precutsets = {} temp_precutsets = {{2 4},{2 6},{2 7},{2 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}}
new_precutsets = {{2 4}} temp_precutsets = {{2 6},{2 7},{2 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 2
mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}}temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 5
mcs = {{1}}, precutsets = {{5},{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 5
mcs = {{1}}, precutsets = {{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {{5 6},{5 7},{5 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 5
mcs = {{1}}, precutsets = {{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {{5 6},{5 7},{5 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 5
mcs = {{1}, {5 6}}, precutsets = {{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {{5 7},{5 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 5
mcs = {{1}, {5 6}, {5 7}}, precutsets = {{6},{7},{8}}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {{5 8}}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 2, j = 8
mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {}
new_precutsets = {{2 4},{2 6},{2 7},{2 8}, ..}temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 3, j = 2
mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {{2 4},{2 6},{2 7},{2 8},…{4 6},…}
new_precutsets = {}temp_precutsets = {}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 3, j = 2
mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {…{4 6},…}
new_precutsets = {}temp_precutsets = {…}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 3, j = 2
mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {…{4 6},…}
new_precutsets = {…}temp_precutsets = {{2 4 6},…}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 3, j = 2
mcs = {{1}, {5 6}, {5 7}, {5 8}, {2 4 6}}, precutsets = {…{4 6},…}
new_precutsets = {…}temp_precutsets = {…}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
Running exampleRunning exampleI = 3, j = 8
mcs = {{1}, {5 6}, {5 7}, {5 8}, {2 4 6},…}, precutsets = {}
new_precutsets = {…}temp_precutsets = {…}
em_obRR1R2R3R4R5R6R7R8
1 (EM2)10000111
2 (EM3)11101000
3 (EM4)10011000
ComplexityComplexity• Let q be the number of reactions.• Assuming |EM| << q.• In initialization q singletons are generated and tested.• In the i-th iteration
– Overall number of temp_precutsets generated
– O(p) comparisons are made.
• Hence, (All subsets of q items)
– Yes .. exponential..
• Maximal MCS size << q bounds polynomial approximation.
)!(!
!
iqi
q
i
qCp iq
q
i
iqC 2
MCS in central metabolism of E. coliMCS in central metabolism of E. coli
• MCS calculated with ‘biomass synthesis’as objective reaction (growth).– Network comprises 110 reactions
and 89 metabolites.– Catabolic (material breakdown) part
modeled in details.• Enables excretion of 5 metabolites.
• Uptake of glucose, acetate, glycerol and succinate.
• Growth on each substrate was tested separately.
MCS in central metabolism of E. coliMCS in central metabolism of E. coli
Possible applicationsPossible applications
Structural fragility and robustness
• MCS can be used for “risk assessment” in metabolic pathways.– More EMs suggested a more robust and less fragile pathway.
• EMs number and MCSs size are strongly correlated. (More elements must fail).• We seek a better criteria.
Glucose is known to be the least fragile growth substratehaving most EMs and apparently longest MCSs
‘Dangerous’ MCSs
Possible applicationsPossible applications
Structural fragility and robustness
Definition – Reaction fragility factor Fi is the reciprocal of the
average size of all the MCSs the reaction i participates.
8
3
3233
18
F
Possible applicationsPossible applications
Structural fragility and robustness
Definition – Reaction fragility factor Fi is the reciprocal of the
average size of all the MCSs the reaction i participates.
May suggest reaction’s importance.
Possible applicationsPossible applications
Structural fragility and robustness
Definition – Reaction fragility factor Fi is the reciprocal of the
average size of all the MCSs the reaction i participates.
Is there a correlation between Fiand the number of EMs the reaction participates?
Possible applicationsPossible applications
Structural fragility and robustness
Possible applicationsPossible applications
Structural fragility and robustness
Definition – Network fragility F is defined as
where q is then number of reactions.
q
i
i
q
FF
1
514.09
183
83
83
21
31
31
31
1
F
Possible applicationsPossible applicationsNetwork verification and mutant phenotype predictions.
• Cutting an MCS is predicted to leave a metabolic pathway dysfunctional.
• Apply the algorithm with ‘growth’ as obR.– If a set of gene deletions (or mutants) contains an MCS a non-viable
phenotype is expected.• Viable phenotype would be a false negative.
– Proof for incorrect or incomplete network.
– Otherwise growth is possible.• Non-viable phenotype would be a false positive.
– May suggest a false assumption in the network structure.
» One of the reactions in the MCS might be of regulatory nature.
Possible applicationsPossible applicationsTarget identification and repressing cellular functions.
MCS offers a theoretical tool for target identification in drug discovery.– An irreducible set of interventions needed for pathway dysfunction.
– Usually we will look for minimal size of MCS.
– Other pathways should be weakly affected.• Can be checked easily –
set of untouched EM’s.• MCS 0, 2, 3, 4 will not affect EM1