Post on 30-May-2018
8/14/2019 Thesis: Approximation of a Cell Cycle Model
1/80
Approximation of a cell cycle model
Master Thesis
Chaiyut ThanukaewM. Sc. Course Automation and Robotics
Technical University of Dortmund
Dept. of Biochemical and Chemical EngineeringChair of Process Dynamics and Operations
First Supervisor: Prof. Dr.-Ing. Sebastian Engell
Second Supervisor: Dipl.-Ing. Tobias Claus Neymann
8/14/2019 Thesis: Approximation of a Cell Cycle Model
2/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
3/80
Declaration
I hereby declare and confirm that the master thesis
Approximation of a cell cycle model
is entirely the result of my own work except where otherwise indicated.
Dortmund, May 15th, 2009 Signature
8/14/2019 Thesis: Approximation of a Cell Cycle Model
4/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
5/80
Abstract
Models in biological systems are basically quite huge and complex. They are written based
on hypothesis of molecular mechanisms. The dynamics of the system are explained by a setof first-order nonlinear ODEs using kinetic variables to tune the responds, thus their solutions
can be solve systematically. This is a good connection to employ the genetic algorithm (GA)
for model reduction. The GA simply maps each state of the model to binary bit indicating
which state is include or excluded in a feasible reduced model. A predefined number of thefeasible reduced models are initially introduced by random. They are subsequently developed
via evolutionary processes of selection and reproduction. Therefore, the better reduced
models (measure in a term of cost value) are expected to be found. These processes are
repeatedly done until the desired solution exists. However, there are some difficulties
applying the GA to reduce the biological system, e.g. the resultant reduced model must bereasonably interpreted referred to physical reality. A good example would be, in this work,
the cell cycle model of budding yeast must preserve the mass state; otherwise it is not viable
because a yeast cell without the mass does not make any sense. Hopefully, the result obtained
from applying the GA to reduce the yeast cell cycle in this work will return some benefits to
ones who are interested in.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
6/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
7/80
Acknowledgements
The task cannot be completed if lacking of full supports and valuable comments from my
supervisor, Dip.-Ing. Tobias Neymann. I really appreciate for his generous helps during workingperiod and I also would like to thank Dip.-Inform. Tomas Tometski for good suggestions regarding
the GA. Definitely, I sincerely thank Prof. Dr.-Ing. Sebastian Engell for opportunity working with
the department.
Million thanks to my friends, cousins, brothers and sisters, and other relatives in my country for
readily full support and encouragement. They do not only share my bliss, but also stand beside me
when the problems exist. Thanks all of friends in Germany for your kind welcomes and warm
hospitality from beginning till the end.
Finally, without love and dedication from my parents, I could not reach this successful point. All
footprints they have left behind give me the hints to create my own successful paths. Whenever I
am discouraged, they always say Where there is a will, there is a way.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
8/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
9/80
Contents
List of Figures
List of Tables
1 Introduction 1
2 Cell Cycle of Budding Yeast 32.1 Cell Cycle..3
2.2 Mathematical Model of Budding Yeast 5
3 Genetic Algorithm (GA) 103.1 Introduction to Genetic Algorithms.. 10
3.2 Individual Representation. 12
3.3 Objective Function 14
3.4 Selection Methods.15
3.5 Genetic Operations17
3.5.1 Crossover. 17
3.5.2 Mutation...19
3.6 Replacement Strategy20
3.7 Advantages and Limitations of the GA.20
4 Implementation Methods 224.1Jacobian-based Local Refinement (JLR)..224.2Constant Input Response (CIR) 234.3Proposed Methods.24
4.3.1 Representation of the Given Problem and Objective function... 24
4.3.2 Implementation Using a Genetic Algorithm26
4.4Application a Cell Cycle Model of Budding Yeast..284.4.1 Reduced Model: Comparison between Substitution Eliminated State
by Zero- and Mean-valued Constant 28
4.4.2 Reduced Model with Parameter Estimation 30
5 Results and Discussion 31
8/14/2019 Thesis: Approximation of a Cell Cycle Model
10/80
5.1Jacobian Index Table 315.2Constant Input Response (CIR) 355.3 Preconditions for Generating Feasible Candidates in the First Generation.. 37
5.3.1 Overview of Initialization of the GA 37
5.3.2 Preconditions for Initialization of the GA 38
5.4 Reduced Model with Zero- and Mean-valued Substitution. 41
5.4.1 Overview and Setting Simulation Parameters 41
5.4.2 Results. 43
5.4.3 Discussion regarding the GAs Viewpoint. 45
5.4.2 Discussion regarding the Cell Cycle Model... 48
5.5 Reduced Model with Parameter Estimation. 52
5.5.1 Overview and Setting Simulation Parameters 52
5.5.2 Results and discussion 53
6 Conclusion and Future Works 57
Bibliography
Appendix A
Appendix B
Appendix C
8/14/2019 Thesis: Approximation of a Cell Cycle Model
11/80
List of Figures
2.1: Cell cycle of budding yeast... 4
2.2: Consensus diagram of the budding yeast.. 6
3.1: a GA cycle 11
3.2: The pseudo-code of the conventional genetic algorithm. 12
3.3: Example of individuals with binary encoding. 13
3.4: Example of a map for the TSP where each circle representing a city and
all transitions connected representing a feasible route. 14
3.5: Example of individuals with permutation encoding.. 14
3.6: Roulette-wheel selection 15
3.7: Situation before ranking (graph of fitness proportion).. 16
3.8: Situation after ranking (graph of order numbers). 16
3.9: One Point Crossover. 18
3.10: Uniform crossover .. 19
3.11: Mutation.. 19
4.1: Pseudo-code to find the CIR. 24
4.2: Example of an ODE system containing several state variables in which
there exist the problem when substitution an eliminated state wit zero is applied 28
4.3: Original index vector and index vector with parameter estimation... 30
5.1: Pseudo-code to find the Jacobian Index Table... 31
5.2: Original mass state and mass states with different added constant to Sic1... 35
5.3: CIR of the full model generated from using 28 values of added constant inputs.. 36
5.4: CIR of the full model generated from using 10 values of added constant inputs
obtained from the median-valued substitution... 37
5.5. Results of the GA for 2. Reduced model with mean-valued substitution: Case #1 47
5.6. Results of the GA for 2. Reduced model with mean-valued substitution: Case #2 47
5.7. Results of the GA for 2. Reduced model with mean-valued substitution: Case #3 47
5.8: Cell cycle of the originally full model 49
5.9: Cell cycle of the reduced model S1 from Table 5.8... 49
5.10: Cell cycle of the reduced model S2 from Table 5.8. 50
5.11: (a) Histogram of the parameter Kez (original value = 0.3)
(b) Histogram of the parameter Kez2 (original value = 0.2)
8/14/2019 Thesis: Approximation of a Cell Cycle Model
12/80
(c) Histogram of the threshold of Esp1 (original value = 0.1)
(d) Histogram of the threshold of ORI, BUD and SPN (original value = 1) 55
8/14/2019 Thesis: Approximation of a Cell Cycle Model
13/80
List of Tables
2.1: Equations.. 8
5.1: Jacobian Index Table (Excluding the Diagonal Elements)
of the Cell Cycle Model of Budding Yeast.. 34
5.2: Added constant inputs and median values for each group.. 36
5.3: Mean values of all states in the ODE system .. 39
5.4: Status of the model by resetting each bit of the index s to 0;
(a) with zero substitution, and (b) with mean-valued substitution.. 40
5.5: Parameter settings of the GA (substitution an eliminated state with zero). 42
5.6: Parameter settings of the GA
(substitution an eliminated state with mean-valued constant). 42
5.7: Results of the GA comparing between the reduced models
with zero- and mean-valued substitution. 43
5.8: Resulting reduced models from several simulations... 45
5.9: Parameter settings of the GA with parameter estimation 53
5.10: Results of the GA with parameter estimation. 54
8/14/2019 Thesis: Approximation of a Cell Cycle Model
14/80
1
Chapter 1
IntroductionIn order to get deep insight into a complex system, a mathematical model is necessary to describe
what is going on in the system. The model may be based either on first order principles or data-
based (black box), or a combination of both approaches. But the most important objective is to get
the accurate model which explains input-output relationship or mechanisms how the system
behaves and how the whole things inside react to each other and other perturbing signals.
Models, especially in biological systems, can be quite complex and huge comprised of many state
variables, e.g. a cell cycle model in budding yeast and a model of the mammalian circadian clock.
In these models, the responses of a cell to internal and external signals are controlled by networks of
proteins. Mathematical modeling, based on biochemical rate equations, provides a precise andreliable tool for explanation of what happens within the cell and the cell network. To facilitate the
models, a mathematical technique called model reduction is applied. The model reduction returns
some advantageous aspects for instance; to lower a computational effort, to make an infeasible
simulation feasible, to reduced data transfer in online application and so on.
In every moment of daily life, biological systems are working for instance cells within human
bodies, animal cells, plant cells and tiny livings like yeast. A sequence of events from a cell grows
till divide itself into two daughter cells is called a cell cycle. To study what happen inside a cell and
relationship among internal components during the cell cycle, many of control mechanisms areapplied. Finding out an accurate model representing the cell cycle is one challenging task for
researchers over decades. Thus, we would grasp, for example, how cancer cells copy themselves
speedily if we study another comparative organism explicitly. That is why studying the buddingyeast cell cycle model plays significant role and is worthy to get insight into.
Yeast cells, simple and single celled eukaryotes, undergo cell division cycle similar to human cells.
There are many yeast species and some are crucial as model organisms in modern cell biology
researches. They are, moreover, the most thoroughly researched eukaryotic microorganism.
Researchers have used them to gather information into the biology of other eukaryotic cells and
ultimately human beings biology. Chapter 2 will go deeper into details for the cell cycle model inbudding yeast.
Nonlinear systems are normally taking place naturally rather than linear ones. One of the greatest
difficulties finding out solutions of nonlinear systems is that it is not generally possible to combine
known solutions into new solutions. In linear problems, independent solutions can be used toconstruct general solutions through the superposition principle. For this reason, problem solving
regarding nonlinear differential equations are extremely diverse and methods of solution or analysis
are problem dependent.
To obtain less complicated model than a given original model, model reduction is an important tool
in many areas of research such as combustion chemical plant, biological modeling and so on.
Models written in those areas are mainly, of course, nonlinear. There are other else methods for the
model reduction rather than the GA for example proper orthogonal decomposition in model order
8/14/2019 Thesis: Approximation of a Cell Cycle Model
15/80
2
reduction technique (MOR), lumping similar state variables together, eliminating states insensitive
to parametric perturbation and leaving redundant state variables. Proper model reduction techniques
should be chosen well suited for given problems. Most recent approach to the nonlinear modelreduction uses mathematic programming techniques in which state variables are removed from the
models without seriously degrading of accuracy.
In this work, we apply an optimization technique Genetic algorithm (GA) to reduce the degree ofthe cell cycle model of budding yeast (model reduction). The algorithm itself has initially takenideas from biological evolutions based on selection, recombination and mutation for breeding to a
next generation, thus, after breeding, we hope finding new better and better cases called parents for
breeding again in the further generations. The GA have been know for decades, however it have
been widely used in last recent years because of much advantageously efficient stuffs employed in
calculating tasks (K. F. Man et al.1999). The model reduction by the GA is suited to resolveproblems which require searching a huge number of possibilities for the solutions. Another
advantage of the GA is that it is easily implemented for parallel calculation, thus computational
time might decrease significantly. Further details of the GA will be discussed again in Chapter 3.
The GA is, however, not the approach yielding globally optimal solutions for the given problems.
On the other hand, its advantage is to return satisfied results within the expected time.
In this task, we are going to apply the GA with the cell cycle model of budding yeast a set of the
first order ordinary differential nonlinear equations. The goal is to reduce as many as possible the
number of states while preserving some key characteristics (normally responses to some signals) inthe manners such that the original model reacts with. Furthermore, few key restrictions in the
reduced model must be preserved such preserving the mass state (lacking of the mass does not make
any sense to study the cell cycle), viability criteria such that they are considered regarding physical
reality of the budding yeast cell cycle model. The full implementation is left for talking in Chapter
4: Implementation Methods.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
16/80
3
Chapter 2
Cell Cycle of Budding YeastIn this chapter, fundamentals of a cell cycle of budding yeast and is discussed. The fundamentals of
a cell cycle of budding yeast are initially explained for two reasons: first, to get familiar with the
model we are going to work with, and second, to realize the crucial criteria in the model.
2.1 Cell Cycle
Biological systems are working in every moment in daily life, for instance, cells within human
bodies, animals, plants and microorganisms like yeast. A sequence of events from a newborn cellgrowing till divides itself into two cells is called a cell cycle. After studying budding yeast cell
simple and single celled eukaryotes, for years, the knowledge obtained from it has shown beneficial
returns in understanding cell proliferation in multicellular plants and animals.
The molecular machinery of eukaryotic* cell cycle control is known in more detail for budding
yeast, Saccharomyces cerevisiae, than for any other organisms (K. C. Chen at al., 2000). In
eukaryotic cells, the cell cycle or cell-division cycle is a process by which a cell divides to form two
new cells. The process does not comprise only division but also replication. A growing mother cell
replicates all its essential components e.g. DNA, and later divides them more or less equally
between two daughter cells. So, each daughter cell contains the machinery and information requiredto repeat the cell-division process again and again.
The eukaryotic cell cycle is divided in 4 phases: G1-, S-, G2- and M-phase. These phases are
defined as follows;
- G1 or Gap1 phase: the cell grows.- S or DNA synthesis phase: the cell makes copies of its chromosomes. Each
chromosome now consists of two sister chromatids.
- G2 or Gap2 phase: the cell checks the duplicated chromosomes and get ready to divide.- M or Mitosis phase: the cell separates the copied chromosomes to form two full set
(mitosis) and the cell divides into two new cells.
We can say that, in the cell cycle, S and M phases are such the two important phases separated by
two waiting/checking gaps G1 and G2.
*Eukaryote is a cell with visible nuclei e.g. all living cellular organisms excepting ones in prokaryotes, while
prokaryote is a cell whose DNA is not contained within nuclei e.g. bacteria and blue-green algae, and its cell
cycle occurs via a process termed binary fission.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
17/80
4
Figure 2.1: Cell cycle of budding yeast (L. Calzone et al., 2009)
The figure shows a cell cycle of budding yeast. After division, a single daughter yeast cell is growing in the
phase G1 until it becomes big enough at the start position, then the cell starts budding. At the S phase, the
most important genetic components DNA within chromosomes must be accurately replicated. After
checking replicated chromosomes at the G2 phase, the cell is ready for segregation if there is no mistake. At
the M phase, the cell does chromosomes separation and cell division (cytokinesis). The process, finally, yields
two approximately even daughter cells. Though, the cell cycle may not be completed if some conditions are
failed e.g. if the cell could not reach the critical size or there is DNA damage in the G1-phase, and if DNA
damaged, DNA is not replicated or chromosome are not aligned at the M phase.
Mitosis plays a key role in the cell cycle. It is further divided into 4 stages: prophase, metaphase,
anaphase and telophase. Mitosis, basically, acts as the process in cell division by which the nucleus
divides resulting in two new nuclei, each of which contains a complete copy of the parental
chromosomes.
The next problem coming to our minds is how we would know which mechanisms take place at
what time in a cell cycle? The answer to this question may concern some elements such that they
are used as triggers or referent levels indicating that which stage is being in operation. In the
budding yeast system, cyclins are accepted as the regulatory substances responsible to this role.
Cyclin is a protein active in regulating the cell cycle. It typically fluctuates in concentration becauseof synthesis and degradation at specific points during the cell cycle and that regulate the cycle by
binding to its partner cyclin-dependent kinase (Cdk). Thus, the word cyclin is named since its
concentration varies in a cyclical fashion during the cell cycle.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
18/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
19/80
6
observed behavior of the chemical reaction system. If a set of rate constants can be found for whichthe solutions fit the observations, then the mechanism is provisionally confirmed (depending on
further experimental investigations). If not, inconsistencies will identify aspects of the mechanism
that require revision and further testing. Although a mechanism can be disproved if it is inconsistent
with well-established facts, it can never be proved correct because new observations may force
modifications and additions (K. C. Chen at al., 2004).
So, we can say that the budding yeast cell cycle model from K. C. Chen at al., 2004 is built up
based on trial and error with manually tuned kinetic parameters to get the exact appropriate
mathematic model. The molecular mechanisms of the budding yeast model are represented in a set
of 36, nonlinear, ordinary differential equations plus 25 algebraic equations.
The consensus diagram of the budding yeast shown below and its explanation, and the subsequent
equations in Table 2.1 (In Appendix B, the equations, parameters and initial conditions are fully
shown) are taken from K. C. Chen at al., 2004. Other aspects for more details can be obtained from
the reference. Viability rules are integrated into the model in order to assure that the model
describes the biology and some mutants correctly, not as a general test for budding yeast model.
Figure 2.2: Consensus model of the budding yeast (K. C. Chen at al., 2004)
The figure shows consensus model of the cell cycle control mechanism in budding yeast. The diagram is
suggested to read from bottom left toward top right, whereas, in the wiring diagram, Cln2 stands for Cln1 and
2, Clb5 for Clb5 and 6, and Clb2 for Clb1 and 2; furthermore, the kinase partner of the cyclins, Cdc28, is not
shown explicitly. There is an excess of Cdc28 and it combines promptly with cyclins as soon as they are
synthesized. Newborn daughter cells must reach to a critical size, then the sufficient Cln3 and Bck2 activate
8/14/2019 Thesis: Approximation of a Cell Cycle Model
20/80
7
the transcription factors MBF and SBF which synthesize two cyclins, Cln2 and Clb5. Cln2 is basically
responsible for bud emergence and Clb5 for initiating DNA synthesis. Clb5-dependent kinase activity is not
immediately active because the G1-phase cell is full of cyclin-dependent kinase inhibitors (CKI; namely, Sic1
and Cdc6). After the CKIs are phosphorylated by Cln2/Cdc28, they are rapidly degraded by SCF, releasing
Clb5/Cdc28 to do its job. A fourth class of mitotic cyclins, denoted Clb2, are out of the picture in G1
because their transcription factor Mcm1 is inactive, their degradation pathway Cdh1/APC is active, and their
stoichiometric inhibitors CKI are abundant. Cln2- and Clb5-dependent kinases remove CKI and inactivate
Cdh1, allowing Clb2 to accumulate, after some delay, as it activates its own transcription factor, Mcm1.
Clb2/Cdc28 turns off SBF and MBF. (Clb5/Cdc28 is probably the other down-regulator of MBF.) As
Clb2/Cdc28 drives the cell into mitosis, it also sets the stage for exit from mitosis by stimulating the synthesis
of Cdc20 and by phosphorylating components of the APC (see text for details). Meanwhile, Cdc20/APC is
kept inactive by the Mad2-dependent checkpoint signal responsive to unattached chromosomes. When the
replicated chromosomes are attached, active Cdc20/APC initiates mitotic exit. First, it degrades Pds1,
releasing Esp1, a protease involved in sister chromatid separation. It also degrades Clb5 and partially Clb2,
lowering their potency on Cdh1 inactivation. In this model, Cdc20/APC promotes degradation of a
phosphatase (PPX) that has been keeping Net1 in its unphosphorylated form, which binds with Cdc14. As the
attached chromosomes are properly aligned on the metaphase spindle, Tem1 is activated, which in turn
activates Cdc15 (the endpoint of the MEN signal-transduction pathway in the model). When Net1 gets
phosphorylated by Cdc15, it releases its hold on Cdc14. Cdc14 (aphosphatase) then does battle against the
cyclin-dependent kinases: activating Cdh1, stabilizing CKIs, and activating Swi5 (the transcription factor for
CKIs). In this manner, Cdc14 returns the cell to G1 phase (no cyclins, abundant CKIs, and active Cdh1).
A cell cycle may not be complete if some mechanisms are abnormal. In the simulation the modelregarding to the equations, parameter values and initial conditions in Appendix B (only the ODEs
are shown in Table 2.1), the simulated model is considered viable if the following rules are fulfilled;
(A)The model must execute the following events in order, otherwise the model is consideredinviable.
1. Origin re-licensing (due to a drop in 1. [Clb2]+[Clb5] below Kez2),2. Origin activation (due to a subsequent rise in [Clb2]+[Clb5], causing [ORI] to increase
above 1),3. Spindle alignment (due to a rise in [Clb2], causing [SPN] to increase above 1),4. Esp1 activation ([Esp1] to increase above 0.1, due to Pds1 proteolysis at anaphase), and
5. [Clb2] dropping below a threshold Kez to trigger nuclear division.
(B)The model is inviable if division occurs in an "unbudded cell" (i.e. when [BUD] neverreaches 1 in the cycle).
(C)The cell cycle should be stable such that the root mean square deviation of all variables is 10.These viable criteria are crucial since, in later tasks, feasibly reduced models must comply these
criteria otherwise they are considered to be failed. Note that the viability criteria (C) will not beused in our viability determination because it is used to assure the stability of the whole model.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
21/80
8
------------------------------------------------------------------------------------------------------------------------Table 2.1: Equations
------------------------------------------------------------------------------------------------------------------------
]1.[
]1.[
])1[]1.([
])1[]1.([]1.[
]1[
]1.[]1[
]20).[()]20[]20]).([.["'(]20[
]20.[]1.["']20[
].[
].[
][1
])[1].(2.[][
]5]).[2.[,(])5[]5].([14.[]1.["']5[
]5.[]1.["']5[
]5).[]14.[(]5.[]5[
]2).[]14.[(]2.[]2[
]5).[(]5].[14.[]6].[5.[]5[
]2).[(]2].[14.[]6].[2.[]2[
]5.[]2[]6).[]14.[(]6.[]6[
]6).[]5.[]2.[(
]6].[14.[]5).[(]2).[(]).[''']5.["'(]6[
]5).[]14.[(]5.[]5[
]2).[]14.[(]2.[]2[
]5).[(]5].[14.[]1].[5.[]5[
]2).[(]2].[14.[]1].[2.[]2[
]5.[]2.[]1).[]15.[(]1.[]1[
]1).[]5.[]2.[(]1].[14.[]5).[(]2).[(])5.["'(]1[
]2]).[6.[]1.[(])2.[]2.[(])2.[]2.[(]]).[1.["'(]2[
]5]).[6.[]1.[(])5.[]5.[(]).[]5.[(]]).[.["'(][
]).[].["'(][
].[][
,
,
,
,,,
,,
20,220,20,
20,20,20,
,
,
,
,
,,,,
,,,
2,6,36,6,
2,6,36,6,
6,5,5,6,5,
6,2,2,6,2,
5,2,6,36,6,
6,5,2,
6,5,5,2,2,6,6,6,
5,1,31,1,
2,1,31,1,
1,5,5,1,5,
1,2,2,1,2,
5,2,1,31,1,
1,5,2,1,5,5,2,2,1,1,
2,2,2,2,6,32,1,32,2,
5,5,5,5,6,35,1,35,5,
2,2,2,
CdhJ
CdhV
CdhCdhJ
CdhCdhVCdhkk
dt
Cdhd
Cdhkkdt
Cdhd
CdckkCdcCdcPAPCkkdt
Cdcd
CdckMcmkkdt
Cdcd
PAPCJ
PAPCk
PAPCJ
PAPCClbk
dt
PAPCd
SwiClbswikikSwiSwiCdckMcmkkdt
Swid
SwikMcmkkdt
Swid
PFVkCdckFVdt
PFd
PFVkCdckFV
dt
PFd
FVVkPFCdckCdcClbkdt
Fd
FVVkPFCdckCdcClbkdt
Fd
PFVPFVPCdckCdckCdcVdt
PCdcd
CdcVClbkClbk
PCdcCdckFkVFkVSBFkSwikkdt
Cdcd
PCVkCdckCVdt
PCd
PCVkCdckCVdt
PCd
CVVkPCCdckSicClbkdt
Cd
CVVkPCCdckSicClbkdt
Cd
PCVPCVPSickCdckSicVdt
PSicd
SicVClbkClbkPSicCdckCkVCkVSwikkdt
Sicd
ClbCdckSickVFkPFkCkPCkmassMcmkkdt
Clbd
ClbCdckSickVFkPFkC5kPCkmassMBFkkdt
Clb5d
Cln2kSBFkkdt
Cln2d
masskmassdt
d
cdhi
cdhi
Tcdha
Tcdhacdhdcdhs
TcdhdcdhsT
AdmadATaaA
TdssT
apci
apci
apca
apca
swidTswiaswisswis
TswidswisswisT
bdfdfppfkp
bdfdfppfkp
fkpbdfdifppfas
fkpbdfdifppfas
bdbdfdfppfkp
fkpfasfas
fppfdibdfdibdfsfsfs
bdcdcppckp
bdcdcppckp
ckpbdbdicppbas
ckpbdbdicppbas
bdbdcdcppckp
ckpbasbascppbdibdbdibdcscs
fasbasbdfdifdbdicdbsbs
fasbasbdfdifdbdicdbsbs
ndnsns
g
(continued)
--------------------------------------------------------------------------------------------------------------
8/14/2019 Thesis: Approximation of a Cell Cycle Model
22/80
9
------------------------------------------------------------------------------------------------------------------------Table 2.1: Equations (continued)
------------------------------------------------------------------------------------------------------------------------
].[]2[
]2[.
][
].[])5.[]3ln[]2ln.[.(][
].[])2[]5.[.(][
]).[(]1].[1.[]1[
]1]).[1.[(].[]1.["].["']1[
].[][
].[].[]1].[14.[].[]).[(][
]1.[]1.[]1].[14.[].[].[]1.[]1[
]1.[]1[
]14]).[1.[]1.[(].[][.])[].([]14.[]14[
]14.[]14[
]15[])15[]15]).([14.[''']1.["])1[]1.(['(]15[
]1.[
]1.[
])1[]1.([
])1[]1.([]1[
,,
,5,3,2,,
,2,5,,
,,,
,,,,2,1,
,,
,,,,,14,
,,,,14,,,
,,
,,,,,14,14,
14,14,
15,15,15,15,
,
2
,
1
SPNkClbJ
Clbk
dt
SPNd
BUDkClbCCkdt
BUDd
ORIkClbClbkdt
ORId
PEVkEspPdskdt
Espd
PdsEspkVPEkMcmkSBFkkdtPdsd
PPXVkdt
PPXd
RENTVRENTPVNetCdckRENTkRENTkkdt
RENTd
NetVPNetVNetCdckRENTkRENTkNetkkdt
Netd
Netkkdt
Netd
CdcPNetkNetkRENTPkRENTkRENTPRENTkCdckkdt
Cdcd
Cdckkdt
Cdcd
CdckCdcCdcCdckTemkTemTemkdt
Cdcd
TemJ
Temk
TemTemJ
TemTemk
dt
Temd
spns
spn
spns
buddbbudnbudnbudbuds
oridboriborioris
pdsdespdiespas
espaspdsdespdipdsspdsspdss
ppxdppxs
netkpnetpprentasrentdinetdd
netkpnetpprentasrentdidnetdnets
TnetdnetsT
rentpasrentasrentpdirentdinetdds
TdsT
iTaaTa
temi
bub
Ttema
Tlte
------------------------------------------------------------------------------------------------------------------------
Reset rules: When [Clb2] drops below Kez, we reset [BUD] and [SPN] to zero, and divide the mass between daughter celland mother cell as follows: massf* mass for daughter, and mass (1 -f) * mass for mother, withf = ekg *D, whereD= (1.026/kg) - 32 is the observed daughter cell cycle time as a function of growth rate. When [Clb2] - [Clb5] drops below
Kez2, [ORI] is reset to 0.
Flags: Bud emergence when [BUD] = 1, start DNA synthesis when [ORI] = 1, chromosome alignment on spindlecompleted when [SPN] = 1.
------------------------------------------------------------------------------------------------------------------------
8/14/2019 Thesis: Approximation of a Cell Cycle Model
23/80
10
Chapter 3
Genetic Algorithm (GA)The Genetic Algorithm (GA) as a search and optimizing methodology is well-known and very
useful in many applications in a variety of fields, for instances engineering and scientific areas. The
main reason for this success is undoubtedly due to the advances in solid-state microelectronics that
led to the proliferation of widely available, low cost and speedy computers (K. F. Man et al., 1999).
How the GA works is based on the Darwinian principle of natural selection survival of the fittest
individuals. It was firstly introduced in 1975 by Holland (M. Sriniva and L.M. Paynaik, 1994) as a
programming computation called Simple Genetic Algorithm (SGA). The GA is not
mathematically oriented, it, instead, possesses an intrinsic flexibility and freedom to choose
desirable optima regarding designs and specifications. The GA can also resolve a given problemcoming along with some criteria e.g. being nonlinear, constrained, discrete or quite infinite
computational time.
As the area of genetic algorithms is very wide, it is not possible to cover everything in these pages.
The purpose of this part is to give the foundations of the GA in order to appreciate its abilities in
problem solving. For a given specific problem, the GA may be designed in other ways in order to
resolve the problem appropriately and efficiently. As normally defined, the word GeneticAlgorithm (GA) means to use binary bit strings to encode candidate solutions, while other
techniques in the family are defined slightly different e.g. Evolutionary Strategy (ES) which is used
to optimize real-valued parameters, Evolutionary Programming (EP) which differs from the GA and
ES that there is no exchange of genes among a parent pair; every individual is consider as a parent
and only mutation operator is used, Genetic Programming (GP) in which the individuals arerepresented by a tree syntax (J. C. G. Narbona, 2006).
3.1 Introduction to Genetic Algorithms
The basic principles of GA were first proposed by Holland (M. Sriniva and L.M. Paynaik, 1994).
There are series of literature and reports available thereafter. The GA is inspired by the mechanism
of natural selection where stronger individuals are likely the winners in a competing environment.
Through the consecutive generations of genetic evolution, fitter and fitter individuals will be found.
After time passes, the fittest individuals comparable to the optimal solutions of a given problem
will come out. In other word, the GA aims at further improving them until achieving a defined
objective value, reaching a setting iteration number or no fitter individual found for some time.
The GA presumes that a feasible solution of any problem is an individual and can be represented by
a set of parameters the process of encoding a feasible solution is called representation. These
parameters are regarded as the genes in a chromosome and can be constructed by binary bit strings.
A positive value, basically known as a fitness or cost value, is used to reflect the degree of
goodness of the individuals of a given problem. It is obtained from calculating the cost function
8/14/2019 Thesis: Approximation of a Cell Cycle Model
24/80
11
and would be defined as the better individual the higher cost value, or the better individual thelower cost value depended on the problem. The cost value allows each candidate to be
quantitatively evaluated.
Initially in the first generation, a number of individuals are randomly generated to be candidate
solutions for a given problem. According to the fitness determination, each individual can be
distinguished apart from others because of different cost values. To breed new individuals for asubsequent generation, the higher-cost individuals acting as the parents should get more chance
to be selected rather than the lower-cost ones. In other word, the probability of each individual
chosen is proportional to its cost value. This selected process is known as the selection. However,
the GA itself is based on random operation, thus no one could guarantee if an individual would havethe exact chance to survive in proportion to its fitness.
A pair of randomly selected individuals is called parents playing a key role in breeding
mechanism. To breed, the chosen parents undergo the genetic operations called crossover and
mutation. Two parents forming the crossover return also two offspring such that each offspring
takes some gene contents from the father, another from the mother. When the crossover is finished,
randomly changes with predetermined probabilities are introduced to the gene contents, causing
additional changes. These changes are called mutation.
The expectation of the GA is that the average fitness of the population will increase for each
generation and, by repeating these processes: selection, crossover and mutation, for some number of
generations, (very) good solutions to the problem should be discovered.
Only by chance, those processes which are applied to the selected individuals from a prior
generation improve the newborn individuals. However, there should be a number of the newborn
individuals which are better than the previous ones since they are not arbitrarily generated, but
rather created according to the strategies of selection, crossover and mutation. The algorithm could
yield a satisfied solution in the end if it keeps making better and better individuals which are more
complete or more efficient to the problem. Otherwise, the algorithm is aborted if there is no
improvement of individuals for some periods or a predefined setting number of iterations is met.
Figure 3.1: a GA cycle (K. F. Man et al., 1999)
8/14/2019 Thesis: Approximation of a Cell Cycle Model
25/80
12
The GA has proven itself to be the powerful and successful problem-solving strategies. It has been
applied to a wide variety fields e.g. music generation, genetic synthesis, VLSI technology, strategy
planning and machine learning (M. Sriniva and L.M. Paynaik, 1994).
As depicted in Figure 3.1: a GA cycle, the algorithm starts at the block Population such that the
GA randomly generates the individuals until reaching the set number of individuals. The encodedindividuals (genotypes) are transformed back regarding the original problem context (phenotypes)
in order to determine the fitness values calculated from the objective function. Then, the selection
chooses the individuals to be the parents according to a predetermined strategy. Each pair of parents
is performed under the genetic operation: crossover and mutation, to yield the required number ofoffspring. The offspring, later, are again calculated for the fitness values. Lastly, the individual
population is replaced by the offspring via a chosen strategy of replacement.
The pseudo-code of the conventional genetic algorithm is shown in Figure 3.2.
Figure 3.2: The pseudo-code of the conventional genetic algorithm(K.F.Man et al., 1999)
3.2 Individual Representation
The most crucial fundamental of the EA structure is the encoding mechanism used to represent
feasible solutions for a given problem. Objects forming feasible solutions within the original
Evolutionary Algorithm ()
{// start with an initial time
t := 0;
// initialize a usually random population of individual
init_population P(t);
// evaluate fitness of all initial individuals of population
evaluate P(t);
// evolution cycle
while not terminated do
// increase the time counter
t := t + 1;
// select a sub-population for offspring production
P'(t) := select_parent P(t) ;// recombine the genes of selected parents
recombine P'(t) ;
// perturb the mated population stochastically
mutate P'(t) ;// evaluate its new fitness
evaluate P'(t) ;
// select the survivors from actual fitness
P := survive P, P'(t) ;
od}
8/14/2019 Thesis: Approximation of a Cell Cycle Model
26/80
13
problem context are called phenotypes, their encodings, the individuals within the GA, are called
genotypes. The representative step specifies the mapping from the phenotypes onto a set of
genotypes.
The problem to be tackled varies from one to the other. The coding of individual representing the
candidate solutions also varies due to the nature of the problem itself. So far, the problem
representation as the bit string is the most classic method used by the GA because of its simplicityand traceability. Meanwhile, there are also other representing methods more suited to certain
problems than using the binary bit strings e.g. the order-based representation which is particularly
useful for the problem in which sequences of individuals contents are required to be solved one
of the most well-known problems is the Traveling Salesman Problem (TSP).
After a properly representative method is chosen for the problem, the GA operations crossover
and mutation are also needed to develop on the basis of this fundamentally representative
structure. It is clear that, for example, if we consider the TSP problem (a salesman needs to
minimize the total travelling distances of visiting entirely n cities), the conservative crossover
method in which the genetic details of an offspring comprises of one part taken from its mother and
another from father will not work because a permutation encoding is basically applied for the TSP,
thus the conservative crossover would cause an element within the offspring appearing more thanone time.
A few representative methods for the GA and other optimization techniques will be given below.
There are many more encoding methods. However, the encoded result must be in the form such that
a computer can process, thus the GA is the population-based technique of searching optimal
(satisfying) results from a huge search space and the computer facilitates the computational effort a
lot.
Representation methods and examples
1. Encoding the solution as binary strings: Binary encoding is the most common for theGA. A feasible solution is in the array in which each element is a binary number of 0 or1. One of the examples for the binary encoding is the Knapsack Problem where there
are things with given value and size, the objective is to maximize the value of things in the
knapsack, but do not extend the knapsacks capacity. The encoding of the feasible solution
is each bit indicating whether the corresponding thing is in the knapsack or not, e.g. Figure
3.3.
Figure 3.3: Example of individuals with binary encoding
2. Encoding the solution as the permutation order: For some specific problems, eachposition of a fixed-length array represents some particular aspect a feasible solution. Itmight be an integer or decimal number. One of the most classic instances such that a
feasible solution is encoded as the permutation order is the Traveling Salesman Problem
(TSP). For a given number of city and distances between a departed city to a destination
city, the salesman needs to travel all of them while minimize the total traveling distances.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
27/80
14
Thus, the sequences of cities are encoded as the fixed-length array, and different sequences(routes) return different distances.
The example map of 10 cities for the TSP is depicted in Figure 3.6. Two different scenarios
for the permutation encoding are shown in Figure 3.7. Of course, if they represent the
feasible individuals for the TSP, Individual A means the salesman travels firstly from the
city 1 to the city 5 and so on (153264798), whereas Individual B isdifferently from the city 8 to the city 5 and so on (856723149).
Therefore, two individuals return the different distances informing that which route is
shorter.
Figure 3.4: Example of a map for the TSP where each circle representing a city and
all transitions connected representing a feasible route
Figure 3.5: Example of individuals with permutation encoding
3. Encoding the solutions as array of real-values: Every individual array is a string of somevalues. Value encoding is very good for some special problems. On the other hand, it is, for
this encoding, often necessary to develop some new crossover and mutation specific to the
problem. One example in this case would be finding weights for a neural network. Assume
there is a basic neural network which comprises of one input and out put layer. The output
is a function of the input connected by one hidden layer. Finding the weights for the inputs
to train for the wanted output can be encoded as the real value.
3.3 Objective Function
An objective function is a measuring mechanism being used to evaluate how good a feasible
solution is. This is a very important link to relate the GA and the system concerned. Since each
candidate is individually going through the same evaluating process, the range of this value varies
from one candidate to another.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
28/80
15
However, writing a proper objective function must be carefully considered. The objective mustevaluate entirely generated individuals with the same standard, furthermore there must be no
conflict causing any doubt in the quality of all individuals.
To make clear in this point, consider an individual after being applied to the problem, then it is
calculated for a fitness value in order to reflect its performance. Therefore, when the entire
individuals have been tested, the relative ability of this individual can be identified, e.g. assumethere are 4 individuals in each generation, the individual a is calculated for its fitness value as well
as others. Then, all fitness values in this generation are normalized. Therefore, the relative ability of
individual a is determined by its normalized portion. When apply this evaluation process throughout
all population in the GA with the same standard, of course, the ability of individual a can beindentified.
3.4 Selection Methods
To generate good offspring, a good parent selection mechanism is important. This is a process used
for determining the number of trials for one particular individual used in reproduction. The
selection may behave in a deterministic or in a randomized manner, depending on the algorithmchosen and its application-dependent implementation. Moreover, the elitist selection may
incorporate. The chance of selecting one individual as a parent should be directly proportional to the
number of offspring produced. There are many techniques which the EA can use to select the
individuals to be copied into the next generation, but listed below are some of the most common
methods.
Selection methods
1. Fitness-proportionate selection or Roulette-wheel selection: Fitter individuals are morelikely, but not certain, to be selected. Conceptually, this can be presented as a game of
roulette each individual get a slice of the wheel, but fitter ones get larger slices than the
less fit ones. The wheel is then spun, and whichever individual owns the section on which itlands each time is chosen.
Figure 3.6: Roulette-wheel selection
However, there are some drawbacks for the roulette-wheel selection. Assume there is an
outstanding individual dominating most parts of the roulette wheel, the generated offspring
8/14/2019 Thesis: Approximation of a Cell Cycle Model
29/80
16
are likely to converge to a certain search space with respect to this individual. In anothercircumstance, if all individuals fitness in the wheel is near each other, the selection
pressure comparing between parents with a high fitness and parents with a low one is low.
Therefore, the generated offspring do not improve much.
2. Rank selection: Each individual in the population is sorted to be assigned for a numericalrank based on the fitness, and the selection is based on this ranking rather than absolutedifference in fitness. The fitness assigned to each individual depends only on its position in
the individuals rank, not on the actual objective value. The advantage of this method is that
it can prevent very fit individuals from gaining dominance early at the expense of less fitted
ones. Therefore, this behavior would increase the populations genetic diversity and mighthinder attempts to find satisfying solutions in a limited number of generations.
Figure 3.7 shows the situation before ranking in the situation that there are 4 individuals
and Individual#1 is very fit, thus it dominates most of the area in the fitness-proportion pie
chart. The result after ranking is depicted in Figure 3.8
Figure 3.7: Situation before ranking (graph of fitness proportion)
Figure 3.8: Situationafter ranking (graph of order numbers)
8/14/2019 Thesis: Approximation of a Cell Cycle Model
30/80
17
3. Tournament selection: Subgroups of individuals are chosen from the larger population,and members of each subgroup compete against each other. Only one individual from each
subgroup is chosen to reproduce.
4. Stochastic universal selection: The individuals are assigned to contiguous sections of aline exactly as the roulette-wheel selection. Differently, a number of pointers over the line
are distributed evenly over here. The number of individuals to be selected (n) correspondsto the number of pointers. Then, the distance between the pointers is 1/n and the position of
the first pointer is given by a randomly generated number in the range [0, 1/n]. The
individuals whose segments include the pointers are selected.
5. Elitist selection: A few of the most fitted members of each generation are guaranteed to beselected. Normally, the GA does not pure elitism but rather a chosen selection strategy
combines the elitism together to assure convergence of the solution if there is no better
offspring introduced in a succeeding generation.
3.5 Genetic Operations
Crossover and mutation are two basic operators of the GA. Performance of GA also depend on
appropriately chosen crossover and mutation schemes. Type and implementation of genetic
operators depends on the encoding method and on the type of problem.
3.5.1 Crossover
The crossover occurs in the GA when two parents are selected for the reproduction. It is sometimes
called recombination. The crossover is the process to combine the attributes of two chosen
individuals, thus the result yields two new offspring in which, by chance, the good attributes from
the mother and from the father combine, so that their fitness values are better than the fitness valuesof their parents.
There are commonly few crossover methods as following;
1. Single point, two-point and multi-point crossover: It is the most common crossovermethod that was inspired by biological processes. The crossover point is randomly
generated and, later, applied to the two selected parents. The first offspring copies the part
before the crossover point from Parent#1 as its first attributes, and then the part after the
crossover point taken from Parent#2 is combined together. The second offspring is also
done analogously.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
31/80
18
Parent #1
Parent #2
Offspring #1
Offspring #2
Crossover point
Figure 3.9: One point crossover
Similar to the single point crossover, the two-point crossover can be done with random
choosing 2 crossover points. The muti-point crossover can also be analogously
implemented. We can say that the single-point and two-point crossovers are the subset of
the multi-point one.
However, the single point crossover has a major drawback in some situations as shown in
K.F. Man et al.1999. For example, assume that there are two high-performance schemes:
S1 = 1 0 1 * * * * 1
S2 = * * * * 1 1 * *
There are two individuals in the population,I1 andI2, matched by S1 and S2 respectively:
I1 = 1 0 1 1 0 0 0 1
I2 = 0 1 1 0 1 1 0 0
If the single point crossover is performed, it is impossible to get the individual that can bematched by the following scheme (S3) as the first scheme (S1) will be destroyed.
S3 = 1 0 1 * 1 1 * 1
A multi-point crossover can overcome this problem, resulting in a greatly improved
performance of offspring generation.
Assuming that the two-point crossover is preferred to substitute the single point crossover
in this case, theI1 andI2are then performed. Thus the resulting offspring would be returned
as theI3 andI4 in which theI3is matched by S3.
I1 = 1 0 1 1 | 0 0 | 0 1
I2 = 0 1 1 0 | 1 1 | 0 0
I3 = 1 0 1 1 1 1 0 1
I4 = 0 1 1 0 0 0 0 0
8/14/2019 Thesis: Approximation of a Cell Cycle Model
32/80
19
2. Uniform crossover: This crossover method generates the offspring based on a randomlygenerated crossover mask. The operation is demonstrated in Figure 3.10. The mask bit 1
means the corresponding bits from Parent#1 and Parent#2 are needed to swap, otherwise
there is no swapping.
Parent #1
Parent #2
Mask
Offspring #1
Offspring #2
1 10 1 00 0 01 1 00 1
Figure 3.10 Uniform crossover
The resultant offspring contain a mixture of bit contents from each parent. The number of
effective crossing points is not fixed, but it will be in average ofL/2 (where L is the bitlength of each individual).
3.5.2 Mutation
After crossover, the offspring are subjected to the mutation. The mutation for the binary string
means to flip the mutated bits to the opposite values changing a bit of 0 to 1 or vice versa.
Basically, the bits of binary string are independently mutated. The purpose of mutation is to avoid
candidate solutions converging into local minima. For example, suppose all the strings in a
population have converged to a 0 at a given position and the optimal solution has a 1 at that
position, then crossover cannot regenerate a 1 at that position, while a mutation could. By simply
changing some elements regarding a mutation probability to new random values, it will raisegenetic diversity of candidate in the population. Usually, the mutation rate is small.
Figure 3.11: Mutation
The mutation concept was originally designed only for the binary-represented individual. From K.
F. Man et al.1999, the concept is then adapted and applied with also real number individual, e.g. arandom mutation is designed as:
),( gg (3.1)
Where g is the real number element in individual; is a random function (may be Gaussian ornormal distribution); , are respectively the mean and variance regarding the random function.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
33/80
20
3.6 Replacement Strategy
According to Figure 3.1, when the generation of the sub-population (offspring) is completed,
several representative strategies proposed for the replacement of old generation exist. In case of
generational replacement, all individuals in the present generation are entirely replaced by the new
offspring. Therefore, the population of sizeNneedsNoffspring for replacement for this strategy.
However, this strategy might fail to generate better offspring comparing to the best current
individual. So, it is usually combined with an elitist strategy where one or a few of the best
individual is copied into the subsequent generation. The elitism may increase the speed of
domination of a population by an outstanding individual, but it appears to improve the performance
in average.
Another strategy of the generational replacement is that not all of the offspring are used for thereplacement. Only some offspring (usually the better ones) are used to replace the individual in the
population.
Knowing that a large number of offspring produced consumes a lot of computation in each
generational cycle, the other scheme is to generate a small number of offspring. Normally, the worstindividuals are replaced when new offspring are put into the population. A direct replacement of the
parents by the corresponding offspring may also be adopted. Another optional method is to replace
the oldest individual which remains in the population for a long time. However, this might cause
discarding the best long-life individual.
3.7 Advantages and Limitations of the GA
There are many features that make the GA becoming popular for resolving optimization problems.
Firstly, the GA itself is not mathematic-oriented optimization strategy even, sometimes,mathematical formulations are required, e.g. to determine a cost function. Thus, it is not hard to
comprehend the algorithm and later apply them to a given problem.
Secondly, encoding the candidate solutions of a given problem can be done easily with the
presentation of binary bit strings (the ordinary GA). Therefore, we do not need any specific
knowledge to the problem.
According the candidate solutions in the GA are different from one to another, a parallel
computation technique of modern technology can give some advantages of simultaneously
searching the candidate solutions. Hence, the computational time is significantly reduced.
Lastly, the GA can be used to solve multi-objective problems. The problems to be optimized, in thereal world, cannot basically be stated in the term of single value to be minimized or maximized.
However, they are expressed in the term of multiple objectives which normally involves with the
tradeoffs such that one objective cannot be increased without the decreasing of another. The GA can
find the best result in the multi-objective scenario by simply choosing some feasible solutions in the
search space and later developing them by the selection, crossover and mutation in order to
achieve the better solutions.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
34/80
21
In contrast, there are some limitations for the GA.
Considering in the representative point of view, the ordinary GA does not suit for a problem which
is better defined its solution as a string of real-values; finding the weight values for nodes in a
hidden layer of a neural network would be a good example for this case (Chapter 3.2). Therefore,
other optimization techniques based on the GA are introduced, for instance, Evolutionary Strategy
(ES), Evolutionary Programming (EP) and Genetic Programming (GP) which are mentioned early
in this chapter.
The most important point of the GA limitation would be the GA is not the optimization method
which guarantees that the optimal solution to the problem will be found. In contrast, it returns
satisfying solution within predetermined time, e.g. setting number of the simulation, better solution
cannot be found for some time or obtaining the result with predefined cost value.
Other limitations would concern with choosing parameter values and the GA mechanisms applied
to a specific problem. For example, if an individual is very fit to the problem compared with others,
so the offspring generated in a subsequent generation would be dominated by the very fit individual,
therefore this scenario causes the candidate solution becoming convergent and they would be stuck
in some local minima.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
35/80
22
Chapter 4
Implementation MethodsIn this chapter, the Jacobian-based Local Refinement (JLR) and the Constant Input Response (CIR)
are firstly discussed thus they will be later comprised in the proposed methods. The proposed
methods show how to manage all materials in previous chapters in order to reduce the given model.
4.1 Jacobian-based Local Refinement (JLR)
Mathematically, the Jacobian is shorthand for the Jacobian matrix. The Jacobian matrix is the
matrix of all first-order partial derivatives of a vector-valued function.
Suppose a systematic function Fis given by m real-valued component functions;y1(x1, x2,,xn), ,
ym(x1, x2,, xn). The partial derivatives of all these function (if existent) can be written in the m-by-n
matrix the Jacobian matrixJofF as follow;
n
mm
n
x
y
x
y
x
y
x
y
J
1
1
1
1
Example: the Jacobian matrix of the function F(x1, x2, x3, x4) with the component;
4134
3
2
23
42
11
.3)sin(.
.2.4
7
xxxx
xxx
xx
xx
3)sin(0)cos(.
0280
7000
0001
),,,(
113
2
4
4
3
4
2
4
1
4
4
3
3
3
2
3
1
3
4
2
3
2
2
2
1
2
41
1
3
1
2
1
1
1
4321
xxx
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
xxxxJF
8/14/2019 Thesis: Approximation of a Cell Cycle Model
36/80
23
To take the Jacobian of a vector-valued function, the result implies also the information ofrelationship among the state vectors in the systematic function. Thus, we can apply the result to
initial processes of model reduction in order to reduce the computational effort afterward. As taken
this method from S. R. Tayler et al., 2008, we reservedly call it Jacobian-based Local Refinement
(JLR)
The JLR could reduce some redundant computations depending on users decision: some statevectors could be neglected from the calculation if they are found out that, whether or not they exist,
there is no effect to the whole system.
In the example above, the Jacobian result shows that;
- the first state ( 1x ) is solely the function of itself, and is called as an input-only statebecause its solution is used as the input for other states, and
- the second state ( 2x ) is solely the function of the variable x4 thus, it is called as anoutput-only state meaning that this state completely depends on the variables taken
from other states.
In the model reduction, these two types of state input- and output-only state are important. We
could substitute the input-only state by a constant such that its value is the solution of that state, and
the output-only state can be discarded from the reduced model since we can indirectly obtain it from
other states. In other word, these states are extraneous from the reduced models viewpoint.
4.2 Constant Input Response (CIR)
The basic idea of this task is taken from Oscillator Model Reduction Preserving the Phaseresponse: Application to the Circadian Clock by F. J. Doyle et al., 2008. Generally, in circadian
clock models, the phase response curve (PRC) is widely used as evaluated criterion for the modelswhen the light input is changed in both of its duration and acute. F. J. Doyle et al., 2008, employ aso called Parametric Impulse Phase Response Curve (pIPRC) which measures the phase shift of a
reduced system to the original system when the light parameter is varied. Roughly speaking, the
pIPRC is the phase difference between one of a feasibly reduced system and another one from the
original system when the light input is changed. The pIPRC is finally accumulated as one of the two
major terms in the total cost function ofF. J. Doyle et al., 2008.
Thus, we try to find any response by varying interested parameters in the cell cycle model of
budding yeast similar to a pulse signal and, then, notice how the whole system reacts.
Unfortunately, the cell cycle of budding yeast model does not posses this phase-different property at
all (from initial experiment) since it is not a limit cycle oscillator model. Thus, hereby, we introduce
a new term called Constant Input Response (CIR) in order to evaluate the sensitivity of thefeasibly reduced models comparing to one of the original model.
A constant input signal is added directly to the state Sic1 (state 5) in the cell cycle model of budding
yeast (K. C. Chen et al., 2004). As Sic1 is a regular cyclin subunit called stoichiometric CDK
inhibitor, to directly add a constant signal, physically, would mean we put some substances in order
to boost up some activities of the cell cycle e.g. going faster into the G1-phase. How can we notice
the response of the budding yeast system after adding up the constant signal to the Sic1 state? The
8/14/2019 Thesis: Approximation of a Cell Cycle Model
37/80
24
easiest answer is to measure how the mass physically reacts to this perturbation. The differentialequation of Sic5 (4.1) below shows how to add the constant into the state.
addedckpbasbas
cppbdibdbdibdcscs
KSicVClbkClbk
PSicCdckCkVCkVSwikkdt
Sicd
]1).[]5.[]2.[(
]1].[14.[]5).[(]2).[(])5.["'(]1[
1,5,2,
1,5,5,2,2,1,1, (4.1)
The constant Kadded is increasingly varied from zero to a certain level such that the whole model is
not oscillatory or fails the viability criteria. The responses of the mass to the different Kadded are
taken to plot the CIR.
So, the CIR is defined as follow: For each added constant input, the CIR is the amplitude and time
of the mass state such that it becomes stable while, also, the whole model must comply the viability
criteria.
According the CIR is in the series of time-amplitude for which each point corresponds to an added
constant input applied, it can be plotted in two dimensions such that the x-axis is defined as
Recovery Time, while the y-axis is for Recovery Amplitude. The result of the CIR applying to
the full model can be found in Chapter 5.2.
Here is the pseudo-code to find the CIR is of a given stable and viable model;
Figure 4.1: Pseudo-code to find the CIR
4.3 Proposed Methods
To investigate mechanisms involved in the cell cycle of budding yeast, we would like to minimize
the number of states in the original model in order to facilitate the mechanism investigation.
Therefore, the reduced model will have a closed-form expression as a system of ODEs, moreover
the reduced model must retain the biochemical interpretation of the state vector and response to
specific input signals as the full model does.
The proposed method for model reduction will return an optimized reduced model with a minimal
number of states while preserving some key characteristics the constant input response (CIR).Thus, the cost function of the problem is formulated as the minimization of the cost depending on
the number of states and the CIR-associated error.
4.3.1 Representation of the Given Problem and Objective function
The cell cycle of budding yeast model consists of 36 ODEs and 25 algebraic equations (K.C. Chen
et al., 2004). Considering only the ODE system, the first main objective is to reduce the number of
Repeat
Vary the constant-added input
Measure amplitude and time of the first peak of the mass state becoming stable
Until the tested model is not oscillatory or not viable
8/14/2019 Thesis: Approximation of a Cell Cycle Model
38/80
25
ODE states. To achieve that objective, a kind of representative index indicating which states areincluded or excluded in the reduced model is needed. Firstly, consider the full model which is
represented by;
),( pxfx
(4.2)
Where:
x= dynamic state variables of the system,
x = vector of state variables, and
p = vector of parameters.
Meanwhile, the reduced system requires one more variable indicating the existent states. We rewrite
the reduced ODE system as;
),,( psxfx
(4.3)
Where: s = index vector where siis 1 if the i thstate is included in the model (and
0 if it is excluded).
The number of states present in the model is
Nf
i
is1
, where Nf is the number of states in the full
model. This representation method for the reduced system is taken the idea from S. R. Tayler et al.,2008.
We want to find the reduced model by minimizing the number of states while preserving the shape
of a particular CIR. The cost of a reduced model is calculated by its size and its error (measured
by its ability to reproduce the CIR). The minimal reduced model ),,( psxfx
is defined such
that its solutionx*minimizes the cost , i.e.
)(min)( xxSs
(4.4)
Where: S in the set of all vectors of the lengthNfwhose entries are 0 or 1, andx
is the solution of ),,( psxfx
.
The cost function is defined to be negative in case the reduced model fails to be periodical or any
of the viability constraints, thus it is regarded as the unusable model. Otherwise, it is non-negative,
i.e.
We give the weight to and CIR equally (multiplied by 1), since these terms are independent. Toobtain one term possessing the great value does not mean we could get another term as a great
value, e.g. a reduced model which has more number of states left is likely to return the better CIR
than other ones with less number of states.
4.5
CIR
.1.1
1 x is not oscillatory or inviable.
x is oscillatory and viable.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
39/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
40/80
27
with the systems Jacobian matrix which eliminates some extraneous states, before the GA operates,in order to reduce computational cost.
Each individual in a genetic algorithm is defined by its genome an array of genes encoding the
solution. We map each gene to a decision variable, meaning that there are Nf binary entries in the
genome.
The implementation of the GA starts with randomly generating an initial population P(0) of sizeNc.
However, there are some conditions causing the initial population cannot be arbitrarily random
generated (the details are shown in 5.3 Preconditions of generating a feasible candidate the first
generation). Thus, the preconditions e.g. mass and other states used to determine the viable criteriamust exist as well as others to be fulfilled before the initial population are created.
Then for subsequent generations, we createNcnew individuals, or offspring (with non-negative cost
functions). We use an elitist strategy: the few best individuals in the previous generation are copied
into the current generation, thus we use the elitism of two meaning the two best individuals are
unconditional copied from the preceding generation. The reason of elitism is to make sure that the
algorithm is converging finding the optimal solution. The remaining offspring in generation i are
created using the genetic operators selection, crossover, and mutation according to the followingalgorithm:
1. Select parentsp1 andp2 from P(i - 1).
2. Create offspring c using uniform crossover withp1 andp2.
3. Mutate the genes in offspring c.
4. Remove the states of offspring c deemed extraneous by the system Jacobian.
5. Compute cost for offspring c.
6. If0, then add offspring c to P(i).
Because the individuals do likely close to each other their fitness are theoretically in the range of
[0, 2], a linear ranking selection seems to be the suitable choice among other selection strategies.
Thus, we use a linear ranking selection operator; each parent is chosen with a probability
proportional to their fitness rank (i.e. the fittest parent is most likely to be chosen).
For reproduction, we use a uniform crossover each gene in offspring cs genome is chosen from
parentp1 with probability 0.5. Then, we mutate each gene with probability 0.1, where mutation is
simply to flip the state index to be a reversal one e.g. from inclusion to be exclusion of that gene or
vice versa.
Because the individual configuration is chosen randomly, some states may be extraneous and can be
removed by the Jacobian-based local refinement method (JLR). Therefore, the computational effortis reduced. The important thing in the JLR is to determine which states are crucial to the model in
both the full and reduced one, and whether or not we can leave the input-only states or the output-
only state. After all determinations are done, the index table of the JLR is needed to be calculated
prior starting the GA.
After offspring c has undergone the local refinement, we determine its fitness by evaluating its cost
function. If the cost is negative then it is discarded and does not count toward the required Nc. We
run the algorithm until reaching the maximal number of generations which is 25 generation here.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
41/80
28
4.4 Application to a Cell Cycle of Budding Yeast
A cell cycle of budding yeast model proposed by K.C.Chen et al. 2004 comprises of 36 ODEs and
25 auxiliary algebraic equations but only the main 36 ODEs are required to be reduced. As an arrayof 36 binary strings is employed to represent each feasible case of the reduce model, the total
number of possibilities of the reduced models is exactly 236 cases (approximately 64 billion cases)
without any constraint. To manage with such this very huge search space, the GA is well-suited in
order to find a satisfying solution.
4.4.1 Reduced Model: Comparison between Substitution the Eliminated
State by Zero and by Mean-valued Constant
The experiment was firstly designed to find the optimal result which possesses the lowest number
of states, while preserving the CIR curve. In the referent literature regarding Oscillator Model
Reduction Reserving the Phase Response proposed by S.R.Tayler et al., 2008, an excluded state is
substituted by zero meaning that the discarded state is totally cut away from the model or it does
not, in other word, affect anything.
Applied this to the cell cycle of budding yeast, however, one can argue which kind of substitution
should be used. The substitution method must not introduce more constraints into the model and the
reduced model with a given substitution method should contain the same properties as the original
model.
There are two feasible substitution methods: 1. substitution the eliminated state by zero, and 2.
substitution the eliminated state by its mean-valued constant. So, we are going to compare the best
solutions taken from the finally reduced models. The better method measured by finally returning
the solution with lower cost should be accredited for further experiments.
Without mathematic proof, there is a foreseeing case (in 1. substitute the eliminated state with zero)that could introduce a consequent problem regarding model reduction. Assume there is an ODE
system consisting of several states. In the system, there is a state m such that the dynamics of thevariable m is a function of itself multiplied by another state variable n. Then, assume further that the
state n could be reduced; hence its dynamics as well as its variable are substituted by zeros.
Therefore, there is the problematic consequence taking place afterward which is the state m also
becomes zero but we do not want it occurred.
nkdt
dn
mndt
dm
.
.
2
Figure 4.2: Example of an ODE system containing several state variables such that
there exists the problem, when substitution an eliminated state with zero, is applied
8/14/2019 Thesis: Approximation of a Cell Cycle Model
42/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
43/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
44/80
31
Chapter 5
Results and DiscussionIn this chapter, the simulation results will be shown and discussed. Firstly, the Jacobian-based Local
Refinement (JLR) will be calculated and then, its result will be kept as an index table for later use.
Then, the Constant Input Response (CIR) of the full model will be shown. It will be employed to
calculate the total cost value of each feasible solution from the GA. Before applying the GA, some
preconditions are introduced, e.g. some states must be fixed, so that, the randomly generated
individuals in the first generation of the GA can be found. After then, the GA will give the answer
to a question of which type of substitution the eliminated states is better. The resulting reduced
model will be discussed regarding its numerical solution and the wiring diagram of the cell cycle
presented in Chapter 2. And, finally, the GA with parameter estimation will be simulated.
5.1 Jacobian Index Table
In order to apply the Jacobian to the ODE system, the algebraic equations must be firstly substituted
into the ODEs before taking the Jacobian because the equations need to contain the variables only
belonging to the system itself.
The Jacobian index table is designed to be initially calculated since this table will facilitate finding
out the Jacobian-based Local Refinement (JLR) afterward. To create the table, we do not payattention to what the result of each element in the Jacobian is. Instead, we do care only which
element is non-zero and which is zero.
The codes to resolve this problem are written in MATLAB, thus the pseudo-code to find the
Jacobian index table of the model is shown below, in Figure 5.1.
Figure 5.1: Pseudo-code to find the Jacobian Index Table
The result, represented as the Jacobian index table, is shown in Table 5.1. As presented in the
pseudo-code, the element 1 means that the Jacobian result in that position of the Jacobian matrixis non-zero. On the other hand, the element 0 is represented when the Jacobian result is zero.
// Pseudo-code for Jacobian Index Table
Substitute the algebraic equations into the ODE system
Define the variable vector
Calculate the Jacobian of the ODE system symbolically
Substitute non-zero elements of the Jacobian result matrix with 1, otherwise
with 0
Calculate the Jacobian index table by subtraction of all diagonal elements)(
~JdiagJJ
8/14/2019 Thesis: Approximation of a Cell Cycle Model
45/80
32
Regarding the result, it is clear that the states 34, 35 and 36 are the output-only states since all
elements in such those respective rows are zeros. So, they do not affect the model at all because
they do not feed any signal back into the model. Hence, we can ignore them. However, the viability
criteria require that the events triggered by the states 33, 34, 35 and 36 (the states Esp1, ORI, BUD
and SPN respectively) must be in the correct sequence, so we will keep the states 34, 35 and 36 for
checking the viability but not count them into the reduced models size at all (the state 33 Esp1
is mentioned here because it involves in the determination of the viability criteria but, as the resultfrom Table 5.1, it is not the output-only state). The ordering events of the viability criteria are
already discussed in Chapter 2.
The states 22, 24, 26 and 28 are clearly the input-only states since all elements in the respective
columns are zeros. These input-only states can be substituted by their solutions at the steady-state.
Of course, if there is individually the state 22, 26 or 28 existing in the reduced model in the end, we
will not count them anyways. Notice that the state 26 is exceptional because it is affected by reset
conditions. To make clear in this point, consider 2 groups of these equations: a. the equations of the
states 22, 26 and 28, and b. the equation of the state 24;
.]1.[]1[
]14.[]14[
]1.[]1[
,,
14,14,
,,
TnetdnetsT
TdsT
Tcdhdcdhs
T
Netkkdt
Netd
Cdckkdt
Cdcd
Cdhkkdt
Cdhd
In the state 22, The constant ks,cdh is 0.01, kd,cdh is 0.01 and the initial value of [Cdh1]Tis 1. Thus, its
solution definitely equals the constant 1. The states 24 and 28 can be explained analogously such
that they also return the constant results.
The state 24 is written below;
State 24:]1[
]1.[
])1[]1[
])1[]1.([]1[
,
2
,
1
TemJ
Temk
TemTemJ
TemTemk
dt
Temd
temi
bub
Ttema
Tlte
.
Unlike the states in the first group, the state 24 is not obviously seen such that it is the input-only
state but the result in Table 5.1 confirms that. However, it can neither be substituted by a constant
nor be discarded because of the reset conditions: kbub2 = 1 (for [ORI] > 1 and [SPN] < 1) or 0.2
(otherwise) and klte1 = 1 (for [SPN] > 1 and [Clb] > kez) or 0.1 (otherwise). This can be obtained
from Table 2: Parameter values and initial conditions in Appendix B.
To decide which one of these input-only states can be left out depends further on which substitution
methods are used, e.g. all states 22, 26 and 28 can be neglected from the reduced model in case of
the mean-valued substitution because there is no difference whether they exist in the reduced model
(their mean values are the same as their numerical solutions).
However, in the circumstance of substitution the eliminated state by zero, these all states cannot be
initially left out. There will exist some problems in case of leaving all of them, e.g. leaving only the
state 24 causes the full system to be unstable (not oscillatory) and inviable (as demonstrated in 5.3).
According to this simulation result, we decided to keep all input-only states for later steps in the
scenario of substitution the eliminated state by zero.
State 22:
State 26:
State 28:
8/14/2019 Thesis: Approximation of a Cell Cycle Model
46/80
33
The mass state (1) is also the input-only state but it is one crucial state in the model since, firstly,
the model will not work properly if the mass is missing and, secondly, the model does not make any
physical sense if there is no mass in the budding yeast system.
The state 4 is needed to be preserved because it is the state triggering the mass resetting event. As
mentioned in the reset rules, the mass is reset when Clb2 is lower than Kez, thus this state is also
crucial and, of course, we must make sure for its existence.
Finally, the state 5 (Sic1) which involves in adding the constant input must be preserved. Otherwise,
adding any constant input does not cause any change to the system if this state is not counted to
exist in the reduced model.
8/14/2019 Thesis: Approximation of a Cell Cycle Model
47/80
34
Table5.1:JacobianIndexTable(ExcludingtheDiagonalElements)of
theCell
CycleModelofBuddingYeast
8/14/2019 Thesis: Approximation of a Cell Cycle Model
48/80
8/14/2019 Thesis: Approximation of a Cell Cycle Model
49/80
36
maximal value supported by the full model even if there is a reduced model which could stand a
higher value.
The CIR is one of the crucial terms in the total cost function. Any CIR of a reduced model which is
the most-likely to one of the full model will feasibly yield a good value of the total cost function.
The lower CIR the better reduced model (only in the point of view of the CIR). The reduced model
which has a very low CIR means that its response to different added constant inputs is very close tothe response of the full model.
Figure 5.3: CIR of the full model generated from using 28 values of added constant inputs
Table 5.2: Added constant inputs and median values for each group
8/14/2019 Thesis: Approximation of a Cell Cycle Model
50/80
37
Figure 5.4: CIR of the full model generated from using 10 values of added constant inputs
obtained from the median-valued substitution
5.3 Preconditions for Generating Feasible Candidates in the First
Generation
Since there is an obstruction occurring when we firstly try to implement the GA to the cell cycle
model, the problem is that the feasible individual in the first generation cannot be found. Therefore,
we add some preconditions prior to generating the individual in order to overcome this difficulty.
The preconditions, however, are different between 2 cases of the substitution for eliminated state.
5.3.1 Overview of Initialization of the GA
The feasible candidates are randomly generated in the first generation of the GA. However, from
the simulated experiments, we could not find out any candidates from several-hour running becausethere is no even oscillatory candidate taking place. Why does this problem exist? To answer this
problem, let make clear on what method we employ to generate these candidates in the first
generation.
In the first generation, the respective codes generate a feasible candidate represented by a 36-by-1
matrix in which its elements are binary numbers, thus each element comes from rounding a numbercreated from a uniform distribution in MATLAB. It means that, in average, half of the generated
elements are 1s and another half are 0s after the rounding process.
The initial individuals in the first generation are required to be stable and viable. From the processof randomly generating the feasible candidate above, we suppose it is appropriate or not. What if
there are some constraints, e.g. the viability criteria mentioned in Chapter 2, causing the generating
process cannot create a feasible candidate arbitrarily, how to find out those constraints? There must
be some state variables such that the model will not be oscillatory in case of lacking them. In order
8/14/2019 Thesis: Approximation of a Cell Cycle Model
51/80
38
to reduce the computational effort for the initialization, we will fix some states. Thus, how to find
out which states will be fixed and what are the reasons will be explained in the subsequent topic.
5.3.2 Preconditions for Initialization of the GA
In order to test for the crucial states, the s index representing a feasible candidate is manually put
into the full model by setting only one state/bit to be 0 each time. The experiments are done for 2
cases; 1. Zero substitution, and 2. Mean-valued substitution. The results are shown in Figure 5.5.Note that the zero substitution means to substitute an eliminated state with zero and the mean-
valued substitution means to substitute an eliminated state with itself mean value which is shown in
Table 5.3.
In case 1. Zero substitution, the results show that the states 1, 4, 20, 21, 24, 25, 26, 27 and 36individually cannot be left out, otherwise a non-oscillatory system exists. However, the
combinatory cases are not tested because the GA itself should find out which combinations are
useful. In the followings, all restrictions to generate feasible candidates in the first generation for
the case of zero substituti