A system control framework for the self-fertilization and selection process of breeding

9
BioSystem, 24 (1991) 291-299 Elsevier Scientific Publishers Ireland Ltd. 291 A system control framework selection process of breeding for the self-fertilizatio Chen Jian, Zheng Weimin and Wang Yongxian Systas Engineering Laboratory, &ho01 of ECONOMICS and Management, Tsinghua University, B&ji,g 10008~ (China) (Received November lst, 1990) The self-fertilizationand selection process is the main method for speeding up the purifying course of a hybrid population in breeding. This is a complex combinatorial randomprocess with large time delay. To raise the control effectiveness and effi. ciency of the process, we try, in this paper, to construct a mathematical model of the process by means of effective factors. Then, a control framework for the process is presented which can be used as a guide by breedersfor helping them in selecting appropriatecontrol actions. Keyworda: Self-fertilizing and selection process of breeding; Modelling; Simulation; Control. 1. Introduction In crop breeding, one of the most useful method is crossbreeding. Consider two varieties which dif- fer in the alleles they bear at loci (supposing N loci) Ai - ai, one thus being AiAi and the other oioi, . = 1,2&V. After cross-breeding, it will create wide segregations in second filial generation (Fz). kith self-fertilizing and progeny selecting, a new variety which has the better characteristics of both its parents may be obtained. The self-fertilizing and selection process of cross-breeding is a com- binatorial random process based on the segregation and recombination of genes contained in characters. Most characters of crops are quantitative and are controlled by polygenes and affected enormously by environmental agents. Studies on quantitative characters show that there are many types of gene effects that are related to different population distribution. Due to the lack of a powerful control method, the process is at present controlled by experienced breeders and differs between dif- ferent breeders. Generally, it would take many years to obtain a new variety. In this paper, according to the characteristics of the problem, we try to establish a control framework for the self-fertilizing and selection process in breeding, which is bas;c~d on a distribution model that describes the process of selection with the consideration of cost and cl’t’ectiveness in every generation. First, we extend the concept of an effective factor (Mather, 1982) that not only can repre- sent the additive effect of a gene but also can describe the dominance deviation aull epistatic deviation and the relations between various types of effective factors with different distribution forms. Some parameters are employed to express the various effects of a gene and a model 01’ PZ distribution can be built. Then, the model of the self-fertilizing and selection process is presented. Hased on this model, we propose a control framework for the breeding process. 2, Distribution model and parameter identification The first step in crossbreeding is to cause a wide segregation in Fz population, that is, to produce 0303.2647/91/$03.50 0 1991 Elsevier Scientific Publishers Ireland Ltd. Published and Printed in Ireland

Transcript of A system control framework for the self-fertilization and selection process of breeding

Page 1: A system control framework for the self-fertilization and selection process of breeding

BioSystem, 24 (1991) 291-299 Elsevier Scientific Publishers Ireland Ltd.

291

A system control framework selection process of breeding

for the self-fertilizatio

Chen Jian, Zheng Weimin and Wang Yongxian

Systas Engineering Laboratory, &ho01 of ECONOMICS and Management, Tsinghua University, B&ji,g 10008~ (China)

(Received November lst, 1990)

The self-fertilization and selection process is the main method for speeding up the purifying course of a hybrid population in breeding. This is a complex combinatorial random process with large time delay. To raise the control effectiveness and effi. ciency of the process, we try, in this paper, to construct a mathematical model of the process by means of effective factors. Then, a control framework for the process is presented which can be used as a guide by breeders for helping them in selecting appropriate control actions.

Keyworda: Self-fertilizing and selection process of breeding; Modelling; Simulation; Control.

1. Introduction

In crop breeding, one of the most useful method is crossbreeding. Consider two varieties which dif- fer in the alleles they bear at loci (supposing N loci) Ai - ai, one thus being AiAi and the other oioi, .

= 1,2&V. After cross-breeding, it will create wide segregations in second filial generation (Fz). kith self-fertilizing and progeny selecting, a new variety which has the better characteristics of both its parents may be obtained. The self-fertilizing and selection process of cross-breeding is a com- binatorial random process based on the segregation and recombination of genes contained in characters. Most characters of crops are quantitative and are controlled by polygenes and affected enormously by environmental agents. Studies on quantitative characters show that there are many types of gene effects that are related to different population distribution. Due to the lack of a powerful control method, the process is at present controlled by experienced breeders and differs between dif- ferent breeders. Generally, it would take many years to obtain a new variety.

In this paper, according to the characteristics of the problem, we try to establish a control framework for the self-fertilizing and selection process in breeding, which is bas;c~d on a distribution model that describes the process of selection with the consideration of cost and cl’t’ectiveness in every generation. First, we extend the concept of an effective factor (Mather, 1982) that not only can repre- sent the additive effect of a gene but also can describe the dominance deviation aull epistatic deviation and the relations between various types of effective factors with different distribution forms. Some parameters are employed to express the various effects of a gene and a model 01’ PZ distribution can be built. Then, the model of the self-fertilizing and selection process is presented. Hased on this model, we propose a control framework for the breeding process.

2, Distribution model and parameter identification

The first step in crossbreeding is to cause a wide segregation in Fz population, that is, to produce

0303.2647/91/$03.50 0 1991 Elsevier Scientific Publishers Ireland Ltd. Published and Printed in Ireland

Page 2: A system control framework for the self-fertilization and selection process of breeding

292

sufficient types for selecting in order to provide a basis for the self-fertilizing and selection process. In this section, we present briefly the distribution model of an Fs population by means of effective factors and the method of parameter estimation. First, let us make some assumptions:

(1) E’%F is the effect of a single gene which has equal effect in different loci (Mather, 1982). (2) There is no linkage among loci (Mather, 1982). (3) Parents are two pure lines, and they are extreme plus and minus types (Mather, 1982). (4) There are N effective factors in a character, which can be divided into three groups, with 0 I Kr,

Ns, Ns I N and Kr + Ns + Ns = N. Kr effective factors provide only additive effect Al. N2 effective factors not only provide an additive effect AZ, but also contribute dominance deviation D, which are decomposed as Ks and K3 denoting the numbers of effective factors having positive and negative .dominance, respectively, with 0 s K2, K3 I Nz and K2 + K3 = Nz. N3 effective factors produce an ad- ditive effect Aa and epistatic deviation I. It consists of 2K4 and 2K5, with 0 I 2K4, 2K5 s Ns and 2K4 + UC, = Ns, where K4, KS denote the number of pairs of effective factors having dominant and recessive epistasis respectively. Thus a character is controlled by a set of effective factors, they satisfy

0 s K,, K2, KS, K4, K5 s N (1)

Kl + K2 + K3 + 2K4 + 2K5 = N

Under the preceding assumptions, the phenotypic value x of a character can be expressed as

x=G+E (2)

where G = A + D + I, A = Al + A2 + As, E is the environmental deviation, with normal distribution N (0,~~) and E, A, D, I are mutually independent.

Based on the assumptions, we can obtain the probability density function of the F2 population as follows (the detail can be seen in Chen (1990)):

f&8,x) = 5 pr (&Xi) -Ai exp( - (x -Xi)2/2a2) i=O

whereM=2Kr+Kz+Ks+2K4+2Ks,Xi = mir-@,p.J + Ipl - p2 I x i/M, Xi is the ith phenotypic value without considering environmental effects. pl, parents. 0 =

p2 are the phenotypic values of [K,,K2,K3,K4,K51T is the parameter vector and pr(XJ can be determined as

PWi) =

“c’ 2 2 “c’ 2 2 2 2 2 (i1+i2+i3+i4+i5=i il = 0 i2 = 0 i3 = 0 i4 = 0 i5 = 0 jl = 0 j2 = 0 11 = 0 12 = 0

2jl + j, = i4 211 + 12 = i5

(3)

Page 3: A system control framework for the self-fertilization and selection process of breeding

(2Kl) ! K2 ! K3 ! K4 ! K5 !

il ! (UC, - il) ! i2 ! (K2 - i2) ! i3 ! (K3 - i3) ! il ! j, ! (K4 - j, - j,) ! I1 ! Z2 ! (KS - I1 - Z2)

x [(i)” (;_)K2+K5+i3-i2-c-$ (34i2+L2 (+)K3+iZ+il-i3 (+)I1

x (+)Q -k -“I)

By the method of curve fitting, the parameters in model (3) that are the numbers of various types effective factors can be determined by comparing the model (3) with the distribution function obtained from observed data. We define the loss function V as

(4)

where fFz(e,x) and j&(z) denote the theoretical and empirical probability density functions, respec- tively.

The value of 8 is determined by minimizing V (for details see Chen (1990)).

3. Model of self-fertilizing and selection process

Because of dominance and epistatic relations among genes and the influences of environmental agents, a desirable set of individuals selected in F2 may segregate again in their progenies. In order to purify hybrid populations, many methods are used in crop breeding . The most popular and useful method is self-fertilizing and progeny selection. Selecting individuals from a hybrid propulation, there are also many approaches available. To be convenient for mathematical expression, we only consider the one which is called truncation (Jain, 1982) as illustrated in Fig. 1. Truncation selection can be ex- pressed as follows:

s b

s(t) = a fit@ dx (5)

where s(t) is the proportion of individuals being selected from population t, fi(z) is the probability den- sity function of population t, a,b are the lower and upper selecting bounds respectively, in case (a): a -00, b = x1, in case (b): a = x1, b = 00, in case (c): a = x1, b = x2.

before starting to derive the model of self-fertilizing and selection process, let us make some assumptions and define some notations:

(1) A character is controlled by a set of effective factors 8, which can be estimated by minimizing equation (4)

(2) Selection begins from F2 population and consider Pi population, i 2 2 with&$&x) (simplified as .6(6x)) given.

(3) Suppose that there are A& possible genotypes for a character and the frequency of genotype Aj in Fi population denoted as qj can be calculated from j@,x), j = 1,2,...,Mx. 0 I qj I 1, #flZ 1 qj = 1.

Page 4: A system control framework for the self-fertilization and selection process of breeding

.

/\s. /, (a) (b)

\

=1 X2

(c) Fig. 1. Illustrations of truncation selection.

(4) Without considering the effect of environmental agents, the phenotype &, in which there are nk genotypes denoted as Ak,l, I? = &2,...,nk, Cf_ 0 nk = Mx, has value &, and its frequency is denoted as pry&), !I = 0,1,2 ,... JM.

(5) The effect of environmental agents does not vary at different generations. Having made the above assumptions, the model of the self-fertilizing and selection process can be

deduced as follows . According to the properties of self-fertilization, we can derive the segregation of each of the individuals selected separately, that is, the genotypes in the individuals selected can be considered one by one, then summing up all the proportions. First, we determine the proportion of phenotype Bk in a set of individuals selected:

eXp( - (% - &)2/%2) dx (6)

where ai and bi are the lower and upper bounds of selection in the Fi population and SFi is a unified coefficient

SFi = i prs(xk) k=O (7)

Then, the proportion of genotype Ak,l in the selected set having phenotype B,+ can be computed as

Page 5: A system control framework for the self-fertilization and selection process of breeding

295

where qk,z is the proportion of genotype A,+l in the Fi population and d,+ is a unified coefficient

nk dk = c !!k,l (9)

l=l

From the assumptions, there are no linkages among genes so tha$ we can treat every type of effec- tive factor in a genotype respectively, and then combine them by means of con~oh&ion. The probability of phenotypic value Xj 0 = 1,2,...,M) in &;li + 1 population, without considering the effect of en- vironmental agents can be expressed as (the details are omitted here and can be seen in Chen (1989))

where &‘(Xj) is produced by the individuals having genotype A~,J selected from the Fi population and can be obtained by convolution:

d,t ’ CXj> = hml (xi,) 0 wf2 (&,I 0 m3 (xi& 0 w-K4 (xi,> 0 IsrK, t&J I q&l (11)

where prKi(.), prKs(.), prK&.), prK&.) and pr Kg(.) are the probability functions produced by Ki class, K2 class, KS class, K4 class and K5 class effective factors, respectively (the details about them can be seen in Chen (1989)). Consider the effect of environmental agents; we have

h + l@hd = i pri + ’ (6&) + eXp( -(X - &)2/2a2) (12) k=O

7ru

In equation (12), we suppose that the environmental variance is constant for different generations. In general, we have

vi + ’ (6x,) d& exp( -(x - X,)2/2~2) k=O

(13)

From the process of derivation, the probability density function of Fi + 1 population can be deter- mined uniquely by the Fi population and the selection action so that equation (13) can be expressed as

fi + luw = sch@sc)d~)) (14)

where c(t) is the selection action or control variable (that is the upper and lower bounds of selection) for the Fi population.

We call equation (14) the probability density function transition equation. We conclude this section with some important parameters based on equation (12).

(1) Response to selection (R(i)): It is used to measure the effect of selection by the following value (Jain, 1932):

Page 6: A system control framework for the self-fertilization and selection process of breeding

M

= c [pri + 1(Xk) - pr?X!JlXk (15)

k=O

where E&) is the expectation of the Fi population. (2) Hereditability (EI): This is an important parameter in breeding, which is defined as (Liu, 1979)

VQ H=- v=w

(16)

M

c Pri + 'thxk) (xk - Ei + 1 @))2 H-

k=O %+I= Jf

c Pri + ‘(&X/c) (Xk - Ei + 1 (d)2 + d + 1

k-0

(17)

where Vc2 is the variance of the genotypic value G of F i + I, population Var(x) is the variance of the Fi + 1 population.

4. Control framework and control method

The control framework of the self-fertilizing and selection process has the structure as shown in Fig. 2. It consists of two main parts. One is about model identification, the other is about the control algorithm. Similar to equation (14) the transition equation of prZ(B,&) has the form

Pr(i + 1) = h(Pr(i), c(i)) (18)

where PI@) = [pri(0,Xo),...,pri(8,XM)] E PM + 1 and PM + ’ is a subset of RM + 1 and 0 5 pri(&&) I 1, k = 0,1,2,...,M. Then, the response to selection can be expressed as

R(i) = [Pr@ + 1) - Pr(i)]XT

where X = [Xo,X1,...,X~] c XM + 1 c RM + 1 is a constant vector. The objective function of the process can be defined as (without

(b) in Fig. 1 is considered)

T-l

J = - Pr(T)XT + C UPr(t ),t ) ,

t=2

(1%

loss of generality, here only case

(20)

Page 7: A system control framework for the self-fertilization and selection process of breeding

297

1 identification 1

Fig. 2. Conceptual framework for self-fertilizing and se!eetion process control.

where Pr(Y’)Xr represents the final selection effect, L(.) is the cost of selection and J denotes the trade off between the selection effect and the cost of cultivation.

The control problem can be then described to select a control sequence n* = (c*(2),~*(3),...,c*(T - l)l, c*(t) E U, U is control constraint set, so as to minimize the objective function (20).

It is obvious that the control problem is a discrete-time dynamic system and the objective function is additive over time. So we select dynamic programming (Larson, 1982) to solve the optimal control problem. Suppose that I[Pr(t),c] represents the cost caused by control c(t) in time interval [t,ZJ since state Pr(t). The optimal cost since Pr(t) is

T-l

= min 40,. *. I c(T - 1)

UP*), co’ )J) - j=l

= min c(t)~U

L(Pr(t), c(t ), t) + I* [ Pr(t + l), t + 1 ] (21)

where t = 2,3,...,T - 1. The terminal objective incurred at the end of the process is

I* [Pr(T),Tl = - Pr(T)X* (22)

Equation (21) is a recursive algorithm, We can obtain a sequence of controls n* as a control policy by means of equations (21), (22) and (18) recursively, But the problem of combinatorial explosion may occur when the generations of self-fertilizing T and the number of characters increase. To deal with this problem, many methods have been proposed, such as Monte Carlo and Simulated Annealing (Kirkpatrick, 1983). But we prefer a newly developed method called the rotating orthogonal searching method (ROSM) (Zheng, 1990). ROSM is based on the idea of the orthogonal method for large scale experiment design. The direction of rotation is chosen according to the difference of the extreme values that represents the grade of importance of each state in the objective function. This method has many advantages, such as the complexity of the algorithm is of O(m), having the ability to avoid

Page 8: A system control framework for the self-fertilization and selection process of breeding

0.2

0.1

0 I I I I I *

I I I T - 72.8 a3.0 95.0 106.1 117.2 128.3 138.4 150.6

plant height, cm

x&/F,, +F,,*F,, OF,

Fig. 3. The result of simulation.

trapping at the local optimal points (a detailed discussion about this method can be seen in Zheng (1990)).

5. A simulation example

In this section, we will give an example to illustrate the control framework. This example is about rice breeding and the data come from the Institute of Crop Breeding and Cultivation of the Chinese Academy of Agricultural Science, Beijing. The two parents of the cross-combination are Tijin and Li- jianxintuanheigu and their plant heights are 93 cm and 135, cm respectively. Using the estimation method mentioned previously, we have Ri = & = 1, & = 2, J!$ = 0, I&, = 3, s2 = 58. Then we simulate the selection process with the control framework presented in this paper. Suppose that we wish the desired value of the plant height to be about 100 cm and the environmental variance is cons- tant for different generations. The selecting bounds (cm) of each generation are: Fz : [85,120], Fs :

[85,115], F4 : [95,105], F5 : [98,102]. The results of simulation are presented in Fig. 3. From Fig. 3, we can find that the hybrid population (F2) is purified continually as the self-fertilizing and selection process is going on. The average of the population approaches 100 cm gradually and it is about 100.8 cm for the Fe population. The variance is also decreased step by step and in Fe it is close to the en- vironmental variance which means that the genotypes are nearly pure. The results also show that the response to selection is great in early generations which is in accordance with breeding practice.

6. Conclusions

In this paper, we build a model of the self-fertilization and selection process in crop breeding by

Page 9: A system control framework for the self-fertilization and selection process of breeding

means of effective factors. The model can help us to analyze the selection control effect and compare different selection programs, that is, we can simulate the selection process. Furthermore, we present a control framework for the self-fertilization and selection process. Using dynamic programming, a sequence of control n* can be obtained which will minimize the objective function. It is believed that the control framework may become a guide for breeders, and will help breeders to take more desirable control actions so that the efficiency and effectiveness of the selection process will be improved.

Acknowledgement

This work is supported by the National Natural Science Foundation of China, grant No. 6864009.

Reference8

Chen Jian, 2339, The system analysis and control of genetic process in crop breeding, Ph.D dissertation, Tsinghua University, Beijing.

Chen Jian, Zheng Weimin and Wang Yongxian, 1990, Gene model and its application. Chinese J. Biotechnol. (The English edi- tion) 6(4), l--:“.

Clarke, D.W., M&31, adi, CT. and Tuffs, P.S. 1987, Generalized predictive control. Automatica 23, 137-160. Jain, J.P., 1982, EI !ti&cal f’schniques in Quantitative Genetics (Tata McGraw-Hill). Kirkpatrick, S. et ah, 1983, Cptimisation by simulated annealing. Science 220, 671-680. Larson, BE. and Casti, J.L., 1982, Principles of Dynamic Programming (Marcel Dekker, New York). Mather, E. and Jinks, J.L., 1982, Biometrical Genetics, 3rd edn. (Chapman and Hall, London). Zheng Wcimin, We Fei, 1990, A new algorithm of O-l programming - rotating orthogonal method. J. Systems Eng. 5(l), l-10.