Estimation of Distribution Algorithm

download Estimation of Distribution Algorithm

of 22

Transcript of Estimation of Distribution Algorithm

  • 8/7/2019 Estimation of Distribution Algorithm

    1/22

    Estimation of Distribution Algorithm (EDA)/Probabilistic Model Building GA(PMBGA)

    Home Online Solution Software Tutorial Software Download Genetic Algorithm

    Linear and Combinatorial Optimizations by Estimation of Distribution Algorithms

    by Topon Kumar Paul and Hitoshi Iba

    [9 th MPS symposium on Evolutionary Computation, IPSJ, Japan, 2002]

    | Flow Chart of EDA | Tutorial on EDA | pdf version of EDA | Summary of different EDAs |

    Abstract

    Estimation of Distribution Algorithms (EDAs) is a new area of Evolutionary

    Computation. In EDAs there is neither crossover nor mutation operator. Newpopulation is generated by sampling the probability distribution, which is estimatedfrom a database containing selected individuals of previous generation. Different approaches have been proposed for the estimation of probability distribution. In thispaper I provide a review of different EDA approaches and present three solutionsbased on Univariate Marginal Distribution Algorithm (UMDA) to optimization inbinary and non-binary search spaces. Three well-known problems: Subset Sumproblem, OneMax function and n-Queen problem are solved and the experimentalresults comparing UMDA with or without local heuristics and GAs are given. From my

    experiments it seems that UMDA may perform bette r for linear problems of without dependencies among variables.

    Keywords

  • 8/7/2019 Estimation of Distribution Algorithm

    2/22

    Estimation of Distribution Algorithm, 2-opt local heuristic, Laplace corrections, n-Queen problem, Partially Matched Crossover, Subset Sum problem

    1. Introduction

    Genetic Algorithms (GAs) are optimization techniques based on selection andrecombination of promising solutions. The collection of candidate solutions is calledpopulations of Genetic Algorithms whereas candidate solutions are sometimes namedas Individuals, Chromosomes etc. Each Individual is an encoded representation of variables of the problems at hand. Each component (variable) in an Individual istermed as Gene. Sometimes the components (genes) are independent of one anotherand sometimes they correlated. But always a communication and informationexchange among individuals in a population is maintained through the selection andrecombination operator of Genetic Algorithm. This kind of exchange helps to combinepartial solutions (individuals) to generate high qu ality partial solutions- buildingblocks (BBs) (Holland 1975; Goldberg 1989). The behavior of the GAs depends on thechoice of the genetic operators-selection, crossover, mutation, probabilities of crossover and mutation, population size, rate of generational reproduction, number of generations etc. But seldom the problem specific interactions among the variables are

    considered. As a result, the fixed two parents recombination and evolution sometimesprovide inferior quality of solution converging to a local optimum. To avoid thedisruptions of partial solutions, the two parents recombination processes can beenreplaced by generating new solutions according to the probability distribution of allpromising solutions of the previous generation. This new approach is calledEstimation of Distribution Algorithm (EDA). EDAs were introduced in the field of Evolutionary Computation for the first time by Mhlenbein and Paa b (1996).

    2. Estimation of Distribution Algorithm (EDA)

    In EDAs the problem specific interactions among the variables of individua ls are takeninto consideration. In Evolutionary Computations the interactions are kept implicitlyin mind whereas in EDAs the interrelations are expressed explicitly through the joint probability distribution associated with the individuals of variables s elected at each

  • 8/7/2019 Estimation of Distribution Algorithm

    3/22

    generation. The probability distribution is calculated from a database of selectedindividuals of previous generation. The selections methods used in Genetic Algorithmmay be used here. Then sampling this probability distribution generates offspring.Neither crossover nor mutation has been applied in EDAs. But the estimation of the

    joint probability distribution associated with the database containing the selectedindividuals is not an easy task. The following is a pseudocode for EDA approac h:

    Step 1: Generate M individuals (the initial population) at random

    Step 2 : Repeat steps 3-5 for l = 1, 2 , until the stopping criteria met

    Step 3 : Select N< = M individuals from Dl-1 according to selection method

    Step 4 : Estimate the probability distribution of an individualbeing among the selected individuals

    Step 5 : Sample M individuals (the new population) from p l (x)

    3 . Different EDA approaches

    3 .1 Independent Variables

    The easiest way to calculate the estimation of probability distribution is to consider allthe variables in a problem as univariate. Then the joint probability distribution

    becomes the product of the marginal probabilities of n variables,

    i.e. .

    Univariate Marginal Distribution Algorithm (UMDA) (Mhlenbein, 1998), PopulationBased Incremental Learning (PBIL) (Baluja, 1994) and Compact Genetic Algorithm(cGA) (Harik et al., 1998) consider no interaction among variables. In UMDAs the joint

    n0 D

    nl D

    !!

    n

    i il x p p

    1)()(x

    n sel D 1

    n )|()( 1 se

    l l

    p x p x

  • 8/7/2019 Estimation of Distribution Algorithm

    4/22

    probability distribution is factorized as a product of independent univariate marginaldistribution, which is estimated from marginal frequencies:

    if in the jth case of , X i=x i ; =0, otherwise.

    In PBILs the population of individuals is represented by a vector of probabilities: p l ( x )=(p l (x 1 ), , p l (x i ), , p l (x n )) where p l (x i ) refers to the probability of obtaining a 1 inthe ith component of Dl , the population of individuals in the l th generation. At each

    generation M individuals are generated by sampling p l (x ) and the best N individualsare selected. The selected individuals are used to update the probability vector by aHebbian inspired rule:

    where a(0,1] and represents the value of x i at k th selected individual . The update rules shifts the vector towards the best of generated individuals.

    In CGAs the vector of probabilities is initialized with probability of each variable 0.5.

    Then two individuals are generated randomly by using this vector of probabilities andrank them by evaluating. Then the probability vector p l (x ) is updated towards the best one. This process of adaptation continues until the vector of probabilities converges.

    All the above mentioned algorithm provides better results fo r variable with nosignificant interaction among variables (Mhlenbein, 1998; Harik et al., 1998; Pelikanand Mhlenbein ,1999) but for higher order interaction it cannot provide betterresults.

    3 .2 Bivariate Dependencies

    To solve the problems of pairwise interaction among variables population basedMutual Information Maximizing Input Clustering (MIMIC) Algorithm (De Bonet et al.,1997), Combining Optimizers with Mutual Information Tress (COMIT) (Baluja and

    w ith N

    x X x p

    N

    j

    se

    l ii j

    il ! !! 1 1 )|()( H

    1)|( 1 !! se

    l ii j D x

    H sel D 1

    !!N

    k

    l M k iil il x N x p x p 1 :,1

    1)()1()( EE l

    M k i x :,

  • 8/7/2019 Estimation of Distribution Algorithm

    5/22

    Davies, 1997), Bivariate Marginal Distribution Algorithm (BMDA) (Pelikan andMhlenbein, 1999) were introduced. Where there is at most two -order dependencyamong variables these provide better result that is far away from the real worldwhere multiple interactions occur.

    3 .3 Multiple Dependencies

    Factorized Distribution Algorithm (FDA) (Mhlenbein et al. 1998), Extended Compact Genetic Algorithm (ECGA) (Harik, 1999), Bayesian Optimization Algorithm(BOA)(Pelikan et al., 2000), Estimation of Bayesian Network Algorithm (EBNA) (Larra agaet al., 2000) can capture the multiple dependencies among variables.

    FDA uses a fixed factorization of the distribution of the uniformly scaled additivelydecomposed function to generate new individuals. It efficiently optimizes a class o f binary functions, which are too difficult for traditional GAs. FDA incrementallycomputes Boltzmann distributions by using Boltzmann selection. FDA converges inpolynomial time if the search distribution can be formed so that the number of parameters used is polynomially bounded in n (Mhlenbein and Mahnig,2002). But the problem of FDA is the requirement of fixed distribution and additivelydecomposed function.

    BOA uses the techniques from modeling data by Bayesian Networks to estimate thejoint probability distributions of selected individuals. Then new population isgenerated based on this estimation. It uses the BDe(Bayesian Dirichlet equivalence)metric to measure the goodness of each structure. This Bayesian metric has theproperty that the scores of two structures that reflect the same conditionaldependency or independency are the same. In BOA prior information about theproblem can be incorporated to enhance the estimation and better convergence(Pelikan et al., 2000). In order to reduce the cardina lity of search space BOA imposesrestriction on the number of parents a node may have. For the problems where a nodemay have more than 2 parents, the situation is complicated to solve.

    In ECGA the factorization of the joint probability is calculated as a product of marginaldistribution of variable size. These marginal distributions of variable size are relatedto the variables that are contained in the same group and to the probabilitydistribution associated with them. The grouping is carried out by usin g a greedyforward algorithm that obtains a partition among the n variables. Each group of

  • 8/7/2019 Estimation of Distribution Algorithm

    6/22

    variables is assumed to be independent of the rest. So the factorization of the joint probability on n variables is:

    where C l denotes the set of groups in the l th generation and p l (x c ) represents themarginal distribution of the variables X c , that is the variables that belong to the cth group in the l th generation. ECGA uses model complexity and population complexity tomeasure the quality of the marginal distribut ion. It can cover any number of interactions among variables but the problem is, it does not consider conditionalprobabilities which is insufficient for highly overlapping cases.

    In EBNA the joint probability distribution encoded by a Bayesian Network i s learnt from the database containing the selected individuals in each generation. The

    factorization can be written as: where is the set of parents of the variable X i. Different algorithms can be obtained by varying the structural searchmethod. Two structural search methods are usually considered: score+search anddetecting conditional (in) dependencies (EBNA PC). Particularly two scores are used inthe score+search approarch: the Bayesian Information Criterion (BIC) score (EBNA BIC)and the K2+penalization score (EBNA k2+pen ). In each case the convergence is onlyaffected by the calculus of the parameters qi j k , where qi j k represents the conditionalprobability of variable X i being in its k th value while the set of its parent at j th value.The parameter of the local probability distribution can be calculated for everygeneration using either: their expected values as obtained by K2 score (Cooper and

    Herskovits, 1992) for their score: or by maximum likelihood

    estimates: where N i j k denotes the number of cases in the selectedindividuals in which the variable X i takes the k th value and its parents Pa i take j th value

    and .For the case of expected values when the selection is elitist EBNAsconverge to a population that contains the gl obal optimum whereas for maximumlikelihood case it is not guaranteed( Gonzlez et al.).

    4 . Univariate Marginal Distribution Algorithm (UMDA)

    !l C c

    cl l p p )()( xx

    !

    !n

    i

    l iil x p x p

    1

    )()( pa l ipa

    ? Aiij

    ijk

    l ijk r N

    N D E !

    1sec1U

    ij

    ijk

    ijk N

    N !U

    !!r r

    k ijk ij N N 1

  • 8/7/2019 Estimation of Distribution Algorithm

    7/22

    In UMDA it is assumed that is there is no interrelation among the variables of theproblems. Hence the n-dimensional joint probability distribution is factorized as aproduct of n univariate and independent probability distribution. That is:

    .

    Each univariate marginal distribution is estimated from marginal frequencies:

    if in the jth case of , X i=x i

    =0, otherwise.

    The pseudocode for UMDA is as follows:

    Step-1:D 0 Generate M individuals (the initial population) at random

    Step-2: Repeat steps 3-5 for l =1,2, until stopping criteria is met

    Step-3: Dl-1 n Select N M individuals from Dl-1 according to selection method

    Step-4: Estimate the joint probability distribution

    Step-5: Dl Sample M individuals (the new population) from p l (x )

    4 .1 Convergence of UMDA by Laplace correction

    Gonzlez et al. has shown that some instances with p l ( x )d >0 visits populations of D* which contains global optimum infinitely with probability 1 and if the selection iselitist, then UMDA may converge to a population that contains the global optimum.

    But the joint probability distribution of UMDA can be zero for some x; for example,

    when the selected individuals at the previous steps are such that

    !

    !!n

    iil

    se

    l l x p D p p1

    1 )()()( xx

    w ith N

    x X x p

    N

    j

    se

    l ii j

    il

    ! !! 1 1 )|()(H

    1)|( 1 !! se

    l ii j D x

    H sel D 1

    !

    !!n

    iil

    S el l x p D p p

    11 )()()( xx

    0)|( 1 !!S el ii j

    x X H

  • 8/7/2019 Estimation of Distribution Algorithm

    8/22

    for all j=1, 2, , N. Hence p l ( x)= 0 . So UMDA sometimes may not visit a globaloptimum (Gonzlez et al.).

    To overcome the problems the way of calculating the probabilities should be changed.One possible solution is to apply Laplace correction (Cestnik, 1990). Now

    where r i is the number of different valuesthat variable X i may take.

    5 . Some Solutions by UMDA

    Here I have applied Univariate Marginal Distribution Algorithm to some well -knownproblems like Subset Sum problem, n-Queen Problem and OneMax function with somelocal heuristics.

    5 .1 Subset Sum Problem

    It is the problem of finding what subset of a list of integers has a given sum. The subset

    sum is an integer relation problem where the relation coefficients are either 0 or 1.If there are n integers, the solution space is 2 n which is the number of subsets for nvariables. For small n exact solutions can be found by backtracking method but forlarger n state space tree grows exponentially. It is an NP-Complete problem.

    5 .1.1 Solving Subset Sum Problem with UMDA

    In UMDA each individual (solution) is represented by an n-tuple (x 1 , x 2 , x 3 , x n ) such that x i { 0 ,1}, 1e ie n. Then x i=0 if ith integer is not selected and x i=1 if selected. Eachindividual is evaluated by finding the difference between the expected sum and thesum of the selected integers in the individual. The smaller difference is the better.Marginal probability distribution is calculated from the best half o f the solution withLaplace correction and new individuals are generated by sampling this. Duringreplacement I have used elitism. The algorithm terminates when the evaluation of thebest individual of a population is zero.

    i

    N

    j

    S el ii jS e

    l ii r N

    x X

    x X p!

    !! ! 1 1

    1

    1)|()|(

    H

  • 8/7/2019 Estimation of Distribution Algorithm

    9/22

    5 .1. 2 Genetic Algorithm for Subset Sum Problem

    Using same representation, initialization, evaluation and replacement strategy asUMDA I apply GA to the problem. I use simple one point crossover and mutation forgeneration of offspring. Parents have been selected randomly for crossover .

    5 .2 OneMax by UMDA

    OneMax function returns the number of ones in an input string, i.e.,

    where X={ X 1 , X2 , , X n } and X i { 0 ,1}

    It is a unimodal function which has optimum in X Opt ={ 1 ,1 , ,1}.It is a trivial functionwhich is used as a test function for the evaluation of performance of GeneticAlgorithms and

    EDAs.

    With La pl a ce co rr ect ion it is guaranteed that UMDA will find optimum value within asmall number of generations.

    In my experiment I initialize population randomly, selected best h alf for calculation of probability distribution and elitism for replacement. The result is satisfactory.

    5 .3 n-Queen Problem

    n-queen problem is a classic combinatorial problem. The task is to place n -queen onan n n chessboard so that no two queens attack each other; that is, no two queens areon the same row, column, or diagonal. Let us number the rows and columns of thechessboard 1 through N and so the queens. Since each queen must be placed on a

    different row, we can assume that queen i is to be placed on row i.

    Let the solution be X ={ X 1 , X 2 , , X n } where X i represents the column position in row i where queen i can be placed. As all queens must be on different columns, all X i s must be different. So the solution will be a permutation of the numbers 1 through n and thesolution space is drastically reduced to n! .

    !!n

    i iOn e Max

    f 1

    )(

  • 8/7/2019 Estimation of Distribution Algorithm

    10/22

    We can easily fulfill the first two constraints-no two queens on the same row orcolumn by allowing distinct value for each X i. But how to test whether two queens at positions (i , j ) and (k ,l )[i=row,j=column] are on the same diagonal or not? We can test it by some observations:

    Every element on the same diagonal that runs from upper left to lower right has thesame row and column value [(i,i) form] or same (row-column) value .

    Every element on the same diagonal that goes from the upper right to the lower left has same row+column value

    So two queens are on the same diagonal if i- j =k-l o r i+ j =k+l .

    These two equations can be rewritten as j -l=i-k and j -l=k-i.

    That is, if abs( j -l )=abs(i-k ) then the two queens are on the same diagonal.

    So the pseudocode for the testing of the constraints for the queens at position X i and X j is:

    If ((Xi=Xj) OR (abs(i-j)=abs(X i-Xj)) then

    return NOT_FEASIBLE;

    But this algorithm requires (n(n-1 )/ 2 ) comparisons to calculate the fitness of an

    individual.

    5 .3 .1 Solving n-Queen using UMDA

    With UMDA the individual in a population is represented as { X 1 , X2 , , X n } where each X i represents the column at row i where ith queen is placed. The fitness of individual iscalculated as the number of queens at non-attacking positions.

    The initial population of size M is generated randomly with constraints that all values

    in an individual are distinct numbers of the set { 1 , 2 , ,n}.By doing this I have implicitlysatisfied the constraints that no two queens are on the same row or column. Then,fitness calculation is just to check whether two queens are on the same diagonal ornot.

    In each generation the best half of the individuals of the previous generation areselected for the calculation of joint probability distribution using marginal frequency

  • 8/7/2019 Estimation of Distribution Algorithm

    11/22

    of each X i. During calculation of marginal distribution of each variable La pl acecorr ect ion has been used. By applying Laplace correction I have ensured that theprobability of any variable will be greater than zero and hence increase the joint probability distribution.

    Then M new individuals are generated. During generation of each in dividual I haveapplied pr obabilis t ic modific at ion to enforce the first constraint of n-Queen. Inprobabilistic modification when some variables are selected their probabilities for thenext turn have been zero and the probability of non selected variables are increasedproportionately. Consider the following example:

    Before Selection of X 1 After Selection

    of X1

    Position/

    Variable

    1 2 3 1 2 3

    X1 0.7 0.1 0.5 0.7 0 0

    X2 0.1 0.6 0.1 0.1 0.65 0.35

    X3 0.2 0.3 0.4 01. 0.35 0.65

    Table 1: Probabilistic modification example

    If X 1 is selected for first position then the probability of X 1 for 2 nd and 3 rd positionshould be zero and the probabilities of X 2 and X 3 will increase proportionally. Thetemporary table should look like the one at the right above. By doing this we canensure that distinct value will be generated for each component of an individual if Roulette Wheel selection is used.

    For replacement strategy I have use elitism The algorithm stops when fitness of thebest individual is n; that is, all queens are at non-attacking position or certain numberof generations have passed.

    Fitness improvement heuristics can be applied to the problem and performs muchbetter than blind search using UMDA. For combinatorial optimization 2 -opt (Croes,

  • 8/7/2019 Estimation of Distribution Algorithm

    12/22

    1992) algorithm or Partially Matched Crossover (PMX) (Goldberg et al., 1987) is

    widely used. I have used 2-opt algorithm. The resulting algorithm is:

    Figure 1. Algorithm for solution of n-Queen problem by UMDA with 2 -opt local

    heuristic

    5 .3 .2 n-Queen problem by Genetic Algorithm

    Using same representation, initialpopulation and fitness function I applyGenetic Algorithm with PartiallyMatched Crossover and Swap Mutation.

    Step 1: Create the initial population of size M randomly with the constraint that all elements in an individual

    are distinct

    Step 2: Apply 2-opt algorithm to the initial population

    Step 3: Select the best half of the population

    Step 4: Estimate joint probability distribution of an individual being among

  • 8/7/2019 Estimation of Distribution Algorithm

    13/22

    Replacement strategy is elitism. The genetic algorithm for the problem is:

    Figure 2 : Algorithm for solution of n-Queen problem by GA

    Step 1: Create the initial population of size M randomly with the constraint that all elements in an individual

    are distinct and evaluate eachStep 2: Repeat Step 3-8 until (the fitness of the best individual=n)

    Step 3: Randomly select two parents

  • 8/7/2019 Estimation of Distribution Algorithm

    14/22

    6 . Experimental Results

    In this section I have presented some experimental results obtained by applyingUMDA to Subset Sum problem, OneMax function and n-Queen Problem. I have runthe programs on a computer with 902 MHZ AMD Athlon Processor and 512 MB of RAM and in Borland C++ builder 6.0 environment.

    6 .1 Subset Sum Problem

    Here I have generated positive integers randomly. Then expected sum has beengenerated by randomly selecting those generated integers so that there is always asolution. For trivial case I have chosen expected sum equal to the sum of allgenerated integers. In this case the solution is {1,1,1,1, ,1}. The parameters for theproblems are:

    Total Run=50; Population Size=1000; Crossover Rate=0.7, Mutation rate=0.1;Elite=10% and Truncation Selection for UMDA.

    The results are shown below:

    Figure 3 . Average no . of generations required for the trivial solution (expectedsum = sum of all integers) of Subset Sum Problem

    Average no. of gene rat ions required for t r iv ia l solut ion of Subse t Sum problem

    0

    20

    40

    60

    80

    100

    0 50 100

    Problem Size

    N o .

    o f g e n e r a

    t i o n s

    UMDA

    GA

  • 8/7/2019 Estimation of Distribution Algorithm

    15/22

  • 8/7/2019 Estimation of Distribution Algorithm

    16/22

    Figure 6 . Average time (sec) required for random sum (expected sum

  • 8/7/2019 Estimation of Distribution Algorithm

    17/22

    Figure 7 . Average time (sec) required until solution for OneMaX function

    Figure 8 . Average number of generations required for OneMax function

    6 .3 n-Queen Problem

    For n-Queen problem the solution space is n! and some of them are feasible solution.For n=1 , 2 ,3 no solutions exist. The following table gives an idea about solution fordifferent n:

    No. of Queens Total FeasibleSolutions

    No. of Queens Total FeasibleSolutions

    4 2 10 724

    5 10 11 26806 4 12 14200

    7 40 13 73712

    8 92 14 365596

    Ave r " # e Time (se $ ) r equ ir ed f o r O neMax Fun $ tion

    -0.5

    0

    0.5

    1

    1.5

    2

    2.5

    0 50 100 150

    P r ob lem % i& e

    T i m e

    ( s e

    '

    )

    r e q u i r e

    dUMDA

    GA

    Ave r age no. o f gene r a ti ons r equ ir ed f o r OneMax Fun $ ti on

    0

    20

    4 0

    6 0

    8 0

    0 50 100 150

    P r ob lem (

    i) e

    0 o . o

    f

    g e n e r a

    t i o n s

    UMDA

    GA

  • 8/7/2019 Estimation of Distribution Algorithm

    18/22

    9 352 15 2279184

    Table 2 : No . of feasible solutions for n-Queen problem

    For small n one possible solution can be found quickly by using BacktrackingMethod. But for larger n backtracking may not be right choice as it may consume alot of memory for recursive function, neither it would be possible to try allpermutations in solution space. So EDA can be a choice for the problem to get solution in a reasonable time limit. I apply the simplest one UMDA with 2-opt algorithm as local heuristics with the following parameters:

    Total Run=50; Population Size=10*Variable Size; Crossover Rate=0.7; MutationRate=0.1; Elitism=10% and the results are:

    Figure 9 . Average time required for solution of n-Queen Problem

    7 . Discussion

    7 .1 Subset Sum problem

    From the experimental results we find that for medium size of problem UMDAprovide better results in respect of generation when the expected sum is the sum of a proper set of the set of integers. But calculation of probability distribution takestime. For trivial solution (when the final solution consists of all integer) UMDAoutperforms GA in both respects. This is due to the fact that the variables in theproblem are less interdependent.

    Ave r age Ti 1 e (se c ) Requ ir ed f o r n-Queen p r ob le 1

    -5 0

    0

    5 0

    10 0

    1 5 0

    20 0

    0 10 20 30 40

    No . o f Queens

    T i

    2

    e ( s e c )

    UMDA with 2 -Op t3

    A with 2 -Op t

    3

    A with PMX

  • 8/7/2019 Estimation of Distribution Algorithm

    19/22

    7 .2 OneMax Function

    The OneMax function is a trivial one and the variables are independent of one

    another. So UMDA provides better results than those of GA.

    7 .3 n-Queen problem

    From the experimental results we see that UMDA with 2-opt as local heuristicsproduces solution slower than two other methods, while GA with Partially MatchedCrossover produces result very fast for higher variable size. This is due to that fact that the variables, which represent the positions of the queens in a checkerboard,are highly co-related. As I have said, UMDA cannot capture the interdependenciesamong the variables of a problem. For highly correlated problems we may get thedependencies among the variables by Bayesian networks or other methods, which ismy future work. But the number of generations required until solution is relativelylow compared to GA with PMX. This is due to fact that the algorithms spend muchtime in 2-opt algorithm, which searches each individual to determine whether thereis a legal move possible or not.

    8 . Conclusion

    In this paper I have discussed Estimation of Distribution Algorithm. It seemsreasonable to apply EDA in place of GA. But the estimation of the joint probabilitydistribution associated with the database containing the selected individuals is abottleneck of this new heuristic. There is no easy method to calculate it. If thedistribution is more general we get better result, but calculation of this distributionis time consuming and complicated and sampling of new instances using thisdistribution is not an easy task.

  • 8/7/2019 Estimation of Distribution Algorithm

    20/22

    Due to simplicity I have applied UMDA. It provides better results for linear problembut when there is no dependency among variables. N-Queen problem has positionaldependencies among variables. Bayesian network may be a possible structure-learning algorithm for estimation of their probability distributions, which is my

    future work.

    9 . Future Works

    q Estimation of Probability distribution by applying Factorized DistributionAlgorithm and Bayesian Optimization Algorithm

    q Applying EDAs to other optimization problems

    10. References

    [1] Baluja, S. (1994). P opul at ion b ased inc r emen ta l lea r ning: A me th od fo r in t eg r at ing gene t ic sea r ch ba sed func t ion op t imiz at ion a nd compe t it ive le a r ning. TechnicalReport No. CMU-CS-94-163, Carnegie Mellon University, Pittsburgh, Pennsylvania.

    [2] Baluja, S. and Davies, S.(1997). Using op t ima l dependency t r ees fo r combin at or ia l op t imiz at ion: Le a r ning th e st r uc t u r e of se a r ch spa ce. Technical Report CMU-CS-97-107, Carnegie Mellon University, Pittsburgh, Pennsylvania.

    [3] Baluja,S. and Caruana,R. (1995): Removing th e gene t ics f r om s ta nd a r d gene t ica lgo r ith m. In A. Priedits and S. Russell, editors, Proceedings of the InternationalConference on Machine Learning, Morgan Kaufman, 38-46.

    [4] Cestnik,B.(1990). Est imat ing p r obabili t ies: A cr uci a l ta sk in m achine le a r ning.Proceedings of the European Conference in Artificial Intelligence, 147-149.

    [5] Cooper,G.F. and Herskovits, E.A.(1992). A Ba yesian me th od fo r th e induc t ion of pr obabilis t ic ne tw or ks f r om d ata . Machine Learning, 9: 309-347.

    [6] Croes,G.A.(1992). A meth od fo r solving t r aveling s a lesm an p r oblem. OperationsResearch, 6:791-812.

    [7] De Bonet, J.S., Isbell,C.L. and Viola, P.(1997). MIMIC: Finding Opt ima by es t im at ing pr obabili t y densi t ies. Advances in Neural Information Processing Systems, Vol. 9

  • 8/7/2019 Estimation of Distribution Algorithm

    21/22

    [8] Gonzlez, C., Lozano,J.A. and Larra aga,P. M ath emat ica l modeling of disc r et eest im at ion of dis t r ibu t ion a lgo r ith ms. In P. Larra aga and J.A. Lozano,editors,Est im at ion of Dist r ibu t ion Algo r ith ms: A N ew Tool fo r Evolu t ion a r y Opt imiz at ion. Kluwer Academic Publishers, Boston, 2001.

    [9] Harik, G.(1999). Link a ge le a r ning vi a pr obabilis t ic modeling in th e ECGA.IlligalReport No. 99010,Illinois Genetic Algorithm Laboratory, University of Illinois,Urbana, Illinois.

    [10] Harik,G. R., Lobo, F.G. and Goldberg, D.E.(1998). T he comp a ct gene t ic a lgo r ith m.In Proceedings of the IEEE Conference on Evolutionary Computation, 523-528

    [11] Holland, J.H.(1975). Ad a ptat ion in Nat ur a l and Ar t ificia l Syst ems. University of Michigan Press, Ann Arbor, Michigan.

    [12] Horowitz,E., Sahni ,S.and Rajasekaran,S.(2000). Fund amen ta ls of Compu t er Algor ith ms . Galgotia Publications pvt.ltd, New Delhi,2000.

    [13] Goldberg, D.E.(1989). Genet ic Algor ith ms in se a r ch , op t imiz at ion and m achinelea r ning . Addison-Wesley, Reading, Massachusetts.

    [14] Goldberg, D.E. and Lingle R. (1987). Alleles , loci , and th e t r a veling s a lesm an pr oblem. In John J. Gerenstette, editor, Proceedings of the International Conferenceon Genetic Algorithms and their Applications, Morgan Kaufmann Pulishers,Inc.

    1987.

    [15] Larra aga,P. and Lozano,J.A.(2001). Est imat ion of Dist r ibu t ion Algo r ith ms: AN ew Tool fo r Evolu t ion a r y Opt imiz at ion. Kluwer Academic Publishers, Boston, 2001.

    [16] Larra aga,P., Etxeberria, R., Lozano,J.A. and Pe a, J.M.(2000). Combin at or ia l Opt imiz at ion by le a r ning and simul at ion of B a yesian ne tw or ks. In Proceedings of theSixteenth Conference on Uncertainty in Artificial Intelligence, Stanford, 343-352

    [17] Mhlenbein, H. (1998). T he equ at ion fo r r esponse t o selec t ion and i t s use fo r

    pr edic t ion. Evolutionary Computation, 5(3): 303-346.

    [18] Mhlenbein,H. and Mahnig,T.(1999). T he F a ct or ized Dist r ibu t ion Algo r ith m fo r addi t ively decomposed func t ions. Proceedings of the 1999 Congress on EvolutionaryComputation, IEEE press,752-759.

  • 8/7/2019 Estimation of Distribution Algorithm

    22/22

    [19] Mhlenbein, H. and Mahnig, T. (2002). Evolu t ion a r y Opt imiz at ion and th eEst im at ion of Se a r ch Dist r ibu t ions w ith Applicat ions t o Gr a ph Bipa r t it ioning.

    [20] Mhlenbein, H. and Paa,G. (1996). F r om r ecombin at ion of genes t o th eest im at ion of dis t r ibu t ions I. Bin a r y pa r ame t er s. In Lecture Notes in ComputerScience 1411: Parallel Problem Solving from Nature-PPSN IV,178-187.

    [21] Pelikan, M.,Goldberg,D.E. and Cant-paz, E.(2000). Link a geP r oblem , Dist r ibu t ionEst im at ion and B a yesian N etw or ks. Evolutionary Computation 8(3):311-340.

    [22] Pelikan,M. and Mhlenbein, H.(1999). T he biv a r iat e m a r gin a l dist r ibu t iona lgo r ith m. Advances in Soft Computing-Engineering Design and Manufacturing, 521-535.