Practical Optimization Using Evolutionary Methods · It...

Practical Optimization Using Evolutionary Methods

Kalyanmoy DebKanpur Genetic Algorithms Laboratory (KanGAL)

Department of Mechanical EngineeringIndian Institute of Technology Kanpur

Kanpur, PIN 208016, [email protected]

http://www.iitk.ac.in/kangal

KanGAL Report Number 2005008

Abstract

Many real-world problem solving tasks, includingCFD problems, involve posing and solving opti-mization problems, which are usually non-linear,non-differentiable, multi-dimensional, multi-modal,stochastic, and computationally time-consuming. Inthis paper, we discuss a number of such practicalproblems which are, in essence, optimization prob-lems and review the classical optimization methodsto show that they are not adequate in solving suchdemanding tasks. On the other hand, in the pastcouple of decades, new yet practical optimizationmethods, based on natural evolutionary techniques,are increasingly found to be useful in meeting thechallenges. These methods are population based,stochastic, and flexible, thereby providing an idealplatform to modify them to suit to solve most opti-mization problems. The remainder of the paper illus-trates the working principles of such evolutionary op-timization methods and presents some results in sup-port of their efficacy. The breadth of their applica-tion domain and ease and efficiency of their workingmake evolutionary optimization methods promisingfor taking up the challenges offered by the vagariesof various practical optimization problems.

1 INTRODUCTION

Optimization is an activity which does not belong toany particular discipline and is routinely used in al-most all fields of science, engineering and commerce.Chambers dictionary describes optimization as an actof ‘making the most or best of anything’. Theoret-ically speaking, performing an optimization task ina problem means finding the most or best suitablesolution of the problem. Mathematical optimizationstudies spend a great deal of effort in trying to de-scribe the properties of such an ideal solution. Engi-neering or practical optimization studies, on the otherhand, thrive to look for a solution which is as sim-ilar to such an ideal solution as possible. Althoughthe ideal optimal solution is desired, the restrictionson computing power and time often make the prac-titioners happy with an approximate solution.

Serious studies on practical optimization begun asearly as the second World war, when the need for effi-cient deployment and resource allocation of militarypersonnel and accessories became important. Mostdevelopment in the so-called ‘classical’ optimizationfield was made by developing step-by-step proceduresfor solving a particular type of an optimization prob-lem. Often fundamental ideas from geometry andcalculus were borrowed to reach the optimum in aniterative manner. Such optimization procedures have

1

enjoyed a good 50 years of research and applica-tions and are still going strong. However, around themiddle of eighties, completely unorthodox and less-mathematical yet intriguing optimization procedureshave been suggested mostly by computer scientists.It is not surprising because these ‘non-traditional’ op-timization methods exploit the fast and distributedcomputing machines which are getting increasinglyavailable and affordable like slide-rules of sixties.

In this paper, we focus on one such non-traditionaloptimization method which takes the lion’s share ofall non-traditional optimization methods. This so-called ‘evolutionary optimization (EO)’ mimics thenatural evolutionary principles on randomly-pickedsolutions from the search space of the problem and it-eratively progresses towards the optimum point. Na-ture’s ruthless selective advantage to fittest individ-uals and creation of new and more fit individualsusing recombinative and mutative genetic process-ing with generations is well-mimicked artificially ina computer algorithm to be played on a search spacewhere good and bad solutions to the underlying prob-lem coexist. The task of an evolutionary optimiza-tion algorithm is then to avoid the bad solutions inthe search space, take clues from good solutions andeventually reach close to the best solution, similar tothe genetic processing in natural systems.

Like the existence of various natural evolutionaryprinciples applied to lower and higher level species,researchers have developed different kinds of evolu-tionary plans resulting in a gamut of evolutionaryalgorithms (EAs) – some emphasizing the recombina-tion procedure, some emphasizing the mutation op-eration and some using a niching strategy, whereassome using a mating restriction strategy. In this ar-ticle, we only give a description of a popular approach– genetic algorithm (GA). Other approaches are alsowell-established and can be found from the EA liter-ature [42, 1, 19, 29, 44, 27].

In the remainder of the paper, we describe an op-timization problem and cite a number of commonly-used practical problems, including a number of CFDproblems, which are in essence optimization prob-lems. A clear look at the properties of such prac-tical optimization problems and a description of theworking principles of classical optimization methods

reveal that completely different optimization proce-dures are in order for such problem solving. There-after, the evolutionary optimization procedure is de-scribed and its suitability in meeting the challengesoffered by various practical optimization problems isdemonstrated. The final section concludes the study.

2 AN OPTIMIZATIONPROBLEM AND ITS NO-

TATIONS

Throughout this paper, we describe procedures forfinding the optimum solution of a problem of the fol-lowing type:

Minimize f(x),Subject to gj(x) ≥ 0,

hk(x) = 0,xmin ≤ x ≤ xmax.

(1)

Here, f(x) is the objective function (of n variables)which is to be minimized. A maximization problemcan be converted to a minimization problem by multi-plying the function by −1. The inequality constraintsg and equality constraints h demand a solution x tobe feasible only if all constraints are satisfied. Manyproblems also require that the search is restrictedwithin a prescribed hypervolume defined by lowerand upper bound on each variable. This region iscalled the search space and the set of all feasible so-lutions is called feasible space. Usually, there existsat least one solution x∗ in the feasible space whichcorresponds to the minimum objective value. Thissolution is called the optimum solution.

Thus, the task in an optimization process is to startfrom one or a few random solutions in the searchspace and utilize the function and constraint func-tions to drive its search towards the feasible regionand finally reach near the optimum solution by ex-ploring as small as a set of solutions as possible.

2

3 SCOPE OF OPTIMIZA-TION IN PRACTICE

Many researchers and practitioners may not know,but they often either use or required to use an op-timization method. For example, in fitting a linearrelationship y = mx + c between an input parameterx and output parameter y through a set of n datapoints, we often almost blindly use the following re-lationships:

m =n

∑xy − ∑

x∑

y

n∑

x2 − (∑

x)2, (2)

c =

∑y − m

∑x

n. (3)

It is interesting to know that the above relation-ship has been derived by solving a unconstrainedquadratic optimization problem of minimizing theoverall vertical error between the actual and the pre-dicted output values. In this section, we briefly dis-cuss different practical problem solving tasks in whichan optimization procedure is usually used.

Optimal design: Instead of arriving at a solutionwhich is simply a functionally satisfactory one,an optimization technique can be used to findan optimal design which will minimize or maxi-mize a design goal. This activity of optimizationprobably takes the lion’s share of all optimiza-tion activities and is most routinely used. Herethe decision variables can be dimensions, shapes,materials etc., which describe a design and ob-jective function can be cost of production, en-ergy consumption, drag, lift, reliability, stabilitymargin etc. For example, in the optimal designof an airfoil shape for minimum drag (objectivefunction), the shape of an airfoil can be repre-sented by a few control parameters which canbe considered as the decision variables. A mini-mum limit on the lift can be kept as a constraintand the airfoil area can be bounded within cer-tain limits. Such a task will not only produce anairfoil shape providing the minimum desired liftand area, but will also cause as small a drag aspossible.

Optimal control: In this activity, control param-eters, as functions of time, distance, etc., aredecision variables. The objective functions areusually some estimates computed at the end ofprocess, such as the quality of end-product, time-averaged drag or lift, and stability or some otherperformance measure across the entire process.For example, in the design of a varying nozzleshape for achieving minimum average drag fromlaunch to full throttle, parameters describing theshape of the nozzle as a function of time are thedecision variables. To evaluate a particular solu-tion, the CFD system must have to be solved ina time sequence from launch to full throttle andthe drag experienced at every time step musthave to be recorded for the computation of theaverage drag coefficient for the entire flight. Sta-bility or other performance indicators must alsobe computed and recorded at every time step forchecking if the system is safe and perform satis-factorily over the entire flight.

Modeling: Often in practice, systems involve com-plex processes which are difficult to express byusing exact mathematical relationships. How-ever, in such cases, either through pilot casestudies or through actual plant operation, nu-merous input-output data are available. Tounderstand the system better and to improvethe system, it is necessary to find the relation-ships between input and output parameters ina process known as ‘modeling’. Such an op-timization task involves minimizing the errorbetween the predicted output obtained by thedeveloped model and actual output. To com-pute the predicted output, a mathematical pa-rameterized model can be assumed based onsome information about the system, such asy = a1 exp(−a2x), where a1 and a2 are the pa-rameters to be optimized for modeling the ex-ponential relationship between the input x andoutput y. If a mathematical relationship is notknown at all, a relationship can be found by asophisticated optimization method (such as ge-netic programming method [29]) or by using anartificial neural network.

3

Scheduling: Many practical problems involve find-ing a permutation which causes certain objec-tives optimum. In these problems, the sequenceof operations are of importance such as the ab-solute location of the operation on the permu-tation. For example, in a machining sequenc-ing problem, it is important to know in whatorder a job will flow from one machine to thenext, instead of knowing when exactly a job hasto be made on a particular machine. Such op-timization tasks appear in time-tabling, plan-ning, resource allocation problems and others,and are usually known as combinatorial opti-mization problems.

Prediction and forecasting: Many time-varyingreal-world problems are often periodic and pre-dictable. Past data for such problems can beanalyzed for finding the nature of periodicity sothat a better understanding of the problem canbe achieved. Using such a model, future fore-casts can also be made judiciously. Althoughsuch applications are plenty in financial domain,periodicities in fluidic systems, such as repeatedwake formation of a particular type and its pre-diction of where and when will it reappear, arealso of importance.

Data mining: A major task in any real-worldproblem solving today is to analyze multi-dimensional data and discover the hidden use-ful information they carry. This is by no meansis an easy task, simply because it is not knownwhat kind of information they would carry andin most problems it is not even known what kindof information one should look for. A majortask in such problems is to cluster functionallysimilar data together and functionally dissimilardata in separate clusters. Once the entire dataset is clustered, useful properties of iso-clusterdata can be deciphered for a better understand-ing of the problem. The clustering task is an op-timization problem in which the objective is usu-ally to maximize inter-cluster distance betweenany pair of clusters and simultaneously minimizetheir intra-cluster distances. Another task oftenrequired to be performed is the identification of

smallest size classifiers which would be able cor-rectly classify most samples of the data set intovarious classes. Such problems often arise in bio-informatics problems [15] and can also be preva-lent in other data-driven engineering tasks. Ina CFD problem such a classification task shouldbe able to find classifiers by which an investiga-tion of a flow pattern should reveal the type offlow (laminar or turbulent) or type of fluid orkind of geometry associated with the problem.

Machine learning: In the age of automation, manyreal-world systems are equipped with automatedand intelligent subsystems which can make de-cisions and adjust the system optimally, as andwhen an unforeseen scenario happens. In thedesign of such intelligent subsystems, often ma-chine learning techniques involving different soft-computing techniques and artificial intelligencetechniques are combined. To make such a sys-tem operate with a minimum change or withminimum energy requirement, an optimizationprocedure is needed. Since such subsystems areto be used on-line and since on-line optimizationis a difficult proposition due to time restrictions,such problems are often posed as an offline opti-mization problem and solved by trying them ona number of synthetic (or real) scenarios [32, 33].In such optimization tasks, an optimal rule baseis learned which works the best on the chosentest scenarios. It is then hypothesized that suchan optimal rule base will also work well in realscenarios. To make the optimal rule base reliablein real scenarios, such an optimization procedurecan be used repeatedly with the new scenariosand new optimal rule base can be learned.

3.1 Properties of Practical Optimiza-tion Problems

Based on the above discussion, we observe that thepractical optimization problems usually have the fol-lowing properties:

1. They are non-smooth problems having their ob-jectives and constraints are most likely to benon-differentiable and discontinuous.

4

2. Often, the decision variables are discrete makingthe search space discrete as well.

3. The problems may have mixed types (real, dis-crete, Boolean, permutation, etc.) of variables.

4. They may have highly non-linear objective andconstraint functions due to complicated relation-ships and equations which the decision variablesmust form and satisfy. This makes the problemsnon-linear optimization problems.

5. There are uncertainties associated with decisionvariables, due to which the the true optimumsolution may not of much importance to a prac-titioner.

6. The objective and constraint functions may alsobe noisy and non-deterministic.

7. The evaluation of objective and constraint func-tions is computationally expensive.

8. The problems give rise to multiple optimal solu-tions, of which some are globally best and manyothers are locally optimal.

9. The problems involve multiple conflicting objec-tives, for which no one solution is best with re-spect to all chosen objectives.

4 CLASSICAL OPTIMIZA-

TION METHODS

Most classical point-by-point algorithms use a deter-ministic procedure for approaching the optimum so-lution. Such algorithms start from a random guesssolution. Thereafter, based on a pre-specified transi-tion rule, the algorithm suggests a search direction,which is often arrived at by considering local infor-mation. A one-dimensional search is then performedalong the search direction to find the best solution.This best solution becomes the new solution and theabove procedure is continued for a number of times.Figure 1 illustrates this procedure. Algorithms varymostly in the way the search directions are definedat each intermediate solution.

��

��

��

��

Initial Guess

Feasible Search Space

Constraints

Optimum

Figure 1: Most classical methods use a point-by-pointapproach.

Classical search and optimization methods can beclassified into two distinct groups mostly in the waythe directions are chosen: Direct and gradient-basedmethods [6, 35, 38]. In direct methods, only objec-tive function and constraints are used to guide thesearch strategy, whereas gradient-based methods usethe first and/or second-order derivatives of the objec-tive function and/or constraints to guide the searchprocess. Since derivative information is not used,the direct search methods are usually slow, requiringmany function evaluations for convergence. For thesame reason, they can be applied to many problemswithout a major change of the algorithm. On theother hand, gradient-based methods quickly convergeto an optimal solution, but are not efficient in non-differentiable or discontinuous problems. In addition,there are some common difficulties with most of thetraditional direct and gradient-based techniques:

• Convergence to an optimal solution depends onthe chosen initial solution.

• Most algorithms are prone to get stuck to a sub-optimal solution.

• An algorithm efficient in solving one problem

5

may not be efficient in solving a different prob-lem.

• Algorithms are not efficient in handling prob-lems having discrete variables or highly non-linear and many constraints.

• Algorithms cannot be efficiently used on a par-allel computer.

5 MOTIVATION FROM NA-

TURE AND EVOLUTION-ARY OPTIMIZATION

It is commonly believed that the main driving prin-ciple behind the natural evolutionary process is theDarwin’s survival-of-the-fittest principle [3, 18]. Inmost scenarios, nature ruthlessly follows two simpleprinciples:

1. If by genetic processing an above-average off-spring is created, it usually survives longer thanan average individual and thus have more oppor-tunities to produce offspring having some of itstraits than an average individual.

2. If, on the other hand, a below-average offspringis created, it usually does not survive longer andthus gets eliminated quickly from the popula-tion.

The principle of emphasizing good solutions and elim-inating bad solutions seems to dovetail well with de-sired properties of a good optimization algorithm.But one may wonder about the real connection be-tween an optimization procedure and natural evo-lution! Has the natural evolutionary process triedto maximize a utility function of some sort? Trulyspeaking, one can imagine a number of such functionswhich the nature may have been thriving to maxi-mize: life span of a species, quality of life of a species,physical growth, and others. However, any of thesefunctions is non-stationary in nature and largely de-pends on the evolution of other related species. Thus,in essence, the nature has been optimizing a much

more complicated objective function by means of nat-ural genetics and natural selection than the searchand optimization problems we are interested in solv-ing in practice. Thus, it is not surprising that thecomputerized evolutionary optimization (EO) algo-rithm is not as complex as the natural genetics andselection procedures, rather it is an abstraction ofthe complex natural evolutionary process. Althoughan EO is a simple abstraction, it is robust and hasbeen found to solve various search and optimizationproblems of science, engineering, and commerce.

5.1 Evolutionary Optimization

The idea of using evolutionary principles to consti-tute a computerized optimization algorithm was sug-gested by a number of researchers located in geo-graphically distant places across the globe. The fieldnow known as ‘genetic algorithms’ was originated byJohn Holland of University of Michigan [24], a con-temporary field ‘evolution strategy’ was originated byIngo Rechenberg and Hans-Paul Schwefel of Techni-cal University of Berlin [37, 41], the field of ‘evolu-tionary programming’ was originated by Larry Fogel[20], and others. Figure 2 provides an overview ofcomputational intelligence and its components. Also,different subfields of evolutionary computing are alsoshown. In this paper, we mainly discuss the prin-ciples of a genetic algorithm and its use in differentoptimization problem solving.

Genetic algorithm (GA) is an iterative optimiza-tion procedure. Instead of working with a single so-lution in each iteration, a GA works with a number ofsolutions (collectively known as a population) in eachiteration. A flowchart of the working principle of asimple GA is shown in Figure 3. In the absence ofany knowledge of the problem domain, a GA beginsits search from a random population of solutions. Ifa termination criterion is not satisfied, three differ-ent operators – reproduction and variation operators(crossover and mutation, and others) – are applied toupdate the population of solutions. One iteration ofthese operators is known as a generation in the par-lance of GAs. Since the representation of a solutionin a GA is similar to a natural chromosome and GAoperators are similar to genetic operators, the above

6

Differential

Evolution

Genetic

Swarm

Particle

Intelligence

Computational

Evolutionary

Algorithms

EvolutionaryProgrammingProgrammingStrategies

EvolutionGeneticAlgorithms

Evolvable

NeuralNetworks

Fuzzy

Systems

Hardware

Figure 2: Computational intelligence and evolution-ary computation.

procedure is called a genetic algorithm.

5.1.1 Step 1: Representation of a solution

Representation of a solution describing the problemis an important first step in a GA. The solution vec-tor x can be represented as a vector of real numbers,discrete numbers, a permutation of entities or oth-ers or a combination, as suitable to the underlyingproblem. If a problem demands mixed real and dis-crete variables, a GA allows such a representation ofa solution.

5.1.2 Step 2: Initialization of a population of

solutions

Usually, a set of random solutions (of size N) are ini-

tialized in a pre-defined search space (x(L)i ≤ xi ≤

x(U)i ). However, it is not necessary that the subse-

quent GA operations are confined to create solutionsin the above range. To make sure that initial popu-lation is well-distributed, any space-filling or Latin-hypercube method can also be used. It is also notnecessary that the initial population is created ran-domly on the search space. If some problem infor-mation is available (such as knowledge about goodsolutions), a biased distribution can be created in

Representation scheme

Begin

t = 0

Evaluation

Stop

No

Yes

Cond?

Reproduction

Variationt = t + 1

Initialize Population

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

Figure 3: A flowchart of working principle of a geneticalgorithm.

any suitable portion of the search space. In diffi-cult problems with a known good solutions, the pop-ulation can be created around the good solution byrandomly perturbing it [5].

5.1.3 Step 3: Evaluation of a solution

In this step, every solution is evaluated and a fitnessvalue is assigned to the solution. This is where the so-lution is checked for its feasibility (by computing andchecking constraint functions gj and hk). By simplyassigning a suitable fitness measure, a feasible can begiven more importance over an infeasible solution or abetter feasible solution can be given more importanceover a worse feasible solution. This is also the placewhere a single-objective GA dealing with a single ob-jective function can be converted to a multi-objectiveGA in which multiple conflicting objectives can bedealt with. In most practical optimization problems,this is also the most time-consuming step. Hence,

7

any effort to complete this step quickly (either by us-ing distributed computers or approximately) wouldprovide a substantial saving in the overall procedure.

The evaluation step requires to first decipherthe solution vector from the chosen representationscheme. If the variables are represented directly asreal or discrete numbers, they are already directlyavailable. However, if a binary substring is used torepresent some discrete variables, first the exact vari-able value must be computed from the substring. Forexample, the following is a string, representing n vari-ables:

11010︸︷︷︸

x1

1001001︸︷︷︸

x2

010︸︷︷︸

x3

. . . 0010︸︷︷︸

xn

The i-th problem variable is coded in a binary sub-string of length ì, so that the total number of al-ternatives allowed in that variable is 2ì . The lowerbound solution xmin

i is represented by the solution(00. . .0) and the upper bound solution xmax

i is repre-sented by the solution (11. . .1). Any other substringsi decodes to a solution xi as follows:

xi = xmini +

xmaxi − xmin

i

2ì − 1DV(si), (4)

where DV(si) is the decoded value of the substringsi. Such a representation of discrete variables (evenreal-parameter variables) has a traditional root andalso nature root. A binary string resembles a chro-mosome comprising of a number of genes taking oneof two values – one or zero. Treating these stringsas individuals, we can then think of mimicking natu-ral crossover and mutation operators similar to theirnatural counterparts.

To represent a permutation (of n integers), aninnovative coded representation procedure can beadopted so that simple genetic operators can be ap-plied on the coding. In the proposed representation[13], at the i-th position from the left of the string, aninteger between zero and (i−1) is allowed. The stringis decoded to obtain the permutation as follows. Thei-th position denotes the placement of component iin the permutation. The decoding starts from theleft-most position and proceeds serially towards right.While decoding the i-th position, the first (i−1) com-ponents are already placed, thereby providing with i

place-holders to position the i-th component. We il-lustrate this coding procedure by using an examplehaving six components (a to f) using six integers (0to 5). Let us say that we have a string

(0 0 2 1 3 2)

and would like to decipher the corresponding permu-tation. The second component (i = 2) has two placesto be positioned – (i) before the component a or (ii)after the component a. The first case is denoted by a0 and the second case is denoted by a 1 in the string.Since the place-holder value for i = 2 (component b)is zero in the above string, the component b appearsbefore the component a. Similarly, with the first twocomponents placed as (b a), there are three place-holders for the component c – (i) before b (with avalue 0), (ii) between b and a (with a value 1), and(iii) after a (with a value 2). Since the string hasa value 2 in the third place, component c is placedafter a. Continuing in this fashion, we obtain thepermutation

(b d f a e c)

corresponding to the above string representation.The advantage of this coding is that simple crossoverand mutation operators (to be discussed later) can beapplied on the coding to create valid permutations.

5.1.4 Step 4: Reproduction operator

Reproduction (or selection) is usually the first op-erator applied to the population. Reproduction se-lects good strings in a population and forms a mat-ing pool. The essential idea is to emphasize above-average strings in the population. The so-called bi-nary tournament reproduction operator picks two so-lutions at random from the population and the betterof the two is chosen according to its fitness value. Al-though a deterministic procedure can be chosen forthis purpose (for example, the best 50% populationmembers can be duplicated), usually a stochastic pro-cedure is adopted in an EA to reduce the chance ofgetting trapped into a local optimal solution.

8

5.1.5 Step 5: Variation operators

During the selection operator, good population mem-bers are emphasized at the expense of bad popula-tion members, but no new solution is created. Thepurpose of variation operators is just to create newand hopefully improved solutions by using the mat-ing pool. A number of successive operations can beused here. However, there are two main operatorswhich are mostly used.

In a crossover operator, two solutions are pickedfrom the mating pool at random and an informa-tion exchange is made between the solutions to cre-ate one or more offspring solutions. In a single-pointcrossover operator applied to binary strings, bothstrings are cut at an arbitrary place and the right-side portion of both strings are swapped among them-selves to create two new strings:

0 0 0 0 01 1 1 1 1

⇒ 0 0 1 1 11 1 0 0 0

It is true that every crossover between any two so-lutions from the new population is not likely to findoffspring better than both parent solutions, but thechance of creating better solutions is far better thanrandom [21]. This is because the parent strings be-ing crossed are not any two arbitrary random strings.These strings have survived tournaments played withother solutions during the earlier reproduction phase.Thus, they are expected to have some good bit com-binations in their string representations. Since, asingle-point crossover on a pair of parent strings canonly create ` different string pairs (instead of all 2`−1

possible string-pairs) with bit combinations from ei-ther strings, the created offspring are also likely to begood strings. To reduce the chance of losing too manygood strings by this process, usually a pair of solu-tions are participated in a crossover operator witha crossover probability, pc. Usually, a large valuewithin [0.7, 1] is chosen for this parameter. A valueof pc = 0.8 means that 80% population members par-ticipate in crossovers to create new offspring and the20% parents are directly accepted as offspring.

The mutation operator perturbs a solution to itsvicinity with a small mutation probability, pm. Forexample, in a binary string of length `, a 1 is changed

to a 0 and vice versa, as happened in the fourth bitof the following example:

0 0 1 1 1 ⇒ 0 0 1 0 1

Usually, a small value of pm ≈ 1/` or 1/n (n is thenumber of variables) is used. Once again, the muta-tion operation uses random numbers but it is not anentirely random process, as from a particular stringit is not possible to move to any other string in onemutation event. Mutation uses a biased distributionto be able to move to a solution close to the originalsolution. The basic difference between the crossoverand mutation operations is that in the former morethan one parent solutions are needed, whereas in thelater only one solution is used directly or indirectly.

After reproduction and variation operators are ap-plied to the whole population, one generation of aGA is completed. These operators are simple andstraightforward. Reproduction operator selects goodstrings and crossover operator recombines good sub-strings from two good strings together to hopefullyform a better substring. Mutation operator alters astring locally to hopefully create a better string. Eventhough none of these claims are guaranteed and/ortested while creating a new population of strings, itis expected that if bad strings are created they will beeliminated by the reproduction operator in the nextgeneration and if good strings are created, they willbe retained and emphasized.

Although the above illustrative discussion on theworking of a genetic algorithm may seem to be acomputer-savvy, nature-hacker’s game-play, there ex-ists rigorous mathematical studies where a completeprocessing of a finite population of solutions underGA operators is modeled by using Markov chains [45],using statistical mechanics approaches [34], and us-ing approximate analysis [22]. On different classes offunctions, the recent dynamical systems analysis [46]treating a GA as a random heuristic search revealsthe complex dynamical behavior of a binary-codedGA with many meta-stable attractors. The interest-ing outcome is the effect of population size on thedegree of meta-stability. In the statistical mechanicsapproach, instead of considering microscopic detailsof the evolving system, several macroscopic variables

9

describing the system, such as mean, standard devia-tion, skewness and kurtosis of the fitness distribution,are modeled. Analyzing different GA implementa-tions with the help of these cumulants provides inter-esting insights into the complex interactions amongGA operators [39]. Rudolph has shown that a sim-ple GA with an elite-preserving operator and non-zero mutation probability converges to the globaloptimum solution to any optimization problem [40].Leaving the details of the theoretical studies (whichcan be found from the growing EA literature), herewe shall illustrate the robustness (combined breadthand efficiency) of a GA-based optimization strategyin solving different types of optimization problemscommonly encountered in practice.

5.2 Real-Parameter Genetic Algo-rithms

When decision variables in an optimization problemare real-valued, they can be directly represented in aGA as mentioned above, instead of a binary repre-sentation. Such real-parameter GAs came to lightin nineties, when engineering applications of GAsgained prominence. Although the GA flowchart re-mains the same, the crossover and mutation opera-tions suitable to real-valued parameters were neededto be developed.

Most studies used variable-wise crossover and mu-tation operators in the past. Recently, vector-wiseoperators are suggested. Mutation operator perturbsa parent solution in its vicinity by using a Gaussian-like probability distribution. However, sophisticatedcrossover operators are also suggested. We proposeda parent-centric crossover (PCX) operator [9] whichuses three or more parents and construct a probabil-ity distribution for creating offspring solutions. Theinteresting feature of the probability distribution isthat it is defined based on the vector-difference ofparent solutions, instead of parent solutions them-selves. Figure 4 shows the distribution of offspringaround parents under the PCX operator.

Since offspring solutions result in a diversity pro-portional to that in parent population, such acrossover operator introduces a self-adaptive feature,which allows a broad search early on and a focused

Figure 4: Offspring created using the PCX operator.

search later in an automatic manner. Such a propertyis desired in an efficient optimization procedure andis present in other real-parameter EAs, such as evolu-tion strategy (ES) [42], differential evolution [44] andparticle swarm optimization [27].

On a 20-variable non-linear unconstrained mini-mization problem

f(x) =

n−1∑

i=1

(100(x2

i − xi+1)2 + (xi − 1)2

), (5)

a GA with the PCX operator [9] is reported to findthe optimum solution with an error of 10−20 in func-tion value in the smallest number of function evalua-tions compared to a number of other real-parameterEAs ((µ, λ)-ES, CMA-ES [23] and differential evolu-tion (DE)) and the classical quasi-Newton (BFGS)approach (which got stuck to a solution having f =6.077(10−17)).

Method Best Median WorstGA+PCX 16,508 21,452 25,520(1, 10)-ES 591,400 803,800 997,500CMA-ES 29,208 33,048 41,076DE 243,800 587,920 942,040BFGS 26,000

10

In the following sections, we discuss a few practicaloptimization problems and how their (approximate)solution can be made easier with an evolutionary op-timization strategy.

6 NON-SMOOTH OPTI-

MIZATION PROBLEMS

Many practical problems are non-smooth in nature,introducing non-differentiabilities and discontinuitiesin the objective and constraint functions. In suchspecial points of non-smoothness, the exact deriva-tives do not usually exist. Although numerical gra-dients can be computed, they can often be meaning-less, particularly if such non-smoothness occurs on ornear optimum points. Thus, it becomes difficult forgradient-based optimization methods to work well insuch scenarios. In a GA, such non-smoothness is nota matter, as a GA does not use any gradient infor-mation during its search towards the optimum. Asshown in Figure 5, as long as a descent informationis provided by the function values (by the way ofcomparing function values of two solutions), a GA isready to work its way towards the optimum.

x

f(x)

Discontinuity

Figure 5: Discontinuoussearch space.

f(x)

Discrete

x

000 001 010 011 100 110 111101

Figure 6: Discrete searchspace.

In many problems, variables take discrete values,thereby providing no meaningful way of comput-ing gradients and using the gradient-based classicalmethods. Once again, the sole purpose of arrivingat descent information using function value compar-isons allows a GA to be used in such discrete searchspace as well (Figure 6).

6.1 MIXED (REAL AND DIS-CRETE) OPTIMIZATIONPROBLEMS

In an EO, decision variables can be coded in eitherbinary strings or directly depending on the type ofthe variable. A zero-one variable can be coded us-ing a single bit (0 or 1). A discrete variable can becoded in either a binary string or directly (if the to-tal number of permissible choices for the variable isnot 2k, where k is an integer). A continuous variablecan be coded directly. This coding allows a natu-ral way to code different variables, as depicted in thefollowing solution representing a complete design ofa cantilever beam having four design variables:

((1) 15 23.457 (1011))

The first variable represents the shape of the cross-section of the beam. There are two options—circular(a 1) or square (a 0). Thus, its a zero-one variable.The second variable represents the diameter of thecircular section if the first variable is a 1 or the sideof the square if the first variable is a 0. This variabletakes only one of a few pre-specified values. Thus,this variable is a discrete variable coded directly. Thethird variable represents the length of the cantileverbeam, which can take any real value. Thus, it is acontinuous variable. The fourth variable is a discretevariable representing the material of the cantileverbeam. This material takes one of 16 pre-specifiedmaterials. Thus, a four-bit substring is required tocode this variable. The above solution represent the12th material from a pre-specified list. Thus, theabove string represents a cantilever beam made ofthe 12th material from a prescribed list of 16 mate-rials having a circular cross-section with a diameter15 mm and having a length of 23.457 mm. Withthe above coding, any combination of cross sectionalshape and size, material specifications, and length ofthe cantilever beam can be represented. This flexi-bility in the representation of a design solution is notpossible with traditional optimization methods. Thisflexibility makes an EO efficient to be used in manyengineering design problems [11, 16], including CFDproblems involving shapes etc.

11

7 CONSTRAINT OPTIMIZA-TION

Constraints are inevitable in real-world (practical)problems. The classical penalty function approachof penalizing an infeasible solution is sensitive toa penalty parameter associated with the techniqueand often requires a trial-and-error method. In thepenalty function method for handling inequality con-straints in minimization problems, the fitness func-tion F (~x) is defined as the sum of the objective func-tion f(~x) and a penalty term which depends on theconstraint violation 〈gj(~x)〉:

F (~x) = f(~x) +J∑

j=1

Rj〈gj(~x)〉2, (6)

where 〈〉 denotes the absolute value of the operand,if the operand is negative and returns a value zero,otherwise. The parameter Rj is the penalty param-eter of the j-th inequality constraint. The purposeof a penalty parameter Rj is to make the constraintviolation gj(~x) of the same order of magnitude as theobjective function value f(~x). Equality constraintsare usually handled by converting them into inequal-ity constraints as follows:

gk+J (~x) ≡ δ − |hk(~x)| ≥ 0,

where δ is a small positive value.In order to investigate the effect of the penalty pa-

rameter Rj (or R) on the performance of GAs, weconsider a well-studied welded beam design problem[38]. The resulting optimization problem has four de-sign variables ~x = (h, `, t, b) and five inequality con-straints:

Min. fw(~x) = 1.10471h2` + 0.04811tb(14.0 + `),s.t. g1(~x) ≡ 13, 600− τ(~x) ≥ 0,

g2(~x) ≡ 30, 000− σ(~x) ≥ 0,g3(~x) ≡ b − h ≥ 0,g4(~x) ≡ Pc(~x) − 6, 000 ≥ 0,g5(~x) ≡ 0.25− δ(~x) ≥ 0,0.125 ≤ h ≤ 10,0.1 ≤ `, t, b ≤ 10.

(7)

The terms τ(~x), σ(~x), Pc(~x), and δ(~x) are given be-low:

τ(~x) =√ [

(τ ′(~x))2 + (τ ′′(~x))2+

`τ ′(~x)τ ′′(~x)/√

0.25(`2 + (h + t)2)]

,

σ(~x) =504, 000

t2b,

Pc(~x) = 64, 746.022(1− 0.0282346t)tb3,

δ(~x) =2.1952

t3b,

where

τ ′(~x) =6, 000√

2h`,

τ ′′(~x) =6, 000(14 + 0.5`)

√

0.25(`2 + (h + t)2)

2 {0.707h`(`2/12 + 0.25(h + t)2)} .

The optimized solution reported in the literature[38] is h∗ = 0.2444, `∗ = 6.2187, t∗ = 8.2915,and b∗ = 0.2444 with a function value equal tof∗ = 2.38116. Binary GAs are applied on this prob-lem in an earlier study [4] and the solution ~x =(0.2489, 6.1730, 8.1789, 0.2533) with f = 2.43 (within2% of the above best solution) was obtained witha population size of 100. However, it was observedthat the performance of GAs largely dependent onthe chosen penalty parameter values, as shown in Ta-ble 1. With R = 1 (small values of R), although 12out of 50 runs have found a solution within 150% ofthe best-known solution, 13 EO runs have not beenable to find a single feasible solution in 40,080 func-tion evaluations. This happens because with small Rthere is not much pressure for the solutions to becomefeasible. With larger penalty parameters, the pres-sure for solutions to become feasible is more and all50 runs found feasible solutions. However, because oflarger emphasis of solutions to become feasible, whena particular solution becomes feasible it has a largeselective advantage over other solutions (which areinfeasible) in the population and the EO is unable toreach near to the true optimal solution.

In the recent past, a parameter-less penalty ap-proach is suggested by the author [7], which uses thefollowing fitness function derived from the objective

12

Table 1: Number of runs (out of 50 runs) convergedwithin ε% of the best-known solution using an EOwith different penalty parameter values and using theproposed approach on the welded beam design prob-lem.

ε Optimized fw(~x)≤ > In-

R 150% 150% fes. Best Med. Worst

100 12 25 13 2.413 7.625 483.502101 12 38 0 3.142 4.335 7.455103 1 49 0 3.382 5.971 10.659106 0 50 0 3.729 5.877 9.424

Prop. 50 0 0 2.381 2.383 2.384

and constraint values:

F (x) =

{f(x), if x is feasible;

fmax +∑J

j=1〈gj(x)〉 +∑K

k=1 |hk(x)|, else.

(8)In a tournament between two feasible solutions, thefirst clause ensures that the one with a better func-tion value wins. The quantity fmax is the objectivefunction value of the worst feasible solution in thepopulation. The addition of these quantity to theconstraint violation ensures that a feasible solutionis always better than any infeasible solution. More-over, since this quantity is constant in any genera-tion, between two infeasible solutions the one withsmaller constraint violation is judged better. Sincethe objective function value is never compared witha constraint violation amount, there is no need of anypenalty parameter with such an approach.

When the proposed constraint handling EO is ap-plied to the welded beam design problem, all 50 sim-ulations (out of 50 runs) are able to find a solutionvery close to the true optimal solution, as shown inthe last row of Table 1. This means that with the pro-posed GAs, one run is enough to find a satisfactorysolution close to the true optimal solution.

8 ROBUST ANDRELIABILITY-BASED OP-

TIMIZATION

For practical optimization studies, robust andreliability-based techniques are commonplace and aregetting increasingly popular. Often in practice, themathematical optimum solution is not desired, due toits sensitivity to the parameter fluctuations and inac-curacy in the formulation of the problem. ConsiderFigure 7, in which although the global minimum is atB, this solution is very sensitive to parameter fluctu-ations. A small error in implementing solution B willresult in a large deterioration in the function value.On the other hand, solution A is less sensitive andmore suitable as the desired solution to the problem.The dashed line is an average of the function around

A

BOriginalAverage

0.8

1

1.2

1.4

1.6

1.8

2

0 0.2 0.4 0.6 0.8 1

f(x)

x

Figure 7: Although solution B is the global minimum,it is a robust solution. Solution A is the robust solu-tion.

a small region near a solution. If the function shownin dashed line is optimized the robust solution can beachieved [2, 26, 12].

For a canonical deterministic optimization task,the optimum solution usually lies on a constraint sur-face or at the intersection of more than one constraintsurfaces. However, if the design variables or somesystem parameters cannot be achieved exactly and

13

are uncertain with a known probability distributionof variation, the so-called deterministic optimum (ly-ing on one or more constraint surface) will fail to re-main feasible in many occasions (Figure 8). In such

��

��

��

��

Deterministicoptimum

x_1

x_2

90%

95%

99%

99.99%

80%

Relationship

Figure 8: The constrained minimum is not reliable.With higher desired reliability, the corresponding so-lutions moves inside the feasible region.

scenarios, a stochastic optimization problem is usu-ally formed and solved, in which the constraints areconverted into probabilistic constraints meaning thatprobability of failures (of being a feasible solution)is limited to a pre-specified value (say ε) [36, 17].The advantage of using an EO is that a global robustsolution can be obtained and the method can be ex-tended for finding multi-objective reliable solutionseasily [28].

9 OPTIMIZATION WITH

DISTRIBUTED COMPUT-ING

One way to beat the difficulties with problems havingcomputationally expensive evaluation procedures is

a distributed computing environment. This way theevaluation procedure can be parallelized and over-all computational time will reduce. There is an ad-ditional advantage with using an EO. Since an EOdeals with a population of solutions in every gener-ation and their evaluations are independent to eachother, each population member can be evaluated on adifferent processor independently and parallely. Ho-mogeneous or heterogeneous processors on a master-slave configuration can be used for such a purposeequally well to any other arrangements. Since thecomputational time for solution evaluation is com-paratively much more than the time taken by thegenetic operators, a fast input-output processing isalso not a requirement. Thus, a hand-made Beowulfcluster formed with a number of fast processors canbe used efficiently to run an EO. This feature makesan EO suitable for parallel processors compared totheir classical counterparts.

Ideally, the evaluation of each solution must also beparallelized, if deemed necessary. For this purpose,several clusters can be combined together and a GApopulation member can be sent to each cluster for itsefficient evaluation, as shown in Figure 9.

10 COMPUTATIONALLYDIFFICULT OPTIMIZA-

TION PROBLEMS

Many engineering optimization problems involve ob-jectives and constraints which require an enormouscomputational time. For example, in CFD controlproblems involving complicated shapes and geome-tries, every solution requires mesh generation, auto-mated node-numbering, and solution of the finite dif-ference equations. The computational time in suchproblems can easily consume a few minutes to daysfor evaluating a single solution on a parallel com-puter, depending the rigor of the problem. Clearly,optimization of such problems is still out of scopein its traditional sense and often innovations in us-ing the optimization methodology must be devised.Here, we suggest a couple of approaches:

1. Approximate models for evaluation instead of

14

��

��

��

��

��

��

��

��

� � � � � � � � � � � �

��

��

��

��

��

��

��

��

��

��

��

��

� � � � � � � � � �

!�!�!!�!�!"�"�""�"�"

#�#�#�#$�$�$�$

%�%�%%�%�%%�%�%%�%�%%�%�%

&�&�&&�&�&&�&�&&�&�&&�&�&

'�'�'�''�'�'�'(�(�(�((�(�(�(

)�)�)*�*�*

+�+�++�+�++�+�++�+�++�+�+

,�,�,,�,�,,�,�,,�,�,,�,�,

-�-�-�--�-�-�-.�.�..�.�.

/�/�/�/0�0�0

1�1�11�1�11�1�11�1�11�1�1

2�2�22�2�22�2�22�2�22�2�2

3�3�3�33�3�3�34�4�44�4�4

5�5�5�56�6�6

7�7�77�7�77�7�77�7�77�7�7

8�8�88�8�88�8�88�8�88�8�8

9�9�9�99�9�9�9:�:�::�:�:

;�;�;<�<�<

=�=�==�=�==�=�==�=�==�=�=

>�>�>>�>�>>�>�>>�>�>

?�?�??�?�?@�@�@@�@�@

A�A�A�AB�B�B

C�C�CC�C�CC�C�CC�C�CC�C�C

D�D�DD�D�DD�D�DD�D�D

E�E�E�EE�E�E�EF�F�FF�F�F

G�G�GH�H�H

I�I�II�I�II�I�II�I�II�I�I

J�J�JJ�J�JJ�J�JJ�J�J

K�K�K�KK�K�K�KL�L�LL�L�L

M�M�MN�N�N

O�O�OO�O�OO�O�OO�O�OO�O�OO�O�O

P�P�PP�P�PP�P�PP�P�P

Q�Q�Q�QQ�Q�Q�QR�R�R�RR�R�R�R

S�S�ST�T�T

U�U�UU�U�UU�U�UU�U�UU�U�U

V�V�VV�V�VV�V�VV�V�V

W�W�W�WW�W�W�WX�X�X�XX�X�X�X

Y�Y�YZ�Z�Z

[�[�[[�[�[[�[�[[�[�[[�[�[

\�\�\\�\�\\�\�\\�\�\

]�]�]]�]�]^�^�^^�^�^

_�_�_�_`�`�`�`

a�a�aa�a�aa�a�aa�a�aa�a�a

b�b�bb�b�bb�b�bb�b�b

c�c�c�cc�c�c�cd�d�dd�d�d

e�e�ef�f�f

g�g�gg�g�gg�g�gg�g�gg�g�g

h�h�hh�h�hh�h�hh�h�h

i�i�i�ii�i�i�ij�j�j�jj�j�j�j

k�k�kl�l�l

m�m�mm�m�mm�m�mm�m�mm�m�mm�m�m

n�n�nn�n�nn�n�nn�n�n

o�o�o�oo�o�o�op�p�p�pp�p�p�p

q�q�qr�r�r

s�s�ss�s�ss�s�ss�s�ss�s�s

t�t�tt�t�tt�t�tt�t�t

u�u�u�uu�u�u�uv�v�v�vv�v�v�v

w�w�wx�x�x

yyyyyyyyyyyy

zzzzzzzzzzzz

{{{{{{{{{{{{

||||||||||||

}}}}}}}}}}}}

~~~~~~~~~~~~

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 9: A parallel computing environment for anideal EO.

exact objective and constraint functions can beused.

2. For time-varying control problems, a fast yet ap-proximate optimization principle can be used.

10.1 Approximate Models

In this approach, an approximate model of the ob-jective and constraint functions are first developedusing a handful of solutions evaluated exactly. Inthis venture, a response surface methodology (RSM),a kriging methodology or an ANN approach can allbe used. Since the model is at best assumed to bean approximate one, it must be used for a while anddiscarded when solutions begins to crowd around thebest region of the approximate model. At this stage,another approximate model can be developed nearthe reduced search region dictated by the currentpopulation. One such coarse-to-fine grained approx-imate modeling technique (with an ANN approach)has been recently developed for multi-objective opti-mization [31] and a number of test problems and two

practical problems are solved. A saving of 30 to 80%savings in the computational time has been recordedby using the proposed approach.

10.2 Approximate Optimization Prin-ciple

In practical optimal control problems involving time-varying decision variables, the objective and con-straint functions need to be computed when thewhole process is simulated for a time period. In CFDproblems, such control problems often arise in whichcontrol parameters such as velocities and shapes needto be changed over a time period in order to obtaina minimum drag or a maximum lift or a maximallystable system. The computation of such objectivesis a non-linear process of solving a system of gov-erning partial differential equations sequentially fromthe initial time to a final time. When such a seriesof finite difference equations are solved for a partic-ular solution describing the system, one evaluationis over. Often, such an exact evaluation proceduremay take a few hours to a few days. Even if onlya few hundred solutions are needed to be evaluatedto come any where closer to the optimum, this maytake a few months on even a moderately fast par-allel computer. For such problems, an approximateoptimization procedure can be used as follows [43].

The optimal control strategy for the entire timespan [ti, tf ] is divided into K time intervals [tk, tk +∆t], such that t0 = ti and tK = tK−1 + ∆t = tf . Atthe k-th time span, the GA is initialized by mutat-ing the best solution found in (k − 1)-th time span.Thereafter, the GA is run for τ generations and thebest solution is recorded. The control strategy corre-sponding to this best solution is assigned as the over-all optimized control strategy for this time span. Thisis how the optimized control strategy gets formed asthe time span increases from t0 to tf . This pro-posed genetic search procedure is fast (O(K) com-pared to O(K2)) and allows an approximate way tohandle computationally expensive time-varying opti-mization problems, such as the CFD problem solvedelsewhere [43]. Although the procedure is fast, on theflip side, the proposed search procedure constitutesan approximate search and the resulting optimized

15

solution need not be the true optimum of the optimalcontrol problem. But with the computing resourcesavailable today, demanding CFD simulations prohibitthe use of a standard optimization algorithm in prac-tice. The above approximate optimization procedureallows a viable way to apply optimization algorithmsin computationally demanding CFD problems.

11 MULTI-MODAL OPTI-

MIZATION

Many practical optimization problems possess a mul-titude of optima – some global and some local. Insuch problems, it may be desirable to find as manyoptima as possible to have a better understanding ofthe problem. Due to the population approach, GAscan be used to find multiple optimal solutions in onesimulation of a single GA run. In one implementa-tion [10], the fitness of a solution is degraded with itsniche count, an estimate of the number of neighbor-ing solutions. It has been shown that if the reproduc-tion operator is performed with the degraded fitnessvalues, stable subpopulations can be maintained atvarious optima of the objective function. Figure 10shows that a niched-GA can find all five optima (al-beit a mix of local and global optima) in one singlesimulation run.

12 MULTI-OBJECTIVE EVO-LUTIONARY OPTIMIZA-TION

Most real-world search and optimization problemsinvolve multiple conflicting objectives, of which theuser is unable to establish a relative preference. Suchconsiderations give rise to a set of multiple optimalsolutions, commonly known as the Pareto-optimalsolutions [8]. Classical approaches to solve theseproblems concentrate mainly in developing a singlecomposite objective function from multiple objectivesand in using a single-objective optimizer to find aparticular optimum solution [30]. Such proceduresare subjective to the user, as the optimized solution

0 0.2 0.4 0.6 0.8 1

f(x)

x

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 10: A niched-GA finds multiple optimal solu-tions.

depends on the chosen scalarization scheme. Onceagain, due to the population approach of a GA, mul-tiple Pareto-optimal solutions are found simultane-ously in a single simulation run, making it an uniqueway to handle multi-objective optimization problems.

To convert a single-objective GA to a multi-objective optimizer, three aspects are kept in mind:(i) non-dominated solutions are emphasized for pro-gressing towards the optimal front, (ii) less-crowdedsolutions are emphasized to maintain a diverse set ofsolutions, and (iii) elite or best solutions in a popu-lation are emphasized for a quicker convergence nearthe true Pareto-optimal front. In one such implemen-tation – elitist non-dominated sorting GA or NSGA-II (which received the ESI Web of Science’s recog-nition as the Fast Breaking Paper in Engineeringin February 2004), parent and offspring populationsare combined together and non-dominated solutionsare hierarchically selected starting from the best so-lutions. A crowding principle is used to select theremaining solutions from the last front which couldnot be selected in total. These operations follow theabove three aspects and the algorithm has been suc-cessful in solving a wide variety of problems. A sketch

16

of the algorithm is shown in Figure 11.

sortingCrowding

3

t

distancesorting

Non−dominated

t+1

F

F1

2

F

Q

R

P

Rejectedt

tP

Figure 11: An iteration of the NSGA-II procedure.

On a 18-speed gearbox design problem having 28mixed decision variables, three objectives, and 101nonlinear constraints, the NSGA-II is able to findas many as 300 different trade-off solutions includingindividual minimum solutions (Figure 12). It is clearfrom the figure that if a large error (ε) in the outputshaft speeds from the ideal is permitted, better non-dominated solutions are expected. Because of themixed nature of variables and multiple objectives, aprevious study [25] based on classical methods hadto divide the problem in two subproblems and eventhen it failed to find a wide spectrum of solutionsas found by NSGA-II. Since the number of teeth ingears are also kept as variables here, a more flexiblesearch is permitted, thereby obtaining a better set ofnon-dominated solutions.

12.1 Innovization: Understandingthe Optimization Problem Bet-ter

Besides finding the trade-off optimal solutions, evo-lutionary multi-objective optimization (EMO) stud-ies are getting useful for another reason. When aset of trade-off solutions are found, they can be ana-lyzed to discover any useful similarities and dissimi-larities among them. Such information, particularlythe common principles, if any, should provide usefulinsight to a designer, user, or a decision-maker.

JA (1990)

NSGA−II (10% error)

NSGA−II (5% error)

0

500

1000

1500

2000

2500

3000

3500

4000

0 2 4 6 8 10 12 14

Volume (cc)

Power (kW)

Figure 12: NSGA-II solutions are shown to outper-form a classical approach with two limiting errors inoutput speeds.

For example, for the gearbox design problem de-scribed above, when we analyze all 300 solutions, wefind that the only way these optimal solutions varyfrom each other is by having a drastically differentvalue in only one of the variables (module of thegear) and all other variables (thickness and numberof teeth of each gear) remain more or less the samefor all solutions. Figure 13 shows how module (m) ischanging with one of the objectives (power delivered(p) by the gearbox). When a curve is fitted throughthese points, the following mathematical relationshipemerges: m =

√p. This is an useful information

for a designer to have. What this means is that ifa gearbox is designed today for p = 1 kW power re-quirement and tomorrow if a gearbox is needed tobe designed for another application requiring p = 4kW the only change needed in the gearbox is an in-crease of module by a factor of two. A two-fold in-crease in module will increase the diameter of gearsand hence the size of the gearbox. Although a com-plete re-optimization can be performed from scratchand there may be other ways the original gearboxcan be modified for the new requirement, but such a

17

Lower limit

Upper limit

0.1

0.15

0.2

0.25

0.3

0.35

0.4

2 4 6 8 10 12 14 16 18 20

Module (cm)

Power (kW)

1

Figure 13: An optimal relationship between moduleand power is discovered by the NSGA-II.

simple change in module alone will mean a redesignof the gearbox for the new requirement in an optimalmanner and importantly not requiring any furthercomputation [14]. Such a task brings out a ‘recipe’or ‘blue-print’ of solving the problem optimally. It isnot clear how such useful information can be achievedby any other means.

13 CONCLUSIONS

In this paper, we have presented a number ofcommonly-used practical problems which are, inessence, optimization problems. A closer look atthese problems has revealed that such optimiza-tion problems are complex, non-linear, and multi-dimensional, thereby making the classical optimiza-tion procedures inadequate to be used with any re-liability and confidence. We have also presented theevolutionary optimization procedure – a population-based iterative procedure mimicking natural evolu-tion and genetics, which has demonstrated its suit-ability and immense applicability in solving varioustypes of optimization problems. A particular imple-mentation – genetic algorithm (GA) – is different

from classical optimization methods in a number ofways: (i) it does not use gradient information, (ii)it works with a set of solutions instead of one solu-tion in each iteration, (iii) it is a stochastic searchand optimization procedure and (iv) it is highly par-allelizable.

Besides GAs, there exist a number of other imple-mentations of the evolutionary idea, such as evolu-tion strategy [42, 1], evolutionary programming [19],genetic programming [29], differential evolution [44],particle swarm optimization [27] and others. Due tothe flexibilities in their search, these evolutionary al-gorithms are found to be quite useful as a search toolin studies on artificial neural networks, fuzzy logiccomputing, data mining, and other machine learningactivities.

Acknowledgment

The author acknowledges the efforts of all his col-laborators during the past 13 years of research andapplication in developing efficient evolutionary opti-mization methods for practical problem solving.

References

[1] H.-G. Beyer. The theory of evolution strategies.Berlin, Germany: Springer, 2001.

[2] J. Branke. Creating robust solutions by means ofan evolutionary algorithm. In Proceedings of theParallel Problem Solving from Nature (PPSN-V), pages 119–128, 1998.

[3] R. Dawkins. The Selfish Gene. New York: Ox-ford University Press, 1976.

[4] K. Deb. Optimal design of a welded beamstructure via genetic algorithms. AIAA Journal,29(11):2013–2015, 1991.

[5] K. Deb. Genetic algorithms in optimal opticalfilter design. In Proceedings of the InternationalConference on Computing Congress, pages 29–36, 1993.

18

[6] K. Deb. Optimization for Engineering Design:Algorithms and Examples. New Delhi: Prentice-Hall, 1995.

[7] K. Deb. An efficient constraint handling methodfor genetic algorithms. Computer Methodsin Applied Mechanics and Engineering, 186(2–4):311–338, 2000.

[8] K. Deb. Multi-objective optimization using evo-lutionary algorithms. Chichester, UK: Wiley,2001.

[9] K. Deb, A. Anand, and D. Joshi. A computa-tionally efficient evolutionary algorithm for real-parameter optimization. Evolutionary Compu-tation Journal, 10(4):371–395, 2002.

[10] K. Deb and D. E. Goldberg. An investigation ofniche and species formation in genetic functionoptimization. In Proceedings of the Third In-ternational Conference on Genetic Algorithms,pages 42–50, 1989.

[11] K. Deb and M. Goyal. A robust optimiza-tion procedure for mechanical component designbased on genetic adaptive search. Transactionsof the ASME: Journal of Mechanical Design,120(2):162–164, 1998.

[12] K. Deb and H. Gupta. Searching for robustPareto-optimal solutions in multi-objective op-timization. In Proceedings of the Third Evolu-tionary Multi-Criteria Optimization (EMO-05)Conference (Also Lecture Notes on ComputerScience 3410), pages 150–164, 2005.

[13] K. Deb, P. Jain, N. Gupta, and H. Maji.Multi-objective placement of VLSI componentsusing evolutionary algorithms. To appear inIEEE Transactions on Components and Pack-aging Technologies, Also KanGAL Report No.2002006, 2002.

[14] K. Deb and S. Jain. Multi-speed gearboxdesign using multi-objective evolutionary algo-rithms. ASME Transactions on Mechanical De-sign, 125(3):609–619, 2003.

[15] K. Deb and A. R. Reddy. Classification of two-class cancer data reliably using evolutionary al-gorithms. BioSystems, 72(1-2):111–129, 2003.

[16] K. Deb and A. Srinivasan. Innovization: Inno-vation through optimization. Technical ReportKanGAL Report Number 2005007, Kanpur Ge-netic Algorithms Laboratory, IIT Kanpur, PIN208016, 2005.

[17] O. Ditlevsen and H. O. Madsen. Structural Re-liability Methods. New York: Wiley, 1996.

[18] N. Eldredge. Macro-Evolutionary Dynamics:Species, Niches, and Adaptive Peaks. New York:McGraw-Hill, 1989.

[19] D. B. Fogel. Evolutionary Computation. Piscat-away, NY: IEEE Press, 1995.

[20] L. J. Fogel. Autonomous automata. IndustrialResearch, 4(1):14–19, 1962.

[21] D. E. Goldberg. Genetic Algorithms for Search,Optimization, and Machine Learning. Reading,MA: Addison-Wesley, 1989.

[22] D. E. Goldberg. The design of innovation:Lessons from and for Competent genetic algo-rithms. Kluwer Academic Publishers, 2002.

[23] N. Hansen and A. Ostermeier. Adapting arbi-trary normal mutation distributions in evolutionstrageties: The covariance matrix adaptation. InProceedings of the IEEE International Confer-ence on Evolutionary Computation, pages 312–317, 1996.

[24] J. H. Holland. Adaptation in Natural and Artifi-cial Systems. Ann Arbor, MI: MIT Press, 1975.

[25] P. Jain and A. M. Agogino. Theory of design: Anoptimization perspective. Mech. Mach. Theory,25(3):287–303, 1990.

[26] Yaochu Jin and Bernhard Sendhoff. Trade-offbetween performance and robustness: An evo-lutionary multiobjective approach. In Proceed-ings of the Evolutionary Multi-Criterion Opti-mization (EMO-2003), pages 237–251, 2003.

19

[27] James Kennedy and Russell C. Eberhart. Swarmintelligence. Morgan Kaufmann, 2001.

[28] R. T. F. King, H. C. S. Rughooputh, andK. Deb. Evolutionary multi-objective environ-mental/economic dispatch: Stochastic versusdeterministic approaches. In Proceedings of theThird International Conference on Evolution-ary Multi-Criterion Optimization (EMO-2005),pages 677–691. Lecture Notes on Computer Sci-ence 3410, 2005.

[29] J. R. Koza. Genetic Programming : On the Pro-gramming of Computers by Means of Natural Se-lection. Cambridge, MA: MIT Press, 1992.

[30] K. Miettinen. Nonlinear Multiobjective Opti-mization. Kluwer, Boston, 1999.

[31] P. K. S. Nain and K. Deb. Computationally ef-fective search and optimization procedure usingcoarse to fine approximations. In Proceedingsof the Congress on Evolutionary Computation(CEC-2003), pages 2081–2088, 2003.

[32] D. Pratihar, K. Deb, and A. Ghosh. Fuzzy-genetic algorithms and time-optimal obstacle-free path generation for mobile robots. Engi-neering Optimization, 32:117–142, 1999.

[33] D. K. Pratihar, K. Deb, and A. Ghosh. Op-timal path and gait generations simultaneouslyof a six-legged robot using a ga-fuzzy approach.Robotics and Autonomous Systems, 41:1–21,2002.

[34] A. Prugel-Bennett and J. L. Shapiro. An anal-ysis of genetic algorithms using statistical me-chanics. Physics Review Letters, 72(9):1305–1309, 1994.

[35] S. S. Rao. Optimization: Theory and Applica-tions. Wiley, New York, 1984.

[36] S. S. Rao. Genetic algorithmic approach for mul-tiobjective optimization of structures. In Pro-ceedings of the ASME Annual Winter Meetingon Structures and Controls Optimization, vol-ume 38, pages 29–38, 1993.

[37] I. Rechenberg. Cybernetic solution path of anexperimental problem, 1965. Royal Aircraft Es-tablishment, Library Translation Number 1122,Farnborough, UK.

[38] G. V. Reklaitis, A. Ravindran, and K. M. Rags-dell. Engineering Optimization Methods and Ap-plications. New York : Wiley, 1983.

[39] A. Rogers and A. Prugel-Bennett. Modelling thedynamics of steady-state genetic algorithms. InFoundations of Genetic Algorithms 5 (FOGA-5), pages 57–68, 1998.

[40] G. Rudolph. Convergence analysis of canonicalgenetic algorithms. IEEE Transactions on Neu-ral Network, 5(1):96–101, 1994.

[41] H.-P. Schwefel. Projekt MHD-Staustrahlrohr:Experimentelle optimierung einerzweiphasenduse, teil I. Technical Report11.034/68, 35, AEG Forschungsinstitut, Berlin,1968.

[42] H.-P. Schwefel. Numerical Optimization of Com-puter Models. Chichester, UK: Wiley, 1981.

[43] T. K. Sengupta, K. Deb, and S. B. Talla.Drag optimization for a circular cylinder at highreynolds number by rotary oscillation using ge-netic algorithms. Technical Report KanGAL Re-port No. 2004018, Mechanical Engineering De-partment, IIT Kanpur, India, 2004.

[44] R. Storn and K. Price. Differential evolution –A fast and efficient heuristic for global optimiza-tion over continuous spaces. Journal of GlobalOptimization, 11:341–359, 1997.

[45] M. D. Vose. Simple Genetic Algorithm: Foun-dation and Theory. Ann Arbor, MI: MIT Press,1999.

[46] M. D. Vose and J. E. Rowe. Random heuristicsearch: Applications to gas and functions of uni-tation. Computer Methods and Applied Mechan-ics and Engineering, 186(2–4):195–220, 2000.

20

Practical Optimization Using Evolutionary Methods · It...

Documents

Transcript of Practical Optimization Using Evolutionary Methods · It...