Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean...

20
666 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013 Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection Aniruddha Basak, Swagatam Das, Senior Member, IEEE, and Kay Chen Tan, Senior Member, IEEE Abstract —In contrast to the numerous research works that integrate a niching scheme with an existing single-objective evolutionary algorithm to perform multimodal optimization, a few approaches have recently been taken to recast multimodal op- timization as a multiobjective optimization problem to be solved by modified multiobjective evolutionary algorithms. Following this promising avenue of research, we propose a novel biobjective formulation of the multimodal optimization problem and use differential evolution (DE) with nondominated sorting followed by hypervolume measure-based sorting to finally detect a set of solutions corresponding to multiple global and local optima of the function under test. Unlike the two earlier multiobjective approaches (biobjective multipopulation genetic algorithm and niching-based nondominated sorting genetic algorithm II), the proposed multimodal optimization with biobjective DE (MO- BiDE) algorithm does not require the actual or estimated gradient of the multimodal function to form its second objective. Performance of MOBiDE is compared with eight state-of-the-art single-objective niching algorithms and two recently developed biobjective niching algorithms using a test suite of 14 basic and 15 composite multimodal problems. Experimental results supported by nonparametric statistical tests suggest that MOBiDE is able to provide better and more consistent performance over the existing well-known multimodal algorithms for majority of the test problems without incurring any serious computational burden. Index Terms—Crowding, differential evolution (DE), multimodal optimization, multiobjective optimization, niching, nondominated sorting. I. Introduction I N PRACTICAL optimization problems, it is often desirable to simultaneously locate multiple global and local optima of a given objective function. For real world problems, due to physical (and/or cost) constraints, the best results cannot be realized always. In such a scenario, if multiple solutions (local and global) are known, the implementation can be quickly switched to an alternative solution while still maintaining an Manuscript received February 3, 2012; revised June 14, 2012 and September 20, 2012; accepted November 17, 2012. Date of publication December 4, 2012; date of current version September 27, 2013. A. Basak is with the Electrical and Computer Engineering Department, Carnegie Mellon University, Silicon Valley Campus, Moffett Field, CA 94035 USA (e-mail: [email protected]). S. Das is with the Electronics and Communication Sciences Unit, Indian Sta- tistical Institute, Kolkata 700108, India (e-mail: [email protected]). K. C. Tan is with the Department of Electrical and Computer En- gineering, National University of Singapore, 117576, Singapore (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TEVC.2012.2231685 optimal system performance. Multiple solutions could also be analyzed to discover hidden properties (or relationships) of the concerned functional landscape. Thus, as the name suggests, a multimodal optimization task amounts to finding multiple optimal solutions (both local and global) and not just one single optimum. If a point-by-point classical optimization approach is used for this task, the approach must have to be applied several times, every time hoping to find a different optimal solution. Evolutionary algorithms (EAs), due to their population- based approach, provide a natural advantage over classical optimization techniques. They maintain a population of can- didate solutions, which are processed every generation, and if several distinct solutions can be preserved over all these generations, then at the termination of the algorithm we will have multiple good solutions, rather than only the best solution. Note that this is against the natural tendency of EAs, which will always converge to the best solution or a suboptimal solution. Detection and maintenance of multiple solutions is wherein lies the challenge of using EAs for multimodal optimization. Niching [1]–[5] is a generic term referred to as the technique of finding and preserving multiple stable niches, or favorable parts of the solution space possibly around multiple solutions, so as to prevent convergence to a single solution. Research on solving multimodal problems with EAs dates back to the landmark work of Goldberg and Richardson [6], in which they nicely showed how a niche- preserving technique can be introduced in the framework of a standard genetic algorithm (GA) and multiple optimal solutions can be obtained. Since that study, many researchers have suggested methodologies of introducing niche-preserving techniques so that, for each optimum solution, a niche is formed in the population of an EA. Currently, the most popular niching techniques used in conjunction with various EAs include crowding [7], fitness sharing [6], restricted tourna- ment selection [8], and speciation [9]. Besides multimodal problems, niching techniques are also frequently employed for solving multiobjective and dynamic optimization problems [10], [11]. Most of existing niching methods, however, have difficulties that need to be overcome before they can be applied successfully to real-world multimodal problems. Some identified issues include difficulties to prespecify some niching parameters, difficulties in maintaining discovered solutions in a run, extra computational overhead, and poor scalability 1089-778X c 2012 IEEE

Transcript of Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean...

Page 1: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

666 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

Multimodal Optimization Using a BiobjectiveDifferential Evolution Algorithm Enhanced With

Mean Distance-Based SelectionAniruddha Basak, Swagatam Das, Senior Member, IEEE, and Kay Chen Tan, Senior Member, IEEE

Abstract—In contrast to the numerous research works thatintegrate a niching scheme with an existing single-objectiveevolutionary algorithm to perform multimodal optimization, afew approaches have recently been taken to recast multimodal op-timization as a multiobjective optimization problem to be solvedby modified multiobjective evolutionary algorithms. Followingthis promising avenue of research, we propose a novel biobjectiveformulation of the multimodal optimization problem and usedifferential evolution (DE) with nondominated sorting followedby hypervolume measure-based sorting to finally detect a set ofsolutions corresponding to multiple global and local optima ofthe function under test. Unlike the two earlier multiobjectiveapproaches (biobjective multipopulation genetic algorithm andniching-based nondominated sorting genetic algorithm II), theproposed multimodal optimization with biobjective DE (MO-BiDE) algorithm does not require the actual or estimatedgradient of the multimodal function to form its second objective.Performance of MOBiDE is compared with eight state-of-the-artsingle-objective niching algorithms and two recently developedbiobjective niching algorithms using a test suite of 14 basic and 15composite multimodal problems. Experimental results supportedby nonparametric statistical tests suggest that MOBiDE is able toprovide better and more consistent performance over the existingwell-known multimodal algorithms for majority of the testproblems without incurring any serious computational burden.

Index Terms—Crowding, differential evolution (DE),multimodal optimization, multiobjective optimization, niching,nondominated sorting.

I. Introduction

IN PRACTICAL optimization problems, it is often desirableto simultaneously locate multiple global and local optima

of a given objective function. For real world problems, dueto physical (and/or cost) constraints, the best results cannot berealized always. In such a scenario, if multiple solutions (localand global) are known, the implementation can be quicklyswitched to an alternative solution while still maintaining an

Manuscript received February 3, 2012; revised June 14, 2012 and September20, 2012; accepted November 17, 2012. Date of publication December 4,2012; date of current version September 27, 2013.

A. Basak is with the Electrical and Computer Engineering Department,Carnegie Mellon University, Silicon Valley Campus, Moffett Field, CA 94035USA (e-mail: [email protected]).

S. Das is with the Electronics and Communication Sciences Unit, Indian Sta-tistical Institute, Kolkata 700108, India (e-mail: [email protected]).

K. C. Tan is with the Department of Electrical and Computer En-gineering, National University of Singapore, 117576, Singapore (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TEVC.2012.2231685

optimal system performance. Multiple solutions could alsobe analyzed to discover hidden properties (or relationships)of the concerned functional landscape. Thus, as the namesuggests, a multimodal optimization task amounts to findingmultiple optimal solutions (both local and global) and not justone single optimum. If a point-by-point classical optimizationapproach is used for this task, the approach must have to beapplied several times, every time hoping to find a differentoptimal solution.

Evolutionary algorithms (EAs), due to their population-based approach, provide a natural advantage over classicaloptimization techniques. They maintain a population of can-didate solutions, which are processed every generation, andif several distinct solutions can be preserved over all thesegenerations, then at the termination of the algorithm wewill have multiple good solutions, rather than only the bestsolution. Note that this is against the natural tendency ofEAs, which will always converge to the best solution or asuboptimal solution. Detection and maintenance of multiplesolutions is wherein lies the challenge of using EAs formultimodal optimization. Niching [1]–[5] is a generic termreferred to as the technique of finding and preserving multiplestable niches, or favorable parts of the solution space possiblyaround multiple solutions, so as to prevent convergence toa single solution. Research on solving multimodal problemswith EAs dates back to the landmark work of Goldberg andRichardson [6], in which they nicely showed how a niche-preserving technique can be introduced in the frameworkof a standard genetic algorithm (GA) and multiple optimalsolutions can be obtained. Since that study, many researchershave suggested methodologies of introducing niche-preservingtechniques so that, for each optimum solution, a niche isformed in the population of an EA. Currently, the most popularniching techniques used in conjunction with various EAsinclude crowding [7], fitness sharing [6], restricted tourna-ment selection [8], and speciation [9]. Besides multimodalproblems, niching techniques are also frequently employedfor solving multiobjective and dynamic optimization problems[10], [11]. Most of existing niching methods, however, havedifficulties that need to be overcome before they can beapplied successfully to real-world multimodal problems. Someidentified issues include difficulties to prespecify some nichingparameters, difficulties in maintaining discovered solutionsin a run, extra computational overhead, and poor scalability

1089-778X c© 2012 IEEE

Page 2: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 667

when the dimensionality is high. The current research onevolutionary multimodal optimization attempts to circumventthese problems by devising more efficient algorithms.

Inspired by the two multiobjective approaches for solvingmultimodal optimization problems: the biobjective multipop-ulation genetic algorithm (BMPGA) of Yao et al. [12], [13]and the niching-based nondominated sorting genetic algorithm(NSGA)-II of Deb and Saha [14], [15], we propose a sim-ple but effective biobjective formulation of the multimodalfunction optimization problem and solve the same by usingthe differential evolution (DE) algorithm [16]–[18] with non-dominated sorting and hypervolume measure-based sorting[19]. DE has emerged as a very competitive optimizer forcontinuous search spaces. In our formulation, while the firstobjective remains as the multimodal function under test, thesecond objective is chosen as the mean Euclidean distance of asolution from all other population-members and is maximizedto prevent the entire population from converging to a singleoptimum. A proper compromise between the two objectivesfor the entire population is expected to lead the individuals tofind all the global optima. The most stable situation in thisscheme is all the global solutions being occupied by differentpopulation members and the algorithm is expected to approachthis situation gradually. An external archive is maintained tokeep track of solutions having the current best fitness values.The archive plays an important role in this aspect, as itprevents the generation of new solutions near points alreadystored in the archive. The archive also helps to reduce thetotal number of function evaluations (FEs) required to detectall the global peaks successfully. In what follows, we shallrefer to the proposed algorithm as multimodal optimizationwith biobjective DE (MOBiDE). Evidently MOBiDE doesnot depend on an actual or even approximated expression ofthe first or second order derivative of the objective functionunder test. This feature can enable the application of MOBiDEto nondifferentiable and badly behaved functions with largediscontinuities. Other salient differences of MOBiDE withBMPGA and niching-based NSGA-II have been outlined inSection II.

We compare the performance of MOBiDE with eight state-of-the-art single-objective evolutionary multimodal optimizersas well as BMPGA and niching-based NSGA-II over a test bedof 14 basic multimodal functions [20] and 15 composite multi-modal functions [21]. Such comparison reflects the consistentand statistically validated superiority of the proposed approachin terms of the average number of optima found, success rate,and the mean number of FEs taken over the successful runs. Apractical application of MOBiDE to the detection of multipleNash equilibria of multiplayer noncooperative games is alsoillustrated.

II. Evolutionary Multimodal Optimization:

Related Works

When a single-objective optimization problem has morethan one optimal solution, it can be considered as a multimodaloptimization problem. The objective of locating all the optimain a single run makes it more complicated than single global

optimization. Niching methods, devised for extending EAs tomultimodal optimization, address this issue by maintaining thediversity of certain properties within the population and thus,they allow parallel convergence to multiple good solutions inmultimodal domains. The concept of niching is inspired bythe way organisms evolve in nature. The process involvesthe formation of subpopulations within a population. Eachsubpopulation has a target of locating one optimal solution andtogether the whole population is expected to detect multiplepeaks in a single run. Several niching methods were proposedin the literature. In this section, we will review some of theprominent niching techniques.

1) Crowding and restricted tournament selection: De Jong[22] introduced the crowding method that tries to main-tain population diversity by allowing competition forlimited resources among similar individuals in the pop-ulation. Hence, effectively the competition takes placewithin each niche. Generally, the similarity is measuredusing the Euclidean distance between individuals. Thealgorithm compares an offspring with some randomlysampled individuals from the current population. Themost similar individual will be replaced if the offspringis a superior solution. A parameter CF, called crowdingfactor, is used to control the size of the sample. CF isgenerally set to 2 or 3. Because of such low CF values,replacement error is one of the main problem for crowd-ing. Mahfoud tried to improve the original crowding byproposing a scheme of deterministic crowding [23]. Iteliminates the CF, reduces the replacement errors, andrestores selection pressure.

In a very similar spirit to crowding, the restricted tour-nament selection [8] method selects a random sampleof w (window size) individuals from the population anddetermines which one is the nearest to the offspring, byeither Euclidean (for real variables) or Hamming (forbinary variables) distance measure. The nearest memberwithin the w individuals will compete with the offspringand the one with higher fitness will survive in the nextgeneration.

2) Sharing methods: One of the most well-known methodsfor creating subpopulations of like individuals is fitnesssharing [6]. It is based on the concept that a pointin the search space has limited resources that needto be shared by any individual that occupies similargenetic representations. Sharing in EAs is implementedby scaling the fitness of an individual based on the num-ber of “similar” individuals present in the population.Derating an individual’s fitness is controlled by twooperations: a similarity function, which measures thedistance between two individuals in either the genotypicor phenotypic space, and a sharing function. The purposeof the sharing function is to take the distance betweentwo individuals and return the degree to which they canbe considered as of the same species.

3) Clearing: Unlike fitness sharing, clearing [24] deter-mines the dominant individuals of the subpopulationsand removes the remaining population members fromthe mating pool. The algorithm first sorts the population

Page 3: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

668 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

members in descending order of their fitness values.It then picks one individual at a time from the topand removes all the individuals with worse fitness thanthe selected one within the specified clearing radius.This step will be repeated until all the individuals inthe population are either selected or removed. Clearingeliminates similar individuals and maintains the diversityamong the selected individuals.

4) Speciation: The concept of speciation [25] depends ona radius parameter rs, which measures the Euclideandistance from the center of a species to its boundary.The center of a species is called species seed. Each ofthe species is built around the dominating species’ seed.All individuals falling within the radius from the speciesseed are identified as the same species. In this way,the whole population is classified into different groupsaccording to their similarity.

Apart from the above, several other methods for multimodaloptimization have also been developed over the years, in-cluding derating [26], parallelization [27], and clustering [28].Stoean et al. [29] proposed a novel technique that integratesthe conservation of the best successive local individuals witha topological method of separating the subpopulations insteadof the conventional radius-triggered manner.

The concept of niching was primarily incorporated intoGAs for tackling multimodal optimization problems. Someof the very prominent GA-variants with niching can befound in works such as [3] and [30]–[32]. Extensionof the evolution strategies (ESs) to solve multimodalproblems has been reported in works like [33], [34].Shir et al. [35], [36] applied a new concept of adap-tive individual niche radius in conjunction with the co-variance matrix adaptation evolution strategy (CMA-ES).Some interesting niching techniques integrated with parti-cle swarm optimization (PSO) can be found in [37]–[44].Li [20] used a simple lbest PSO employing the typical ringtopology (particles are assumed to be arranged in a ringaccording to their indices in the population) to ensure stableniching behaviors. Li utilized the ring topology to control thespeed of convergence for the PSO population. Woldemariamand Yen [45] proposed a vaccine-enhanced artificial immunesystem for the optimization of multimodal functions.

DE has been used to devise a number of single-objectiveniching algorithms for solving multimodal problems. Thomsenintegrated the fitness sharing concept with DE to form thesharing DE [7]. Thomsen also proposed to extend DE with acrowding Scheme (crowding DE) [7] to allow it to tackle mul-timodal optimization problems. Crowding DE (CDE), with acrowding factor equal to the population size, has outperformedthe sharing DE on standard benchmarks. In CDE, when anoffspring is generated, its fitness is only compared with themost similar (in terms of lo Euclidean distance) individualin the current population. The offspring will replace thisindividual if it has a better fitness value. Zaharie [46] proposeda multiresolution multipopulation DE that divides the popula-tion into c equally sized subpopulations. The search space isinitially divided into c nonoverlapping subdomains, for whichthe subpopulations are initialized. While the DE algorithm is

iteration independent for each subpopulation, the subpopula-tions are not restricted to the subdomains used in the initial-ization. Some other prominent approaches involving DE-basedniching algorithms were reported by Hendershot [47] andRönkkönen and Lampinen [48]. Wong et al. [49] incorporatedthe principles of spatial and temporal locality in the CDE algo-rithm for performing multimodal optimization. Qu et al. [50]used the concept of Euclidean neighborhoods to modify themutation schemes of DE to improve the performance inmultimodal optimization. The concept of Euclidean neighbor-hoods were also independently employed by Roy et al. [51]in a two-stage hybrid multimodal optimizer based on invasiveweed optimization and DE for locating and preserving multipleoptima of a real-parameter functional landscape.

Yao et al. proposed BMPGA, a multipopulation based GAthat uses two different but complementary objective functionsfor simultaneous detection of multiple peaks over a functionlandscape [13]. While the first objective remains the mul-timodal function itself, as the second objective the authorschose gradient of the function for continuous problems anda numerical estimation of the gradient for discrete problems.Based on these two objectives, all the population membersare ranked into two ranking lists. Next, a clustering algorithmcalled recursive middling is employed to form subpopulationsof individuals around potential optima without a priori knowl-edge of the landscape. The subpopulations have a maximumlimit on the number of population members belonging to eachof them and they are allowed to evolve independently towardtheir potential optima. During the generation of new solutions,the second objective plays an important role in the selectionand recombination phases.

As will be evident from the discussions under Section III,our proposed biobjective approach differs significantly fromBMPGA. The second objective of MOBiDE is not at all basedon the gradient (or its estimation) of the function under testand is designed to enhance the diversity of the population. Thisenables MOBiDE to handle nondifferentiable and ill-behavedfunctions where the estimation of the gradient becomes wellnigh impossible. Unlike BMPGA, MOBiDE does not neces-sitate any separate clustering mechanism that may incur extracomputational overhead. In MOBiDE, the second objectiveand the sorting procedure effectively undergoes a selectionbased on the crowding of the population members. Moreover,the performance of BMPGA is dependent on parameters likethe maximum value of individuals in each subpopulation NS

and the preset threshold σ. If NS is varied from 5 to 20 in stepsof 5, then the performance of the algorithm varies significantlyfor some functions. MOBiDE does not require such heuristicparameters to which, performance of the algorithm remainssensitive over various functions.

Recently, Deb and Saha [15] proposed another biobjectiveapproach—the niching-based NSGA-II, for solving multi-modal optimization problems. In this framework, the sec-ond objective is designed from the inspiration of gradient,though without using the exact or estimated expression ofthe gradient. For a particular solution �X, it is a count ofthe neighboring solutions that are better than �X. Based onthese two objectives, the nondominated front members are

Page 4: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 669

identified by using the modified domination principle and aclearing operation. These members are allowed to pass tothe next generation and generate offspring using the SBXoperator. Though MOBiDE- and niching-based NSGA-II bothuse the nondominated sorting, there are still several underlyingdifferences. The second objectives of the two algorithms arecompletely different. Moreover, the computation of the secondobjective in case of niching-based NSGA-II requires additionalFEs. A certain number of solutions are generated in theneighborhood of �X to calculate the number of solutions thatare better than �X and thus, to compute its second objectivevalue. Unlike this scheme, in MOBiDE, no extra objectiveFE is necessary in the computation of the second objectiveof the solutions. Another important difference between thealgorithms is that the domination principles are not similar inthe two cases. Niching-based NSGA-II introduces a clearingtechnique based on the second objective before determiningthe nondominated front. However, in MOBiDE, clearing ofsolutions in the archive is done during archive update, but onthe basis of the objective function values.

III. MOBiDE Algorithm

In this section, we systematically introduce the algorithmiccomponents of MOBiDE and then present the complete algo-rithm in sufficient details.

A. Two Objective Functions

The aim of using two objectives is to select fitter individualsin the population maintaining sufficient diversity. Selecting theindividuals with better fitness is the basic goal of optimizationbut while performing the selection process we need to ensurethat the entire population does not converge toward singleglobal optimum. Diversity in the population requires thedistances among the individuals in the population should notbecome very small at some point of time.

Let us consider the situation of a 2-D objective space asshown in Fig. 1 and suppose we need to select two individualsfrom the entire population. Here, the numbers indicate thefitness values of the nearest individuals and the global peakshave heights of 100. Appropriate selection would imply {b, d}to survive as they are closest to the global peaks and alsohave better fitness. But selection only on the basis of fitnessvalues implies {a, b} to be selected. It seems that {a, d} willbe selected to survive if we perform the selection process byconsidering another objective of maximizing the population-variance. What we need is a proper tradeoff between the fitnessvalue and the diversity of the population.

MOBiDE includes a novel approach to find the most appro-priate tradeoff by using the nondominated sorting techniquefollowed by a dominated hypervolume-based sorting [52].Prior to discuss about the sorting techniques as used in ouralgorithm, we need one more objective (other than the fitnessvalue) that would have a direct relation with the populationdiversity for each individual. Hence, we define an Euclideandistance-based metric for each individual as follows:

�i =Np∑j=1

∥∥ �Xi − �Xj

∥∥ (1)

Fig. 1. Positions and fitness values of four individuals at a particular instanton a sample 2-D landscape. Red triangles denote the global peaks.

where i ∈ [1, Np

], Np is the population size,

∥∥ �Xi − �Xj

∥∥ =√∑Dm=1

(xm

i − xmj

)2and D denotes the dimensionality of

the search space. �i equals the sum of the distances of anindividual i from all other members of the population. Anymember with higher �i means it is far from the other onesin the population and lesser implies that there are some othermembers near the former. This is clear from Fig. 2(a) wherethe numbers denote the distance metric associated with aparticular individual in the solution space. The member at(2.5, −5) has maximum � as it is located furthest from therest of the population. On the other hand, the individual at(0, 0), having (0.5, −0.7) very close to it, possesses the leastmagnitude of the distance metric. If all the members of thepopulation are symmetrically located in the solution space, �

will be the same for all of them as is evident from Fig. 2(b).Now, we have two objectives associated with each individ-

ual in the population

Maximize f1( �Xi) = f( �Xi

)(2a)

Maximize f2( �Xi) =�i

Np(2b)

where f( �Xi

)is the multimodal function to be optimized

and �i/Np is the average distance of an individual from allother members of the population of size Np. The sorting andselection process should allow those members to survive whohave both higher fitness (for maximization) and higher averagedistance metric. In fact, a proper tradeoff between the twoobjectives is required so that the selected members may occupyall the global optima.

B. Selection Procedure With Nondominated Sorting

1) Concept of Dominance: To understand the nondom-inated sorting procedure, we first review a few standarddefinitions. The concept of dominance may be formally statedin the following way.

Definition 1: Consider without loss of generality the fol-lowing multiobjective optimization problem with D decision

Page 5: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

670 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

Fig. 2. Two toy populations on a 2-D plane. (a) Asymmetrically distributedpopulation. (b) Symmetrically distributed population.

variables x(parameters) and n objectives y:

Maximize �Y = f ( �X) = (f1(x1, . . . , xD), . . . , fn(x1, . . . , xD))(3)

where �X = [x1, . . . , xD]T ∈ P and �Y = [y1, . . . , yn]T ∈ O and�X is called the decision (parameter) vector, P is the parameterspace, �Y is the objective vector, and O is the objective space.A decision vector �A ∈ P is said to dominate another decisionvector �B ∈ P (also written as �A � �B for maximization) if andonly if

∀i ∈ {1, . . . , n} : fi( �A) ≥ fi(�B)

∧∃j ∈ {1, . . . , n} : fj( �A) > fj(�B). (4)

Based on this convention, we can define the nondominatedsolutions as follows.

Definition 2: Let �A ∈ P be an arbitrary decision vector.The decision vector �A is said to be nondominated regardingthe set P ′ ⊆ P if and only if there is no vector in P ′ whichcan dominate �A.

2) Nondomination Sorting: Suppose we have an initialpopulation G0 of size Np. A child population H0 is createdfrom the parent population G0 by using mutation and crossoveroperators of DE. Initially, a combined population Rt = Gt +Ht

is formed of size 2 Np. Each solution of the population isassigned a fitness that is equal to its nondomination count[53]. Now all the solutions of Rt are sorted based on theirnondomination count. Solutions having the same nondomina-tion count are grouped together and assigned to nondominatedsets Fk, k ≥ 1. If the total number of solutions belonging to the

Fig. 3. Schematic representation of nondominated sorting followed by hy-pervolume measure based sorting.

best nondominated set F1 is smaller than Np, F1 is completelyincluded into Gt+1. The remaining members of the populationGt+1 are chosen from subsequent nondominated fronts in theorder of their ranking.

3) Hypervolume Measure-Based Sorting: To choose ex-actly Np solutions, the solutions of the last included front aresorted using the hypervolume measure and the best amongthem (i.e., those with larger values of the crowding distance)are selected to fill in the available slots in Gt+1.

The hypervolume measure or S metric, as proposed byZitzler and Thiele [54], was defined to be the size of thedominated space. Coello et al. [55] described this metric asthe Lebesgue measure � of the union of hypercubes ai definedby a nondominated point mi and a reference point xref

S (M) = �({⋃

i ai

∣∣ mi ∈ M})

= �(⋃

m∈M {x| m ≺ x ≺ xref}) (5)

where M ⊆ O ⊆ Rn. Using this metric, one member from

the worst rank front (FI) is discarded from the population.To select this individual, �S (s, FI) = S (FI) − S (FI\ {s}) iscomputed for all the members s ∈ FI . The individual thatminimizes �S (s, FI) is eliminated from the population. Thiselimination process is continued until the population size de-creases to Np. Hence this method keeps those solutions whichmaximize the population’s S metric value, which implies thatthe covered hypervolume of a population will not decrease bythis reduction scheme. The sorting schemes are outlined inFig. 3.

4) Effect of This Type of Sorting: Due to the nondominatedsorting followed by the dominated hypervolume-based sorting,two features that are very helpful for niching come into picture.

a) Reduction of multiple solutions near a single point: Asolution having smaller value of �i is much crowded thanother solutions in the population, i.e., many candidate solutionsare near that point. As the solutions with less crowding arepreferred, the aim of maximizing the �i values effectivelyreduces crowding of multiple individuals near a particularsolution. Hence, the tendency of the evolutionary optimizationprocess, convergence of the entire population to a singleoptimum, is reduced.

Page 6: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 671

Fig. 4. Variation of the population variance with generations for two samplefunctions. (a) Himmelblau’s function [f8 in Table I(a)]. (b) Inverted ShubertFunction [f11 in Table I(a)].

b) Diversity enhancement: The second objective isprimarily introduced to maintain the population diversity whileexploring the search space. During the sorting phase, solutionswith higher value of the second objective are more likelyto dominate the other solutions. These dominating solutionssurvive and reproduce in the next generation. Consequently,the variance of the population does not decrease continuouslyas in the case of classical DE with selection [18] for detectinga single global optimum. Thus, the diversity of the populationis maintained throughout the exploration and exploitationprocesses. This conclusion is also justified from Fig. 4, wherethe population-variance is plotted against generations for aMOBiDE population of size 50, for two different benchmarkslisted in Table I.

5) Putting It All Together: In the following, we provide astep-wise description of the MOBiDE algorithm.

a) Initialization: DE starts with a population of NpD-dimensional real-valued vectors representing the candidatesolutions. The initial population at generation t = 0 is denotedas X0 =

{ �X1,0, �X2,0, . . . , �XNp,0}

.b) Generation of new solutions by DE: The mutation

and crossover operators of DE/rand/1/bin [17] are applied oneach member �Xi,t , i ∈ [1, Np] to generate a new population oftrial vectors Y t = { �U1,t , �U2,t , . . . , �UNp,t}. Under this scheme atfirst three distinct population members are selected for eachvector �Xi,t and the weighted difference of any two of them is

added to the third one to create a donor vector �Vi,t . Next, thedonor vector mixes its components with the target vector tocreate a final offspring or trial vector �Ui,t . However, the trialvector �Ui,t is accepted as a member of Y t only if it is distinctfrom all the solutions of the current archive ARt . Until Np

new solutions are created, this generation process is continued.The scheme can be expressed through the following equations

�Vi,t = �Xαi1,t

+ F · ( �Xαi2,t

− �Xαi3,t

) (6)

uj,i,t =

{vj,i,t if randi,j ≤ Cr or j = jrand

xj,i,t otherwise(7)

�Yi,t = �Ui,t if∥∥∥�Yi,t − −→

ARk,t

∥∥∥>δ ∀k ∈ [1, A] (8)

where indices αi1, αi

2, and αi3 are mutually exclusive integers

randomly chosen from the range [1, Np], F is the scalefactor for amplifying the difference vector, Cr ∈ [0, 1] isthe crossover rate, and randi,j[0, 1) is a uniformly distributedrandom number, which is called anew for each jth componentof the ith parameter vector. jrand ∈ [1, 2, . . . , D] is a randomlychosen index, which ensures that �Ui,t gets at least one com-ponent from �Vi,t . δ is a very small positive number (typically0 < δ < 1e − 3) and

−→ARk,t is the kth member of the archiveat generation t. If the distances between two solutions are lessthan δ, we consider them as identical solutions. Hence, we seethe number of trial vectors generated can be greater than Np

and we will denote this as Ng.c) Calculation of the two objectives: After the creation

of new population Y t , the two populations Xt and Y t arecombined to create Zt as follows:

Zt ={ �X1,t , �X2,t , . . . , �XNp,t, �Y1,t , �Y2,t , . . . , �YNp,t

}=

{�Z1,t , . . . , �Z2Np,t

} .

Then, the two objectives of each solution are calculated asfollows:

f1,i = f(�Yi,t

). (9a)

f2,i =�i

2Np=

1

2Np

2Np∑j=1

∥∥�Zi − �Zj

∥∥ (9b)

where f denotes the multimodal function under test.d) Sorting followed by selection: In this step, the

combined population is sorted using nondomination sortingand hypervolume measure-based sorting, respectively. Thelatter is significant in the way that if some solutions havesame domination count, they are sorted according to theirhypervolume measure. While calculating Lebesgue measure,we used an infinite reference point for xref as done in [56].In this approach, finding the solution contributing least to thehypervolume is independent of xref . The extremal points ofa front are anyway included in the available slots of Gt+1.Therefore, crucial choice of xref is unnecessary in this selectionprocess.

The S metric-based reduction allows Np individuals to passto the next generation and thus, to maintain the population size

Page 7: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

672 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

constant. This selection strategy is totally different from theselection scheme of original DE. In fact, this type of selectionprocedure makes DE capable of finding multiple optima of theobjective function.

e) Archive update: Before updating the archive, theglobal best fitness value (gbest) is updated if any new solution isfound with better fitness (f1). Next, the solutions having fitnessneargbestare included in the archive if they are not already inthe archive and the archive is updated by introducing a smallheuristic parameter α in the following way

ARA+1,t+1=Zi,t if

⎧⎨⎩

f(Zi,t

)< (1 + α) ∗ gbest when gbest > 0

f(Zi,t

)< (1 − α) ∗ gbest when gbest < 0

f(Zi,t

)< 0.001 when gbest ≈ 0.

(10)We varied α from 0.1 to 0.95 in steps of 0.1 and the

success rates of the algorithm for different functions wereobserved. We found that for most of the benchmark instances,best success rates were achieved for α lying between 0.05and 0.25. For all the experiments described in the followingsections, we selected α = 0.1. Due to this type of strategy thearchive contains the solutions near some global peak. As inthe generation step, no new solution very near to any memberof the archive is accepted; the archive will contain very fewsolutions which lie near the same global peak. Only thesolutions having the distance from a global peak just greaterthan δ and fitness value close to gbest can enter the archive.This is also evident from the simulation results. However,in the beginning of the optimization process, when gbest isnot same as the true global best fitness (fitness of any globalpeak), i.e., not a single peak has been detected, some solutions,despite remaining far from any global peak, can enter thearchive if their fitness values are close to gbest. This may causeunnecessary increase of archive size and complexity of thealgorithm as number of distance calculations will also increase.To prevent this, we have introduced a clearing strategy forthe archive (only) after the above update. The mean of thefitness values of all the members of the archive is calculated(say gbavg). All the members having fitness less than gbavg arerejected and a new archive is created containing solutions withfitness greater than or equal to gbavg. If all the members ofthe archive have same fitness (=gbest) no one is rejected fromthe archive. Thus, it is ensured that a global optimum, oncedetected, is never lost from the archive. This preservation ofthe global peaks improves the performance of the algorithmto a great extent.

Due to the clearing, however, the individuals acquiring thelocal optima will be removed from the archive whenever anyglobal optimum, having better fitness than the local peaks,is discovered. Hence at the end of the search process, localsolutions are not expected to be present in the archive. Rather,the population itself preserves these local solutions due to theinherent nature of MOBiDE’s generation and selection processfor maintaining the diversity of the population. If the local andglobal peaks are not very close to each other, the individuals atlocal optima will have good second objective value. Solutions,near to the local optima, are likely to have similar secondobjective value but will be inferior in the first objective, by the

definition of local optima. And other individuals near to globalpeaks, better in first objective, are also unlikely to dominatethe individuals at local peaks because of their higher �i values.As the tendency of DE is to converge to the global optima,some solutions are generated near the discovered global peaksalmost in every generation, thus adding much to the �i

measure of the local solutions and less for global ones. On theother hand if the global and local peaks are closely located,there is a probability that the individuals near to global peakswill dominate the local solutions. However, as new solutionswithin a distance of δ from the discovered global solutionsare not included in the population, there is high probabilitythat the new candidate solutions will be created at or nearthe local optima, which are, in this case, close to the globaloptima but not within a distance of δ. Thus, dynamically, localsolutions can be added and discarded from the population.Toward the end of the search, when the population converges,the spread of the solutions decreases and consequently fewersolutions appear close to one peak. At this stage, usually, onlythe solutions at global peaks can dominate solutions at localpeaks. For this reason, the population size Np is kept higherthan the total number of global and local peaks such that,although being dominated by few ones, the local solutionsmay remain in the population.

6) Reduction of FEs Using the External Archive: Anexternal archive ARt is maintained to keep track of the optima,already detected. At every generation some new solutions,having fitness nearly same as the global best fitness, areinserted into the archive. After the discovery of (at least)one global peak, the archive contains primarily solutions neardifferent optima. As no new solution very close to

−→ARk,t

(k ∈ [1, A]) is included in the population, fitness evaluationdoes not take place for new solutions near an already detectedoptima. Hence waste of FEs for the solutions near the detectedpeaks is prevented resulting in a significant reduction of thetotal number of FEs required to detect all the peaks.

7) Proof-of-Principle Result With MOBiDE: To demon-strate the ability of MOBiDE to solve multimodal opti-mization problems a typical 1-D multimodal function (withone global peak and six local peaks) is selected and op-timized by using MOBiDE with parameters: Np = 50,F = 0.8, and Cr = 0.9. The function is defined as f (x) =log(x) (sin(ex) + sin(3x)) , x ∈ (0, 3.5].

The optima detected by the archived solutions of MOBiDEon this function are shown in Fig. 5. It is observed that alongwith the global peak, local peaks have also been detected byMOBiDE. This happens due to the fact that, local peaks havebetter fitness values than some solutions and local peaks havehigher second objective values than some solutions which arenearer to any global peak. Even if the solutions very near toany global peak may have better fitness values, they will not beable to dominate the solutions at local peaks due to their lowersecond objective value which is decreased by the presenceof other solutions (global optimal solutions) close to them.Only the individuals at the global optima can dominate thesolutions at local peaks. If the population size is greater thanthe total number of global and local peaks, all of them (globaland local), if detected, will remain in the population. Hence

Page 8: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 673

TABLE I (a):

Summary of Test Function Set 1

Name D Test Function Range No. of Peak FitnessGlobal Peaks Value

f1: Two-peak trap 1 f1(x) =

⎧⎨⎩

160

15(15 − x) for 0 ≤ x < 15

200

5(x − 15) for 15 ≤ x ≤ 20

0 ≤ x ≤ 20 1 200

f2: Central two-peak trap

1 f2(x) =

⎧⎪⎪⎨⎪⎪⎩

160

10x for 0 ≤ x < 10

160

5(15 − x) for 10 ≤ x < 15

200

5(x − 15) for 15 ≤ x ≤ 20

0 ≤ x ≤ 20 1 200

f3:Five-uneven-peak-trap

1 f3(x) =

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

80(2.5 − x)64(x − 2.5)64(7.5 − x)28(x − 7.5)28(17.5 − x)32(x − 17.5)32(27.5 − x)80(x − 27.5)

for 0 ≤ x < 2.5for 2.5 ≤ x < 5for 5 ≤ x < 7.5

for 7.5 ≤ x < 12.5for 12.5 ≤ x < 17.5for 17.5 ≤ x < 22.5for 22.5 ≤ x < 27.5for 27.5 ≤ x ≤ 30

0 ≤ x ≤ 30 2 200

f4: Equal maxima 1 f4(x) = sin6(5πx) 0 ≤ x ≤ 1 5 1

f5: Decreasingmaxima

1 f5(x) = exp[−2 log(2) · (

x − 0.1

0.8)2]

· sin6(5πx) 0 ≤ x ≤ 1 1 1

f6: Uneven max-ima

1 f6(x) = sin6(5π (x34 − 0.05)) 0 ≤ x ≤ 1 5 1

f7: Uneven de-creasing maxima

1 f5(x) = exp[−2 log(2) · (

x − 0.08

0.854)2]

· sin6(5π (x34 − 0.05)) 0 ≤ x ≤ 1 1 1

f8: Himmelblau’sfunction

2 f8(x, y) = 200 − (x2 + y − 11)2 − (x + y2 − 7)2 −4 ≤ x, y ≤ 4 4 200

f9: Six-humpcamel back

2 f9(x, y) = −4

[(4 − 2.1x2 +

x4

3

)x2 + xy +

(−4 + 4y2

)y2

]−1.9 ≤ x ≤ 1.9−1.1 ≤ y ≤ 1.1 2 4.1265

f10: shekel’s fox-holes

2 f10(x, y) = 500 − 1

0.002 +∑24

i=0

1

1 + i + (x − a (i))6 + (y − b (i))6

where a (i) = 16(imod5 − 2)b (i) = 16(

⌊(i/

5)⌋

− 2)

−65.536 = x, y = 65.536 1 476.191

f11: 2-D invertedShubert function

2 f11

( �X)

= −∏D

i=1

∑5j=1 j cos

[(j + 1) xi + j

]−10 ≤ x1, x2 ≤ 10 18 186.731

f12: 1-D invertedVincent function

1 0.25 ≤ xi ≤ 10 6 1

f13: 2-D inverted

Vincent function2 f ( �X) =

1

D

D∑i=1

sin(10 · log(xi)) 0.25 ≤ xi ≤ 10 36 1

f14: 3-D invertedVincent function

3 0.25 ≤ xi ≤ 10 216 1

it can be concluded that with substantially high populationsize, MOBiDE can detect both local and global peaks of thefunction landscape.

C. Runtime Complexity of MOBiDE: A Discussion

The initialization phase involves creating Np random solu-tions having D decision variables each and thus, has (D·Np)complexity. Creation of each donor vector through mutationinvolves constant number of random integer generation andtwo vector additions, which is (D) operation and thus,generation of Np donor vectors takes (D·Np) time. Similarlyeach crossover operation is (D) and for Np trial vectors thecomplexity is (D · Np). However, if some new solutions arerejected, the generation process continues (i.e., the number oftrial vectors created exceeds Np) until Np acceptable solutionsare generated. Thus, the worst case runtime complexity of the Fig. 5. All six maxima are detected by the MOBiDE algorithm.

Page 9: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

674 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

generation process has a lower bound of �(D · Np). We willinvestigate the upper bound subsequently. After the generationphase, the two objectives are evaluated for all new trial vectors.The calculation of the first objective is clearly of (Np) com-plexity. But calculating the second objective of Np individualsinvolves distance calculations from each individual to all otherindividuals. Thus, this step has a complexity of (D · Np2).

The next phase is selection. We know nondominated sortingfor a two-objective case will need O(2 · Np2) or O(Np2) time[53]. As hypervolume measure-based sorting is applied only tosolutions in one nondominated front (with two objectives), itsruntime complexity is O(Np · log Np) [56]. Hence the overallcomplexity of the selection step is O(Np2). The archive updatephase primarily involves copying and deleting of individualsbased on the fitness values. As the number of such operationsis limited by O(Np), the complexity of this step is O(D ·Np).Thus, except the generation process, the worst case complexityof the rest of the computations is (D · Np2).

Coming to the complexity of the generation process, evi-dently, the number of candidate trial solutions generated (sayNg) depends on the archive size, position of the solutions inthe archive, and the random numbers involved in mutationand crossover. Thus, it is difficult to analytically determinean exact bound of the runtime. We can, however, estimatethe upper bound of Ng from simulations and then infer thatthe expected complexity of each generation of MOBiDE as(D·Np2)+O(D·Ng). For different test functions, we recordedthe maximum value of Ng for all generations. Fig. 6 shows theratio of Ng to Np for ten representative functions with their cor-responding Np values. These ten functions cover a wide rangeof population sizes and search space bounds. Our experimentsindicate that in general, the ratio is way less than Np for allcases and in fact less than a constant factor of “5” for all the 29benchmarks. Hence, we can conclude Ng < c1 · Np for somepositive constant c1 and ∀ Np indicating that Ng = O(Np).Hence, from our previous discussion, we see that the runtimecomplexity of each generation of MOBiDE is O(D · Np2).However, in order to detect both global and local peaks,Np should be chosen much greater than D. Consequently,D appears as a small constant as compared to Np2. Mostniching techniques like, “Fitness Sharing,” “Clearing,” “S-CMA” use the distance measures between individuals and thus,have complexity O(Np2). The time complexity of SCGA [25]also lies between O(Np) and O(Np2). From this perspective,MOBiDE is not seriously expensive as compared to some ofthe prominent niching algorithms existing in the literature.

IV. Experimental Setup

A. Numerical Benchmarks

To evaluate the performance of the MOBiDE algorithm, weused 29 challenging test functions of various characteristics,such as irregular landscape, symmetric, or equal distribution ofoptima, unevenly spaced optima, multiple global optima in thepresence of multiple local optima, and dimensional scalabilitywith function rotation. The test functions are divided intotwo classes: Test function set 1 comprising of 14 generalmultimodal functions [20] and Test function set 2 comprising

Fig. 6. Sample ratios of Ng and Np for 10 different test functions withvalues of Np in braces (along the x-axis).

TABLE I(b):

Summary of Test Function Set 2 (Corresponding to the

Composite Functions (CF) From [21])

Name D Range No. of Global Peak FitnessPeaks Value

CF1 10, 30 −5 ≤ xi ≤ 5 8 0.000CF2 10, 30 −5 ≤ xi ≤ 5 6 0.000CF3 10, 30 −5 ≤ xi ≤ 5 6 0.000CF4 10, 30 −5 ≤ xi ≤ 5 6 0.000CF5 10, 30 −5 ≤ xi ≤ 5 6 0.000CF6 10, 30 −5 ≤ xi ≤ 5 6 0.000CF7 10, 30 −5 ≤ xi ≤ 5 6 0.000CF8 10, 30 −5 ≤ xi ≤ 5 6 0.000CF9 10, 30 −5 ≤ xi ≤ 5 6 0.000

CF10 10, 30 −5 ≤ xi ≤ 5 6 0.000CF11 10, 30 −5 ≤ xi ≤ 5 8 0.000CF12 10, 30 −5 ≤ xi ≤ 5 6 0.000CF13 10, 30 −5 ≤ xi ≤ 5 6 0.000CF14 10, 30 −5 ≤ xi ≤ 5 6 0.000CF15 10, 30 −5 ≤ xi ≤ 5 6 0.000

of 15 scalable, composition functions [21]. A brief descriptionof the functions is provided in Tables Ia and Ib.

B. Algorithms Compared

Performance of MOBiDE is compared with the followingstandard multimodal EAs (single objective):

1) crowding DE (CDE) [7];2) speciation-based DE (SDE] [57];3) fitness-Euclidean distance ratio PSO (FER-PSO) [58];4) speciation-based PSO (SPSO) [59];5) r2pso: r2pso [20] is an lbest PSO with a ring topology,

each member interacts with only its immediate memberto its right;

6) r3pso: r3pso [20] is an lbest PSO with a ring topology,each member interacts with its immediate member onits left and right;

7) CMA-ES with self-adaptive niche radius (S-CMA) [36];8) CMA-ES with fixed niche radius (CMA) [35].

We also consider two other very recently proposed biob-jective EAs for multimodal optimization in our compara-tive study: 1) biobjective multipopulation genetic algorithm(BMPGA) [13], and 2) niching-based NSGA-II [15]. Paramet-ric setup for these algorithms, as used in this paper, has beenshown in Table II. A few typical parameters, not mentioned inTable II, were fixed following the corresponding references.

Page 10: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 675

TABLE II

Parametric Set-Up for BMPGA [13] and Niching-Based

NSGA-II [15]

Niching-Based NSGA II BMPGAParam. Description Value Param. Description Value

δi Hookes–Jeevesdistance

0.005*D Ns Size ofsubpopulations

20

δf Dominationparameter

0.5 Re Elitism percentage 0.1

δx Normalizeddistance

0.2 Rd Percentage of least fitindividuals discarded

0.15

σ Convergencethreshold

0.025

σdis Global threshold forclearing

0.23

Population sizes for the ten peer algorithms were fixed asper their respective literatures. Population sizes for MOBiDEwere allotted based on the degree of complexity of thefunctions and are listed in Table II. However, all comparedalgorithms including MOBiDE were allowed to run till thesame maximum number of FEs as discussed in Section IV-D.

C. Performance Measures

To compare the performances of different multimodal algo-rithms, we specified a level of accuracy ε (typically 0 < ε ≤ 1)for indicating how close the fitness values of the computedsolutions to the known global peaks are. If the difference infitness values of computed solution and a known global opti-mum is below ε, a peak is considered to have been found andno other solutions within the niche radius (Euclidian distance< niche radius) are counted. To compare the performancesof different multimodal algorithms, we specified a level ofaccuracy ε (typically 0 < ε ≤ 1) for indicating how close thefitness values of the computed solutions to the known globalpeaks are, i.e., whether |fmax − f (�x)| ≤ ε.

Different values of ε have been used for different testfunctions based on the level of complexities of the associatedfitness landscapes following recent works such as [50] and[60]. ε was set in such a fashion that detecting the peakswith sufficient level of accuracy may become challenging forthe algorithms compared. As for example, for relatively easierfunctions f4 to f7, we used ε = 10−6, whereas in articles like[20], ε is taken as 0.01. This is also a reason why some ofthe contender algorithms performed poorly in the comparativestudy undertaken in Section V as compared to the reportedliterature.

The performance of each of the multimodal optimizers ismeasured in terms of the following three criteria: 1) suc-cess rate: the percentage of runs in which all global peakswere successfully found within the budget of maximum FEs;2) average number of optima found [61]; and 3) averagenumber of FEs required to detect all global optima over 50successful runs.

All performances are calculated and averaged over 50 inde-pendent runs. All the algorithms are implemented in MATLAB7.5 and executed using a Pentium core 2 duo machine with 2GB RAM and 2.23 GHz speed.

TABLE III

Population Sizes Used for MOBiDE and Maximum Number

of FEs for Different Benchmarks

Function Number Population Maximum Number ofSize (Np) FEs

f1 to f10 30*D 10000f11 250 100 000f12 100 20 000f13 200 200 000f14 1000 400 000

CF1 to CF15 in 10D 600 300 000CF1 to CF15 in 30D 1000 800 000

D. Maximum Number of Evaluations

Different maximum numbers of FEs were allotted for dif-ferent functions depending upon their degrees of complexity.However, the same maximum number of FEs (as a stoppingcriterion) was used for all algorithms on the same functionin order to assure homogeneous experimental conditions. Themaximum allowable numbers of FEs are tabulated in Table IIIfor different benchmark maximization problems.

V. Results and Analysis

A. Comparing MOBiDE With Other Evolutionary MultimodalOptimizers

On 29 benchmark functions, all algorithms were run untilall known peaks were found or the maximum budget of FEswas exhausted. Table IV shows the average number of globalpeaks detected by MOBiDE and the other ten evolutionarymultimodal optimizers on test function sets 1. Table Vapresents the corresponding comparative results of MOBiDEand nine other peer algorithms on composition functions CF1to CF15 in 10 dimensions in terms of the mean number ofglobal peaks found. Table Vb shows similar results on thecomposition functions in 30 dimensions. Note that BMPGAhas not been run on test function set 2 as the second objectiveof this algorithm requires at least a numerical estimate ofthe function gradient as indicated in (8) on page 84 of [13].The composition functions are mostly nondifferentiable dueto the incorporation of Weierstrass functions and have verycomplex and rugged fitness landscapes. In this context, authorsof [13] state that “if the function is nondifferentiable andbadly behaved, e.g., a function with large discontinuities,gradient may not be an appropriate measure. There may stillbe ways to design the secondary fitness objectives, but thisis beyond the scope of this paper.” For the niching-basedNSGA-II algorithm [14], we used the neighborhood pointcount method (page 10, [14]) to avoid the need for first-and second-order derivatives. The second and third columnsof Tables IVa and b indicate the level of accuracy and nicheradius (for SDE and SPSO) used in the experiments. All valuesof the niche radius have been taken from works like [20],[21], and [50]. Note that except for the composite functionsCF1 to CF15, the level of accuracy (ε) have been takensufficiently small so that it may become challenging for thealgorithms to locate the peaks. Table VI provides the success

Page 11: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

676 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

TABLE IV

Average Number of Global Peaks Found for Test Function Set 1 and the Corresponding Ranks (in Parentheses)

Func. Niche Rad. ε MOBi-DE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso Niching- BMPGABased NSGA-II

f1 0.5 0.05 1 (4) 1 (4) 1 (4) 1 (4) 1 (4) 0.44†(11) 0.82†(9) 0.76†(10) 0.84†(8) 1 (4) 1 (4)f2 0.5 0.05 1 (4.5) 1 (4.5) 1 (4.5) 1 (4.5) 1 (4.5) 0.40†(11) 1 (4.5) 0.88†(10) 0.96 (9) 1 (4.5) 1 (4.5)f3 0.5 1e − 6 2 (2) 2 (2) 1.96 (4) 1.92† (6) 1.95 (5) 1.38† (8) 0.8† (9) 0.48† (11) 0.6† (10) 2 (2) 1.50† (7)

f4 0.01 1e − 6 5 (1) 3.84† (9) 4.70† (5) 0.04† (11) 0.6† (10) 4.88 (2) 4.84† (3) 4.68† (6) 4.74† (4) 4.37† (8) 4.64† (7)

f5 0.01 1e − 6 1 (5.5) 0.72†(11) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5) 1 (5.5)f6 0.01 1e − 6 5 (1.5) 3.96† (8) 4.6† (6) 0† (11) 0.64† (10) 4.92 (3) 4.96 (2) 4.88 (4) 4.72† (5) 4.52† (7) 3.56† (9)

f7 0.01 1e − 6 1 (5) 0.8†(10) 1 (5) 1 (5) 1 (5) 1 (5) 1 (5) 1 (5) 1 (5) 0. 6†(11) 1 (5)f8 0.5 5e − 4 4 (1) 0.32† (11) 3.72† (3.5) 3.72† (3.5) 3.43† (6) 0.84† (10) 3.68† (5) 2.92† (8) 2.76† (9) 3.74† (2) 3.34† (7)

f9 0.5 1e − 6 2 (2.5) 0.04† (11) 2 (2.5) 1.6† (6) 2 (2.5) 0.08† (10) 1.96 (5) 1.44† (8) 1.56† (7) 2 (2.5) 1.24† (9)

f10 0.5 1e − 5 1 (1.5) 0.52† (8) 0.32† (9) 0.06† (11) 0.18† (10) 0.56† (7) 1 (1.5) 0.88† (5) 0.76† (6) 0.94† (4) 0.96 (3)

f11 0.2 5e − 2 17.40 (1) 11.39† (11) 12.78† (8) 12.04† (9) 12† (10) 14.36† (6) 15.61† (5) 15.95† (4) 16.45† (3) 16.94 (2) 13.72† (7)

f12 0.2 0.05 6 (1) 5.56† (5.5) 4.88† (11) 5.56 (5.5) 5.81 (2) 5.6 (4) 5.28† (9) 5.52 (7) 5.16† (10) 5.36† (8) 5.71 (3)

f13 0.2 1e − 3 35.40 (1) 33.8† (2) 23.8† (6) 24.6† (5) 23.6† (7) 25.72† (4) 21.8† (11) 22.4† (8) 22.2† (9) 22.05† (10) 31.76† (3)

f14 0.1 1e − 3 175.88 (1) 152† (2) 50.6† (7) 0.6† (11) 32.5† (10) 70.12† (5) 68.6† (6) 40.6† (9) 45.4† (8) 92.76† (4) 121.54† (3)Total Ranks 31.5 99 81 98 91.5 91.5 80.5 100.5 98.5 74.5 77

Best results are marked in boldface.

TABLE V(a):

Average Number of Global Peaks Found for Composite Functions CF1 to CF15 in 10 Dimensions

Func. Niche Rad. ε MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso Niching-BasedNSGA-II

CF1 1 0.5 4.36 (1) 0† (8.5) 1.8† (2) 1.08† (3.5) 1† (5) 0† (8.5) 1.08† (3.5) 0† (8.5) 0† (8.5) 0.8† (6)

CF2 1 0.5 2 (1) 1.2† (5.5) 1.2† (5.5) 1.52† (3) 1.40† (4) 0† (10) 1.8† (2) 0† (10) 0† (10) 1† (7)

CF3 1 0.5 4.80 (1) 0.72† (8) 1.52† (7) 2.21† (4) 2.08† (5) 0† (9) 2.52† (2) 0† (9) 0† (9) 0. 6† (6)

CF4 1 0.5 3.4 (1) 0† (7) 0† (7) 0† (7) 0.4† (3) 0† (7) 0† (7) 0† (7) 0† (7) 0.41† (2)

CF5 1 0.5 3.95 (1) 1.12† (5) 1.32† (3) 1.2† (4) 1† (6) 0† (9) 2† (2) 0† (9) 0† (9) 0.34† (7)

CF6 1 0.5 2.61 (2) 0† (10) 1.4† (4) 2.91‡ (1) 1.54† (3) 0† (10) 1.2† (5) 0† (10) 0† (10) 0† (10)

CF7 1 0.5 1.95 (1) 0† (8) 1† (4) 1.41† (2) 1.07† (3) 0† (8) 0.64† (5) 0† (8) 0† (8) 0† (8)

CF8 1 0.5 1.97 (1) 0† (8) 1.42† (4) 1.59† (2) 1.33† (5) 0† (8) 1.48† (3) 0† (8) 0† (8) 0† (8)

CF9 1 0.5 2.68 (1) 0† (8.5) 1.93† (2) 1.8† (3) 1.42† (5) 0† (8.5) 1.67† (4) 0† (8.5) 0† (8.5) 0.28† (6)

CF10 1 0.5 1.12 (2.5) 0† (8) 1.12 (2.5) 1.26‡ (1) 1† (5) 0† (8) 1.09† (4) 0† (8) 0† (8) 0† (8)

CF11 1 0.5 2.94 (1) 0† (7.5) 1.3† (2) 0.68† (3) 0.32† (4) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5)

CF12 1 0.5 2.34 (1) 0† (8) 1.64† (3) 1.70† (2) 1.25† (4) 0† (8) 0† (8) 0† (8) 0† (8) 0† (8)

CF13 1 0.5 2.28 (1) 0†(8) 0.88†(4) 1.39† (3) 1.61† (2) 0†(8) 0.35† (5) 0†(8) 0† (8) 0† (8)

CF14 1 0.5 1 (2) 0† (7) 1 (2) 0† (7) 1 (2) 0† (7) 0† (7) 0† (7) 0† (7) 0† (7)

CF15 1 0.5 3.4 (1) 0† (7.5) 1.6† (4) 2† (2.5) 2† (2.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5)Total Rank 18.5 114.5 56 48 54.5 124 72.5 134 134 106

Best results are marked in boldface.

rates of different algorithms. Mean and standard deviations ofthe number of FEs taken by each algorithm to find all theglobal peaks of a function are listed in Table VII. In each cellof this table the average and standard deviations have beenpresented only over the successful runs out of the total 50 runs.Since the 15 composition functions (CF1 to CF15) have farmore complicated fitness landscapes as compared to the singlebasic functions, no algorithm except MOBiDE could achieve anonzero success rates at least for some of them. For MOBiDE,the average success rates remained 6%, 2%, 4%, and 3.5%on CF3, CF4, CF5, and CF15 (in 10D), respectively. Forthis reason, results for the composite functions have not beenshown in Tables VI and VII. In Table VII, we also excludedfunction f14, since no algorithm could detect all the globaloptima of this function within the limit of allowable number

of FEs in any run. To complement the results of Table VII, inTable VIII, we provide the average CPU time (in seconds) andstandard deviations of each algorithm on test function sets 1and 2. In this table, the CPU time is recorded after each runof each algorithm was continued till the maximum allowednumber of FEs.

B. Statistical Tests

In order to determine the statistical significance of theperformance of MOBiDE over other algorithms, first weconduct a nonparametric statistical test called Wilcoxon’s ranksum test [62], [63] on the average number of peaks found bydifferent competitor algorithms at the 5% significance level. InTables IV and V, the mark † indicates that MOBiDE performsstatistically better than the corresponding algorithm as the

Page 12: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 677

TABLE V(b):

Average Number of Global Peaks Found for Composite Functions CF1 to CF15 in 30 Dimensions

Func. Niche Rad. ε MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso Niching-BasedNSGA- II

CF1 1 1 2 (1) 0† (7.5) 1.25† (2) 0.28† (4) 0.32† (3) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5)

CF2 1 1 1.5 (1) 1† (5.5) 1† (5.5) 1.2† (4) 1.25† (3) 0† (8.5) 1.34† (2) 0† (8.5) 0† (8.5) 0† (8.5)

CF3 1 1 3.34 (1) 0† (8) 0.88† (2) 0.6† (3) 0.32† (5) 0† (8) 0.35† (4) 0† (8) 0† (8) 0† (8)

CF4 1 1 1 (1) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6)

CF5 1 1 1 (1) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6)

CF6 1 1 1 (1.5) 0† (7) 1 (1.5) 1 (1.5) 1 (1.5) 0† (7) 0†(7) 0† (7) 0† (7) 0† (7)

CF7 1 1 1.5 (1) 0† (6) 1† (3) 1†(3) 0†(6) 0† (6) 0† (6) 0† (6) 0† (6) 0† (6)

CF8 1 1 1 (1.5) 0† (7.5) 1 (1.5) 0.54† (3) 0.46† (4) 0† (7.5) 0 (7.5) 0† (7.5) 0† (7.5) 0† (7.5)

CF9 1 1 1.2 (1) 0† (6.5) 0† (6.5) 0†(6.5) 0.6†(2) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5)

CF10 1 1 1 (1.5) 0† (7) 0† (7) 1 (1.5) 0† (7) 0† (7) 0.4† (3) 0† (7) 0† (7) 0† (7)

CF11 1 1 1.68 (1) 0† (6.5) 0† (6.5) 0† (6.5) 0.24† (2) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5)

CF12 1 1 1.56 (1) 0† (7.5) 1† (3) 1† (3) 1† (3) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5)

CF13 1 1 0.96 (1) 0†(6.5) 0.2†(2) 0† (6.5) 0†(6.5) 0†(6.5) 0† (6.5) 0†(6.5) 0† (6.5) 0† (6.5)

CF14 1 1 0.72† (2) 0† (6.5) 0†(6.5) 0† (6.5) 1 (1) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5) 0† (6.5)

CF15 1 1 1.6 (1) 0† (7.5) 1† (3) 1† (3) 1† (3) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5) 0† (7.5)Total Rank 17.5 101.5 62 63 59 104.5 90 104.5 104.5 104.5

Best results are marked in boldface.

TABLE VI

Success Rates in Percentage

Function MOBiDE CDE SDE S- CMA SPSO FER- r2pso r3pso BMPGA Niching-BasedCMA PSO NSGA- II

f1 100 100 100 100 100 44 82 76 84 100 100f2 100 100 100 100 100 40 100 88 96 100 100f3 100 100 96 92 96 36 96 8 8 100 86f4 100 28 72 12 20 88 84 92 88 100 87f5 100 72 100 100 100 100 100 100 100 100 100f6 100 28 60 88 84 92 94 88 72 84 76f7 100 60 100 100 100 100 100 100 100 56 100f8 100 0 72 72 67 0 72 28 24 60 87f9 100 0 100 60 100 0 96 56 60 100 92f10 100 52 32 0 12 56 100 88 76 84 92f11 88 18 30 22 25 40 52 54 60 51 36f12 100 56 48 60 55 72 60 68 56 54 67f13 92 34 0 0 0 0 0 0 0 2 5

Best results are marked in boldface.

p-values obtained with rank sum test are less than 0.05. On theother hand, ‡ mark indicates that the corresponding algorithmis better than MOBiDE. If an entry (not corresponding toMOBiDE) has no such marking, it means result of the cor-responding algorithm does not differ statistically significantlyfrom that of MOBiDE.

To undertake a multiple-function analysis on the perfor-mances of MOBiDE and the other contender algorithms, weuse some more nonparametric statistical tests, which remainusually less restrictive than the parametric ones (e.g., repeated-measures ANOVA) and can be used over small-size samples ofresults [64]. A performance comparison can be initiated oncethe algorithms are ranked as per their relative performances.Under the null hypothesis, the k algorithms are of equivalentperformance. If the null hypothesis is rejected, then at leastone of the k algorithms performed differently from at least oneother algorithm. However, this does not indicate which one.To get such information, a post hoc test is used. For testing the

statistical hypothesis, we used the Iman–Davenport test [63],[65], based on the following statistic

FF =(N − 1)χ2

F

N(k − 1) − χ2F

where k is the number of compared algorithms, N is thenumber of benchmark functions, and χ2

F is the Friedman’sstatistic. For a chosen significance level α (usually 0.05), thenull hypothesis is rejected if α < p-value, where the p-value isdetermined according to the FF statistic. For the post hoc test,we first use the Bonferroni–Dunn method, according to whichthe performance of two algorithms is significantly different ifthe corresponding average of the rankings is at least as greatas its critical distance (CD)

CD = Qα

√k(k + 1)

6N

where Qα is the critical value for a multiple nonparamet-ric comparison with a control (see [66, Table B.16]). In

Page 13: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

678 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

TABLE VII

Mean and Standard Deviation of the Number of FEs Consumed Over the Successful Runs

Func. MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso BMPGANiching-basedNSGA- II

f1 112(17.89)

5.40e+03(3.02e+02)

5.45e+03(1.79e+02)

8.78e+03(3.67e+01)

7.01e+02(2.34e+01)

8.12e+03(4.46e+01)

6.36e+02(4.92e+01)

3.01e+03(1.19e+01)

2.77e+03(1.78e+02)

8.73e+03(2.71e+01)

7.32e+03(5.82e+01)

f2 142(23.94)

5.92e+03(1.25e+02)

4.98e+03(1.02e+02)

9.45e+03(2.96e+02)

5.41e+03(1.03e+03)

7.37e+02(5.93e+02)

1.96e+02(2.62e+01)

2.78e+03(1.32e+02)

4.49e+03(2.19e+02)

5.08e+03(2.77e+02)

4.68e+03(4.08e+02)

f3 400(251.78)

4.13e+02(8.16e+01)

7.89e+02(1.45e+01)

7.89e+02(5.84e+01)

6.44e+02(3.09e+01)

5.89e+02(5.19e+02)

5.70e+02(1.67e+02)

8.16e+02(1.47e+02)

7.83e+02(2.07e+01)

7.84e+02(1.13e+02)

8.74e+02(1.82e+01)

f4 326(124.42)

2.44e+02(7.22e+01)

4.40e+02(2.69e+01)

6.49e+02(3.93e+01)

1.28e+02(2.86e+01)

3.62e+02(3.80e+01)

3.79e+02(2.67e+01)

4.66e+02(6.89e+01)

5.18e+02(4.51e+01)

7.63e+02(3.78e+01)

6.78e+02(2.56e+01)

f5 122.8(4.25)

1.86e+02(5.12e+01)

3.04e+02(9.78e+01)

2.87e+02(2.48e+00)

127 (6.48) 1.28e+02(8.56e+00)

1.67e+02(1.05e+01)

1.87e+02(9.62e+00)

1.37e+02(1.41e+01)

1.58e+02(7.83e+00)

2.63e+02(4.62e+00)

f6 285(96.07)

3.14e+03(1.28e+03)

1.42e+03(1.19e+03)

8.76e+02(4.89e+01)

1.05e+03(2.43e+02)

3.35e+02(2.04e+01)

3.69e+02(3.59e+01)

2.91e+03(1.84e+03)

3.49e+03(1.77e+03)

2.75e+03(1.67e+02)

5.26e+03(1.25e+02)

f7 136(28.41)

2.71e+03(1.14e+03)

3.58e+03(2.57e+03)

2.46e+03(1.73e+02)

1.88e+03(1.12e+02)

1.57e+02(1.02e+01)

1.89e+02(1.57e+01)

1.86e+02(2.48e+01)

1.79e+02(3.45e+01)

3.85e+02(9.42e+01)

2.76e+02(5.78e+01)

f8 1370(182)

– 5.29e+03(5.17e+03)

7.08e+03(6.72e+02)

9.24e+03(1.94e+02)

– 4.98e+03(1.76e+03)

8.76e+03(2.54e+03)

3.34e+03(4.98e+03)

2.84e+03(1.66e+02)

6.53e+03(1.22e+02)

f9 522(75.41)

– 1.24e+03(8.14e+02)

2.15e+03(1.07e+02)

3.35e+03(1.13e+02)

– 8.97e+02(4.94e+01)

7.64e+02(3.41e+01)

7.28e+02(2.19e+01)

7.21e+02(8.33e+01)

6.42e+02(2.05e+01)

f10 1470(490.3)

2.45e+03(7.67e+02)

7.56e+03(1.67e+03)

- 1.82e+03(2.98e+02)

3.77e+03(1.73e+03)

3.09e+03(3.47e+02)

5.98e+03(7.24e+02)

4.67e+03(5.69e+02)

4.76e+03(1.88e+03)

5.21e+03(1.84e+03)

f11 5.23e+04(1.09e+04)

7.00e+04(2.31e+04)

7.63e+04(2.32e+03)

8.17e+04(6.44e+03)

9.14e+04(3.27e+03)

6.31e+04(3.97e+03)

9.01e+04(1.51e+03)

6.89e+04(2.98e+03)

5.71e+04(2.17e+03)

6.92e+04(2.84e+03)

6.18e+04(1.12e+04)

f12 1656(678.82)

1.09e+04(4.19e+03)

7.85e+04(5.71e+03)

1.74e+04(3.39e+03)

1.71e+04(1.38e+03)

1.38e+04(4.63e+03)

1.25e+04(4.87e+03)

9.48e+03(3.78e+03)

1.89e+04(5.89e+03)

1.78e+04(1.62e+03)

9.45e+03(2.76e+02)

f13 6.85e+04(1411.12)

1.67e+05(2.14e+04)

– – – – – – – 1.52e+05(2.78e+04)

1.72e+05(2.32e+03)

(A blank entry indicates that no run of the algorithm could successfully detect all the global optima of the corresponding function. Best results aremarked in boldface

Tables IXa and IXb, we present the algorithm rankings withthe p-values and CD for α = 0.05 on test function set 1 and set2 (in 10 and 30 dimensions), respectively. Comparison of thealgorithms on the basis of the Bonferroni–Dunn test is shownin Fig. 7(a) and (b), respectively, for these two sets. The al-gorithms, with average rank columns higher than the CD line,are significantly worse than the control algorithm MOBiDE.

In order to undertake multiple pair wise comparisons, wereport the adjusted p-values obtained by using the followingmethods [65]: Nemeny’s test, Holm’s procedure, Shaffer’sstatic procedure, and Bergmann–Hommel’s dynamic proce-dure. Table X provides information about the state of retentionor rejection of any hypothesis, comparing its associated ad-justed p-value with the chosen α = 0.01. For this comparisonwe consider 10 algorithms excluding BMPGA (since it couldnot be run on test function set 2) and a total of 44 benchmarkinstances from test function sets 1 and 2. In set 2, we have15 composite functions, each in 10 and 30 dimensions. If ap-value is less than α, then the corresponding hypothesis isconsidered as rejected. We note that hypotheses 1–9 are allrejected by the four procedures, indicating the gross perfor-mance of MOBiDE to be significantly better than the ninecontender algorithms.

C. Ability to Locate Local Optima

In order to test the ability of locating local optima, fivetest functions (f1 − f3, f5, and f10) are used. The results interms of the percentage success rates in locating are shownin Tables XI and XII. For MOBiDE, the final solutions are

collected both from the archive and the final population. It canbe observed that besides the global optima, the local optimaare also retained by the final population of the algorithm toa good extent for the test cases considered. Note that thepopulation size and maximum number of FEs are set to 500and 100 000 for function f10. The settings of remainingparameters are kept unchanged.

D. Discussion on the Comparative Study

This section presents a brief discussion on the performancesof various algorithms that participated in the comparativestudy. A close inspection of Tables IV and V reveal that ac-cording to Wilcoxon’s rank sum test results, although on easierand low-dimensional functions f1 to f10, one or more algo-rithms yielded statistically comparable results as MOBiDE, onfunctions f11 to f14 (which are more complex than the first tenfunctions in set 1) MOBiDE could provide statistically betterperformance as compared to all other algorithms compared.

We also note that on the more challenging compositionfunctions from set 2, while performance of the other al-gorithms considerably degraded, MOBiDE could retain itsstatistically superior performance as compared to majority ofthe contender algorithms. According to the Bonferroni–Dunntest results, only the overall performance of niching-basedNSGA-II was not statistically worse than that of MOBiDE fortest function set 1. However, on test function set 2, niching-based NSGA-II performed poorly and the overall performanceof MOBiDE remained statistically better than the nine state-of-the-art multimodal optimizers as indicated in Fig. 7(b).

Page 14: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 679

TABLE VIII

Average CPU Time Taken (in Seconds) and Corresponding Standard Deviations (in Parentheses)

Test Function Set 1Func. MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso BMPGA Niching-based

NSGA- IIf1 1.56

(0.25)1.04(0.15)

1.34(0.6)

2.72(0.28)

1.86(0.72)

1.38(0.26)

2.64 (0.14) 0.88(0.21)

0.94(0.16)

1.52(0.27)

1.860.33)

f2 1.46(0.28)

1.16(0.26)

1.52(0.38)

2.28(0.36)

2.15(0.17)

1.32(0.42)

2.70 (0.42) 0.92(0.14)

0.94(0.26)

1.62(0.49)

2.13(0.27)

f3 2.82(0.14)

1.08(0.37)

1.68(0.25)

3.14(0.83)

2.92(0.46)

1.52(0.47)

3.04 (0.82) 1.14(0.27)

1.22(0.31)

2.84(0.28)

3.02(0.17)

f4 2.64(0.22)

1.32(0.13)

1.64(0.29)

2.98(0.37)

2.74(0.25)

2.46(0.16)

3.52 (0.18) 1.24(0.27)

1.12(0.27)

3.62(0.16)

1.84(0.17)

f5 1.54(0.26)

0.92(0.21)

1.28(0.28)

1.62(0.51)

1.84(0.36)

1.36(0.62)

2.08 (0.23) 0.86(0.17)

0.84(0.13)‘

1.52(0.16)

1.43(0.26)

f6 2.04(0.16)

1.24(0.11)

1.54(0.20)

1.84(0.26)

1.93(0.35)

1.78(0.26)

2.14 (0.26) 1.12(0.52)

1.08(0.42)

1.88(0.25)

2.18(0.32)

f7 1.68(0.14)

1.18(0.26)

1.48(0.35)

1.76(0.42)

1.84(0.30)

1.68(0.39)

2.02 (0.18) 114(0.24)

0.94(0.08)

1.84(0.62)

1.66(0.37)

f8 18.34(1.26)

6.26(1.25)

7.18(1.03)

12.03(2.17)

10.12(3.29)

12.28(2.05)

14.26 (2.18) 6.90(1.28)

6.56(2.33)

10.42(1.28)

13.28(3.25)

f9 15.28(3.29)

8.34(1.76)

9.24(2.19)

13.82(1.77)

12.88(4.92)

‘14.60(2.76)

13.62 (5.27) 9.74(3.29)

8.57(3.42)

15.22(1.86)

13.84(2.59)

f10 41.24(5.22)

20.82(3.72)

24.14(4.14)

32.58(3.95)

28.48(1.62)

27.34(5.93)

30.24 (2.93) 22.39(1.42)

18.38(3.29)

35.92(8.26)

38.27(4.27)

f11 46.28(2.83)

29.44(1.47)

28.36(7.39)

35.28(4.74)

32.74(5.03)

33.08(10.24)

36.28 (12.16) 26.38(1.82)

24.74(7.93)

37.84(4.28)

38.58(10.26)

f12 36.49(5.93)

23.84(4.82)

26.84(6.28)

30.94(8.36)

24.84(6.83)

22.88(4.61)

18.58 (3.67) 12.62(3.74)

13.26(2.58)

27.46(1.83)

29.72(5.51)

f13 51.74(6.83)

30.86(4.06)

36.72(5.73)

39.62(5.47)

38.40(7.82)

32.48(6.32)

41.76 (13.25) 30.64(12.17)

28.24(9.53)

41.74(3.36)

40.29(4.72)

f14 72.36(2.91)

35.86(9.38)

39.72(2.64)

54.36(9.44)

52.74(8.48)

46.72(10.24)

57.62 (14.82) 33.72(2.74)

30.32(2.94)

59.64(14.36)

58.24(13.82)

Test Function Set 2 (in 10 D)Func. MOBi- DE CDE SDE S-CMA CMA SPSO FER-

PSOr2pso r3pso Niching-based

NSGA- IICF1 2153.46

(213.88)1544.32(10.37)

1590.62(62.64)

1805.28(30.27)

1739.52(108.36)

1677.48(25.39)

1798.34(20.41)

1286.28(9.26)

1270.84(11.29)

2108.36(173.57)

CF2 2573.58(206.41)

1964.36(20.47)

2067.68(35.08)

2561.58(26.48)

2274.39(142.53)

2207.47(132.63)

2315.62(44.82)

1835.48(163.29)

1937.50(29.47)

2607.36(173.57)

CF3 2484.00(27.18)

1867.84(21.85)

2109.74(65.57)

2218.46(17.42)

1973.42(223.44)

2217.48(21.46)

2374.32(367.27)

2095.36(253.74)

1920.08(121.26)

2416.27(162.43)

CF4 2731.84(214.99)

2410.54(314.62)

2205.62(153.82)

2351.47(109.37)

2017.48(172.84)

2338.48(62.83)

2563.92(21.59)

2292.42(31.53)

2274.30(204.17)

2704.38(412.62)

CF5 2671.28(281.22)

2473.27(189.27)

2316.34(201.36)

2561.80(63.69)

2583.00(92.36)

2415.20(173.29)

2579.28(13.28)

2271.37(121.76)

2310.46(212.44)

2716.28(63.27)

CF6 2482.74(79.36)

2182.42(42.47)

2234.48(43.27)

2371.36(21.58)

2365.26(36)

2198.36(151.21)

2319.37(21.11)

2019.37(121.54)

2104.70(48.31)

2421.72(32.62)

CF7 3014.28(27.49)

2457.72(167.83)

2562.80(42.36)

2831.57(32.54)

2959.23(153.28)

2217.22(75.37)

2727.32(53.37)

2415.74(121.82)

2536.52(422.47)

2918.489142.47)

CF8 2812.34(62.46)

2615.74(53.28)

2627(78.37)

2804.64(53.28)

2718.36(224.93)

2516.88(32.64)

2724.16(56.32)

2561.32(62.47)

2497.30(143)

2795.42(73.24)

CF9 3132.56(47.29)

2547.28(73.27)

2617.48(302.42)

2531.74(134.28)

2251.38(52.31)

2416.28(61.46)

2918.47(162.46)

2416.26(252.32)

2415.36(72.57)

3102.42(17.43)

CF10 2858.28(12.57)

2517.46(102.57)

2435.20(47.26)

2554.62(84.29)

2674.52(62.08)

2429.48(172.37)

2817.18(73.52)

2536.14(74.29)

2432.2(152.84)

2918.42(26.17)

CF11 3271.48(27.42)

2819.46(142.58)

3019.42(261.47)

3182.63(15.27)

3042.5(43.52)

2823.46(64.16)

3126.52(71.52)

3001.28(152.47)

2964.22(56.49)

3209.36(241.47)

CF12 3394.28(64.39)

3126.72(161.73)

3274.52(62.58)

3302.52(84.28)

3162.84(27.18)

3028.36(262.74)

3362.48(261.39)

3218.26(209.68)

3152.50(61.42)

3327.20(85)

CF13 3696.18(281.46)

3328.62(74.26)

3398.76(241.25)

3502.6(53.28)

3416.74(516.38)

3525.82(102.48)

3627.16(352.18)

3062.17(271.37)

3071.46(83.20)

3594.38(62.48)

CF14 3781.32(182.32)

3321.00(92.58)

3402.86(64.38)

3521.67(73.46)

3429.42(27.84)

3553.28(72.63)

3702.42(72.32)

3287.24(102.61)

3218.44(26.47)

3604.22(31.82)

CF15 4403.36(727.12)

4102.48(261.42)

4226.4(54.29)

4328.58(726.58)

4029.72(52.41)

4163.02(146.38)

4372.28(63.32)

3292.46(172.48)

3282.38(82.17)

4297.48(232.46)

Best entries are marked in boldface.

Page 15: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

680 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

Fig. 7. (a) Overall algorithm comparison (on function set 1) with theBonferroni–Dunn test with α = 0.5 and CD = 3.1730. (b) Overall algorithmcomparison (function set 2) with the Bonferroni–Dunn test with α = 0.05 andCD = 2.1675.

Table VII further indicates that in most of the test cases,a successful run of MOBiDE was able to detect the globaloptima by consuming least amount of FEs. Table VIII indi-cates that the CPU time consumed by MOBiDE remainedcomparable to both BMPGA and niching based NSGA-II.Also in some cases, MOBiDE’s computational time remainedonly marginally higher or comparable to some expensivesingle objective niching algorithms like S-CMA and FER-PSO. Although CDE, SPSO, r2pso, and r3pso took smalleraverage CPU times per run in most cases, looking at their grossaccuracy levels across the benchmarks, it can be inferred thatfor offline niching purposes, they do not stand as ideal choices.On the other hand, we must be aware of the fact that the testfunctions, especially those in set 1 are not computationallyexpensive by themselves. One fitness evaluation of thesefunctions is usually less than 10−4 s. However, in many real-world problems, the computation of fitness can be very timeconsuming. In aerodynamic design optimization, for example,it takes over 10 h to perform one fitness evaluation when it

TABLE IX(a):

Overall Algorithm Ranks for Test Function Set 1

Algorithms Avg. Rank p-valuer2pso 7.1785 3.64e − 32CDE 7.0714 3.64e − 32

S-CMA 7 3.64e − 32r3pso 6.6428 3.64e − 32SPSO 6.5357 3.64e − 32CMA 6.5357 3.64e − 32SDE 5.7857 3.64e − 32

FER-PSO 5.75 3.64e − 32BMPGA 5.5 3.64e − 32

Niching-based NSGA-II 5.3214 3.64e − 32MOBiDE 2.25 3.64e − 32

TABLE IX(b):

Overall Algorithm Ranks for Test Function Set 2

Algorithms Avg. Rank p-valuer3pso 7.95 4.63e − 29r2pso 7.95 4.63e − 29SPSO 7.616 4.63e − 29CDE 7.2 4.63e − 29

Niching-based NSGA-II 7.016 4.63e − 29FER-PSO 5.416 4.63e − 29

SDE 3.93 4.63e − 29CMA 3.78 4.63e − 29

S-CMA 3.7 4.63e − 29MOBiDE 1.2 4.63e − 29

involves a 3-D computational fluid dynamics (CFD) simulation[67].

Due to MOBiDE’s superior searching ability, it will bebeneficial to use MOBiDE in such circumstances where thecomputation of fitness takes the most of the processing time.In such cases MOBiDE is expected to yield acceptably goodlevel of accuracy with least number of FEs, as indicated bythe results of Table VII.

CDE and SPSO can generate acceptable results over simpleand low-dimensional functions (test function set 1), but theirfine tuning abilities are limited. Consequently, their perfor-mances over difficult and high dimensional problems (testfunction set 2) are poor. Although SDE is always able to findcertain number of global peaks and the fine tuning ability overthe found optima is good, it fails to find most of the localoptima. SDE algorithm also shows very poor performance ifthe number of global peaks is high. Therefore, if our aim is todetect a few global peaks with high accuracy, SDE could bea good choice. Otherwise, SDE may not be able to generatesatisfactory results. Compared to the algorithms proposed inthe literature, FERPSO is able to generate relatively satisfac-tory results over many test functions. However, the abilityof locating local optima is very poor as can be perceivedfrom Tables XI and XII. It cannot be satisfactorily used tolocate both global and local optima. r2pso and r3pso performwell on simple problems but their search ability deteriorate ondifficult test functions. Their ability to locate local optima isalso poor. These algorithms do not require any of the nichingparameters and this can be claimed as one of the advantages,

Page 16: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 681

TABLE X

Adjusted p-Values for Pair-Wise Statistical Comparisons

No. Hypothesis Holm Schaffer Nemeny Bergmann–Hommel

1. MOBiDE versusCMA

0.0046 0.0046 0.0073 0.0069

2. MOBiDEversus S-CMA

4.26 × 10−5 3.36 × 10−5 6.82 × 10−4 1.29 × 10−4

3. MOBiDEversus Niching-basedNSGA-II

2.17 × 10−6 5.26 × 10−6 8.65 × 10−4 3.22 × 10−6

4. MOBiDE versusSDE

7.41 × 10−6 8.46 × 10−6 3.05 × 10−6 7.25 × 10−6

5. MOBiDE versusFER-PSO

2.38 × 10−7 1.24 × 10−7 9.03 × 10−6 4.82 × 10−7

6. MOBiDE versusCDE

2.72 × 10−9 3.13 × 10−11 8.45 × 10−9 3.14 × 10−9

7. MOBiDE versusSPSO

5.92 × 10−12 2.81 × 10−12 4.63 × 10−12 2.59 × 10−12

8. MOBiDE versusr2pso

7.37 × 10−14 8.21 × 10−14 9.38 × 10−13 2.35 × 10−14

9. MOBiDE versusr3pso

4.23 × 10−22 4.17 × 10−22 7.28 × 10−20 7.24 × 10−22

TABLE XI

Average of Optima Found in Locating Both Global and Local Peaks

Function MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso BMPGA Niching-BasedNSGA-II

f1 2 (3.5) 2 (3.5) 1.84† (7) 2 (3.5) 2 (3.5) 1.5† (8) 1.46† (10) 1.7† (9) 1.42† (11) 2 (3.5) 2 (3.5)f2 2 (3.5) 2 (3.5) 1.68† (8) 2 (3.5) 2 (3.5) 1.64† (9) 1.92† (7) 1.38† (10) 1.26† (11) 2 (3.5) 2 (3.5)f3 4.46 (1.5) 4.46 (1.5) 3.12† (5) 2† (7) 3.18† (4) 3.06† (6) 0.64† (8) 0.8† (9) 0.4† (11) 3.8†4 (4) 3.56† (3)

f5 4.96 (2) 4.22† (5) 1.44† (6) 0.52† (11) 0.64† (10) 5 (1) 1† (8) 1† (8) 1† (8) 4.64† (3) 4.56† (4)

f10 25 (2) 12.54† (7) 1.26† (10) 0.86† (11) 1.5† (9) 24.6† (4) 5.24† (8) 24.2 (6) 24.32 (5) 25 (2) 25 (2)Total Rank 12.5 20.5 36 36 30 28 41 42 46 16 16

Corresponding ranks (in parentheses) and Wilcoxon’s rank sum test results are also indicated. Best results are marked in boldface.

TABLE XII

Success Rates (in %) for Locating Both Global and Local Peaks

Function MOBiDE CDE SDE S-CMA CMA SPSO FER-PSO r2pso r3pso BMPGA Niching-BasedNSGA-II

f1 100 100 84 100 100 44 66 76 56 100 100f2 100 100 80 100 100 40 87 54 34 100 100f3 92 100 12 0 8 38 0 0 0 72 82f5 94 45 4 0 0 100 0 0 0 80 86f10 100 0 6 0 0 92 0 62 54 100 100

Best results are marked in boldface.

as the performance of other algorithms may vary according tothe assigned values of the niching parameters. SCMA-ES andCMA-ES based niching algorithms are quite complicated withrespect to implementation. Their performances on complexand high dimensional problems are better than some of thewell-known existing techniques. However, their poor finesearching ability makes the algorithms to fail when the levelof accuracy is high (e.g., we set it to 10−6 for the functionsf4 to f7 from set 1) when solving simple problems.

BMPGA, despite being a biobjective multimodal optimizer,is not applicable to functions, which are badly behaved andmostly nondifferentiable. However, our results provided in

Table IV indicate that BMPGA, in its current form, does notalso offer any distinct advantage (in terms of performance) onMOBiDE for the simpler, differentiable set of functions. Asthe authors also suggested, a cleverer design of the secondobjective may extend the usability of BMPGA as well asimprove its performance. Niching-based NSGA-II exhibiteda good overall performance on test function set 1. However,its performance considerably degraded for test function set 2.One possible reason may be the Hooke–Jeeves neighbourhoodcount procedure used to construct the second objective didnot work well on the rugged landscapes of the compositionfunctions. In this context, despite being a biobjective pro-

Page 17: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

682 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

TABLE XIII

Average Number of Global Peaks Found Among Ten

Independent Runs

Func. D Actual No. MOBiDE MOBiDE−no MOBiDE−noof Peaks −Archive −2nd−Obj

f4 1 5 5 2 4f5 1 1 1 0.46 1f6 1 5 5 2.6 3.7f7 1 1 1 1 0.54f8 2 4 4 3.1 2.8f9 2 2 2 1.6 1.2f10 2 1 1 1 1f12 1 6 6 2.4 3.9f13 2 36 35.40 12.7 8.2CF1 10 8 2 1.4 1CF2 10 6 2 1.2 0CF3 10 6 4.80 3.46 0CF4 10 6 3.4 2.4 1CF5 10 6 3.95 0.6 0

Best results are marked in boldface.

cedure, MOBiDE is widely applicable and has been shownto maintain a consistent level of performance across a widevariety of benchmarks. Even for 30-dimensional compositionfunctions, its performance scaled best among all the optimizersconsidered.

E. Impact of External Archive and the Second Objective

In this section, we investigate the role of the external archiveand the second objective in the search process of MOBiDE.For this purpose, we compare the performance of MOBiDEwith two of its variants on the basis of average number ofglobal peaks found. These two variants are exactly same asMOBiDE except the first one (MOBiDE−no−Archive) doesnot have any external archive and the second one does not useany second objective (MOBiDE−no−2nd−Obj.). To implementthe later, we set the second objective value to zero for allindividuals throughout the evolution process. In order to savespace, we selected 14 functions from test function sets 1 and2. The results are presented in Table XIII.

We observe that MOBiDE−no−2nd−Obj., whichuses the external archive, performs much better thanMOBiDE−no−Archive on the problems having smaller searchvolume (D = 1). In a small search space, usually, DEpopulation tends to converge quickly, thus, leaving some partsof the search space unexplored. In MOBiDE−no−2nd−Obj,however, the archive prevents multiple solutions near oneglobal optimum and thus causing creation of more candidatesolutions in one generation. For problems with relativelysmaller search space (global and local peaks are also closeto each other), there is a finite probability that some of thesenew solutions will be closer to the peaks. On the other hand,MOBiDE−no−Archive suffers with problem that individualsvery close to one discovered-peak (within a radius if δ) hasvery good first objective value and thus, tend to dominateother solutions nearer to the so far undiscovered peaks.Consequently MOBiDE−no−2nd−Obj. performs better thanMOBiDE−no−Archive in these problems (f4, f6, f12).

TABLE XIV

Summary of the Game Problems From GAMBIT

Game No. of No. of Strategies No. of CorrespondingPlayers Available to Each Player NEs GAMBIT File Name

1 4 2 3 2 × 2 × 2 × 2.nfg2 4 2 5 g3.nfg3 5 2 5 2 × 2 × 2 × 2 × 2.nfg4 3 2 9 2 × 2 × 2.nfg

In case of problems with larger search space (D ≥ 2),however, a more systematic thorough search is required to findthe optimal solutions. In addition, the search converges muchslowly than the previous case. Thus, MOBiDE−no−Archive,using the second objective, gets the opportunity to maintain thespread of the population for some generations before multiplesolutions near one particular optimum starts entering thepopulation. But MOBiDE−no−2nd−Obj. loses the populationdiversity relatively quicker and as the optima are not close toeach other, more solution creation in one generation does nothelp much.

Hence, MOBiDE−no−Archive outperforms the other variantin problems like f8, f9, f13 and CF1 to CF5. Therefore,MOBiDE−no−Archive and MOBiDE−no−2nd−Obj - both havetheir own advantages and disadvantages, however, when com-bined together, these two features harmoniously boost up theoverall performance of the MOBiDE algorithm.

F. Application to Computation of Nash Equilibria forMultiplayer Games

Detection of equilibria in multiplayer, noncooperative, andnormal form games is a computationally challenging task. Aniching-based EA can be used for searching and detectingmultiple Nash Equilibria (NE) of a multiplayer game [68],[69]. A finite n-person strategic game can be defined by aset N = {1, 2, . . . , n} of players, each of whom possessesa strategy set Si = {si1, si2, . . . , simi

}, consisting of mi purestrategies and S = S1 × . . . × Sn is the set of all possiblesituations in the game. Each player is endowed with a payoffor utility function ui : S → R and the function can be extendedto have the domain R

m (with m =∑n

i=1 mi) by the followingrelation

ui(p) =∑s∈S

p(s).ui(s),

where p(s) = p1(s1) × . . . × pn(sn), and pi is a probabilitymeasure defined on Si. To keep the notations simple, we canwrite ui(ti, s−i) as the utility of player i when he plays ti andthe other players choose their strategies according to S−i. Ingeneral p ∈ P, where P = P1 × . . .×Pn and Pi is the set of allreal valued functions on Si. A point p∗ ∈ P can be consideredas an NE of the game if p∗ ∈ �, where � = �1 × . . . × �n,with �i = {pi ∈ Pi :

∑j pij = 1, pj ≥ 0}, and for all i ∈ N

and all pi ∈ �i, the relation ui(pi, p∗−i) ≤ ui(p∗) holds [69].

Computation of the NE can be reformulated as a problemof finding global minima of a real valued objective function.This function comprises of three other functions x, z, and g,all defined on P and having values in R

m. The ijth value of

Page 18: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 683

TABLE XV

Average Number of NEs Found for Each Game and the Corresponding Standard Deviations (in Parentheses)

Games MOBiDE CDE S-CMA FER-PSO r2pso r3pso Niching-BasedNSGA-II

BMPGA

1 3 (0) 2.84†

(0.132)3 (0) 2.12†

(0.162)1.88†

(0.351)2.14†

(0.315)3 (0) 2.96

(0.018)

2 5 (0) 4.36†

(0.53)4.68†

(0.032)1.84†

(0.462)3.88†

(0.392)4.12†

(0.047)4.52†

(0.517)3.62†

(1.041)

3 4.96(0.0012)

1.44†

(0.735)2.16†

(0.791)1.80†

(0.726)1.48†

(0.569)1.62†

(0.814)3.18†

(0.634)3.98†

(0.783)

4 8.92(0.472)

7.14†

(0.902)7.08†

(1.134)7.84†

(1.092)6.68†

(1.623)6.74†

(1.258)8.04†

(0.873)7.26†

(0.869)

Best results are marked in boldface.

these functions for any p ∈ P, i ∈ {1, 2, ..., N} and sij ∈ Si

can be defined as follows:

xij(p) = ui(sij,p−i), zij(p) = xij(p) − ui(p),

and

gij(p) = max(zj(p), 0).

Then, an NE corresponds to a global minimum of theobjective function

v(p) =∑i∈N

∑1≤j≤mi

g2ij(p). (11)

To demonstrate an application of the MOBiDE algorithm,we select four test games presenting multiple NEs and avail-able with the software GAMBIT (ver. 0.2007.01.31) [70],which computes NE by solving systems of polynomial equa-tions. The games are summarized in Table XIV. Resultsobtained with MOBiDE are presented in Table XV and con-trasted with the results obtained with ten other evolutionarymultimodal optimizers. For all algorithms, the maximum num-ber of FEs was kept at 30 000 and the population size forMOBiDE was set to 100. The results for all the algorithmsare presented as mean and standard deviations of the best-of-the-run number of NEs found within the prescribed budget ofFEs and within a precision of ε = 10−8 over 50 independentruns. Results of Wilcoxon’s rank sum test are also indicatedin Table XV in a manner similar to Tables IV and V.

We excluded CMA, SDE, and SPSO from this comparisonas the performances of these algorithms are sensitive to thevalue of the niching radius chosen and it needs much moreexperimentation with fitness landscape of the NE detectionproblem to investigate if an acceptably good value of theniching radius may be found for a wide variety of games.Such experiments are out of the scope of this paper and canbe undertaken in future. As evident from Table XV, game 1is a relatively simple one. MOBiDE, S-CMA, and niching-based NSGA-II are the three algorithms that could detect allthree NEs at all runs for this game like GAMBIT itself. Forgames 2 and 3, MOBiDE alone is able to find all NEs inall the runs. Game 4 is the most difficult one and MOBiDEfound best average number of NEs as compared to all theevolutionary multimodal optimizers considered. For this gameeven GAMBIT could detect only seven NEs as reported in[69].

VI. Conclusion

In this paper, we presented a biobjective approach basedon the DE algorithm for solving multimodal optimizationproblems. Introducing a novel second objective to increase thepopulation diversity, we integrated nondominated and hyper-volume measure-based sorting techniques with DE to detectmultiple global and local optima of a multimodal functionlandscape. Moreover, we added an external archive, whichimproved the performance of the algorithm by reducing thenumber of FEs necessary to find all the optimal solutions.We undertook a comparative study of the proposed MOBiDEalgorithm with ten state-of-the-art evolutionary multimodal op-timizers, among which two were recently developed biobjec-tive approaches. The experimental study undertook 29 distincttest functions with number of global peaks varying from 1to 216 and search space dimensionality ranging from 1 to30. The algorithms were compared on the basis of successrates, average number of global and local peaks found, andthe total number of FEs necessary to detect all the globalpeaks. Statistical significance of the results was judged withthe nonparametric Wilcoxon’s rank sum test, Iman–Davenportstatistic, Bonferroni–Dunn post hoc test, and four other testsfor adjusting the p-values for multiple pair-wise comparisons.The experimental study clearly indicated that in most of thetested cases, performance of MOBiDE remains statisticallybetter than all the other algorithms compared.

Our future works will be directed toward testing the perfor-mance of the algorithm on much more massively multimodalproblems with high dimensionality and constraints. Futureresearch may also focus on devising the second objective ina cleverer fashion so that the population diversity may bepreserved without the need of calculating pair-wise Euclideandistances. Moreover, applications of this algorithm to real-life engineering optimization problems that necessitate thedetection of multiple peaks in a single run can be undertaken.The proposed biobjective formulation can be integrated withother multiobjective optimizers like NSGA-II, multiobjectivePSO variants etc. and the performances over high-dimensionaland complicated multimodal functions can be investigated.

References

[1] S. W. Mahfoud, “Niching methods for genetic algorithms,” IllinoisGenetic Algorithms Lab., Univ. Illinois at Urbana-Champaign, Urbana,(Tech. Rep. 95001), 1995.

Page 19: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

684 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 17, NO. 5, OCTOBER 2013

[2] A. Della Cioppa, C. De Stefano, and A. Marcelli, “Where are the niches?Dynamic fitness sharing,” IEEE Trans. Evol. Comput., vol. 11, no. 4,pp. 453–465, Aug. 2007.

[3] S. Das, S. Maity, B.-Y. Qu, and P. N. Suganthan, “Real-parameterevolutionary multimodal optimization: A survey of the state-of-the-art,”Swarm Evol. Comput., vol. 1, no. 2, pp. 71–88, Jun. 2011.

[4] B. Sareni and L. Krahenbuhl, “Fitness sharing and niching methodsrevisited,” IEEE Trans. Evol. Comput., vol. 2, no. 3, pp. 97–106, Sep.1998.

[5] G. Singh and K. Deb, “Comparison of multimodal optimization al-gorithms based on evolutionary algorithms,” in Proc. Genetic Evol.Comput. Conf., 2006, pp. 1305–1312.

[6] D. E. Goldberg and J. Richardson, “Genetic algorithms with sharingfor multimodal function optimization,” in Proc. 2nd Int. Conf. GeneticAlgorithms, 1987, pp. 41–49.

[7] R. Thomsen, “Multimodal optimization using crowding-based differen-tial evolution,” in Proc. IEEE Congr. Evol. Comput., Jun. 2004, pp.1382–1389.

[8] G. R. Harik, “Finding multimodal solutions using restricted tourna-ment selection,” in Proc. 6th Int. Conf. Genetic Algorithms, 1995,pp. 24–31.

[9] A. Petrowski, “A clearing procedure as a niching method for geneticalgorithms,” in Proc. 3rd IEEE Congr. Evol. Comput., May 1996, pp.798–803.

[10] J. Horn, N. Nafpliotis, and D. E. Goldberg, “A niched Pareto geneticalgorithm for multiobjective optimization,” in Proc. 1st IEEE Conf. Evol.Comput., Jun. 1994, pp. 82–87.

[11] I. Schoeman and A. P. Engelbrecht, “Niching for dynamic environmentsusing particle swarm optimization,” in Proc. Simulated Evolution Learn-ing, LNCS 4247. 2006, pp. 134–141.

[12] J. Yao, N. Kharma, and P. Grogono, “BMPGA: A bi-objective multi-population genetic algorithm for multimodal function optimization,” inProc. IEEE Congr. Evol. Comput., Sep. 2005, pp. 816–823.

[13] J. Yao, N. Kharma, and P. Grogono, “Bi-objective multipopulationgenetic algorithm for multimodal function optimization,” IEEE Trans.Evol. Comput., vol. 14, no. 1, pp. 80–102, Feb. 2010.

[14] K. Deb and A. Saha, “Multimodal optimization using a bi-objectiveevolutionary algorithm,” Indian Instit. Technol. Kanpur, Kanpur, India,KanGAL Rep. 2009006, Dec. 2009.

[15] K. Deb and A. Saha, “Finding multiple solutions for multi-modal optimization problems using a multiobjective evolutionary ap-proach,” in Proc. 12th Annu. Conf. Genet. Evol. Comput., Jul. 2010,pp. 447–454.

[16] R. Storn and K. V. Price, “Differential evolution: A simple andefficient adaptive scheme for global optimization over continuousspaces,” ICSI, Tech. Rep. TR-95-012, 1995 [Online]. Available:http://http.icsi.berkeley.edu/∼storn/litera.html

[17] K. Price, R. Storn, and J. Lampinen, Differential Evolution—A PracticalApproach to Global Optimization. Berlin, Germany: Springer, 2005.

[18] S. Das and P. N. Suganthan, “Differential evolution: Asurvey of the state-of-the-art,” IEEE Trans. Evol. Comput.,vol. 15, no. 1, pp. 4–31, Feb. 2011.

[19] C. A. C. Coello, G. B. Lamont, and D. A. V. Veldhuizen, Evolution-ary Algorithms for Solving Multi-Objective Problems, 2nd ed. Berlin,Germany: Springer, Sep. 2007.

[20] X. Li, “Niching without niching parameters: Particle swarm optimizationusing a ring topology,” IEEE Trans. Evol. Comput., vol. 14, no. 1, pp.150–169, Feb. 2010.

[21] B. Y. Qu and P. N. Suganthan, “Novel multimodal problems anddifferential evolution with ensemble of restricted tournament selection,”in Proc. IEEE Congr. Evol. Comput., Jul. 2010, pp. 1–7.

[22] K. A. De Jong, “An analysis of the behavior of a class of genetic adaptivesystems,” Ph.D. dissertation, Univ. Michigan, Ann Arbor, 1975.

[23] S. Mahfoud, “Niching methods for genetic algorithms,” Ph.D. disserta-tion, Univ. Illinois, Urbana, IL, 1995.

[24] A. Pétrowski, “A clearing procedure as a niching method for geneticalgorithms,” in Proc. IEEE Int. Conf. Evol. Comput., May 1996, pp.798–803.

[25] J.-P. Li, M. E. Balazs, G. T. Parks, and P. J. Clarkson, “A speciesconserving genetic algorithm for multimodal function optimization,”Evol. Comput., vol. 10, no. 3, pp. 207–234, 2002.

[26] D. Beasley, D. R. Bull, and R. R. Martin, “A sequential niche techniquefor multimodal function optimization,” Evol. Comput., vol. 1, no. 2, pp.101–125, 1993.

[27] M. Bessaou, A. Petrowski, and P. Siarry, “Island model cooperating withspeciation for multimodal optimization,” in Proc. 6th Int. Conf. Parall.Prob. Solv. Nat., 2000, pp. 16–20.

[28] F. Streichert, G. Stein, H. Ulmer, and A. Zell, “A clustering basedniching EA for multimodal search spaces,” in Artificial Evolution(Lecture Notes in Computer Science Series, vol. 2936). Berlin, Germany:Springer, 2004, pp. 293–304.

[29] C. Stoean, M. Preuss, R. Stoean, and D. Dumitrescu, “Multimodaloptimization by means of a topological species conservation algo-rithm,” IEEE Trans. Evol. Comput., vol. 14, no. 6, pp. 842–864,Dec. 2010.

[30] K. Deb and D. E. Goldberg, “An investigation of niche and speciesformation in genetic function optimization,” in Proc. 3rd Int. Conf.Genetic Algorithms, 1989, pp. 42–50.

[31] K. Deb, “Genetic algorithms in multimodal function optimization, theClearinghouse for genetic algorithms,” M.S thesis, Univ. Alabama,Tuscaloosa, Rep. 89002, 1989.

[32] K.-C. Wong, K.-S. Leung, and M.-H. Wong, “An evolutionary algorithmwith species-specific explosion for multimodal optimization,” in Proc.11th Annu. Conf. Genetic Evol. Comput., 2009, pp. 923–930.

[33] C. Im, H. Kim, H. Jung, and K. Choi, “A novel algorithm for multimodalfunction optimization based on evolution strategy,” IEEE Trans. Magn.,vol. 40, no. 2, pp. 1224–1227, Mar. 2004.

[34] O. M. Shir and T. Bäck, “Niching in evolution strategies,” in Proc. Conf.Genet. Evol. Comput., 2005, pp. 915–916.

[35] O. M. Shir and T. Bäck, “Niching with derandomized evolution strate-gies in artificial and real-world landscapes,” Natural Comput., vol. 8,no. 1, pp. 171–196, Mar. 2009.

[36] O. M. Shir, M. Emmerich, and T. Bäck, “Adaptive niche radii andniche shapes approaches for niching with the CMA-ES,” Evol. Comput.,vol. 18, no. 1, pp. 97–126, 2010.

[37] K. Parsopoulos and M. Vrahatis, “Modification of the particle swarmoptimizer for locating all the global minima,” Artificial Neural Net-works and Genetic Algorithms. Berlin, Germany: Springer, 2001,pp. 324–327.

[38] K. Parsopoulos and M. Vrahatis, “On the computation of all globalminimizers through particle swarm optimization,” IEEE Trans. Evol.Comput., vol. 8, no. 3, pp. 211–224, Jun. 2004.

[39] A. E. R. Brits and F. van den Bergh, “A niching particle swarmoptimizer,” in Proc. 4th Asia-Pacif. Conf. Simul. Evol. Learn., Feb. 2002,pp. 692–696.

[40] A. Nickabadi, M. M. Ebadzadhe, and R. Safabakhsh, “A dynamicniching particle swarm optimizer for multimodal optimization,” in Proc.IEEE Congr. Evol. Comput., Jun. 2008, pp. 26–32.

[41] S. Bird and X. Li, “Adaptively choosing niching parameters in a PSO,”in Proc. Genet. Evol. Comput. Conf., 2006, pp. 3–10.

[42] D. Parrott and X. Li, “A particle swarm model for tracking multiplepeaks in a dynamic environment using speciation,” IEEE Trans. Evol.Comput., vol. 10, no. 4, pp. 440–458, Aug. 2006.

[43] A. P. Engelbrecht and L. N. H. Van Loggerenberg, “Enhancingthe niche PSO,” in Proc. IEEE Congr. Evol. Comput., Sep. 2007,pp. 2297–2302.

[44] J. Barerra and C. A. C. Coello, “A review of PSO methods used formultimodal optimization,” in Innovations Swarm Intelligence, vol. 248,C. P. Lim, L. C. Jain, and S. Dehuri, Eds. Berlin, Germany: Springer,2009, pp. 9–37.

[45] K. M. Woldemariam and G. G. Yen, “Vaccine-enhanced artificialimmune system for multimodal function optimization,” IEEE Trans.Syst., Man, Cybern. B, Cybern., vol. 40, no. 1, pp. 218–228,Feb. 2010.

[46] D. Zaharie, “A multipopulation differential evolution algorithm for mul-timodal optimization,” in Proc. 10th MENDEL Int. Conf. Soft Comput.,Jun. 2004, pp. 17–22.

[47] Z. Hendershot, “A differential evolution algorithm for automaticallydiscovering multiple global optima in multidimensional discontinuousspaces,” in Proc. 15th Midwest Artif. Intell. Cognitive Sci. Conf., Apr.2004, pp. 92–97.

[48] J. Ronkkonen and J. Lampinen, “On determining multiple global optimaby differential evolution,” in Proc. Eurogen Evol. Deterministic MethodsDesign, Optimization, Control, Jun. 2007, pp. 146–151.

[49] K.-C. Wong, C.-H. Wuc, R. K. P. Mokd, C. Penge, and Z. Zhang,“Evolutionary multimodal optimization using the principle of locality,”Informat. Sci., vol. 194, pp. 138–170, Jul. 2012.

[50] B.-Y. Qu, P. N. Suganthan, and J. J. Liang, “Differential evolution withneighborhood mutation for multimodal optimization,” IEEE Trans. Evol.Comput., vol. 16, no. 5, pp. 601–614, Oct. 2012.

[51] S. Roy, Sk. M. Islam, S. Das, S. Ghosh, and A. V. Vasilakos, “Asimulated weed colony system with sub-regional differential evolutionfor multimodal optimization,” in Eng. Optimization. Taylor and Francis,May 2012.

Page 20: Multimodal Optimization Using a Biobjective Differential Evolution Algorithm Enhanced With Mean Distance-Based Selection

BASAK et al.: MULTIMODAL OPTIMIZATION USING A BIOBJECTIVE DIFFERENTIAL EVOLUTION ALGORITHM 685

[52] N. Beume, B. Naujoks, and M. Emmerich, “SMS-EMOA: Multiobjectiveselection based on dominated hypervolume,” Eur. J. Operational Res.,vol. 181, no. 3, pp. 1653–1669, 2007.

[53] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitistmultiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput.,vol. 6, no. 2, pp. 182–197, Apr. 2002.

[54] E. Zitzler and L. Thiele, “Multiobjective optimization using evolutionaryalgorithms: A comparative study,” in Proc. 5th Parallel Problem SolvingFrom Nature, 1998, pp. 292–301.

[55] C. A. C. Coello, D. A. Van Veldhuizen, and G. B. Lamont, EvolutionaryAlgorithms for Solving Multi-Objective Problems. New York: KluwerAcademic, 2002.

[56] M. Emmerich, N. Beume, and B. Naujoks, “An EMO algorithm usingthe hypervolume measure as selection criterion,” in Proc. Int. Conf. Evol.Multi-Criterion Optimization, vol. 3410. 2005, pp. 62–76.

[57] X. Li, “Efficient differential evolution using speciation for multimodalfunction optimization,” in Proc. Conf. Genetic Evol. Comput., 2005, pp.873–880.

[58] X. Li, “Multimodal function optimization based on fitness-Euclideandistance ratio,” in Proc. Genet. Evol. Comput. Conf., 2007, pp. 78–85.

[59] X. Li, “Adaptively choosing neighborhood bests using species in aparticle swarm optimizer for multimodal function optimization,” in Proc.Genet. Evol. Comput. Conf., LNCS 3102. 2004, pp. 105–116.

[60] B.-Y Qu, P. N. Suganthan, and S. Das, “A distance-based locallyinformed particle swarm model for multimodal optimization,” IEEETrans. Evol. Comput., Preprint (10.1109/TEVC.2012.2203138), to bepublished.

[61] J. Gan and K. Warwick, “A variable radius niching technique forspeciation in genetic algorithms,” in Proc. Genetic Evol. Comput. Conf.,2000, pp. 96–103.

[62] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics,vol. 1, pp. 80–83, 1945.

[63] J. Derrac, S. García, D. Molina, and F. Herrera, “A practical tutorial onthe use of nonparametric statistical tests as a methodology for comparingevolutionary and swarm intelligence algorithms,” Swarm Evol. Comput.,vol. 1, no. 1, pp. 3–18, Mar. 2011.

[64] J. Demsar, “Statistical comparisons of classifiers over multiple data sets,”J. Mach. Learning Res., vol. 7, pp. 1–30, 2006.

[65] S. García, A. Fernández, J. Luengo, and F. Herrera, “Advanced non-parametric tests for multiple comparisons in the design of experimentsin computational intelligence and data mining: Experimental analysis ofpower,” Informat. Sci., vol. 180, no. 10, pp. 2044–2064, 2010.

[66] J. H. Zar, Biostatistical Analysis. Englewood Cliffs, NJ: Prentice-Hall,1999.

[67] Y. Jin, “A comprehensive survey of fitness approximation in evolutionarycomputation,” Soft Comput., vol. 9, pp. 3–12, 2005.

[68] M. J. Osborne and A. Rubinstein, A Course in Game Theory. Cambridge,MA: MIT Press, 1994.

[69] R. Lung and D. Dumitrescu, “An evolutionary model for solvingmultiplayer non cooperative games,” in Proc. Int. Conf. Knowledge Eng.,Principles Techniques, Jun. 2007, pp. 209–216.

[70] R. D. McKelvey, A. M. McLennan, and T. L. Turocy. (2006). “Gambit:Software tools for game theory,” Tech. Rep., Version 0.2007.01.07[Online]. Available: http://econweb.tamu.edu/gambit/

Aniruddha Basak received the B.E.Tel.E. degreefrom the Electronics and Telecommunication Engi-neering Department, Jadavpur University, Kolkata,India, in 2011. He is currently pursuing the Ph.D. de-gree in electrical and computer engineering with theElectrical and Computer Engineering Department,Carnegie Mellon University, Moffett Field, CA.

His current research interests include machinelearning, optimization, and evolutionary algorithms.

Swagatam Das (M’10–SM’12) received theB.E.Tel.E., M.E.Tel.E in control engineering, andPh.D. degrees from Jadavpur University, Kolkata,India, in 2003, 2005, and 2009, respectively.

He is currently an Assistant Professor withthe Electronics and Communication SciencesUnit, Indian Statistical Institute, Kolkata. Hehas published one research monograph, oneedited volume, and more than 150 researcharticles in peer-reviewed journals and internationalconferences. His current research interests include

evolutionary computing, pattern recognition, multiagent systems, and wirelesscommunication.

Dr. Das is the founding Co-Editor-in-Chief of Swarm and EvolutionaryComputation. He is an Associate Editor of the IEEE Transaction on

Systems, Man, and Cybernetics Part A and Information Sciences(Elsevier). He is an Editorial Board Member of Progress in ArtificialIntelligence (Springer), Mathematical Problems in Engineering, InternationalJournal of Artificial Intelligence and Soft Computing, and InternationalJournal of Adaptive and Autonomous Communication Systems. He is aRegular Reviewer for journals such as Pattern Recognition, the IEEETransactions on Evolutionary Computation, the IEEE/ACMTransactions on Computational Biology and Bioinformatics, theIEEE Transactions on Systems, Man, Cybernetics A: Systems and

Humans, the IEEE Transactions on Systems, Man, Cybernetics

B: Cybernetics, and the IEEE Transactions on Systems, Man,Cybernetics C: Applications and Reviews. He has been associatedwith the International Program Committee and Organizing Committee ofseveral regular international conferences, including IEEE CEC, IEEE SSCI,SEAL, GECCO, and SEMCCO. He has been a Guest Editor for specialissues in journals such as the IEEE Transactions on Evolutionary

Computation and the IEEE Transactions on Systems, Man,Cybernetics C, Applications. He was the recipient of the 2012 YoungEngineer Award from the Indian National Academy of Engineering.

Kay Chen Tan (SM’08) is currently an AssociateProfessor with the Department of Electrical andComputer Engineering, National University ofSingapore, Singapore. He has published morethan 100 journal papers, more than 100 papers inconference proceedings, co-authored five books andco-edited four books.

He has been invited to be an InvitedKeynote/Plenary Speaker for more than 30international conferences. He has served oninternational program committees for more than

100 conferences and involved in organizing committees for more than40 international conferences, including the General Co-Chair for theIEEE Congress on Evolutionary Computation 2007 in Singapore and theGeneral Co-Chair for the IEEE Symposium on Computational Intelligencein Scheduling 2009 in Tennessee. He has been an IEEE DistinguishedLecturer of the IEEE Computational Intelligence Society since 2011. He iscurrently the Editor-in-Chief of the IEEE Computational Intelligence

Magazine (CIM). He is also an Associate Editor or Editorial Board Memberof over 15 international journals, such as the IEEE Transactions on

Evolutionary Computation, the IEEE Transactions on Systems,Man and Cybernetics B: Cybernetics, the IEEE Transactions

on Computational Intelligence and AI in Games, EvolutionaryComputation (MIT Press). He was the recipient of the 2012 IEEEComputational Intelligence Society Outstanding Early Career Award. He wasthe recipient of the Recognition Award from the International Network forEngineering Education and Research in 2008. He was the recipient of theNUS Outstanding Educator Award in 2004; the Engineering Educator Awardin 2002, 2003, and 2005; the Annual Teaching Excellence Award in 2002,2003, 2004, 2005, and 2006; and the Honour Roll Award in 2007.