[IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)...

5
Search Space Reduction of Particle Swarm Optimization for Hierarchical Similarity Measurement Model Arun Reungsinkonkarn Department of Computer Information System Assumption University Huamak, Bangkok10240, Thailand [email protected] Paskorn Apirukvorapinit Department of Information Technology Thai-Nichi Institute of Technology [email protected] Abstract—Particle Swarm Optimization (PSO) is one of many optimization techniques used to find a solution in many areas not limited to engineering or mathematics. It can discover the solution to a problem of finding input to a program based on the similarity of program’s execution. However, identifying such solutions with standard PSO is not very efficient or in a few cases, not possible. There is a high probability that particles are stuck in an area of local maxima. The main reason is due to excessive exploitation steps. In addition, when the new exploration starts, there is no guarantee that particles will no longer be generated from earlier explored areas. This paper presents an algorithm of Search Space Reduction (SSR) applied to PSO for Hierarchical Similarity Measurement (HSM) model of program execution. The algorithm uses a fitness function computed from HSM model. SSR helps to find the solution by eliminating areas where the solution is most likely not to be found. It improves the optimization process by reducing the excessive exploitation step. Moreover, SSR can be applied to all variants of PSO. The experimental results demonstrate that PSO with SSR is the most effective method among all other three techniques used in experiment. SSR increases effectiveness in finding a solution by 73%. For each program under the experiment, SSR algorithm was able to find all solutions with the smallest number of exploitations. Regardless of the program’s complexity, PSO with SSR usually manipulates the searching process faster than both versions of PSO without SSR. Keywords—Particle Swarm Optimization (PSO), Optimization, Search Space Reduction, Hierarchical Similarity Measurement Model (HSM) I. INTRODUCTION Optimization techniques are widely applied in many industries including engineering, business intelligence, finance and many others [6][8][10]. Many improved versions of PSO were studied and developed; the modified PSO using a parameter called “inertia weight” [1], the improved social learning structure of PSO using the effect of neighborhood typology [2], hybrid PSO and genetic algorithm with breeding and subpopulation [3], PSO that utilizes a negative entropy for adjusting velocity and position of each particle [4]. There were also other papers that described the notion of balancing exploitation and exploration process [12][13]. The appropriate ratio of exploration to exploitation depends on the problem domain. Too small numbers of exploitation rounds could stop the search process prematurelywhereasan overly excessive exploitation can lead toa waste in the search process [14]. However, none of these methods introduces a concept of Search Space Reduction (SSR).SSR can incrementally eliminate undesirable areas to find a solution, thereby narrowing the search space. Its main idea involves three steps: exploration, exploitation and search space reduction. Exploration is a process to generate a group of particles randomly positioned in a search space where as exploitation manipulate each particle’s position. Search space reduction is an essential (important) process to contract and reduce the search space by calculating a new distance for the next round of exploration. The contraction occurs when the search reaches the maximum numbers of exploitation, and the position of the best particle of previous round is not the same as that of currentround. Alsothe reduction will take place afterthe best particle is stuck in the best previous position. In a typical optimization problem, a mathematical equation is required to form a problem. However, in case of program execution, it is neither necessary nor suitable to form an equation to a program due to various natures of the program. Consequently, a Hierarchical Similarity Measurement Model (HSM) of program execution [5] were developed to compute a proper fitness function used in our SSR algorithm. II. RELATED WORK The Particle Swarm Optimization (PSO) is a population-based stochastic that imitates a social model of bird flocking, fish schooling, and swarming theory. The Nearest-Neighbor Velocity Matching is used in initializing random population in order to solve a unanimous and unchanged direction. Based on local and global best (gBest) solution, the best value is recognized while the particles update their velocities and positions [7]. PSO approach is also widely used in various applications such as decisions making model in finance. When applying the PSO algorithm, it selects the global best (gBest) of artificial neural networks which simulates one-step forward investment decision for stock market. The proposed approach results in a more accurate and better result in average than other PSO algorithms[8]. Sponsor by Bangkok University 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) 23 ,(((

Transcript of [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)...

Page 1: [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) - Chon Buri (2014.5.14-2014.5.16)] 2014 11th International Joint Conference on Computer

Search Space Reduction of Particle Swarm Optimization for Hierarchical Similarity

Measurement Model

Arun Reungsinkonkarn Department of Computer Information System

Assumption University Huamak, Bangkok10240, Thailand

[email protected]

Paskorn Apirukvorapinit Department of Information Technology

Thai-Nichi Institute of Technology [email protected]

Abstract—Particle Swarm Optimization (PSO) is one of many optimization techniques used to find a solution in many areas not limited to engineering or mathematics. It can discover the solution to a problem of finding input to a program based on the similarity of program’s execution. However, identifying such solutions with standard PSO is not very efficient or in a few cases, not possible. There is a high probability that particles are stuck in an area of local maxima. The main reason is due to excessive exploitation steps. In addition, when the new exploration starts, there is no guarantee that particles will no longer be generated from earlier explored areas. This paper presents an algorithm of Search Space Reduction (SSR) applied to PSO for Hierarchical Similarity Measurement (HSM) model of program execution. The algorithm uses a fitness function computed from HSM model. SSR helps to find the solution by eliminating areas where the solution is most likely not to be found. It improves the optimization process by reducing the excessive exploitation step. Moreover, SSR can be applied to all variants of PSO. The experimental results demonstrate that PSO with SSR is the most effective method among all other three techniques used in experiment. SSR increases effectiveness in finding a solution by 73%. For each program under the experiment, SSR algorithm was able to find all solutions with the smallest number of exploitations. Regardless of the program’s complexity, PSO with SSR usually manipulates the searching process faster than both versions of PSO without SSR.

Keywords—Particle Swarm Optimization (PSO), Optimization, Search Space Reduction, Hierarchical Similarity Measurement Model (HSM)

I. INTRODUCTION

Optimization techniques are widely applied in many industries including engineering, business intelligence, finance and many others [6][8][10]. Many improved versions of PSO were studied and developed; the modified PSO using a parameter called “inertia weight” [1], the improved social learning structure of PSO using the effect of neighborhood typology [2], hybrid PSO and genetic algorithm with breeding and subpopulation [3], PSO that utilizes a negative entropy for adjusting velocity and position of each particle [4]. There were also other papers that described the notion of balancing exploitation and exploration process [12][13]. The appropriate ratio of exploration to exploitation depends on the problem domain. Too small numbers of exploitation rounds could stop

the search process prematurelywhereasan overly excessive exploitation can lead toa waste in the search process [14].

However, none of these methods introduces a concept of Search Space Reduction (SSR).SSR can incrementally eliminate undesirable areas to find a solution, thereby narrowing the search space. Its main idea involves three steps: exploration, exploitation and search space reduction. Exploration is a process to generate a group of particles randomly positioned in a search space where as exploitation manipulate each particle’s position. Search space reduction is an essential (important) process to contract and reduce the search space by calculating a new distance for the next round of exploration. The contraction occurs when the search reaches the maximum numbers of exploitation, and the position of the best particle of previous round is not the same as that of currentround. Alsothe reduction will take place afterthe best particle is stuck in the best previous position.

In a typical optimization problem, a mathematical equation is required to form a problem. However, in case of program execution, it is neither necessary nor suitable to form an equation to a program due to various natures of the program. Consequently, a Hierarchical Similarity Measurement Model (HSM) of program execution [5] were developed to compute a proper fitness function used in our SSR algorithm.

II. RELATED WORK

The Particle Swarm Optimization (PSO) is a population-based stochastic that imitates a social model of bird flocking, fish schooling, and swarming theory. The Nearest-Neighbor Velocity Matching is used in initializing random population in order to solve a unanimous and unchanged direction. Based on local and global best (gBest) solution, the best value is recognized while the particles update their velocities and positions [7].

PSO approach is also widely used in various applications such as decisions making model in finance. When applying the PSO algorithm, it selects the global best (gBest) of artificial neural networks which simulates one-step forward investment decision for stock market. The proposed approach results in a more accurate and better result in average than other PSO algorithms[8].

Sponsor by Bangkok University

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)

23

Page 2: [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) - Chon Buri (2014.5.14-2014.5.16)] 2014 11th International Joint Conference on Computer

Furthermore, PSO integrates a process of meta-heuristic search algorithm into a dynamic execution program by generating the highest possible coverage rate of search-based test data. In order to calculate a fitness function of corresponding coverage criterion, PSO identifies search direction for the best test to reach maximum possible coverage ratio [11].

Organizational Evolutionary PSO, an improved version, is applicable in test case generation. With inter-organizational collaboration and self-learning, it evolves to reach a global maximal while significantly reduce number of test case [9].

A Hierarchical Similarity Measurement Model (HSM) of program’s execution avoids having to explicitly form an equation by working like a Black-box model. It uses a similarity value to compute a fitness function and supports primitive, abstract, and complex data types [5].

III. SEARCH SPACE REDUCTION APPLIED TO HSM

Search Space Reduction (SSR) is a method to eliminate undesirable area in order to find a solution more efficiently.SSR can be applied to all variants of PSO. For this paper, we applied SSR to a standard PSO that utilizes HSM model. HSM facilitates the process of fitness function computation for program’s execution

The particle velocity and position formula [7]

max max

max max

1

if 11

if 1

id id p id id g gd

id

id

id

v t w t v t c u p x t c u p x t

V v t Vv t

V v t V

(1)

1 1id id idx t x t v t (2)

Search Space Reduction with HSM Algorithm given: A is a search space dimension. t is the round of exploitation. ubPos is an upper bound of search space in A. lbPos is a lower bound of search space in A. distance is a distance between positiont[gBest]and positiont-

1[gBest]. P is a program to execute. AO is an actual output of P. EO is an expected output of P. HSM is the Hierarchical Similarity Measurement model. simt is a similarity value between AO and EO in t round. p is a particle object that consists of

1) position in A.2) fv is a fitness value where it derives from sim in t round.

3) pBest is the best p that has the best fv among particles at aparticular round of exploitation.

4) gBest is the best p that has the best fv of all particles forevery round of exploitation. v is a velocity of p Output: gBest particle 1: Exploration() 2: If fvt[gBest[p]] is one 3: return gBest[p] 4: else do 5:t<-- t++

6: change v using the equation (1) and (2) 7:positiont[p] <-- positiont-1[p] + vt 8: callHSM(positiont[p]) 9: set_pBest(group of pt, group of pt-1) 10: set_gBest(all p) 11: if fvt[gbest[p]] is one 12: return gBest[p] 13: else if t reaches maximum number of exploitations 14:distance<-- calculateDistance(gBestt[p],gBestt-1[p]) 15:lbPost<-- positiont[gBest[p]] - distance 16:ubPost<-- positiont[gBest[p]] + distance 17: Exploration() 18: while resource is exhausted

From the algorithm, it explains the three main steps of SSR method: (1) exploration, (2) exploitation and (3) space reduction by calculating a new distance.

Line 1 begins an exploration step. In this step, t is initialized; a group of particles is randomly generated with positions in the upper and lower bounds of search space A. A velocity, v, is randomly generated for each particle. For each particle in the group, HSM is called to compute a similarity between AO and EO, and assign it as a fitness value (fv) of the particle.Lines 2 and 3 are to check if the fv of gBest equals to 1. If so, it means the solution is found, and the algorithm returns the gBest particle.

Lines 4 to 18 are a do-while loop that combines the other two steps, which are exploitation and SSR. The exploitation step is described in lines 4 to 12. The exploitation round t is increased by one on line 4. The vis manipulated using equations (1) and (2). Then the algorithm changes the particle’s position, and calls HSM to execute program (P) to obtain the new fv.

The set_pBest procedure on line 9 is called to find the pBest by comparing the fvof each particle between groups of t-1 and t rounds, and the best value of fv is then assigned to the particle in the group. Line 10 is to call set_gBest procedure to find the best value of fv of all particles.

Lines 13 to line 16 represent SSR that specifies a new search space for the next exploration. If the solution is not found and t reaches a maximum number of exploitations, a distance is calculated on line 14. The algorithm assigns the lower bound (lbPos) of the new search space with the position of gBest at t round minus distance, and the upper bound (ubPos) with the position of gBest at t round plus distance. Line 17 starts a new round of exploration. The algorithm repeats lines 4 to 18 until resource is exhausted.

procedure calculateDistance(gBestt[p],gBestt-1[p]) begin 1: if gBestt[p] <>gBestt-1[p] 2:d<-- positiont-1[gBest[p]] - positiont[gBest[p]] 3: else 4:d<-- Min (|(ubPost - positiont[gBest[p]])| ,|(lbPost - positiont[gBest[p]])| 5: return d end

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)

24

Page 3: [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) - Chon Buri (2014.5.14-2014.5.16)] 2014 11th International Joint Conference on Computer

The calculateDistance procedure calculates distance between positions of gBest at t round and t-1 round. If their positions are not equal, a variable d is set by subtracting the gBest’s positionof round tfrom roundt-1.Otherwise, line 4 sets d with the minimum distance from the difference between ubPosand gBest’s position, and between lbPos and gBest’s position

The Figures below explain how SSR contracts and reduces the search space area step by step. Assume that a search space is a two-dimension (x, y) area where x and y are integer input to a program, and 0<x<100 and 0<y<20 as shown in Figure 1.

Figure 1. Search Space of 0 < x < 100 and 0 < y < 20.

Assume that after the first exploration, a solution is (70, 18), and gBest is (60,14). When the t reaches the maximum number of exploitation rounds, the lbPosand ubPosfor next exploration are calculated as below:

distance(x) = Min (|(ubPos(x) - gBest(x))|, |(lbPos- gBest(x))|) = Min ((100 – 60),((60 – 0)) = 40

distance(y) = Min (|(ubPos(x) - gBest(x))|, |(lbPos- gBest(x))|) = Min ((20-14),(14-0)) = 6

lbPos(x) = gBest(x) – distance(x)= 20; ubPos(x) = gBest(x) + distance(x) = 100

lbPos(y) = gBest(y) – distance(y)= 8; ubPos(y) = gBest(y) + distance(y) = 20

In Figure 2, the solid line represents a new search space after contraction.

Figure 2. New search spaceof 20 < x < 100 and 8 < y < 20.

At the 4th round of exploration, suppose that the search space has been incrementally contracted shown in Figure 3, and gBest of 4th round is equal gBest of 3rd round, it is very likely that the particle is stuck in the local maxima. As a result, the SSR algorithm must reduce the search space.

Figure 3. Search Spacewhere positions of gBest at t – 1 and t round are the same.

The next calculation shows how to eliminate an undesirable area for the next round of exploration, and Figure 4 illustrates the new search space area after reduction.

distance(x) = Min (|(ubPos(x) - gBest(x))|, |(lbPos(x)- gBest(x))|) = 10

distance(y) = Min (|(ubPos(y) - gBest(y))|, |(lbPos(y)- gBest(y))|) = 2

lbPos(x) = gBest(x) – distance(x)= 80; ubPos(x) = gBest(x) + distance(x) = 100

lbPos(y) = gBest(y) – distance(y)= 16; ubPos(y) = gBest(y) + distance(y) = 20

Figure 4. New Search Space after reduction.

IV. EXPERIMENT

An experimental study has been conducted to demonstrate the concept of SSR in PSO applied to HSM. Three techniques were used in our experiment: (1) PSO with SSR, (2) PSO without SSR and low exploitation, and (3) PSO without SSR and high exploitation. For PSO-SSR, 15x20 represents 15 rounds of exploration and 20 rounds of exploitation per each exploration. For PSO-No-SSR-Low and PSO-No-SSR-High, 20 indicates the low rounds of exploitation and 100 indicate the high rounds of exploitation respectively.

Four programs were developed for the experiment. First, installment V1 calculates monthly payment from principal, interest and a number of payments. The objective is to find a number of payments, where the principal is 2,500,000 baht, and monthly payment does not exceed 8,300 baht. Second, installment V2 is as same as Installment V1 except that it becomes a multiple-objective problem that is to find a number of payments and interest rate. The third program is to solve

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)

25

Page 4: [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) - Chon Buri (2014.5.14-2014.5.16)] 2014 11th International Joint Conference on Computer

two-variable polynomial equation x2+y2 = 4 where x, y are integer and -1000 <= X <= 1000 and -1000 <= y <=1000. Fourth is a program of Caesar Cipher Decryption string generated from four input strings where each input is a string from “a” to “z”. The objective is to find a string “fish” as an output of the decryption.

TABLE I. EXPERIMENTAL STUDY RESULT.

PSO-SSR PSO-No-SSR-Low PSO-No-SSR-High 15x20 30x20 6x100

Solution Found

Round where

solution found

Solution Found

Round where

solution found

Solution Found

Round where

solution found

Installment V1

Yes 13rd Yes 14th Yes 12nd

Installment V2

Yes 27th Yes 549th No -

Two-variables Polynomial

Yes 71st No - No -

Decryption Yes 281st No - No -

Table 1 shows the experimental study results of three PSO techniques applied to four programs.

The result demonstrates that PSO-SSR is much better than the other two techniques. For Two-variables polynomial and Decryption programs, PSO-SSR can identify all solutions whereas both versions of PSO-No-SSR cannot. Likewise in Installment V2 program, PSO-SSR found the solution, but PSO-No-SSR-High did not. Despite the fact that both PSO-SSR and PSO-No-SSR-Low found the solution for Installment V2, the number of rounds of PSO-SSR is significantly lower than that of PSO-No-SSR-Low (27th round compared to 549th round). Another finding is that low exploitation is better than high exploitation asshown in row 2 of Table 1 where PSO-No-SSR-Low found the solution while PSO-No-SSR-High did not.

Finding the solution depends on program complexity. For the purpose of our experiment, two factors were identified to define the program’s complexity; number of objectives and ratio of number of solution to size of search space. It took a lot more effort for PSO-No-SSR-Low to find the solution of Installment V1 than Installment V2 because V1 has a single objective whereas V2 has multiple objectives (14th round compared to 549th round). For Decryption program, the ratio of number of solution to size of search space is 1 to 456,976, in contrast with 6 to 480 for Installment V1 considered as a lot less complicated problem.Additionally, effective value is used in this research to measure SSR’s effectiveness. The value is from an average of difference between maximum number of exploitation round and the fastest round where solution is found, divided by the maximum number of exploitation round. If PSO-SSR finds a solution whereas PSO without SSR does not, the effectiveness value equals one. In contrast, the value equals zero when PSO-SSR does not find a solution but PSO without SSR does.

V. CONCLUSIONS AND FUTURE WORKS

This paper implements the concept of SSR with PSO that employs a HSM as a fitness function instead of forming a program’s execution equation in order to find input to the program. The result of our experiment indicates the significant

improvement of optimization process by SSR with area reduction and contraction. SSR expedites and increases effectiveness value of the PSO by 73%. However, program complexity affects the speed of search process.Moreover, a good ratio of exploration rounds to exploitation rounds depends on the types and program complexity, which can be found by running experiments for many types of problems. Our study finds that an excessive exploitation is time-consuming and ineffective. In addition, SSR concept can be applied to all variants of PSO. Thus, it can be concluded that SSR is a very effective method in identifying solution for problems. In future research, it will be used in software engineering; for instances, bugs detection, and software vulnerability detection.

ACKNOWLEDGMENT

This paper would not been possible without the help, support and patience of the two persons. The author would like to specially thank toMr. Sarawut Rasniyom forhissupport of the programming assistance in this research. The author also wishes to thank Ms. Waraporn Duriyavanich for her assistance in finishing this paper.

REFERENCES

[1] Shi, Y.; & Eberhart, R.C. “A Modified Particle Swarm Optimizer”, IEEE International Conference on Evolutionary Computation, pp. 69-73. 1998.

[2] Kennedy, J. “Small Worlds and Mega-minds: Effects of Neighborhood Topology on Particle Swarm Performance”, Evolutionary Computation, pp. 1931-1938, 1999.

[3] Løvberg, M.R.; Rasmussen, T.K.; & Krink, T. “Hybrid Particle Swarm Optimizer with Breeding and Subpopulations”, Third Genetic and Evolutionary Computation Conference, pp. 469-476, 2001.

[4] Xie, X.F.; Zhang, W.J.; & Yang, Z.L. “A Dissipative Particle Swarm Optimization”, IEEE Congress on Evolutionary Computation”, pp. 1456-1461, 2002.

[5] Reungsinkonkarn A., “Hierarchical Similarity Measurement Model of Program Execution”, 4th IEEE International Conference on Software Engineering and Service Science, pp.255 - 261 ,2013.

[6] Xinjie Xiaohui Hu, Eberhart, R.C and Yuhui Shi, “Engineering optimization with particle swarm”, Swarm Intelligence Symposium, 2003. SIS '03, pp. 53 – 57, 2003

[7] Eberhart, R., Kennedy J., “A New Optimizer Using Particle Swarm Theory”, Micro Machine and Human Science, pp. 39-43, 1995.

[8] Nenortaite J., Butleris R., “Application of Particle Swarm Optimization Algorithm to Decision Making Model Incorporating Cluster Analysis”, Human System Interactions, pp. 88-93, 2008.

[9] Xiaoying P., “Using Organizational Evolutionary Particle Swarm Techniques to Generate Test Cases for Combinatorial Testing”, Computational Intelligence and Security, pp. 1580-1583, 2011.

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)

26

Page 5: [IEEE 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE) - Chon Buri (2014.5.14-2014.5.16)] 2014 11th International Joint Conference on Computer

[10] Ernawati and Subanar, “Using particle swarm optimization to a financial time series prediction”, Distributed Framework and Applications (DFmA), pp. 1 - 6, 2010.

[11] Chengying M., Xinxin Y., Jifu C., "Swarm Intelligence-Based Test Data Generation for Structural Testing", International Conference on Computer and Information Science, pp. , 2012.

[12] Parsopoulos, K.E., Vrahatis, M.N., “Parameter selection and adaptation in unified particle swarm optimization”, Mathematical and Computer Modeling 46, pp. 198–213, 2007.

[13] Koduru, P., et al. “A particle swarm optimization-nelder mead hybrid algorithm for balanced exploration and exploitation”, multidimensional search space, 2006.

[14] Andries P. Engelbrecht, “Computational Intelligence: An Introduction”, John Wiley and Sons, pp. 289-358., 2007.

2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE)

27