FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable...

8
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 7, No. 8, 2016 57 | Page www.ijacsa.thesai.org FPGA Implementation of Parallel Particle Swarm Optimization Algorithm and Compared with Genetic Algorithm BEN AMEUR Mohamed sadek Laboratory of Microelectronic, university of Monastir, Monastir, Tunisia SAKLY Anis National Engineering School of Monastir, Monastir, Tunisia AbstractIn this paper, a digital implementation of Particle Swarm Optimization algorithm (PSO) is developed for implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the mechanism of algorithm is inspired by the swarming of biological populations. PSO is similar to the Genetic Algorithm (GA). In fact, both of them use a combination of deterministic and probabilistic rules. The experimental results of this algorithm are effective to evaluate the performance of the PSO compared to GA and other PSO algorithm. New digital solutions are available to generate a hardware implementation of PSO Algorithms. Thus, we developed a hardware architecture based on Finite state machine (FSM) and implemented into FPGA to solve some dispatch computing problems over other circuits based on swarm intelligence. Moreover, the inherent parallelism of these new hardware solutions with a large computational capacity makes the running time negligible regardless the complexity of the processing. KeywordsPSO algorithm; GA; FPGA; Finite state machine; hardware I. INTRODUCTION Over the last decade, several meta-heuristic algorithms are proposed to solve hard and complex optimization problems. The effectiveness of this algorithm give satisfaction to solve the most difficult problems for many algorithms related for various optimization problems. The proposed architecture is tested on some benchmarks functions. We have also analyzed the operators of GAs to describe how the performance of each one can be enhanced by incorporating some features of the other. We used standard benchmarks functions to make comparison between the two algorithms. In fact, PSO algorithm use the technique [1] that explores all the search space to fix parameters that minimizes or maximizes a problem. So, the ability and the simplicity to solve complex problems make the studies active in this area compared with many others optimization techniques [2] [3]. This research attempts to present that PSO has a good effectiveness to find the best global optimal solution as the GA but with a better computing efficiency (less using of resource hardware and execution time). The main objective of this paper is to compare the computational efficiency of our optimized PSO with GA and other PSO algorithms using a set of benchmark test problems. The results of this optimization algorithm could prove to be important for the future study of PSO. The organization of the paper is described as follow: The first chapter briefly introduces the general steps performing the mechanism of PSO. Especially, a brief introduction of pseudo random number generator [4]. The next section describes the background functional architecture which performs the GA and PSO algorithm. In chapter 3, a description of the architecture used in the hardware implementation of PSO and genetic algorithm; the second part illustrates the experimental results of some benchmarks functions applied into the PSO algorithm and compared with GA and others PSO algorithms. Finally, we conclude our work and we make some implications and directions for future studies. II. PARTICLE SWARM OPTIMIZATION In Particle Swarm Optimization algorithm we can say that each « bird » may be a solution through a search space. Birds are called particles and to explore all the search space, each particle is evaluated by the fitness function and to manage the flying of the swarm to the prey, they use velocities module. Each particle flies around the solution by following the optimum position of particles [5][6]. All particles are associated with points in the search space and their positions are depending on their own solution and of their neighbors. Some particles come into play randomly in every iteration through this environment; they look the assessment of themselves and their neighbors [7]. Then, they follow successful particles of the given problem. PSO algorithm give satisfactory results in solving many dispatch problems related to biology medical, finance, 3d graphics, image processing and others. [8], but it is hard to choose the setting parameters because it is too complicated to find the best setting of a desired application. So, we have to set first, several parameters of the PSO algorithm: [9] [10]: Position and velocity equations of particles Number of particles in the search space The Gbest fitness achieved. Positions of particles having the best solution of all. Number of iteration In the beginning we generate a random population after that we search for the best solution after each iteration. Then, the particles update their positions using two best solutions. The first one is the best solution towards the problem and it is

Transcript of FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable...

Page 1: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

57 | P a g e

www.ijacsa.thesai.org

FPGA Implementation of Parallel Particle Swarm

Optimization Algorithm and Compared with Genetic

Algorithm

BEN AMEUR Mohamed sadek

Laboratory of Microelectronic, university of Monastir,

Monastir, Tunisia

SAKLY Anis

National Engineering School of Monastir,

Monastir, Tunisia

Abstract—In this paper, a digital implementation of Particle

Swarm Optimization algorithm (PSO) is developed for

implementation on Field Programmable Gate Array (FPGA).

PSO is a recent intelligent heuristic search method in which the

mechanism of algorithm is inspired by the swarming of biological

populations. PSO is similar to the Genetic Algorithm (GA). In

fact, both of them use a combination of deterministic and

probabilistic rules. The experimental results of this algorithm are

effective to evaluate the performance of the PSO compared to

GA and other PSO algorithm. New digital solutions are available

to generate a hardware implementation of PSO Algorithms.

Thus, we developed a hardware architecture based on Finite

state machine (FSM) and implemented into FPGA to solve some

dispatch computing problems over other circuits based on swarm

intelligence. Moreover, the inherent parallelism of these new

hardware solutions with a large computational capacity makes

the running time negligible regardless the complexity of the

processing.

Keywords—PSO algorithm; GA; FPGA; Finite state machine;

hardware

I. INTRODUCTION

Over the last decade, several meta-heuristic algorithms are proposed to solve hard and complex optimization problems. The effectiveness of this algorithm give satisfaction to solve the most difficult problems for many algorithms related for various optimization problems. The proposed architecture is tested on some benchmarks functions. We have also analyzed the operators of GAs to describe how the performance of each one can be enhanced by incorporating some features of the other. We used standard benchmarks functions to make comparison between the two algorithms. In fact, PSO algorithm use the technique [1] that explores all the search space to fix parameters that minimizes or maximizes a problem. So, the ability and the simplicity to solve complex problems make the studies active in this area compared with many others optimization techniques [2] [3].

This research attempts to present that PSO has a good effectiveness to find the best global optimal solution as the GA but with a better computing efficiency (less using of resource hardware and execution time). The main objective of this paper is to compare the computational efficiency of our optimized PSO with GA and other PSO algorithms using a set of benchmark test problems. The results of this optimization algorithm could prove to be important for the future study of

PSO. The organization of the paper is described as follow: The first chapter briefly introduces the general steps performing the mechanism of PSO. Especially, a brief introduction of pseudo random number generator [4]. The next section describes the background functional architecture which performs the GA and PSO algorithm. In chapter 3, a description of the architecture used in the hardware implementation of PSO and genetic algorithm; the second part illustrates the experimental results of some benchmarks functions applied into the PSO algorithm and compared with GA and others PSO algorithms. Finally, we conclude our work and we make some implications and directions for future studies.

II. PARTICLE SWARM OPTIMIZATION

In Particle Swarm Optimization algorithm we can say that each « bird » may be a solution through a search space. Birds are called particles and to explore all the search space, each particle is evaluated by the fitness function and to manage the flying of the swarm to the prey, they use velocities module. Each particle flies around the solution by following the optimum position of particles [5][6]. All particles are associated with points in the search space and their positions are depending on their own solution and of their neighbors. Some particles come into play randomly in every iteration through this environment; they look the assessment of themselves and their neighbors [7]. Then, they follow successful particles of the given problem. PSO algorithm give satisfactory results in solving many dispatch problems related to biology medical, finance, 3d graphics, image processing and others. [8], but it is hard to choose the setting parameters because it is too complicated to find the best setting of a desired application. So, we have to set first, several parameters of the PSO algorithm: [9] [10]:

Position and velocity equations of particles

Number of particles in the search space

The Gbest fitness achieved.

Positions of particles having the best solution of all.

Number of iteration

In the beginning we generate a random population after that we search for the best solution after each iteration. Then, the particles update their positions using two best solutions. The first one is the best solution towards the problem and it is

Page 2: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

58 | P a g e

www.ijacsa.thesai.org

named « lbest ». The other optimal solution is followed by the PSO algorithm and obtained by any particles from the population and it is named « Gbest ».

A. The random number generator

Programming PSO algorithms requires the use of random generator; there are several methods to generate a random numbers. In fact it is impossible to generate a random number based on algorithms that‟s why they are called pseudo random number. The random generators programs are particularly suitable for implementation and effective. Most pseudo random algorithms try to produce outputs that are uniformly distributed. A common class generator uses a linear congruence. Others are inspired by the Fibonacci sequence by adding the two previous values. Most popular and fast algorithms were created in 1948 D. H. Lehmer introduced linear generators congruentiels and will eventually become extremely popular.

In our algorithm we used the bloc of the pseudo random generator [13] at the initial position of particles and in the velocity vector. We choose the frequently used pseudo-random generator called the linear congruent of Lehmer:

Fn +1 = (A * Fn + B) mod C (1) Where: - Fn +1 : is the random number obtained from

the function F

- Fn : is the previous number obtained

- A and B : are multiplicative and additive value,

respectively

- C : the modulo number

B. Position and velocity equations

Velocity equation allows changing the position of a desired particle and generally, the objective of using PSO algorithm is to indicate by their positions the distance to the best particle. So, these equations are updated throughout the race of iterations using the equations below:

( )

(2)

(t) (3)

xi(t) is the particle position at time t and vi is the velocity of particle at the instant t(i), w is parameters, c1 and c2 are constant coefficients, r1 and r2 are random numbers at each iteration, « Gbest » is the optimal solution found until now and « lbest » is the best solution found by the particle i. So, generally the velocity vector allows directing the research process and reflects the sociability of the particles.

The convergence to the optimum solution can be fixed by a number of iterations depending on the fitness or when the variation tends to zero (like sphere function) or when it tends to the best minimized solution. Here some parameters that comes into play:

The number of population.

The size of the neighborhood.

The dimension of the search space.

The values of the coefficients.

The maximum speed.

Each iteration allows the particles to move as a function of three components:

Its current speed

Its local best solution

The global best solution in its neighborhood.

TABLE I. EXAMPLE OF SOME SELECTED PARTICLES

Pa

rti

cles

iteration

1 2 3 n

X1 0010011111110010 0010011111110001 0010011111110000 …

X2 0010011111100011 0010011111100011 0010011111100011 …

X3 0010011111000001 0010011111000001 0010011111000001 …

Xn Xn1 Xn2 Xn3 …

In this table we present a sample from the sphere function, we can easily see that the “lbest” of particle x1 is located in it3 and the “lbest” of particle x2 and x3 are located in it2 but the “gbest” is x3 and located in it2.

The global minimum for the sphere function is clearly located at xi = 0, in each iteration we pick the “lbest” and we save the results into memory in order to compare its value with the new position of particles in the next iteration.

III. ARCHITECTURE OF GA AND PSO ALGORITHMS

A. GA architecture

To optimize a problem in GA, we have to explore all the searching state in order to maximize (or minimize) a chosen function. So, the use of genetic algorithm is suitable for a quick exploration of an area. The organizational chart that describes the architecture of GA is shown by the following figure.

Page 3: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

59 | P a g e

www.ijacsa.thesai.org

Fig. 1. Architecture Of The GA

B. PSO architecture

The architecture of our optimized PSO algorithm is presented in the following figure:

Fig. 2. Internal architecture of the PSO algorithm

For hardware implementation of PSO algorithm, the architecture is decomposed into five operations that are performed on each particle: update the position, evaluate the fitness, update the particle's best position, update the global best position and update the velocity.

We can demonstrate from the two architectures that the two algorithms share some common points. In fact the two algorithms begin with a random population in the search space and both of them use fitness module to evaluate the generation.

Both of them update the generation and search for an optimal value using the pseudo random number but the two of them does not guarantee the success. However, the Particle Swarm Optimization doesn‟t have crossover and mutation operators. Indeed, PSO update its particles using the velocity module.

C. The FSM

In our paper a dynamic parallel PSO is implemented to be applied into large optimization problem and compared with GA and others PSO algorithms. The FSM is used to exploit all type of parallelism to find the optimum solution in a reduced portion of times. The dynamical process of FSM is represented in figure 3, in fact every state may have at every time a position of many possible finite states. Firstly, we must propose a number of fixed states; every transition may have one or more around states. In this way, states which have only one state and have no possible transitions we named the final states.

The algorithm performs the updating of the optimum fitness number after the evaluations for all the particles. Here, when we update their positions and velocities we can obtain a good convergence rates after evaluating each particle. In a dynamic parallel computing, the main factor of performance is the communication latency after each transition between states. The goal of parallel dynamic computing is to produce optimal results even when we use multiple processors to reduce the running time. In this architecture we used pair memory modules to compound the bandwidth and thus, we can ameliorate the capabilities of our algorithm and we cannot do this only if we use Dual Channel bloc RAM. In that way we can access to the data memory in two modes write or read at the same frequency. There are problems with the dual RAM. In fact, the reading time of the content of memory is delayed by one clock comparative to the last reading. The description of the 8 states is presented in the sequel:

S0: Initialize parameters, signals and counters of PSO algorithm and goto S1

S1: Generate initial population and their velocities using random generator and goto S2 or S3

S2: Save positions and velocities value into memory (RAM)

S3: Evaluate particles using fitness module and goto S4 or S5

S4: Save evaluated value into Bloc RAM and goto S6

YES

NO

Fitness Bloc

Module of

selection

Module of

crossover

Module of

mutation Random

generator

Initialize

Population

(memory)

Fitness

Memory

Optimal

solution

Iteration

Number °

Initialization

particles & counters

Update

positions and

velocities

Initialize

velocity

Evaluation

module

Random

generator

Store

velocity

Save

evaluated

particles

Update lbest

and gbest

position

Update

velocities

Update

particles Update

iterations

Page 4: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

60 | P a g e

www.ijacsa.thesai.org

S5: Test gbest If fit(i) <Global-best(i) then update Global-best and if the number of iteration is achieved then go to final state else go to State S7

S6: Test lbest If fit(i) < local-best(i) then update local-best(i) then, go to S2 and return to state S4

S7: Update particles positions and velocities

S8: Update the number of iteration if iteration not achieved then go to state S3 else go to final state (SF)

SF: Display the optimum solution.

Fig. 3. The finate state machine of the PSO algorithm

Luckily, new advances in processor technology are capable and available to compute a complex program and use low cost power beyond clusters of mid-range performance computers. So, the dynamic process implemented in the particle swarm optimization could be separated in two states which update position and velocity of each particle using dynamic process with the goal to reduce the processing time.

In our paper, the soul of the parallel processing was used to generate a dynamic PSO algorithm and the aim of using parallel computing to the PSO algorithm, is to speed up the algorithm processing using a uniform distribution method to achieve optimum solutions with a significant execution time.

Figure 3 present the finite state machine of the global control module; especially, it presents step by step the code of the PSO in order to keep the algorithm more practical.

D. Benchmark test functions

The Most researchers use a number of population size between 10 to 50 for the performance comparison between algorithms, here we fixed the population at 20 chromosome for the GA and the same for PSO algorithm. To test the PSO and to compare its performance with other algorithm, we used some standard benchmark functions which are described as below:

Sphere function

Rosen-brock function

Rastrigin function

Zakharov function

Some well-known benchmark functions have been selected for comparing the two implementations. So, to test and compare the performance of our proposed PSO algorithm we used unimodal and multimodal functions. These functions are described as below:

∑ (4)

(5)

(6)

(7)

(8)

IV. VALIDATION EXAMPLES

Most researchers use a number of population size between 10 to 50 for the performance comparison of GA and PSO, the swarm size used for the PSO is the same as the population size used in GA and is fixed at 20 particles in the PSO swarm and 20 chromosomes in GA population. In the GA all variables of each individual are represented with binary strings of „0‟ and „1‟ that are referred to as chromosomes. Like genetic algorithm, PSO begins with a random population and to perform its exploration, GA use three operators (crossover, selection and mutation) to propagate its population from iteration to another.

A. The sphere function

Sphere function is useful to evaluate the characteristics of our optimization algorithms, such as the robustness and the convergence velocity. This function has a local minimum and it is unimodal and continuous. The interval of search space is between [-1,1]. Figure 4 present the results of simulation using modelsim of the sphere function.

1

0

2

3

4

5

6

7

SF

8

Initial index counter to

the data particle

structure

Generate initial population of particles

and velocities using pseudo random

module

Save the values of

positions and velocities

Evaluate

function

Save fitness of particles BRAM3 : 32 bits

If fit (i) < local-best (i)

Then update local-best (i)

Results

If fit(i) <Global-best(i)

Then update Global-best

Update velocities

Update particles positions

Update iteration

Page 5: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

61 | P a g e

www.ijacsa.thesai.org

Fig. 4. Simulation results of function f1

The detailed results describe that our solution converges to zero from iteration to another.

Fig. 5. Simulation results of function f2

These particles work together in a parallel dynamic state to get the best solution of any function. They update position and velocity even if the algorithm has a lot of particles and this cannot make a hard impact on the global execution time speed. Indeed, the number of particles in this algorithm is limited by the size of embedded features of FPGA. The following tables present the number of LUT (Look up Table), bloc RAM and all the resource materials used in this function.

TABLE II. DEVICE UTILIZATION SUMMARY OF PSO

TABLE III. DEVICE UTILIZATION SUMMARY OF GA

In the following figure, we can easily see the difference between the two algorithms, here the PSO algorithm give better optimization in the use of hardware resources than the Genetic Algorithm.

Fig. 6. Comparison of hardware resource between PSO and GA

We implemented the sphere function with two algorithms, GA and PSO using Spartan 3 from Xilinx, and then we can realize that the processing time of one iteration of PSO algorithm gives higher operation speed for optimization problems rather than genetic algorithm. The following table describes this.

TABLE IV. PROCESSING TIME OF ONE ITERATION

algorithm pso genetic

Execution of one

Iteration (clock cycle) 1180 9740

B. The rastrigin function

This function is described below:

The Rastrigin function contains several local minima. But it has just one global minimum and it is highly multimodal and

the location of the minima is distributed regular.

0

5

10

15

20

25

30

35

used slices used flipflops

used 4 luts

GA

PSO

( PSO) Sphere function

logic used available utilisation slices 225 1920 11%

flip flops 214 3840 5%

4 inputs 354 3840 9%

IOBs 10 173 5%

BRAM 5 12 41%

multiplexers 4 12 33%

GCLKS 3 8 37%

( GA) Sphere function

logic used available utilisation slices 581 1920 30%

flip flops 600 3840 15%

4 inputs 864 3840 22%

IOBs 23 173 13%

BRAM 2 12 16%

multiplexers 8 12 66%

GCLKS 5 8 62%

Page 6: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

62 | P a g e

www.ijacsa.thesai.org

Fig. 7. Simulation results with modelsim

The synthesis results of the whole system are shown in the following tables:

TABLE V. DEVICE UTILIZATION SUMMARY OF PSO

TABLE VI. DEVICE UTILIZATION SUMMARY OF GA

We can easily see that GA require a lot of hardware resource while the PSO algorithm use less number of slice and flip flop as it shows the following figure.

Fig. 8. a comparison of hardware resource used in the two algorithms

C. Rosenbrock function

The function of rosenbrock is a non-convex benchmark of two variables which is used to test some mathematical

optimization problems. It was introduced in 1960 by Howard H. Rosenbrock and it is known by the banana function name.

In this function the global minimum of search algorithms converge easily. The function is described as follow:

(11)

The global minimum is obtained at point (x, y) = (1, 1), for which the function is 0. A different coefficient is sometimes given in the second term, but that doesn‟t have a great affect in the position of the global minima.

D. Zakharov function

We used another benchmark which is the zakharov function whose global minimum occurs at x = (0):

(12)

Fig. 9. Comparison between PSO and GA of rosenbrock

Fig. 10. Comparison between PSO and GA of zakharov function

V. EXPERIMENTAL RESULTS

The platform of Spartan-3 FPGA is from Xilinx. The Spartan3 is one of the best low cost generation of FPGAs and the board can offers a choice of many platforms which deliver a unique cost optimization balanced between programmable logic, connectivity and hardware applications. It creates a PROM file and this latter can be written to the non volatile memory of Spartan-3. The platform of Spartan3 board includes the following elements (Figure 11):

200k of gate in a 256-ball thin Ball Grid Array package)

0

20

40

60

80

used slices used flipflops

used 4 luts

GA

PSO

0

10

20

30

40

50

60

used slices used flipflops

used 4 luts

GA

PSO

0

10

20

30

40

50

60

used slices used flipflops

used 4 luts

GA

PSO

( PSO) rastrigin function

logic used available utilisation

slices 307 1920 15%

flip flops 259 3840 6%

4 inputs 547 3840 14%

IOBs 26 173 15%

BRAM 10 12 83%

multiplexers 2 12 16%

GCLKS 3 8 37%

(GA) rastrigin function

logic used available utilisation

slices 1265 1920 65%

flip flops 842 3840 21%

inputs 2231 3840 58%

IOBs 2 173 1%

BRAM 3 12 25%

multiplexers 4 12 33%

GCLKS 8 8 100%

Page 7: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

63 | P a g e

www.ijacsa.thesai.org

4,320 logic cell and equivalents

12 x 18K of bit block RAMs (216K bits)

12 of hardware multipliers (18x18)

4 Digital extern clock (DCMs)

A lot of I/O signals and it is up to 173

Three “40” pin expansion connectors

PS/2 mouse/keyboard port, VGA port and serial port.

Fig. 11. The Block Diagram of SPARTAN-3

To make a comparison of this algorithm to deliver better solution in a significant time especially, its robustness and speed, we have tested it against other meta-heuristic algorithms, like genetic algorithms and another PSO algorithm. For GA, we used the basic model with elitism method and a probability of mutation equal 5%. The simulations have been carried out using spartran-3 of Xilinx with 50MHz. We have also fixed the population n = 20 for all simulations. The results are favorable and proved that Real BAT can be effective for many problems related to any algorithms used. The experiment results was carried out at minimum 5 % which allow judging whether the results of the PSO are acceptable and optimized in execution time compared to the best results of other algorithms.

Fig. 12. Display of the number of iteration to achieve the optimal solution

VI. RELATED WORK

Since its invention, many researchers have worked on the PSO algorithm [11] and how to accelerate its performance to give a good convergence and to reduce the use of hardware resource for embedded applications. In this section we will

present some works lean on parallelization algorithms proposed by other researches. In fact, there are many interesting improvements using PSO algorithm for several applications; al.Reynolds [12] suggested a smart technique for modified PSO algorithm using neural networks. His technique is based on a deterministic approach while the particles update their positions to simplify the hardware implementation because the standard PSO algorithm has been implemented to use random generators only for the operations of update and to reduce the hardware resource Upegui and Peña [13] use a discrete recombination of PSO algorithm called (PSODR), that‟s allow to decrease the time of computing of the velocity module. It is clear that these modified PSO algorithm allows generating competitive results compared to those of the basic PSO algorithm [14]. Moreover another works on the PSODR algorithm are proposed by Bratton and Blackwell with simplified models of the PSODR algorithm are analyzed and proposed by Blackwell and Bratton [15] with effective results and promising.

Many researches presented a modified variant of PSO either to reduce the materials resource or to eliminate explicit problem related directly on the architecture of PSO. That‟s why we developed a modified architecture using finite state machine to program a parallel algorithm that could give effective results to solve several problems [16]. Thus, we fixed the representations of the data by 20 particles to bearing several purpose of applications.

A comparison performance of PSO algorithms on some processors platforms are represented in the following table. We choose two different processors platforms, the Xilinx xc3s500 [17] and the Xilinx Micro-Blaze soft processor core for the Sphere test function.

TABLE VII. COMPARISON OF OTHER PLATFORMS

Plat_

form Xilinx xc3S500 Xilinx microblaze Spartran xc3S200

Averag

e nb.

Iter. (st.dev)

Average

.Exe.

time (s) (st.dev)

Average.nb. iter.

(st.dev)

Average

.Exe.

time(s)(st.dev)

Average.

nb.

iter.(st.dev)

Averag

e.Exe.

time(s) (st.dev)

Spher

e

functi

on

338

(30.9)

0.28

(0.03)

382

(27.0)

10.4

(0.65)

420

(88.9)

0,024

(0.001)

The random number generator plays a big role in the implementation of the two algorithms. That‟s why we can obtain some difference in the number of iterations even we use the same equation of random generator and the same initial seed used for the three tests. In order to evaluate the performance of our proposed PSO algorithm, we consider and compare two implementations of the PSO process: the first one is our algorithm and the second use the processor Xilinx MicroBlaze [18].

Page 8: FPGA Implementation of Parallel Particle Swarm ...€¦ · implementation on Field Programmable Gate Array (FPGA). PSO is a recent intelligent heuristic search method in which the

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 7, No. 8, 2016

64 | P a g e

www.ijacsa.thesai.org

TABLE VIII. DEVICE UTILIZATION SUMMARY OF THE PSO ALGORITHM ON

SPARTAN XC3S200 AND XC3S500

Tested

function

(sphere)

Number of

slices

Block

BRAM

MULT18x18s

Other PSO

algorithm

[18]

Xilinx

xc3s500

1523

(32.7%)

7

(35%)

8

(40%)

proposed

algorithm

Xilinx

xc3s200

225

(11%)

5

(41%)

3

(37%)

VII. CONCLUSION

In this work, we developped a hardware implementation on FPGA of a Particle Swarm Optimization algorithm. The effectiveness of our PSO algorithm has been tested on several benchmark functions for many degrees of parallelism. This architecture exploits all the parallelism to allow updating the particle positions and velocities to get a good performance of the fitness function using a finite state machine and implemented as hardware on Xilinx spartran 3 (xc3s200). In this algorithm we used a FSM to exploit all the parallelisms that make the program converge very quickly. The FSM allow updating the positions and velocities of particles and after that we can take independently the result of the better optimized fitness from the position of particles. In this paper the simulation results demonstrate that all the states and modules can be executed at the same time and the execution time can be reduced a lot.

The proposed PSO algorithm proves that it has a favorable convergence speed compared to the other meta-heuristic algorithms and the complexity of the algorithm depends on the size of design space, it means the number of allocated particles and the complexity of the problem. So, the PSO‟s robustness is attached to its enhanced ability to achieve a satisfaction between two requirements, the numbers of used memory and the processing time of algorithm to solve complex problems.

REFERENCES

[1] P.K. Tripathi, S. Bandyopadhyay, K.S. Pal, Multi-objective particle swarm optimization with time variant inertia and acceleration coefficients, Information Sciences 177 (22) (2007) 5033–5049.

[2] A.S. Mohais, R. Mohais, C. Ward, Earthquake classifying neural networks trained with dynamic neighborhood PSOs, in: Proceedings of the 9th Annual Conference on Genetic and, Evolutionary Computation (GECCO‟07), pp. 110–117.

[3] G.S. Tewolde, D.M. Hanna, R.E. Haskell, and Multi-swarm parallel PSO: hardware implementation, in: Proceedings of the 2009 IEEE Symposium on Swarm (2009).

[4] D.H. Lehmer, Mathematical methods in large-scale computing units, Ann. Computing Lab. Harvard Univ. 141-146, 1951.

[5] R.M. Calazan, N. Nedjah, L.M. Mourelle, Parallel coprocessor for PSO, International Journal of High Performance Systems Architecture (4) (2011) 233–240.

[6] EDA Industry Working Groups, VHDL – Very High Speed Integrated Circuits Hardware Description Language, September 201

[7] N. Nedjah, L.S. Coelho, L.M. Mourelle, Multi-Objective Swarm Intelligent Systems – Theory & Experiences, vol. 261, Springer, Berlin, (2010).

[8] A. Ratnaweera, S.K. Halgamuge, H.C. Watson, Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients, IEEE Transaction on System, Man and Cybernetics 8 (3) (2004) 240–255.

[9] Hardware/software co-design for particle swarm optimization algorithm Shih-An Li, Chen-Chien Hsu b, Ching-Chang Wong, Chia-Jun Yu, Information Sciences 181, Elsevier (2011) 4582–4596

[10] A hardware accelerator for Particle Swarm Optimization, Rogério M. Calazan, Nadia Nedjah, Luiza M. Mourelle, Applied Soft Computing, Elsevier (2013)

[11] A hardware accelerator for Particle Swarm Optimization Rogério M. Calazan, Nadia Nedjah,Luiza M. Mourelle Applied Soft Computing xxx (2013).

[12] P.D. Reynolds, R.W. Duren, M.L. Trumbo, R.J. Marks, II, FPGA Implementation of Particle swarm optimization for inversion of large neural networks, in: Proceedings 2005 IEEE Swarm Intelligence Symposium (2005).

[13] J. Peña, A. Upegui, A population-oriented architecture for particle swarms, in: Second NASA/ESA Conference on Adaptive Hardware and Systems, (AHS 2007), pp. 563–570.

[14] J. Peña, A. Upegui, E. Sanchez, Particle swarm optimization with discrete recombination: an online optimizer for evolvable hardware, in: First NASA/ESA Conference on Adaptive Hardware and Systems, (AHS 2006), pp. 163–170.

[15] D. Bratton, T. Blackwell, Understanding particle swarms through simplification: a study of recombinant PSO, in: Proceedings of the 9th 1013 Annual Conference on Genetic and, Evolutionary Computation (GECCO‟07), pp. 2621–2627.

[16] Y. Maeda, N. Matsushita, Simultaneous perturbation particle swarm 1035optimization using FPGA, in: Proceedings of International Joint Conference 1036 on Neural Networks (IJCNN 2007), pp. 2695–2700.

[17] G.S. Tewolde, D.M. Hanna, R.E. Haskell, Multi-swarm parallel PSO: hardware 1054 implementation, in: Proceedings of the 2009 IEEE Symposium on Swarm Intelligence (SIS09), Nashville, TN, pp. 60–66.

[18] A modular and efficient hardware architecture for particle swarm, optimization algorithm Girma S. Tewolde Q1, Darrin M. Hanna, Richard E. Haskell Microprocessors and Microsystems xxx, Elsevier (2012).