Evolving binary-weights neural network using hybrid...

10

Transcript of Evolving binary-weights neural network using hybrid...

Page 1: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

Scientia Iranica B (2015) 22(4), 1625{1634

Sharif University of TechnologyScientia Iranica

Transactions B: Mechanical Engineeringwww.scientiairanica.com

Evolving binary-weights neural network using hybridoptimization algorithm for color space conversion

C.Y. Chen�

Department of Information and Telecommunications Engineering, Ming Chuan University, Taoyuan, Taiwan, ROC.

Received 22 May 2014; accepted 11 May 2015

KEYWORDSArti�cial neuralnetworks;Neuroevolution;Evolutionaryalgorithms;PSO;Color space converter.

Abstract. Arti�cial Neural Networks (ANNs) are applied to many complex real-worldproblems, ranging from image recognition to autonomous robot control. However, to designa neural network that can implement special task, it is necessary to select an appropriatebiological neuron model, meanwhile, good learning algorithm should be adopted to achievethe expected goal. Neuroevolution is a form of machine learning that uses EvolutionaryAlgorithms (EAs) to train ANNs. EAs, for the learning algorithm used by neural networks,can provide alternative and complementary solution, which can avoid the frequentlyhappened issues of \getting stuck in local minimum" during the iteration process madeby gradient-based learning algorithms. In this paper, a method using Hybrid PSO-basedLearning Algorithm (HPLA) to evolve the connection weights and network parameters ofBinary-Weights Neural Network (BWNN) will be introduced. The extracted knowledgefrom trained BWNN can then be used to construct high-speed shift-and-add based ColorSpace Converter (CSC) hardware architecture. The experimental results in this researchalso show that the performance of implemented hardware architecture is good at RGBto YCbCr color space converting, and it also has the advantages of high-speed and low-complexity.© 2015 Sharif University of Technology. All rights reserved.

1. Introduction

Color spaces is a method by which di�erent colorscan be speci�ed, created and visualized. There aremany existing color spaces and most of them representeach color as a point in a three dimensional coordinatesystem. Di�erent color spaces have historically evolvedfor di�erent applications. RGB, YUV and YCbCr arecommon color spaces in use. YUV and YCbCr areadopted in video codec and transmission (for example,JPEG, MPEG, etc.), while RGB is adopted for display,so the conversion between them is unavoidable [1].

In many Digital Signal Processing (DSP) algo-rithms, multiplier coe�cients are oating-point num-

*. Tel.: +886-3-350-7001#3737;Fax: +886-3-359-3879E-mail address: [email protected]

bers, rational numbers, or real numbers. However,when implementing DSP algorithms, �xed-point coe�-cient operations are often preferred over oating-pointdue to lower complexity and power consumption [2].When we are to realize Color Space Converter (CSC)from RGB to YCbCr, we will meet two dilemmas in thehardware circuit design; �rst, the conversion formulacontains many oating-point operations; next, in theconversion formula, many multiplication operationswill be met. If multiplier is used in the hardwarearchitecture to realize the multiplication operationfunction, then massive operation time and hardwarearea will have to be consumed. In order to solve bade�ciency problem of traditional oating-point basedCSC architecture, when it is used for color conversion,there are lots of di�erent solutions proposed by scholarsin the academy [1,3,4]. The LUT method is one ofthe high e�cient methods especially for embedded

Page 2: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

1626 C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634

systems [3]. In Ref. [4], Li et al. have proposeda method using Genetic Algorithm (GA) to evolveautomatically shift-and-add based CSC. Such methodis simple and e�ective, and it can also get high-speedCSC architecture. However, in its design process, onlyimage quality is considered instead of considering thehardware resource consumed, hence, the number ofadder used will be too much.

ANNs have been fruitfully used in a variety of�elds, including economics, �nance, meteorology, andengineering. Constructing neural networks involvesdi�cult optimization problems, such as �nding a net-work architecture appropriate for the application athand, and �nding an optimal set of weight values forthe network to solve the problems [5]. The trainingprocess of ANNs consists of two tasks, the �rst oneis the selection of an appropriate architecture for theproblem, and the second is the adjustment of theconnection weights of the network. However, ANNshave their own limitations. They are easy to fallinto local minimum and sometimes hard to adjust thearchitecture. When there are many local minima,a neural network using a hill-climbing algorithm foroptimization may stall on a di�erent minimum on eachrun of the network.

Neuroevolution is a method for modifying neuralnetwork weights, topologies in order to learn a speci�ctask. Various evolutionary computation methods, suchas GA, Genetic Programming (GP) and EvolutionStrategies (ES) etc. could be applied to evolve a neuralnetwork on its topological architecture, connectionweights and activation functions, or even groups oflearning rules [6]. With the development of computertechnology, these methods have already achieved hugeprogress and exhibited a merging tendency [7,8]. GAhas been used to solve each of these optimizationproblems. In weight optimization, the set of weightsis represented as a chromosome, and a genetic search

is applied to the encoded representation to �nd aset of weights that best �ts the training data [5].Koza provides an alternative method to representingANNs [9], under the framework of GP, which enablesmodi�cation of not only the weights, but also thearchitecture for a neural network. In [10], the authorsuse ES to learn the weights of the neural networkinstead of learning the method of the network.

Among lots of computing techniques, ParticleSwarm Optimization (PSO) has characteristics, suchas less parameters setting, robustness and high conver-gence speed. It is inspired by insect swarms and hasproven to be a competitor to GA when it comes tooptimization problems. Successful applications of PSOto some optimization problems, such as function mini-mization and ANNs design [11-13]. However, the mainproblem with PSO is that it prematurely converges [14-15] to stable point, which is not necessarily minimum.

In this study, a method using neuroevolutionstrategy to realize multiplierless CSC architecture willbe introduced. The complementary characteristicsbetween PSO and GA are used to construct high e�-ciency hybrid evolutionary learning algorithm, which isthen used to train BWNN (shown in Figure 1). AfterBWNN training is �nished, the extracted knowledgefrom neural network can then be used to constructhigh-speed shift-and-add based CSC circuit hardwarearchitecture. In the proposed hybrid learning algo-rithm, the elite strategy has been adopted to retainsome particles with superior �tness in the populationbut sort out particles with bad �tness; moreover, theones with better �tness are selected to reorganize anddisturb their encoding strings through crossover andmutation procedures so as to replace the particlessorted out in the population. Through this method,diversi�cation of particles in the population can beenhanced, and the premature convergence of PSO canbe avoided too.

Figure 1. Evolving connection weights in a binary-weights neural network.

Page 3: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634 1627

The composition of the rest of the paper is asfollows. A review for the ANNs and an introduction forBWNN are given in Section 2. Section 3 is concernedwith the color conversion formulas and the descriptionof the proposed method. Then the multiplierlessCSC implementation and the results and analysis arepresented in Section 4. Finally, Section 5 concludes thepaper.

2. Binary-weights neural network

ANNs are parallel computing machines that learn fromexamples and by interaction instead of by program-ming. They are inspired by biology, and composedof elements with functionality similar to a biologicalneuron. The ANNs can be made in many di�erentways and can try to mimic the brain in many di�erentways. Basically, ANNs are widely used in functionalapproximation and pattern classi�cation applicationsdue to their capability for modeling complex and highlynonlinear functions.

A multilayer neural network consists of a layerof input units, one or more layers of hidden units,and one output layer of units. Each connectionbetween nodes has a weight associated with it. Thelearning process in multilayer neural network is usuallyimplemented by training data set, because the learningprocess is achieved by iteratively adjusting the connec-tion weights. The error calculations used to train amultilayer neural network are very important. Likeleast square, the sum-of-squared error is calculatedby looking at the squared di�erence between whatthe network's output for training data and the targetvalue. Formally the equation is the same as one-halfthe traditional least squares error. For a given set ofinput vectors fxtg, t = 1; 2; :::; N and desired vectorsfdtg, the error function is as follows:

E(w) =12

NXt=1

�y(xt; w)� dt�2 ; (1)

where N is the total number of training cased, y(xt; w)is the observed output for the tth training case, andthe dt is the target output for that case.

BWNN introduced in this paper is of three-layerarchitecture, which is, respectively, input layer, hiddenlayer and output layer, as shown in Figure 2. Theweights of BWNN are only formed by binary digitsof 0 and 1. Input layer receives the input data, andthen the data is processed by hidden layer and sent tooutput layer for �nal result output.

The BWNN can be represented mathematicallyby the following equations:

nethj (x) =pXi=1

whji:xi; j = 0; 1; :::;m; whji 2 f0; 1g(2)

Figure 2. The proposed architecture of BWNN.

yj(x) = f�nethj (x)

�= f

pXi=1

whji:xi

!=

pXi=1

whji:xi

!� j; (3)

Y (x) = f

0@Xj

yj(x)

1A=

0@0@(mXj=0

pXi=1

whji:xi)� j

1A� n

1A+B; (4)

where p and m are two constants set by the user, wjidenotes a weight from the input to the hidden layer,and B is a bias.

3. Using HPLA-based BWNN for CSC design

3.1. RGB to YCbCr conversionThe Red, Green and Blue (RGB) color space is widelyused throughout computer graphics. The image inRGB color space is not suitable for image compressionapplications, because the image in RGB color space ishighly correlated. In the YCbCr color space, imagedata consists of three components: luminance (Y ),blueness (Cb), and redness (Cr). The �rst component,luminance, represents the intensity of the image, whilethe Cb and Cr components called chrominance indicatehow much blue and red is used, respectively. TheYCbCr color space is a scaled and an o�set version ofthe YUV color space. Y is de�ned to have a range of16-235; Cb and Cr are de�ned to have a nominal rangeof 16-240 [16]. The conversion from RGB to Y CbCr isgiven by:

Page 4: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

1628 C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634

24 YCbCr

35 =

24 0:257 0:504 0:098�0:148 �0:291 0:4390:439 �0:368 �0:071

35� 24RGB

35+

24 16128128

35 : (5)

3.2. Hybrid PSO-based Learning Algorithm(HPLA)

3.2.1. PSO basicsPSO is an evolutionary computation technique devel-oped by Kenney and Eberhart in 1995 [17]. Themethod has been developed through a simulation ofsimpli�ed social models. PSO is based on swarms,such as �sh schooling and bird ocking. Accordingto the research results for bird ocking, birds �ndfood by ocking (not by each individual). It mustalso have a �tness evaluation function that takes theparticle's position and assigns to it a �tness value. Theposition with the highest �tness value in the entirerun is called the global best (G). Each particle alsokeeps track of its highest �tness value. The locationof this value is called its personal best (Pi). The basicalgorithm involves casting a population of particles overthe searching space, remembering the best (most �t)solution encountered. At each iteration, every particleadjusts its velocity vector, based on its momentum andthe in uence of both its best solution and the bestsolution of its neighbors, and then computes a newpoint to examine. The studies shows that the PSO hasmore chance to \ y" into the better solution areas morequickly, so it can discover a reasonable quality solutionmuch faster than other evolutionary algorithms. Theoriginal PSO formulate is described by the followingequations [17]:

Vi;n(t+ 1) =�:Vi;n(t) + r1:rand1: (Pi;n(t)� Li;n(t))

+ r2:rand2: (Gn(t)� Li;n(t)) ; (6)

Li;n(t+ 1) = Li;n(t) + Vi;n(t+ 1); (7)

where n is the dimensional number, i denotes the ithparticle in the population, V is the velocity vector, L isthe position vector and � is the inertia factor. r1 and r2are the cognitive and social learning rates, respectively.

3.2.2. Real-coded GAA genetic or evolutionary algorithm applies the prin-ciples of evolution found in nature to the problem of�nding an optimal solution to a solver problem. GAwas initially introduced by John Holland in seventies asa special technique for function optimization [18]. It isthe most widely known type of evolutionary computa-tion methods [4,19,20]. The basic operators used in GAconsist of selection, crossover, and mutation. GA pa-rameters (such as population size, crossover probability

(Pc), and mutation probability (Pm) etc.) interact incomplex ways. There is evidence showing that theprobabilities of crossover and mutation are critical tothe success of GA [21]. Traditionally, determining whatprobabilities of crossover and mutation should be usedis usually done by means of experimental methods.

An individual or solution to the problem to besolved is represented by a list of parameters calledchromosome or genome. Genes and chromosomes arethe basic building blocks of the GA. The conventionalstandard GA encodes the optimization parameters intobinary code string. The strings are combinations of0 s and 1 s, which represent the value of a number.During each generation, the chromosomes are evalu-ated by a �tness function, which is a measure for thequality of the solution. To create the next generation,the individuals with higher �tness value will havehigher probability of being selected as candidates forthe next generation. In addition, new chromosomes(called o�spring) are formed by either merging twochromosomes from the current generation using thecrossover operator and modifying a chromosome usingthe mutation operator.

To reduce computing time, several researchershave tried to use real-coded values instead of binary bitstrings for implementing chromosome encoding. Therole and behavior of genetic operators in real-codedGA are fundamentally di�erent from those in binaryencodings, although motivation of the operators andthe framework of GA are similar. It has been con�rmedto have better performance than either binary or grayencoding for constrained optimization problems [22-24]. The crossover and mutation operator of real-codedGA are described as follows.

Crossover operators. A crossover is a processwhere new individuals are created from the informationcontained within the parents. It is the main geneticoperator. The crossover operator of a real-codedGA is constructed by borrowing the concept of linearcombination of strings from the area of convex settheory, where the weighted average of two strings(chromosomes) is calculated as follows [25-26]:

x0i = Uniform(0; 1):xi + (1�Uniform(0; 1)) :xi+1;(8)

x0i+1 = (1�Uniform(0; 1)) :xi + Uni form(0; 1):xi+1:(9)

Mutation operators. Random changes are generatedto the population in mutation. Mutation can createa new genetic material in the population to maintainthe population diversity. This prevents the populationfrom converging towards a local optimum. For a pointxk, the mutated solution is as follows:

Page 5: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634 1629

Figure 3. The ow chart of HPLA.

x0k = xk + rand�N(0; 1): (10)

3.2.3. Implementation of HPLA algorithmIn the introduced hybrid learning algorithm, when allthe particles follow Eqs. (6) and (7) to �nish velocitiesand positions update, we will then retain particles withbetter �tness in the population, then through crossoverand mutation operators by using Eqs. (8)-(10), a lot ofnew particle strings are then generated so as to replaceparticles sorted out due to bad �tness. By doing so, thesize of population is then retained unchanged. Figure 3shows the ow chart of HPLA.

3.3. HPLA-based BWNN for CSC designFigure 4 is the architectural diagram showing the useof HPLA-based BWNN to realize multiplierless CSC.Taking RGB to Y as example, the input value ofBWNN should be (R;G;B), and the output valueshould be Y . If the initial solution of PSO isf�1; �2; �3; ng, then after �nishing the training ofBWNN, the parameters set R = fwji; ng extractedfrom BWNN can then be used to construct multi-plierless CSC architecture expected to be obtained.Similarly, as long as the same procedures are followed,hardware architectures such as RGB to Cb and RGBto Cr can be obtained too.

Figure 4. Multiplierless CSC design using HPLA-basedBWNN method.

3.3.1. Encoding strategy of particlesIn our proposed learning algorithm, each particle isa string encoded by four real number values. Whenparticle string is sent to �tness function to do �tnessvalue evaluation, the �rst three real number values inparticle string will all be converted into binary stings of�xed length to be used as weights for connecting inputlayer and hidden layer in BWNN. The last real numbervalue is then the number of bits to shift right for theoutput value of the output neuron. Figure 5 shows theexample of the encoding of a particle.

3.3.2. Fitness function evaluationWhen too many adders are used, more hardwareresources will be consumed during circuit synthesis,hence, in this study, two indexes such as image qualityand hardware resource consumed will be considered atthe same time, and the designed �tness function is asshown in Eq. (11):

=

��

NXi=1

jfi;desired�fi;obtainedj+� �NAdd

!�1

;(11)

where N denotes the total number of training patterns,NADD denotes the number of adders, parameters �and � are constant between 0 and 1, and fi;obtainedis calculated using Eq. (5).

4. Performance evaluation

In this research, the required parameter set setting ofthe HPLA-based BWNN is as follows: Pop size = 40,M = 10, maxgen =100, Pc = 0:8, Pm = 0:1, � =0:005, � = 1, m = 9, and the bias of BWNN B 2f16; 128g. The search space of �1, �2, and �3 is in therange of [0, 1023], and the search space of n is in therange of [0, 31]. The training data set used by HPLA-based BWNN consists of 729 data patterns which arethe sub-sampled data in the RGB color space. Thedesigned architectures were described in Verilog-HDLand synthesized for Altera Cyclone II EP2C70 FPGA,with the aid of the tool Quartus II.

Page 6: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

1630 C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634

Figure 5. A coded particle string.

When BWNN that is responsible for the imple-mentation of color space conversion, such as RGB toY , RGB to Cb, and RGB to Cr, has been �nishedwith training, the knowledge and related parametersextracted from black boxes are shown respectively asin Eqs. (12)-(14):

RGB to Y :

wTji =

240 0 0 0 0 0 0 0 0 10 0 1 1 1 1 1 1 1 10 0 0 0 0 1 1 1 0 0

35 ;xi =

24RGB

35 ; n = 11; B = 16 (12)

RGB to Cb:

wTji =

240 0 1 1 0 1 0 0 1 00 0 0 0 1 0 1 0 0 10 0 1 0 0 0 0 1 1 1

35 ;xi =

24�R�GB

35 ; n = 11; B = 128 (13)

RGB to Cr:

wTji =

241 0 0 0 1 0 0 1 1 10 0 0 0 0 0 0 0 1 10 0 0 0 1 0 0 1 0 0

35 ;

xi =

24 R�G�B

35 ; n = 11; B = 128 (14)

Eq. (15) shows the multiplierless CSC architecture fromBWNN framework introduced in this research:

Y =�

((R� 9) + (G� 2) + (G� 3)

+ (G� 4) + (G� 5) + (G� 6)

+ (G� 7) + (G� 8) + (G� 9)

+ (B � 5) + (B � 6) + (B � 7))

� 11�

+ 16;

Cb =�

(�(R� 2)� (R� 3)� (R� 5)

� (R� 8)� (G� 4)� (G� 6)

� (G� 9) + (B � 2) + (B � 7)

+ (B � 8) + (B � 9))� 11�

+ 128;

Page 7: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634 1631

Cr =�

((R) + (R� 4) + (R� 7) + (R� 8)

+ (R� 9)� (G� 8)� (G� 9)

� (B � 4)� (B � 7))� 11�

+ 128: (15)

In order to enhance the processing e�ciency of thementioned CSC architecture on color space conversion,in this study, how to perform pipelined design on theelectronic circuit architecture, described in Eq. (15),will be further described too. The concept of pipelineddesign has been widely applied in all kinds of engi-neering �elds; for example, the production line of theautomobile plant is operated based on the conceptsimilar to pipeline processing. For the hardware circuit,the application of pipeline architecture is similar tothe production line in the automobile plant, and thedata in massive processing is similar to the objectsin the production line of a plant. Moreover, severalfunctional units with special functions, for example,sub-circuits, such as adder and subtractor, are usedto perform data processing. The processed data arecoordinated and integrated through register, and thenare sent to the next stage for subsequent processing.According to Eq. (15), the circuit representation canbe designed into pipeline architecture, which is shownin Figures 6-8. All these color conversion circuitsare divided into �ve stages, and the sub-circuit ineach stage is formed by signed adder; each stage

Table 1. Comparison of the time needed for processing512�512 size color image using Nios II C/C++ language.

Method Time cost(s)

Clock frequency(MHz)

Traditional CSC 54 100Proposed CSC 1.2 100

is set up with pipeline register at the data outputlocation of the path to store temporarily the processeddata.

Here, the calculation time needed for severaldi�erent CSC architectures proposed in this paper inprocessing single image will be compared so as tounderstand the performance of the proposed methodsin performing color conversion. The related comparisonresults are as shown in Table 1. When we are executingtraditional oating-point based CSC software programsin Nios II processor, the color conversion processingtime needed for single 512 � 512 size image was 54 s.However, if the proposed CSC architecture is usedinstead to implement the same conversion task, thetime needed will be only 1.2 s. Therefore, it is clearthat the CSC architecture realized in this study has avery good performance, which is su�cient to deal withthe need of real-time image processing. In addition tothat, as compared to the CSC architecture used andproposed by Li et al. in 2013 [4], which used 47 adders,the proposed CSC architecture only needs 32 adders.This shows signi�cant di�erence in the consumption ofhardware resource in both cases.

Figure 6. The pipelined architecture for RGB to Y .

Page 8: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

1632 C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634

Figure 7. The pipelined architecture for RGB to Cb.

Figure 8. The pipelined architecture for RGB to Cr.

When we further use Verilog-HDL to describecircuit architectures, such as Figures 6-8, and downloadthem to Cyclone II FPGA chip, the performance dataof hardware circuit is then as shown in Table 2. Fromthe data listed in the table, it can be seen that thehardware architecture realized in this research can havea maximum operating frequency of 306 MHz in CycloneII FPGA chip, which shows better performance ascompared to the CSC circuit proposed by Bensaali andAmira [27], CAST. Inc. [28], ALMA. Tech. [29], andAmphion Ltd. [30], hence, it can be proved that the

Table 2. Performance comparison with existing CSChardware architecture.

Design parameters Speed(Mhz)

Proposed CSC core 306

Bensaali and Amira [27] 234

CAST. Inc. [28] 112

ALMA. Tech. [29] 105

Amphion Ltd. [30] 90

Page 9: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634 1633

Table 3. A comparison between color space conversion result and related performance for test images.

Computation timeTest image Conversion results RMSE Nios II Proposed CSC core

YCbCr

0.34410.72220.1623

1.2 s 0.856 ms

YCbCr

0.78030.53480.3991

1.2 s 0.856 ms

YCbCr

0.47930.54830.2877

1.2 s 0.856 ms

above CSC circuit architecture has, in fact, excellentcharacteristic and practical value.

Next, three 512�512 color images which arefrequently used for image quality test in the academywere used to perform the experiencement so as tocompare the execution speed and image quality, whichare as shown in Table 3. From Table 3, it is clearthat when hardware method is used to realize CSC, theprocessing speed will be far faster than that realizedby Nios II C/C++ software design method; in otherwords, the color conversion task of the image can beindeed processed in real time. In addition, RMSE indexin the table shows that the proposed CSC can retaingood image quality while doing color space conversion,and the error between it and the result obtained fromEq. (5) is relatively small (which is about in the range of0.2�0.8), hence, in real application, almost no in uencewill be seen. The calculation of RMSE is shown inEq. (16):

RMSE =vuut1=(NM)N�1Xi=0

M�1Xj=0

(fTemplate(i; j)�fobtained(i; j))2;(16)

where M and N are the number of rows and columnsof the image.

5. Conclusion

In this paper, a method using HPLA-based BWNN todesign high-speed shift-and-add based CSC is intro-

duced. In the learning strategy, we have introducedtwo consideration factors of RMSE and adder numberat the same time, hence, the realized circuit can,under the minimal hardware consumption, still retaina very good color conversion quality. In addition, theproposed CSC architecture is pretty simple and hasvery high execution speed; hence, it is very suitableto be integrated into all kinds of image applicationsystems that need real time processing for taking careof the task of fast color space conversion.

Acknowledgments

This research was supported by the National ScienceCouncil of the Republic of China under Contract No.NSC 102-2221-E-130 -021.

References

1. Yang, Y., Peng, Y. and Liu, Z. \A fast algorithm forY CbCr to RGB conversion", IEEE Trans. ConsumerElectron., 53(4), pp. 1490-1493 (2007).

2. Gustafsson, O. and Qureshi, F. \Addition aware quan-tization for low complexity and high precision constantmultiplication", IEEE Signal Process. Lett., 17(2), pp.173-176 (2010).

3. Webb, J.L.H. \E�cient table access form reversiblevariable-length decoding", IEEE Trans. Circuits Syst.Video Technol., 11(8), pp. 981-985 (2001).

4. Li, S.A., Chen, C.Y. and Chen, C.H. \Design of a shift-and-add based hardware accelerator for color spaceconversion", J. Real Time Image Process (2013). doi:10.1007/s11554-013-0324-7

Page 10: Evolving binary-weights neural network using hybrid ...scientiairanica.sharif.edu/article_3740_7198122a57e7c3cc52ab7e03b479d0e3.pdfoptimization may stall on a di erent minimum on each

1634 C.Y. Chen/Scientia Iranica, Transactions B: Mechanical Engineering 22 (2015) 1625{1634

5. Zhang, B.T. and Muhenbein, H. \Evolving optimalneural networks using genetic algorithms with Occam'srazor", Complex Syst., 7(3), pp. 199-220 (1993).

6. Li, S., Yuan, J., Yue, X. and Luo, J. \The binary-weights neural network for robot control", Proceedingsof 2010 3rd RAS & EMBS, pp. 765-770 (2010).

7. Palmes, P.P., Hayasaka, T.C. and Usui, S. \Mutation-based genetic neural network", IEEE Trans. NeuralNetworks, 16(3), pp. 0587-600 (2005).

8. Yao, X. and Liu, Y. \A new evolutionary systemfor evolving arti�cial neural networks", IEEE Trans.Neural Networks, 8(3), pp. 694-713 (1997).

9. Koza, J.R., Genetic Programming: On the Program-ming of Computers by Means of Neural Selection, MITPress (1992).

10. Berlanga, A., Isasi, P., Sanchis, A. and Molina, J.M.\Neural networks robot controller trained with evolu-tion strategies", Proceedings of the 1999 Congress onEvolutionary Computation, pp. 413-419 (1999).

11. Ratnaweera, A., Saman, K. and Watson, H.C. \Self-organizing hierarchical particle swarm optimizer withtime-varying acceleration coe�cients", IEEE Trans.Evol. Comput., 8(3), pp. 240-255 (2004).

12. Juang, C.F. \A hybrid genetic algorithm and particleswarm optimization for recurrent network design",IEEE Trans. Syst. Man Cybernet., 32, pp. 997-1006(2004).

13. Da, Y. and Ge, X.R. \An improved PSO-based ANNwith simulated annealing technique", Neurocomput.Lett., 63, pp. 527-533 (2005).

14. Van den Bergh, F. and Engelbrecht, A.P. \A coopera-tive approach to particle swarm optimization", IEEETrans. Evol. Comput., 8(3), pp. 225-239 (2004).

15. Premalatha, K. and Natarajan, A.M. \Hybrid PSOand GA models for document clustering", Int. J.Advance. Soft Comput. Appl., 2(3), pp. 1-19 (2010).

16. Hsu, R.L., Abdel-Mottaleb, M. and Jain, A.K. \Facedetection in color images", IEEE Trans. Pattern Anal.Mach. Intell., 24(5), pp. 696-707 (2002).

17. Eberhart, R. and Kennedy, J. \Particle swarm opti-mization", Proceedings of IEEE International Confer-ence on Neural Network, pp. 1942-1948 (1995).

18. Holland, J.H., Adaptation in Natural and Arti�cialSystems, University of Michigan Press, Ann Arbor, MI,USA (1975).

19. Back, T., Evolutionary Algorithm: Comparisons ofApproaches, Chapman and Hall, Cambridge, UK(1994).

20. Page, A.J. and Naughton, T.J. \Framework for taskscheduling in heterogeneous distributed computing

using genetic algorithms", Artif. Intell. Rev., 24(3-4),pp. 415-429 (2005).

21. Ochoa, G., Harvey, I. and Buxton, H. \On recombi-nation and optimal mutation rates", Proceedings ofGenetic and Evolutionary Computation Conference,pp. 488-495 (1999).

22. Lee, K.Y. and Mohamed, P.S. \A Real-coded geneticalgorithm involving a hybrid crossover method forpower plant control system design", Proceedings ofthe 2002 Congress on Evolutionary Computation, pp.1069-1074 (2002).

23. Yalcinoz, T. and Altun, H. \Power economic dispatchusing a hybrid genetic algorithm", IEEE Power Eng.Rev., 21(3), pp. 59-60 (2001).

24. Ramos, R.M., Saldanha, R.R., Takahashi, R.H.C. andMoreira, F.J.S. \The real-biased multiobjective geneticalgorithm and its application to the design of wireantennas", IEEE Trans. Magn., 39(3), pp. 1329-1332(2003).

25. Kaelo, P. and Ali, M.M. \Integrated crossover rulesin real coded genetic algorithms", Eur. J. Oper. Res.,176(1), pp. 60-76 (2007).

26. Ripon, K.S.N., Kwong, S. and Man, K.F. \A real-coding jumping gene genetic algorithm (RJGGA)for multiobjective optimization", Inform. Sciences,177(2), pp. 632-654 (2007).

27. Bensaali, F. and Amira, A. \Design and e�cientFPGA implementation of an RGB to YCrCb colorspace converter using distributed arithmetic", Int. J.Graph. Vis. Image Process., 5(1), pp. 37-47 (2004).

28. Application Note: CSC color space converter, CAST,Inc., http://www.cast-inc.com (2002).

29. Datasheet: High performance color space converter,ALMA Technologies, http://www.alma-tech.com(2002).

30. Datasheet: Color space converters, Amphion semicon-ductor Ltd, DS6400 V1.1, http://www.amphion.com(2002).

Biography

Ching-Yi Chen received his PhD degree fromTamkang University, New Taipei City, Taiwan, in 2006.He joined the Department of Information and Telecom-munications Engineering at Ming Chuan Universityin 2007 as an Assistant Professor. Currently, he isan Associate Professor at the department. His mainresearch interests include swarm intelligence, patternrecognition, and embedded systems.