1-s2.0-S0957417409004308-main

download 1-s2.0-S0957417409004308-main

of 8

Transcript of 1-s2.0-S0957417409004308-main

  • 8/9/2019 1-s2.0-S0957417409004308-main

    1/8

    Power load forecasts based on hybrid PSO with Gaussian and adaptive mutation

    and Wv-SVM

    Qi Wu *

    Key Laboratory of Measurement and Control of CSE (School of Automation, Southeast University), Ministry of Education, Nanjing, Jiangsu 210096, China

    School of Mechanical Engineering, Southeast University, Nanjing, Jiangsu 210096, China

    a r t i c l e i n f o

    Keywords:

    Load forecasts

    Wv-SVM

    Particle swarm optimization

    Adaptive mutation

    Gaussian mutation

    a b s t r a c t

    This paper presents a new load forecasting model based on hybrid particle swarm optimization with

    Gaussian and adaptive mutation (HAGPSO) and wavelet   v -support vector machine (Wv-SVM). Firstly, it

    is proved that mother wavelet function can build a set of complete base through horizontal floating

    and form the wavelet kernel function. And then, Wv-SVM with wavelet kernel function is proposed in this

    paper. Secondly, aiming to the disadvantage of standard PSO, HAGPSO is proposed to seek the optimal

    parameter of Wv-SVM. Finally, the load forecasting model based on HAGPSO and Wv-SVM is proposed

    in this paper. The results of application in load forecasts show the proposed model is effective and

    feasible.

     2009 Elsevier Ltd. All rights reserved.

    1. Introduction

    The theoretical study of load forecasting of power systems

    started in the middle of last century, simultaneously with the

    flourishing of system identification and modern control theories,

    etc. Before that, because the scales of power systems were limited

    and there were many uncertain factors, the study of load forecast-

    ing had not taken shape. It was not until the 1980s that the theo-

    retical study of mid-long term load forecasting began to occur, and

    a series of forecasting methods, such as AR algorithm, MA algo-

    rithm, General Exponential Smoothing algorithm, ARMA algorithm

    and ARIMA algorithm, had been successively developed and are

    widely accepted in the load forecasting of power systems at pres-

    ent (Chenhui, 1987). With the improvement of the grey system,

    manual neural network, expert system, genetic algorithm (Wu,

    Yan, & Yang, 2008a) and other theories and methods, the method

    of mid-long term load forecasting of power systems has continu-

    ously improved (Benaouda, Murtagh, Starck, & Renaud, 2006;

    Liang, 1997; Santos, Martins, & Pires, 2007; Topalli, Erkmen, &

    Topalli, 2006; Ying & Pan, 2008). In general, most of the algorithms

    above are based on the time series.

    Recently, SVM which was developed by Vapnik (1995) is one of 

    the methods that receives increasing attention with remarkable re-

    sults in the field of load forecasting (Hong, 2009; Pai & Hong, 2005;

    Wu, Tzeng, & Lin, 2009). The main difference between NN and SVM

    is the principleof riskminimization.ANN implements empiricalrisk

    minimization (ERM) to minimize the error on the training data,

    while SVM implements the principle of structural risk minimization

    (SRM) by constructing an optimal separating hyper-plane in the hid-

    den feature space, and using quadratic programming to find a un-

    ique solution. SVM has yielded excellent generalization

    performance that is significantly better than that of competingmethods in load forecasts(Hong, 2009; Pai& Hong, 2005; Wu,Tzeng

    et al., 2009). However, for our used kernel functions so far, the SVM

    cannot approach any curve in  L2ðRnÞ  space (quadratic continuous

    integral space), because the kernel function which is used now is

    not the complete orthonormal base. This character lead the SVM

    cannot approach every curve in the   L2ðRnÞ   space. Similarly, the

    regression SVMcannot approach every function. Therefore we need

    find a new kernel function, and this function can build a set of com-

    plete base through horizontal floating and flexing. As we know, this

    kind of function has already existed, and it is the wavelet functions.

    The SVM with wavelet kernel function is called by wavelet SVM

    (WSVM). Reviewing the load forecasts literatures about support

    vector machine technique (Hong, 2009; Pai & Hong, 2005; Wu,

    Tzeng et al., 2009), little has been written about in the literature

    on application of Wv-SVM to load forecast research field.

    However, the confirmation of unknown parameters of the Wv-

    SVMis complicated process. In fact,it is a multivariableoptimization

    problem in a continuous space. The appropriate parameter combi-

    nation of models can enhance approximating degree of the original

    series. Therefore, it is necessary to select an evolutionary algorithm

    to seek theoptimal parameters of Wv-SVM. These unknown param-

    eters have a great effect on the generalization performance of Wv-

    SVM. An appropriate parameter combination corresponds to a high

    generalization performance of Wv-SVM. Particle swarm optimiza-

    tion (PSO), which is an evolutionary computation technique devel-

    oped by   Kennedy and Eberhart (1995), is considered as an

    excellent technique to solve the combinatorial optimization prob-

    0957-4174/$ - see front matter   2009 Elsevier Ltd. All rights reserved.doi:10.1016/j.eswa.2009.05.011

    *   Tel.: +86 25 51166581; fax: +86 25 511665260.

    E-mail address:  [email protected]

    Expert Systems with Applications 37 (2010) 194–201

    Contents lists available at  ScienceDirect

    Expert Systems with Applications

    j o u r n a l h o m e p a g e :   w w w . e l s e v i e r . c o m / l o c a t e / e s w a

    mailto:[email protected]://www.sciencedirect.com/science/journal/09574174http://www.elsevier.com/locate/eswahttp://www.elsevier.com/locate/eswahttp://www.sciencedirect.com/science/journal/09574174mailto:[email protected]

  • 8/9/2019 1-s2.0-S0957417409004308-main

    2/8

    lems (Lin, Ying, Chen, & Lee, 2008; Shen, Shi, Kong, & Ye, 2007; Wu,

    Liu, Xiong, & Liu, 2009; Wu, Yan, & Wang, 2009; Wu, 2009; Wu, Yan,

    & Yang, 2008b; Wu, & Yan, 2009, in press; Yuan & Chu, 2007; Yang,

    Yuan, Yuan, & Mao, 2007; Zhao & Yang, 2009).

    PSO is based on the metaphor of social interaction and commu-

    nication such as bird flocking. Original PSO is distinctly different

    from other evolutionary-type methods in a way that it does not

    use the filtering operation (such as crossover and mutation) andthe members of the entire population are maintained through the

    search procedure so that information is socially shared among indi-

    viduals to direct the search towards the best position in the search

    space. Oneof themajor drawbacks of the standard PSOis its prema-

    ture convergence. To overcome the shortage, there have been a lot

    of reported works focused on the modification PSO such as in (Lin

    et al., 2008; Shen et al., 2007; Wu et al., 2008b; Yuan & Chu,

    2007; Zhao & Yang, 2009) to solve the parameter selection prob-

    lemsof SVM,but little attention is given inWv-SVM. And then, a hy-

    brid PSO with adaptive mutation and Gaussian mutation (HAGPSO)

    is proposed to optimize the parameters of Wv-SVM in this paper.

    Based on the above analysis, a new load forecasting model

    based and Wv-SVM is proposed in this paper. Their superiority

    over traditional model is verified in numerical simulation. The rest

    of this paper is organized as follows. Section 2 introduces Wv-SVM.

    HAGPSO is arranged in Section 3. In Section 4 the steps of HAGPSO

    and forecasting method are described. Section 5  gives experimen-

    tal simulation and results. Conclusions are drawn in the end.

    2. Wavelet  v -support vector machine (W v -SVM)

     2.1. Wavelet kernel theory

    Let us consider a set of data points  ð x1; y1Þ; ð x2; y2Þ; . . . ; ð xl; ylÞ,

    which are independently and randomly generated from an un-

    known function. Specifically,  xi   is a column vector of attributes, y iis a scalar, which represents the dependent variable, and l  denotes

    the number of data points in the training set.The support vector’s kernel function can be described as not

    only the product of point, such as  K ð x; x0Þ ¼  K ðÞ, but also

    the horizontal floating function, such as K ð x; x0Þ ¼  K ð x  x0Þ. In fact,

    if a function satisfied condition of Mercer, it is the allowable sup-

    port vector kernel function.

    Lemma 1   (Mercer, 1909).   The symmetry function K ð x; x0Þ   is the

    kernel function of SVM if and only if: for all function g – 0   which

    satisfies the condition of R 

    Rn g 2ðnÞdn <  1,   we need satisfy the

    condition as follows:Z Z   K ð x; x0Þ g ð xÞ g ð xÞdxdx0 P 0; x; x0 2  R n ð1Þ

    This theorem proposed a simple method to build kernel function.

    For the horizontal floating function, because hardly dividing this function into two same functions, we can give the condition of 

    horizontal floating kernel function.

    Lemma 2   (Smola and Scholkopf, 1998). The horizontal floating func-

    tion is allowable support vector’s kernel function if and only if the Fou-

    rier transform of K ð xÞ  need satisfy the condition follows:

    F ½ xðxÞ ¼ ð2pÞn=2Z 

    Rnexpð jðx: xÞÞK ð xÞdx P 0;   x 2 R n ð2Þ

    If the wavelet function   wð xÞ   satisfied the conditions:   wð xÞ 2

    L2ðRÞ \ L1ðRÞ, and  ŵð xÞ ¼  0; ŵ   is the Fourier transform of function

    wð xÞ. The wavelet function group can be defined as:

    wa;mð xÞ ¼ ðaÞ12w   x  ma

    ;   x 2 R   ð3Þ

    where a is the so-called scaling parameter,  m  is the horizontal float-

    ing coefficient, and  wð xÞ is called the ‘‘mother wavelet”. The param-

    eter of translation  m  2  R  and dilation  a  >  0, may be continuous or

    discrete. For the function   f ð xÞ; f ð xÞ 2  L2ðRÞ, The wavelet transform

     f ð xÞ can be defined as:

    W ða; mÞ ¼ ðaÞ12 Z   þ1

    1

     f ð xÞw   x  ma dx   ð4Þ

    where wð xÞ stands for the complex conjugation of  wð xÞ.

    The wavelet transform W ða;mÞ  can be considered as functions

    of translation   m  with each scale   a. Eq.   (4)  indicates the wavelet

    analysis is a time-frequency analysis, or a time-scaled analysis. Dif-

    ferent from the Short Time Fourier Transform (STFT), the wavelet

    transform can be used for multi-scale analysis of a signal through

    dilation and translation so it can extract time-frequency features

    of a signal effectively.

    Wavelet transform is also reversible, which provides the possi-

    bility to reconstruct the original signal. A classical inversion for-

    mula for f ð xÞ  is:

     f ð xÞ ¼ C 1wZ   þ1

    1

    Z   þ11

    W ða; mÞwa;mð xÞda

    a2 dm   ð5Þ

    where

    C w ¼Z   1

    1

    jŵðwÞj2jwj   dw   0. So far, because

    the wavelet kernel function must satisfy the conditions of  Lemma 2,

    the number of wavelet kernel function which can be showed by

    existent functions is few. Now, we give an existent wavelet kernel

    function: Morlet wavelet kernel function, and we can prove that

    this function can satisfy the condition of allowable support vector’s

    kernel function. Morlet wavelet function is defined as follows:

    w

    ð x

    Þ ¼ cos

    ð1:75 x

    Þexp

     x2

    2

    ð10

    Þ Theorem 1.  Morlet wavelet kernel function is defined as:

    K ð x; x0Þ ¼Yli¼1

    cos 1:75  xi  x0i

    a

    exp   k xi  x

    0ik2

    2a2

    !;

     x 2 R ld;   xi 2  R d ð11Þand this kernel function is an allowable support vector kernel function.

    Proof.   According to Lemma 2, we only need to prove

    F ½ xðxÞ ¼ ð2pÞl=2Z 

    Rldexpð jðx: xÞÞK ð xÞdx P 0   ð12Þ

    where   K ð xÞ ¼Ql

    i¼1w

      xi

    a

     ¼Ql

    i¼1 cos

      1:75 xi

    a

    expðk xik

    2=2a2Þ

    ; j   denotesimaginary number unit. We have

    Q. Wu / Expert Systems with Applications 37 (2010) 194–201   195

  • 8/9/2019 1-s2.0-S0957417409004308-main

    3/8

    Z  R 

    ldexpð jx xÞK ð xÞdx ¼

    Z  R 

    ldexpð jx xÞ

    Yli¼1

    cos 1:75 xia

    exp   k xik2

    2a2

    !!dx

    ¼ Yl

    i¼1 Z   1

    1exp

    ð jxi xi

    Þ   expð j1:75 xi=aÞ þ expð j1:75 xi=aÞ

    2

    exp   k xik2

    2a2

    !dxi ¼

    Yli¼1

    1

    2

    Z   11

      exp   k xik2

    2  þ   1:75 j

    a   jxia

     xi

    !

    þ exp   k xik2

    2    1:75i

    a  þ jxia

     xi

    !!

    ¼ Yl

    i¼1

    jaj  ffiffiffiffiffiffiffi2pp 

    2

      exp 

    ð1:75 xiaÞ2

    2 !

    þ exp   ð1:75 þxiaÞ2

    2

    !!  ð13Þ

    Substituting formula (13) into Eq. (12), we can obtain Eq. (14).

    F ½ xðxÞ ¼Yli¼1

    jaj2

      exp   ð1:75 xiaÞ

    2

    2

    !

    þ exp   ð1:75 þxiaÞ2

    2

    !!  ð14Þ

    where a – 0, we have

    F ½ xðxÞ P 0   ð15Þ

    This completes the proof of  Theorem 1.   h

     2.2. Wavelet   v -support vector machine

    Combining the wavelet kernel function with   v -SVM, we can

    build a new SVM learning algorithm that is Wv-SVM. The structure

    of Wv-SVM is shown in   Fig. 1. For a set of data points   ð x1; y1Þ;

    ð x2; y2Þ; . . . ; ð xl; ylÞ, Wv-SVM can be described as:

    minw;nðÞ ;e;b

    sðw; nðÞ; eÞ ¼ 12

    kwk2 þ C     v   eþ 1l

    Xli¼1

    ðni þ ni Þ !

      ð16Þ

    Subject to   ðw  xi þ bÞ  yi  6 eþ ni   ð17Þ yi  ðw  xi þ bÞ 6 eþ ni   ð18Þn

    ðÞi   P 0; eP 0; b 2 R   ð19Þ

    where  w  and  x i  are a column vector with  d  dimensions,  C  >  0 is a

    penalty factor,   nðÞi   ði ¼  1; . . . ; lÞ   are slack variables and  m 2 (0,1] isan adjustable regularization parameter.

    Problem (16) is a quadratic programming (QP) problem. By

    means of the Wolfe principle, wavelet kernel function technique

    and Karush–Kuhn–Tucker (KKT) conditions, we have the duality

    problem (20) of the original optimal problem (16).

    maxa;a

    W ða; aÞ ¼ 12

    Xli; j¼1

    ðai  aiÞða j  a jÞK ð xi  x jÞ

    þXli¼1

    ðai  aiÞ yi   ð20Þ

    s:t:   0 6 ai;ai   6

    l   ð21ÞXli¼1

    ðai  aiÞ ¼ 0   ð22Þ

    Xli¼1

    ðai þai Þ 6 C   v    ð23Þ

    Select the appropriate parameters   C   and   v , and the optimal

    mother wavelet function which can match well the original series

    in some scope of scales as the kernel function of Wv-SVM model.

    Then, W   v -SVM output function is described as following:

     f ð xÞ ¼X

    l

    i¼1ðai ai Þ

    Yl

    i¼1w

      x j  x jia

    !þ b; b 2 R   ð24Þ

    where wð xÞ is wavelet transform function,  a is the scaling parameter

    of wavelet, a >  0.  x j is the jth value of test vector x. x ji  is the jth value

    of sample vector  x i.

    Parameter b  can be computed by Eq. (25), select the two scalars

    a jða j  2 ð0; l=C ÞÞ  and  akðak  2 ð0; l=C ÞÞ, then we have

    b ¼ 12

      y j þ yk Xli¼1

    ðai  aiÞK ð xi; x jÞ þXli¼1

    ðai  aiÞK ð xi; xkÞ !" #

    ð25Þ

    3. Hybrid particle swarm optimization

    The confirmation of unknown parameters of the Wv-SVM iscomplicated process. In fact, It is a multivariable optimization

    problem in a continuous space. The appropriate parameter combi-

    nation of models can enhance approximating degree of the original

    series Therefore, it is necessary to select an intelligence algorithm

    to get the optimal parameters of the proposed models. The

    parameters of Wv-SVM have a great effect on the generalization

    performance of Wv-SVM. An appropriate parameter combination

    corresponds to a high generalization performance of the Wv-SVM.

    PSO algorithm is considered as an excellent technique to solve the

    combinatorial optimization problems. The proposed HAGPSO algo-

    rithm is used to determine the parameters of Wv-SVM. The intelli-

    gence system shown in Fig. 2 based on the HAGPSO algorithm and

    Wv-SVM model can evaluate the performance of HAGPSO algo-

    rithm by forecasting time series. The different Wv-SVMs in the dif-ferent Hilbert spaces are adopted to forecast the power load time

     y

     x

    Σ

    1( , )K    x x

    2( , )K    x x

    1w

    iw

    ( , )n

    K    x x

    nw

    ( , )i

    K    x x

    2w

    1 x

    2 x

    n x

    Fig. 1.  The architecture of Wv-SVM.

    196   Q. Wu / Expert Systems with Applications 37 (2010) 194–201

  • 8/9/2019 1-s2.0-S0957417409004308-main

    4/8

    series. For each particular region only the most adequate Wv-SVM

    with the optimal parameters is used for the final forecasting.

    To valuate forecasting capacity of the intelligence system, the

    fitness function of AGPSO algorithm is designed as follows:

     fitness ¼ 1l

    Xli¼1

     yi  yi yi

    2ð26Þ

    where l  is the size of the selected sample,  yi  denote the forecastingvalue of the selected sample,   yi   is original date of the selected

    sample.

     3.1. Standard particle swarm optimization

    Similarly to evolutionary computation techniques, PSO (Yang

    et al., 2007) uses a set of particles, representing potential solutions

    to the problem under consideration. The swarm consists of  n  par-

    ticles. Each particle has a position   X i  ¼ ð xi1; xi2; . . . ; xij; . . . xidÞ, a

    velocity   V i  ¼ ðv i1;v i2; . . . ;v ij; . . . ;v idÞ, where   i ¼  1;2; . . . ;N ;   j ¼

    1;2; . . . ; d, and moves through an   d-dimensional search space.

    According to the global variant of the PSO, each particle moves to-

    wards its best previous position and towards the best particle pg  in

    the swarm. Let us denote the best previously visited position of theith particle that gives the best fitness value as  pi  ¼ ð pi1; pi2; . . . ;

     pij; . . . ; pidÞ, and the best previously visited position of the swarm

    that gives best fitness as pg  ¼ ð pg 1; pg 2; . . . ; pg  j; . . . ; pg dÞ.

    The change of position of each particle from one iteration to an-

    other can be computed according to the distance between the cur-

    rent position and its previous best position and the distance

    between the current position and the best position of swarm. Then

    the updating of velocity and particle position can be obtained by

    using the following equations:

    v kþ1ij   ¼ w   v kij þ c 1  r 1    pij  xkij

    þ c 2  r 2    pg  j  xkij

      ð27Þ

     xkþ1ij   ¼ xkij þ v kþ1ij   ð28Þ

    where   w   is called inertia weight and is employed to control theimpact of the previous history of velocities on the current one.   k

    denotes the iteration number,   c 1   is the cognition learning factor,

    c 2   is the social learning factor,  r 1  and  r 2   are random numbers uni-

    formly distributed in the range [0,1].

    Thus, the particle flies through potential solutions towards  pkiand pg k in a navigated way while still exploring new areas by the

    stochastic mechanism to escape from local optima. Since there

    was no actual mechanism for controlling the velocity of a particle,

    it was necessary to impose a maximum value   V max   on it. If thevelocity exceeds the threshold, it is set equal to  V max, which con-

    trols the maximum travel distance at each iteration to avoid this

    particle flying past good solutions. The PSO is terminated with a

    maximal number of generations or the best particle position of 

    the entire swarm cannot be improved further after a sufficiently

    large number of generations. The PSO has shown its robustness

    and efficacy in solving function value optimization problems in

    real number spaces.

     3.2. Hybrid particle swarm optimization with Gaussian mutation and

    adaptive mutation

    Aiming to the disadvantage of the standard PSO, the adaptive

    mutation operator is proposed to regulate the inertia weight of 

    velocity by means of the fitness value of object function and itera-

    tive variable. The Gaussian mutation operator is considered to cor-

    rect the direction of particle velocity. The aforementioned problem

    is addressed by incorporating adaptive mutation and Gaussian

    mutation for the previous velocity of the particle. Thus, the HAG-

    PSO can update the velocity and particle position by using the fol-

    lowing equations:2

    v kþ1ij   ¼ ð1  kÞwkijv kij þ kN   0;rki

    þ c 1r 1   pij  xkij þ c 2r 2   pg  j  xkij

      ð29Þ

     xkþ1ij   ¼ xkij þ v kþ1ij   ð30Þwkij

     ¼ b   1

     f xki = f xkm þ ð

    1

     b

    Þw0ij exp

    ðak2

    Þ ð31

    Þrkþ1i   ¼ rki  expðN ið0;MrÞÞ ð32Þ

    Fig. 2.  The AGPSO optimizes the parameters of Wv-SVM.

    Q. Wu / Expert Systems with Applications 37 (2010) 194–201   197

  • 8/9/2019 1-s2.0-S0957417409004308-main

    5/8

    where   i ¼  1; 2; . . . ; N ; t  ¼  1;2.   Mr   is standard error of Gaussiandistribution, b   is the adaptive coefficient,  k   is an increment coeffi-

    cient, a is the coefficient of controlling particle velocity attenuation, f xki

      is the fitness of the   ith particle in the   kth iterative process.

     f xkm

      is the optimal fitness of particle swarms in the   k   iterative

    process.

    The parameter w regulates the trade-off between the global and

    local exploration abilities of the swarm. A large inertia weight facil-

    itates global exploration, while a small one tends to facilitate local

    exploration. A suitable value of the inertia weight w  usually pro-

    vides balance between global and local exploration abilities and

    consequently results in a reduction of the number of iterations re-

    quired to locate the optimum solution.

    Adaptive mutation, which makes the quality of the solution de-

    pend on mutation operator, is high effective mutation operator in

    real code. The proposed adaptive mutation operator based on iter-

    ative variable k  and the fitness function value f ð xkÞ is described in

    Eq. (31). Then, in first item of the right of Eq. (29), velocity inertia

    weight  wkij  can provide balance between global and local explora-

    tion abilities and consequently results in a reduction of the number

    of iterations required to locate the optimum solution. In Eq.   (31)

    1  f xki = f xkm  represents the particles with the bigger fitness

    mutate in a smaller scope, while the ones with the smaller fitness

    mutate in a big scope.  w0ij expðak2

    Þ   represents the initial inertia

    weight  w0ij  mutate in big scope and search the local optimal value

    in bigger space in the start moment (smaller  kÞ, while the param-

    eter w0ij  mutate in small scope and search the global optimal value

    in small space and gradually reach the global optimal value in the

    end moment (bigger kÞ.

    The second item of Eq. (29) represents Gaussian mutation based

    on the iterative variable k. The Gaussian mutation operator which

    can correct the moving direction of particle velocity is represented

    inEq. (32). In thestrategyof Gaussianmutation, theproposedveloc-

    ity vector   v kþ1 ¼   v kþ11   ;v kþ12   ; ;v 

    kþ1d

      consists of last generation

    velocity vector   v k ¼   v k1;v k2; ;v 

    kd

      and perturbation vector rk ¼

    rk1;rk2; . . . ;r

    kd . The perturbation vector mutates itself by Eq. (32)

    on the each iterative process as a controlling vector of velocityvector.

    The adaptive and Gaussian mutation operators can restore the

    diversity loss of the population and improve the capacity of the

    global search of the algorithm.

    4. The procedures of HAGPSO and W v -SVM

    The HAGPSO algorithm is described in steps as follows:

     Algorithm 1

    Step 1. Data preparation: Training, validation, and test sets are

    represented as Tr, Va, and Te, respectively.

    Step 2. Particle initialization and PSO parameters setting: Gener-ate initial particles. Set the PSO parameters including

    number of particles   ðnÞ, particle dimension   ðmÞ, number

    of maximal iterations ðkmaxÞ, error limitation of the fitness

    function, velocity limitation ðV maxÞ, and inertia weight for

    particle velocity   ðw0Þ, Gaussian distribution   ðN ð0;MrÞÞ,the perturbation momentum   ðr0i Þ; the coefficient of controlling particle velocity attenuation   ðaÞ, adaptivecoefficient ðbÞ, increment coefficient ðkÞ. Set iterative vari-

    able:   k ¼   0. And perform the training process from Step

    3–8.

    Step 3. Set iterative variable: k  ¼  k þ 1.

    Step 4. Compute the fitness function value of each particle. Take

    current particle as individual extremum point of every par-

    ticle and do the particle with minimal fitness value as theglobal extremum point.

    Step 5. Stop condition checking: if stopping criteria (maximum

    iterations predefined or the error accuracy of the fitness

    function) are met, go to Step 8. Otherwise, go to the next

    step.

    Step 6. Adopt the adaptive mutation operator by Eq.  (31)   and

    Gaussian mutation operator by Eq. (32) to manipulate par-

    ticle velocity.

    Step 7. Update the particle position by Eqs. (29) and (30) and formnew particle swarms, go to step 3.

    Step 8. End the training procedure, output the optimal particle.

    On the basis of the Wv-SVM model, we can summarize an

    estimation algorithm as the follows.

     Algorithm 2

    Step 1. Initialize the original data by normalization and fuzzifica-

    tion, then form training patterns.

    Step 2. Select the appropriate wavelet kernel function K , the con-

    trol constant m  and the penalty factor C . Construct the QPproblem (16) of the Wv-SVM.

    Step 3. Solve the optimization problem and obtain the parame-

    ters aðÞi   . Compute the regression coefficient b  by  (25).Step 4. For a new forecasting task, extract load characteristics and

    form a set of input variables  x. Then compute the estima-

    tion result   ̂y by  (24).

    5. Experiment

    To analyze the performance of the proposed HAGPSO algorithm,

    the forecast of power load series by means of the intelligence sys-

    tem based on HAGPSO and Wv-SVM is studied. To compare the

    performance of HAGPSO algorithm, the standard PSO is also

    adopted to optimize the parameters of Wv-SVM. The better algo-

    rithm will give the better combinational parameters of Wv-SVM.

    Therefore, there is a good forecasting capability provided by the

    better combinational parameters in the regression estimation of 

    Wv-SVM. Better algorithm provides better forecasting capability.

    To evaluate forecasting capacity of the intelligent system, some

    evaluation indexes, such as mean absolute error (MAE), mean

    absolute percentage error (MAPE) and mean square error (MSE),

    are adopted to deal with the forecasting results of HAGPSOWv-

    SVM and PSOWv-SVM.

    In our experiments, power load series are selected from past

    load record in a typical power company. The detailed characteristic

    data and load series compose the corresponding training and test-

    ing sample sets. During the process of the power load forecasting,

    six influencing factors shown in   Table 1,   viz., sunlight, data, air

    pressure, temperature, rainfall and humidity are taken into ac-

    count. All linguistic information of gotten influencing factors is

    dealt with fuzzy comprehensive evaluation (Feng   & Xu, 1999)and form numerical information. Suppose the number of variables

    is   n, and   n ¼  n1 þ  n2, where   n1   and   n2, respectively denote the

    number of fuzzy linguistic variables and crisp numerical variables.

    The linguistic variables are evaluated in several description levels,

     Table 1

    Influencing factors of power load forecasts.

    Load characteristics Unit Expression Weight

    Sunlight Dimensionless Linguistic information 0.9

    Data Dimensionless Linguistic information 0.7

    Air pressure Dimensionless Linguistic information 0.68

    Temperature Dimensionless Linguistic information 0.8

    Rainfall Dimensionless Linguistic information 0.7

    Humidity Dimensionless Linguistic information 0.4

    198   Q. Wu / Expert Systems with Applications 37 (2010) 194–201

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-

  • 8/9/2019 1-s2.0-S0957417409004308-main

    6/8

    and a real number between 0 and 1 can be assigned to each

    description level. Distinct numerical variables have different

    dimensions and should be normalized firstly. The following nor-

    malization is adopted:

     xei ¼ xei  min   xei

    li¼1

    max   xei

    l

    i¼1

    min   xei

    l

    i¼1

    ;   e ¼ 1; 2; . . . ; n2   ð33Þ

    where l is the number of samples, xe

    i   and  xe

    i  denote the original valueand the normalized value, respectively. In fact, all the numerical

    variables from (1) through (32) are the normalized values although

    they are not marked by bars.

    The proposed HAGPSO algorithm has been implemented in

    Matlab 7.1 programming language. The experiments are made on

    a 1.80 GHz Core(TM)2 CPU personal computer (PC) with 1.0 G

    memory under Microsoft Windows XP professional. The initial

    parameters of HAGPSO are given as follows: inertia weight:

    w0 ¼ 0:9; positive acceleration constants: c 1; c 2  ¼  2; the standard

    error of Gaussian distribution:   Mr ¼  0:5; the adaptive coefficientb ¼  0:8; increment coefficient: k  ¼  0:1; the fitness accuracy of the

    Fig. 3.  Mexican hat wavelet transform of load series in the scope of different scale.

    Fig. 4.  Morlet wavelet transform of load series in the scope of different scale.

    Fig. 5.  Gaussian wavelet transform of load series in the scope of different scale.

    Q. Wu / Expert Systems with Applications 37 (2010) 194–201   199

    http://-/?-http://-/?-

  • 8/9/2019 1-s2.0-S0957417409004308-main

    7/8

    normalized samples is equal to 0.0002; the coefficient of control-

    ling particle velocity attenuation: a  ¼  2.The Morlet, Mexican hot and Gaussian wavelet are selected to

    analyze the load series on the different scales shown in  Figs. 3–5.Morlet wavelet transform is the best wavelet transform that can

    inosculate the original load series on the scope of scale from 0.3

    to 4 among all given wavelet transforms.

    Therefore, Morlet wavelet can be ascertained as a kernel func-

    tion of Wv-SVM model, three parameters also are determined as

    follows:

    v  2 ½0; 1; a 2 ½0:3; 2  and

    C  2   maxð xi; jÞ minð xi; jÞl

      103; maxð xi: jÞ minð xi; jÞl

      103

    The trend of fitness value of HAGPSO is shown in   Fig. 6. It is

    obvious that the HAGPSO is convergent. Therefore, HAGPSO is able

    to be applied to seek the parameters of Wv-SVM.

    The optimal combinational parameters are obtained byAlgorithm HAGPSO, viz.,  C  ¼  960:10;v  ¼  0:88 and  a  ¼  0:89. Fig. 7

    illuminates the load series forecasting results given by HAGPSO

    and Wv-SVM.

    For analyzing the parameter searching capacity of HAGPSO

    algorithm, the standard PSO algorithm is used to optimize param-

    eters of Wv-SVM by training the original load series, then give the

    latest 12 weeks forecasting results of each model shown in Table 2.

    The comparison between HAGPSO and PSO optimizing the

    parameters of the same model (Wv-SVM) is shown in Table 3.

    The Table 3 shows the error index distribution from two differ-

    ent models. The MAE, MAPE and MSE of HAGPSOWv-SVM are bet-

    ter than ones of PSOWv-SVM. It is obvious that adaptive and

    Gaussian mutation operators can improve the global search ability

    of particle swarm optimization algorithm. Experiment resultsshow that the forecast’s precision is improved by HAGPSO, com-

    pared with PSO under the same conditions.

    6. Conclusion

    In this paper, a new load forecasting model based on HAGPSO

    and Wv-SVM is proposed. A new version of PSO, viz., hybrid parti-

    cle swarm optimization with adaptive mutation and Gaussian

    mutation (HAGPSO), is also proposed to optimize the parameters

    of Wv-SVM. The performance of the HAGPSOWv-SVM is evaluated

    by means of forecasting the data of power loads, and the simula-

    tion results demonstrate that the Wv-SVM is effective in dealing

    with many dimensions, nonlinearity and finite samples. Moreover,

    it is shown that the HAGPSO presented here is available for theWv-SVM to seek optimized parameters.

    Fig. 6.  The change trend of the fitness function.

    Fig. 7.  The load forecasting results based on HAGPSOWv-SVM model.

     Table 2

    Comparison of forecasting result from two different models.

    The latest 12 weeks Real value Forecasting value

    PSOWv-SVM HAGPSOWv-SVM

    1 580 725 703

    2 2046 2010 20183 908 858 880

    4 1625 1585 1606

    5 452 547 525

    6 2937 2880 2920

    7 1135 1046 1167

    8 2580 2493 2499

    9 2561 2508 2566

    10 781 908 884

    11 1489 1536 1516

    12 1532 1519 1525

     Table 3

    Error statistic of two forecasting models.

    Model MAE MAPE MSE

    PSOWv-SVM 69.92 0.068 6292

    HAGPSOWv-SVM 45.25 0.048 3473

    200   Q. Wu / Expert Systems with Applications 37 (2010) 194–201

  • 8/9/2019 1-s2.0-S0957417409004308-main

    8/8

    In our experiments, the fixed adaptive coefficients  ðb; kÞ, the

    second step control parameter   Mr   of normal mutation and theparameter a  of control the velocity attenuation are adopted. How-

    ever, how to choose an appropriate coefficient is not described in

    this paper. The research on the velocity changes when different

    above parameters are adopted is a meaningful problem for future

    research.

    References

    Benaouda, D., Murtagh, F., Starck, J. L., & Renaud, O. (2006). Wavelet-basednonlinear multiscale decomposition model for electricity load forecasting.Neurocomputing, 70(1–3), 139–154.

    Chenhui, L. (1987). Theory and method of load forecasting of power systems. Ha’erbinInstitute of Technology Press.

    Feng, S., & Xu, L. (1999). An intelligent decision support system for fuzzycomprehensive evaluation of urban development.   Expert Systems with

     Applications, 16 (1), 21–32.Hong, W. C. (2009). Electric load forecasting by support vector model.   Applied

    Mathematical Modelling, 33(5), 2444–2454.Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization.  IEEE International

    Conference on Neural Networks, Australia, 1942–1948.Liang, R. H. (1997). Application of grey linear programming to short-term hydro

    scheduling. Electric Power Systems Research, 41(3), 159–165.Lin, S. W., Ying, K. C., Chen, S. C., & Lee, Z. J. (2008). Particle swarm optimization for

    parameter determination and feature selection of support vector machines.Expert Systems with Applications, 35(4), 1817–1824.

    Mercer, J. (1909). Functions of positive and negative type and their connection withthe theory of integral equation, Philos.   Transactions of the Royal Society of London, A-209, 415–446.

    Pai, P. F., & Hong, W. C. (2005). Support vector machines with simulated annealingalgorithms in electricity load forecasting.   Energy Conversion and Management,46 (17), 2669–2688.

    Santos, P. J., Martins, A. G., & Pires, A. J. (2007). Designing the input vector to ANN-based models for short-term load forecast in electricity distribution systems.International Journal of Electrical Power and Energy Systems, 29(4), 338–347.

    Shen, Q., Shi, W. M., Kong, W., & Ye, B. X. (2007). A combination of modified particleswarm optimization algorithm and support vector machine for gene selectionand tumor classification.  Talanta, 71(4), 1679–1683.

    Smola, A., & Scholkopf, B. (1998). The connection between regularization operatorsand support vector kernels.  Neural Network, 11, 637–649.

    Topalli, A. K., Erkmen, I., & Topalli, I. (2006). Intelligent short-term load forecastingin Turkey.  International Journal of Electrical Power and Energy Systems, 28(7),437–447.

    Vapnik, V. (1995). The Nature of Statistical Learning . New York: Springer.Wu, Q. (2009). The forecasting model based on wavelet  v-support vector machine.

    Expert Systems with Applications, 36 (4), 7604–7610.Wu, Q., Liu, J., Xiong, F. L., & Liu, X. J. (2009). The fuzzy wavelet classifier machine

    with penalizing hybrid noises from complex diagnosis system.  Acta Automatica

    Sinica, 35(6), 773–779 (in Chinese).Wu, C. H., Tzeng, G. H., & Lin, R. H. (2009). A Novel hybrid genetic algorithm for

    kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36 (3), 4725–4735.

    Wu, Q., & Yan, H. S. (2009). Forecasting method based on support vector machinewith Gaussian loss function.  Computer Integrated Manufacturing Systems, 15(2),306–312 (in Chinese).

    Wu, Q., & Yan, H. S. (in press). Product sales forecasting model based on robust  v-support vector machine.   Computer Integrated Manufacturing Systems   (inChinese).

    Wu, Q., Yan, H. S., & Wang, B. (2009). The product sales forecasting model based onrobust wavelet   v-support vector machine.   Acta Automatica Sinica, 37 (7),1227–1232 (in Chinese).

    Wu,Q., Yan, H. S., & Yang, H. B. (2008a). A hybrid forecasting model based onchaoticmapping and improved support vector machine. In:  Proceedings of the ninthinternational conference for young computer scientists , (pp. 2701-2706).

    Wu, Q., Yan, H. S., & Yang, H. B. (2008b). A forecasting model based support vectormachine and particle swarm optimization. In:  Proceedings of the 2008 workshopon power electronics and intelligent transportation system

    , (pp. 218-222).

    Yang, X. M., Yuan, J. S., Yuan, J. Y., & Mao, H. (2007). A modified particle swarmoptimizer with dynamic adaptation. Applied Mathematics and Computation, 189,1205–1213.

    Ying, L. C., & Pan, M. C. (2008). Using adaptive network based fuzzy inferencesystem to forecast regional electricity loads.  Energy Conversion and Management,49(2), 205–211.

    Yuan, S. F., & Chu, F. L. (2007). Fault diagnostics based on particle swarmoptimization and support vector machines.   Mechanical Systems and SignalProcessing, 21(4), 1787–1798.

    Zhao, L., & Yang, Y. (2009). PSO-based single multiplicative neuron model for timeseries prediction.  Expert Systems with Applications, 36 (2), 2805–2812.

    Q. Wu / Expert Systems with Applications 37 (2010) 194–201   201