ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network,...

7
ORIGINAL RESEARCH Open Access Approximation of multi-parametric functions using the differential polynomial neural network Ladislav Zjavka Abstract Unknown data relations can describe a lot of complex systems through a partial differential equation solution of a multi-parametric function approximation. Common artificial neural network techniques of a pattern classification or function approximation in general are based on whole-pattern similarity relations of trained and tested data samples. It applies input variables of only absolute interval values, which may cause problems by far various training and testing data ranges. Differential polynomial neural network is a new type of neural network developed by the author, which constructs and resolves an unknown general partial differential equation, describing a system model of dependent variables. It creates a sum of fractional polynomial terms, defining partial mutual derivative changes of input variables combinations. This type of regression is based on learned generalized data relations. It might improve dynamic system models a standard time-series prediction, as the character of relative data allows to apply a wider range of input interval values than defined by the trained data. Also the characteristics of differential equation solutions facilitate a great variety of model forms. Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function approximation, Sum derivative term Introduction Solving differential equations are able to define models for a variety of pattern recognition [1] and primarily func- tion approximation problems, applying genetic program- ming techniques [2] or an artificial neural network (ANN) construction [3]. A common ANN operating principle is based on entire similarity relations of new presented input patterns with the trained ones. A principal lack of its func- tionality in general is a disability of an input pattern data relation generalization. It utilizes only input variables of absolute interval values, which are not able to describe a wider range of applied data. The ANN generalization from the training data set may be difficult or problematic if the model has not been trained with inputs in the range covered testing data [4]. If the data involve relations, which may become stronger or weaker character, the neural network model should generalize it to be applied also onto different interval values. Differential polyno- mial neural network (D-PNN) is a new neural network type, which creates and resolves an unknown partial differential equation (DE) of a multi-parametric function approximation. A DE is replaced producing sum of frac- tional polynomial derivative terms, forming a system model of dependent variables. In contrast with the ANN approach, each neuron of the D-PNN can direct take part in the network total output calculation. Analogous to the ANN function approximation (and pattern identification), the study tried to create a neural network, which function estimation (or pattern recognition) is based on any depen- dent data relations. In a case a function approximation the output of the neural network is a functional value. In the case a pattern identification its response should be the same to all input vectors, which variables keep up the trained de- pendencies, no matter what values they become. However the principle of both types is the same, analogous to the ANN approach [5]. y ¼ a 0 þ X m i¼1 a i x i þ X m i¼1 X m j¼1 a ij x i x j þ X m i¼1 X m j¼1 X m k¼1 a ijk x i x j x k þ ð1Þ Correspondence: [email protected] VŠB-Technical University of Ostrava Centre of Excellence IT4innovations, Ostrava, Czech republic © 2013 Zjavka; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Zjavka Mathematical Sciences ,: http://www.iaumath.com/content/// 2013 2013, 7:33 http://www.iaumath.com/content/7/1/33

Transcript of ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network,...

Page 1: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Zjavka Mathematical Sciences httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

ORIGINAL RESEARCH Open Access

Approximation of multi-parametric functionsusing the differential polynomial neural networkLadislav Zjavka

Abstract

Unknown data relations can describe a lot of complex systems through a partial differential equation solution of amulti-parametric function approximation Common artificial neural network techniques of a pattern classification orfunction approximation in general are based on whole-pattern similarity relations of trained and tested datasamples It applies input variables of only absolute interval values which may cause problems by far various trainingand testing data ranges Differential polynomial neural network is a new type of neural network developed by theauthor which constructs and resolves an unknown general partial differential equation describing a system modelof dependent variables It creates a sum of fractional polynomial terms defining partial mutual derivative changesof input variables combinations This type of regression is based on learned generalized data relations It mightimprove dynamic system models a standard time-series prediction as the character of relative data allows to applya wider range of input interval values than defined by the trained data Also the characteristics of differentialequation solutions facilitate a great variety of model forms

Keywords Polynomial neural network Data relations Partial differential equation construction Multi-parametricfunction approximation Sum derivative term

IntroductionSolving differential equations are able to define modelsfor a variety of pattern recognition [1] and primarily func-tion approximation problems applying genetic program-ming techniques [2] or an artificial neural network (ANN)construction [3] A common ANN operating principle isbased on entire similarity relations of new presented inputpatterns with the trained ones A principal lack of its func-tionality in general is a disability of an input pattern datarelation generalization It utilizes only input variables ofabsolute interval values which are not able to describe awider range of applied data The ANN generalization fromthe training data set may be difficult or problematic ifthe model has not been trained with inputs in the rangecovered testing data [4] If the data involve relationswhich may become stronger or weaker character theneural network model should generalize it to be appliedalso onto different interval values Differential polyno-mial neural network (D-PNN) is a new neural networktype which creates and resolves an unknown partial

Correspondence lzjavkagmailcomVŠB-Technical University of Ostrava Centre of Excellence IT4innovationsOstrava Czech republic

copy 2013 Zjavka licensee Springer This is an OpAttribution License (httpcreativecommonsorin any medium provided the original work is p

2013

differential equation (DE) of a multi-parametric functionapproximation A DE is replaced producing sum of frac-tional polynomial derivative terms forming a systemmodel of dependent variables In contrast with the ANNapproach each neuron of the D-PNN can direct take partin the network total output calculation Analogous to theANN function approximation (and pattern identification)the study tried to create a neural network which functionestimation (or pattern recognition) is based on any depen-dent data relations In a case a function approximation theoutput of the neural network is a functional value In thecase a pattern identification its response should be the sameto all input vectors which variables keep up the trained de-pendencies no matter what values they become Howeverthe principle of both types is the same analogous to theANN approach [5]

y frac14 a0 thornXmifrac141

aixi thornXmifrac141

Xmjfrac141

aijxixj

thornXmifrac141

Xmjfrac141

Xmkfrac141

aijkxixjxk thornhellip

eth1THORN

en Access article distributed under the terms of the Creative Commonsglicensesby20) which permits unrestricted use distribution and reproductionroperly cited

Zjavka Mathematical Sciences Page 2 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

m ndash number of variables A(a1 a2 hellip am) hellip - vectorsof parametersX(x1 x2 hellip xm) - vector of input variables

D-PNNrsquos block skeleton is formed by the GMDH(Group Method of Data Handling) polynomial neural net-work which was created by a Ukrainian scientist AlekseyIvakhnenko in 1968 when the back-propagation tech-nique was not known yet [6] General connection betweeninput and output variables is possible to express by theVolterra functional series a discrete analogue of which isKolmogorov-Gabor polynomial (1) This polynomial canapproximate any stationary random sequence of observa-tions and can be computed by either adaptive methods orsystem of Gaussian normal equations [7] GMDH decom-poses the complexity of a process into many simpler rela-tionships each described by low order polynomials (2) forevery pair of the input values

yrsquo frac14 a0 thorn a1xi thorn a2xj thorn a3xixj thorn a4xi2 thorn a5xj

2 eth2THORN

Partial differential equation constructionThe basic idea of the D-PNN is to create and resolve a gen-erally true partial differential equation (3) which is notknown in advance and can describe a system of dependentvariables with a special type of fractional multi-parametricpolynomials (4) ie sum derivative terms

athornXnifrac141

bipartupartxi

thornXnifrac141

Xnjfrac141

cijpart2upartxipartxj

thornhellip frac14 0

u frac14Xinfinkfrac141

uk

eth3THORNu = f(x1 x2 hellip xn) ndash searched function of all input

variablesa B(b1 b2 hellip bn) C(c11 c12hellip) ndash polynomial

parametersElementary methods of DE solutions express the solution

in special elementary functions ndash polynomials (eg Besselrsquosfunctions Fourierrsquos or power series) Numerical integrationof differential equation solutions is based on using

rational integral functions trigonometric series

Partial DE terms are formed by the adapted applicationof the method of integral analogues which replaces math-ematical operators and symbols of a DE by ratio of corre-sponding values Derivatives are replaced by their integralanalogues ie derivative operators are removed and simul-taneously all operators are replaced by similarly or propor-tion signs in equations all vectors are replaced by their

absolute values [8] However there should be possible toform sum derivative terms replacing a general partial DE(3) by using different math techniques eg wave seriesand others

yi frac14a0 thorn a1x1 thorn a2x2 thornhellipthorn anxn thorn anthorn1x1x2 thornhellipeth THORNmthorn1=n

b0 thorn b1x1 thornhellip

frac14 partmf x1 x2hellip xneth THORNpartx1partx2hellippartxm

Y frac14Xinfinifrac141

yi frac14 0

eth4THORNn ndash combination degree of n-input variable polynomial

of numeratorm ndash combination degree of denominator wt ndash weights

of terms

The fractional polynomials (4) defining partial relationsof n-input variables represent summation derivative terms(neurons) of a DE The numerator of eq (4) is a completen-variable polynomial which realizes a new partial func-tion u of formula (3) The denominator of eq (4) is a de-rivative part which gives a partial mutual change of someinput variables combination It arose from the partial der-ivation of the complete n-variable polynomial in respectto competent combination variables Root functions of nu-merator (4) take the polynomials into a correspondentcombination degree but neednrsquot be used at all if are notnecessary They may be adapted to enable the D-PNN togenerate an adequate range of desired output values

Multi-parametric function approximationD-PNN can approximate a multi-parametric functionthrough a general partial sum DE solution (3) Considerfirst only linear data relations which describe the DE ega simple sum function yt = x1 + x2 (however it could beany linear function) The network with 2 inputs forming1 functional output value y = f(x1 x2) should approximatethe true function yt by replacing sum derivative terms ofthe DE (5) It consists of only 1 block of 2 neurons termsof both derivative variables x1 and x2 (Figure 1)

y frac14 w1a0 thorn a1x1 thorn a2x2 thorn a3x1x2

b0 thorn b1x1

thornw2a0 thorn a1x1 thorn a2x2 thorn a3x1x2

b0 thorn b1x2

eth5THORN

D-PNN can be trained with only very small data set(6 samples) involving a wide range of input valueslt5500gt Figure 2 shows approximation errors (y-axis) ofthe trained network ie differences of the true and esti-mated function to random input vectors with dependentvariables Thus X-axis represents the ideal functionOutput errors can result from some disproportional de-

pendent random vector values (Figure 2) which D-PNN was

x1 + x2

f(x1 x2)

Block

w0 +

Neuron 1 Neuron 2

y1 y2

Figure 1 1-block D-PNN

pp p

x1 x2 x3

Zjavka Mathematical Sciences Page 3 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

not trained with e g 360 = 358 + 2 Figure 2 shows onlythe results of the 2-variable 2D-function f(x1x2)= x1 + x2The testing random output functions f(x1x2) displayed onx-axis (Figure 2) exceed the maximal trained sum value500 while the approximation error increases just slowlyIf the number of input variables is increased to 3 the

DE composition can apply polynomials of higher com-bination degree (=3) which results in raising amount ofsum derivative terms The 3-variable D-PNN for lineartrue function approximation (eg yt = x1 + x2 + x3) cancontain again 1 block of 6 neurons DE terms of all 1and 2-combination derivative variables of the completeDE eg (6)(7)

y1 frac14 w1a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3eth THORN2=3

b0 thorn b1x1

eth6THORN

Figure 2 Approximation of the 2-variable function

y4 frac14 w2a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3

b0 thorn b1x1 thorn b2x1 thorn b3x1x2eth7THORN

The training data set of the 3-variable function requiredan extension (in comparison 2-variables) to enable theD-PNN to get with a desired approximation error Theparameter optimization may apply a proper differenceevolution algorithm (EA) supplied with sufficient ran-dom mutations to prevent a parameter adjustment con-vergence before a desired error reaching [9] Not everyexperiment succeeds in a functional model

Multi-layered backward D-PNNMulti-layered D-PNN consisting of blocks of neuronsforms composite polynomial functions (functions of func-tions) (8) in each next hidden layer Each block contains asingle polynomial (without derivative part) forming itsoutput entranced into the next hidden layer (Figure 3)Neurons donrsquot affect the block output but are applied justdirectly as the sum derivative terms of a total output calcu-lation (DE composition) The blocks of the 2nd and follow-ing hidden layers also form additional extended neuronsie composite terms (CT) which define derivatives of com-posite functions applying reverse outputs and inputs ofback connected blocks of previous layers These partial de-rivatives in respect to variables of previous layers are

x1rsquorsquorsquo

Y

x3rsquorsquox2rsquorsquox1rsquorsquo

x3rsquox2rsquox1rsquo

p p p

CTCT CT

CTCTCT

O = simple term

CT = compound term

P = output polynomial

Figure 3 The 3-variable multi-layered 2-combination D-PNNBack-connections of the 3rd layer 1st block

Figure 4 Compare approximation of the 3-variable linear function

Zjavka Mathematical Sciences Page 4 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

calculated according to the composite function derivationrules (9) (10) and formed by products of partial derivativesof outer and inner functions [10]

yi frac14 ϕi Xeth THORN frac14 ϕi x1 x2hellip xneth THORN i frac14 1hellipm eth8THORN

F x1 x2hellip xneth THORN frac14 f y1 y2hellip ymeth THORNfrac14 f ϕ1 Xeth THORNϕ2 Xeth THORNhellipϕm Xeth THORNeth THORN

eth9THORN

partFpartxk

frac14Xmifrac141

partf y1 y2hellip ymeth THORNpartyi

sdotpartϕi Xeth THORNpartxk

k frac14 1hellip n

eth10THORNThe 1st block of the last (3rd) hidden layer (Figure 3)

forms 2 neurons of its own input variables as 2 simpleterms (11) of the DE (3) It creates also 4 compound termsof the 2nd (previous) hidden layer using reverse outputs

Figure 5 Compare approximation f(x1x2x3) = x12+x2

2+x32

and inputs of 2 bound blocks in respect to 4 derivative var-iables (12) As couples of variables of the inner functionscan differ from each other their partial derivations are 0and so the sum of formula (10) will consist only of 1 termThus each neuron of the D-PNN represents a DE termLikewise compound terms can be created in respect to the1st hidden layer variables eg (13) The 3 back-joint blocksform 8 CT of the DE and this can be well performed by arecursive algorithm

y11 frac14 w1a0 thorn a1x

1a2x

2 thorn a3x

1x

2eth THORN

4=7

b0 thorn b1x1

frac14 w1eth1x1 THORN

4=7

b0 thorn b1x1

eth11THORN

y13 frac14 w3

3x1eth THORN2=3

x2=

x1eth THORN2=3

b0 thorn b1x1

eth12THORN

y17 frac14 w77x1

x2 =

x1x2

=x1

b0 thorn b1x1eth13THORN

D-PNN should create a functional value around thedesired output As the input vector variables can take awide range of values (Figure 2) the combination polyno-mials produce big output values Therefore the multipli-cation (10) was replaced by division operator in fractionsof compound terms (11)(12)(13) without an negativeeffect reducing the combination degree of compositeterm polynomials each previous joint layer Withoutthis modification the root exponents of CT fractionswould require an adjustment The numerator exponentsare adapted to the current layer calculation as the

Zjavka Mathematical Sciences Page 5 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

combination degree of polynomials doubles each follow-ing hidden layer (11)(12)(13)Each neuron has an adjustable term weight wi but not

everyone may participate in the total network output cal-culation (DE composition) The selection of optimal neuroncombination can perform easily a proper genetic algorithm(GA) [11] Parameters of polynomials are represented byreal numbers which random initial values are generatedfrom the interval lt05 15gt They are adjusted withsimultaneous GA best-fit neuron combination search inthe initial phase of the DE composition There wouldbe welcome to apply an adequate gradient steepest descentmethod [12] in conjunction with the EA [13] D-PNN canbe trained with only small inputndashoutput data set like-wise the GMDH polynomial neural network does [14]D-PNNrsquos total output Y is the arithmetic mean of all ac-tive neuron output values (14) to prevent the neuronamount to influence it

Y frac14

Xkifrac141

yi

kk frac14 actual amount of active neurons

eth14THORN

ExperimentsThe presented 3-variable multi-layered D-PNN (Figure 3)is able to approximate any linear function eg simplesum f(x1 x2 x3) = x1 + x2 + x3 The D-PNN and ANNcomparison processes 12 fixed same training data samplesThe progress curves are typical of all following experi-ments (benchmarks) (Figure 4) The approximation accur-acy of both methods is co-equal on the trained intervalvalues lt10300gt however the ANN approximation ability

Figure 6 Compare approximation f(x1x2x3) = x1+x22+x3

3

rapidly falls outside of this range while the D-PNN al-ternate errors grow just slowly The ANN with 2-hiddenlayers of neurons applied the sigmoidal activation func-tion and the standard back-propagation algorithm TheD-PNN output has typically a wave-like behavior as itmodel is composed of sum DE termsApproximation of non-linear functions requires the

extension of the D-PNN block and neuron polynomials(11)(12)(13) with square power variables Polynomials arethe same as applied by the GMDH algorithm (2) Compe-tent square power (16) and combination (17) derivativesform additional sum terms of the 2nd order partial DE (15)The compound neurons of these derivatives are also formedaccording to the composite function derivative rules [10]

F x1 x2 upartupartx1

partupartx2

part2upartx21

part2u

partx1partx2part2upartx22

frac14 0 eth15THORN

where F(x1 x2 u p q r s t) is a function of 8 variables

y10 frac14 w10a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x21

frac14 part2f x1 x2eth THORNpartx21

eth16THORN

y12 frac14 w12a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x2 thorn b3x1x2

frac14 part2f x1 x2eth THORNpartx1partx2

eth17THORNFigures 5 and 6 show the D-PNN and ANN compare

approximation of some benchmarks - growing non-linear

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 2: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Zjavka Mathematical Sciences Page 2 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

m ndash number of variables A(a1 a2 hellip am) hellip - vectorsof parametersX(x1 x2 hellip xm) - vector of input variables

D-PNNrsquos block skeleton is formed by the GMDH(Group Method of Data Handling) polynomial neural net-work which was created by a Ukrainian scientist AlekseyIvakhnenko in 1968 when the back-propagation tech-nique was not known yet [6] General connection betweeninput and output variables is possible to express by theVolterra functional series a discrete analogue of which isKolmogorov-Gabor polynomial (1) This polynomial canapproximate any stationary random sequence of observa-tions and can be computed by either adaptive methods orsystem of Gaussian normal equations [7] GMDH decom-poses the complexity of a process into many simpler rela-tionships each described by low order polynomials (2) forevery pair of the input values

yrsquo frac14 a0 thorn a1xi thorn a2xj thorn a3xixj thorn a4xi2 thorn a5xj

2 eth2THORN

Partial differential equation constructionThe basic idea of the D-PNN is to create and resolve a gen-erally true partial differential equation (3) which is notknown in advance and can describe a system of dependentvariables with a special type of fractional multi-parametricpolynomials (4) ie sum derivative terms

athornXnifrac141

bipartupartxi

thornXnifrac141

Xnjfrac141

cijpart2upartxipartxj

thornhellip frac14 0

u frac14Xinfinkfrac141

uk

eth3THORNu = f(x1 x2 hellip xn) ndash searched function of all input

variablesa B(b1 b2 hellip bn) C(c11 c12hellip) ndash polynomial

parametersElementary methods of DE solutions express the solution

in special elementary functions ndash polynomials (eg Besselrsquosfunctions Fourierrsquos or power series) Numerical integrationof differential equation solutions is based on using

rational integral functions trigonometric series

Partial DE terms are formed by the adapted applicationof the method of integral analogues which replaces math-ematical operators and symbols of a DE by ratio of corre-sponding values Derivatives are replaced by their integralanalogues ie derivative operators are removed and simul-taneously all operators are replaced by similarly or propor-tion signs in equations all vectors are replaced by their

absolute values [8] However there should be possible toform sum derivative terms replacing a general partial DE(3) by using different math techniques eg wave seriesand others

yi frac14a0 thorn a1x1 thorn a2x2 thornhellipthorn anxn thorn anthorn1x1x2 thornhellipeth THORNmthorn1=n

b0 thorn b1x1 thornhellip

frac14 partmf x1 x2hellip xneth THORNpartx1partx2hellippartxm

Y frac14Xinfinifrac141

yi frac14 0

eth4THORNn ndash combination degree of n-input variable polynomial

of numeratorm ndash combination degree of denominator wt ndash weights

of terms

The fractional polynomials (4) defining partial relationsof n-input variables represent summation derivative terms(neurons) of a DE The numerator of eq (4) is a completen-variable polynomial which realizes a new partial func-tion u of formula (3) The denominator of eq (4) is a de-rivative part which gives a partial mutual change of someinput variables combination It arose from the partial der-ivation of the complete n-variable polynomial in respectto competent combination variables Root functions of nu-merator (4) take the polynomials into a correspondentcombination degree but neednrsquot be used at all if are notnecessary They may be adapted to enable the D-PNN togenerate an adequate range of desired output values

Multi-parametric function approximationD-PNN can approximate a multi-parametric functionthrough a general partial sum DE solution (3) Considerfirst only linear data relations which describe the DE ega simple sum function yt = x1 + x2 (however it could beany linear function) The network with 2 inputs forming1 functional output value y = f(x1 x2) should approximatethe true function yt by replacing sum derivative terms ofthe DE (5) It consists of only 1 block of 2 neurons termsof both derivative variables x1 and x2 (Figure 1)

y frac14 w1a0 thorn a1x1 thorn a2x2 thorn a3x1x2

b0 thorn b1x1

thornw2a0 thorn a1x1 thorn a2x2 thorn a3x1x2

b0 thorn b1x2

eth5THORN

D-PNN can be trained with only very small data set(6 samples) involving a wide range of input valueslt5500gt Figure 2 shows approximation errors (y-axis) ofthe trained network ie differences of the true and esti-mated function to random input vectors with dependentvariables Thus X-axis represents the ideal functionOutput errors can result from some disproportional de-

pendent random vector values (Figure 2) which D-PNN was

x1 + x2

f(x1 x2)

Block

w0 +

Neuron 1 Neuron 2

y1 y2

Figure 1 1-block D-PNN

pp p

x1 x2 x3

Zjavka Mathematical Sciences Page 3 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

not trained with e g 360 = 358 + 2 Figure 2 shows onlythe results of the 2-variable 2D-function f(x1x2)= x1 + x2The testing random output functions f(x1x2) displayed onx-axis (Figure 2) exceed the maximal trained sum value500 while the approximation error increases just slowlyIf the number of input variables is increased to 3 the

DE composition can apply polynomials of higher com-bination degree (=3) which results in raising amount ofsum derivative terms The 3-variable D-PNN for lineartrue function approximation (eg yt = x1 + x2 + x3) cancontain again 1 block of 6 neurons DE terms of all 1and 2-combination derivative variables of the completeDE eg (6)(7)

y1 frac14 w1a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3eth THORN2=3

b0 thorn b1x1

eth6THORN

Figure 2 Approximation of the 2-variable function

y4 frac14 w2a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3

b0 thorn b1x1 thorn b2x1 thorn b3x1x2eth7THORN

The training data set of the 3-variable function requiredan extension (in comparison 2-variables) to enable theD-PNN to get with a desired approximation error Theparameter optimization may apply a proper differenceevolution algorithm (EA) supplied with sufficient ran-dom mutations to prevent a parameter adjustment con-vergence before a desired error reaching [9] Not everyexperiment succeeds in a functional model

Multi-layered backward D-PNNMulti-layered D-PNN consisting of blocks of neuronsforms composite polynomial functions (functions of func-tions) (8) in each next hidden layer Each block contains asingle polynomial (without derivative part) forming itsoutput entranced into the next hidden layer (Figure 3)Neurons donrsquot affect the block output but are applied justdirectly as the sum derivative terms of a total output calcu-lation (DE composition) The blocks of the 2nd and follow-ing hidden layers also form additional extended neuronsie composite terms (CT) which define derivatives of com-posite functions applying reverse outputs and inputs ofback connected blocks of previous layers These partial de-rivatives in respect to variables of previous layers are

x1rsquorsquorsquo

Y

x3rsquorsquox2rsquorsquox1rsquorsquo

x3rsquox2rsquox1rsquo

p p p

CTCT CT

CTCTCT

O = simple term

CT = compound term

P = output polynomial

Figure 3 The 3-variable multi-layered 2-combination D-PNNBack-connections of the 3rd layer 1st block

Figure 4 Compare approximation of the 3-variable linear function

Zjavka Mathematical Sciences Page 4 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

calculated according to the composite function derivationrules (9) (10) and formed by products of partial derivativesof outer and inner functions [10]

yi frac14 ϕi Xeth THORN frac14 ϕi x1 x2hellip xneth THORN i frac14 1hellipm eth8THORN

F x1 x2hellip xneth THORN frac14 f y1 y2hellip ymeth THORNfrac14 f ϕ1 Xeth THORNϕ2 Xeth THORNhellipϕm Xeth THORNeth THORN

eth9THORN

partFpartxk

frac14Xmifrac141

partf y1 y2hellip ymeth THORNpartyi

sdotpartϕi Xeth THORNpartxk

k frac14 1hellip n

eth10THORNThe 1st block of the last (3rd) hidden layer (Figure 3)

forms 2 neurons of its own input variables as 2 simpleterms (11) of the DE (3) It creates also 4 compound termsof the 2nd (previous) hidden layer using reverse outputs

Figure 5 Compare approximation f(x1x2x3) = x12+x2

2+x32

and inputs of 2 bound blocks in respect to 4 derivative var-iables (12) As couples of variables of the inner functionscan differ from each other their partial derivations are 0and so the sum of formula (10) will consist only of 1 termThus each neuron of the D-PNN represents a DE termLikewise compound terms can be created in respect to the1st hidden layer variables eg (13) The 3 back-joint blocksform 8 CT of the DE and this can be well performed by arecursive algorithm

y11 frac14 w1a0 thorn a1x

1a2x

2 thorn a3x

1x

2eth THORN

4=7

b0 thorn b1x1

frac14 w1eth1x1 THORN

4=7

b0 thorn b1x1

eth11THORN

y13 frac14 w3

3x1eth THORN2=3

x2=

x1eth THORN2=3

b0 thorn b1x1

eth12THORN

y17 frac14 w77x1

x2 =

x1x2

=x1

b0 thorn b1x1eth13THORN

D-PNN should create a functional value around thedesired output As the input vector variables can take awide range of values (Figure 2) the combination polyno-mials produce big output values Therefore the multipli-cation (10) was replaced by division operator in fractionsof compound terms (11)(12)(13) without an negativeeffect reducing the combination degree of compositeterm polynomials each previous joint layer Withoutthis modification the root exponents of CT fractionswould require an adjustment The numerator exponentsare adapted to the current layer calculation as the

Zjavka Mathematical Sciences Page 5 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

combination degree of polynomials doubles each follow-ing hidden layer (11)(12)(13)Each neuron has an adjustable term weight wi but not

everyone may participate in the total network output cal-culation (DE composition) The selection of optimal neuroncombination can perform easily a proper genetic algorithm(GA) [11] Parameters of polynomials are represented byreal numbers which random initial values are generatedfrom the interval lt05 15gt They are adjusted withsimultaneous GA best-fit neuron combination search inthe initial phase of the DE composition There wouldbe welcome to apply an adequate gradient steepest descentmethod [12] in conjunction with the EA [13] D-PNN canbe trained with only small inputndashoutput data set like-wise the GMDH polynomial neural network does [14]D-PNNrsquos total output Y is the arithmetic mean of all ac-tive neuron output values (14) to prevent the neuronamount to influence it

Y frac14

Xkifrac141

yi

kk frac14 actual amount of active neurons

eth14THORN

ExperimentsThe presented 3-variable multi-layered D-PNN (Figure 3)is able to approximate any linear function eg simplesum f(x1 x2 x3) = x1 + x2 + x3 The D-PNN and ANNcomparison processes 12 fixed same training data samplesThe progress curves are typical of all following experi-ments (benchmarks) (Figure 4) The approximation accur-acy of both methods is co-equal on the trained intervalvalues lt10300gt however the ANN approximation ability

Figure 6 Compare approximation f(x1x2x3) = x1+x22+x3

3

rapidly falls outside of this range while the D-PNN al-ternate errors grow just slowly The ANN with 2-hiddenlayers of neurons applied the sigmoidal activation func-tion and the standard back-propagation algorithm TheD-PNN output has typically a wave-like behavior as itmodel is composed of sum DE termsApproximation of non-linear functions requires the

extension of the D-PNN block and neuron polynomials(11)(12)(13) with square power variables Polynomials arethe same as applied by the GMDH algorithm (2) Compe-tent square power (16) and combination (17) derivativesform additional sum terms of the 2nd order partial DE (15)The compound neurons of these derivatives are also formedaccording to the composite function derivative rules [10]

F x1 x2 upartupartx1

partupartx2

part2upartx21

part2u

partx1partx2part2upartx22

frac14 0 eth15THORN

where F(x1 x2 u p q r s t) is a function of 8 variables

y10 frac14 w10a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x21

frac14 part2f x1 x2eth THORNpartx21

eth16THORN

y12 frac14 w12a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x2 thorn b3x1x2

frac14 part2f x1 x2eth THORNpartx1partx2

eth17THORNFigures 5 and 6 show the D-PNN and ANN compare

approximation of some benchmarks - growing non-linear

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 3: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

x1 + x2

f(x1 x2)

Block

w0 +

Neuron 1 Neuron 2

y1 y2

Figure 1 1-block D-PNN

pp p

x1 x2 x3

Zjavka Mathematical Sciences Page 3 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

not trained with e g 360 = 358 + 2 Figure 2 shows onlythe results of the 2-variable 2D-function f(x1x2)= x1 + x2The testing random output functions f(x1x2) displayed onx-axis (Figure 2) exceed the maximal trained sum value500 while the approximation error increases just slowlyIf the number of input variables is increased to 3 the

DE composition can apply polynomials of higher com-bination degree (=3) which results in raising amount ofsum derivative terms The 3-variable D-PNN for lineartrue function approximation (eg yt = x1 + x2 + x3) cancontain again 1 block of 6 neurons DE terms of all 1and 2-combination derivative variables of the completeDE eg (6)(7)

y1 frac14 w1a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3eth THORN2=3

b0 thorn b1x1

eth6THORN

Figure 2 Approximation of the 2-variable function

y4 frac14 w2a0 thorn a1x1 thorn a2x2 thorn a3x3 thorn a4x1x2 thornhellipthorn a7x1x2x3

b0 thorn b1x1 thorn b2x1 thorn b3x1x2eth7THORN

The training data set of the 3-variable function requiredan extension (in comparison 2-variables) to enable theD-PNN to get with a desired approximation error Theparameter optimization may apply a proper differenceevolution algorithm (EA) supplied with sufficient ran-dom mutations to prevent a parameter adjustment con-vergence before a desired error reaching [9] Not everyexperiment succeeds in a functional model

Multi-layered backward D-PNNMulti-layered D-PNN consisting of blocks of neuronsforms composite polynomial functions (functions of func-tions) (8) in each next hidden layer Each block contains asingle polynomial (without derivative part) forming itsoutput entranced into the next hidden layer (Figure 3)Neurons donrsquot affect the block output but are applied justdirectly as the sum derivative terms of a total output calcu-lation (DE composition) The blocks of the 2nd and follow-ing hidden layers also form additional extended neuronsie composite terms (CT) which define derivatives of com-posite functions applying reverse outputs and inputs ofback connected blocks of previous layers These partial de-rivatives in respect to variables of previous layers are

x1rsquorsquorsquo

Y

x3rsquorsquox2rsquorsquox1rsquorsquo

x3rsquox2rsquox1rsquo

p p p

CTCT CT

CTCTCT

O = simple term

CT = compound term

P = output polynomial

Figure 3 The 3-variable multi-layered 2-combination D-PNNBack-connections of the 3rd layer 1st block

Figure 4 Compare approximation of the 3-variable linear function

Zjavka Mathematical Sciences Page 4 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

calculated according to the composite function derivationrules (9) (10) and formed by products of partial derivativesof outer and inner functions [10]

yi frac14 ϕi Xeth THORN frac14 ϕi x1 x2hellip xneth THORN i frac14 1hellipm eth8THORN

F x1 x2hellip xneth THORN frac14 f y1 y2hellip ymeth THORNfrac14 f ϕ1 Xeth THORNϕ2 Xeth THORNhellipϕm Xeth THORNeth THORN

eth9THORN

partFpartxk

frac14Xmifrac141

partf y1 y2hellip ymeth THORNpartyi

sdotpartϕi Xeth THORNpartxk

k frac14 1hellip n

eth10THORNThe 1st block of the last (3rd) hidden layer (Figure 3)

forms 2 neurons of its own input variables as 2 simpleterms (11) of the DE (3) It creates also 4 compound termsof the 2nd (previous) hidden layer using reverse outputs

Figure 5 Compare approximation f(x1x2x3) = x12+x2

2+x32

and inputs of 2 bound blocks in respect to 4 derivative var-iables (12) As couples of variables of the inner functionscan differ from each other their partial derivations are 0and so the sum of formula (10) will consist only of 1 termThus each neuron of the D-PNN represents a DE termLikewise compound terms can be created in respect to the1st hidden layer variables eg (13) The 3 back-joint blocksform 8 CT of the DE and this can be well performed by arecursive algorithm

y11 frac14 w1a0 thorn a1x

1a2x

2 thorn a3x

1x

2eth THORN

4=7

b0 thorn b1x1

frac14 w1eth1x1 THORN

4=7

b0 thorn b1x1

eth11THORN

y13 frac14 w3

3x1eth THORN2=3

x2=

x1eth THORN2=3

b0 thorn b1x1

eth12THORN

y17 frac14 w77x1

x2 =

x1x2

=x1

b0 thorn b1x1eth13THORN

D-PNN should create a functional value around thedesired output As the input vector variables can take awide range of values (Figure 2) the combination polyno-mials produce big output values Therefore the multipli-cation (10) was replaced by division operator in fractionsof compound terms (11)(12)(13) without an negativeeffect reducing the combination degree of compositeterm polynomials each previous joint layer Withoutthis modification the root exponents of CT fractionswould require an adjustment The numerator exponentsare adapted to the current layer calculation as the

Zjavka Mathematical Sciences Page 5 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

combination degree of polynomials doubles each follow-ing hidden layer (11)(12)(13)Each neuron has an adjustable term weight wi but not

everyone may participate in the total network output cal-culation (DE composition) The selection of optimal neuroncombination can perform easily a proper genetic algorithm(GA) [11] Parameters of polynomials are represented byreal numbers which random initial values are generatedfrom the interval lt05 15gt They are adjusted withsimultaneous GA best-fit neuron combination search inthe initial phase of the DE composition There wouldbe welcome to apply an adequate gradient steepest descentmethod [12] in conjunction with the EA [13] D-PNN canbe trained with only small inputndashoutput data set like-wise the GMDH polynomial neural network does [14]D-PNNrsquos total output Y is the arithmetic mean of all ac-tive neuron output values (14) to prevent the neuronamount to influence it

Y frac14

Xkifrac141

yi

kk frac14 actual amount of active neurons

eth14THORN

ExperimentsThe presented 3-variable multi-layered D-PNN (Figure 3)is able to approximate any linear function eg simplesum f(x1 x2 x3) = x1 + x2 + x3 The D-PNN and ANNcomparison processes 12 fixed same training data samplesThe progress curves are typical of all following experi-ments (benchmarks) (Figure 4) The approximation accur-acy of both methods is co-equal on the trained intervalvalues lt10300gt however the ANN approximation ability

Figure 6 Compare approximation f(x1x2x3) = x1+x22+x3

3

rapidly falls outside of this range while the D-PNN al-ternate errors grow just slowly The ANN with 2-hiddenlayers of neurons applied the sigmoidal activation func-tion and the standard back-propagation algorithm TheD-PNN output has typically a wave-like behavior as itmodel is composed of sum DE termsApproximation of non-linear functions requires the

extension of the D-PNN block and neuron polynomials(11)(12)(13) with square power variables Polynomials arethe same as applied by the GMDH algorithm (2) Compe-tent square power (16) and combination (17) derivativesform additional sum terms of the 2nd order partial DE (15)The compound neurons of these derivatives are also formedaccording to the composite function derivative rules [10]

F x1 x2 upartupartx1

partupartx2

part2upartx21

part2u

partx1partx2part2upartx22

frac14 0 eth15THORN

where F(x1 x2 u p q r s t) is a function of 8 variables

y10 frac14 w10a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x21

frac14 part2f x1 x2eth THORNpartx21

eth16THORN

y12 frac14 w12a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x2 thorn b3x1x2

frac14 part2f x1 x2eth THORNpartx1partx2

eth17THORNFigures 5 and 6 show the D-PNN and ANN compare

approximation of some benchmarks - growing non-linear

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 4: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Figure 4 Compare approximation of the 3-variable linear function

Zjavka Mathematical Sciences Page 4 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

calculated according to the composite function derivationrules (9) (10) and formed by products of partial derivativesof outer and inner functions [10]

yi frac14 ϕi Xeth THORN frac14 ϕi x1 x2hellip xneth THORN i frac14 1hellipm eth8THORN

F x1 x2hellip xneth THORN frac14 f y1 y2hellip ymeth THORNfrac14 f ϕ1 Xeth THORNϕ2 Xeth THORNhellipϕm Xeth THORNeth THORN

eth9THORN

partFpartxk

frac14Xmifrac141

partf y1 y2hellip ymeth THORNpartyi

sdotpartϕi Xeth THORNpartxk

k frac14 1hellip n

eth10THORNThe 1st block of the last (3rd) hidden layer (Figure 3)

forms 2 neurons of its own input variables as 2 simpleterms (11) of the DE (3) It creates also 4 compound termsof the 2nd (previous) hidden layer using reverse outputs

Figure 5 Compare approximation f(x1x2x3) = x12+x2

2+x32

and inputs of 2 bound blocks in respect to 4 derivative var-iables (12) As couples of variables of the inner functionscan differ from each other their partial derivations are 0and so the sum of formula (10) will consist only of 1 termThus each neuron of the D-PNN represents a DE termLikewise compound terms can be created in respect to the1st hidden layer variables eg (13) The 3 back-joint blocksform 8 CT of the DE and this can be well performed by arecursive algorithm

y11 frac14 w1a0 thorn a1x

1a2x

2 thorn a3x

1x

2eth THORN

4=7

b0 thorn b1x1

frac14 w1eth1x1 THORN

4=7

b0 thorn b1x1

eth11THORN

y13 frac14 w3

3x1eth THORN2=3

x2=

x1eth THORN2=3

b0 thorn b1x1

eth12THORN

y17 frac14 w77x1

x2 =

x1x2

=x1

b0 thorn b1x1eth13THORN

D-PNN should create a functional value around thedesired output As the input vector variables can take awide range of values (Figure 2) the combination polyno-mials produce big output values Therefore the multipli-cation (10) was replaced by division operator in fractionsof compound terms (11)(12)(13) without an negativeeffect reducing the combination degree of compositeterm polynomials each previous joint layer Withoutthis modification the root exponents of CT fractionswould require an adjustment The numerator exponentsare adapted to the current layer calculation as the

Zjavka Mathematical Sciences Page 5 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

combination degree of polynomials doubles each follow-ing hidden layer (11)(12)(13)Each neuron has an adjustable term weight wi but not

everyone may participate in the total network output cal-culation (DE composition) The selection of optimal neuroncombination can perform easily a proper genetic algorithm(GA) [11] Parameters of polynomials are represented byreal numbers which random initial values are generatedfrom the interval lt05 15gt They are adjusted withsimultaneous GA best-fit neuron combination search inthe initial phase of the DE composition There wouldbe welcome to apply an adequate gradient steepest descentmethod [12] in conjunction with the EA [13] D-PNN canbe trained with only small inputndashoutput data set like-wise the GMDH polynomial neural network does [14]D-PNNrsquos total output Y is the arithmetic mean of all ac-tive neuron output values (14) to prevent the neuronamount to influence it

Y frac14

Xkifrac141

yi

kk frac14 actual amount of active neurons

eth14THORN

ExperimentsThe presented 3-variable multi-layered D-PNN (Figure 3)is able to approximate any linear function eg simplesum f(x1 x2 x3) = x1 + x2 + x3 The D-PNN and ANNcomparison processes 12 fixed same training data samplesThe progress curves are typical of all following experi-ments (benchmarks) (Figure 4) The approximation accur-acy of both methods is co-equal on the trained intervalvalues lt10300gt however the ANN approximation ability

Figure 6 Compare approximation f(x1x2x3) = x1+x22+x3

3

rapidly falls outside of this range while the D-PNN al-ternate errors grow just slowly The ANN with 2-hiddenlayers of neurons applied the sigmoidal activation func-tion and the standard back-propagation algorithm TheD-PNN output has typically a wave-like behavior as itmodel is composed of sum DE termsApproximation of non-linear functions requires the

extension of the D-PNN block and neuron polynomials(11)(12)(13) with square power variables Polynomials arethe same as applied by the GMDH algorithm (2) Compe-tent square power (16) and combination (17) derivativesform additional sum terms of the 2nd order partial DE (15)The compound neurons of these derivatives are also formedaccording to the composite function derivative rules [10]

F x1 x2 upartupartx1

partupartx2

part2upartx21

part2u

partx1partx2part2upartx22

frac14 0 eth15THORN

where F(x1 x2 u p q r s t) is a function of 8 variables

y10 frac14 w10a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x21

frac14 part2f x1 x2eth THORNpartx21

eth16THORN

y12 frac14 w12a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x2 thorn b3x1x2

frac14 part2f x1 x2eth THORNpartx1partx2

eth17THORNFigures 5 and 6 show the D-PNN and ANN compare

approximation of some benchmarks - growing non-linear

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 5: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Zjavka Mathematical Sciences Page 5 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

combination degree of polynomials doubles each follow-ing hidden layer (11)(12)(13)Each neuron has an adjustable term weight wi but not

everyone may participate in the total network output cal-culation (DE composition) The selection of optimal neuroncombination can perform easily a proper genetic algorithm(GA) [11] Parameters of polynomials are represented byreal numbers which random initial values are generatedfrom the interval lt05 15gt They are adjusted withsimultaneous GA best-fit neuron combination search inthe initial phase of the DE composition There wouldbe welcome to apply an adequate gradient steepest descentmethod [12] in conjunction with the EA [13] D-PNN canbe trained with only small inputndashoutput data set like-wise the GMDH polynomial neural network does [14]D-PNNrsquos total output Y is the arithmetic mean of all ac-tive neuron output values (14) to prevent the neuronamount to influence it

Y frac14

Xkifrac141

yi

kk frac14 actual amount of active neurons

eth14THORN

ExperimentsThe presented 3-variable multi-layered D-PNN (Figure 3)is able to approximate any linear function eg simplesum f(x1 x2 x3) = x1 + x2 + x3 The D-PNN and ANNcomparison processes 12 fixed same training data samplesThe progress curves are typical of all following experi-ments (benchmarks) (Figure 4) The approximation accur-acy of both methods is co-equal on the trained intervalvalues lt10300gt however the ANN approximation ability

Figure 6 Compare approximation f(x1x2x3) = x1+x22+x3

3

rapidly falls outside of this range while the D-PNN al-ternate errors grow just slowly The ANN with 2-hiddenlayers of neurons applied the sigmoidal activation func-tion and the standard back-propagation algorithm TheD-PNN output has typically a wave-like behavior as itmodel is composed of sum DE termsApproximation of non-linear functions requires the

extension of the D-PNN block and neuron polynomials(11)(12)(13) with square power variables Polynomials arethe same as applied by the GMDH algorithm (2) Compe-tent square power (16) and combination (17) derivativesform additional sum terms of the 2nd order partial DE (15)The compound neurons of these derivatives are also formedaccording to the composite function derivative rules [10]

F x1 x2 upartupartx1

partupartx2

part2upartx21

part2u

partx1partx2part2upartx22

frac14 0 eth15THORN

where F(x1 x2 u p q r s t) is a function of 8 variables

y10 frac14 w10a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x21

frac14 part2f x1 x2eth THORNpartx21

eth16THORN

y12 frac14 w12a0 thorn a1x1 thorn a2x2 thorn a3x21 thorn a4x22 thorn a5x1x2

b0 thorn b1x1 thorn b2x2 thorn b3x1x2

frac14 part2f x1 x2eth THORNpartx1partx2

eth17THORNFigures 5 and 6 show the D-PNN and ANN compare

approximation of some benchmarks - growing non-linear

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 6: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

(a)

(b)

(c)

Figure 7 Models of the function f(x1x2x3) = (x1+x2+x3)2 with decreasing training errors (a b c)

Zjavka Mathematical Sciences Page 6 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References
Page 7: ORIGINAL RESEARCH Open Access Approximation of multi ... · Keywords: Polynomial neural network, Data relations, Partial differential equation construction, Multi-parametric function

Zjavka Mathematical Sciences Page 7 of 7httpwwwiaumathcomcontent

2013 733httpwwwiaumathcomcontent7133

functions The 24 training data samples were randomlygenerated by benchmark functions from the intervallt10400gt for both network models The parameter andweight adjustment of both methods appeared heavy time-consuming and have not succeed any experiment Theoptimal number of the D-PNNrsquos derivative neurons of thenon-linear benchmark models was around 100 Experimentswith other benchmarks (eg x2

2+x33+x3

4) result in similaroutcome graphs The D-PNN wavelet output can comprisea wider range of testing data interval values which was nottrained than ANN Both method inaccuracies intensify onuntrained interval values as the benchmarks involve powerfunctions Presented experiments verified the neural net-work capability to approximate any multi-parametric func-tion though the operating principles of both techniquesdiffer essentiallyFigure 7a-c show D-PNN models of different final training

root mean square errors (RMSE) RMSE decrease results ina lower generalization on untrained data interval values(Figure 7c) Vice versa the greater RMSE evoke the model isvalid for a wider data range while the inaccuracies of thetraining data interval are overvalued (Figure 7a) This effectbecame evident of all experiments As a result the D-PNNshould not be trained to a minimal achievable error value toget with the optimal generalization of testing data The ap-plied incomplete adjustment and selective methods requireimprovements which could yield better results Thepresented D-PNN operating principle differs by far fromother common neural network applied techniques Bench-mark results enable to suppose the D-PNN will succeedforming complex models which can be defined in the formof multi-parametric functions

ConclusionD-PNN is a new type of neural network which functionapproximation and dependence of variables identification isbased on a generalization of data relations It does notutilize absolute values of variables but relative ones whichcan better describe a wide range of input data intervalvalues D-PNN constructs a general partial differentialequation which defines a system model of dependentvariables applying integral fractional polynomial sumterms An acceptable implementation of the sum derivativeterms is the principal part of a partial DE substitution Arti-ficial neural network pattern identification and function ap-proximation models are simpler techniques based only onwhole-pattern similarity relations A real data examplemight solve weather forecasts of 1 locality eg staticpressure values prediction applying some trained data re-lations of few nearby localities of surrounding areasPhases of a constant time interval of this very complexsystem could define input vectors of training data setEstimated multi-parametric function values ie nextsystem states of a selected locality of a time delay form

desired network outputs D-PNN could create betterlong-time models based on a partial sum DE solutionthan a standard ANN time-series prediction (based onentire pattern definitions too)

Competing interestsThe author declares that they have no competing interests

Authorrsquos contributionsThe author designed a new type of neural network based on GMDHpolynomial neural network which forms its skeleton structure The proposednew neural network generates a sum of relative derivative polynomial termsas a general differential equation model description It replaces and resolvesthe sum partial differential equation as an approximation of an unknownmulti-parametric function defined by several discrete point observationsThe network was tested with some benchmark functions and compared withartificial neural network models Real data application models will follow thetest experiments

AcknowledgementsThis paper has been elaborated in the framework of the IT4Innovations Centreof Excellence project reg no CZ1051100020070 supported by OperationalProgrammersquoResearch and Development for Innovationsrsquo funded by StructuralFunds of the European Union and state budget of the Czech Republic andsupported by the Ministry of Industry and Trade of the Czech Republic underthe grant no FR-TI1420 and by SGS VŠB ndash Technical University of OstravaCzech Republic under the grant No SP201258

Received 31 May 2012 Accepted 21 June 2013Published

References1 Zhou B Xiao-Li Y Liu R Wei W Image segmentation with partial differential

equations Information Technology Journal 9(5) 1049ndash1052 (2010)2 Iba H Inference of differential equation models by genetic programming

pp 4453ndash4468 Information Sciences Volume 178 Issue 23 1 December 2008Pages (2008)

3 Tsoulos I Gavrilis D Glavas E Solving differential equations withconstructed neural networks Neurocomputing Volume 72 Issues 10ndash122385ndash2391 (2009)

4 Giles CL Noisy Time Series Prediction using Recurrent Neural Networks andGrammatical Inference Machine Learning 44 161ndash183 (2001)

5 Zjavka L Recognition of Generalized Patterns by a Differential PolynomialNeural Network ETASR - Engineering Technology amp Applied ScienceResearch 2(1) 167ndash172 (2012)

6 Ivakhnenko AG Polynomial theory of complex systems IEEE Transactionson systems Vol SMC-1 No4 (1971)

7 Nikolaev NY Iba H Adaptive Learning of Polynomial Networks SpringerNew York (2006)

8 Kuneš J Vavroch O Franta V Fundamentals of modeling SNTL Prahain Czech (1989)

9 Das S Abraham A Konar A Particle swarm optimization and Differentialevolution algorithms Studies in Computational Intelligence 116 1ndash38Springer-Verlag Berlin (2008)

10 Kluvaacutenek I Mišiacutek L Svec M Matematics I SNTL Bratislava in Slovak (1966)11 Obitko M Genetic algorithms Hochshule fur Technik und Wirtschaft

Dresden (1998) [Online] Available httpwwwobitkocomtutorialsgenetic-algorithms

12 Nikolaev NY Iba H Polynomial harmonic GMDH learning networks for timeseries modeling Neural Networks 16 1527ndash1540 (2003) Science Direct

13 Zjavka L Construction and adjustment of differential polynomial neuralnetwork pp 40ndash50 Journal of Engineering and Computer Innovations Vol 2Num 3 March 2011 Academic Journals (2011)

14 Galkin I Polynomial neural networks Materials for UML 91 531 Data miningcourse University Mass Lowell httpulcarumledu~iagCSPolynomial-NNhtml

doiCite this article as Zjavka Approximation of multi-parametric functionsusing the differential polynomial neural network Mathematical Sciences

24 Jul 2013

1011862251-7456-7-33

2013 733

  • Abstract
  • Introduction
    • Partial differential equation construction
    • Multi-parametric function approximation
    • Multi-layered backward D-PNN
    • Experiments
      • Conclusion
      • Competing interests
      • Authorrsquos contributions
      • Acknowledgements
      • References