Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving...

8
Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa, and Fernando Gomide Abstract—A primary requirement of a broad class of evolv- ing intelligent systems is to process a sequence of numeric data over time. This paper introduces a granular neural network framework for evolving fuzzy system modeling from fuzzy data streams. The evolving granular neural network (eGNN) efficiently handles concept changes, distinctive events of nonstationary environments. eGNN constructs interpretable multi-sized local models using fuzzy neural information fusion. An incremental learning algorithm builds the neural network topology from the information contained in data streams. Here we emphasize fuzzy intervals and objects with trapezoidal membership functions. Triangular fuzzy numbers, intervals, and numeric data are particular instances of trapezoids. An example concerning weather time series forecasting illustrates the neural network performance. The goal is to extract, from monthly temperature data, information of interest to attain accurate one- step forecasts and better rapport with reality. Simulation results suggest that eGNN learns from fuzzy data successfully and is competitive with state-of-the-art approaches. I. I NTRODUCTION The prominent presence of data streams and online in- formation processing in real-world systems, along with the necessity of modeling, analyzing, and understanding these systems, has brought new challenges, higher demands, and new research directions. Research and development of con- ceptual frameworks, methods, and algorithms capable to extract knowledge from data streams is also motivated by a manifold of important applications [1] - [3]. Data stream modeling is fundamentally based on com- putational learning approaches that both, process data con- tinuously as an attempt to find similarities in their spatio- temporal features, and thereafter provide insights about the phenomenon which governs the data. The ultimate goal is to obtain more abstract (often human-centric) representations of large amounts of detailed data with no apparent value. As real-world systems become more complex, modeling, processing and disposing information become more complex as well. Data streams are characterized by nonstationarity, nonlinearity, and heterogeneity; they are potentially endless, and may be subjected to changes of various kinds. Direct application of machine learning and data mining algorithms to data streams is often infeasible because it is difficult to maintain all the data in memory. Particularly, a challenge faced in stream modeling concerns handling uncertainty. Daniel Leite and Fernando Gomide are with the Department of Computer Engineering and Automation, School of Electrical and Computer Engineer- ing, University of Campinas, 13083-852 BRA, e-mail: {danfl7, gomide} @dca.fee.unicamp.br. Pyramo Costa is with the Graduate Program in Electri- cal Engineering, Pontifical Catholic University of Minas Gerais, 30535-610 BRA, e-mail: [email protected]. Uncertainty is an attribute of information once our ability to perceive the reality is limited [4]. The more complex a system is, the more uncertain is the available information, and the more imprecise is our understanding of that system. As Kreinovich stated, measurements and expert estimates are never exact [5]. Granular computing theory [6] - [10] hypothesizes that accepting some level of uncertainty may be beneficial and therefore suggests a balance between precision and uncertainty. Information granulation for uncertainty representation is a fundamental manifestation of the human knowledge [6]. Information granulation means that instead of dealing with detailed real-world data, the data are considered in a more abstract and conceptual perspective. The result of information granulation is called information granule - being a granule, a clump of objects, subsets, clusters, or elements of a universe put together by similarity, proximity or functionality [11]. There are close relations between granulation, data mining [12], data fusion [13], and knowledge discovery [14]. Granular models developed from data streams can be expressed into several computationally tractable frameworks. Of special concern to this paper are fuzzy granular data streams and evolving neurofuzzy modeling framework. Fuzzy granular data may rise from expert judgment, readings from unreliable sensors, and summaries of numeric (singular) data over time periods. Artificial neural networks are nonlin- ear, highly plastic systems equipped with significant learn- ing capability. Fuzzy sets and fuzzy neurons provide neu- ral networks with mechanisms of approximating reasoning and transparency of the resulting construction. Fuzzy sets and neurocomputing are complementary in terms of their strengths thus motivating neurofuzzy granular computing. The evolving aspect of neurofuzzy networks accounts for endless streams of nonstationary data and structural adap- tation of models on an incremental basis. This paper introduces an evolving granular neural network (eGNN) approach for fuzzy time series forecasting. Refer to [15] for the pioneering work in granular non-evolving neural networks, [16] - [17] for regression and semi-supervised classification applications of eGNN, and [18] - [19] for related interval and fuzzy evolving granular approaches. In this paper, the proposed eGNN plays the role of an evolving predictor able to capture the essence of uncertain (fuzzy) time series data in a more abstract and compact representation. The remainder of this paper is organized as follows. Section II presents fuzzy aggregation neurons which are key constructs of granular neurofuzzy networks. The topology U.S. Government work not protected by U.S. copyright WCCI 2012 IEEE World Congress on Computational Intelligence June, 10-15, 2012 - Brisbane, Australia IJCNN

Transcript of Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving...

Page 1: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

Evolving Granular Neural Networkfor Fuzzy Time Series Forecasting

Daniel Leite, Pyramo Costa, and Fernando Gomide

Abstract—A primary requirement of a broad class of evolv-ing intelligent systems is to process a sequence of numericdata over time. This paper introduces a granular neuralnetwork framework for evolving fuzzy system modeling fromfuzzy data streams. The evolving granular neural network(eGNN) efficiently handles concept changes, distinctive eventsof nonstationary environments. eGNN constructs interpretablemulti-sized local models using fuzzy neural information fusion.An incremental learning algorithm builds the neural networktopology from the information contained in data streams. Herewe emphasize fuzzy intervals and objects with trapezoidalmembership functions. Triangular fuzzy numbers, intervals, andnumeric data are particular instances of trapezoids. An exampleconcerning weather time series forecasting illustrates the neuralnetwork performance. The goal is to extract, from monthlytemperature data, information of interest to attain accurate one-step forecasts and better rapport with reality. Simulation resultssuggest that eGNN learns from fuzzy data successfully and iscompetitive with state-of-the-art approaches.

I. INTRODUCTION

The prominent presence of data streams and online in-formation processing in real-world systems, along with thenecessity of modeling, analyzing, and understanding thesesystems, has brought new challenges, higher demands, andnew research directions. Research and development of con-ceptual frameworks, methods, and algorithms capable toextract knowledge from data streams is also motivated bya manifold of important applications [1] - [3].

Data stream modeling is fundamentally based on com-putational learning approaches that both, process data con-tinuously as an attempt to find similarities in their spatio-temporal features, and thereafter provide insights about thephenomenon which governs the data. The ultimate goal is toobtain more abstract (often human-centric) representations oflarge amounts of detailed data with no apparent value.

As real-world systems become more complex, modeling,processing and disposing information become more complexas well. Data streams are characterized by nonstationarity,nonlinearity, and heterogeneity; they are potentially endless,and may be subjected to changes of various kinds. Directapplication of machine learning and data mining algorithmsto data streams is often infeasible because it is difficult tomaintain all the data in memory. Particularly, a challengefaced in stream modeling concerns handling uncertainty.

Daniel Leite and Fernando Gomide are with the Department of ComputerEngineering and Automation, School of Electrical and Computer Engineer-ing, University of Campinas, 13083-852 BRA, e-mail: {danfl7, gomide}@dca.fee.unicamp.br. Pyramo Costa is with the Graduate Program in Electri-cal Engineering, Pontifical Catholic University of Minas Gerais, 30535-610BRA, e-mail: [email protected].

Uncertainty is an attribute of information once our abilityto perceive the reality is limited [4]. The more complex asystem is, the more uncertain is the available information,and the more imprecise is our understanding of that system.As Kreinovich stated, measurements and expert estimatesare never exact [5]. Granular computing theory [6] - [10]hypothesizes that accepting some level of uncertainty may bebeneficial and therefore suggests a balance between precisionand uncertainty.

Information granulation for uncertainty representation isa fundamental manifestation of the human knowledge [6].Information granulation means that instead of dealing withdetailed real-world data, the data are considered in a moreabstract and conceptual perspective. The result of informationgranulation is called information granule - being a granule, aclump of objects, subsets, clusters, or elements of a universeput together by similarity, proximity or functionality [11].There are close relations between granulation, data mining[12], data fusion [13], and knowledge discovery [14].

Granular models developed from data streams can beexpressed into several computationally tractable frameworks.Of special concern to this paper are fuzzy granular datastreams and evolving neurofuzzy modeling framework. Fuzzygranular data may rise from expert judgment, readings fromunreliable sensors, and summaries of numeric (singular) dataover time periods. Artificial neural networks are nonlin-ear, highly plastic systems equipped with significant learn-ing capability. Fuzzy sets and fuzzy neurons provide neu-ral networks with mechanisms of approximating reasoningand transparency of the resulting construction. Fuzzy setsand neurocomputing are complementary in terms of theirstrengths thus motivating neurofuzzy granular computing.The evolving aspect of neurofuzzy networks accounts forendless streams of nonstationary data and structural adap-tation of models on an incremental basis.

This paper introduces an evolving granular neural network(eGNN) approach for fuzzy time series forecasting. Refer to[15] for the pioneering work in granular non-evolving neuralnetworks, [16] - [17] for regression and semi-supervisedclassification applications of eGNN, and [18] - [19] forrelated interval and fuzzy evolving granular approaches. Inthis paper, the proposed eGNN plays the role of an evolvingpredictor able to capture the essence of uncertain (fuzzy) timeseries data in a more abstract and compact representation.

The remainder of this paper is organized as follows.Section II presents fuzzy aggregation neurons which are keyconstructs of granular neurofuzzy networks. The topology

U.S. Government work not protected by U.S. copyright

WCCI 2012 IEEE World Congress on Computational Intelligence June, 10-15, 2012 - Brisbane, Australia IJCNN

Page 2: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

of eGNN is introduced in Section III. Section IV addressesthe gradual construction of granular networks by means ofone-pass recursive learning algorithm. Section V presentsthe results obtained by eGNN and alternative approaches intemperature time series forecasting. Section VI concludes thepaper and suggests issues for further investigation.

II. FUZZY AGGREGATION NEURON

Aggregation neurons are artificial neuron models basedon aggregation operators [20]. Evolving granular neuralnetworks may use different types of aggregation neuronsto perform information fusion. In general, there are nospecific guidelines to choose a particular aggregation operatorto construct a fuzzy neuron. The choice depends on theapplication environment and domain knowledge [21].

Aggregation operators 𝐴 : [0, 1]𝑛 → [0, 1], 𝑛 > 1 combineinput values in the unit hypercube [0, 1]𝑛 into a singleoutput value in [0, 1]. They must satisfy two fundamentalproperties: (i) monotonicity in all arguments, i.e., given𝑥1 = (𝑥11, ..., 𝑥

1𝑛) and 𝑥2 = (𝑥21, ..., 𝑥

2𝑛), if 𝑥1𝑗 ≤ 𝑥2𝑗 ∀𝑗 then

𝐴(𝑥1) ≤ 𝐴(𝑥2); (ii) boundary conditions: 𝐴(0, 0, ..., 0) = 0and 𝐴(1, 1, ..., 1) = 1. The classes of aggregation operatorsconsidered in this work are summarized below. See [7],[21] for a detailed coverage and [16] for other examples ofoperators that can be used to construct granular networks.

A. T-norm aggregation

T-norms (𝑇 ) are commutative, associative and monotoneoperators on the unit hypercube whose boundary conditionsare 𝑇 (𝛼, 𝛼, ..., 0) = 0 and 𝑇 (𝛼, 1, ..., 1) = 𝛼, 𝛼 ∈ [0, 1]. Anexample of T-norm is the minimum operator:

𝑇𝑚𝑖𝑛(𝑥) = min𝑗=1,...,𝑛

𝑥𝑗 , (1)

which is the strongest T-norm because

𝑇 (𝑥) ≤ 𝑇𝑚𝑖𝑛(𝑥) for any 𝑥 ∈ [0, 1]𝑛. (2)

The minimum is also idempotent, symmetric and Lipschitz-continuous. A further example of T-norm is the product,

𝑇𝑝𝑟𝑜𝑑(𝑥) =𝑛∏𝑗=1

𝑥𝑗 , (3)

which is a non-idempotent, but symmetric and Lipschitz-continuous aggregation operator.

B. Averaging aggregationAn aggregation operator 𝐴 is averaging if for every 𝑥 ∈

[0, 1]𝑛 it is bounded by

𝑇𝑚𝑖𝑛(𝑥) ≤ 𝐴(𝑥) ≤ 𝑆𝑚𝑎𝑥(𝑥), (4)

where 𝑆𝑚𝑎𝑥 is the maximum S-norm operator,

𝑆𝑚𝑎𝑥(𝑥) = max𝑗=1,...,𝑛

𝑥𝑗 . (5)

The basic rule is that the output value of an averagingoperator cannot be lower or higher than any input value. Anexample of averaging operator is the arithmetic mean:

𝑀(𝑥) =1

𝑛

𝑛∑𝑗=1

𝑥𝑗 . (6)

Averaging operators are idempotent, strictly increasing, sym-metric, homogeneous, and Lipschitz continuous.

C. Fuzzy aggregation neuron model

Let 𝑥 = (𝑥1, ..., 𝑥𝑛) be a vector of membership degrees ofa sample 𝑥 = (𝑥1, ..., 𝑥𝑛) in the fuzzy sets 𝐺 = (𝐺1, ..., 𝐺𝑛).Let 𝑤 = (𝑤1, ..., 𝑤𝑛) be a weighting vector such that

𝑤𝑗 ∈ [0, 1], 𝑗 = 1, ..., 𝑛. (7)

Fuzzy aggregation neurons employ product T-norm toperform synaptic processing and an aggregation operator 𝐴to fuse the individual results of synaptic processing in theneuron body. The output of a fuzzy aggregation neuron is

𝑜 = 𝐴(𝑥1𝑤1, ..., 𝑥𝑛𝑤𝑛). (8)

An aggregation neuron produces a diversity of nonlinearmappings between neuron inputs and output depending onthe choice of weights 𝑤, and aggregation operator 𝐴. Thestructure of a fuzzy aggregation neuron is shown in Fig. 1.

Fig. 1. Fuzzy aggregation neuron model

III. EVOLVING GRANULAR NEURAL NETWORKS

The eGNN approach concerns online modeling of fuzzydata streams. Generally speaking, fuzzy data arise fromimprecise perception or description of the value of a variable[22] - [23]. This paper emphasizes fuzzy trapezoidal data,granular data expressed by trapezoidal fuzzy numbers. Trape-zoids allow some freedom in the choice of representativegranules once they encompass triangular fuzzy numbers,intervals and real values as particular instances [24].

The basic processing units of eGNN are fuzzy aggregationneurons. Its topology encodes a set of fuzzy rules, andneural processing conforms with a fuzzy inference system.The topology results from a gradual network constructionthat is transparent and interpretable. eGNN manages to dis-cover more abstract high-level granular knowledge from finergranular data. High-level granular knowledge can be easilytranslated into a fuzzy knowledge base. The consequent(Then) part of an eGNN rule is composed by linguistic andlocal functional (real-valued function) terms. Independentlyof the choice of aggregation neurons, network parameters,and the nature of input-output data, the linguistic term ofthe rule consequent produces a granular output while thefunctional term gives a singular (pointwise) output.

Page 3: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

Learning in eGNN means to recursively accommodatenew data into existing granular models. Learning may add,remove and combine granules, neurons and respective con-nections whenever necessary. The parameters of the real-valued functions of rule consequents are also object oflearning. This means that eGNN captures new informationfrom data streams, adapts itself to the new scenario, andavoids redesigning and retraining.

A. Fuzzy data stream

Fuzzy data arise from perceptions of expert knowledge,inaccurate measurements, variables that are hard to be pre-cisely quantified, or when pre-processing steps introduce un-certainty in singular data. A fuzzy data stream is a sequenceof samples that conveys fuzzy granular information. Fuzzyintervals and numbers are instances of fuzzy data. A fuzzydatum 𝑥𝑗 has the following canonical form:

𝑥𝑗(𝑧) =

⎧⎨⎩𝜙𝑗 , 𝑧 ∈ [𝑥

𝑗, 𝑥𝑗 [

1, 𝑧 ∈ [𝑥𝑗 , 𝑥𝑗 ]𝜄𝑗 , 𝑧 ∈ ]𝑥𝑗 , 𝑥𝑗 ]0, otherwise

(9)

where 𝑧 is a real number in 𝑋𝑗 . If the fuzzy datum 𝑥𝑗 isnormal (𝑥𝑗(𝑧) = 1 for at least one 𝑧 ∈ ℜ) and convex(𝑥𝑗(𝜅𝑧1 + (1 − 𝜅)𝑧2) ≥ 𝑚𝑖𝑛(𝑥𝑗(𝑧

1), 𝑥𝑗(𝑧2)), 𝑧1, 𝑧2 ∈ ℜ,

𝜅 ∈ [0, 1]), then it is a fuzzy interval [7]. In particular, if

𝜙𝑗 =𝑧 − 𝑥

𝑗

𝑥𝑗 − 𝑥𝑗and (10)

𝜄𝑗 =𝑥𝑗 − 𝑧𝑥𝑗 − 𝑥𝑗

, (11)

then the fuzzy datum (9) has a trapezoidal membership func-tion and can be represented by the quadruple (𝑥

𝑗, 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗).

When 𝑥𝑗 = 𝑥𝑗 , the fuzzy datum is a fuzzy number.In this paper we focus on data streams of trapezoidal

and symmetric fuzzy intervals, similarly as in [23]. Fuzzygranular data streams generalize singular data streams byallowing fuzzy data.

B. Structure and processing

Let 𝑥 = (𝑥1, ..., 𝑥𝑛) be an input vector and 𝑦 its cor-responding output. Assume that the data stream (𝑥, 𝑦)[ℎ],ℎ = 1, ..., are samples produced by a nonstationary function𝑓 . Inputs 𝑥𝑗 and output 𝑦 are symmetric fuzzy data.

Fig. 2 depicts the four-layer eGNN structure. The firstlayer inputs samples 𝑥[ℎ], one at a time, to the network. Thesecond (granular) layer consists of a collection of fuzzy sets𝐺𝑖𝑗 , 𝑗 = 1, ..., 𝑛; 𝑖 = 1, ..., 𝑐, stratified from the input data.Fuzzy sets 𝐺𝑖𝑗 , 𝑖 = 1, ..., 𝑐, form a fuzzy partition of the 𝑗-th input domain, 𝑋𝑗 . Similarly, fuzzy sets Γ𝑖, 𝑖 = 1, ..., 𝑐,give a fuzzy partition of the output domain 𝑌 . A granule𝛾𝑖 = 𝐺𝑖1×...×𝐺𝑖𝑛×Γ𝑖 is a fuzzy relation, a multidimensionalfuzzy set in 𝑋1×...×𝑋𝑛×𝑌 . Thus, granule 𝛾𝑖 has member-ship function 𝛾𝑖(𝑥, 𝑦) = 𝑚𝑖𝑛{𝐺𝑖1(𝑥1), ..., 𝐺𝑖𝑛(𝑥𝑛),Γ𝑖(𝑦)} in𝑋1 × ...×𝑋𝑛 × 𝑌 . Granule 𝛾𝑖 is denoted by 𝛾𝑖 = (𝐺𝑖,Γ𝑖)with 𝐺𝑖 = (𝐺𝑖1, ..., 𝐺

𝑖𝑛), for short. The granule 𝛾𝑖 has a

companion local function 𝑝𝑖. Here in this paper we use real-valued affine functions:

𝑝𝑖(�̂�1, ..., �̂�𝑛) = 𝑦𝑖 = 𝑎𝑖0 +

𝑛∑𝑗=1

𝑎𝑖𝑗 �̂�𝑗 , (12)

Parameters 𝑎𝑖0 and 𝑎𝑖𝑗 are real values; �̂�𝑗 is the midpoint of𝑥𝑗 = (𝑥

𝑗, 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗), computed as follows:

mp(𝑥𝑗) = �̂�𝑗 =𝑥𝑗 + 𝑥𝑗

2. (13)

Similarity degrees 𝑥𝑖 = (𝑥𝑖1, ..., 𝑥𝑖𝑛) is the result of matching

between input 𝑥 = (𝑥1, ..., 𝑥𝑛) and fuzzy sets of 𝐺𝑖 =(𝐺𝑖1, ..., 𝐺

𝑖𝑛), see Section IV-C. The third (aggregation) layer

has fuzzy aggregation neurons 𝐴𝑖, 𝑖 = 1, ..., 𝑐, to combinethe values from different inputs. A fuzzy neuron 𝐴𝑖 combinesweighted similarity degrees (𝑥𝑖1𝑤

𝑖1, ..., 𝑥

𝑖𝑛𝑤𝑖𝑛) into a single

value 𝑜𝑖. The fourth (output) layer processes weighted values(𝑜1𝑦1𝛿1, ..., 𝑜𝑐𝑦𝑐𝛿𝑐) using a fuzzy aggregation neuron 𝐴𝑓 toproduce a singular output 𝑦[ℎ].

Fig. 2. eGNN topology and singular output

An 𝑚-output eGNN needs a vector of local functions(𝑝𝑖1, ..., 𝑝

𝑖𝑚), 𝑚 output layer neurons (𝐴𝑓1 , ..., 𝐴

𝑓𝑚), and 𝑚

outputs (𝑦1, ..., 𝑦𝑚). The network output 𝑦, obtained asshown in Fig. 2, is a singular approximation of 𝑓 , indepen-dently if input data are singular or granular.

Granular approximation of function 𝑓 at step 𝐻 is a setof granules 𝛾𝑖, 𝑖 = 1, ..., 𝑐, such that:

(𝑥, 𝑦)[ℎ] ⊆𝑐∪𝑖=1

𝛾𝑖, ℎ = 1, ..., 𝐻. (14)

The granular approximation is constructed by granulatingboth, input data 𝑥[ℎ] into fuzzy sets of 𝐺𝑖, as shown in Fig.

Page 4: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

2, and output data 𝑦[ℎ] into fuzzy sets Γ𝑖, as summarized inFig. 3. Note that the granular approximation is the convexhull of output fuzzy sets Γ𝑖

∗, where 𝑖∗ are indices of active

granules, that is, those for which 𝑜𝑖 > 0. This guarantees thatthe singular approximation 𝑦[ℎ] is included in the granule.

Fig. 3. Granular approximation from input and output data granulation

The convex hull of trapezoidal fuzzy sets Γ1, ...,Γ𝑖, ...,Γ𝑐,with Γ𝑖 = (𝑢𝑖, 𝑢𝑖, 𝑢𝑖, 𝑢

𝑖), is a trapezoidal fuzzy set

ch(Γ1, ...,Γ𝑐) whose representation is

ch(Γ1, ...,Γ𝑐) = (𝑚𝑖𝑛(𝑢1, ..., 𝑢𝑐),𝑚𝑖𝑛(𝑢1, ..., 𝑢𝑐),

𝑚𝑎𝑥(𝑢1, ..., 𝑢𝑐),𝑚𝑎𝑥(𝑢1, ..., 𝑢

𝑐)). (15)

In particular, the trapezoid (𝑢𝑖∗, 𝑢𝑖

∗, 𝑢𝑖

∗, 𝑢𝑖∗) of Fig. 3 that

results from ch(Γ𝑖∗), 𝑖∗ = {𝑖 : 𝑜𝑖 > 0, 𝑖 = 1, ..., 𝑐}, is a

granular approximation of 𝑦. It is worth noting that granularapproximation at step ℎ does not depend on the availabilityof 𝑦[ℎ] because 𝑜𝑖 is obtained from 𝑥[ℎ] (see Fig. 2). Onlythe collection of output fuzzy sets Γ𝑖 is required.

Figure 4 illustrates the singular and granular approxima-tion, 𝑝 and

∪𝑐𝑖=1 𝛾

𝑖, of a function 𝑓 . In Fig. 4(a), a singularinput 𝑥[ℎ1] and a granular input 𝑥[ℎ2] produce singular outputs𝑦[ℎ1] and 𝑦[ℎ2] using 𝑝. In Fig. 4(b), the granular input 𝑥[ℎ]

activates the fuzzy sets of 𝐺2 and 𝐺3. Therefore, the granularoutput is ch(Γ2,Γ3). Notice that 𝑦[ℎ] ⊂ ch(Γ2,Γ3).

eGNN develops functional and linguistic fuzzy models.While functional fuzzy models are more precise, linguisticfuzzy models are more interpretable. Accuracy and inter-pretability require tradeoffs and one usually excels over theother. eGNN links functional and linguistic systems into asingle framework. Under assumption on specific weights andneurons types, fuzzy rules extracted from eGNN can be ofthe type:

𝑅𝑖: IF (𝑥1 is 𝐺𝑖1) AND ... AND (𝑥𝑛 is 𝐺𝑖𝑛)THEN (𝑦 is Γ𝑖)︸ ︷︷ ︸

linguistic

AND 𝑦 = 𝑝𝑖(𝑥1, ..., 𝑥𝑛)︸ ︷︷ ︸functional

As an example, the eGNN can combine Mamdani andfunctional Takagi-Sugeno fuzzy models.

IV. RECURSIVE LEARNING

Construction of the fuzzy rules encoded in the eGNNstructure and approximation of nonstationary functions fromgranular data streams are the key goals of the learningapproach. Because application domain may be unknownbeforehand, eGNN learning is mostly bottom-up. We assumethat no granules and neurons exist before training starts.

(a) eGNN singular approximation of 𝑓

(b) eGNN granular approximation of 𝑓

Fig. 4. eGNN singular (a) and granular (b) approximation of function 𝑓

The algorithm builds the network structure in plug-and-playmode. Single pass over data enables eGNN to address theissues of unbounded data sets and scalability.

A. Expansion

Membership functions of 𝐺𝑖𝑗 = (𝑔𝑖𝑗, 𝑔𝑖𝑗, 𝑔𝑖𝑗 , 𝑔

𝑖𝑗) and of

input data 𝑥𝑗 = (𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗) are trapezoidal. Similarly,

Γ𝑖 = (𝑢𝑖, 𝑢𝑖, 𝑢𝑖, 𝑢𝑖) and output data 𝑦 = (𝑦, 𝑦, 𝑦, 𝑦) are

trapezoids. Each rule antecedent 𝐺𝑖 = (𝐺𝑖1, ..., 𝐺𝑖𝑛) has a

correspondent consequent Γ𝑖. With 𝛾𝑖 = (𝐺𝑖,Γ𝑖), eGNNlooks at examples (𝑥, 𝑦) at a coarser granule size.

The support and the core of trapezoidal membership func-tion 𝐺𝑖𝑗 are:

supp(𝐺𝑖𝑗) = [𝑔𝑖𝑗, 𝑔𝑖𝑗 ], (16)

core(𝐺𝑖𝑗) = [𝑔𝑖𝑗, 𝑔𝑖𝑗 ]. (17)

The midpoint and width of 𝐺𝑖𝑗 are as follows:

mp(𝐺𝑖𝑗) =𝑔𝑖𝑗+ 𝑔𝑖𝑗

2, (18)

wdt(𝐺𝑖𝑗) = 𝑔𝑖𝑗 − 𝑔𝑖

𝑗. (19)

The maximum width fuzzy sets 𝐺𝑖𝑗 are allowed to expandis denoted by 𝜌, i.e., wdt(𝐺𝑖𝑗) ≤ 𝜌, 𝑗 = 1, ..., 𝑛; 𝑖 = 1, ..., 𝑐.Let the expansion region of a fuzzy set 𝐺𝑖𝑗 be

𝐸𝑖𝑗 = [mp(𝐺𝑖𝑗)−𝜌

2,mp(𝐺𝑖𝑗) +

𝜌

2]. (20)

It follows that wdt(𝐺𝑖𝑗) ≤ wdt(𝐸𝑖𝑗) ∀𝑗, 𝑖. Expressions similarto (16) - (20) can be written for fuzzy sets Γ𝑖. Valuesof 𝜌 allow different representations of the same process at

Page 5: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

different levels of abstraction. Expansion regions help toderive criteria for deciding whether or not granular datashould be considered enclosed by the current model.

B. Granularity adaptationAppropriate balance between parametric and structural

adaptation is a key to capture changes in nonstationarysystems online. The procedure developed next gives a mech-anism to parsimoniously reconcile parametric and structuralchanges in eGNN.

The value of 𝜌 affects the granularity, accuracy, andtransparency of models. In practice, 𝜌 ∈ [0, 1] settles thesize of expansion regions (20) and the need to either createor adapt rules. In the most general case, eGNN starts learningwith an empty rule base and with no a priori knowledge ofdata properties. In these circumstances it is worth to initialize𝜌 at an intermediate value, e.g. 𝜌[0] = 0.5.

Let 𝑟 be the number of rules created in ℎ𝑟 steps. If thenumber of rules grows faster than a rate 𝜂, that is, 𝑟 > 𝜂,then 𝜌 is increased,

𝜌(new) =

(1 +

𝑟

ℎ𝑟

)𝜌(old). (21)

Equation (21) acts against outbursts of growth once largerule bases increase model complexity and worsen generaliza-tion. If the number of rules grows at a rate smaller than 𝜂,that is, 𝑟 ≤ 𝜂, then 𝜌 is decreased,

𝜌(new) =

(1− (𝜂 − 𝑟)

ℎ𝑟

)𝜌(old). (22)

If 𝜌 = 1, then eGNN is structurally stable, but unableto capture abrupt changes. Conversely, if 𝜌 = 0, theneGNN overfits the data and causes excessive complexityand irreproducible optimistic results. Life-long adaptabilityis reached choosing intermediate values for 𝜌.

Reducing the maximum granule width may require shrink-ing larger granules to fit them to new data. In this case, thesupport of fuzzy set 𝐺𝑖𝑗 is narrowed as follows:

If mp(𝐺𝑖𝑗)− 𝜌(new)2 >𝑔𝑖

𝑗then 𝑔𝑖

𝑗(new)=mp(𝐺𝑖𝑗)− 𝜌(new)

2

If mp(𝐺𝑖𝑗)+𝜌(new)

2 <𝑔𝑖𝑗 then 𝑔

𝑖𝑗(new)=mp(𝐺𝑖𝑗)+

𝜌(new)2

Cores [𝑔𝑖𝑗, 𝑔𝑖𝑗 ], and supports [𝑢𝑖, 𝑢

𝑖] and cores [𝑢𝑖, 𝑢𝑖] of

fuzzy sets Γ𝑖 are handled similarly. Time-varying granularityis useful to avoid guesses on how fast and how often thedata stream properties change. The accuracy-interpretabilitytradeoff is an important issue in neurofuzzy computing [25].

C. Computing similarity between data and modelsData and granules are trapezoidal fuzzy objects and in

this case a convenient similarity measure to quantify howthe input data match current knowledge is:

𝑥𝑖𝑗 = 1−∣𝑔𝑖𝑗−𝑥

𝑗∣+ ∣𝑔𝑖

𝑗−𝑥𝑗 ∣+ ∣𝑔𝑖𝑗−𝑥𝑗 ∣+ ∣𝑔𝑖𝑗−𝑥𝑗 ∣

4(max(𝑔𝑖𝑗 , 𝑥𝑗)−min(𝑔𝑖

𝑗, 𝑥𝑗))

. (23)

This measure returns 𝑥𝑖𝑗 = 1 for identical trapezoids andreduces linearly as any numerator increases.

D. Creation of new granules

The incremental procedure to create granules runs when-ever the support of at least one entry of (𝑥1, ..., 𝑥𝑛) is notenclosed by expansion regions (𝐸𝑖1, ..., 𝐸

𝑖𝑛), 𝑖 = 1, ..., 𝑐. This

is the case when fuzzy sets 𝐺𝑖 cannot be expanded beyondthe limit 𝜌 in order to fit the sample. Analogously, if supp(𝑦)is not enclosed by 𝐸𝑖 for at least one Γ𝑖, then the sampleshould be enclosed by a new granule.

A new granule 𝛾𝑐+1 is formed by fuzzy sets 𝐺𝑐+1𝑗 and

Γ𝑐+1 whose parameters match the parameters of the sample:

𝐺𝑐+1𝑗 = (𝑔𝑐+1

𝑗, 𝑔𝑐+1𝑗, 𝑔𝑐+1𝑗 , 𝑔

𝑐+1𝑗 ) = (𝑥

𝑗, 𝑥𝑗 , 𝑥𝑗 , 𝑥𝑗) (24)

Γ𝑐+1 = (𝑢𝑐+1, 𝑢𝑐+1, 𝑢𝑐+1, 𝑢𝑐+1

) = (𝑦, 𝑦, 𝑦, 𝑦). (25)

The coefficients of the real-valued local function 𝑝𝑐+1 are

𝑎𝑐+10 = mp(𝑦), 𝑎𝑐+1

𝑗 = 0, 𝑗 ∕= 0. (26)

E. Adaptation of granules

Adaptation of granules means to expand or contract thesupport and the core of fuzzy sets 𝐺𝑖𝑗 and Γ𝑖 and simultane-ously update the coefficients of the local functions 𝑝𝑖.

Granule 𝛾𝑖 can be adapted whenever a sample (𝑥, 𝑦)falls within its expansion region, that is, supp(𝑥𝑗) ⊂ 𝐸𝑖𝑗 ,𝑗 = 1, ..., 𝑛, and supp(𝑦) ⊂ 𝐸𝑖. In situations in whichtwo or more granules are qualified to enclose the data,adapting only one of the granules is enough to guaranteedata inclusion. In particular, we may chose 𝛾𝑖 such that𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1, ..., 𝑜𝑐). In other words, choose 𝛾𝑖, thegranule with the highest activation level.

Adaptation proceeds depending where the input datum 𝑥𝑗is located compared with fuzzy set 𝐺𝑖𝑗 . More specifically:

If 𝑥𝑗∈ [mp(𝐺𝑖𝑗)− 𝜌

2 , 𝑔𝑖

𝑗] then 𝑔𝑖

𝑗(new) = 𝑥

𝑗

If 𝑥𝑗 ∈ [mp(𝐺𝑖𝑗)− 𝜌2 , 𝑔

𝑖𝑗] then 𝑔𝑖

𝑗(new) = 𝑥𝑗

If 𝑥𝑗 ∈ [𝑔𝑖𝑗,mp(𝐺𝑖𝑗)] then 𝑔𝑖

𝑗(new) = 𝑥𝑗

If 𝑥𝑗 ∈ [mp(𝐺𝑖𝑗),mp(𝐺𝑖𝑗) +𝜌2 ] then 𝑔𝑖

𝑗(new) = mp(𝐺𝑖𝑗)

If 𝑥𝑗 ∈ [mp(𝐺𝑖𝑗)− 𝜌2 ,mp(𝐺𝑖𝑗)] then 𝑔𝑖𝑗(new) = mp(𝐺𝑖𝑗)

If 𝑥𝑗 ∈ [mp(𝐺𝑖𝑗), 𝑔𝑖𝑗 ] then 𝑔𝑖𝑗(new) = 𝑥𝑗

If 𝑥𝑗 ∈ [𝑔𝑖𝑗 ,mp(𝐺𝑖𝑗) +𝜌2 ] then 𝑔𝑖𝑗(new) = 𝑥𝑗

If 𝑥𝑗 ∈ [𝑔𝑖𝑗 ,mp(𝐺𝑖𝑗) +

𝜌2 ] then 𝑔

𝑖𝑗(new) = 𝑥𝑗

The first and last rules perform support expansion, and thesecond and seventh rules take care of core expansion. Theremaining cases concern core contraction.

Operations on core parameters, 𝑔𝑖𝑗

and 𝑔𝑖𝑗 , require adjust-ment of the midpoint of the respective fuzzy sets as follows:

mp(𝐺𝑖𝑗)(new) =𝑔𝑖𝑗(new) + 𝑔𝑖𝑗(new)

2. (27)

As result, support contraction may happen in two occasions:

If mp(𝐺𝑖𝑗)(new)− 𝜌2 >𝑔

𝑖

𝑗then 𝑔𝑖

𝑗(new)=mp(𝐺𝑖𝑗)(new)− 𝜌

2

If mp(𝐺𝑖𝑗)(new)+ 𝜌2 <𝑔

𝑖𝑗 then 𝑔

𝑖𝑗(new)=mp(𝐺𝑖𝑗)(new)+ 𝜌

2

Adaptation of consequent fuzzy sets Γ𝑖 is done similarlyusing output data 𝑦. Coefficients 𝑎𝑖𝑗 of the local functions 𝑝𝑖

Page 6: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

are updated using midpoints of trapezoidal fuzzy data (𝑥, 𝑦)and the recursive least squares algorithm [18] - [19].

F. Weights update

Aggregation layer weights 𝑤𝑖𝑗 ∈ [0, 1] represent the im-portance of the 𝑗-th attribute of fuzzy set 𝐺𝑖𝑗 to the neuralnetwork output. If 𝑤𝑖𝑗 = 1, then the output is not affected.A relatively lower value of 𝑤𝑖𝑗 discounts the impact of therespective attribute. The procedure described below assignslower weight values to less helpful attributes.

Whenever a granule 𝛾𝑐+1 is created, the learning proceduresets 𝑤𝑐+1

𝑗 = 1, ∀𝑗. If it is known a priori that different inputvariables have different importance, then values for 𝑤𝑐+1

𝑗 canbe chosen in a way to reflect the application domain.

Considering the similarity measure (23) and the currentapproximation error,

𝜖[ℎ] = 𝑦[ℎ] − 𝑝(𝑥[ℎ]), (28)

weights 𝑤𝑖𝑗 corresponding to the most active granule 𝛾𝑖,where 𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1, ..., 𝑜𝑐), are recursively updated using

𝑤𝑖𝑗(new) = 𝑤𝑖𝑗(old)− 𝑥𝑖𝑗𝑜𝑖∣𝜖∣. (29)

Equation (29) ascribes to the 𝑗-th attribute of 𝐺𝑖 a proportionof the approximation error.

G. Pruning granules

Pruning inactive granules aims at simplifying the eGNNstructure and keeping it flexible to model dynamic behavior.Retaining a small number of highly active granules is a wayto emphasize compactness and fast processing.

Output layer weights 𝛿𝑖 ∈ [0, 1] help pruning by encodingthe amount of data assigned to granule 𝛾𝑖. Learning startswith 𝛿𝑖 = 1. During the next steps 𝛿𝑖 is reduced whenever𝛾𝑖 is not activated after ℎ𝑟 steps as follows:

𝛿𝑖(new) = 𝜁𝛿𝑖(old), (30)

where 𝜁 ∈ [0, 1]. Otherwise, if 𝛾𝑖 is activated at least oncewithin ℎ𝑟 steps, then 𝛿𝑖 is increased:

𝛿𝑖(new) = 𝛿𝑖(old) + 𝜁(1− 𝛿𝑖(old)). (31)

If the value of 𝛿𝑖 is less than a threshold 𝜗, then granule 𝛾𝑖

and its respective neuron 𝐴𝑖 are pruned because they do notaffect system accuracy significantly.

H. Combination of granules

Relationships between granules may be strong enough tojustify assembling a larger granule that inherits the informa-tion of the smaller granules. A suitable metric to measure thedistance between trapezoidal objects is:

𝐷(𝛾𝑖1 , 𝛾𝑖2) =1

4(𝑛+ 1)(𝑛∑𝑗=1

(∣𝑔𝑖1𝑗− 𝑔𝑖2

𝑗∣+ ∣𝑔𝑖1

𝑗− 𝑔𝑖2

𝑗∣+

+ ∣𝑔𝑖1𝑗 − 𝑔𝑖2𝑗 ∣+ ∣𝑔𝑖1𝑗 − 𝑔𝑖2𝑗 ∣) + ∣𝑢𝑖1 − 𝑢𝑖2 ∣++∣𝑢𝑖1 − 𝑢𝑖2 ∣+ ∣𝑢𝑖1 − 𝑢𝑖2 ∣+ ∣𝑢𝑖1 − 𝑢𝑖2 ∣). (32)

𝐷 satisfies

𝐷(𝛾𝑖1 , 𝛾𝑖2) ≥ 0𝐷(𝛾𝑖1 , 𝛾𝑖2) = 0 if and only if 𝛾𝑖1 = 𝛾𝑖2

𝐷(𝛾𝑖1 , 𝛾𝑖2) = 𝐷(𝛾𝑖2 , 𝛾𝑖1)𝐷(𝛾𝑖1 , 𝛾𝑖3) ≤ 𝐷(𝛾𝑖1 , 𝛾𝑖2) +𝐷(𝛾𝑖2 , 𝛾𝑖3)

for any 𝛾𝑖1 , 𝛾𝑖2 and 𝛾𝑖3 . Thus 𝐷 is a distance measure. Inaddition, 𝐷 is fast to compute and more accurate than both,distance between midpoints and distance between closestpoints.

Granules are combined after ℎ𝑟 steps considering thelowest value of 𝐷(𝛾𝑖1 , 𝛾𝑖2), 𝑖1, 𝑖2 = 1, ..., 𝑐, 𝑖1 ∕= 𝑖2, anda decision criterion. The decision criterion may consider ifthe new granule obeys the maximum width allowed 𝜌.

A new granule 𝛾𝑖, coarsening of 𝛾𝑖1 and 𝛾𝑖2 , is formedby trapezoidal membership functions 𝐺𝑖𝑗 as follows:

𝐺𝑖𝑗 = ch(𝐺𝑖1𝑗 , 𝐺𝑖2𝑗 ), 𝑗 = 1, ..., 𝑛. (33)

Γ𝑖 is obtained similarly. The new granule 𝛾𝑖 encloses thesupport and core of the granules combined.

The coefficients of the new local function 𝑝𝑖 are foundusing the expressions:

𝑎𝑖𝑗 =1

2(𝑎𝑖1𝑗 + 𝑎𝑖2𝑗 ), 𝑗 = 0, ..., 𝑛. (34)

I. Learning algorithm

The learning algorithm to evolve granular neural networkscan be summarized as follows:

———————————————————————BEGINSelect a type of neuron for the aggregation and output layers;Set parameters 𝜌, ℎ𝑟 , 𝜂, 𝜁, 𝜗, 𝑐 = 0;Read (𝑥, 𝑦)[ℎ], ℎ = 1;Create granule 𝛾𝑐+1, neurons 𝐴𝑐+1, 𝐴𝑓 , and respective connections;For ℎ = 2, ... do

Read (𝑥, 𝑦)[ℎ];Input 𝑥[ℎ] to the network;Compute compatibility degrees (𝑜1, ..., 𝑜𝑐);Aggregate values using 𝐴𝑓 to get singular approximation 𝑦[ℎ];Compute convex hull of Γ𝑖∗ , 𝑖∗ = {𝑖, 𝑜𝑖 > 0};

Find granular approximation (𝑢𝑖∗, 𝑢𝑖

∗, 𝑢𝑖

∗, 𝑢

𝑖∗);

Compute output error 𝜖[ℎ] = 𝑦[ℎ] − 𝑦[ℎ];If 𝑥[ℎ] is not within expansion regions 𝐸𝑖∀𝑖

Create granule 𝛾𝑐+1, neuron 𝐴𝑐+1 and connections;Else

Update the most active granule 𝛾𝑖, 𝑖 = 𝑎𝑟𝑔 𝑚𝑎𝑥(𝑜1, ..., 𝑜𝑐);Update local function parameters 𝑎𝑖𝑗 using recursive least squares;Update connection weights 𝑤𝑖

𝑗∀𝑗, 𝑖;If ℎ = 𝛼ℎ𝑟 , 𝛼 = 1, 2, ...

Combine granules when feasible;Update model granularity 𝜌;Adapt connection weights 𝛿𝑖∀𝑖;Prune inactive granules and respective connections;

END

———————————————————————

V. WEATHER TEMPERATURE FORECASTING

This section considers fuzzy granular data streams derivedfrom monthly mean, minimum, and maximum temperaturesof weather time series of different geographic regions. Theaim is to predict monthly temperatures for all regions.

Page 7: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

A. Weather forecasting

Weather forecasting are useful to plan activities, protectproperty, and assist decision making in several economicsectors such as energy, transportation, aviation, agriculture,inventory planning. Any system that is sensitive to the stateof the atmosphere may benefit from weather forecasts.

Monthly temperature data carry a degree of uncertaintydue to imprecision of atmospheric measurements, instrumentmalfunction, and equivocated transcripts. Usually tempera-ture data are numerical, but the processes which originateand supply the data are imprecise. Temperature estimates infiner time granularities (days, weeks) are a common demand.Evolving granular approaches such as eGNN provide guar-anteed granular predictions of the time series in these cases.The satisfaction of the granular prediction depends of theprediction model compactness. Granular predictions togetherwith singular predictions are important because they conveya value and a range of possible temperature values.

Computational experiments assume fuzzy data whosemembership functions are translations of the average min-imum, mean and maximum monthly temperatures to trian-gular fuzzy numbers. The data were normalized in the range[0, 1]. We use data from different weather stations. They aresummarized in Table I (data available at http://eca.knmi.nland http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn.html).

TABLE IMONTHLY TEMPERATURE VALUES

Station Samples From To Std.Dev.Bucharest 960 Jan 1930 Dec 2010 0.1795

Death Valley 1308 Jan 1901 Dec 2009 0.1835Helsinki 1680 Jan 1871 Dec 2010 0.1842Lisbon 1200 Jan 1910 Dec 2009 0.1556Ottawa 1380 Jan 1895 Dec 2009 0.1790

During the computational experiments described subse-quently, the eGNN scans the data only once on a per-samplebasis to simulate online data stream processing. Algorithmperformance is evaluated using the root mean square error ofnormalized singular predictions,

𝑅𝑀𝑆𝐸 =

√√√⎷ 1

𝐻

𝐻∑ℎ=1

(mp(𝑦)[ℎ] − 𝑦[ℎ])2, (35)

the non-dimensional error index,

𝑁𝐷𝐸𝐼 =𝑅𝑀𝑆𝐸

𝑠𝑡𝑑(mp(𝑦)[ℎ]∀ℎ) , (36)

average number of rules in the model structure, and per-sample CPU time in milliseconds. The computer has a dual-core 2.54GHz processor with 4GB of RAM.

B. Performance analysis

Different computational intelligence approaches were cho-sen for performance assessment. They are: dynamic evolv-ing neuro-fuzzy inference system (DENFIS) [2], evolvingTakagi-Sugeno (eTS) [26], fuzzy set-based evolving model-ing (FBeM) [19], interval-based evolving modeling (IBeM)

[18], multilayer perceptron neural network (MLP) [27], andextended Takagi-Sugeno (xTS) [28].

The task of the different approaches is to give one stepahead forecast of the monthly temperature 𝑦[ℎ+1] using thelast 12 observations 𝑥[ℎ−11], ..., 𝑥[ℎ]. Online methods employthe sample-per-sample testing-before-training approach asfollows. First, an estimation 𝑦[ℎ+1] is derived for a giveninput (𝑥[ℎ−11], ..., 𝑥[ℎ]). One time step later, the actual value𝑦[ℎ+1] becomes available and model adaptation is performedif necessary. Table II shows the forecasting results.

TABLE IITEMPERATURE FORECASTS

Station Method # Rules 𝑅𝑀𝑆𝐸 𝑁𝐷𝐸𝐼 CPUDENFIS 5.00 0.0800 0.4457 4.7eGNN 3.80 0.0594 0.3309 1.6eTS 3.00 0.0598 0.3331 1.1

Bucharest FBeM 7.57 0.0603 0.3359 1.1IBeM 5.88 0.0643 0.3582 1.0MLP – 0.0892 0.4969 35.5xTS 10.00 0.0643 0.3582 1.0

DENFIS 8.00 0.0600 0.3270 4.7eGNN 3.91 0.0498 0.2714 1.6eTS 3.00 0.0491 0.2676 1.0

Death Valley FBeM 8.00 0.0506 0.2757 1.1IBeM 8.79 0.0541 0.2948 1.0MLP – 0.0584 0.3183 44.2xTS 10.00 0.0503 0.2741 1.1

DENFIS 24.00 0.0780 0.4235 5.7eGNN 2.78 0.0607 0.3295 1.6eTS 4.00 0.0634 0.3442 1.4

Helsinki FBeM 6.00 0.0602 0.3268 1.1IBeM 10.38 0.0764 0.4148 1.2MLP – 0.0892 0.4843 35.5xTS 16.00 0.0651 0.3534 1.1

DENFIS 12.00 0.0880 0.5656 5.2eGNN 2.77 0.0577 0.3708 1.7eTS 4.00 0.0714 0.4589 2.3

Lisbon FBeM 5.63 0.0599 0.3850 1.2IBeM 3.59 0.0687 0.4415 1.0MLP – 0.0955 0.6138 48.2xTS 11.00 0.0744 0.4781 1.0

DENFIS 7.00 0.0770 0.4302 4.9eGNN 3.88 0.0575 0.3212 1.5eTS 3.00 0.0604 0.3374 1.0

Ottawa FBeM 6.80 0.0609 0.3402 1.1IBeM 9.28 0.0734 0.4101 1.1MLP – 0.0769 0.4296 41.3xTS 14.00 0.0631 0.3525 1.1

Table II shows that eGNN gives the most precise forecastsin 3 of the 5 temperature data sets, seconded by eTS andFBeM with one each. The structures of the eGNN are, inaverage, the most parsimonious. Alternative evolving ap-proaches such as DENFIS, eTS and xTS use numeric data -the mean temperature. In contrast, granular approaches suchas eGNN, IBeM and FBeM take into account the mean andits neighbor data to bound forecasts. IBeM and xTS are thefastest among the algorithms evaluated in this paper.

As an example, the one-step singular and granular forecastsof eGNN for the Helsinki time series are shown in Fig. 5.The additional plots of Fig. 5 show the granularity, errorindices, and number of rules. Note that while the singularprediction 𝑝 attempts to match the actual mean temperature

Page 8: Evolving Granular Neural Network for Fuzzy Time Series ...danfl7/Evolving_granular... · Evolving Granular Neural Network for Fuzzy Time Series Forecasting Daniel Leite, Pyramo Costa,

value, the corresponding granular information [𝑢, 𝑈 ] formedby the lower and upper bounds of the consequent trapezoidalmembership functions intends to envelop previous data andthe uncertainty of the unknown temperature function 𝑓 .

Fig. 5. eGNN Helsinki temperature forecasts

The results suggest that eGNN benefits from data uncer-tainty and neurofuzzy granular framework to provide accurateand linguistic predictions of fuzzy time series. eGNN is anevolving approach able to process fuzzy data streams andprovide simultaneous singular and granular forecasts.

VI. CONCLUSION

This paper has introduced a fuzzy data stream modelingframework based on an evolving granular neural network ap-proach. The eGNN framework processes fuzzy data streamsusing fuzzy granular models, fuzzy aggregation neurons,and incremental learning algorithm. Its neurofuzzy structureencodes a set of fuzzy rules and embeds a fuzzy inferencesystem. The resulting modeling approach trades off preci-sion and interpretability combining functional and linguisticfuzzy models in a single framework. The eGNN providessingular approximation as well as granular approximationof functions. An application example concerning weathertemperature forecasting has shown that eGNN is highlycompetitive with state-of-the-art evolving approaches. Furtherwork shall address methods to control the specificity ofgranules during learning, and linguistic approximation.

ACKNOWLEDGMENT

The first author acknowledges CAPES, Brazilian Ministryof Education, for his fellowship. The second author is gratefulto the Energy Company of Minas Gerais - CEMIG, Brazil, forgrant P&D178. The last author thanks CNPq, the BrazilianNational Research Council, for grant 304596/2009-4.

REFERENCES

[1] Angelov, P.; Filev, D.; Kasabov, N. (Eds.) Evolving Intelligent Systems:Methodology and Applications. Wiley-IEEE Press Series on CI, 2010.

[2] Kasabov, N. Evolving Connectionist Systems: The Knowledge Engi-neering Approach. Springer-Verlag - London, 2nd edition, 2007.

[3] Lughofer, E. Evolving Fuzzy Systems - Methodologies, AdvancedConcepts and Applications. Springer-Verlag, Berlin Heidelberg, 2011.

[4] Bouchon-Meunier, B.; Marsala, C.; Rifqi, M.; Yager, R. Uncertaintyin Intelligent and Information Systems. World Scientific - SG, 2008.

[5] Kreinovich, V. “Interval computations as an important part of granularcomputing: an introduction.” In: Pedrycz; Skowron; Kreinovich (Eds.)Handbook of Granular Computing, pp: 1-31, 2008.

[6] Bargiela, A.; Pedrycz, W. Granular Computing: An Introduction.Kluwer Academic Publishers - Boston, 1st edition, 2002.

[7] Pedrycz, W.; Gomide, F. Fuzzy Systems Engineering: Toward Human-Centric Computing. Wiley - Hoboken, 2007.

[8] Yao, J. T. “A ten-year review of granular computing.” IEEE Interna-tional Conference on Granular Computing, pp: 734-739, 2007.

[9] Yao, Y. Y. “Granular computing: past, present and future.” IEEEInternational Conference on Granular Computing, pp: 80-85, 2008.

[10] Zadeh, L. A. “Fuzzy sets and information granularity.” In: Gupta, M.M.; Ragade, R. K.; Yager, R. R. (Eds.) Advances in Fuzzy Set Theoryand Applications, North Holland - Amsterdam, pp: 3-18, 1979.

[11] Zadeh, L. A. “Toward a theory of fuzzy information granulation andits centrality in human reasoning and fuzzy logic.” Fuzzy Sets andSystems, Vol. 90, Issue 2, pp: 111-127, 1997.

[12] Witten, I. H.; Frank, E.; Hall, M. A. Data Mining: Practical MachineLearning Tools and Techniques. Morgan Kaufmann, 3rd edition, 2011.

[13] Liggins, M. E.; Hall, D. L.; Llinas, J. (Eds.) Handbook of MultisensorData Fusion: Theory and Practice. CRC Press, 2nd edition, 2008.

[14] Maimon, O. Z.; Rokach, L. The Data Mining and Knowledge Discov-ery Handbook. Springer - New York, USA, 2005.

[15] Pedrycz, W.; Vukovich, W. “Granular neural networks.” Neurocomput-ing, Vol. 36, pp: 205-224, 2001.

[16] Leite, D.; Costa, P.; Gomide, F. “Evolving granular neural networksfrom fuzzy data streams.” Neural Networks (Submitted), 17p. 2012.

[17] Leite, D.; Costa, P.; Gomide, F. “Evolving granular neural networkfor semi-supervised data stream classification.” World Congress onComputational Intelligence, pp: 1877-1884, Jul. 2010.

[18] Leite, D.; Costa, P.; Gomide, F. “Interval approach for evolvinggranular system modeling.” In: Mouchaweh, M.; Lughofer, E. (Eds.)Learning in Non-stationary Environments, Springer - NY, 30p. 2012.

[19] Leite, D.; “Evolving Fuzzy Granular Modeling from NonstationaryFuzzy Data Streams.” Evolving Systems (Submitted), 25p. 2012.

[20] Bouchon-Meunier, B. (Ed.) Aggregation and Fusion of ImperfectInformation (SFSC). Physica-Verlag, Heidelberg, New York, 1998.

[21] Beliakov, G.; Pradera, A.; Calvo, T. Aggregation Functions: A Guidefor Practitioners (SFSC). Springer-Verlag - Berlin, Heidelberg, 2007.

[22] Zadeh, L. A. “Generalized theory of uncertainty (GTU) - principalconcepts and ideas.” Computational Statistics & Data Analysis, Vol.51, pp: 15-46, 2006.

[23] Yager, R. R. “Participatory learning with granular observations.” IEEETransactions on Fuzzy Systems, Vol. 17, Issue 1, 2009.

[24] Yager, R. R. “Learning from imprecise granular data using trapezoidalfuzzy set representations.” In: Prade, H.; Subrahmanian, V. S. (Eds.)Lecture Notes in Computer Sc., Vol. 4772, pp: 244-254, 2007.

[25] Pedrycz, W. “Heterogeneous fuzzy logic networks: fundamentals anddevelopment studies.” IEEE Transactions on Neural Networks, Vol. 15,Issue 6, pp: 1466-1481, 2004.

[26] Angelov, P.; Filev, D. “An approach to online identification of Takagi-Sugeno fuzzy models.” IEEE Transactions on Systems, Man, andCybernetics - Part B, Vol. 34, Issue 1, pp: 484-498, 2004.

[27] Haykin, S. Neural Networks: A Comprehensive Foundation. PrenticeHall, 2nd edition, 1999.

[28] Angelov, P.; Zhou, X. “Evolving fuzzy systems from data streams inreal-time.” Int. Symp. on Evolving Fuzzy Systems, pp: 29-35, 2006.