Fuzzy Growing Hierarchical Self-organizing Networks

24
Fuzzy Growing Hierarchical Self-Organizing Networks Miguel Barreto-Sanz, Andres Perez-Uribe, Carlos-Andres Peña-Reyes and Marco Tomassini

description

Hierarchical Self-Organizing Networks are used to reveal the topology and structure of datasets. Those structures create crisp partitions of the dataset producing branches or prototype vectors that represent groups of data with similar characteristics. However, when observations can be represented by several prototypes with similar accuracy, crisp partitions are forced to classify it in just one group, so crisp divisions usually lose information about the real dataset structure. To deal with this challenge we propose the Fuzzy Growing Hierarchical Self-Organizing Networks (FGHSON). FGHSON are adaptive networks which are able to reflect the underlying structure of the dataset, in a hierarchical fuzzy way. These networks grow by using three variables which govern the membership degree of data observations to its prototype vectors and the quality of the network representation. The resulting structure allows to represent heterogeneous groups and those that present similar membership degree to several clusters

Transcript of Fuzzy Growing Hierarchical Self-organizing Networks

Page 1: Fuzzy Growing Hierarchical Self-organizing Networks

Fuzzy Growing Hierarchical Self-Organizing

Networks

Miguel Barreto-Sanz, Andres Perez-Uribe, Carlos-Andres Peña-Reyes and Marco Tomassini

Page 2: Fuzzy Growing Hierarchical Self-organizing Networks

Outline

• IntroductionMotivation

Challenges

• Fuzzy Growing Hierarchical Self-Organizing Networks

How it works ?

• Experimental testing

• Conclusions

Page 3: Fuzzy Growing Hierarchical Self-organizing Networks

IntroductionMotivation : Clustering of spatial-temporal data in order to find homolog places. For applications in fields as Geographic Information Systems (GIS) , epidemiology, land use, environmental research, natural resource discovery, and spatial business intelligence.

Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea.

Soil

Climate

Genotype

Example: Agriculture

Page 4: Fuzzy Growing Hierarchical Self-organizing Networks

Challenges: 1. Large Databases2. Resolution levels of abstraction 3. The fuzzy and implicit nature of spatial and spatio-temporal relationships between objects. Boundaries between geographic areas are transition zones rather than sharp boundaries.

1 Km

1 Km

1 point

1 336,025 points just for Colombia

Introduction

Different resolutions

1

2

3

Page 5: Fuzzy Growing Hierarchical Self-organizing Networks

Homolog zones

SOM

First solution: Self-Organizing Maps

Advantages : It is possible to obtain prototypesDisadvantages : It is not possible to obtain different resolutions (fix size of the Kohonen map)

Introduction

Page 6: Fuzzy Growing Hierarchical Self-organizing Networks

Similar Zones

IntroductionSecond solution: Growing hierarchical SOM (GHSOM)

Advantages : It is possible to obtain different resolutionsDisadvantages : It is not possible to represent fuzzy relationships

Page 7: Fuzzy Growing Hierarchical Self-organizing Networks

IntroductionCrisp zones obtained with SOM and GHSOM

Page 8: Fuzzy Growing Hierarchical Self-organizing Networks

SOM

Fuzzy GHSOM

GHSOM

IntroductionOur solution Fuzzy Growing Hierarchical Self-Organizing Networks

Page 9: Fuzzy Growing Hierarchical Self-organizing Networks

Fuzzy Kohonen Clustering Networks

FKCN integrate the idea of fuzzy membership from Fuzzy c-Means (FCM) with the updating rules from SOM. Thus, creating a self-organizing algorithm that automatically adjust the size of the updated neighborhood during a learning process,

Wi,t represents the centroid of the ith cluster at iteration t

m(t) is an exponent like the fuzzication index in FCM and Uik,t is themembership value of the compound Zk to be part of cluster i.

Page 10: Fuzzy Growing Hierarchical Self-organizing Networks

Fuzzy Growing Hierarchical Self-Organizing Networks

Breadth growth process

Depth growth or hierarchical growth

FKCN

Page 11: Fuzzy Growing Hierarchical Self-organizing Networks

Initial Setup and Global Network Control

First Prototype vector One dimension in this example

W0 is a vector that corresponds to the mean of the input variables.

Membership degrees in each layerHierarchical structure

The value of qe0 will help to measure the minimum quality of data representation of the prototype vectors in the subsequent layers, therefore the next prototypes have the task of reducing the global representation error qe0.

Page 12: Fuzzy Growing Hierarchical Self-organizing Networks

Breadth growth process

Breadth growth process

Membership degrees in each layerHierarchical structure

Two initial prototype vectors

New prototype vectors added in order to reach a suitable representation of the dataset

A membership matrix U is obtained. This matrix contains the membership degreeof the dataset elements to the

prototype vectors.

Page 13: Fuzzy Growing Hierarchical Self-organizing Networks

Mean quantization error of the map (MQE) is evaluated in an attempt to measure the quality of data representation, and is used also as stopping criterion for the growing process of the FKCN.

The stopping criterion

qeu represents the qe of the corresponding prototype u in the upper layer.

FKCN1 is allowed to grow until the qe present on the prototype of its preceding layer (qe0 in the case of layer 1) is reduced to at least a fixed percentage τ1

For layer 1

In general

Breadth growth process

Page 14: Fuzzy Growing Hierarchical Self-organizing Networks

Depth growth or hierarchical growth

Depth growth or hierarchical growth

In particular, those prototypes with a large quantization error will indicate us which clusters need a better representation by means of new FKCNs.

Page 15: Fuzzy Growing Hierarchical Self-organizing Networks

Depth growth or hierarchical growth

The prototypes Wi which does not fulfil :will be subject to hierarchical expansion.

It is used to describe the desired level of granularity in the data representation

Minimal membership degree

the breadth processdescribed in stage 2 begins with the newly established FKCNs

Page 16: Fuzzy Growing Hierarchical Self-organizing Networks

End of the process

The training process of the FGHSON is terminated when no more prototypes require further expansion.

Note that this training process does not necessarily lead to a balanced hierarchy, i.e. a hierarchy with equal depth in each branch.

Rather, the specific distribution of the input data is modeled by a hierarchical structure, where some clusters require deeper branching that others.

Page 17: Fuzzy Growing Hierarchical Self-organizing Networks

Iris Data SetIris data sets. There are three Iris categories: Setosa, Versicolor, andVirginica represented respectively by triangles, plus symbols, and dots. Each having 50 samples with 4 features. Here, only three features are used: PL, PW, and SL

T1 = 0.3, T2 = 0.065 and phi = 0.2

Page 18: Fuzzy Growing Hierarchical Self-organizing Networks

Iris Data SetDistribution of the prototype vectors, represented by stars, in each layer ofthe hierarchy.

Page 19: Fuzzy Growing Hierarchical Self-organizing Networks

Iris Data SetDistribution of the prototype vectors, represented by stars, in each layer ofthe hierarchy.

Page 20: Fuzzy Growing Hierarchical Self-organizing Networks

Iris Data SetThird layer of the FGHSON, in this layer prototypes are presented only in the zone where observations of Virginica and Vesicolor share the same area, so the new prototypes represent each category in a more accurate manner

Page 21: Fuzzy Growing Hierarchical Self-organizing Networks

Toy set

T1 = 0.3, T2 = 0.065 and phi = 0.2

Here it is possible to illustrate how the model stop the growing process in those parts where the desired representation is reached and keep on growingwhere an low membership or poor representation is present.

Page 22: Fuzzy Growing Hierarchical Self-organizing Networks

GIS results

Page 23: Fuzzy Growing Hierarchical Self-organizing Networks

Conclusion

The Fuzzy Growing Hierarchical Self-organizing networks are fully adaptive networks able to hierarchically represent complex datasets.

Moreover, it allows for a fuzzy clustering of the data, allocating more prototype vectors or branches to heterogeneous areas and where there is presented similar membership degree to several clusters, this can help to better describing the dataset structure and the inner data relationships.

Future work will be focused on a more accurate way to find the parameters used to tune the algorithm, more specically . In some cases this value can change in order to find better fuzzy sets to represent the structure of the dataset.

Page 24: Fuzzy Growing Hierarchical Self-organizing Networks

Thanks for new ideas and directions to explore! The end ?