G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for...

18
G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International Symposium on Parallel and Distributed Computing Ljubljana, Slovenia · 13-14 October 2003

Transcript of G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for...

Page 1: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

G. Folino, A. Forestiero, G. Spezzano{folino,forestiero,spezzano}@icar.cnr.it

Swarming Agents for Discovering Clusters in Spatial Data

Second International Symposium on Parallel and Distributed Computing

Ljubljana, Slovenia · 13-14 October 2003

Page 2: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Sommario Introduction

Swarm intelligence Flocking algorithm Clustering and spatial datasets

Sparrow-SNN

Experimental results

Conclusions and Future Works

Page 3: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Swarm Intelligence

Swarm Intelligence (SI) is the property of a system whereby the collective behaviors of (unsophisticated) agents interacting locally with their environment cause coherent functional global patterns to emerge.

A swarm has the following interesting properties: Distributed, without central control Ability to change the environment Stigmergy (indirect communication via interaction with

environment) Fault tolerance Adaptivity and self organization

Typical examples are ant colonies, flocks of birds, etc..

Page 4: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Flocking algorithm

Typical example of emergent collective behavior.

No global control Every agent has a limited visibility The collective behavior emerges only by local interation,

following these three simple rules:Separation Alignment

Cohesion

Page 5: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Flocking algorithm

Agents could have an exploratory behavior:

Before, agents can search for a goal of particular interest

Then, the other flock members will be driven towards the goal in order to explore interesting area more carefully.

Page 6: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Clustering

Clustering means to divide all objects in different groups (clusters) so that all members of a cluster are as similar as possible whereas the members of different clusters differ as much as possible from each other.

Spatial clustering should identify clusters of different dimensions, size, shape and density (particularly difficult).

Page 7: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Clustering

A different density spatial dataset

Page 8: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SNN algorithm (1) SNN is based on the famous Jarvis-Patrick algorithm. identifies the K nearest-neighbors of each object (data

point) in the dataset. two objects i and j join the same cluster if:

1) i is one of the K nearest-neighbors of j;2) j is one of the K nearest-neighbors of i;3) i and j have at least Kmin of their K-nearest-

neighbors in common; where K and Kmin are used-defined parameters. For each

pair of points i and j is defined a link with an associate weight.

The connectivity of a data point is computed as the sum of the weights associated to the outgoing links.

Page 9: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SNN algorithm (2)

For every node (data point) calculate the connectivity; Identify representative points by choosing the point that

have high connectivity ( > core_threshold); Identify noise points by choosing the points that have low

connectivity ( < noise_threshold) and remove them; Remove all links between points that have weight smaller

than a threshold (merge_threshold) Take connected components of points to form clusters,

where every point in a cluster is either a representative point or is connected to a representative point.

Page 10: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SPARROW-SNN Sparrow-SNN combine the stochastic search of an

adaptive flocking with SNN to discover clusters in spatial data.

It uses a variant of the flocking algorithm:

Before, agents can search for a goal of particular interest

Then, the other flock’s members will be driven towards the goal in order to explore interesting area more carefully.

We used Swarm, a software package for multi-agent simulation of complex systems, for the implementation of Sparrow-SNN.

Page 11: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SPARROW-SNN

for i=1..MaxGenerations foreach agent (yellow, green) if (not visited (current_point)) conn = compute_conn(); if (conn < noise_threshold) consider the point for the removal from the clustering endif endif mycolor = color_agent(); end foreach foreach agent (yellow, green) dir= compute_dir(); end foreach foreach agent (all) switch (mycolor){ case yellow, green: move(dir, speed(mycolor)); break; case white: stop ();generate_new_boid();break; case red: stop (); merge(); generate_new_close_boid(); break; } end foreach end for

Pseudo-code ofthe algorithm

Page 12: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SPARROW-SNN N agents are generated randomly in the search space.

When an agent falls on a data point not previously explored computes the connectivity.

Using connectivity, agents take different colors: conn > core_threshold -> mycolor = rednoise_threshold < conn <= core_threshold -> mycolor =

green0 < conn < noise_threshold -> mycolor = yellowconn = 0 -> mycolor = white

Agents can indicate a representative point (red), noise (yellow), border point (green), or obstacle (white).

Red and white agents will stop signaling to the others the interesting and desert regions.

Page 13: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

SPARROW-SNN Yellow and green agents will move following the

modified rules of the flock (with repulsion from white agents and attraction towards red agents.

Besides, yellow agents move quickly (not interesting zones) whereas green agents move slowly.

red agents (placed on a representative point) will run the merge procedure so that it will include, in the final cluster, the representative point discovered together to the points that share with it a significant (greater that Pmin) number of neighbors.

Page 14: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Experimental results (datasets)

Page 15: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Experimental results (clusters found)

Page 16: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Experimental results (random search vs Sparrow–SNN)

0

200

400

600

800

1000

0 500 1000

Visited Points

Core

Poi

nts

SPARROW-SNN

RANDOM

0

50

100

150

200

250

300

0 100 200 300 400

Visited Points

Core

Poi

nts

SPARROW-SNN

RANDOM

a) GEORGE b) North-East

Page 17: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Experimental results (scalability)

Page 18: G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Swarming Agents for Discovering Clusters in Spatial Data Second International.

Conclusions and Future Works

Sparrow-SNN is able to discover cluster of arbitrary shape, size and density in spatial data.

Performs well approximate clustering.

is naturally distributed, fault tolerant and scalable.

We are working on implementing a new version of Sparrow using Anthill, a peer-to-peer multi agent system based on JXTA.