Graph mining with kernel self-organizing map

45
Motivations Dissimilarities and distances between vertices Kernel SOM Application and comments Graph mining with kernel self-organizing map Nathalie Villa-Vialaneix http://www.nathalievilla.org Joint work with Fabrice Rossi , INRIA, Rocquencourt, France Institut de Mathématiques de Toulouse, - IUT de Carcassonne, Université de Perpignan France SanTouVal, February 1st, 2008 Nathalie Villa - [email protected] SanTouVal - Feb. 2008

description

Rencontres BoSanTouVal, Universidad de Valladolid, Spain February 1st, 2008

Transcript of Graph mining with kernel self-organizing map

Page 1: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Graph mining with kernel self-organizing map

Nathalie Villa-Vialaneixhttp://www.nathalievilla.org

Joint work with Fabrice Rossi, INRIA, Rocquencourt, France

Institut de Mathématiques de Toulouse, - IUT de Carcassonne, Université dePerpignan

France

SanTouVal, February 1st, 2008

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 2: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Table of contents

1 Motivations

2 Dissimilarities and distances between vertices

3 Kernel SOM

4 Application and comments

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 3: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Exploring a big historic database

Data1000 agrarian contracts,

from four seignories (about 10 villages) of South West ofFrance,

established between 1250 and 1350 (before the HundredYears’ war).

Historian’s questions:family or geographical social links ?central people having a main social role ?. . .

⇒ Data mining is required.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 4: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Exploring a big historic database

Data1000 agrarian contracts,

from four seignories (about 10 villages) of South West ofFrance,

established between 1250 and 1350 (before the HundredYears’ war).

Historian’s questions:family or geographical social links ?central people having a main social role ?. . .

⇒ Data mining is required.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 5: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Exploring a big historic database

Data1000 agrarian contracts,

from four seignories (about 10 villages) of South West ofFrance,

established between 1250 and 1350 (before the HundredYears’ war).

Historian’s questions:family or geographical social links ?central people having a main social role ?. . .

⇒ Data mining is required.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 6: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

A graph clustering problem

From the database, building a weighted graph:

with 615 vertices x1, . . . , xn := peasants found in thecontracts;

with weights (wi,j)i,j=1,...,n := ]{contracts where xi and xj arementionned}.

Number of vertices: 615Number of edges: 4193Total of weights: 40 329Diameter: 10Density: 2,2%

Clustering the vertices into homogeneous social groups tounderstand the structure of the peasant community.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 7: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

A graph clustering problem

From the database, building a weighted graph:

with 615 vertices x1, . . . , xn := peasants found in thecontracts;

with weights (wi,j)i,j=1,...,n := ]{contracts where xi and xj arementionned}.

Number of vertices: 615Number of edges: 4193Total of weights: 40 329Diameter: 10Density: 2,2%

Clustering the vertices into homogeneous social groups tounderstand the structure of the peasant community.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 8: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

A graph clustering problem

From the database, building a weighted graph:

with 615 vertices x1, . . . , xn := peasants found in thecontracts;

with weights (wi,j)i,j=1,...,n := ]{contracts where xi and xj arementionned}.

Number of vertices: 615Number of edges: 4193Total of weights: 40 329Diameter: 10Density: 2,2%

Clustering the vertices into homogeneous social groups tounderstand the structure of the peasant community.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 9: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

A graph clustering problem

From the database, building a weighted graph:

with 615 vertices x1, . . . , xn := peasants found in thecontracts;

with weights (wi,j)i,j=1,...,n := ]{contracts where xi and xj arementionned}.

Number of vertices: 615Number of edges: 4193Total of weights: 40 329Diameter: 10Density: 2,2%

Clustering the vertices into homogeneous social groups tounderstand the structure of the peasant community.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 10: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Other fields modelized by large graphs

Computer science: World Wide Web, P2P network. . .

Social networks

Biology: Protein interactions, Neuronal network,. . .

Business, management: Transportation networks, Industrypartnerships. . .

Question: Understanding the structure of these large graphs

Clustering: building relevant homogeneous groups;

Graph drawing: giving a global representation of the graph.

Here: Self-Organizing Map for nonvectorial data.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 11: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Other fields modelized by large graphs

Computer science: World Wide Web, P2P network. . .

Social networks

Biology: Protein interactions, Neuronal network,. . .

Business, management: Transportation networks, Industrypartnerships. . .

Question: Understanding the structure of these large graphs

Clustering: building relevant homogeneous groups;

Graph drawing: giving a global representation of the graph.

Here: Self-Organizing Map for nonvectorial data.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 12: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Other fields modelized by large graphs

Computer science: World Wide Web, P2P network. . .

Social networks

Biology: Protein interactions, Neuronal network,. . .

Business, management: Transportation networks, Industrypartnerships. . .

Question: Understanding the structure of these large graphs

Clustering: building relevant homogeneous groups;

Graph drawing: giving a global representation of the graph.

Here: Self-Organizing Map for nonvectorial data.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 13: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Table of contents

1 Motivations

2 Dissimilarities and distances between vertices

3 Kernel SOM

4 Application and comments

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 14: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Usual dissimilarities between vertices

The Dice (Jaccard) index:

D(xi , xj) =

∣∣∣Γ(xi) ∩ Γ(xj)∣∣∣

|Γ(xi)|+ |Γ(xj)|

(non weighted graphs);

Dissimilarities based on the shortest paths;

Dissimilarities or distances based on the Laplacian matrix:spectral clustering.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 15: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Usual dissimilarities between vertices

The Dice (Jaccard) index:

D(xi , xj) =

∣∣∣Γ(xi) ∩ Γ(xj)∣∣∣

|Γ(xi)|+ |Γ(xj)|

(non weighted graphs);

Dissimilarities based on the shortest paths;

Dissimilarities or distances based on the Laplacian matrix:spectral clustering.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 16: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Usual dissimilarities between vertices

The Dice (Jaccard) index:

D(xi , xj) =

∣∣∣Γ(xi) ∩ Γ(xj)∣∣∣

|Γ(xi)|+ |Γ(xj)|

(non weighted graphs);

Dissimilarities based on the shortest paths;

Dissimilarities or distances based on the Laplacian matrix:spectral clustering.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 17: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian

DefinitionsFor a graph with vertices V = {x1, . . . , xn} having positive weights(wi,j)i,j=1,...,n such that, for all i, j = 1, . . . , n, wi,j = wj,i and di =

∑nj=1 wi,j ,

Laplacian: L = (Li,j)i,j=1,...,n where

Li,j =

{−wi,j if i , jdi if i = j

;

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 18: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property I [von Luxburg, 2007]

Connected subgraphs

KerL = Span{IA1 , . . . , IAk } where Ai indicates the positions of thevertices of the ith connected component of the graph.

1

4

5

2

3

KerL = Span

10011

;

01100

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 19: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property II [Boulet et al., 2008]

Perfect community : Complete subgraph (clique) which verticesshare the same neighbors outside the clique.

Laplacian and perfect communitiesFor a non weighted graph,

The graph has a perfect community with m vertices⇔

L has m eigenvectors such that each eigenvector has the samen −m coordinates that vanish.

Application :

But: only 1/3 of the graph can be drawn this way.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 20: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property II [Boulet et al., 2008]

Perfect community : Complete subgraph (clique) which verticesshare the same neighbors outside the clique.Application :

But: only 1/3 of the graph can be drawn this way.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 21: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property II [Boulet et al., 2008]

Perfect community : Complete subgraph (clique) which verticesshare the same neighbors outside the clique.Application :

But: only 1/3 of the graph can be drawn this way.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 22: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property III [von Luxburg, 2007]

Min Cut problem: Suppose that we have a connected graph.Find a classification of the vertices of the graph, A1, . . . ,Ak suchthat

12

k∑i=1

∑j∈Ai ,j′<Ai

wj,j′

is minimum , is equivalent to minimize

H = arg minh∈Rn×k

Tr(hT Lh

)subject to

hT h = Ihi = 1/

√|Ai |1Ai

can be approached by

H = arg minh∈Rn×k

Tr(hT Lh

)subject to hT h = I

Spectral clustering: Find the k smallest eigenvectors of L , H, andmake the classification on the rows of H.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 23: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property III [von Luxburg, 2007]

Min Cut problem: Suppose that we have a connected graph.Find a classification of the vertices of the graph, A1, . . . ,Ak suchthat

12

k∑i=1

∑j∈Ai ,j′<Ai

wj,j′

is minimum , is equivalent to minimize

H = arg minh∈Rn×k

Tr(hT Lh

)subject to

hT h = Ihi = 1/

√|Ai |1Ai

⇒ NP-complete problem.

can be approached by

H = arg minh∈Rn×k

Tr(hT Lh

)subject to hT h = I

Spectral clustering: Find the k smallest eigenvectors of L , H, andmake the classification on the rows of H.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 24: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property III [von Luxburg, 2007]

Min Cut problem: Suppose that we have a connected graph.Find a classification of the vertices of the graph, A1, . . . ,Ak suchthat

12

k∑i=1

∑j∈Ai ,j′<Ai

wj,j′

is minimum can be approached by

H = arg minh∈Rn×k

Tr(hT Lh

)subject to hT h = I

Spectral clustering: Find the k smallest eigenvectors of L , H, andmake the classification on the rows of H.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 25: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Laplacian: property III [von Luxburg, 2007]

Min Cut problem: Suppose that we have a connected graph.Find a classification of the vertices of the graph, A1, . . . ,Ak suchthat

12

k∑i=1

∑j∈Ai ,j′<Ai

wj,j′

is minimum can be approached by

H = arg minh∈Rn×k

Tr(hT Lh

)subject to hT h = I

Spectral clustering: Find the k smallest eigenvectors of L , H, andmake the classification on the rows of H.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 26: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

A regularized version of L

Regularization : the diffusion matrix : pour β > 0,Kβ = e−βL =

∑+∞k=1

(−βL)k

k ! .⇒

k β : V × V → R

(xi , xj) → Kβi,j

diffusion kernel (or heat kernel).

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 27: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Diffusion process on the graph

If Z0 = (1 1 1 . . . 1 1)T is the “energy” of each vertex at time 0 andif a small fraction ε of this energy is propagated among the edgesof the graph at each time step, then after t steps, the energy of thevertices of the graph is:

Zt = (1 + εL)t Z0

Limits: Time step↘ ∆t by t ↪→ t/(∆t) and ε ↪→ ε∆t ; then(∆t)→ 0 (continuous process) gives

lim Zt = eεtL = K εt

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 28: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Diffusion process on the graph

If Z0 = (1 1 1 . . . 1 1)T is the “energy” of each vertex at time 0 andif a small fraction ε of this energy is propagated among the edgesof the graph at each time step, then after t steps, the energy of thevertices of the graph is:

Zt = (1 + εL)t Z0

Limits: Time step↘ ∆t by t ↪→ t/(∆t) and ε ↪→ ε∆t ; then(∆t)→ 0 (continuous process) gives

lim Zt = eεtL = K εt

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 29: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Properties

1 Diffusion on the graph: k β(xi , xj) ' quantity of energyaccumulated in xj after a given time if energy 1 is injected in xi

at time 0 and if diffusion is done continuously along the edges.β ' intensity of diffusion;

2 Regularization operator: for u ∈ Rn ∼ V , uT Kβu is higher forvectors u that vary a lot over “close” vertices of the graph.β ' intensity of regularization (for small β, direct neighbors aremore important);

3 Reproducing kernel property: k β is symmetric and positive⇒ ∃ Hilbert space (H , 〈., .〉) and φ : V → H such that

k β(xi , xj) = 〈φ(xi), φ(xj)〉.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 30: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Properties

1 Diffusion on the graph: k β(xi , xj) ' quantity of energyaccumulated in xj after a given time if energy 1 is injected in xi

at time 0 and if diffusion is done continuously along the edges.β ' intensity of diffusion;

2 Regularization operator: for u ∈ Rn ∼ V , uT Kβu is higher forvectors u that vary a lot over “close” vertices of the graph.β ' intensity of regularization (for small β, direct neighbors aremore important);

3 Reproducing kernel property: k β is symmetric and positive⇒ ∃ Hilbert space (H , 〈., .〉) and φ : V → H such that

k β(xi , xj) = 〈φ(xi), φ(xj)〉.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 31: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Properties

1 Diffusion on the graph: k β(xi , xj) ' quantity of energyaccumulated in xj after a given time if energy 1 is injected in xi

at time 0 and if diffusion is done continuously along the edges.β ' intensity of diffusion;

2 Regularization operator: for u ∈ Rn ∼ V , uT Kβu is higher forvectors u that vary a lot over “close” vertices of the graph.β ' intensity of regularization (for small β, direct neighbors aremore important);

3 Reproducing kernel property: k β is symmetric and positive⇒ ∃ Hilbert space (H , 〈., .〉) and φ : V → H such that

k β(xi , xj) = 〈φ(xi), φ(xj)〉.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 32: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Table of contents

1 Motivations

2 Dissimilarities and distances between vertices

3 Kernel SOM

4 Application and comments

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 33: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Kohonen map

Mapping the data onto a 2 dimensional map

Each neuron of the map, i = 1, . . . ,M is associated to aprototype, pi ∈ H ;

Neurons are related to each others by a neighborhoodrelationship (“distance”: d) :

Classifying the vertices on the map

Each xi is associated to a neuron (cluster or class) of the map,f(xi).

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 34: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Preserving the initial topology

Energy

The goal is to minimize the energy of the map:

E =

∫ M∑i=1

h(d(f(x), i))‖x − pi‖2H

dP(x)

where h is a decreasing function (ex: h(t) = αe−t/2σ2).

Energy is approached by its empirical version:

En =n∑

j=1

M∑i=1

h(d(f(xj), i))‖xj − pi‖2H.

and minimization is approached by SOM algorithm.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 35: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Preserving the initial topology

Energy

The goal is to minimize the energy of the map:

E =

∫ M∑i=1

h(d(f(x), i))‖x − pi‖2H

dP(x)

where h is a decreasing function (ex: h(t) = αe−t/2σ2).

Energy is approached by its empirical version:

En =n∑

j=1

M∑i=1

h(d(f(xj), i))‖xj − pi‖2H.

and minimization is approached by SOM algorithm.Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 36: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Batch kernel SOM [Villa and Rossi, 2007]

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Initialize randomly γ0ji ∈ R

(i, j = 1, . . . , n) and p0j =

∑ni=1 γ

0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f(xi) = arg minj=1,...,M

n∑u,u′=1

γjuγju′k β(xu, xu′) − 2n∑

u=1

γjuk β(xu, xi)

Representation step

γlji =

h(f l(xi), j))∑ni′=1 h(f l(xi′ , j))

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 37: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Batch kernel SOM [Villa and Rossi, 2007]

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f l(xi) = arg minj=1,...,M

∥∥∥∥∥∥∥φ(xi) −n∑

i=1

γljiφ(xi)

∥∥∥∥∥∥∥H

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f(xi) = arg minj=1,...,M

n∑u,u′=1

γjuγju′k β(xu, xu′) − 2n∑

u=1

γjuk β(xu, xi)

Representation step

γlji =

h(f l(xi), j))∑ni′=1 h(f l(xi′ , j))

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 38: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Batch kernel SOM [Villa and Rossi, 2007]

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f l(xi) = arg minj=1,...,M

∥∥∥∥∥∥∥φ(xi) −n∑

i=1

γljiφ(xi)

∥∥∥∥∥∥∥H

Representation step

γlj = arg min

γ∈Rn

n∑i=1

h(f l(xi), j)

∥∥∥∥∥∥∥φ(xi) −n∑

l′=1

γl′φ(xl′)

∥∥∥∥∥∥∥2

H

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f(xi) = arg minj=1,...,M

n∑u,u′=1

γjuγju′k β(xu, xu′) − 2n∑

u=1

γjuk β(xu, xi)

Representation step

γlji =

h(f l(xi), j))∑ni′=1 h(f l(xi′ , j))

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 39: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Batch kernel SOM [Villa and Rossi, 2007]

Initialize randomly γ0ji ∈ R (i, j = 1, . . . , n) and p0

j =∑n

i=1 γ0jiφ(xi).

Then, for l = 1, . . . , n repeat

Assignment step

for all xi ,

f(xi) = arg minj=1,...,M

n∑u,u′=1

γjuγju′k β(xu, xu′) − 2n∑

u=1

γjuk β(xu, xi)

Representation step

γlji =

h(f l(xi), j))∑ni′=1 h(f l(xi′ , j))

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 40: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Table of contents

1 Motivations

2 Dissimilarities and distances between vertices

3 Kernel SOM

4 Application and comments

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 41: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Results on a 7 × 7 rectangular map

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 42: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Results on a 7 × 7 rectangular map

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 43: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Results on a 7 × 7 rectangular map

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 44: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

Expected developments

1 Hierarchical clustering;2 Achieve a classification based on density criterium (joint work

with S. Gadat);3 Adapting the algorithm to very large graphs (thousands of

vertices).

Nathalie Villa - [email protected] SanTouVal - Feb. 2008

Page 45: Graph mining with kernel self-organizing map

MotivationsDissimilarities and distances between vertices

Kernel SOMApplication and comments

References

Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).Batch kernel SOM and related laplacian methods for social networkanalysis.Neurocomputing.To appear.

Villa, N. and Rossi, F. (2007).A comparison between dissimilarity SOM and kernel SOM for clustering thevertices of a graph.In Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07),Bielefield, Germany.

von Luxburg, U. (2007).A tutorial on spectral clustering.Technical Report TR-149, Max Planck Institut für biologische Kybernetik.Avaliable at http://www.kyb.mpg.de/publications/attachments/luxburg06_TR_v2_4139%5B1%5D.pdf.

Nathalie Villa - [email protected] SanTouVal - Feb. 2008