Post on 22-Jun-2015
« Brain Connectivity Graph Classification »
Romain Chion
tutored by: S. Achard, M. Desvignes, F. Forbes, D. Vandeville
gipsa-lab
SYNOPSIS
INTRODUCTION TO GRAPHS
USUAL METHODS
LOCAL MEASURES
RESULTS
gipsa-lab
INTRODUCTION
METHODS
3
CONTEXT
• How to compare graphs to each other?
• Is it possible to model brain connectivity graphs (BCG)?
• To which extent can we characterize BCGs?
gipsa-lab
INTRODUCTION
METHODS
4
GENERATIVE MODELSS
Illustration « Small World », Collective dynamics of‘small-world’ networks, D. J. Watts & S. H. Strogatz
Illustration « Preferential Attachment », Choice-driven phasetransition in complex networks, P. L. Krapivsky and S. Redner
• Erdos-Renyi
• Forest Fire
• Kronecker
• Preferential Attachment
• Random k-regular
• Random Power Law
• Random Typing
• Small-World
gipsa-lab
INTRODUCTION
METHODS
5
GRAPH COMPARISON
• Transformation from a graph to anotherex : Edition distance
STRUCTURAL MEASURES
• Nodes tendency to form clusters, degree distribution, path between nodesex : Clustering, Characteristic Path Length
LOCAL MEASURES(for each node)
• Average for all local measures, core and community formationex : Assortativity, Centrality, Modularity, Diameter
OVERALL MEASURES
gipsa-lab
METHODS
LOCAL MEASURES
6
Graphlets coutnting
Learning Set Graph Instance
Amount of Graphlets
Classifier
Graph model
classifier inputclassifier learning
STATE OF THE ART : JANSSEN et al. 2012
Amount of Graphlets
gipsa-lab
METHODS
LOCAL MEASURES
7
STATE OF THE ART : MOTALLEBI et al. 2013
Complex NetworksClassification
gipsa-lab
METHODS
LOCAL MEASURES
8
BCGs MODELISATION
BCGs classification in 4 generative models(Erdos-Renyi, Preferential Attachment, Random k-regular, Small-World)
Classe Prédiction E-R P A R k-R S-W
Control Small-World 0.2502 0.2501 0.2492 0.2505
Patient Small-World 0.2502 0.2501 0.2492 0.2505
Characterization with global measures and SVM classifier
Confidence interval ~25%
gipsa-lab
METHODS
LOCAL MEASURES
9
BCGs IDENTIFICATION
true Control true Patient class precisionpred. Control 13 11 54.17%pred. Patient 7 6 46.15%
class recall 65.00% 35.29% 50.16%
Identification results with global measures and SVM classifier
Classification accuracy 50.16%, random at 50%
gipsa-lab
METHODS
LOCAL MEASURES
RESEARCH QUESTION
« Global measures are not representative of local properties of graphs »
Local clustering coefficient histograms for 3 generative models
10
gipsa-lab
LOCAL MEASURES
RESULTS
HISTOGRAMME NORMALISE• Clustering Coefficient
• Characteristic Path Length
• Degrees Distribution
• Efficiency Learning Set Graph Instance
Average Normalized Histograms
Histograms Distances
Graph model
Local measures histograms
NormalizedHistograms
distance minimum or classifier
11
gipsa-lab
LOCAL MEASURES
RESULTS
HISTOGRAMS DISTANCE
12
• Bin to bin (dis)similarity measures :
Battacharyya :
Chi²
Hellinger :
• Shape preservation dissimilarity measures:
Earth Mover Distance : Optimisation of minimal work someone has to
provide to move earth from a pile to an other one.
Match : Cumulated histograms bin to bin measures
gipsa-lab
RESULTS
GENERATED DATA
13
Performances graphlets : 78% global measures : 88% to 97.3% 6 measures and more local measures : 86% or 100% only 1 measure
Accuracy
SW 100%
RPL 100%
RkR 100%
PA 100%
KG 100%
FF 100%
ER 100%
100%
Accuracy
SW 100%
RTG 96%
RPL 98%
PA 99%
KG 96%
FF 98%
ER 93%97.2%
Classification results
local measures global measures
gipsa-lab
RESULTS
CONNECTIVITY GRAPHS
14
GLOBAL
A.N.N.
C PC 11 9 55%P 5 12 71%
69% 57% 63%
MAX global measures 63% V.S. 83% MAX histograms
Confusion matrix of Control / Patient identification
HISTOGRAM
CLUSTERING
AND CHI²
C PC 18 2 90%P 4 13 76%
82% 87% 83%
gipsa-lab
RESULTS
BCGs MODELISATION
15
7 Clustering DegreeER 0,418 0,133FF 0,207 0,074KG 0,112 0,211RPL 0,156 0,088PA 0,437 0,242
RkR 0,459 0,183SW 0,103 0,238
EMD distance between BCGs and models for two histograms
gipsa-lab
RESULTS
SCALABILITY
16
100 200 300 400 500 600 700 800 900 100011001200130014001500160017001800190020000,01 11% 10% 10% 14% 14% 12% 7% 6% 14% 11% 23% 29% 29% 30% 29% 34% 34% 36% 36% 45%0,02 12% 18% 18% 16% 20% 22% 30% 39% 41% 42% 43% 42% 42% 44% 42% 43% 43% 42% 42% 43%0,03 10% 19% 20% 27% 28% 41% 41% 45% 43% 43% 43% 42% 41% 40% 44% 43% 44% 43% 43% 43%0,04 17% 26% 32% 40% 43% 41% 44% 40% 43% 43% 43% 43% 43% 43% 45% 43% 43% 43% 43% 42%0,05 16% 25% 41% 42% 41% 43% 42% 43% 38% 40% 43% 42% 42% 43% 42% 43% 42% 43% 43% 43%0,06 33% 41% 43% 44% 43% 42% 42% 46% 41% 43% 43% 43% 42% 43% 43% 43% 49% 43% 44% 43%0,07 36% 57% 54% 65% 62% 70% 67% 72% 71% 72% 69% 71% 68% 72% 85% 85% 83% 86% 84% 86%0,08 44% 69% 72% 72% 72% 75% 69% 86% 84% 86% 86% 86% 86% 86% 86% 86% 84% 86% 86% 86%0,09 41% 81% 85% 93% 96% 93% 90% 97% 94% 90% 86% 86% 84% 85% 71% 71% 70% 72% 71% 71%0,1 49% 88% 86% 100% 96% 100% 99% 84% 81% 85% 86% 86% 86% 86% 86% 86% 86% 86% 86% 86%
0,11 52% 99% 93% 90% 89% 91% 92% 78% 74% 72% 71% 71% 69% 71% 71% 71% 71% 71% 71% 71%0,12 62% 83% 85% 72% 71% 68% 72% 68% 74% 73% 72% 71% 71% 71% 72% 71% 71% 72% 71% 71%0,13 62% 64% 70% 64% 68% 68% 71% 68% 67% 67% 70% 66% 69% 57% 65% 61% 56% 43% 48% 43%0,14 59% 57% 48% 43% 43% 44% 44% 44% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43%0,15 54% 49% 45% 49% 42% 42% 45% 42% 42% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43%0,16 45% 44% 43% 43% 44% 45% 42% 44% 43% 43% 43% 43% 43% 43% 43% 43% 43% 42% 43% 43%0,17 42% 41% 41% 40% 42% 42% 42% 42% 43% 44% 43% 43% 42% 41% 41% 42% 42% 41% 39% 39%0,18 45% 43% 44% 43% 43% 45% 42% 42% 43% 42% 43% 41% 41% 37% 40% 37% 34% 32% 31% 31%0,19 44% 45% 43% 39% 43% 42% 41% 42% 42% 40% 36% 33% 30% 32% 29% 29% 29% 29% 29% 29%0,2 43% 43% 41% 45% 41% 41% 40% 35% 30% 35% 31% 29% 29% 29% 29% 29% 29% 29% 29% 29%
Increasing number of nodes
Incr
ea
sing
den
sity
gipsa-lab
RESULTS
LEARNING STABILITY
17
CROSS-VALIDATION d100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 200067% 70% 67% 69% 70% 71% 73% 75% 74% 77% 77% 78% 77% 78% 79% 80% 80% 78% 79% 81%
CROSS-VALIDATION nd = 0,01 0,02 0,03 0,04 0,05 0,06 0,07 0,08 0,09 0,1 0,11 0,12 0,13 0,14 0,15 0,16 0,17 0,18 0,19 0,2
PREC 76% 82% 95% 97% 97% 97% 99% 99% 99% 99% 97% 99% 98% 99% 99% 99% 99% 99% 99% 99%MIN
PREC. 32% 31% 80% 88% 89% 91% 96% 96% 96% 96% 91% 93% 92% 94% 97% 96% 96% 98% 98% 98%MIN
CLASS. ER ER FF FF KG KG KG SW SW SW KG SW SW SW SW SW SW SW SW SW
Increasing density
Increasing number of nodes
gipsa-lab
RESULTATS
RANDOMIZATION STABILITY
18
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90%ER 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%FF 100% 100% 97% 97% 100% 97% 100% 100% 100% 100% 100% 97% 100% 100% 100% 100% 100% 100% 100%KG 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%PA 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
RkR 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%RPL 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%SW 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Increasing randomness
gipsa-lab
RESULTS
REMOVING CLASSES
19
Erdos-Renyi
FFRPL
Forest Fire
RPLSW
Kronecker Graph
FF
77%SW 23%RPL
Preferential Attachment
FFRPL
Random k-Regular
FFRPL
Random Power Law
FF
92% SW 8% PA
Small-World
FFRPL
Graphes de Connectivités
FFRPLPASW…
gipsa-lab
RESULTS
PCA : RESULTS
20
PC 1 0.415 0.750 0.750PC 2 0.170 0.126 0.876PC 3 0.132 0.076 0.952PC 4 0.101 0.044 0.996PC 5 0.028 0.004 0.999PC 6 0.011 0.000 1.000PC 7 0.003 0.000 1.000
NUMBER OF PRINCIPAL COMPONENT
CUMULATIVE VARIANCE
gipsa-lab
RESULTS
PCA : STABILITY
21
14% 14% 14% 2% 0% 20% 11% 5% 14% 15% 14% 15% 15% 15% 15% 17% 17% 20% 23%
1% 14% 25% 15% 20% 28% 25% 24% 19% 36% 36% 37% 37% 40% 40% 55% 43% 42% 43%
6% 26% 31% 30% 35% 43% 46% 63% 62% 64% 63% 58% 60% 61% 67% 61% 66% 63% 61%
27% 34% 42% 45% 53% 55% 60% 66% 67% 67% 66% 68% 66% 64% 63% 51% 55% 57% 55%
31% 43% 48% 57% 59% 60% 63% 70% 66% 69% 71% 70% 70% 71% 71% 58% 70% 78% 70%
32% 51% 56% 69% 70% 66% 68% 72% 71% 74% 72% 86% 86% 85% 84% 71% 85% 86% 83%
34% 62% 68% 71% 70% 71% 83% 86% 87% 86% 86% 85% 84% 85% 86% 79% 84% 86% 86%
36% 67% 67% 79% 85% 86% 86% 86% 84% 86% 86% 86% 86% 86% 86% 86% 85% 86% 86%
40% 76% 94% 99% 100% 100% 98% 99% 99% 100% 100% 100% 99% 100% 94% 96% 96% 83% 82%
46% 96% 99% 100% 98% 100% 99% 98% 98% 98% 100% 100% 100% 100% 88% 88% 87% 88% 86%
52% 100% 96% 100% 99% 100% 100% 95% 94% 92% 93% 96% 92% 91% 86% 86% 86% 86% 86%
57% 98% 100% 99% 100% 98% 87% 73% 74% 77% 75% 72% 72% 73% 72% 71% 71% 72% 71%
58% 80% 85% 68% 73% 67% 70% 57% 55% 59% 57% 57% 57% 58% 58% 57% 57% 57% 57%
61% 64% 69% 64% 66% 61% 63% 59% 58% 58% 57% 57% 58% 57% 57% 57% 57% 58% 57%
65% 57% 67% 62% 59% 60% 58% 56% 58% 57% 57% 58% 57% 57% 58% 58% 58% 58% 57%
68% 59% 61% 53% 56% 57% 57% 58% 58% 57% 57% 57% 57% 57% 57% 57% 58% 57% 57%
66% 56% 56% 42% 45% 52% 57% 57% 57% 58% 57% 58% 57% 57% 57% 57% 57% 57% 57%
62% 57% 61% 43% 43% 47% 54% 58% 58% 55% 57% 57% 57% 57% 58% 58% 58% 57% 57%
60% 58% 57% 43% 43% 43% 44% 49% 54% 47% 46% 56% 57% 57% 57% 57% 57% 56% 57%
57% 59% 52% 46% 43% 42% 43% 42% 41% 43% 44% 43% 43% 43% 43% 43% 43% 43% 43%
11% 10% 10% 14% 14% 12% 7% 6% 14% 11% 23% 29% 29% 30% 29% 34% 34% 36% 36%
12% 18% 18% 16% 20% 22% 30% 39% 41% 42% 43% 42% 42% 44% 42% 43% 43% 42% 42%
10% 19% 20% 27% 28% 41% 41% 45% 43% 43% 43% 42% 41% 40% 44% 43% 44% 43% 43%
17% 26% 32% 40% 43% 41% 44% 40% 43% 43% 43% 43% 43% 43% 45% 43% 43% 43% 43%
16% 25% 41% 42% 41% 43% 42% 43% 38% 40% 43% 42% 42% 43% 42% 43% 42% 43% 43%
33% 41% 43% 44% 43% 42% 42% 46% 41% 43% 43% 43% 42% 43% 43% 43% 49% 43% 44%
36% 57% 54% 65% 62% 70% 67% 72% 71% 72% 69% 71% 68% 72% 85% 85% 83% 86% 84%
44% 69% 72% 72% 72% 75% 69% 86% 84% 86% 86% 86% 86% 86% 86% 86% 84% 86% 86%
41% 81% 85% 93% 96% 93% 90% 97% 94% 90% 86% 86% 84% 85% 71% 71% 70% 72% 71%
49% 88% 86% 100% 96% 100% 99% 84% 81% 85% 86% 86% 86% 86% 86% 86% 86% 86% 86%
52% 99% 93% 90% 89% 91% 92% 78% 74% 72% 71% 71% 69% 71% 71% 71% 71% 71% 71%
62% 83% 85% 72% 71% 68% 72% 68% 74% 73% 72% 71% 71% 71% 72% 71% 71% 72% 71%
62% 64% 70% 64% 68% 68% 71% 68% 67% 67% 70% 66% 69% 57% 65% 61% 56% 43% 48%
59% 57% 48% 43% 43% 44% 44% 44% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43%
54% 49% 45% 49% 42% 42% 45% 42% 42% 43% 43% 43% 43% 43% 43% 43% 43% 43% 43%
45% 44% 43% 43% 44% 45% 42% 44% 43% 43% 43% 43% 43% 43% 43% 43% 43% 42% 43%
42% 41% 41% 40% 42% 42% 42% 42% 43% 44% 43% 43% 42% 41% 41% 42% 42% 41% 39%
45% 43% 44% 43% 43% 45% 42% 42% 43% 42% 43% 41% 41% 37% 40% 37% 34% 32% 31%
44% 45% 43% 39% 43% 42% 41% 42% 42% 40% 36% 33% 30% 32% 29% 29% 29% 29% 29%
43% 43% 41% 45% 41% 41% 40% 35% 30% 35% 31% 29% 29% 29% 29% 29% 29% 29% 29%
CROSS-VALIDATION d : 5 to 16% increase average from 75 to 84%
CROSS-VALIDATION n : UP to 5% increase average from 96 to 97%
BEFORE AFTER
gipsa-lab
RESULTS
PCA : INTERPRETATION
22
CO
MP
ON
EN
T 2
COMPONENT 1
Biplot: visual representation
RANDOM POWER LAW
COMPONENT 1
SMALL WORLD
FOREST FIRE
PREF ATTACHMENT
ERDOS RENYI
K REGULAR
FORMER ATTRIBUTESVECTORS
gipsa-lab
CONCLUSION
Excellent performances on generated data
Histograms of local measures are useful
Local clustering is particularly important
Still dependent on existing and number of models
Results on connectivity data are still lacking
Combined model are to be considered
Basis of histograms to be constructed