Chapitre 2: métriques de graphes et modèles...
Transcript of Chapitre 2: métriques de graphes et modèles...
Chapitre 2: metriques de graphes et modeles generatifs
Sophie Achard
GIPSA-lab, CNRS, Univ. Grenoble Alpes
M1 MIASHS
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 1 / 23
Graph features: sparsity
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
low sparsity high sparsity
Density = number of edges that are present in the graph divided by thetotal number of possible edges.
Ds(G ) =|E |
N ∗ (N − 1)/2
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 2 / 23
Graph features: degree
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
low degree high degree
Degree = number of connections that node makes to other nodes.A = [Aij ]16i ,j6N is the adjacency matrix 1 6 i , j 6 N, Aij = 0 or 1.
di =∑j∈G
Aij .
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 3 / 23
Degree distribution
degree distribution
4
6
1
1
1
1
1 11
2
1
degree sequence = {1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 6} 0 1 2 3 4 5 6
degree
Fra
ctio
n of
ver
tices
with
giv
en d
egre
e
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 4 / 23
Degree distribution
Given a degree sequence, the degree distribution is the set of discreteprobabilities {p(k)} for k = 0, . . . ,N − 1 such that
p(k) =#{i ∈ V , di = k}
N
Verify thatk=N−1∑k=0
pk = 1
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 4 / 23
Power law: scale-free graphs
P(k) = k−γ
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 5 / 23
Scale-free
Characterisation of graphs using the distribution of degrees.Representation of the cumulative distribution in a log-log plot.
0 1 2 3 4
−4
−3
−2
−1
log(k)
log(
cum
ulat
ive
dist
ribut
ion)
+ data−− power law.. exponential law− truncated power law
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 6 / 23
Graph features: shortest path
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
low average SP high average SP
A shortest path is a path between two vertices such that no shorter pathexists.
Average shortest path is the mean other all possible shortest paths.
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 7 / 23
Graph features: shortest path
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
low average SP high average SP
A shortest path is a path between two vertices such that no shorter pathexists.
Average shortest path is the mean other all possible shortest paths.
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 7 / 23
Graph features: clustering
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
C close to 0 C close to 1
Clustering = measure of information transfer in the immediateneighbourhood of each node. Let Ni denote the neighbourhood of node i .
We have: di = |Ni |.
Ci =2×#edges connecting nodes in Ni
di (di − 1).
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 8 / 23
Graph features: clustering
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
C close to 0 C close to 1
Clustering = measure of information transfer in the immediateneighbourhood of each node. Let Ni denote the neighbourhood of node i .
We have: di = |Ni |.
Ci =2×#edges connecting nodes in Ni
di (di − 1).
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 8 / 23
Graph features: clustering
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
C close to 0 C close to 1
Clustering = measure of information transfer in the immediateneighbourhood of each node. Let Ni denote the neighbourhood of node i .
We have: di = |Ni |.
Ci =2×#edges connecting nodes in Ni
di (di − 1).
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 8 / 23
Assortativity
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 9 / 23
Assortativity
r =|E |−1
∑Ni=1 jiki − [|E |−1
∑Ni=1
12(ji + ki )]2
|E |−1∑N
i=112(j2i + k2i )− [|E |−1
∑Ni=1
12(ji + ki )]2
(1)
where ji and ki are the degrees of the vertices at the ends of the ith edge.
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 9 / 23
Interpretation of graph metrics
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 10 / 23
Interpretation of graph metrics
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
sparse dense
Sparsity
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
High Eglob Low Eglob
Shortest path
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
Low Clust High Clust
Clustering
5
2
3
4
1
6 7
8
9 10
11
5
2
3
4
1
6 7
8
9 10
11
Low Modularity High Modularity
Modularity
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 10 / 23
Regular graphs
Definition
A graph G is regular if each vertex has the same number of neighbours.We will denote a k-regular graph, a graph with k neighbours for eachvertex.
Lattice graph k-regular random graph
1
2
34
5
6
7
8 9
10
1
2
34
5
6
7
8 9
10
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 11 / 23
Regular lattice graphs
Definition
A graph G is called a lattice graphs if it is defined on a grid.
Lattice graph Erdos-Renyi random graph
1
2
34
5
6
7
8 9
10
1
2
34
5
6
7
8 9
10
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 12 / 23
Lattice graphs
Could you guess what will be the distribution of degree ? The minimumpath length ? The clustering ?
Lattice graph Erdos-Renyi random graph
1
2
34
5
6
7
8 9
10
1
2
34
5
6
7
8 9
10
Lreg ∼card(V )
2 < k >, and Creg =
3
4
where < k >= d(G) is the mean number of edges per vertex. (as shown inthe Watts and Strogatz model)Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 13 / 23
Erdos-Renyi random graphs
Definition
GN,M model: a graph is chosen uniformly at random from thecollection of all graphs which have N vertices and M edges. Forexample, in the G3,2 model, each of the three possible graphs on threevertices and two edges are included with probability 1/3.
GN,p model: a graph is constructed by connecting nodes randomly.Each edge is included in the graph with probability p independentfrom every other edge. Equivalently, all graphs with N nodes and Medges have equal probability of
pM(1− p)L−M
where L = N(N − 1)/2.
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 14 / 23
Erdos-Renyi random graphs
Lattice graph Erdos-Renyi random graph
1
2
34
5
6
7
8 9
10
1
2
34
5
6
7
8 9
10
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 14 / 23
Properties of Erdos-Renyi random graphs
Theorem
The distribution of the degree of any particular vertex is binomial:
P(d(v) = k) =
(N − 1
k
)pk(1− p)N−1−k
The average degree per vertex is equal to (N − 1)p
GN,p should behave similarly to GN,M with M =(N2
)p as N increases.
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 15 / 23
Random graphs
Could you guess what will be the distribution of degree ? The minimumpath length ? The clustering ?
Lattice graph Erdos-Renyi random graph
1
2
34
5
6
7
8 9
10
1
2
34
5
6
7
8 9
10
Assuming card(V ) >< k >> ln(card(V )) > 1,
Lrand ∼log(card(V ))
log(< k >), and Crand =
< k >
N
where < k >= d(G) is the mean number of edges per vertex, andcard(V ) = N.Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 16 / 23
Chung-Lu Model
Given a degree sequence d = (d1, d2, . . . , dN). Let us write 2ω =∑
i di .We suppose that maxi d
2i < 2ω. An instance is generated by connecting
all pairs (i , j) with probability:
pij =di dj2ω
, (2)
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 17 / 23
Watts and Strogatz Model
Regular Small−world Random
[Watts and Strogatz, Nature, 1998]
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 18 / 23
Watts and Strogatz model
A simple one: the Watts and Strogatz model.
How to move from a regular graph to a random one by rewiring the edges?
Regular Small−world Random
[Watts and Strogatz, Nature, 1998]
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 19 / 23
WS Model: Metrics
A simple one: the Watts and Strogatz model.
[Watts and Strogatz, Nature, 1998]
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 20 / 23
Barabasi-Albert model
Definition
The network begins with an initial connected network of m0 nodes. Newnodes are added to the network one at a time. Each new node isconnected to m ≤ m0 existing nodes with a probability that is proportionalto the number of links that the existing nodes already have. Formally, theprobability pi that the new node is connected to node i is
pi =ki∑j kj
,
where ki is the degree of node i and the sum is made over all pre-existingnodes j (i.e. the denominator results in the current number of edges in thenetwork). Heavily linked nodes (”hubs”) tend to quickly accumulate evenmore links, while nodes with only a few links are unlikely to be chosen asthe destination for a new link. The new nodes have a ”preference” toattach themselves to the already heavily linked nodes.
Note: in this case, P(k) = k−3Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 21 / 23
Barabasi-Albert model
Theorem
Using preferential attachement, one can show that (asymptotically in t):
P(k) ∼ k−3
Note: this is shown using the calculation of time dependence of degree kiof vertex i .
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 22 / 23
An example on real data
Comparison of the human brain functional network with other networks:
Erdos-Renyi random graphs : randomly chosen connections
Scale-free graphs : distribution of the degree = power law(e.g. WWW)
Random
Scale-free
Brain
Sophie Achard (CNRS, Grenoble) Chapitre 2 26/09/2018 23 / 23