Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on...
Transcript of Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on...
![Page 1: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/1.jpg)
Network structure
Leonid E. Zhukov
School of Data Analysis and Artificial IntelligenceDepartment of Computer Science
National Research University Higher School of Economics
Network Science
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 1 / 25
![Page 2: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/2.jpg)
Network structure
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 2 / 25
![Page 3: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/3.jpg)
Typical network structure
Core-periphery structure of a network
image from J. Leskovec, K. Lang, 2010
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 3 / 25
![Page 4: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/4.jpg)
Graph cores
Definition
A k-core is the largest subgraph such that each vertex is connected to atleast k others in subset
Every vertex in k-core has a degree ki ≥ k(k + 1)-core is always subgraph of k-coreThe core number of a vertex is the highest order of a core that containsthis vertex
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 4 / 25
![Page 5: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/5.jpg)
k-core decomposition
V. Batageli, M. Zaversnik, 2002
If from a given graph G = (V, L) recursively delete all vertices, andlines incident with them, of degree less than k, the remaining graph isthe k-core.
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 5 / 25
![Page 6: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/6.jpg)
K-cores
Zachary karate club: 1,2,3,4 - cores
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 6 / 25
![Page 7: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/7.jpg)
k-cores
k-cores: 1:1458, 2:594, 3:142, 4:12, 5:6k-shells: 1:864-red, 2:452-pale green, 3:130-green, 5:6-blue, 6:6-purple
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 7 / 25
![Page 8: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/8.jpg)
Graph cliques
Definition
A clique is a complete (fully connected) subgraph, i.e. a set of verticeswhere each pair of vertices is connected.
Cliques can overlapLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 8 / 25
![Page 9: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/9.jpg)
Graph cliques
A maximal clique is a clique that cannot be extended by includingone more adjacent vertex (not included in larger one)
A maximum clique is a clique of the largest possible size in a givengraph
Graph clique number is the size of the maximum clique
image from D. Eppstein
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 9 / 25
![Page 10: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/10.jpg)
Graph cliques
Maximum cliques
Maximal cliques:Clique size: 2 3 4 5Number of cliques: 11 21 2 2
Zachary, 1977
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 10 / 25
![Page 11: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/11.jpg)
Graph cliques
Computational issues:
Finding click of fixed given size k - O(nkk2)
Finding maximum clique O(3n/3)
But in sparse graphs...
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 11 / 25
![Page 12: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/12.jpg)
Network communities
Definition
Network communities are groups of vertices such that vertices inside thegroup connected with many more edges than between groups.
Community detection is an assignment of vertices to communities.
Will consider non-overlapping communities
Graph partitioning problem
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 12 / 25
![Page 13: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/13.jpg)
Network communities
What makes a community (cohesive subgroup):
Mutuality of ties. Almost everyone in the group has ties (edges) toone another
Compactness. Closeness or reachability of group members in smallnumber of steps, not necessarily adjacency
Density of edges. High frequency of ties within the group
Separation. Higher frequency of ties among group members comparedto non-members
Wasserman and Faust
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 13 / 25
![Page 14: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/14.jpg)
Community density
Graph G (V ,E ), n = |V |, m = |E |Community - set of nodes Sns -number of nodes in S , ms - number of edges in S
Graph density
ρ =m
n(n − 1)/2
community internal density
δint(C ) =ms
ns(ns − 1)/2
external edges density
δext(C ) =mext
nc(n − nc)
community (cluster): δint > ρ, δext < ρ
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 14 / 25
![Page 15: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/15.jpg)
Modularity
Compare fraction of edges within the cluster to expected fraction inrandom graph with identical degree sequence
Q =1
4(ms − E (ms))
Modularity score
Q =1
2m
∑ij
(Aij −
kikj2m
)δ(ci , cj),=
∑u
(euu − a2u)
euu - fraction of edges within community uau =
∑u euv - fraction of ends of edges attached to nodes in u
The higher the modularity score - the better are communities
Modularity score range Q ∈ [−1/2, 1), single community Q = 0
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25
![Page 16: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/16.jpg)
Heuristic approach
Focus on edges that connect communities.Edge betweenness -number of shortest paths σst(e) going through edge e
CB(e) =∑s 6=t
σst(e)
σst
Construct communities by progressively removing edgesLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 16 / 25
![Page 17: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/17.jpg)
Edge betweenness
Newman-Girvan, 2004
Algorithm: Edge Betweenness
Input: graph G(V,E)
Output: Dendrogram
repeatFor all e ∈ E compute edge betweenness CB(e);
remove edge ei with largest CB(ei ) ;
until edges left;
If bi-partition, then stop when graph splits in two components(check for connectedness)
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 17 / 25
![Page 18: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/18.jpg)
Edge betweenness
Hierarchical algorithm, dendrogram
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 18 / 25
![Page 19: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/19.jpg)
Edge betweenness
Zachary karate club
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 19 / 25
![Page 20: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/20.jpg)
Edge betweenness
Zachary karate club
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 20 / 25
![Page 21: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/21.jpg)
Edge betweenness
Zachary karate club
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 21 / 25
![Page 22: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/22.jpg)
Edge betweenness
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 22 / 25
![Page 23: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/23.jpg)
Edge betweenness
best: clusters = 6, modularity = 0.345
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 23 / 25
![Page 24: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/24.jpg)
Edge betweenness
Zachary karate club
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 24 / 25
![Page 25: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 8 05.03.2016 15 / 25 Heuristic approach Focus on edges that connect communities. Edge betweenness -number of shortest paths ˙ st(e)](https://reader036.fdocuments.in/reader036/viewer/2022070908/5f83d535d8b43d54666681bc/html5/thumbnails/25.jpg)
References
Finding and evaluating community structure in networks, M.E.J.Newman, M. Girvan, Phys. Rev E, 69, 2004
Modularity and community structure in networks, M.E.J. Newman,PNAS, vol 103, no 26, pp 8577-8582, 2006
S. E. Schaeffer. Graph clustering. Computer Science Review,1(1):2764, 2007.
S. Fortunato. Community detection in graphs, Physics Reports, Vol.486, Iss. 35, pp 75-174, 2010
Leonid E. Zhukov (HSE) Lecture 8 05.03.2016 25 / 25