Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / [email protected] Random Graph Models:...
-
Upload
marjorie-flora-tucker -
Category
Documents
-
view
223 -
download
2
Transcript of Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / [email protected] Random Graph Models:...
Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]
Random Graph Models: Create/Explain Complex Network Properties
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Random Graph Models: Why do we Need Them? The networks discussed are quite large!
Impossible to describe or visualize explicitly.
Consider this example: You have a new Internet routing algorithm You want to evaluate it, but do not have a trace of the Internet
topology You decide to create an “Internet-like” graph on which you will
run your algorithm How do you describe/create this graph??
Random graphs: local and probabilistic rules by which vertices are connected
Goal: from simple probabilistic rules to observed complexity
Q: Which rules gives us (most of) the observed properties? 2
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Emergent Complexity in Cellular Automata
This is “Conway’s game of life” (many other automata) http://www.youtube.com/watch?v=ma7dwLIEiYU&feature=
related (demo)
http://www.bitstorm.org/gameoflife/ (try your own)4
Local Rules Each cell either white or blue Each cell interacts with its 8 neighbors Time is discrete (rounds)1. Any blue cell with fewer than two live
neighbors becomes white2. Any blue cell with two or three blue
neighbors lives on to the round3. Any blue cell with more than three blue
neighbors becomes white4. Any white cell with exactly three blue
neighbors become blue
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Back to Networks: (Erdös-Rényi) Random Graphs A very (very!) simple local rule:
(any) two vertices are connected with probability p Only inputs: number of vertices n and probability p
Denote this class of graphs as G(n,p)
5
Erdös-Rényi model (1960)
Connect with probability p
p=1/6 N=10
average degree k ~ 1.5
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
N and p do not uniquely define the network– we can have many different realizations of it. How many?
P(G(N,L))pL (1 p)N (N 1)2
L
G(10,1/6)N=10 p=1/6
G(N,L): a graph with N nodes and L linksThe probability to form a particular graph G(N,L) is That is, each graph G(N,L)
appears with probability P(G(N,L)).
How Many Networks in G(n,p)?
2𝑁 (𝑁− 1)
2
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
P(L): the probability to have exactly L links in a network of N nodes and probability p:
P(L)N
2
L
pL (1 p)
N(N 1)2
L
The maximum number of links in a network of N nodes.
Number of different ways we can choose L links among all potential
links.
Binomial distribution...
Relation of G(N,p) to G(N,L)
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
P(L): the probability to have a network of exactly L links
P(L)N
2
L
pL (1 p)
N(N 1)2
L
L LP(L)pN(N 1)2L0
N(N 1)2
The average number of links <L> in a random graph
The standard deviation
2 p(1 p)N(N 1)2
)1( Npk
G(N,p) statistics
Average node degree <k>
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
As the network size increases, the distribution becomes increasingly narrow—which means that we are increasingly confident that the number of links the graph has is in the vicinity of <L>.
NO
NNp
p
L
1
)1(
212/1
G(N,p) as N ∞
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
The degree distribution average degree is <k> = p(N-1) variance σ2 = p(1-p)(N-1)
Assuming z=Np is fixed, as N → ∞,B(N,k,p) is approximated by a Poisson distribution
As N → ∞ Highly concentrated around the mean Probability of very high node degrees is exponentially small Very different from power law!
Random Graphs: Degree Distribution
zk
ek!
zz)P(k;p(k)
10
k1)(Nk p)(1pk
1Np)N,B(k;p(k)
1/2
1/2
k
1)(N
1
1)(N
1
p
p1
k
σ
Binomial
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
The secret behind the small world effect – Looking at the network volume
ddS 4)(
Are Erdos-Renyi (Poisson) Graphs Small-World?
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
The secret behind the small world effect – Looking at the network volume
d
x
dddxdN1
2~)1(24)(
Polynomial growth
The Volume of Geometric Graphs
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
The secret behind the small world effect – Looking at the network volume
d
x
dddxdN1
2~)1(24)(
Polynomial growth
The Exploding Volume of Random Graphs
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
The secret behind the small world effect – Looking at the network volume
d
x
dddxdN1
2~)1(24)(
Polynomial growth
dd
x
dx k
k
kkdN ~
1
1)(
1
1
Exponential growth
The Exploding Volume of Random Graphs (2)
k
Nd
Nd
Nk
k
d
ln
ln
log
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
klog
Nloglmax
Given the huge differences in scope, size, and average degree, the agreement is excellent!
Distance in Random Graphs Compare with Real Data
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Random Graphs: Clustering Co-efficient
Consider a random graph G(n,p)Q: What is the probability that two of your neighbors are
also neighbors?A: It is equal to p, independent of local structure
clustering coefficient C = p
when z is fixed (sparse networks): C = z/n =O(1/n)
16
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Given the huge differences in scope, size, and average degree, there is a clear disagreement.
Clustering in Random Graphs Compare with Real Data
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Summary: Are Real Networks Random Graphs? Erdos-Renyi Graphs are “small world”
path lengths are O(logn)
Erdos-Renyi Graphs are not “scale-free” Degree distribution binomial and highly-concentrated (no
power-law) Exponentially small probability to have “hubs” (no heavy-tail)
Erdos-Renyi Graphs are not “clustered” C 0, as N becomes larger
Conclusion: ER random graphs are not a good model of real networks BUT: still provide a great deal of insight!
18
√
X
X
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Some of your neighbors neighbors are also your own
Exponential growth: k
Nd
ln
ln
dkdS
kS
kS
)(
)2(
)1(2
Clustering inhibits the small-worldness
pkkN
dSdSdkSdS dd 21
)2()1(1)1()(
)1()1(
)2()3(
11
)2(
)1(
1)0(
32
22
kpkN
kpkNkSS
pkN
kNkS
kS
S
Poisson Graph Diameter: Growth is slightly slower
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Small World Graphs: Watts-Strogatz Model Short paths must be combined with
High clustering coefficient
Watts and Strogatz model [WS98] Start with a ring, where every node is connected to the next k nodes With probability p, rewire every edge (or, add a shortcut) to a random
node
20
order randomness
p = 0 p = 10 < p < 1
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Small World Graphs (2)
The Watts Strogatz Model: It takes a lot of randomness to ruin the clustering, but a very small amount to overcome locality 21
log-scale in p
When p = 0, C = 3(k-2)/4(k-1) ~ ¾ L = n/k
For small p, C ~ ¾ L ~ logn
Clustering Coefficient – Characteristic Path Length
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Nodes: online user Links: email contact, tweet, or friendship
Alan Mislove, Measurement and Analysis of Online Social Networks
All distributions show a fat-tail behavior:there are orders of magnitude spread in the degrees
Online Social Networks
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
World Wide Web
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Scale-free Graphs: What About Power Laws? The configuration model
input: the degree sequence [d1,d2,…,dn] process:
- Create di copies of node i; link them randomly
- Take a random matching (pairing) of the copies• self-loops and multiple edges are allowed
24
4 1 3 2
But: Too artificial!
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Networks continuously expand by the addition of new nodes
Barabási & Albert, Science 286, 509 (1999)
ER, WS models: the number of nodes, N, is fixed (static models)
One Explanation of Scale-Free(ness): Growth
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
(1) Networks continuously expand by the addition of new nodes
Add a new node with m links
Barabási & Albert, Science 286, 509 (1999)
Growth Models
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Barabási & Albert, Science 286, 509 (1999)jj
ii k
kk
)(
PREFERENTIAL ATTACHMENT:
the probability that a node connects to a node with k links is proportional to k.
A: New nodes prefer to link to highly connected nodes.
Q: Where will the new node link to?ER, WS models: choose randomly.
Growth Models: Preferential Attachment
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Preferential Attachment in Networks
“The rich get richer”
First considered by [Price 65] as a model for citation networks each new paper is generated with m citations (on average) new papers cite previous papers with probability proportional
to their indegree (citations) what about papers without any citations?
- each paper is considered to have a “default” citation- probability of citing a paper with degree k, proportional to k+1
Power law with exponent α = 2+1/m
28
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Barabasi-Albert model
The BA model (undirected graph) input: some initial subgraph G0, and m the number of edges
per new node the process:
- nodes arrive one at a time- each node connects to m other nodes selecting them with probability
proportional to their degree- if [d1,…,dt] is the degree sequence at time t, the node t+1 links to
node i with probability
Results in power-law with exponent α = 3
Various Problems: cannot account for every power law observed (Web), correlates age with degree, etc.
29
2mtd
dd i
i i
i
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Comparison with Real Networks (B-A Model)
30
Path length Clust. Coeff.
ln 𝑁ln (ln𝑁 )
Larger than ER Still goes to 0 as N ∞
Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]
Network Resilience or How to Break a Network
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Phase Transitions in Random Graphs
We saw that increasing p denser networks In the large N case we increase z = Np the average degree
But what really happens as p (or z) increases?
32
A random network on 50 nodes:p = 0.01 disconnected, largest component = 3
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Phase Transitions in Random Graphs (2)
p = 0.03 large component appears But almost 40% of nodes still disconnected
33
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Phase Transitions in Random Graphs (3)
p = 0.05 “giant” component emerges Only 3 nodes disconnected Giant component the graph “percolates”
34
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Phase Transitions in Random Graphs (4)
p = 0.10 all nodes connected
35
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
S: the fraction of nodes in the giant component, S=NGC/N
there is a phase transition at <k>=1:
for <k> < 1 there is no giant component
for <k> > 1 there is a giant component
for large <k> the giant component contains all nodes (S=1)
http://linbaba.files.wordpress.com/2010/10/erdos-renyi.png
Connectivity (“Percolation”) of Random GraphsS
<k>
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Network Resilience
Q1: How does the degree distribution affect resilience?Q2: How does the removal strategy affect resilience?
Def: network is still “functional” as long as there is a “giant component”
Def: “giant component” S contains a finite percentage of all nodes n as n ∞ S = c•n (or (Θ(n))
37
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Uniform removal of vertices
Def: φ = probability a vertex not having been removed i.e. percentage of vertices present
38
φ = 1 φ = 0.7
φ = 0.3 φ = 0
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Uniform removal of vertices
Assume degree distribution pk
Probability of a uniformly chosen node having k neighbors
Step 1: pick random vertex i Step 2: i not in giant cluster none of its neighbors is
in the giant clusterDef: u = average probability that a neighbor j does not
connect i to giant componentResult: If i has degree k Prob(i not in S) = uk
39
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Graphical Solution for u
40
φc given at the point at which the curve is tangent to u = 1
tangent to u = 1
2 solutions for ugiant component S exists
u = 1: gives S = 0u = 1: threshold
11u(u)g1dud 1
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Critical Threshold for Percolation φc
Critical Threshold depends on mean degree (<k>) and degree variability (<k2>)
41
kk k2c
11u(u)g1dud 1
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Critical Threshold for Poisson Graphs
42
k!cep kck Degree distribution
Average degree <k> = c <k2> = c(c+1)
Critical threshold
Example: for average degree c = 4 φc = 0.25
75% vertices must be removed to “kill” the network
c1c
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Critical Threshold for Power-Law Graphs
43
αk ckp Degree distribution
Fact: most networks exhibit a power-law degree distribution with α in the range (2,3)
Q: what is <k> and <k2> for these networks?A: <k> is finite but <k2> is infinite!
Q: What is the critical threshold?A: it is 0!! power-law graphs can “survive” any number
of failures
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Size of giant cluster
Exponential degree distribution
44
Power-law degree distribution
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Scale-free networks: resilient to random attack
gnutella (P2P) network 20% of nodes removed
574 nodes in giant component 427 nodes in giant component
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Applications
Network attacks (good news) Real networks (social networks, internet, P2P
networks) are extremely robust to uniform removal attacks
The higher the variance of the degree distribution, the better
Malware Infections and Immunization (bad news) Epidemic occurs if a majority of nodes gets infected Stop virus/worm/etc. from spreading vaccinate/fix a
number of nodes --- Goal: disconnect “contact” graph Need to immunize a large majority of nodes to avoid
spread
46
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Non-uniform Removal of Nodes
A more efficient attack: remove the highest degree nodes
47
Exponential degree distribution Power-law degree distribution
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Targeted attacks are effective against scale-free
nets gnutella network, 22 most connected nodes removed (2.8% of the nodes)
301 nodes in giant component574 nodes in giant component
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
jkjkB gigiC /)()(
Where gjk = the number of shortest paths connecting j-k, and gjk = the number that node i is on.
Usually normalized by:
2' /)()( niCiC BB
Betweeness Centrality: Definition
49
betweenness of vertex i paths between j and k that pass through i
all paths between j and k
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
betweenness on toy networks
50
bridge
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Nodes are sized by degree, and colored by betweenness.
Can you spot nodes with high betweenness but relatively low degree?
What about high degree but relatively low betweenness?
Betweeness vs. Degree Centrality
51
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Why is Betweeness Centrality Important?Connectivitya) Remove random nodeb) Remove high degree nodec) Remove high betweeness node
52
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Why is Betweeness Centrality Important? The network below is a wireless network (e.g. sensor
network) Nodes run on battery total energy Emax
Each node picks a destination randomly and sends data at constant rate every packet going through a node spends E of its energy
Q: How long would it take until the first node dies out of battery?
53
S1
D1
D2
S2
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
How About in This Network?
54
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Why is Betweeness Centrality Important?Monitoringa) Where would you place a traffic monitor in order to
track the maximum number of packets (if this was your university network)?
b) Where would you place traffic cameras if that was a street network?
55
Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis
Why is Betweeness Centrality Important? Traffic Flow: Each link has capacity 1Q: What is the maximum throughput between S-D?A: Max Flow – Min Cut theorem max flow equal to min
number of links removed to disconnect S-D S-D throughput = 1
56
S
D