Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / [email protected] Part II: Complex Networks...

87
Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / [email protected] Part II: Complex Networks Empirical Properties and Metrics

Transcript of Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / [email protected] Part II: Complex Networks...

Page 1: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]

Part II: Complex Networks

Empirical Properties and Metrics

Page 2: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Textbooks

2

“Networks, Crowds, and Markets: Reasoning About a Highly Connected World” by D. Easley and T. Kleinberg (“NCM”: publicly available online) · “Networks: An Introduction” by M. Newman – (“Networks”: shared copies in library)

Networked Life: 20 Questions and Answers by M.Chiang (some chapters - shared copies in library)

Page 3: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

What is a Network?

A set of “nodes” Humans, routers, web pages, telephone switches, airports,

proteins, scientific articles …

Relations between these nodes humans: friendship/relation or online friendship routers, switches: connected by a communication link web pages: hyperlinks from one to other airports: direct flights between them articles: one citing the other proteins: link if chemically interacting

Network often represented asa graph: vertex = node link relation (weight strength)

3

Page 4: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Social Networks (of the past)

4

The social network of friendships within a 34-person karate club provides clues to the fault lines that eventually split the club apart (Zachary, 1977)

Page 5: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Social Networks (of the past)

5

High school dating

Peter S. Bearman, James Moodyand Katherine StovelChains of affection: The structure ofadolescent romantic and sexual networksAmerican Journal of Sociology 11044-91 (2004)Image drawn by Mark Newman

Page 6: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Network Research of the Past

Mostly done by Social Scientists Interested in Human (Social) Networks Spread of Diseases, Influence, etc.

Methodology: Questionnaires cumbersome, (lots of) bias

Network Size: 10s or at most 100s

6

Page 7: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Email Network

7

Email flows amongst a large project team. Colors denote each participant’s department

Page 8: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

(Online) Social Networks

8

Page 9: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

(a subset!) of the Internet Graph

9

Page 10: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The Science of Complex Networks

The study of large networks coming from all sorts of diverse areas We will focus on technological (e.g. Internet) and information networks (e.g.

Web, Facebook) Cannot visually observe such networks (as in the case of old social networks

of few 10s of nodes) need ways to measure them, and quantify their properties

The field is often called Social Networks or Network Science or Network Theory

Question 1: What are the statistical properties of real networks? Connectivity, paths lengths, degree distributions How do we measure such huge networks sampling

Question 2: Why do these properties arise? Models of large networks: random graphs Deterministic ways too complex/restrictive

Question 3: How can we take advantage of these properties? Connectivity (epidemiology, resilience) Spread (information, disease) Search (Web page, person)

10

Page 11: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Part I: Network Properties of Interest

There are a lot of different properties we might be interested in also depends on application

But there are some commonly studied properties for 2 reasons:1. These properties are important for key applications2. The majority of networks exhibit surprising similarities with

respect to these properties.

1. Degree distribution (“scale free structure”)2. Path length (“small world phenomena”)3. Clustering (“community structure”)

11

Page 12: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Measuring Real Networks: Degree distributions Problem: find the probability distribution that best fits

the observed data

degree

frequency

k

fk

fk = fraction of nodes with degree k = probability of a randomly selected node to have degree k

Page 13: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Exponential distribution

Probability of having k neighbors

Identified by a line in the log-linear plot

p(k) = λe-λk

log p(k) = - λk + log λ

degree

log frequency

λ

Page 14: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Power-law distributions

Right-skewed/Heavy-tail distribution there is a non-negligible fraction of nodes that has very high

degree (hubs) scale-free: f(ax) = bf(x), no characteristic scale, average is not

informative

p(k) = Ck-α

Power-law distribution gives a line in the log-log plot

α : power-law exponent (typically 2 ≤ α ≤ 3)

log p(k) = -α logk + logC

degree

frequency

log degree

log frequency

α

Page 15: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

semilog

10

100

101

102

103

-4

10-3

10-2

10-1

100

loglog

This difference is particularly obvious if we plot them on a log vertical scale: for large x there are orders of magnitude differences between the two functions.

1cx)x(f

xc)x(f

50.cx)x(f

xc)x(f

50.cx)x(f

1cx)x(f

Network Science: Scale-Free Property February 7, 2011

Power Law vs. Exponential Distribution

Page 16: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Internet Topology Primer

16

Internet backbone and regional connectivity

Multi-tier AS topology

Gateway Routers inside ASs

Page 17: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Internet Degree Distribution

17

Holds for both AS and Router topologies

Page 18: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Degree Distribution for Other Networks

18

Page 19: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Power Law Exponent in Real Networks (M.

Newman 2003)

19

α : power-law exponent (typically 2 ≤ α ≤ 3)

Page 20: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Measuring path length

dij = shortest path between i and jDiameter:

Average path length:

Also of interest: distribution of all shortest paths

ijji,

dmaxd

ji

ijd1)/2-n(n

1

Page 21: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Path Length: Lattice Network

A total of n nodes arranged in a grid

Only neighbors (up,down,left,right) connected

Q: What is the diameter of the network?

A: 2 -1Q: What is the avg. distance?

i.e. picking two nodes randomly

A: It is in the order of (i.e. c )

21

n

n

n n

n

Page 22: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Path Length: Random Geometric Network n wireless nodes in an area of

1x1 Each transmits at distance R

R must be at least for connectivity

Q: Choose two random nodes: What is the expected hop count (distance) between them?

A:

22

n

lognΟ

logn

Page 23: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Millgram’s small world experiment Letters were handed out to people in Nebraska to be

sent to a target in Boston People were instructed to pass on the letters to

someone they knew on first-name basis ~60 letters, only about 35% delivered

The letters that reached the destination followed paths of length around 6

Six degrees of separation: (play of John Guare)

Page 24: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Millgram’s small world experiment: Email Version

In 2001, Duncan Watts, a professor at Columbia University, recreated Milgram's experiment using an e-mail message as the “package" that needed to be delivered.

Surprisingly, after reviewing the data collected by 48,000 senders and 19 targets in 157 different countries, Watts found that again the average number of intermediaries was 6.

Page 25: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

A Few Good Men

Robert Wagner

Austin Powers: The spy who shagged me

Wild Things

Let’s make it legal

Barry Norton

What Price Glory

Monsieur Verdoux

Kevin Bacon number: link 2 actors in same movie

Page 26: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Kevin Bacon Number

(statistics from IMDB) ~740000 linkable actors Average (path length) = 3 99% of actors less than 6 hops Try your own actor here:

http://www.cs.virginia.edu/oracle/26

Page 27: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Erdos number: collaboration networks

Legendary mathematician Paul Erdos, around 1500 papers and 509 collaborators

Collaboration Graph: link between two authors who wrote a paper together

Erdos number of X: hop count between Erdos and author X in collaboration graph

~260,000 in connected component

27

Kostas Psounis

Kostas PsounisT. Spyropoulos

Page 28: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Internet Path Lengths

28

Number of AS traversed by an email message• ~35000 nodes• Avg. path ~ 5!

Number of routers traversed by an email message• >200000• Avg. path ~ 15

plots taken from R. V. Hofstad

Page 29: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Internet Path Length: Different Continents

29

Page 30: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Measurement Findings: Path Length

Milgram’s experiment => Small World Phenomenon Short paths exist between most nodes: Path length l

<< total nodes N (e.g line network: path length l = O(N))

30

“Small world” = avg. path length l is at most logN

Page 31: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Clustering (Transitivity) coefficient Measures the density of triangles (local clusters) in

the graph Two different ways to measure it:

The ratio of the means

i

i(1)

i nodeat centered triples

i nodeat centered trianglesC

1

23

4

5

Page 32: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Example

1

23

4

5 83

6113

C(1)

Page 33: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Clustering (Transitivity) coefficient Clustering coefficient for node i

The mean of the ratios

i nodeat centered triplesi nodeat centered triangles

Ci

i(2) C

n1

C

Page 34: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Example

The two clustering coefficients give different measures

C(2) increases with nodes with low degree

1

23

4

5

3013

611151

C(2)

83

C(1)

Page 35: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Clustering Coeff. In Real Nets (M. Newman 2003)

Page 36: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Basic Graph Properties: Revision Material

36

Page 37: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Summary of Findings

Most real networks have…

1. Short paths between nodes (“small world”)

2. Transitivity/Clustering coefficient that is finite > 0

3. Degree distribution that follows a power law

37

Q1. Can we design graph models that exhibit similar characteristics?Q2. Can we explain how/why these phenomena occur in the first place?Q3. Can we take advantage of these properties (e.g. searching, advertising, viral infection/immunization, etc.)?

Page 38: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Undirected Graphs

Graph G=(V,E) V = set of vertices E = set of edges

1

2

3

45undirected graphE={(1,2),(1,3),(2,3),(3,4),(4,5)}

Page 39: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Directed Graphs

Graph G=(V,E) V = set of vertices E = set of edges

1

2

3

45directed graphE={‹1,2›, ‹2,1› ‹1,3›, ‹3,2›, ‹3,4›, ‹4,5›}

Page 40: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Weighted and Unweighted GraphsEdges have / do not have a weight associated

with them

weighted unweighted

48 13

5

Page 41: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Undirected graph: Degree Distribution

1

2

3

45

degree d(i) of node inumber of edges

incident on node i

degree distribution1 node with degree 13 nodes with degree 21 node with degree 3P(1) = 1/5, P(2) = 3/5, P(3) =

1/5

23

1

degree1 2 3

Page 42: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Undirected Graph: Degree Distribution

k

P(k)

1 2 3 4

0.10.20.30.40.50.6

Network Science: Graph Theory January 24, 2011

Page 43: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Directed Graph: In- and Out-Degree

1

2

3

45

in-degree din(i) of node i number of edges pointing to

node i

out-degree dout(i) of node i number of edges leaving node i

in-degree sequence [1,2,1,1,1]

out-degree sequence [2,1,2,1,0]

Page 44: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Paths Path from node i to node j: a sequence of edges

(directed or undirected from node i to node j) path length: number of edges on the path nodes i and j are connected cycle: a path that starts and ends at the same node

1

2

3

45

1

2

3

45

Page 45: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Shortest Paths

Shortest Path from node i to node j also known as BFS path, or geodesic path

1

2

3

45

1

2

3

45

Page 46: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Diameter

The longest shortest path in the graph

1

2

3

45

1

2

3

45

Page 47: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Undirected graph: Components

1

2

3

45

Connected graph: a graph where every pair of nodes is connected

Disconnected graph: a graph that is not connected

Connected Components: subsets of vertices that are connected

Page 48: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Fully Connected Graph

Clique Kn

A graph that has all possible n(n-1)/2 edges

1

2

3

45

Page 49: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Directed Graph

1

2

3

45

Strongly connected graph: there exists a path from every i to every j

Weakly connected graph: If edges are made to be undirected the graph is connected

Page 50: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Adjacency Matrix: Undirected Graph

Adjacency Matrix symmetric matrix for undirected graphs

1

2

3

45

01000

10100

01011

00101

00110

A

Page 51: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Adjacency Matrix: Directed Graph

Adjacency Matrix non-symmetric matrix for undirected graphs

00000

10000

01010

00001

00110

A 1

2

3

45

Page 52: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

0

1

1

1

1

0

1

1

1

1

0

1

1

1

1

0

0

1

0

1

0

0

0

1

0

0

1

1

0

0

0

0

0

1

0

0

1

0

0

0

0

1

0

0

1

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

1

0

1

0

0

0

0

0

0

1

0

1

0

0

0

0

0

0

1

0

G1

G2

G3

0

1 2

3

0

1

2

1

0

2

3

4

5

6

7

symmetric

undirected: n2/2directed: n2

Examples of Adjacency Matrices

Page 53: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Eurecom, Sophia-AntipolisThrasyvoulos Spyropoulos / [email protected]

Random Graph Models: Create/Explain Complex Network Properties

Page 54: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Random Graph Models: Why do we Need Them? The networks discussed are quite large!

Impossible to describe or visualize explicitly.

Consider this example: You have a new Internet routing algorithm You want to evaluate it, but do not have a trace of the Internet

topology You decide to create an “Internet-like” graph on which you will

run your algorithm How do you describe/create this graph??

Random graphs: local and probabilistic rules by which vertices are connected

Goal: from simple probabilistic rules to observed complexity

Q: Which rules gives us (most of) the observed properties? 54

Page 55: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Emergence of Complexity

55

Page 56: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Emergent Complexity in Cellular Automata

This is “Conway’s game of life” (many other automata) http://www.youtube.com/watch?v=ma7dwLIEiYU&feature=

related (demo)

http://www.bitstorm.org/gameoflife/ (try your own)56

Local Rules Each cell either white or blue Each cell interacts with its 8 neighbors Time is discrete (rounds)1. Any blue cell with fewer than two live

neighbors becomes white2. Any blue cell with two or three blue

neighbors lives on to the round3. Any blue cell with more than three blue

neighbors becomes white4. Any white cell with exactly three blue

neighbors become blue

Page 57: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Back to Networks: (Erdös-Rényi) Random Graphs A very (very!) simple local rule:

(any) two vertices are connected with probability p Only inputs: number of vertices n and probability p

Denote this class of graphs as G(n,p)

57

Erdös-Rényi model (1960)

Connect with probability p

p=1/6 N=10

k ~ 1.5

Page 58: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

N and p do not uniquely define the network– we can have many different realizations of it. How many?

P(G(N,L)) pL (1 p)N (N 1)

2 L

G(10,1/6)N=10 p=1/6

G(N,L): a graph with N nodes and L linksThe probability to form a particular graph G(N,L) is That is, each graph G(N,L)

appears with probability P(G(N,L)).

How Many Networks in G(n,p)?

2𝑁 (𝑁− 1)

2

Page 59: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

P(L): the probability to have exactly L links in a network of N nodes and probability p:

P(L)N

2

L

pL (1 p)

N(N 1)

2 L

The maximum number of links in a network of N nodes.

Number of different ways we can choose L links among all potential

links.

Binomial distribution...

Relation of G(N,p) to G(N,L)

Page 60: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

P(L): the probability to have a network of exactly L links

P(L)N

2

L

pL (1 p)

N(N 1)

2 L

L LP(L)pN(N 1)

2L0

N(N 1)

2

The average number of links <L> in a random graph

The standard deviation

2 p(1 p)N(N 1)

2)1( Npk

G(N,p) statistics

Average node degree <k>

Page 61: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

As the network size increases, the distribution becomes increasingly narrow—which means that we are increasingly confident that the number of links the graph has is in the vicinity of <L>.

NO

NNp

p

L

1

)1(

212/1

G(N,p) as N ∞

Page 62: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The degree distribution follows a binomial average degree is <k> = p(N-1) variance σ2 = p(1-p)(N-1)

Assuming z=Np is fixed, as N → ∞,B(N,k,p) is approximated by a Poisson distribution

As N → ∞ Highly concentrated around the mean Probability of very high node degrees is exponentially small Very different from power law!

Random Graphs: Degree Distribution

zk

ek!

zz)P(k;p(k)

62

k1)(Nk p)(1pk

1Np)N,B(k;p(k)

1/2

1/2

k

1)(N

1

1)(N

1

p

p1

k

σ

Page 63: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The secret behind the small world effect – Looking at the network volume

ddS 4)(

Are Erdos-Renyi (Poisson) Graphs Small-World?

Page 64: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The secret behind the small world effect – Looking at the network volume

d

x

dddxdN1

2~)1(24)(

Polynomial growth

The Volume of Geometric Graphs

Page 65: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The secret behind the small world effect – Looking at the network volume

d

x

dddxdN1

2~)1(24)(

Polynomial growth

The Exploding Volume of Random Graphs

Page 66: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

The secret behind the small world effect – Looking at the network volume

d

x

dddxdN1

2~)1(24)(

Polynomial growth

dd

x

dx k

k

kkdN ~

1

1)(

1

1

Exponential growth

The Exploding Volume of Random Graphs (2)

k

Nd

Nd

Nk

k

d

ln

ln

log

Page 67: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

klog

Nloglmax

Given the huge differences in scope, size, and average degree, the agreement is excellent!

Distance in Random Graphs Compare with Real Data

Page 68: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Random Graphs: Clustering Co-efficient

Consider a random graph G(n,p)Q: What is the probability that two of your neighbors are

also neighbors?A: It is equal to p, independent of local structure

clustering coefficient C = p

when z is fixed (sparse networks): C = z/n =O(1/n)

68

Page 69: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Given the huge differences in scope, size, and average degree, there is a clear disagreement.

Clustering in Random Graphs Compare with Real Data

Page 70: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Summary: Are Real Networks Random Graphs? Erdos-Renyi Graphs are “small world”

path lengths are O(logn)

Erdos-Renyi Graphs are not “scale-free” Degree distribution binomial and highly-concentrated (no

power-law) Exponentially small probability to have “hubs” (no heavy-tail)

Erdos-Renyi Graphs are not “clustered” C 0, as N becomes larger

Conclusion: ER random graphs are not a good model of real networks BUT: still provide a great deal of insight!

70

X

X

Page 71: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Some of your neighbors neighbors are also your own

Exponential growth: k

Nd

ln

ln

dkdS

kS

kS

)(

)2(

)1(2

Clustering inhibits the small-worldness

pkkN

dSdSdkSdS dd 21

)2()1(1)1()(

)1()1(

)2()3(

11

)2(

)1(

1)0(

32

22

kpkN

kpkNkSS

pkN

kNkS

kS

S

Poisson Graph Diameter: Growth is slightly slower

Page 72: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Small World Graphs: Watts-Strogatz Model Short paths must be combined with

High clustering coefficient

Watts and Strogatz model [WS98] Start with a ring, where every node is connected to the next k nodes With probability p, rewire every edge (or, add a shortcut) to a random

node

72

order randomness

p = 0 p = 10 < p < 1

Page 73: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Small World Graphs (2)

The Watts Strogatz Model: It takes a lot of randomness to ruin the clustering, but a very small amount to overcome locality 73

log-scale in p

When p = 0, C = 3(k-2)/4(k-1) ~ ¾ L = n/k

For small p, C ~ ¾ L ~ logn

Clustering Coefficient – Characteristic Path Length

Page 74: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Nodes: online user Links: email contact, tweet, or friendship

Alan Mislove, Measurement and Analysis of Online Social Networks

All distributions show a fat-tail behavior:there are orders of magnitude spread in the degrees

Online Social Networks

Page 75: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

World Wide Web

Page 76: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Scale-free Graphs: What About Power Laws? The configuration model

input: the degree sequence [d1,d2,…,dn] process:

- Create di copies of node i; link them randomly

- Take a random matching (pairing) of the copies• self-loops and multiple edges are allowed

76

4 1 3 2

But: Too artificial!

Page 77: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Networks continuously expand by the addition of new nodes

Barabási & Albert, Science 286, 509 (1999)

ER, WS models: the number of nodes, N, is fixed (static models)

One Explanation of Scale-Free(ness): Growth

Page 78: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

(1) Networks continuously expand by the addition of new nodes

Add a new node with m links

Barabási & Albert, Science 286, 509 (1999)

Growth Models

Page 79: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Barabási & Albert, Science 286, 509 (1999)jj

ii k

kk

)(

PREFERENTIAL ATTACHMENT:

the probability that a node connects to a node with k links is proportional to k.

A: New nodes prefer to link to highly connected nodes.

Q: Where will the new node link to?ER, WS models: choose randomly.

Growth Models: Preferential Attachment

Page 80: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Preferential Attachment in Networks

“The rich get richer”

First considered by [Price 65] as a model for citation networks each new paper is generated with m citations (on average) new papers cite previous papers with probability proportional

to their indegree (citations) what about papers without any citations?

- each paper is considered to have a “default” citation- probability of citing a paper with degree k, proportional to k+1

Power law with exponent α = 2+1/m

80

Page 81: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Barabasi-Albert model

The BA model (undirected graph) input: some initial subgraph G0, and m the number of edges

per new node the process:

- nodes arrive one at the time- each node connects to m other nodes selecting them with probability

proportional to their degree- if [d1,…,dt] is the degree sequence at time t, the node t+1 links to

node i with probability

Results in power-law with exponent α = 3

Various Problems: cannot account for every power law observed (Web), correlates age with degree, etc.

81

2mtd

dd i

i i

i

Page 82: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Poisson graphs as a function of p

As p increases, so does the density of the graph For small p (<0.2) notice that not all nodes are

connected For p = 0.2 only one isolated node

82

p = 0 p = 0.1 p = 0.2

Page 83: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Phase Transitions in Random Graphs

We saw that increasing p denser networks In the large N case we increase z = Np the average degree

But what really happens as p (or z) increases?

83

A random network on 50 nodes:p = 0.01 disconnected, largest component = 3

Page 84: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Phase Transitions in Random Graphs (2)

p = 0.03 large component appears But almost 40% of nodes still disconnected

84

Page 85: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Phase Transitions in Random Graphs (3)

p = 0.05 “giant” component emerges Only 3 nodes disconnected Giant component the graph “percolates”

85

Page 86: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

Phase Transitions in Random Graphs (4)

p = 0.10 all nodes connected

86

Page 87: Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Part II: Complex Networks Empirical Properties and Metrics.

Thrasyvoulos Spyropoulos / [email protected] Eurecom, Sophia-Antipolis

S: the fraction of nodes in the giant component, S=NGC/N

there is a phase transition at <k>=1:

for <k> < 1 there is no giant component

for <k> > 1 there is a giant component

for large <k> the giant component contains all nodes (S=1)

http://linbaba.files.wordpress.com/2010/10/erdos-renyi.png

Connectivity (“Percolation”) of Random GraphsS

<k>