Dynamics of large networks
-
Upload
bradley-summers -
Category
Documents
-
view
21 -
download
1
description
Transcript of Dynamics of large networks
![Page 1: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/1.jpg)
Dynamics of large networks
Jure LeskovecMachine Learning DepartmentCarnegie Mellon University
Thesis defense, September 3 2008
![Page 2: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/2.jpg)
2
Web: Rich data
Today: Large on-line systems have detailed records of human activity On-line communities:
▪ Facebook (64 million users, billion dollar business)▪ MySpace (300 million users)
Communication:▪ Instant Messenger (~1 billion users)
News and Social media:▪ Blogging (250 million blogs world-wide, presidential candidates run
blogs) On-line worlds:
▪ World of Warcraft (internal economy 1 billion USD)▪ Second Life (GDP of 700 million USD in ‘07)
Can study phenomena and behaviors at scales that
before were never possible
![Page 3: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/3.jpg)
3
Rich data: Networks
b) Internet (AS)c) Social networksa) World wide web
d) Communication e) Citationsf) Biological networks
![Page 4: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/4.jpg)
4
Networks: What do we know? We know lots about the network structure:
Properties: Scale free [Barabasi ’99], Clustering [Watts-Strogatz ‘98], Navigation [Adamic-Adar ’03, LibenNowell ’05], Bipartite cores [Kumar et al. ’99], Network motifs [Milo et al. ‘02], Communities [Newman ‘99], Conductance [Mihail-Papadimitriou-Saberi ‘06], Hubs and authorities [Page et al. ’98, Kleinberg ‘99]
Models: Preferential attachment [Barabasi ’99], Small-world [Watts-Strogatz ‘98], Copying model [Kleinberg el al. ’99], Heuristically optimized tradeoffs [Fabrikant et al. ‘02], Congestion [Mihail et al. ‘03], Searchability [Kleinberg ‘00], Bowtie [Broder et al. ‘00], Transit-stub [Zegura ‘97], Jellyfish [Tauro et al. ‘01]
We know much less about processes and
dynamics of networks
![Page 5: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/5.jpg)
5
This thesis: Network dynamics
Network evolution How network structure changes as the
network grows and evolves? Diffusion and cascading behavior
How do rumors and diseases spread over networks?
Large data Observe phenomena that is “invisible” at
smaller scales
![Page 6: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/6.jpg)
6
Data size matters
We need massive network data for the patterns to emerge MSN Messenger network [WWW ’08]
(the largest social network ever analyzed)▪ 240M people, 255B messages, 4.5 TB data
Product recommendations [EC ‘06]
▪ 4M people, 16M recommendations Blogosphere [in progress]
▪ 164M posts, 127M links
![Page 7: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/7.jpg)
7
This thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
ModelsQ2: How to
model individual edge
attachment?
Q5: How do we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 8: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/8.jpg)
8
This thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
ModelsQ2: How to
model individual edge
attachment?
Q5: How do we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 9: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/9.jpg)
9
Background: Network models
Empirical findings on real graphs led to new network models
Such models make assumptions/predictions about other network properties
What about network evolution?
log degree
log pro
b. Model
Power-law degree distribution
Preferential attachment
Explains
![Page 10: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/10.jpg)
10
Networks are denser over time Densification Power Law:
a … densification exponent (1 ≤ a ≤ 2)
Q1) Network evolution
What is the relation between the number of nodes and the edges over time?
Prior work assumes: constant average degree over time
Internet
Citations
a=1.2
a=1.6
N(t)
E(t
)
N(t)
E(t
)
[w/ Kleinberg-Faloutsos, KDD ’05]
![Page 11: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/11.jpg)
11
Q1) Network evolution
Prior models and intuition say that the network diameter slowly grows (like log N, log log N)
time
diam
eter
diam
eter
size of the graph
Internet
Citations
Diameter shrinks over time as the network grows the
distances between the nodes slowly decrease
[w/ Kleinberg-Faloutsos, KDD ’05]
![Page 12: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/12.jpg)
12
Q2) Modeling edge attachment
We directly observe atomic events of network evolution (and not only network snapshots)
We can model evolution at finest scale Test individual edge attachment
Directly observe events leading to network properties Compare network models by likelihood
(and not by just summary network statistics)
and so on for
millions…
[w/ Backstrom-Kumar-Tomkins, KDD ’08]
![Page 13: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/13.jpg)
13
Setting: Edge-by-edge evolution
Network datasets Full temporal information from the first edge onwards LinkedIn (N=7m, E=30m), Flickr (N=600k, E=3m), Delicious
(N=200k, E=430k), Answers (N=600k, E=2m)
We model 3 processes governing the evolution P1) Node arrival: node enters the network P2) Edge initiation: node wakes up, initiates an edge, goes
to sleep P3) Edge destination: where to attach a new edge
▪ Are edges more likely to attach to high degree nodes?▪ Are edges more likely to attach to nodes that are close?
[w/ Backstrom-Kumar-Tomkins, KDD ’08]
![Page 14: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/14.jpg)
14
Edge attachment degree bias
Are edges more likely to connect to higher degree nodes?
kkpe )(Gnp
PA
Flickr
Network
τ
Gnp 0
PA 1
Flickr 1
Delicious
1
Answers
0.9
0.6
First direct proof of preferential
attachment!
[w/ Backstrom-Kumar-Tomkins, KDD ’08]
![Page 15: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/15.jpg)
15
uw
v
But, edges also attach locally Just before the edge (u,w) is placed how many
hops is between u and w?
Network
% Δ
Flickr 66%
Delicious
28%
Answers
23%
50%
Fraction of triad closing
edges
Real edges are local.Most of them close
triangles!
[w/ Backstrom-Kumar-Tomkins, KDD ’08]
GnpPA
Flickr
![Page 16: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/16.jpg)
16
New triad-closing edge (u,w) appears next We model this as:
1. u chooses neighbor v 2. v chooses neighbor w3. Connect (u,w) We consider 25 triad closing strategies and compute their log-likelihood
Triad closing is best explained by choosing a node based on the number of common
friends and time since last activity (just choosing random neighbor also works well)
How to best close a triangle?
[w/ Backstrom-Kumar-Tomkins, KDD ’08]
uw
v
![Page 17: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/17.jpg)
17
Q3) Generating realistic graphs
Problem: generate a realistic looking synthetic network
Why synthetic graphs? Anomaly detection, Simulations, Predictions, Null-model,
Sharing privacy sensitive graphs, …
Q: Which network properties do we care about? Q: What is a good model and how do we fit it?
Compare graph properties, e.g.,
degree distribution
Given a real
network
Generate a synthetic network
![Page 18: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/18.jpg)
18
Q3) The model: Kronecker graphs
We prove Kronecker graphs mimic real graphs: Power-law degree distribution, Densification,
Shrinking/stabilizing diameter, Spectral properties
Initiator
(9x9)(3x3)
(27x27)
Kronecker product of graph adjacency matrices
[w/ Chakrabarti-Kleinberg-Faloutsos, PKDD ’05]
pij
Edge probabilityEdge probability
Given a real graph. How to estimate the initiator G1?
![Page 19: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/19.jpg)
19
Maximum likelihood estimation
Q5) Kronecker graphs: Estimation
Naïve estimation takes O(N!N2): N! for different node labelings:
▪ Our solution: Metropolis sampling: N! (big) const N2 for traversing graph adjacency matrix
▪ Our solution: Kronecker product (E << N2): N2 E Do stochastic gradient descent
a bc d
P( | ) Kronecker
arg max
We estimate the model in O(E)
[w/ Faloutsos, ICML ’07]
![Page 20: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/20.jpg)
20
Estimation: Epinions (N=76k, E=510k)
We search the space of ~101,000,000
permutations Fitting takes 2 hours Real and Kronecker are very close
Degree distribution
node degree
pro
babili
ty
0.99 0.54
0.49 0.13
Path lengths
number of hops
# r
each
able
pair
s
“Network” values
ranknetw
ork
valu
e
[w/ Faloutsos, ICML ’07]
![Page 21: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/21.jpg)
21
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
ModelsQ2: How to
model individual edge
attachment?
Q5: How do we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 22: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/22.jpg)
22
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
ModelsQ2: How to
model individual edge
attachment?
Q5: How can we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 23: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/23.jpg)
23
Part 2: Diffusion and Cascades
Behavior that cascades from node to node like an epidemic News, opinions, rumors Word-of-mouth in marketing Infectious diseases
As activations spread through the network they leave a trace – a cascade
Cascade (propagation graph)Network
We observe cascading behavior in large
networks
![Page 24: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/24.jpg)
24
Setting 1: Viral marketing
People send and receive product recommendations, purchase products
Data: Large online retailer: 4 million people, 16 million recommendations, 500k products
10% credit 10% off
[w/ Adamic-Huberman, EC ’06]
![Page 25: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/25.jpg)
25
Setting 2: Blogosphere
Bloggers write posts and refer (link) to other posts and the information propagates
Data: 10.5 million posts, 16 million links
[w/ Glance-Hurst et al., SDM ’07]
![Page 26: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/26.jpg)
26
Viral marketing cascades are more social: Collisions (no summarizers) Richer non-tree structures
Q4) What do cascades look like? Are they stars? Chains? Trees?
Information cascades (blogosphere):
Viral marketing (DVD recommendations):
(ordered by frequency)
pro
pagati
on
[w/ Kleinberg-Singh, PAKDD ’06]
![Page 27: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/27.jpg)
27
Q5) Human adoption curves Prob. of adoption depends on the number of friends
who have adopted [Bass ‘69, Granovetter ’78] What is the shape?
Distinction has consequences for models and algorithms
k = number of friends adopting
Pro
b.
of
ad
op
tion
k = number of friends adopting
Pro
b.
of
ad
op
tion
Diminishing returns? Critical mass?
To find the answer we need lots of
data
![Page 28: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/28.jpg)
28
Q5) Adoption curve: Validation
Later similar findings were made for group membership [Backstrom-Huttenlocher-Kleinberg ‘06], and probability of communication [Kossinets-Watts ’06]
Prob
abili
ty o
f pur
chas
ing
0 5 10 15 20 25 30 35 400
0.010.020.030.040.050.060.070.080.09
0.1DVD
recommendations(8.2 million
observations)
# recommendations received
Adoption curve follows the diminishing returns.Can we exploit this?
[w/ Adamic-Huberman, EC ’06]
![Page 29: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/29.jpg)
29
Q6) Cascade & outbreak detection
Blogs – information epidemics Which are the influential/infectious blogs?
Viral marketing Who are the trendsetters? Influential people?
Disease spreading Where to place monitoring stations to detect
epidemics?
![Page 30: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/30.jpg)
30
Q6) The problem: Detecting cascades
How to quickly detect epidemics as they spread?
c1
c2
c3
[w/ Krause-Guestrin et al., KDD ’07]
![Page 31: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/31.jpg)
31
Two parts to the problem Cost:
Cost of monitoring is node dependent
Reward: Minimize the number of affected
nodes:▪ If A are the monitored nodes, let R(A)
denote the number of nodes we save
We also consider other rewards: ▪ Minimize time to detection▪ Maximize number of detected outbreaks
R(A)A
( )
[w/ Krause-Guestrin et al., KDD ’07]
![Page 32: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/32.jpg)
32
Reward for detecting cascade i
Given: Graph G(V,E), budget M Data on how cascades C1, …, Ci, …,CK spread over time
Select a set of nodes A maximizing the reward
subject to cost(A) ≤ M
Solving the problem exactly is NP-hard Max-cover [Khuller et al. ’99]
Optimization problem
![Page 33: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/33.jpg)
33
Solution: CELF Algorithm
We develop CELF (cost-effective lazy forward-selection) algorithm: Two independent runs of a modified greedy
▪ Solution set A’: ignore cost, greedily optimize reward▪ Solution set A’’: greedily optimize reward/cost ratio
Pick best of the two: arg max(R(A’), R(A’’))
Theorem: If R is submodular then CELF is near optimal CELF achieves ½(1-1/e) factor approximation
[w/ Krause-Guestrin et al., KDD ’07]
![Page 34: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/34.jpg)
34
Problem structure: Submodularity
Theorem: Reward function R is submodular (diminishing returns, think of it as “concavity”)
[w/ Krause-Guestrin et al., KDD ’07]
Gain of adding a node to a small set
Gain of adding a node to a large set
R(A {u}) – R(A) ≥ R(B {u}) – R(B)A B
S1
S2
Placement A={S1, S2}
S’
New monitored
node:
Adding S’ helps a lot
S2
S4
S1
S3
Placement B={S1, S2, S3, S4}
S’
Adding S’
helps very little
![Page 35: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/35.jpg)
35
Blogs: Information epidemics Question: Which blogs should one read to
catch big stories? Idea: Each blog covers part of the blogosphere
• Each dot is a blog• Proximity is based on the number of common cascades
![Page 36: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/36.jpg)
36
Blogs: Information epidemics
For more info see our website: www.blogcascade.org
Which blogs should one read to catch big stories?
CELF
In-links
Random
# postsOut-links
Number of selected blogs (sensors)
Reward
(higher is
better)
(used by Technorati)
![Page 37: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/37.jpg)
37
CELF: Scalability
CELF
GreedyExhaustive search
Number of selected blogs (sensors)
Run time
(seconds)
(lower is better)
CELF runs 700x faster than simple greedy algorithm
![Page 38: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/38.jpg)
38
Same problem: Water Network Given:
a real city water distribution network
data on how contaminants spread over time
Place sensors (to save lives)
Problem posed by the US Environmental Protection Agency
SS
c1
c2
[w/ Krause et al., J. of Water Resource Planning]
![Page 39: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/39.jpg)
39
Water network: Results
Our approach performed best at the Battle of Water Sensor Networks competition
CELF
PopulationRandom
Flow
Degree
Author Score
CMU (CELF) 26
Sandia 21
U Exter 20
Bentley systems 19
Technion (1) 14
Bordeaux 12
U Cyprus 11
U Guelph 7
U Michigan 4
Michigan Tech U 3
Malcolm 2
Proteo 2
Technion (2) 1
Number of placed sensors
Population
saved(higher is better)
[w/ Krause et al., J. of Water Resource Planning]
![Page 40: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/40.jpg)
40
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
Models
Q2: How to model
individual edge
attachment?
Q5: How do we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 41: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/41.jpg)
41
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Q1: How does network structure
evolve over time?
Q4: What are patterns of diffusion in networks?
Q7: What are the properties of a
social network of the whole planet?
ModelsQ2: How to
model individual edge
attachment?
Q5: How do we model influence
propagation?
Q8: What is community
structure of large networks?
Algorithms
(applications)
Q3: How to generate
realistic looking networks?
Q6: How to identify
influential nodes and epidemics?
Q9: How to predict search result quality from the web
graph?
![Page 42: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/42.jpg)
42
3 case studies on large data
Benefits from working with large data: Q7) Can test hypothesis at planetary scale
6 degrees of separation
Q8) Observe phenomena previously invisible Network community structure
Q9) Making global predictions from local network structure Web search
![Page 43: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/43.jpg)
43
Q7) Planetary look on a small-world
Milgram’s small world experiment
(i.e., hops + 1)
Small-world experiment [Milgram ‘67] People send letters from Nebraska to Boston
How many steps does it take? Messenger social network – largest network analyzed
240M people, 255B messages, 4.5TB data
[w/ Horvitz, WWW ’08]
MSN Messenger network
Avg. is 6.2. Thus, 6 degrees of separation
Avg. is 6.6! Wikipedia calls it 7 degrees of
separation
![Page 44: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/44.jpg)
44
Q8) Network community structure
How community like is a set of nodes?
Need a natural intuitive measure
Conductance:Φ(S) = # edges cut / # edges inside
Plot: Score of best cut of volume k=|S|
S
S’
[w/ Dasgupta-Lang-Mahoney, WWW ’08]
![Page 45: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/45.jpg)
45
Example: Small network
Collaborations between scientists (N=397, E=914)Cluster size, log k
log Φ
(k)
![Page 46: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/46.jpg)
46
Example: Large network
Collaboration network (N=4,158, E=13,422)
Cluster size, log k
log Φ
(k)
[w/ Dasgupta-Lang-Mahoney, WWW ’08]
![Page 47: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/47.jpg)
47
Q8) Suggested network structure
Good cuts exist only to size of 100
nodes
Denser and denser
core of the network
Expander like core contains
60% of the nodes and 80% edges
Network structure: Core-
periphery (jellyfish, octopus)
• Good cuts exist at small scales• Communities blend into the core as they grow • Consequences: There is a scale to a cluster (community) size
100
![Page 48: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/48.jpg)
48
Q9) Web Projections
User types in a query to a search engine Search engine returns results:
Is this a good set of search results?
Result returned by the search engine
Hyperlinks
Non-search resultsconnecting the graph
[w/ Dumais-Horvitz, WWW ’07]
![Page 49: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/49.jpg)
49
Q9) Web Projections: Results We can predict search result quality with
80% accuracy just from the connection patterns between the results
Predict “Good” Predict “Poor”
Good?
Poor?
[w/ Dumais-Horvitz, WWW ’07]
![Page 50: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/50.jpg)
50
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Densification and shrinking
diameterCascade shapes
7 degrees of separation of
MSN
Models Triangle closing model
Diminishing returns of
human adoption
Network community structure
Algorithms
(applications)
Kronecker graphs and
fitting
Cascade and outbreak detection
Web projections
![Page 51: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/51.jpg)
51
Future directions: Evolution
Why are networks the way they are? Health of a social network
Steer the network evolution Better design networked services
Predictive modeling of large communities Online massively multi-player games are closed
worlds with detailed traces of activity
![Page 52: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/52.jpg)
52
Future directions: Diffusion
Predictive models of information diffusion When, where and what post will create a cascade? Where should one tap the network to get the
effect they want? Social Media Marketing
How do news and information spread New ranking and influence measures for blogs Sentiment analysis from cascade structure
![Page 53: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/53.jpg)
53
What’s next?
Observations: Data analysis
Models: Predictions
Algorithms: Applications
Actively influencing the network
![Page 54: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/54.jpg)
54
Christos FaloutsosJon KleinbergEric Horvitz
Carlos GuestrinAndreas KrauseSusan Dumais
Andrew TomkinsRavi Kumar
Lada AdamicBernardo Huberman
Kevin LangMichael Manohey
Zoubin GharamaniAnirban Dasgupta
Tom MitchellJohn Lafferty
Larry WassermanAvrim Blum
Sam MaddenMichalis Faloutsos
Ajit SinghJohn Shawe-Taylor
Lars BackstromMarko Grobelnik
Natasa Milic-FraylingDunja Mladenic
Jeff HammerbacherMatt Hurst
Deepay ChakrabartiMary McGlohonNatalie Glance
Jeanne VanBriesen
Microsoft ResearchYahoo Research
FacebookEve online
Spinn3rLinkedIn
Nielsen BuzzMetricsMicrosoft Live Labs
Thanks!
Everyone is invited toWean 4623 at 5pmThere will be food
![Page 55: Dynamics of large networks](https://reader036.fdocuments.in/reader036/viewer/2022062517/56813361550346895d9a779c/html5/thumbnails/55.jpg)
55
Thesis: The structure
Network Evolution
Network Cascades
Large Data
Observations
Densification and shrinking
diameterCascade shapes
7 degrees of separation of
MSN
Models Triangle closing model
Diminishing returns of
human adoption
Network community structure
Algorithms
(applications)
Kronecker graphs and
fitting
Cascade and outbreak detection
Web projections