1 chayes

The Power of Locality for Network Algorithms

JENNIFER CHAYES, MICROSOFT RESEARCH

CHRISTIAN BORGS, MICHAEL BRAUTBAR, SANJEEV KHANNA, BRENDAN LUCIER, AND SHANGHUA TENG

Online Networks

Online networks are often massive

WWW has trillions of (static) sites

Facebook has over a billion users

3D representation of WWW by Opte

Small piece of FB mapped by Nexus

Algorithmic Network Questions

• Ranking of the sites (e.g., PageRank)

• Finding the most influential site (or k most influential sites) under various definitions of influence

– Most highly connected

– Most influential under a certain model, e.g., KKT independent cascade model

• Covering the graph via local moves (the recruiter problem)

http://upload.wikimedia.org/wikipedia/commons/e/ee/PageRank-byFML.gif

Constraints • Limitations on network visibility, e.g. Facebook,

LinkedIn, etc. only let you see one or two hops away on the graph

Facebook LinkedIn

• Limitations on compute time, especially relevant for online computation on massive graphs

Need local (approximation) algorithms!

Want local (approximation) algorithms to be efficient, at expense of approximation factor if necessary.

Outline of the Talk

I. Network algorithms with local access constraints – Context: local information algorithms

– Algorithms on preferential attachment networks

– Algorithms on general networks

II. Using locality to get sublinear algorithms without a priori access constraints – PageRank problem

– Finding the most influential nodes

(viral marketing in ind. cascade model)

Borgs, Brautbar, C, Lucier, Khanna : WINE ‘12

Borgs, Brautbar, C, Teng : WAW ’12 & Internet Mathematics

Borgs, Brautbar, C, Lucier: SODA ‘14

A Networking Problem with Local Access Constraints

Goal: Meet the most influential people.

A Networking Problem


A Networking Problem


Find the highest-degree vertex.

A Networking Problem with Local Access Constraints

Motivating Question

How well can a graph algorithm perform when it has only local visibility of the network structure?

… on “natural” networks?

… as a function of the “level” of visibility?

Online Social Networks

Social network applications differ in what is visible:

Facebook LinkedIn Orkut, Google+

…

Question: what is the impact of this design choice?

Local Algorithms

More generally:

Search Problems: find the highest-degree node, the most central, …

Coverage Problems: minimum dominating set, maximum k-coverage, …

Connectivity Problems: shortest path, multicast, …

“Local”: Graph topology is revealed locally as the algorithm builds its output set.

Outline of Part I: Algorithms with Local Access Constraints

1. A model of local information algorithms

2. Algorithms for preferential attachment networks

3. Minimum dominating set problem on general networks

Local Information Algorithms Input: Graph G = (V,E), initially unknown. Output: subset S of the vertices. (eg: find feasible S, minimizing |S|)

Two operations: 1. Add a random node to S 2. Add any visible node to S

Visible region: all nodes distance ≤ r from S, plus the induced subgraph

…plus degrees of outermost nodes

r-Local algorithm

Note: To map this into questions on Facebook and LinkedIn, think of r as the distance out from your current set of connections, i.e., your set of friends.

1-Local Algorithm

2-Local Algorithm

2-Local Algorithm

This talk: focus mainly on 1-local algorithms.

Preferential Attachment Networks

Preferential Attachment

Random network growth model [BA’99,BR’00,…]

1. Begin with small fixed graph (e.g. clique).

2. Each new node v connects to m ≥ 2 previous nodes at random, proportional to their degrees:

1

2

3

4

5

6

7

Pr[i connects to j] ~ deg(j)

Preferential Attachment

Properties:

Connected (with high probability)

Small diameter: O(logn / loglogn)

Power law degree sequence: P(k) ~ k-3

Older nodes tend to have higher degree:

E[deg(i)] = (n/i)½

Note: possible to remove assumption that alg can detect

node 1.

Finding the Root

Problem: Return a set S containing node 1.

Opportunistic algorithm:

Initialize S to an arbitrary node

While S does not contain node 1:

Add node v ϵ N(S) with largest degree

1-local

Finding the Root Theorem: The opportunistic algorithm finds node 1

in O(log4n) queries, with high probability over the random graph process.

Facebook LinkedIn

O(log4n) Random Walk: Ω(n½)

Note: random walk requires O*(n½) queries.

Applications

s-t connectivity: O(log4n)

- connect s and t to node 1

- can connect k terminals in O(k log4n)

Find k nodes of largest degree: O(log4n + k)

- find node 1, but don’t stop the algorithm

Proof Sketch Theorem: The greedy algorithm finds node 1 in time O(log4 n). The Hope: the algorithm reaches a node of degree 2k after k * polylog n iterations.

The Problem: reaching a node of high degree does not necessarily imply progress.

1

Bottleneck

Proof Sketch Observation: if there is a path connecting S to node 1, with all nodes of degree ≥ d, then the algorithm never queries a node of degree < d.

Q: How common are these “good” paths?

A: For m ≥ 2, most nodes lie on good paths with constant probability. (Proof: detailed prob. analysis)

S

1 1

General Graphs

Minimum Dominating Set

Problem: find smallest set S s.t. N(S) ∪ S = V.

Minimum Dominating Set

Problem: find smallest set S s.t. N(S) ∪ S = V.

Lower bound: Ω(log Δ) (set cover)

Upper bound: O(log Δ), 3-local [GK’98]

How well can a 1-local algorithm perform?

max degree

A local algorithm

Greedy Algorithm:

Initialize S to a random node.

While |D(S)| < n:

Add node v ϵ N(S) that maximizes |D(S∪{v})|

A local algorithm

Greedy Algorithm:

Initialize S to a random node.

While |D(S)| < n:


Optimal: O(1)

Greedy alg: Ω(n) 1 2

3

4

5

6

7

8 9

A local algorithm

Greedy-Random Algorithm: Initialize S to a random node.

While |D(S)| < n:


Add a random node from N(v) \ D(S)

Theorem: The greedy-random algorithm obtains a (1 + 2logΔ) approximation (in expectation and whp).

x

Analysis

v

Pick v in OPT. We wish to show that our alg. does not waste too many steps covering N(v). Consider some x chosen greedily by the algorithm, and which covers some vertex in N(v).

x

Analysis

v


Case 1: v was visible.

x must cover many nodes due to greediness.

x

Analysis

v


Case 2: v was not visible.

If x covers few nodes, good chance to reveal v on the “random” step.

Conclusions from Part I:

• Local information algorithms: sequential decisions with limited network visibility.

• Many problems can be solved locally and efficiently in preferential attachment and general networks.

• The level of visibility can have a strong impact on the approximability of a problem.

Part II: Using Locality to Get Sublinear Algorithms

• Sublinear algorithms to find high pagerank nodes

• Sublinear algorithms for influence maximization

Sublinear Algorithms for PageRank: Definitions

• View the WWW as a directed graph with G(V,E) with V being the webpages and E being the (directed) hyperlinks

• PageRank: Do a random walk on the webgraph, restarting at a random site every say 1/a steps. The relative weight of a page in the stationary distribution is the PageRank of that page.


PageRank: Definitions • Random walk matrix M

Muv = (1/dout(u) )Auv • dout(u) out-degree of vertex u • Auv adj. matrix of dir. graph

• Stationary distribution p

p = p M

• PageRank matrix Puv

P = a1 + (1 – a) P M • Personalized PageRank vector p(u)

p(u) = eu P = Pu.

• Contribution vector c(v)

c(v) = ev PT = P.v

PageRank: PRv = Su Puv

(Always restart at u)

(All contributions to v)

• RW with restart at u: • in each step do one RW step

with prob. 1-a, and jump to u with prob. a:

p(u) = adu,. + (1 – a) p(u) M


Computing PageRank

• Take G = (V,E) with |V| = n and |E| = m.

• Significant PageRank Problem (SPP): Find all nodes with PR > D, and no nodes with PR < D/2.

• Previous results on running time: – Power iteration method (Bianchi et. al. ‘03): (m)

– Linear algebra improvement (Langville & Meyer ‘04): (n)

• Lower bound on running time: (n/D) (Roughly from n/D sites with PR = D, and all other sites with PR = 0.)

• Approximate SPP: Can we use locality to get an (additive) -approximation which essentially matches the lower bound, i.e., time *(n/D)?


Roadmap for Approximate SPP

• Steps of Calculation

i. Calculate each Puv

ii. Each PRv is the sum of the n terms in the

contribution column vector: PRv = Su Puv

iii. Do this for all n points, i.e. all contribution vectors

• A priori each step should take (n) time

i. Local calculation of -approximate Puv

• Previous results

– Deterministic • Jeh-Widom ‘03: ((log n) -1 max-in-degree)

• Andersen et. al. ‘06: (-1 max-in-degree)

– Random • Fogaras et. al. ’05: Monte Carlo based approach which

removed dependence on max-in-degree but gave mult. rather than additive error, and where approx. depended on Puv

• Our approach – Modification of Forgaras et. al.

– Handled concentration better to remove dependence on Puv

Bad if in-degree unbounded

i. Local calc. of (,)-approximate Puv

• Local method: uses – Terminating Random Walk: A RW which terminates

with prob. a, and with prob. 1 - a, moves uniformly to a random outlink of the current node

• Algorithm: – For (-1-2 log n) do

– Run a new terminating RW starting at u to a max (capping) length of log1/(1-a)(1/)

– If the walk is terminated before reaching the capping length, add one to the counter of the node the walk last visited before terminating

– Output avg count accumulated at each node

• Running time: (-1-2log n log -1 ) ~ *(-1 )

Note : The probability that a terminating RW starting at u, happens to end it v, is Puv

ii. From Puv to approx. of PRv = Su Puv

• Obviously can’t just sum (takes time n) • Alternative: Naïve sampling

– Pick L random ui R {1, … , n}.

– Check if sum of these L terms, Sui Puiv , is large or

small wrt D L/n.

• Problem with naïve sampling – To make sure error does not drown out expectation,

need = O(n/D) – To get concentration (Chernoff), need L = O*(n/D) – Runtime = L O*(1/) = O*(n2/D2) rather than O*(n/D)

ii. Multiscale Sampling

• Choose many scales t = 2-t

• Estimate how many entries P.v of the contribution vector c(v) lie in the interval (t,2t)

• Land up spending most of our time on the estimates of the larger P.v

lots of work

• Estimate whether PRv > D in running time *(n/D).

iii. From question of PRv > D for one v to all v

• Key: Use sparse matrix methods to do all n columns in parallel.

• Maintain running time *(n/D).

Conclusion for PageRank

Locality + multiscale analysis + sparse matrix methods

Running time = *(n/D) to find approx. of all nodes with significant pagerank PRv

Running time sublinear in n for D = 0(np), 0 < p < 1.

Final Topic:

Sublinear Algorithms for

Models of Influence Maximization

Influence Maximization: Definitions

Independent Cascade Model (Kempe, Kleinberg, Tardos ‘03)

Introduced as model of viral marketing

• G = (V,E) oriented graph with |V| = n, |E| = m, and edge weights {pe|eE}

• pe = probability infection spreads out along e

• I(S) = (random) size of the set that is eventually infected starting from seed set S V

Problem: For fixed k = |S|, find the seed set S which maximizes the expected influence E[I(S)]

KKT Model: Previous Results

• KKT: E[I(S)] is submodular maximizing E[I(S)] can be approximated to within (1 – 1/e) via greedy algorithm

• With oracle access to E[I(S)], greedy alg has runtime = O(kn)

• Oracle access can be simulated Total runtime = (mnk poly(-1)), i.e., even on a sparse graph, at least quadratic in n.

Influence Maximization: Our Results

• Nearly linear time algorithm: We can find an approximately optimal seed with an approximation factor of (1 – 1/e – ) in time O*((m + n) -3). – Note: There is a lower bound of (m + n), so this is

essentially optimal.

• Sublinear time algorithm: We can find an approx-imately optimal seed with an approximation factor of (1/) in time O*(n a(G)/ ) where a(G) is the arboricity* of the graph G. – Taking = 0(np), 0 < p < 1, we get a time sublinear in n.

*Arboricity of G is the minimum number of spanning forests necessary to cover all edges of G. Roughly speaking, arboricity corr. with density of graph.

Key Elements of the Proof • Key Idea: Preprocess G with random sampling

sparse hypergraph representation which retains influence characteristics of high-influence nodes – Each hypergraph edge represents a set of nodes

influenced by a random node in the transpose graph – Degree of set S in hypergraph is approximately

proportional to influence of S in original graph – Allows us to efficiently estimate marginal influence in

the original diffusion process with very few samples

• Local and applicable in many access models: Only operations are accessing a random vertex and traversing edges incident to previously accessed vertex

Key Elements of the Proof • Sublinear variant: Construct two possible seed

sets: one using a greedy algorithm according to the constructed hypergraph, and the other is a singleton selected at random according to the hypergraph degree distribution

Conclusions • Local network algorithms may either be required

due to local information access constraints, or just desirable due to increased runtime efficiency

• Recurring elements in the sublinear network algorithms: – Sampling (sometimes at multiple scales) rather than

probing all elements

– Interspersing greedy steps with random steps to see more of the space

– Maintaining locality by using backwards random walks, transposes of matrices, etc. to find large contributors

Conclusions

– Finding the most highly connected node or nodes

– Finding connections between nodes

– Covering the network (“recruiter problem”)

– Ranking of sites on the network (significant PageRank problem)

– Finding sets of maximum influence in the independent cascade model

• With local network methods, it is possible to get sublinear time algorithms with reasonable approximation ratios for questions of interest in massive networks:


Thanks for your attention!

1 chayes

Technology

Transcript of 1 chayes