Communities, Spectral Clustering, and Random Walksbindel/present/2012-07-mmds.pdf · Random Walks...

Communities, Spectral Clustering, andRandom Walks

David Bindel

Department of Computer ScienceCornell University

13 Jul 2012

Spectral clustering recipe

Ingredients:1. A subspace basis with useful information

(eigenvectors of L, L̄, B, etc; NMF factors)2. A way to extract clusters from the basis

(sign patterns; (scaled) latent coordinates)

What works for small communities? With overlap?

Classic spectral bisection

Goal: Split graph into equal parts, minimizing edges cut.

Strategy: turn into a constrained integer QPI Identify partition with label vector s̄ ∈ {±1}n.I Objective: s̄T Ls̄ = 4× |edges cut|I Constraint: eT s̄ = 0

This is NP hard, but...

Classical spectral bisection

2 4 6 8 10 12 14 16 18 20−1

−0.5

Hard: min s̄T Ls̄ s.t. eT s̄ = 0, s̄ ∈ {±1}n.Easy: min vT Lv s.t. eT v = 0, v ∈ Rn, ‖v‖2 = n.

v is the Fiedler vector (second eigenvector of L)

Three clusters?

Take three eigenvectors as latent coordinates + cluster:

r1...r1r2...r2r3...r3

r1r2r3

Equivalent to a matrix factorization!

Overlapping communities

0 20 40 60 80 100 120 140 160 180

nz = 3996

A ≈ S diag(β)ST , S ∈ {0,1}n×c

Spectrum for A with overlap

20 40 60 80 100 120 140 160 180

Clustering and overlap

Dominant eigenvectors for A:

Alternate basis for the space:

How do we get the latter basis?

Recovering indicators

minimize ‖s̃‖1 (proxy for sparsity of s̃)s.t. s̃ = Uy (s̃ in the right space)

s̃i ≥ 1 (“seed” constraint)s̃ ≥ 0 (componentwise nonnegativity)

Given a basis U, want to extract a vector s̃ s.t.I s̃ lies close to the span of UI s̃ is almost an indicator for a communityI Some known “seeds” are in the community

Indicators from subspaces: LP edition

minimize ‖s̃‖1 (proxy for sparsity of s̃)s.t. s̃ = Uy (s̃ in the right space)

s̃i ≥ 1 (“seed” constraint)s̃ ≥ 0 (componentwise nonnegativity)

Recovers smallest set containing node i ifI U = SY−1 exactly.I Each set contains at least one element only in that set.

(Frequently works if there is not “too much” overlap.)

Got noise? Need thresholding!

The method is willing, but the space is weak:I Eigenvectors of A may not be ideal – what about L, L̄,

transition matrices for random walks?I Many small communities =⇒ many eigenvectors?

So consider:I Why eigenvectors?

I Optimization connectionI Dynamics connection

I What might serve better?

Optimization and quadratic forms

Indicate V ′ ⊆ V by s ∈ {0,1}n. Measure subgraph:

sT As = |E ′| = internal edges

sT Ds = edges incident on subgraph

sT Ls = edges between V ′ and V̄ ′

sT Bs = “surprising” internal edges

Rayleigh quotients

sT AssT s

= mean internal degree in subgraph

sT LssT s

= edges cut between V ′ and V̄ ′ relative to |V ′|

sT AssT Ds

= fraction of incident edges internal to V ′

sT LssT Ds

= fraction of incident edges cut

sT BssT s

= mean “surprising” internal degree in subgraph

sT BssT Ds

= mean fraction of internal degree that is surprising

Rayleigh quotients and eigenvalues

Basic connection (M spd):

xT KxxT Mx

stationary at x ⇐⇒ Kx = λMx

Easy despite lack of convexity.

Rayleigh quotients big and small

Decompose:

W T MW = I and W T KW = Λ = diag(λ1, . . . , λn).

For any x 6= 0, xT Mx = 1, have

x =∑

xT Kx =∑

λjz2j

SosT KssT Ms

≈ λmax =⇒ s ≈∑

λj≈λmax

wjzj .

So look at invariant subspaces for extreme eigenvalues.

The dynamics perspective

The random walker

Basic idea: extract structure from random walk.

Start at seed and walk forwardDay 1: I came up with a funny joke!Day 2: I tell everyone in my familyDay 3: My mother tells a friend?

Or look at how quickly source is forgottenDay 1: David came up with a funny joke!Day 2: There’s a joke going around Cornell CS.Day 3: I read this bad joke on the web...

The random walker

Lazy random walk with transition matrix

S = (αI + AD−1)/(1 + α).

1. Start at p0, take k steps. Distribution:

pk = Skp0 (→ d/m as k →∞)

2. End at q0 after k steps. Conditional distribution on start:

qk ∝(

q0 (→ e/n as k →∞)

Notes:I If the graph is undirected, ST = D−1SD.I If the graph is also regular, ST = S.

Simon-Ando theory

Markov chain with loosely-coupled subchains:I Rapid local mixing: after a few steps

pk ≈c∑

αj,kp(j)∞

where p(j)∞ is a local equilibrium for the j th subchain

I Slow equilibration: αj,k → αj,∞.

Alternately, rapid local mixing looks like:

qk ≈c∑

γj,ksj

where sj is an indicator for nodes in one subchain.

Spectral Simon-Ando picture

Exactly decoupled case (c decoupled chains):I Eigenvalue one has multiplicity c.I Eigenvectors of S are local equilibria.I Eigenvectors of ST are indicators for chains.I Rapid mixing =⇒ large gap to λc+1.

Weakly coupled case:I Cluster of c eigenvalues near 1.I Eigenvectors of S are combinations of local equilibria.I Eigenvectors of ST are combinations of indicators.I Large gap between λc and λc+1.

Beyond eigenvectors

Connections that suggest using dominant eigenvectors:I Optimization of quadratic formsI Asymptotic dynamics of random walks

Works well for a few big communities – what about local ones?

Ritz vectors

Variational characterization of eigenpairs:

xT KxxT Mx

stationary at x ⇐⇒ Kx = λMx

Rayleigh-Ritz approximation: x = Vy

(Vy)T K (Vy)

(Vy)T M(Vy)stationary at y ⇐⇒ (V T KV )y = λ̂(V T MV )y

Krylov + Rayleigh-Ritz

Usual approach to large-scale symmetric eigenproblems:1. Generate a basis for a Krylov subspace

Kk (A, x0) = span{x0,Ax0,A2x0, . . . ,Ak−1x0}

2. Use Rayleigh-Ritz on subpace3. Re-start if subspace gets too large

Krylov + Rayleigh-Ritz + dynamics

Eigendecomposition picture:

pk =N∑

αjλkj vj

Krylov + Rayleigh-Ritz:

{(µj ,wj)}rj=1 = Ritz pairs from Kr (S,p0)

pk =r∑

βjµkj wj , k < r

The idea: Ritz subspaces

Idea: Use unconverged subspace computations

Invariant subspace Ritz subspaceLong random walks Short random walksGlobal information Local informationPotentially expensive Cheap to computeMay need large space Small space okay

Current favorite method

1. Pick “seed” nodes j1, j2, . . .2. Take short random walks (length k ) from each seed3. Extract few Ritz vectors from span{q0,q1, . . . ,qk−1}.4. Approx indicators in span of all Ritz vectors.5. Possibly add more seeds and return to step 1.6. Convert raw “score” vector to a {0,1} indicator.

Wang test graph

Spectrum for Wang test graph

5 10 15 20 25 30−0.5

Zachary Karate graph

Spectrum for Karate

5 10 15 20 25 30

−0.5

Football graph

Spectrum for Football

20 40 60 80 100

Dolphin graph

Spectrum for Dolphin

10 20 30 40 50 60

−0.5

Non-overlapping synthetic benchmark (µ = 0.5)

0 100 200 300 400 500 600 700 800 900 1000

nz = 15746

Spectrum for synthetic benchmark

20 40 60 80 100

Score vector

100 200 300 400 500 600 700 800 900 1,0000

Score vector for the two-node seed of 492 and 513 in the firstLFR benchmark graph. Ten steps, three Ritz vectors.

Non-overlapping synthetic benchmark (µ = 0.6)

0 100 200 300 400 500 600 700 800 900 1000

nz = 15316

20 40 60 80 100

Score vector

100 200 300 400 500 600 700 800 900 1,0000

Score vector for the two-node seed of 492 and 513 in the firstLFR benchmark graph. Ten steps, three Ritz vectors.

Overlapping synthetic benchmark (µ = 0.3)

I 1000 nodesI 47 communitiesI 500 nodes belong to two communities

20 40 60 80 100

Score vector

100 200 300 400 500 600 700 800 900 1,0000

Score vector for the two-node seed of 521 and 892.The desired indicator is in red.

Score vector

100 200 300 400 500 600 700 800 900 1,0000

Score vector for the two-node seed of 521 and 892 +twelve reseeds. The desired indicator is in red.

Conclusions

Classic spectral methods use eigenvectors to cluster, but:I We don’t need to stop at partitioning!

I Overlap is okayI Key is how we mine the subspace

I We don’t need to stop at eigenvectors!I Can also use Ritz vectorsI Computation is cheap: short random walks

Still very much need:I Better theoretical groundingI Better ideas for thresholdingI Testing on large-scale examples

Communities, Spectral Clustering, and Random Walksbindel/present/2012-07-mmds.pdf · Random Walks...

Documents

Transcript of Communities, Spectral Clustering, and Random Walksbindel/present/2012-07-mmds.pdf · Random Walks...

Random Walks in Random Environmentszeitouni/pdf/rwrereview.pdfRandom Walks in Random Environments Ofer Zeitouni ∗ July 19, 2006 Abstract Random walks in random environments (RWRE’s)

35229433 Random Variale Random Process

Random event or random numbers?

Metadata Issues in a Cryptographic File System David Bindel IRAM/ISTORE/OceanStore Retreat.

A random talk on random trees

Random-Number Generationjain/cse567-08/ftp/k_26rng.pdf · Random-Number Generation! Random Number = Uniform (0, 1)! Random Variate = Other distributions = Function(Random number)

4222 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 32, NO. 6 ...bindel/papers/2017-tps.pdf · FLiER: Practical Topology Update Detection Using Sparse PMUs Colin Ponce and David S. Bindel

Random walks in random sceneries and related models · Random walks in random sceneries are also linked with ergodic theory. Indeed, when the random walk is the simple symmetric random

Krylov Subspace Approximation for Local Community ...bindel/papers/2019-tkdd.pdf · based on random walk diffusion. We thoroughly investigate a rich set of diffusion methods; and

Random Number and Random Variate Generation

Random (Stochastic) Processesdsp/dsp2013/slides/Course 12 - Random Signals, Oversampling...Random (or Stochastic) Process (or Signal) A random process is random ‘function’, not

An Algebraic Approach to Practical and Scalable Overlay ...bindel/papers/2004-sigcomm.pdfOverlay network monitoring enables distributed Internet applications todetect andrecoverfrom

Markov Random Fields & Conditional Random Fields

Local Lanczos Spectral Approximation for …bindel/papers/2017-ecml-pkdd.pdfLocal Lanczos Spectral Approximation for Community Detection Pan Shi1, Kun He1( ), David Bindel 2, John

Tomography-based Overlay Network Monitoring Yan Chen, David Bindel , Randy H. Katz

Richard Bindel, UMDDivision of Nuclear Physics, Maui, 2005 System Size and Energy Dependence of Elliptical Flow Richard Bindel University of Maryland For.

Matrix Factorizations for Computer Network Tomographybindel/present/2011-07-householder.pdfMatrix Factorizations for Computer Network Tomography David Bindel Department of Computer

Simulating MEMS David Bindel April 11, 2001. Overview What are MEMS? Modeling and simulation The SUGAR simulator Ongoing work Conclusion.

Random Numbers Random Walk

Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)