Random Dot Product Graphs

75
Random Dot Product Graphs Ed Scheinerman Applied Mathematics & Statistics Johns Hopkins University IPAM Intelligent Extraction of Information from Graphs & High Dimensional Data July 26, 2005

description

IPAM Intelligent Extraction of Information from Graphs & High Dimensional Data July 26, 2005. Random Dot Product Graphs. Ed Scheinerman Applied Mathematics & Statistics Johns Hopkins University. Coconspirators. Libby Beer John Conroy (IDA) Paul Hand (Columbia) Miro Kraetzl (DSTO) - PowerPoint PPT Presentation

Transcript of Random Dot Product Graphs

Page 1: Random Dot Product Graphs

Random Dot Product GraphsRandom Dot Product Graphs

Ed ScheinermanApplied Mathematics & Statistics

Johns Hopkins University

IPAMIntelligent Extraction of Information from

Graphs & High Dimensional DataJuly 26, 2005

Page 2: Random Dot Product Graphs

CoconspiratorsCoconspirators

• Libby Beer• John Conroy (IDA)• Paul Hand (Columbia)• Miro Kraetzl (DSTO)• Christine Nickel• Carey Priebe• Kim Tucker• Stephen Young (Georgia Tech)

Page 3: Random Dot Product Graphs

OverviewOverview

• Mathematical context

• Modeling networks

• Random dot product model

• The inverse problem

Page 4: Random Dot Product Graphs

Mathematical ContextMathematical Context

Page 5: Random Dot Product Graphs

Graphs I Have LovedGraphs I Have Loved

• Interval graphs & intersection graphs

• Random graphs

• Random intersection graphs

• Threshold graphs & dot product graphs

Page 6: Random Dot Product Graphs

Interval GraphsInterval Graphs

v a Ivv ~ w ⇔ Iv ∩ Iw ≠∅

Page 7: Random Dot Product Graphs

Intersection GraphsIntersection Graphs

v a Sv

v ~ w ⇔ Sv ∩ Sw ≠∅

{1}

{1}

{1,2}

{2}

Page 8: Random Dot Product Graphs

Random GraphsRandom Graphs

Erdös-Rényi style…

p 1 – p

Randomness is “in” the edges. Vertices are “dumb” placeholders.

Page 9: Random Dot Product Graphs

Random Intersection GraphsRandom Intersection Graphs

• Assign random sets to vertices.

• Two vertices are adjacent iff their sets intersect.

• Randomness is “in” the vertices.

• Edges reflect relationships between vertices.

Page 10: Random Dot Product Graphs

Threshold GraphsThreshold Graphs

v a xv ∈ R

v ~ w ⇔ xv + xw ≥1

0.5

0.6

0.8

0.3

Page 11: Random Dot Product Graphs

Dot Product GraphsDot Product Graphs

v a xv ∈ Rd

v ~ w ⇔ xv ⋅xw ≥1

[1 0]

[2 0]

[1 1]

[0 1]

Fractional intersection graphs

Page 12: Random Dot Product Graphs

Communication NetworksCommunication Networks

Page 13: Random Dot Product Graphs

Physical NetworksPhysical Networks

Telephone

Local area network

Power grid

Internet

Page 14: Random Dot Product Graphs

Social NetworksSocial Networks

Alice

Bob

A B

2003-4-10

Page 15: Random Dot Product Graphs

Social Network GraphsSocial Network Graphs

Vertices (Actors) Edges (Dyads)

Telephones Calls

Email addresses Messages

Computers IP Packets

Human beings Acquaintance

Academicians Coauthorship

Page 16: Random Dot Product Graphs

Example: Email at HPExample: Email at HP

• 485 employees

• 185,000 emails

• Social network (who emails whom) identified 7 “communities”, validated by interviews with employees.

Page 17: Random Dot Product Graphs

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

Page 18: Random Dot Product Graphs

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

P a ~ c | a ~ b ~ c[ ] > P a ~ c[ ]

a

b

c

Page 19: Random Dot Product Graphs

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

“Six degrees of separation”

Page 20: Random Dot Product Graphs

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

log d

log

N(d

)

Degree Histogram

Page 21: Random Dot Product Graphs

Degree Histogram Example 1Degree Histogram Example 1

2838 vertices

degree

Num

ber

of v

erti

ces

Page 22: Random Dot Product Graphs

Degree Histogram Example 2Degree Histogram Example 2

16142 vertices

degree

Num

ber

of v

erti

ces

Page 23: Random Dot Product Graphs

Random Graph ModelsRandom Graph Models

Goal: Simple and realistic random graph models of social networks.

Page 24: Random Dot Product Graphs

Erdös-Rényi?Erdös-Rényi?

• Low diameter!

• No clustering: P[a~c]=P[a~c|a~b~c].

• No power-law degree distribution.

Not a good model.

Page 25: Random Dot Product Graphs

Model by Fan Chung et alModel by Fan Chung et al

N(d) = α d−β⎣ ⎦

Consider only those graphs with

with all such graphs equally likely.

Page 26: Random Dot Product Graphs

People as VectorsPeople as Vectors

a1

a2

a3

a4

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Sports

Politics

Movies

Graph theory

b1

b2

b3

b4

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Page 27: Random Dot Product Graphs

Shared InterestsShared Interests

P a ~ b[ ] = f a ⋅b( )

Alice and Bob are more likely to communicate when they have more shared interests.

Page 28: Random Dot Product Graphs

Selecting the FunctionSelecting the Function

P a ~ b[ ] = f a ⋅b( )

f(t)=1

πtan

−1(t)+

1

2 f:[−∞,+∞]→[0,1]

Page 29: Random Dot Product Graphs

Selecting the FunctionSelecting the Function

P a ~ b[ ] = f a ⋅b( ) €

f(t)=t

1+t a⋅b≥0 f:[0,∞]→[0,1]

Page 30: Random Dot Product Graphs

Selecting the FunctionSelecting the Function

f(t)=tr

a⋅b∈[0,1]

f:[0,1]→[0,1]

P a ~ b[ ] = f a ⋅b( )

Page 31: Random Dot Product Graphs

Random Dot Product Graphs, IRandom Dot Product Graphs, I

Given x1,x2 ,K ,xn ∈ Rd

P[i ~ j] = xi ⋅x j or = f (xi ⋅x j )( )

Write X = [x1,x2,K ,xn ]

PX (G) = (xi ⋅x j )ij∈E

∏ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟× (1−xi ⋅x j )

ij∉E i≠ j

∏ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟

Page 32: Random Dot Product Graphs

Generalize Erdös-RényiGeneralize Erdös-Rényi

Take x1 = x2 =L = xn = x

with x ⋅x = p.

Page 33: Random Dot Product Graphs

Generalize Intersection GraphsGeneralize Intersection Graphs

If i a Ai ⊆{1,2,K ,k}

take xi = χ (Ai )∈ {0,1}k

and f (t) =0 t = 0

1 t > 0

⎧ ⎨ ⎩

Page 34: Random Dot Product Graphs

Whence the Vectors?Whence the Vectors?

• Vectors are given in advance.

• Vectors chosen (iid) from some distribution.

P(G) = PX (G) dX∫

Page 35: Random Dot Product Graphs

Random Dot Product Graphs, IIRandom Dot Product Graphs, II

• Step 1: Pick the vectors Given by fiat. Chosen from iid a distribution.

• Step 2: For all i<j Let p=f(xi•xj).

Insert an edge from i to j with probability p.

Page 36: Random Dot Product Graphs

MegageneralizationMegageneralization

• Generalization of: Intersection graphs (ordinary & random) Threshold graphs Dot product graphs Erdös-Rényi random graphs

• Randomness is “in” both the vertices and the edges.

• P[a~b] independent of P[c~d] when a,b,c,d are distinct.

Page 37: Random Dot Product Graphs

Results in Dimension 1Results in Dimension 1

Choose xi iid uniform in [0,1].

Use f (t) = t r .

Choose xi independently from U r[0,1]

P(i ~ j) = xix j f (t) = t

Page 38: Random Dot Product Graphs

Probability/Number of EdgesProbability/Number of Edges

P[i ~ j] = (xix j )rdxidx j

0

1

∫0

1

∫ =1

(1+ r)2

Expected number of edges =n

2

⎝ ⎜

⎠ ⎟1+ r( )

−2.

Page 39: Random Dot Product Graphs

ClusteringClustering

P[a ~ c | a ~ b ~ c] =P[a ~ c & a ~ b ~ c]

P[a ~ b ~ c]

=(xy)r (xz)r (yz)rdxdydz∫∫∫

(xy)r (yz)rdxdydz∫∫∫

=(1+ r)2(1+ 2r)

(1+ 2r)3>

1

(1+ r)2= P[a ~ c]

Page 40: Random Dot Product Graphs

Power LawPower Law

Believe :

N(d)∝ d−c, c =1−1/r

Can show :

N (1−ε)d,(1+ ε)d( )∝ 2εd−c

Page 41: Random Dot Product Graphs

Power Law ExamplePower Law Example

n = 30000

P[i ~ j] = (xix j )3

Page 42: Random Dot Product Graphs

Isolated VerticesIsolated Vertices

E[N(0)] ~ Crn(r−1)/r = o(n)

where Cr =(1+ r)1/r Γ(1/ r)

r.

Thus, the graph is not connected, but…

Page 43: Random Dot Product Graphs

“Mostly” Connected“Mostly” Connected

“Giant” connected component

A “few” isolated vertices

Page 44: Random Dot Product Graphs

Six Degrees of SeparationSix Degrees of Separation

Diameter ≤ 6

Page 45: Random Dot Product Graphs

Attached

Attachedpair

Diameter ≤ 6 Proof OutlineDiameter ≤ 6 Proof Outline

{i : xi ≥ τ }

Diameter = 2

Isolated

{i : xi < τ }

Page 46: Random Dot Product Graphs

Diameter ≤ 6 Proof OutlineDiameter ≤ 6 Proof Outline

Page 47: Random Dot Product Graphs

Graphs to VectorsGraphs to Vectors

The Inverse Problem

Page 48: Random Dot Product Graphs

Given Graphs, Find VectorsGiven Graphs, Find Vectors

• Given: A graph, or a series of graphs, on a common vertex set.

• Problem: Find vectors to assign to vertices that “best” model the graph(s).

Page 49: Random Dot Product Graphs

Maximum Likelihood MethodMaximum Likelihood Method

• Feasible in dimension 1. Awful d>1.

• Nice results for f(t) = t / (1+t).€

arg maxX

{PX (G)}

Page 50: Random Dot Product Graphs

Gram Matrix ApproachGram Matrix Approach

Given G1,G2 ,K ,Gm .

Let A =1

mA(G j )

j=1

m

∑ .

∴ aij ≈ P[i ~ j] = xi ⋅x j (i ≠ j)

X =[x1,x2,K ,xn ] (d ×n)

A ≈ XTX

Page 51: Random Dot Product Graphs

Wrong Best SolutionWrong Best Solution

Minimize f (X) = A− XTXF

2

A =UTΛU; λ 1 ≥ λ 2 ≥L ≥ λ n

X = gd (A) :=

λ 1+ 0 L 0

0 λ 2+ L 0

M M O M

0 0 L λ d+

⎢ ⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥ ⎥

u1T

u2T

M

udT

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Page 52: Random Dot Product Graphs

Real ProblemReal Problem

Minimize f (X) = A− XTX + I o(XTX)F

2

We don’ t want xi ⋅xi ≈ 0 = aii .

Idea : aii ← xi ⋅xi

Page 53: Random Dot Product Graphs

Iterative AlgorithmIterative Algorithm

1. D = 0n×n

2. X = gd (A+ D)

3. D = I o(XTX)

4. Go to 2

Minimize f (X) = A− XTX + I o(XTX)F

2

Page 54: Random Dot Product Graphs

ConvergenceConvergence

If (when) the algorithm converges,

then the rows of X are eigenvectors

of A+ I o(XTX) and X is a local min

of f (X) = A− XTX + I o(XTX)F

2.

Page 55: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

G(n = 40,m =115) d = 2

Page 56: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

G(n = 40,m =115) d = 5

max{xi ⋅x j : i ≠ j} =1.05

Page 57: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

max{xi ⋅x j : i ≠ j} =1.152

G(n = 40,m =115) d =12

Page 58: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

30 iterations

Page 59: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

150 iterations

Page 60: Random Dot Product Graphs

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

500 iterations

Page 61: Random Dot Product Graphs

Enron exampleEnron example

Page 62: Random Dot Product Graphs

ApplicationsApplications

Network Change/Anomaly Detection

Clustering

Page 63: Random Dot Product Graphs

Change/Anomaly DetectionChange/Anomaly Detection

G1,G2 ,K ,Gr

X1 2 4 4 3 4 4 H1,H2,K ,H s

Y1 2 4 4 3 4 4

Align X,Y

Find xi − yi large.

Page 64: Random Dot Product Graphs

Change/Anomaly DetectionChange/Anomaly Detection

Page 65: Random Dot Product Graphs

Graph ClusteringGraph Clustering

Page 66: Random Dot Product Graphs

Graph ClusteringGraph Clustering

Page 67: Random Dot Product Graphs

Synthetic Lethality GraphsSynthetic Lethality Graphs

• Vertices are genes in yeast

• Edge between u and v iff Deleting one of u or v does not kill, but Deleting both is lethal.

Page 68: Random Dot Product Graphs

SL Graph StatusSL Graph Status

• Yeast has about 6000 genes.

• Full graph known on 126 “query” genes (about 1300 edges).

• Partial graph known on 1000 “library” genes.

Page 69: Random Dot Product Graphs
Page 70: Random Dot Product Graphs

What Next?What Next?

Page 71: Random Dot Product Graphs

Random Dot Product GraphsRandom Dot Product Graphs

• Extension to higher dimension Cube Unit ball intersect positive orthant

• Small world measures: clustering coefficient

• Other random graph properties

γ(v) = E(N(v)) ÷N (v)

2

⎝ ⎜

⎠ ⎟

γ(G) = average γ (v) :d(v) ≥ 2.

Page 72: Random Dot Product Graphs

Vector EstimationVector Estimation

• MLE method Computationally efficient? More useful?

• Eigenvalue method Understand convergence Prove that it globally minimizes Extension to missing data

• Validate against real data

Page 73: Random Dot Product Graphs

Network EvolutionNetwork Evolution

• Communication influences interests:

X =[x1,x2 ,…,xn ]

X(k +1) = F[G, X(k)]

Page 74: Random Dot Product Graphs

Rapid GenerationRapid Generation

• Can we generate a sparse random dot product graph with n vertices and m edges in time O(n+m)?

• Partial answer: Yes, but.

Page 75: Random Dot Product Graphs

The EndThe End