Nonparametric Link Prediction in Dynamic Graphs

Purnamrita Sarkar (UC Berkeley)Deepayan Chakrabarti (Facebook)Michael Jordan (UC Berkeley)

Link Prediction Who is most likely to be interact with a given node?

Friend suggestion in Facebook

Should Facebook suggest Alice

as a friend for Bob?

Link Prediction

Charlie

Movie recommendation in Netflix

Should Netflix suggest this

movie to Alice?

Link Prediction Prediction using simple features

degree of a node number of common neighbors last time a link appeared

What if the graph is dynamic?

Related Work

Generative models Exp. family random graph models [Hanneke+/’06] Dynamics in latent space [Sarkar+/’05] Extension of mixed membership block models

[Fu+/10] Other approaches

Autoregressive models for links [Huang+/09] Extensions of static features [Tylenda+/09]

Link Prediction incorporating graph dynamics, requiring weak modeling assumptions, allowing fast predictions, and offering consistency guarantees.

Outline

Model Estimator Consistency Scalability Experiments

The Link Prediction Problem in Dynamic Graphs

G1 G2 GT+1……

Y1 (i,j)=1

Y2 (i,j)=0

YT+1 (i,j)=?

YT+1(i,j) | G1,G2, …,GT ~ Bernoulli (gG1,G2,…GT(i,j))

Edge in T+1 Features of previous graphsand this pair of nodes

ℓℓ

Including graph-based features

Example set of features for pair (i,j): cn(i,j) (common neighbors) ℓℓ(i,j) (last time a link was formed) deg(j)

Represent dynamics using “datacubes” of these features. ≈ multi-dimensional histogram on binned feature values

ηt = #pairs in Gt with these features

1 ≤ cn ≤ 33 ≤ deg ≤ 61 ≤ ℓℓ ≤ 2

ηt+ = #pairs in Gt with these

features, which had an edge in Gt+1

high ηt+/ηt this feature

combination is more likely to create a new edge at time t+1

G1 G2 GT……

Y1 (i,j)=1 Y2 (i,j)=0 YT+1 (i,j)=?

1 ≤ cn(i,j) ≤ 33 ≤ deg(i,j) ≤ 61 ≤ ℓℓ (i,j) ≤ 2

Including graph-based features

How do we form these datacubes? Vanilla idea: One datacube for Gt→Gt+1

aggregated over all pairs (i,j) Does not allow for differently evolving communities

YT+1 (i,j)=?

1 ≤ cn(i,j) ≤ 33 ≤ deg(i,j) ≤ 61 ≤ ℓℓ (i,j) ≤ 2

Our Model

How do we form these datacubes? Our Model: One datacube for each neighborhood

Captures local evolution

G1 G2 GT……

Y1 (i,j)=1 Y2 (i,j)=0

Our Model

Number of node pairs- with feature s- in the neighborhood of i- at time t

Number of node pairs- with feature s- in the neighborhood of i- at time t- which got connected at time t+1

Datacube

1 ≤ cn(i,j) ≤ 33 ≤ deg(i,j) ≤ 61 ≤ ℓℓ (i,j) ≤ 2

Neighborhood Nt(i)= nodes within 2 hops

Features extracted from (Nt-p,…Nt)

Our Model

Datacube dt(i) captures graph evolution in the local neighborhood of a node in the recent past

Model:

What is g(.)?

YT+1(i,j) | G1,G2, …,GT ~ Bernoulli ( gG1,G2,…GT(i,j))g(dt(i), st(i,j))

Features of the pair

Local evolution patterns

Outline

Kernel Estimator for g

G1 G2 …… GTGT-1GT-2

query data-cube at T-1 and feature vector at time T

compute similarities

datacube, feature pair

…datacube,

feature pair t=3

Factorize the similarity function Allows computation of g(.) via simple lookups

K( , )I{ == }

G1 G2 …… GTGT-1GT-2

datacubes t=1

datacubes t=2

datacubes t=3

compute similarities only between data cubes

η1 , η1+

η2 , η2+

η3 , η3+

η4 , η4+

44332211

wwwwwwww

Factorize the similarity function Allows computation of g(.) via simple lookups What is K( , )?

K( , )I{ == }

Similarity between two datacubes

Idea 1 For each cell s, take

(η1+/η1 – η2

+/η2)2 and sum

Problem: Magnitude of η is ignored 5/10 and 50/100 are treated

equally

Consider the distribution

η1 , η1+

η2 , η2+

Similarity between two datacubes

0 5 10 15 20 25 30 35 40 450

) , dist(b) , K( 0<b<1

As b0, K( , ) 0 unless dist( , ) =0

Idea 2 For each cell s, compute

posterior distribution of edge creation prob.

dist = total variation distance between distributions summed over all cells

η1 , η1+

η2 , η2+

1tη) , K(#1f

) , (f) , (h) , (g

1tη) , K(

Want to show: gg

Outline

Consistency of Estimator

Lemma 1: As T→∞, for some R>0,

Proof using:

) , (f) , (h) , (g

As T→∞,

Lemma 2: As T→∞,

) , (f) , (h) , (g

Assumption: finite graph Proof sketch:

Dynamics are Markovian with finite state spacethe chain must eventually enter a closed, irreducible communication classgeometric ergodicity if class is aperiodic(if not, more complicated…)strong mixing with exponential decayvariances decay as o(1/T)

Theorem:

Proof Sketch:

for some R>0

Outline

Scalability Full solution:

Summing over all n datacubes for all T timesteps Infeasible

Approximate solution: Sum over nearest neighbors of query datacube

How do we find nearest neighbors? Locality Sensitive Hashing (LSH)

[Indyk+/98, Broder+/98]

Using LSH

Devise a hashing function for datacubes such that “Similar” datacubes tend to be hashed to the

same bucket “Similar” = small total variation distance

between cells of datacubes

0 5 10 15 20 25 30 35 40 450

Using LSH

Step 1: Map datacubes to bit vectors

Use B2 bits for each bucket For probability mass p the first bits are set to

1Use B1 buckets to discretize [0,1]

Total M*B1*B2 bits, where M = max number of occupied cells << total number of cells

Using LSH

Step 1: Map datacubes to bit vectors Total variation distance

L1 distance between distributions Hamming distance between vectors

Step 2: Hash function = k out of MB1B2 bits

Fast Search Using LSH

1111111111000000000111111111000

10000101000011100001101010000

10101010000011100001101010000

101010101110111111011010111110

1111111111000000000111111111001

00000001

Outline

Experiments

Baselines LL: last link (time of last occurrence of a pair)

CN: rank by number of common neighbors in AA: more weight to low-degree common neighbors Katz: accounts for longer paths

CN-all: apply CN to AA-all, Katz-all: similar

Pick random subset S from nodes with degree>0 in GT+1

, predict a ranked list of nodes likely to link to s Report mean AUC (higher is better)

G1 G2 GT

Training data Test dataGT+

Simulations Social network model of Hoff et al.

Each node has an independently drawn feature vector

Edge(i,j) depends on features of i and j Seasonality effect

Feature importance varies with seasondifferent communities in each season

Feature vectors evolve smoothly over timeevolving community structures

Simulations

NonParam is much better than others in the presence of seasonality

CN, AA, and Katz implicitly assume smooth evolution

Sensor Network*

* www.select.cs.cmu.edu/data

Summary

Link formation is assumed to depend on the neighborhood’s evolution over a time window

Admits a kernel-based estimator Consistency Scalability via LSH

Works particularly well for Seasonal effects differently evolving communities

Nonparametric Link Prediction in Dynamic Graphs

Documents

Transcript of Nonparametric Link Prediction in Dynamic Graphs

arXiv:1809.02657v2 [cs.SI] 2 Jul 2019outperform traditional link prediction methods on graphs [8, 9]. Existing works on graph representation learning primarily focus on static graphs

Mapping Images to Scene Graphs with Permutation …...Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction Roei Herzig Tel Aviv University roeiherzig@mail.tau.ac.il

Nonparametric Confidence Intervals: Nonparametric Bootstrap.

NONPARAMETRIC AND PARTIALLY NONPARAMETRIC … › researcher › files › us... · 2010-03-30 · NONPARAMETRIC AND PARTIALLY NONPARAMETRIC STATISTICAL INFERENCE IN WIRELESS SENSOR

Consumer Behavior Prediction using Parametric and Nonparametric Methods

Facilitating prediction of adverse drug reactions by using ...Facilitating prediction of adverse drug reactions by using knowledge graphs and multi-label learning models ... current

Dynamic Pruning of Factor Graphs for Maximum Marginal Prediction

Regularization Methods for Prediction in Dynamic Graphs and e …members.cbio.mines-paristech.fr/~erichard/pdfs/theseER.pdf · tems, the valuation of network agents, prediction of

Nonparametric Link Prediction in Dynamic Graphs Purnamrita Sarkar (UC Berkeley) Deepayan Chakrabarti (Facebook) Michael Jordan (UC Berkeley) 1.

ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs · 2020-05-04 · deep active learning for link prediction in knowledge graphs. Our work is an initial yet

Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization

Bootstrap prediction intervals for linear, nonlinear and ...politis/PAPER/BPIforARjspiFINAL.pdf · Bootstrap prediction intervals for linear, nonlinear and nonparametric autoregressions

Nonparametric Inference

Mobile Agent Trajectory Prediction using Bayesian ...acl.mit.edu/papers/Infotech11_Aoude_Joseph_Roy_How.pdf · Mobile Agent Trajectory Prediction using Bayesian Nonparametric Reachability

Nonparametric Counterfactual Predictions in Neoclassical ... · Nonparametric Counterfactual Predictions in Neoclassical Models . ... Nonparametric Counterfactual Predictions in Neoclassical

Link Prediction for Annotation Graphs using Graph Summarizationbarna/paper/iswc2011.pdf · 2014-08-24 · Link Prediction for Annotation Graphs using Graph Summarization Andreas Thor

Spatial Community-Informed Evolving Graphs for Demand ...skai2/files/ecml_spatial.pdfSpatial Community-Informed Evolving Graphs for Demand Prediction? Qianru Wang 1[00000002 1682 910X],

In Copyright - Non-Commercial Use Permitted Rights ...24069/eth... · GARCH models of higher order, nonparametric ARMA models for prediction of condi-tional expectations and nonparametric

Functional methods for time series prediction: a ...dm.udc.es/modes/sites/default/files/AneirosCaoVilar.pdfFunctional methods for time series prediction: a nonparametric approach GermÆn

Mobile Agent Trajectory Prediction using Bayesian Nonparametric Reachability Trees