Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure...

25
Inferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr´ eal, Canada GSP Workshop, Philadelphia, May 2016 1 / 25

Transcript of Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure...

Page 1: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Inferring Network Structure from IndirectObservations

Mike Rabbat

McGill University, Montreal, Canada

GSP Workshop, Philadelphia, May 2016

1 / 25

Page 2: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Motivation

• Need “the graph(s)” to do graph signal processing

• Fundamental problem: Given signals, recover the graph

2 / 25

Page 3: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

The Agenda

• Historical overview

• Case study: Inferring network structure from cascades

• Open directions

3 / 25

Page 4: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Combinatorial vs. Statistical Approaches

Combinatorial Graph Reconstruction

• Let G = (V,E) be an graph

• A card Gi = V \ vi is a vertex-deleted subgraph

• The deck is the set of all cards

Deck(G) = {Gi = V \ vi : vi ∈ V }

• Does Deck(G) uniquely determine G?

• How many cards are necessary?

4 / 25

Page 5: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Statistical Approaches – Historical Overview

• Covariance selection (Dempster ’72)

• Estimate Σ = E[xx>

]from x1,x2, . . . ,xm

i.i.d.∼ N (0,Σ)• Explore sparsity of Σ−1

• Σ−1i,j = 0 encodes conditional independence relationships

• Theoretical and computational advances• Meinshausen and Buhlmann (2006)• Ravikumar, Wainwright, and Lafferty (2010)• Friedman, Hastie, and Tibshirani (2007)• Banerjee, El Ghaoui, and d’Aspremont (2008)• Marjanovic and Hero (2014)• Main tools: Convex optimization + sparsity/high-d statistics

5 / 25

Page 6: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Statistical Approaches – Historical Overview

• Learning probabilistic graphical models• Beyond Gaussian/Ising• Survey by Heckerman (1995)• Main tools: Search over discrete space of graphs

6 / 25

Page 7: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Statistical Approaches – Historical Overview

Learning causal networks

• Edge u→ v iff ¬v =⇒ ¬u• Interventions: Pearl (2000)

• Predictive: Granger (1969)

Other network representations

• Partial correlations

• Structural equation models

• ...

7 / 25

Page 8: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Application-Specific Approaches

Network tomography

• Castro, Coates, Liang, Nowak, and Yu, Statistical Science,2004

S

R1 R2 R3

S

R1 R2 R3

S

R1 R2 R3

8 / 25

Page 9: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

The Graph Signal Processing View

Session: Thursday P.M.

9 / 25

Page 10: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Relevant Questions

Identifiability: Do the observations (asymptotically) uniquelyidentify G?

Consistency: Does G (asymptotically) converge to G?

Statistical complexity: How many samples are required forG = G w.h.p.?

Computational complexity: To compute G from observations?

10 / 25

Page 11: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Case Study

M. Gomez-Rodriguez, L. Song, H. Daneshmand, and B. Scholkopf,“Estimating diffusion network stuctures”, ICML 2014.

11 / 25

Page 12: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Observations of Cascades

a

c

b

d

e

f

a

c

b

d

e

f

tb=0 te=2.1

1.8

tf=3.2

a

c

b

d

e

f

tb=0 te=2.1

1.8

tf=3.2tc=∞

ta=∞

G cascade observation

Observe multiple cascades t1, t2, . . . , tm where tcv ∈ [0, T c] ∪ {∞}

Each cascade is a graph signal (of exposure times)

12 / 25

Page 13: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Continuous-time diffusion modelGomez-Rodriguez et al. (2011)

1. Cascade begins at seed node si.i.d.∼ P(s)

2. Subsequent transmission over edge j → i governed by density

f(ti|tj ;αji) = f(ti − tj ;αji)

13 / 25

Page 14: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Examples

• Exponential

f(τ ;αji) = αjie−αjiτ1 {τ ≥ 0}

• Power-law (δ > 0)

f(τ ;αji) =αjiδ

(τδ

)−1−αji

1 {τ ≥ δ}

• Rayleighf(τ ;αji) = αjiτe

−αjiτ2/21 {τ ≥ 0}

14 / 25

Page 15: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Continuous-time diffusion modelGomez-Rodriguez et al. (2011)

1. Cascade begins at seed node si.i.d.∼ P(s)

2. Subsequent transmission over edge j → i governed by density

f(ti|tj ;αji) = f(ti − tj ;αji)

Also define survival function

s(ti|tj ;αji) = 1−∫ ti

tj

f(t− ti;αji)dt

and hazard function

h(ti|tj ;αji) =f(ti|tj ;αji)s(ti|tj ;αji)

15 / 25

Page 16: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Likelihood of infected nodes

Likelihood that a particular node j is the parent of i,

f(ti|tj ;αji)∏

k 6=j:tk<ti

s(ti|tk;αki)

Marginalizing over all nodes infected before i,∑j:tj<ti

f(ti|tj ;αji)∏

k 6=j:tk<ti

s(ti|tk;αki)

=∏

k:tk<ti

s(ti|tk;αki)∑j:tj<ti

h(ti|tj ;αji)

16 / 25

Page 17: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Likelihood of a Cascade

Network structure is encoded in matrix A = [αji]For a cascade t = (t1, . . . , tN ) with T = max{ti : ti <∞}

f(t;A) =∏i:ti≤T

∏u:tu>T

s(T |ti;αiu)

×∏

k:tk<ti

s(ti|tk;αki)∑j:tj<ti

h(ti|tj ;αji)

Maximum likelihood: Given t1, . . . , tm, find A to

minimize −∑m

c=1 log f(tc;A)subject to αji ≥ 0, i, j = 1, . . . , N, i 6= j

Note: Problem decouples into independent per-node sub-problems

17 / 25

Page 18: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Per-Node Problems

For node i optimize over αi = {αji : j = 1, . . . , N, j 6= i}.

minimize `i(αi)def= −

∑mc=1 gi(t

c;αi)subject to αji ≥ 0, j = 1, . . . , N

where (e.g., for exponential delays)

gi(t;αi) =

log(∑

j:tj<tiαji

)−∑

j:tj<tiαji(ti − tj) if ti ≤ T

−∑

j:tj≤T αji(T − tj) if ti > T

Note: Problem is convex in αi for all models mentioned before

18 / 25

Page 19: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Identifiability and Consistency

Are per-node problems identifiable? (`i(α) = `i(α?) iff α = α?)

The Hessian Q(αi) = ∇2`i(αi) has the form

Q(αi) = X(αi)[X(αi)]>

where X(αi) ∈ RN×m with entries

[X(αi)]jc =

(∑

k:tck<tciαki

)−1if tcj < tci

0 if tcj > tci .

Suppose P(s) > 0 for all s = 1, . . . , N .Then, as m→∞, X(αi) is full row-rank.=⇒ Q(αi) is positive definite=⇒ If α 6= α? then `i(α) > `i(α

?).Consistency also follows since the problem is strictly convex.

19 / 25

Page 20: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Finite Sample Recovery

When can we recover G (w.h.p.) from a finite number cascades?

• Are some networks more difficult to recover than others?

• What kind of cascades are needed for recovery?

Examine “population” log-likelihood E [`i(α)] = E [gi(t;α)].

20 / 25

Page 21: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Let Q?def= ∇2E [gi(t;α)] |α=α? .

Let P denote the true set of parents.

Dependency Condition: There exist constants Cmin, Cmax > 0such that

Cmin < λmin(Q?PP ) < λmax(Q?PP ) < Cmax.

Incoherence Condition: There exists a constant ε ∈ (0, 1] suchthat

‖Q?P cP (Q?PP )−1‖∞ ≤ 1− ε.

21 / 25

Page 22: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Conditions depend on A = [αji] and P(s)

1

0

3

2

5

4

7

6

2

1

3

0

0

1

2

3

4

n

22 / 25

Page 23: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Finite-Sample Recovery Results

Choose αi to solve the regularized problem

minimize `i(αi) + λ‖αi‖1subject to αji ≥ 0, j = 1, . . . , N

Theorem (Gomez-Rodriguez et al. (2016)): Suppose

λ ≥ C12− εε

√logN

N.

If m > C2|P |3 logN then with probability at least1− 2 exp(C3λ

2m),

1. αi is unique, and

2. αi = α?i .

23 / 25

Page 24: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Summary

Identifiability,Consistency,Statistical complexity, andComputational complexity

Open questions for this particular model:

• Noise/uncertainty in observations

• Allow for / model unobserved incubation periods

• Track time-varying αji’s

24 / 25

Page 25: Inferring Network Structure from Indirect … › ~gsp16 › rabat.pdfInferring Network Structure from Indirect Observations Mike Rabbat McGill University, Montr eal, Canada GSP Workshop,

Directions For Topology Identification in GSP

• Inferring topologies such that observed signals are smooth• Generative models (diffusions, Gaussian)• Discriminative

• Inference with confidences (or p(G|X))

• Understand how errors in G impact subsequent GSPoperations (filtering, sampling)

• Infer topology characteristics

[email protected]

25 / 25