Bipartite Edge Prediction via Transductive Learning over...

52
Bipartite Edge Prediction via Transductive Learning over Product Graphs Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang School of Computer Science, Carnegie Mellon University July 8, 2015 ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 1

Transcript of Bipartite Edge Prediction via Transductive Learning over...

Page 1: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product Graphs

Bipartite Edge Prediction via TransductiveLearning over Product Graphs

Hanxiao Liu, Yiming Yang

School of Computer Science, Carnegie Mellon University

July 8, 2015

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 1

Page 2: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsProblem Description

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 2

Page 3: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsProblem Description

Problem Description

Many applications involve predicting the edges of a bipartite graph.

I

II

A

B

C

?

?

?

?-2

+5

Graph G Graph H

1 Recommender System2 Host-Pathogen Interaction3 Question-Answering Mapping4 Citation Network . . .

Sometimes, vertex sets on both sides are intrinsically structured.Heterogeneous info: G + H + partial observationsCombine them to make better edge predictions?

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 3

Page 4: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsProblem Description

Problem Description

Many applications involve predicting the edges of a bipartite graph.

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

1 Recommender System2 Host-Pathogen Interaction3 Question-Answering Mapping4 Citation Network . . .

Sometimes, vertex sets on both sides are intrinsically structured.

Heterogeneous info: G + H + partial observationsCombine them to make better edge predictions?

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 4

Page 5: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsProblem Description

Problem Description

Many applications involve predicting the edges of a bipartite graph.

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

1 Recommender System2 Host-Pathogen Interaction3 Question-Answering Mapping4 Citation Network . . .

Sometimes, vertex sets on both sides are intrinsically structured.Heterogeneous info: G + H + partial observations

Combine them to make better edge predictions?

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 5

Page 6: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsProblem Description

Problem Description

Many applications involve predicting the edges of a bipartite graph.

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

1 Recommender System2 Host-Pathogen Interaction3 Question-Answering Mapping4 Citation Network . . .

Sometimes, vertex sets on both sides are intrinsically structured.Heterogeneous info: G + H + partial observationsCombine them to make better edge predictions?

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 6

Page 7: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

Transductive learning should be effective1 Labeled edges (red) are highly sparse2 Unlabeled edges (gray) are massively available

Assumption: similar edges should have similar labelsPrerequisite: a similarity measure among the edges, i.e. a “Graph ofEdges” (not directly provided)Can be induced from G and H via Graph Product!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 7

Page 8: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

Transductive learning should be effective1 Labeled edges (red) are highly sparse2 Unlabeled edges (gray) are massively available

Assumption: similar edges should have similar labels

Prerequisite: a similarity measure among the edges, i.e. a “Graph ofEdges” (not directly provided)Can be induced from G and H via Graph Product!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 8

Page 9: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

Transductive learning should be effective1 Labeled edges (red) are highly sparse2 Unlabeled edges (gray) are massively available

Assumption: similar edges should have similar labelsPrerequisite: a similarity measure among the edges, i.e. a “Graph ofEdges” (not directly provided)

Can be induced from G and H via Graph Product!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 9

Page 10: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

I

II

A

B

C

?

?

?

?-2

+5Graph G Graph H

Transductive learning should be effective1 Labeled edges (red) are highly sparse2 Unlabeled edges (gray) are massively available

Assumption: similar edges should have similar labelsPrerequisite: a similarity measure among the edges, i.e. a “Graph ofEdges” (not directly provided)Can be induced from G and H via Graph Product!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 10

Page 11: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

The “Graph of Edges” can be induced by taking the product of G and H

In the product graph G ◦HEach Vertex ∼ edge (in the original bipartite graph)Each Edge ∼ edge-edge similarity

The adjacency matrix of the product graph is defined by “◦” (to bediscussed later).

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 11

Page 12: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

The “Graph of Edges” can be induced by taking the product of G and H

In the product graph G ◦HEach Vertex ∼ edge (in the original bipartite graph)Each Edge ∼ edge-edge similarity

The adjacency matrix of the product graph is defined by “◦” (to bediscussed later).

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 12

Page 13: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsThe Proposed Framework

The Proposed Framework

Problem Mapping

Edge Prediction(Original Problem)Given G, H and labeled edges,predict the unlabeled edges

I

II

A

B

C

?

?

?

?-2

+5

Vertex Prediction(Equivalent Problem)Given G◦H and labeled vertices,predict the unlabeled vertices

(I, C)?

(I, A)-2

(I, B)?

(II, C)?

(II, A)?

(II, B)+5

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 13

Page 14: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 14

Page 15: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 15

Page 16: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Q: When should vertex (i, j) ∼ (i′, j′) in the product graph?Tensor GP i ∼ i′ in G AND j ∼ j′ in H

Cartesian GP(i ∼ i′ in G AND j = j′

)OR

(i = i′ AND j ∼ j′ in H

)

Can be trivially generalized to weighted graphs.

To compute the adjacency matrices of PGG ◦Tensor H = G⊗H︸ ︷︷ ︸

Kronecker (a.k.a. Tensor) Product

G ◦Cartesian H = G⊗ I + I ⊗H = G⊕H︸ ︷︷ ︸Kronecker Sum

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 16

Page 17: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Q: When should vertex (i, j) ∼ (i′, j′) in the product graph?Tensor GP i ∼ i′ in G AND j ∼ j′ in H

Cartesian GP(i ∼ i′ in G AND j = j′

)OR

(i = i′ AND j ∼ j′ in H

)Can be trivially generalized to weighted graphs.

To compute the adjacency matrices of PGG ◦Tensor H = G⊗H︸ ︷︷ ︸

Kronecker (a.k.a. Tensor) Product

G ◦Cartesian H = G⊗ I + I ⊗H = G⊕H︸ ︷︷ ︸Kronecker Sum

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 17

Page 18: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Q: When should vertex (i, j) ∼ (i′, j′) in the product graph?Tensor GP i ∼ i′ in G AND j ∼ j′ in H

Cartesian GP(i ∼ i′ in G AND j = j′

)OR

(i = i′ AND j ∼ j′ in H

)Can be trivially generalized to weighted graphs.

To compute the adjacency matrices of PGG ◦Tensor H = G⊗H︸ ︷︷ ︸

Kronecker (a.k.a. Tensor) Product

G ◦Cartesian H = G⊗ I + I ⊗H = G⊕H︸ ︷︷ ︸Kronecker Sum

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 18

Page 19: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Both GPs can be written in the form of spectral decomposition

G ◦Tensor H =∑i,j

(λi × µj)(ui ⊗ vj)(ui ⊗ vj)> (1)

G ◦Cartesian H =∑i,j

(λi + µj)(ui ⊗ vj)(ui ⊗ vj)> (2)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 19

Page 20: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Both GPs can be written in the form of spectral decomposition

G ◦Tensor H =∑i,j

(λi × µj)︸ ︷︷ ︸soft AND

(ui ⊗ vj)(ui ⊗ vj)> (1)

G ◦Cartesian H =∑i,j

(λi + µj)︸ ︷︷ ︸soft OR

(ui ⊗ vj)(ui ⊗ vj)> (2)

The interplay of graphs is captured by the interplay of their spectrum!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 20

Page 21: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Both GPs can be written in the form of spectral decomposition

G ◦Tensor H =∑i,j

(λi × µj)︸ ︷︷ ︸soft AND

(ui ⊗ vj)(ui ⊗ vj)> (1)

G ◦Cartesian H =∑i,j

(λi + µj)︸ ︷︷ ︸soft OR

(ui ⊗ vj)(ui ⊗ vj)> (2)

The interplay of graphs is captured by the interplay of their spectrum!

Generalization: Spectral Graph Product

G ◦H def=∑i,j

(λi ◦ µj)(ui ⊗ vj)(ui ⊗ vj)> (3)

where “◦” can be arbitrary binary operator (“×”, “+”, . . . )

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 21

Page 22: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Product Graph Construction

Product Graph Construction

Both GPs can be written in the form of spectral decomposition

G ◦Tensor H =∑i,j

(λi × µj)︸ ︷︷ ︸soft AND

(ui ⊗ vj)(ui ⊗ vj)> (1)

G ◦Cartesian H =∑i,j

(λi + µj)︸ ︷︷ ︸soft OR

(ui ⊗ vj)(ui ⊗ vj)> (2)

The interplay of graphs is captured by the interplay of their spectrum!

Generalization: Spectral Graph Product

G ◦H def=∑i,j

(λi ◦ µj)(ui ⊗ vj)(ui ⊗ vj)> (3)

where “◦” can be arbitrary binary operator (“×”, “+”, . . . )

Commutative Property: G ◦H and H ◦G are isomorphic.ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 22

Page 23: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 23

Page 24: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Graph-based Transductive Learning

With the product graph A def= G ◦H constructed, we solve a standardgraph-based transductive learning problem over A

Learning Objective

minf

`(f)︸︷︷︸Loss Function

+ λf>A−1f︸ ︷︷ ︸Graph Regularization

(4)

fi system-predicted value for vertex i in A`(f) quantifies the gap between f and partially observed labels.

λf>A−1f quantifies the smoothness over graphUnderlying assumption: f ∼ N (0, A)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 24

Page 25: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Graph-based Transductive Learning

With the product graph A def= G ◦H constructed, we solve a standardgraph-based transductive learning problem over A

Learning Objective

minf

`(f)︸︷︷︸Loss Function

+ λf>A−1f︸ ︷︷ ︸Graph Regularization

(4)

fi system-predicted value for vertex i in A`(f) quantifies the gap between f and partially observed labels.

λf>A−1f quantifies the smoothness over graphUnderlying assumption: f ∼ N (0, A)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 25

Page 26: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Graph-based Transductive Learning

The enhanced learning objective

minf

`(f)︸︷︷︸Loss Function

+ λf>κ(A)−1f︸ ︷︷ ︸Graph Regularization

(5)

to incorporate a variety of graph transduction patterns:

k-step Random Walk κ(A) = Ak

Regularized Laplacian κ(A) = (εI −A)−1 = I +A+A2 +A3 + . . .

Diffusion Process κ(A) = exp(A) ≡ I +A+ 12!A

2 + 13!A

3 + · · ·

All can be viewed as to transform the spectrum of A :=∑i θiuiu

>i

Ak =∑i

θki uiu>i (εI−A)−1 =

∑i

1ε− θi

uiu>i exp(A) =

∑i

eθiuiu>i

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 26

Page 27: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Graph-based Transductive Learning

The enhanced learning objective

minf

`(f)︸︷︷︸Loss Function

+ λf>κ(A)−1f︸ ︷︷ ︸Graph Regularization

(5)

to incorporate a variety of graph transduction patterns:k-step Random Walk κ(A) = Ak

Regularized Laplacian κ(A) = (εI −A)−1 = I +A+A2 +A3 + . . .

Diffusion Process κ(A) = exp(A) ≡ I +A+ 12!A

2 + 13!A

3 + · · ·

All can be viewed as to transform the spectrum of A :=∑i θiuiu

>i

Ak =∑i

θki uiu>i (εI−A)−1 =

∑i

1ε− θi

uiu>i exp(A) =

∑i

eθiuiu>i

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 27

Page 28: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsFormulation

Graph-based Transductive Learning

Graph-based Transductive Learning

The enhanced learning objective

minf

`(f)︸︷︷︸Loss Function

+ λf>κ(A)−1f︸ ︷︷ ︸Graph Regularization

(5)

to incorporate a variety of graph transduction patterns:k-step Random Walk κ(A) = Ak

Regularized Laplacian κ(A) = (εI −A)−1 = I +A+A2 +A3 + . . .

Diffusion Process κ(A) = exp(A) ≡ I +A+ 12!A

2 + 13!A

3 + · · ·

All can be viewed as to transform the spectrum of A :=∑i θiuiu

>i

Ak =∑i

θki uiu>i (εI−A)−1 =

∑i

1ε− θi

uiu>i exp(A) =

∑i

eθiuiu>i

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 28

Page 29: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 29

Page 30: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 30

Page 31: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

Even if κ(A)−1 is given, it is expensive to compute ∇r(f) naively

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 31

Page 32: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

Prohibitive to load it into memoryProhibitive to compute its inverseEven if κ(A)−1 is given, it is expensive to compute ∇r(f) naively

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 32

Page 33: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

Prohibitive to load it into memory No need to store κ(A)Prohibitive to compute its inverseEven if κ(A)−1 is given, it is expensive to compute ∇r(f) naively

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 33

Page 34: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

Prohibitive to load it into memory No need to store κ(A)Prohibitive to compute its inverse No need of matrix inverseEven if κ(A)−1 is given, it is expensive to compute ∇r(f) naively

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 34

Page 35: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Transductive Learning over Product Graph

minf

`(f) + λ f>κ(A)−1f︸ ︷︷ ︸r(f)

(6)

Challenge: κ(A) = κ( G︸︷︷︸m×m

◦ H︸︷︷︸n×n

) is a huge mn×mn matrix!

Prohibitive to load it into memory No need to store κ(A)Prohibitive to compute its inverse No need of matrix inverseEven if κ(A)−1 is given, it is expensive to compute ∇r(f) naivelyCan be performed much more efficiently

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 35

Page 36: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Keys for complexity reduction1 Instead of matrices—

κ only manipulates eigenvalues◦ only manipulates the interplay of eigenvalues

2 The “vec” trick:

Bottleneck: multiplication (X ⊗ Y )ff = vec(F ), where Fij

def= system-predicted score for edge (i, j)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 36

Page 37: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Keys for complexity reduction1 Instead of matrices—

κ only manipulates eigenvalues◦ only manipulates the interplay of eigenvalues

2 The “vec” trick:Bottleneck: multiplication (X ⊗ Y )f

f = vec(F ), where Fijdef= system-predicted score for edge (i, j)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 37

Page 38: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Keys for complexity reduction1 Instead of matrices—

κ only manipulates eigenvalues◦ only manipulates the interplay of eigenvalues

2 The “vec” trick:Bottleneck: multiplication (X ⊗ Y )ff = vec(F ), where Fij

def= system-predicted score for edge (i, j)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 38

Page 39: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization

Keys for complexity reduction1 Instead of matrices—

κ only manipulates eigenvalues◦ only manipulates the interplay of eigenvalues

2 The “vec” trick:Bottleneck: multiplication (X ⊗ Y )ff = vec(F ), where Fij

def= system-predicted score for edge (i, j)(X ⊗ Y )f︸ ︷︷ ︸

O(m2n2) time/space

= (X ⊗ Y )vec(F )

≡ vec(XFY >)︸ ︷︷ ︸O(mn(m + n)) time, O((m + n)2) space

(7)

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 39

Page 40: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization with Low-rank Constraint

Further speedup is possible by factorizing F into two low-rank matrices

The cost of each alternating gradient step is proportional torank(F ) · rank(Σ)Σ: a “Characteristic Matrix” where Σij = 1

κ(λi◦µj)

An interesting observation: rank(Σ) is usually a small constant!Example: Diffusion process over the Cartesian PG

Σ =

e−(λ1+µ1) . . . e−(λ1+µn)

.... . .

...e−(λm+µ1) . . . e−(λm+µn)

=

e−λ1

...e−λm

[e−µ1 . . . e−µn]

=⇒ rank(Σ) = 1

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 40

Page 41: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization with Low-rank Constraint

Further speedup is possible by factorizing F into two low-rank matricesThe cost of each alternating gradient step is proportional torank(F ) · rank(Σ)

Σ: a “Characteristic Matrix” where Σij = 1κ(λi◦µj)

An interesting observation: rank(Σ) is usually a small constant!Example: Diffusion process over the Cartesian PG

Σ =

e−(λ1+µ1) . . . e−(λ1+µn)

.... . .

...e−(λm+µ1) . . . e−(λm+µn)

=

e−λ1

...e−λm

[e−µ1 . . . e−µn]

=⇒ rank(Σ) = 1

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 41

Page 42: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization with Low-rank Constraint

Further speedup is possible by factorizing F into two low-rank matricesThe cost of each alternating gradient step is proportional torank(F ) · rank(Σ)Σ: a “Characteristic Matrix” where Σij = 1

κ(λi◦µj)

An interesting observation: rank(Σ) is usually a small constant!Example: Diffusion process over the Cartesian PG

Σ =

e−(λ1+µ1) . . . e−(λ1+µn)

.... . .

...e−(λm+µ1) . . . e−(λm+µn)

=

e−λ1

...e−λm

[e−µ1 . . . e−µn]

=⇒ rank(Σ) = 1

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 42

Page 43: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization with Low-rank Constraint

Further speedup is possible by factorizing F into two low-rank matricesThe cost of each alternating gradient step is proportional torank(F ) · rank(Σ)Σ: a “Characteristic Matrix” where Σij = 1

κ(λi◦µj)An interesting observation: rank(Σ) is usually a small constant!

Example: Diffusion process over the Cartesian PG

Σ =

e−(λ1+µ1) . . . e−(λ1+µn)

.... . .

...e−(λm+µ1) . . . e−(λm+µn)

=

e−λ1

...e−λm

[e−µ1 . . . e−µn]

=⇒ rank(Σ) = 1

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 43

Page 44: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsOptimization

Optimization with Low-rank Constraint

Further speedup is possible by factorizing F into two low-rank matricesThe cost of each alternating gradient step is proportional torank(F ) · rank(Σ)Σ: a “Characteristic Matrix” where Σij = 1

κ(λi◦µj)An interesting observation: rank(Σ) is usually a small constant!Example: Diffusion process over the Cartesian PG

Σ =

e−(λ1+µ1) . . . e−(λ1+µn)

.... . .

...e−(λm+µ1) . . . e−(λm+µn)

=

e−λ1

...e−λm

[e−µ1 . . . e−µn]

=⇒ rank(Σ) = 1

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 44

Page 45: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsExperiment

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 45

Page 46: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsExperiment

Datasets and Baselines

Datasets

Dataset G H

Movielens-100K Users MoviesCora Publications Publications

Courses Courses Prerequisite Courses

BaselinesMC Matrix Completion.

Ignores the info of G and H.TK Tensor Kernel.

Implicitly construct PG, no transductionGRMC Graph Regularized Matrix Completion.

Transduction over G and H, no PG constructed

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 46

Page 47: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsExperiment

Results

Performance of several interesting combinations of ◦ and κ

Dataset Graph Transduction Graph Product MAP AUC ndcg@3

CoursesRandom Walk Tensor 0.488 0.827 0.461

Diffusion Cartesian 0.518 0.872 0.500von-Neumann Tensor 0.472 0.861 0.449von-Neumann Cartesian 0.366 0.531 0.359

Sigmoid Cartesian 0.443 0.617 0.431

CoraRandom Walk Tensor 0.222 0.764 0.205

Diffusion Cartesian 0.256 0.884 0.232von-Neumann Tensor 0.230 0.853 0.211von-Neumann Cartesian 0.218 0.633 0.212

Sigmoid Cartesian 0.192 0.443 0.188

MovieLensRandom Walk Tensor - - 0.7695

Diffusion Cartesian - - 0.7702von-Neumann Tensor - - 0.7720von-Neumann Cartesian - - 0.7624

Sigmoid Cartesian - - 0.7650

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 47

Page 48: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsExperiment

Results

Proposed method (Diff + Cartesian GP) v.s. Baselines

Dataset Method MAP AUC ndcg@3

CoursesMC 0.319 0.758 0.294

GRMC 0.366 0.777 0.343TK 0.449 0.810 0.446

Proposed 0.490 0.838 0.473

CoraMC 0.101 0.697 0.086

GRMC 0.115 0.702 0.101TK 0.248 0.872 0.231

Proposed 0.268 0.894 0.243

MovieLensMC - - 0.748

GRMC - - 0.752TK - - 0.718

Proposed - - 0.765

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 48

Page 49: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsConclusion

Outline

1 Problem Description

2 The Proposed Framework

3 FormulationProduct Graph ConstructionGraph-based Transductive Learning

4 Optimization

5 Experiment

6 Conclusion

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 49

Page 50: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsConclusion

Conclusion

SummaryProblem Predicting the missing edges of a bipartite graph with

graph-structured vertex sets on both sides.Contribution A novel approach via transductive learning over product

graph, efficient algorithmic solution and good results.

On-going WorkExtend to k Graphs (k > 2)

Bipartite Graph → k-partite GraphEdge → Hyperedge

Determine the “optimal” graph product for any given problem.

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 50

Page 51: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsConclusion

Conclusion

SummaryProblem Predicting the missing edges of a bipartite graph with

graph-structured vertex sets on both sides.Contribution A novel approach via transductive learning over product

graph, efficient algorithmic solution and good results.

On-going WorkExtend to k Graphs (k > 2)

Bipartite Graph → k-partite GraphEdge → Hyperedge

Determine the “optimal” graph product for any given problem.

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 51

Page 52: Bipartite Edge Prediction via Transductive Learning over ...nyc.lti.cs.cmu.edu/.../Publications/liu-icml2015-slides.pdfICML 2015 Bipartite Edge Prediction via Transductive Learning

Bipartite Edge Prediction via Transductive Learning over Product GraphsConclusion

[email protected]

ICML 2015 Bipartite Edge Prediction via Transductive Learning over Product Graphs 52