How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14...
Transcript of How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14...
![Page 1: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/1.jpg)
How Powerful are Graph Neural Networks
Jure LeskovecJoint work with R. Ying, J. You, M. Zitnik, W. Hamilton, W. Hu, et al.
![Page 2: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/2.jpg)
BIG Data
Digital transformation of science and society
2
![Page 3: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/3.jpg)
Machine Learning
3
Advances in Machine Learning & Statistics
Inpu
t Lay
erO
utput Layer
![Page 4: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/4.jpg)
New Paradigm For Discovery
Massive data: Observe “invisible” patterns
Data Science,Machine Learning
DataModels and
insights
4
![Page 5: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/5.jpg)
Modern ML Toolbox
5Jure Leskovec, Stanford University
Images
Text/Speech
Modern deep learning toolbox is designed for simple sequences & grids
![Page 6: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/6.jpg)
Jure Leskovec, Stanford University 6
But not everything can be represented as a sequence or a grid
How can we develop neural networks that are much more
broadly applicable?New frontiers beyond classic neural networks that learn on images and
sequences
![Page 7: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/7.jpg)
Networks: Common Language
Jure Leskovec, Stanford University 7
Peter Mary
Albert
Tom
co-worker
friendbrothers
friend
Protein 1 Protein 2Protein 5
Protein 9
Movie 1
Movie 3Movie 2
Actor 3
Actor 1 Actor 2
Actor 4
|N|=4|E|=4
![Page 8: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/8.jpg)
Deep Learning in Graphs
…z
Input: Network
Predictions: Node labels, New links, Generated graphs and subgraphs
8Jure Leskovec, Stanford University
![Page 9: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/9.jpg)
Why is it Hard?But networks are far more complex!
§ Arbitrary size and complex topological structure (i.e., no spatial locality like grids)
§ No fixed node ordering or reference point§ Often dynamic and have multimodal features
9
vs.
Networks Images
Text
Jure Leskovec, Stanford University
![Page 10: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/10.jpg)
10
GraphSAGE: Graph Neural Networks
Jure Leskovec, Stanford University
Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. Neural Information Processing Systems (NIPS), 2017.Representation Learning on Graphs: Methods and Applications. W. Hamilton, R. Ying, J. Leskovec. IEEE Data Engineering Bulletin, 2017.
http://snap.stanford.edu/graphsage
![Page 11: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/11.jpg)
Setup
We have a graph !:
§ " is the vertex set
§ # is the (binary) adjacency matrix
§ $ ∈ ℝ'×|*| is a matrix of node features
§ Meaningful node features:§ Social networks: User profile
§ Biological networks: Gene expression profiles, gene functional information
Jure Leskovec, Stanford University 11
![Page 12: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/12.jpg)
Graph Neural Networks
Learn how to propagate information across the graph to compute node features
12Jure Leskovec, Stanford University
Determine node computation graph
!
Propagate andtransform information
!
Idea: Node’s neighborhood defines a computation graph
The Graph Neural Network Model. Scarselli et al. IEEE Transactions on Neural Networks 2005Semi-Supervised Classification with Graph Convolutional Networks. T. N. Kipf, M. Welling, ICLR 2017
![Page 13: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/13.jpg)
INPUT GRAPH
TARGET NODE B
DE
F
CA
B
C
D
A
A
A
C
F
B
E
A
Graph Neural Networks
13Jure Leskovec, Stanford University
Each node defines a computation graph§ Each edge in this graph is a
transformation/aggregation function Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. NIPS, 2017.
![Page 14: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/14.jpg)
INPUT GRAPH
TARGET NODE B
DE
F
CA
B
C
D
A
A
A
C
F
B
E
A
Graph Neural Networks
14Jure Leskovec, Stanford University
Intuition: Nodes aggregate information from their neighbors using neural networks
Neural networks
Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. NIPS, 2017.
![Page 15: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/15.jpg)
Idea: Aggregate Neighbors
Intuition: Network neighborhood defines a computation graph
Jure Leskovec, Stanford University 15
Every node defines a computation graph based on its neighborhood!
Can be viewed as learning a generic linear combination of graph low-pass and high-pass operators
![Page 16: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/16.jpg)
Our Approach: GraphSAGE
16Jure Leskovec, Stanford University
[NIPS ‘17]
xA
xC
!( Σ ℎ% )' ∈ ) *
Update for node +:
ℎ,(-./) = 1 2 - ℎ,- , 4%∈) ,
1(5 - ℎ%- )
§ ℎ,6 = attributes 7, of node *, 1(⋅) is a sigmoid activation function
ΣTransform *’s own
embedding from level 9Transform and aggregate
embeddings of neighbors :9 + 1=> level
embedding of node *
?… combine@… aggregate
@?
![Page 17: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/17.jpg)
GraphSAGE: Training
§ Aggregation parameters are shared for all nodes§ Number of model parameters is independent of |V|§ Can use different loss functions:
§ Classification/Regression: ℒ ℎ% = '% − ) ℎ%*
§ Pairwise Loss: ℒ ℎ%, ℎ, = max(0, 1 − 3456 ℎ%, ℎ, )Jure Leskovec, Stanford University 17
INPUT GRAPH
B
DE
F
CA
Compute graph for node A Compute graph for node B
shared parameters
shared parameters
8(9), :(9)
[NIPS ‘17]
![Page 18: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/18.jpg)
Inductive Capability
18
train with a snapshot new node arrivesgenerate embedding
for new node
Jure Leskovec, Stanford University
zu
Even for nodes we never trained on!
![Page 19: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/19.jpg)
Embedding Entire Graphs
19Jure Leskovec, Stanford University
Don’t just embed individual nodes. Embed the entire graph.Problem: Learn how to hierarchical pool the nodes to embed the entire graphOur solution: DIFFPOOL
§ Learns hierarchical pooling strategy§ Sets of nodes are pooled hierarchically
Hierarchical Graph Representation Learning with Differentiable Pooling. R. Ying, et al. NeurIPS 2018.
[NeurIPS ‘18]
![Page 20: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/20.jpg)
Embedding Entire Graphs
20Jure Leskovec, Stanford University
Don’t just embed individual nodes. Embed the entire graph.Problem: Learn how to hierarchical pool the nodes to embed the entire graphOur solution: DIFFPOOL
§ Learns hierarchical pooling strategy§ Sets of nodes are pooled hierarchically
Hierarchical Graph Representation Learning with Differentiable Pooling. R. Ying, et al. NeurIPS 2018.
[NeurIPS ‘18]
How expressive are Graph Neural Networks?
![Page 21: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/21.jpg)
How expressive are GNNs?
Theoretical framework: Characterize GNN’s discriminative power:§ Characterize upper bound of the
discriminative power of GNNs§ Propose a maximally powerful GNN§ Characterize discriminative power
of popular GNNs
Jure Leskovec, Stanford University 21
How Powerful are Graph Neural Networks? K. Xu, et al. ICLR 2019.
GNN tree:
![Page 22: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/22.jpg)
Discriminative Power of GNNs
Theorem: GNNs can be at most as powerful as the Weisfeiler-Lehman graph isomorphism test (a.k.a. canonical
labeling or color refinement)
Jure Leskovec, Stanford University 22
Color nodes by their degrees. Aggregate colors of neighbors into a multiset.Compress multisets into new colors. Repeat ! times or until colors in " and "’ differ.
![Page 23: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/23.jpg)
Discriminative Power of GNNs
Theorem: Power(GNNs) ≤ Power(WL)
Why?
So, to distinguish 2 nodes, GNN needs to distinguish structure of their rooted subtrees
Jure Leskovec, Stanford University 23
We develop GIN – provably most powerful GNN!
“Color” at node 1
![Page 24: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/24.jpg)
PinSAGE for Recommender Systems
24
Graph Convolutional Neural Networks for Web-Scale Recommender Systems. R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, J. Leskovec. KDD, 2018.
![Page 25: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/25.jpg)
§ 300M users
§ 4+B pins, 2+B boards
![Page 26: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/26.jpg)
Under review as a conference paper at ICLR 2019
sum - multiset
>mean - distribution max - set
>Input
Figure 2: Ranking by expressive power for sum, mean and max-pooling aggregators over a multiset.Left panel shows the input multiset and the three panels illustrate the aspects of the multiset a givenaggregator is able to capture: sum captures the full multiset, mean captures the proportion/distributionof elements of a given type, and the max aggregator ignores multiplicities (reduces the multiset to asimple set).
vs.
(a) Mean and Max both fail
vs.
(b) Max fails
vs.
(c) Mean and Max both fail
Figure 3: Examples of simple graph structures that mean and max-pooling aggregators fail todistinguish. Figure 2 gives reasoning about how different aggregators “compress” different graphstructures/multisets.
existing GNNs instead use a 1-layer perceptron � �W (Duvenaud et al., 2015; Kipf & Welling, 2017;Zhang et al., 2018), a linear mapping followed by a non-linear activation function such as a ReLU.Such 1-layer mappings are examples of Generalized Linear Models (Nelder & Wedderburn, 1972).Therefore, we are interested in understanding whether 1-layer perceptrons are enough for graphlearning. Lemma 7 suggests that there are indeed network neighborhoods (multisets) that modelswith 1-layer perceptrons can never distinguish.
Lemma 7. There exist finite multisets X1 6= X2 so that for any linear mapping W ,Px2X1
ReLU (Wx) =P
x2X2ReLU (Wx) .
The main idea of the proof for Lemma 7 is that 1-layer perceptrons can behave much like linearmappings, so the GNN layers degenerate into simply summing over neighborhood features. Ourproof builds on the fact that the bias term is lacking in the linear mapping. With the bias term andsufficiently large output dimensionality, 1-layer perceptrons might be able to distinguish differentmultisets. Nonetheless, unlike models using MLPs, the 1-layer perceptron (even with the bias term)is not a universal approximator of multiset functions. Consequently, even if GNNs with 1-layerperceptrons can embed different graphs to different locations to some degree, such embeddings maynot adequately capture structural similarity, and can be difficult for simple classifiers, e.g., linearclassifiers, to fit. In Section 7, we will empirically see that GNNs with 1-layer perceptrons, whenapplied to graph classification, sometimes severely underfit training data and often underperformGNNs with MLPs in terms of test accuracy.
5.2 STRUCTURES THAT CONFUSE MEAN AND MAX-POOLING
What happens if we replace the sum in h (X) =P
x2X f(x) with mean or max-pooling as in GCNand GraphSAGE? Mean and max-pooling aggregators are still well-defined multiset functions becausethey are permutation invariant. But, they are not injective. Figure 2 ranks the three aggregators bytheir representational power, and Figure 3 illustrates pairs of structures that the mean and max-poolingaggregators fail to distinguish. Here, node colors denote different node features, and we assume theGNNs aggregate neighbors first before combining them with the central node.
In Figure 3a, every node has the same feature a and f(a) is the same across all nodes (for anyfunction f ). When performing neighborhood aggregation, the mean or maximum over f(a) remainsf(a) and, by induction, we always obtain the same node representation everywhere. Thus, mean andmax-pooling aggregators fail to capture any structural information. In contrast, a sum aggregatordistinguishes the structures because 2 · f(a) and 3 · f(a) give different values. The same argument
6
Application: Pinterest
PinSage graph convolutional network:§ Goal: Generate embeddings for nodes in a large-
scale Pinterest graph containing billions of objects§ Key Idea: Borrow information from nearby nodes
§ E.g., bed rail Pin might look like a garden fence, but gates and beds are rarely adjacent in the graph
§ Pin embeddings are essential to various tasks like recommendation of Pins, classification, ranking§ Services like “Related Pins”, “Search”, “Shopping”, “Ads”
Jure Leskovec, Stanford University 26
A
![Page 27: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/27.jpg)
Pinterest Graph
27
Human curated collection of pinsPins: Visual bookmarks someone has saved from the internet to a board they’ve created.Pin features: Image, text, links
Boards7Jure Leskovec, Stanford University
![Page 28: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/28.jpg)
Pin Recommendation
Task: Recommend related pins to users
Source pin
8
Predict whether two nodes in a graph are related
Task: Learn node embeddings !" such that# !$%&'(, !$%&'*< #(!$%&'(, !-.'%/'0)
!
![Page 29: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/29.jpg)
PinSAGE Training
Goal: Identify target pin among 3B pins§ Issue: Need to learn with resolution of 100 vs. 3B§ Massive size: 3 billion nodes, 20 billion edges§ Idea: Use harder and harder negative samples
29
Source pin Positive Hard negativeEasy negativeJure Leskovec, Stanford University
![Page 30: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/30.jpg)
PinSAGE Performance
Related Pin recommendations§ Given a user is looking at pin Q, predict
what pin X are they going to save next
§ Setup: Embed 3B pins, perform nearest neighbor to generate recommendations
Jure Leskovec, Stanford University 30
0
0.050.1
0.15
0.20.25
0.30.35
RW-GCN Visual Annotation
Mea
n Re
cipro
cal R
ank
PinSage
![Page 31: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/31.jpg)
PinSAGE Example
31Jure Leskovec, Stanford University
![Page 32: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/32.jpg)
Computational Drug Discovery: Drug Side
Effect Prediction
32
Modeling Polypharmacy Side Effects with Graph Convolutional Networks. M. Zitnik, M. Agrawal, J. Leskovec. Bioinformatics, 2018.
http://snap.stanford.edu/decagon/
![Page 33: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/33.jpg)
Polypharmacy side effects
Many patients take multiple drugs to treat complex or co-existing diseases:§ 46% of people ages 70-79 take more than 5 drugs
§ Many patients take more than 20 drugs to treat heart disease, depression, insomnia, etc.
Task: Given a pair of drugs predict adverse side effects
,30% prob.
65% prob.
33Jure Leskovec, Stanford University
![Page 34: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/34.jpg)
Approach: Build a Graph
34
r" Edge type #Drug node
Protein node
Drug-drug interaction of type $%, e.g., nausea
Drug-target interaction
Protein-protein interaction
![Page 35: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/35.jpg)
Task: Link Prediction
Task: Given a partially observed graph, predict labeled edges between
drug nodes
35
Ciprofloxacinr1
r2
Simvastatin
Mupirocin
r2
Doxycycline
S
C
MD
Example query: Given drugs !, #, how likely is an edge (!, %&, #)?
Co-prescribed drugs ! and # lead to side effect %&
Jure Leskovec, Stanford University
![Page 36: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/36.jpg)
Decagon: Graph Neural Net
36
Node !’s computation graph
Network neighborhood of node !
![Page 37: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/37.jpg)
Decoder: Link Prediction
37Parameter weight matrices
Probability of edge of type !"
Two nodes
Predicted edges
Tensor factorized model to capture dependences between
different types of edges
![Page 38: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/38.jpg)
Results: Side Effect Prediction
36% average in AP@50 improvement over baselines
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
AUROC AP@50
Jure Leskovec, Stanford University 38
![Page 39: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/39.jpg)
De novo PredictionsDrug c Drug d
39Jure Leskovec, Stanford University
![Page 40: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/40.jpg)
De novo Predictions
Evidence foundDrug c Drug d
40Jure Leskovec, Stanford University
![Page 41: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/41.jpg)
Predictions in the Clinic
Clinical validation via drug-drug interaction markers, lab values, and surrogates
41
First method to predict side effects of drug pairs, even for drug combinations not yet used in patients
![Page 42: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/42.jpg)
Reasoning in Knowledge Graphs
42
Embedding Logical Queries on Knowledge Graphs. W. Hamilton, P. Bajaj, M. Zitnik, D. Jurafsky, J. Leskovec. Neural Information Processing Systems (NIPS), 2018.
![Page 43: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/43.jpg)
Knowledge as a Graph
Jure Leskovec, Stanford University 43
![Page 44: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/44.jpg)
Knowledge Graph
Heterogeneous Knowledge Graphs
44Jure Leskovec, Stanford University
Biological interactions Online communities
![Page 45: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/45.jpg)
Conjunctive Graph Queries
45
Query formula
Query DAG
Example subgraphs that satisfy the query
Jure Leskovec, Stanford University
![Page 46: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/46.jpg)
Predictive Graph Queries
Key challenges: Big graphs and queries can involve noisy and unobserved data!
Problem: Naïve link prediction and graph template matching are too expensive
46
?Some links might be
noisy or unobserved or haven’t occurred yet
?
Jure Leskovec, Stanford University
![Page 47: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/47.jpg)
Overview of Our Framework
Goal: Answer complex logical queriesE.g.: “Predict drugs Clikely target proteinsP associated with diseases d1 and d2”
Idea: Logical operatorsbecome spatial operators
Jure Leskovec, Stanford University 47
![Page 48: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/48.jpg)
Model Specification
Given: Knowledge graph
Find:§ Node embeddings
§ Projection operator !: ! ", $ = &' ⋅ )*§ Applies transition &' of relation $ to "
§ Intersection operator +: + ",…. = /0 ⋅ AGG34,...6(NN(q:))§ Set intersection in the embedding space
Jure Leskovec, Stanford University 48
τ… edge typeγ… node typeR?… matrixWA… matrix
Ψ… aggregatorNN… neural net
![Page 49: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/49.jpg)
Model Training
Training examples: Queries on the graph
§ Positives: Path with a known answer§ “Standard” negatives: Random nodes of the
correct answer type§ “Hard” negatives: Correct answers if a logical
conjunction is relaxed to a disjunction§ Loss:
Jure Leskovec, Stanford University 49
![Page 50: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/50.jpg)
Performance
§ Performance on different query types:
50
Query graph
Jure Leskovec, Stanford University
![Page 51: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/51.jpg)
How can this technology be used for other problems?
Many other applications: § Nodes: Predict tissue-specific protein functions§ Subgraphs: Predict which drug treats what disease§ Graphs: Predict properties of molecules/drugs
We can now apply neural networks much more broadly
New frontiers beyond classic neural networks that learn on images and sequences
51Jure Leskovec, Stanford University
![Page 52: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/52.jpg)
Summary
§ Graph Convolutional Neural Networks§ Generalize beyond simple convolutions
§ Fuses node features & graph info§ State-of-the-art accuracy for node
classification and link prediction
§ Model size independent of graph size; can scale to billions of nodes§ Largest embedding to date (3B nodes, 20B edges)
§ Leads to significant performance gainsJure Leskovec, Stanford University 52
![Page 53: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/53.jpg)
Conclusion
Results from the past 2-3 years have shown:
§ Representation learning paradigm can be extended to graphs
§ No feature engineering necessary
§ Can effectively combine node attribute data with the network information
§ State-of-the-art results in a number of domains/tasks
§ Use end-to-end training instead of multi-stage approaches for better performance
Jure Leskovec, Stanford University 53
![Page 54: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/54.jpg)
54
PhD Students
Post-Doctoral Fellows
Funding
Collaborators
Industry Partnerships
AlexandraPorter
CamiloRuiz
ClaireDonnat
EmmaPierson
JiaxuanYou
Bowen Liu
MohitTiwari
RexYing
BaharanMirzasoleiman
MarinkaZitnik
MicheleCatasta
SrijanKumar
RokSosic
ResearchStaff
AdrijanBradaschia
Dan Jurafsky, Linguistics, Stanford UniversityDavid Grusky, Sociology, Stanford UniversityStephen Boyd, Electrical Engineering, Stanford UniversityDavid Gleich, Computer Science, Purdue UniversityVS Subrahmanian, Computer Science, University of MarylandSarah Kunz, Medicine, Harvard UniversityRuss Altman, Medicine, Stanford UniversityJochen Profit, Medicine, Stanford UniversityEric Horvitz, Microsoft ResearchJon Kleinberg, Computer Science, Cornell UniversitySendhill Mullainathan, Economics, Harvard UniversityScott Delp, Bioengineering, Stanford UniversityJames Zou, Medicine, Stanford University
ShantaoLi
HingweiWang
WeihuaHu
Jure Leskovec, Stanford University
PanLi
![Page 55: How Powerful are Graph Neural Networks...Graph Neural Networks Jure Leskovec, Stanford University 14 Intuition: Nodes aggregate information from their neighbors using neural networks](https://reader034.fdocuments.in/reader034/viewer/2022042605/5f3de3ecd76cb8344458c75a/html5/thumbnails/55.jpg)
References§ Tutorial on Representation Learning on Networks at WWW 2018 http://snap.stanford.edu/proj/embeddings-
www/
§ Inductive Representation Learning on Large Graphs. W. Hamilton, R. Ying, J. Leskovec. NIPS 2017.
§ Representation Learning on Graphs: Methods and Applications. W. Hamilton, R. Ying, J. Leskovec.IEEE Data Engineering Bulletin, 2017.
§ Graph Convolutional Neural Networks for Web-Scale Recommender Systems. R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, J. Leskovec. KDD, 2018.
§ Modeling Polypharmacy Side Effects with Graph Convolutional Networks. M. Zitnik, M. Agrawal, J. Leskovec. Bioinformatics, 2018.
§ Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation. J. You, B. Liu, R. Ying, V. Pande, J. Leskovec, NeurIPS 2018.
§ Embedding Logical Queries on Knowledge Graphs. W. Hamilton, P. Bajaj, M. Zitnik, D. Jurafsky, J. Leskovec. NeuIPS, 2018.
§ How Powerful are Graph Neural Networks? K. Xu, W. Hu, J. Leskovec, S. Jegelka. ICLR 2019.
§ Code:§ http://snap.stanford.edu/graphsage§ http://snap.stanford.edu/decagon/§ https://github.com/bowenliu16/rl_graph_generation§ https://github.com/williamleif/graphqembed§ https://github.com/snap-stanford/GraphRNN
Jure Leskovec, Stanford University 55