Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph...

66
Graph Neural Networks: A Review of Methods and Applications Jie Zhou * , Ganqu Cui * , Zhengyan Zhang * , Cheng Yang, Zhiyuan Liu, Maosong Sun Tsinghua University arxiv 2019 https://qdata.github.io/deep2Read Presenter : Jack Lanchantin 1 / 60

Transcript of Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph...

Page 1: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks:A Review of Methods and Applications

Jie Zhou∗, Ganqu Cui∗, Zhengyan Zhang∗, Cheng Yang, Zhiyuan Liu,Maosong Sun

Tsinghua Universityarxiv 2019

https://qdata.github.io/deep2Read

Presenter : Jack Lanchantin

1 / 60

Page 2: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

2 / 60

Page 3: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graphs

Type of data structure which models a set of objects (vertices) and theirrelationships (edges).

3 / 60

Page 4: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graphs

Type of data structure which models a set of objects (vertices) and theirrelationships (edges).

3 / 60

Page 5: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Euclidean Data: RNNs/CNNs

4 / 60

Page 6: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Non-Euclidean Data

5 / 60

Page 7: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

How to extend CNNs to graph-structured data?

Assumption: Non-Euclidean data are locally stationary and manifesthierarchical structures.

But, how to define compositionality on graphs? (convolution andpooling on graphs)

6 / 60

Page 8: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

7 / 60

Page 9: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Learning on Graphs

Naive approach: a per-node classifier

Represent each node v as vector hv ∈ Rs

Completely drop the graph structure, and classify each nodeindividually, with a shared deep neural network classifier:

ov = f (hv ; W) (1)

Methods like DeepWalk also classify each node independently, butinject graph structure indirectly using learned embeddings

8 / 60

Page 10: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Learning on Graphs

Naive approach: a per-node classifier

Represent each node v as vector hv ∈ Rs

Completely drop the graph structure, and classify each nodeindividually, with a shared deep neural network classifier:

ov = f (hv ; W) (1)

Methods like DeepWalk also classify each node independently, butinject graph structure indirectly using learned embeddings

8 / 60

Page 11: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Learning on Graphs(from Petar Velickovic)

Most deep learning is done this way, even if there are relationshipsbetween training examples

9 / 60

Page 12: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks

Originally introduced by Scarselli et. al. (2009)

The goal of GNNs is to learn a state embedding hv ∈ Rs whichcontains information of the neighborhood for each node

hv can be used to produce an output ov such as the node label

10 / 60

Page 13: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks

Local transition function f : updates each node state according to itsneighborhood (shared among all nodes)

Output function g : describes how the output is produced

Then, hv and ov are defined as follows:

hv = f (xv , xe(v),hn(v), xn(v)) (2)

ov = g(hv , xv ) (3)

where xv are the features of v , xe(v) the features of its edges, hn(v)

the states of the nodes in the neighborhood of v , and xn(v) thefeatures of the nodes in the neighborhood of v

11 / 60

Page 14: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks

Local transition function f : updates each node state according to itsneighborhood (shared among all nodes)

Output function g : describes how the output is produced

Then, hv and ov are defined as follows:

hv = f (xv , xe(v),hn(v), xn(v)) (2)

ov = g(hv , xv ) (3)

where xv are the features of v , xe(v) the features of its edges, hn(v)

the states of the nodes in the neighborhood of v , and xn(v) thefeatures of the nodes in the neighborhood of v

11 / 60

Page 15: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks Update Step

GNNs use the following iterative update to compute the state:

Ht+1 = f (Ht ,X) (4)

where Ht denotes the t-th iteration of H.

12 / 60

Page 16: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks

13 / 60

Page 17: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Network Training

Given a GNN framework, learn the parameters of f and g

With the target information (tv for a specific node) for thesupervision, the loss can be written as:

loss =

p∑i=1

(ti − oi ) (5)

where p is the number of supervised nodes.

Weights W of f and g are updated via gradient descent

14 / 60

Page 18: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks Training

Original GNN constrains f to be a contractive map

Contractive map, on a metric space (M, d) is a function f from M toitself, with the property that there is some real number 0 ≤ k < 1 suchthat for all x and y in M, d(f (x), f (y)) ≤ k d(x , y)

This implies that the hv vectors will always converge to a unique fixedpoint (very restrictive)

Impossible to inject problem-specific information into h0v (will alwaysconverge to same value regardless of initialization)

15 / 60

Page 19: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks Training

Original GNN constrains f to be a contractive map

Contractive map, on a metric space (M, d) is a function f from M toitself, with the property that there is some real number 0 ≤ k < 1 suchthat for all x and y in M, d(f (x), f (y)) ≤ k d(x , y)

This implies that the hv vectors will always converge to a unique fixedpoint (very restrictive)

Impossible to inject problem-specific information into h0v (will alwaysconverge to same value regardless of initialization)

15 / 60

Page 20: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Neural Networks (compact representation)

Let H, O, X, and XN be the vectors constructed by stacking all the states,all the outputs, all the features, and all the node features, respectively.Then we have a compact form as:

H = f (H,X) (6)

O = g(H,XN) (7)

16 / 60

Page 21: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

17 / 60

Page 22: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Directed Graphs

e.g. Knowledge graphs where there is a parent and child.

Can use different weights for each edge type

18 / 60

Page 23: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Heterogeneous Graphs

e.g. Multi-model biological network data

Can group neighbors based on types or distance

19 / 60

Page 24: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graphs with Edge Information

e.g. Drug interaction data with different types of interactions (edges).Each edge also has its information like the weight or the type of theedge.

Can use different types of weight matrices

20 / 60

Page 25: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Graph Types

21 / 60

Page 26: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

22 / 60

Page 27: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Objectives

Node-level outputs: prediction for each node in the graph (e.g.classify each person in a social network)

Edge-level outputs: prediction for each edge in the graph (e.g.classify each friendship in a social network)

Graph-level outputs; prediction on the graph as a whole (e.g.classify an entire group in a social network)

23 / 60

Page 28: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Objective Frameworks

Supervised learning for node, edge, or graph level classification:Given full labeled networks, predict target objective

Semi-supervised learning for node or edge level classification:Given a single network with partial nodes being labeled and othersremaining unlabeled

Unsupervised learning for graph embedding: When no class labelsare available in graphs, we can learn the graph embedding in a purelyunsupervised way

24 / 60

Page 29: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

25 / 60

Page 30: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Propagation Variants

The propagation (update) step and output step are the crucialcomponents to obtain the hidden states of nodes (or edges).

H = f (H,X)

Variants utilize different aggregators to gather information from eachnode’s neighbors and updaters to update nodes’ hidden states

26 / 60

Page 31: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

27 / 60

Page 32: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Convolution

Advances in this direction are often categorized into either spectral orspatial approaches

28 / 60

Page 33: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spectral Convolution Methods

Define a convolutional operation by operating on the graph in thespectral domain, leveraging the convolution theorem

These approaches utilise the graph Laplacian matrix, L, defined as L= D− A, where D is the degree matrix (diagonal matrix with Dii =deg(i)) and A is the adjacency matrix.

29 / 60

Page 34: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Limitations of Spectral Methods

Poor generalization to new/different graphs

Graphs with variable size: spectral techniques work with fixed sizegraphs.

Directed graphs: definition of directed graph Laplacian is unclear.

30 / 60

Page 35: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Transductive Learning(from Petar Velickovic)

Training algorithm sees all features (including test nodes)31 / 60

Page 36: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Inductive Learning(from Petar Velickovic)

Algorithm does not have access to all nodes upfront. Two variations:1 Test nodes are (incrementally) inserted into training graphs2 Test graphs are disjoint and completely unseen

Requires generalizing across arbitrary graph structures, thus manytransductive methods are useless

32 / 60

Page 37: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spectral vs Spatial

33 / 60

Page 38: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Convolution Methods

Spatial approaches define convolutions directly on the graph,operating on spatially close neighbors

Major challenge: defining convolution with different sizedneighborhoods and maintaining the local invariance of CNNs

34 / 60

Page 39: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Convolution MethodsDuvenaud GCN - Duvenaud et. al., 2015

Different weight matrices are used for nodes with different degrees,

x = hv +Nv∑i=1

hi

h′v = σ(

xWNvL

) (8)

where WNvL is the weight matrix for nodes with degree Nv at layer L

Problem: doesn’t scale to graphs with very wide degree distributions

35 / 60

Page 40: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Convolution MethodsDiffusion-convolutional neural networks (DCNNs) - Atwood et. al.,2016

Transition matrices are used to define the neighborhood for nodes inDCNN. For node classification, it has

H = f (Wc � P∗X) (9)

X is an N × F matrix of input features (N is the number of nodesand F is the number of features)

P∗ is an N × K × N tensor which contains the power series {P,P2,..., PK} of matrix P (where P is the degree- normalized transitionmatrix from the adjacency matrix A)

P∗ transforms each node into a K × F diffusion convolutionalrepresentation

36 / 60

Page 41: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Convolution MethodsGraphSAGE - Hamilton et. al.,2017

General inductive framework

Restricts every degree to be the same (by sampling a fixed-size set anode’s local neighborhood, during both training and inference)

htNv

= AGGREGATEt

({ht−1

u ,∀u ∈ Nv})

htv = σ

(Wt · [ht−1

v ‖htNv

]) (10)

Key is that Wt is not shared across time steps T

37 / 60

Page 42: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Convolution MethodsGraphSAGE - Hamilton et. al.,2017

General inductive framework

Restricts every degree to be the same (by sampling a fixed-size set anode’s local neighborhood, during both training and inference)

htNv

= AGGREGATEt

({ht−1

u ,∀u ∈ Nv})

htv = σ

(Wt · [ht−1

v ‖htNv

]) (10)

Key is that Wt is not shared across time steps T

37 / 60

Page 43: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Spatial Methods: Summary

38 / 60

Page 44: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

39 / 60

Page 45: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

GatingGated Graph Neural Networks (GGNNs) - Li et al. 2016

Extension of original GNN (Scarselli et. al. 2009)

Propagate for T steps, but do not restrict the propagation model tobe contractive

Use gating in the propagation step to alleviate gradient issues

40 / 60

Page 46: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Gating

Initial idea: update htv using neighborhood summation and tanh

atv = b +

∑j∈Nv

ht−1j (11)

htv = tanh(Wat

v ) (12)

Now, extend this to incorporate gating mechanisms, to prevent fulloverwrite of ht−1

v by htv

41 / 60

Page 47: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

GatingGated Graph Neural Networks (GGNNs) - Li et al. 2016

Gating mechanism defined similar to LSTM:

atv = b +

∑j∈Nv

ht−1j (13)

ztv = σ(Wzat

v + Uzht−1v

)(14)

rtv = σ(Wrat

v + Urht−1v

)(15)

h̃tv = tanh

(Wat

v + U(rtv � ht−1

v

))(16)

htv =

(1− ztv

)� ht−1

v + ztv � h̃tv (17)

Where � is element-wise multiplication, and zv and rv are the update andreset vectors, respectively

42 / 60

Page 48: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

43 / 60

Page 49: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

AttentionGraph attention network (GAT) - Velickovic et. al., 2017

Gating mechanisms are designed for data that changes sequentially;however, our graphs have static features

GAT incorporates attention mechanisms into the propagation step

44 / 60

Page 50: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Attention

GAT uses a self-attention mechanism to compute the hidden states ofeach node by attending over itself and its neighbors

45 / 60

Page 51: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Attention

GAT uses a self-attention mechanism to compute the hidden states ofeach node by attending over itself and its neighbors

45 / 60

Page 52: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

AttentionGraph attention network (GAT) - Velickovic et. al., 2017

evj = a(Wahv ,Wbhj) (18)

αvj =exp(evj)∑

k∈Nvexp(eik)

(19)

hv = σ

( ∑j∈Nv

αvjWzhj

)(20)

46 / 60

Page 53: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Attention

47 / 60

Page 54: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Gating and Attention

48 / 60

Page 55: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

49 / 60

Page 56: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Skip ConnectionsHighway GCN - Rahimi et. al.,2018

Uses uses layer-wise gates. The output of a layer is summed with its inputwith gating weights (inspired by Highway nets)

T(ht) = σ(Wtht + bt

)ht+1 = ht+1 � T(ht) + ht � (1− T(ht))

(21)

50 / 60

Page 57: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Skip ConnectionsJump Knowledge Network (JKN) - Xu et. al., 2018

With neighborhood aggregation, the receptive field of each nodegrows exponentially w.r.t. the number of layers (steps) TThe Jump Knowledge Network selects from all of the intermediaterepresentations for each node at the last layer

Allows the model adapt the effective neighborhood size for each nodeas needed

51 / 60

Page 58: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Propagation Methods: Summary

52 / 60

Page 59: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

53 / 60

Page 60: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Training Methods

GraphSAGE solved the problems of the original GCN by replacing fullgraph Laplacian with learnable aggregation functions, which are keyto generalize to unseen nodes.

In addition, GraphSAGE uses neighbor sampling to alleviate receptivefield expansion

54 / 60

Page 61: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Training Methods

Chen et. al (2018) proposed a control-variate based stochasticapproximation algorithms by utilizing the historical activations ofnodes as a control variate.

This limits the receptive field in the 1-hop neighborhood, but isefficient

55 / 60

Page 62: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Training Methods

Li et. al. (2018) note that GNNs requires many additional labeleddata for validation and also suffers from the localized nature of theconvolutional filter

To solve the limitations, the authors propose a method to find thenearest neighbors of training data and a boosting-like method

56 / 60

Page 63: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Training Methods

57 / 60

Page 64: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Outline

1 Introduction

2 Graph Neural Networks

3 Graph Variants

4 Objective Variants

5 Propagation VariantsConvolution

SpectralSpatial

GatingAttentionSkip Connections

6 Training Variants

7 Applications and Datasets

58 / 60

Page 65: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Application Areas

59 / 60

Page 66: Graph Neural Networks: A Review of Methods and Applications€¦ · Unsupervised learning for graph embedding: When no class labels are available in graphs, we can learn the graph

Commonly Used Datasets

60 / 60