Introduction to graph neural networks¶

Yueh-Hua Tu 杜岳華¶

2019.04.28¶

Outline¶

Brief history and some graph applications
Some definitions about graph
Taxonomy of GNN
Tasks
Graph convolution networks
Graph attention networks
Graph autoencoder
Graph generative networks
Graph spatial-temporal networks
Graph network (GN) block

Brief history¶

1998 LeNet - LeCun¶

2003 Language model - Bengio¶

2005 GNN - Franco Scarselli, Marco Gori, Gabriele Monfardini¶

2006 RBM - Hinton¶

2009 GNN - Marco Gori, Gabriele Monfardini, Franco Scarselli¶

2012 AlexNet - Hinton¶

2013 Word2Vec - Google¶

2015 Deep learning - Hinton¶

Graph applications¶

Social network
Biological network (gene regulatory network, protein-protein interaction network)
Computer graphics
Traffic network
Knowledge graph

Internet¶

Traffic network¶

Global flight network¶

Biological pathway¶

KEGG¶

Protein-protein interaction network¶

Protein-protein interaction network¶

Schziophrenia PPI network

Human gene coexpression network¶

Computer graphics¶

Knowledge graph¶

NLP and knowledge graph¶

PageRank¶

Network science¶

Definition¶

$$G = (V, E)$$$$V = \{1, 2, 3, 4, 5, 6, 7, 8, 9\}$$$$E = \{(1, 2), (2, 3), (2, 7), (3, 4), (4, 7), (5, 6), (6, 7), (7, 8), (7, 9)\}$$

Adjacency matrix¶

$$ A = \begin{bmatrix} 0& 1& 0& 0& 0& 0& 0& 0& 0 \\ 1& 0& 1& 0& 0& 0& 1& 0& 0 \\ 0& 1& 0& 1& 0& 0& 0& 0& 0 \\ 0& 0& 1& 0& 0& 0& 1& 0& 0 \\ 0& 0& 0& 0& 0& 1& 0& 0& 0 \\ 0& 0& 0& 0& 1& 0& 1& 0& 0 \\ 0& 1& 0& 1& 0& 1& 0& 1& 1 \\ 0& 0& 0& 0& 0& 0& 1& 0& 0 \\ 0& 0& 0& 0& 0& 0& 1& 0& 0 \end{bmatrix} $$

Degree¶

$$ deg(v_i) = \text{the number of connections with }v_i\\ \text{ex. }deg(7) = 5 $$

Degree matrix¶

$$ D = \begin{bmatrix} 1& 0& 0& 0& 0& 0& 0& 0& 0 \\ 0& 3& 0& 0& 0& 0& 0& 0& 0 \\ 0& 0& 2& 0& 0& 0& 0& 0& 0 \\ 0& 0& 0& 2& 0& 0& 0& 0& 0 \\ 0& 0& 0& 0& 1& 0& 0& 0& 0 \\ 0& 0& 0& 0& 0& 2& 0& 0& 0 \\ 0& 0& 0& 0& 0& 0& 5& 0& 0 \\ 0& 0& 0& 0& 0& 0& 0& 1& 0 \\ 0& 0& 0& 0& 0& 0& 0& 0& 1 \end{bmatrix} $$

Laplacian matrix¶

$$ L = D - A $$$$ L = \begin{bmatrix} 2& -1& 0& 0& -1 \\ -1& 3& -1& 0& -1 \\ 0& -1& 2& -1& 0 \\ 0& 0& -1& 2& -1 \\ -1& -1& 0& -1& 3 \end{bmatrix} $$

Symmetric normalized Laplacian matrix¶

$$ L = D^{-\frac{1}{2}}LD^{-\frac{1}{2}} = I - D^{-\frac{1}{2}}AD^{-\frac{1}{2}} $$

Graph signals and labels¶

$A$: adjacency matrix, $X$: input node features, $l$: node/edge labels

GNN and network embedding¶

Taxonomy of GNN¶

Graph convolution networks
Graph attention networks
Graph autoencoders
Graph generative networks
Graph spatial-temporal networks

Tasks¶

Node-level tasks¶

node regression, node classification
graph convolution layer gives node's latent representations

Edge-level tasks¶

link prediction, edge classification
additional function would take two nodes' latent representations as input of graph convolution layer

Graph-level (global-level) tasks¶

graph classification
graph pooling gives the reduced graph information

Semi-supervised learning for node classification¶

Supervised learning for graph classification¶

Unsupervised learning for graph embedding¶

Convolution on graph¶

Graph convolution networks¶

Spectral-based GCN¶

Spatial-based GCN¶

Spectral-based GCN¶

Single-node layer¶

$$x_j^{t+1} = \sigma (\tilde{A}x_i^t) = \sigma (U \Theta_{i,j}^k U^T x_i^t)$$

Spectral CNN¶

$$ L = I - D^{-\frac{1}{2}}AD^{-\frac{1}{2}} = U \Lambda U^T $$

Graph Fourier transform¶

$$ \mathcal{F}(x) = U^Tx $$

Graph convolution of a input $x$ and a filter $g \in \mathbb{R}^N$ is defined as

$$ x *_G g = \mathcal{F}^{-1}(\mathcal{F}(x) \odot \mathcal{F}(g)) = U(U^Tx \odot U^Tg) $$

$$ = Ug_{\theta}U^Tx, g_{\theta} = diag(U^Tg) $$

Chebyshev spectral CNN (ChebNet)¶

For efficiency, define filters as Chebyshev polynomials.

$$ g_{\theta} = \sum \theta_iT_k(\tilde{\Lambda}) $$

First-order ChebNet¶

First-order approximation of ChebNet¶

$$ x *_G g_{\theta} = \theta (I + D^{-\frac{1}{2}}AD^{-\frac{1}{2}})x \\ = \theta \tilde{A} x $$

$$ X^{t+1} = \tilde{A} X^t \Theta $$

Spatial-based GCN¶

Recurrent-based GCN (spatial)¶

Apply the same graph convolution layer to update hidden representations.

Composition-based GCN (spatial)¶

Apply different graph convolution layers to update hidden representations.

Recurrent-based GCN¶

Graph neural network (GNN)¶

$$ x_i(t+1) = f(l_i, l_e, x_{ne[i]}(t), l_{ne[i]}) \\ o_i(t) = g(x_i(t), l_i) $$

Composition-based GCN¶

Message passing neural networks (MPNN)¶

MPNN¶

Message passing phase¶

$$ m_v^{t+1} = \sum_{u \in N(v)} M_t(h_v^t, h_u^t, e_{uv}) \\ h_v^{t+1} = U_t(h_v^t, m_v^{t+1}) $$

$h_v^t$: vertex features, $e_{uv}$: edge features, $N(v)$: neighbors of vertex $v$

Readout phase¶

$$ \hat{y} = R(h_v) $$

Graph pooling module¶

mean/max/sum pooling¶

Reducing number of vertecies of a graph.

Graph attention networks¶

Graph attention network (GAT)¶

$$ h_i^t = \sigma (\sum_j \alpha (h_i^{t-1}, h_j^{t-1})W^{t-1}h_j^{t-1}) $$

Multi-head attention¶

$$ h_i^t = ||_{k=1}^K \sigma (\sum_j \alpha_k (h_i^{t-1}, h_j^{t-1})W_k^{t-1}h_j^{t-1}) $$

GAT¶

Graph autoencoder¶

Graph autoencoder (GAE)¶

$$ Z = GCN(X, A) = \text{ReLU}(\tilde{A}XW_0) \\ \hat{A} = \sigma (ZZ^T) $$

Variational graph autoencoder (VGAE)¶

$$ \mu = GCN_{\mu}(X, A) = \tilde{A} \ \text{ReLU}(\tilde{A}XW_0) \ W_{\mu} \\ \log \sigma = GCN_{\sigma}(X, A) = \tilde{A} \ \text{ReLU}(\tilde{A}XW_0) \ W_{\sigma} \\ q(Z \mid X, A) = \prod_{i=1}^N \mathcal{N}(z_i \mid \mu_i, diag(\sigma_i^2)) \\ p(A \mid Z) = \sigma (ZZ^T) $$

Graph generative networks¶

MolGAN¶

Graph spatial-temporal networks¶

Diffusion convolutional recurrent neural network (DCRNN)¶

DCRNN¶

diffusion convolution + sequence to sequence architecture + scheduled sampling technique

DCRNN¶

diffusion convolution + sequence to sequence architecture + scheduled sampling technique

DCRNN¶

Relational inductive bias¶

Graph as model of relations¶

Graph network (GN) block¶

Graph network¶

Update of graph network¶

Thank you for attention¶

Reference papers¶

A Comprehensive Survey on Graph Neural Networks
Relational inductive biases, deep learning, and graph networks
Geometric deep learning: going beyond Euclidean data
The Graph Neural Network Model
Variational Graph Auto-Encoders
Neural Message Passing for Quantum Chemistry
DIFFUSION CONVOLUTIONAL RECURRENT NEURAL NETWORK: DATA-DRIVEN TRAFFIC FORECASTING
GRAPH ATTENTION NETWORKS
MolGAN: An implicit generative model for small molecular graphs

Introduction to graph neural networks¶

Yueh-Hua Tu 杜岳華¶

2019.04.28¶

Outline¶

Brief history¶

1998 LeNet - LeCun¶

2003 Language model - Bengio¶

2005 GNN - Franco Scarselli, Marco Gori, Gabriele Monfardini¶

2006 RBM - Hinton¶

2009 GNN - Marco Gori, Gabriele Monfardini, Franco Scarselli¶

2012 AlexNet - Hinton¶

2013 Word2Vec - Google¶

2015 Deep learning - Hinton¶

Graph applications¶

Internet¶

Traffic network¶

Global flight network¶

Social network¶

Biological pathway¶

KEGG¶

Protein-protein interaction network¶

Protein-protein interaction network¶

Human gene coexpression network¶

Computer graphics¶

Knowledge graph¶

NLP and knowledge graph¶

PageRank¶

Network science¶

Definition¶

Adjacency matrix¶

Degree¶

Degree matrix¶

Laplacian matrix¶

Laplacian matrix¶

Symmetric normalized Laplacian matrix¶

Graph signals and labels¶

GNN and network embedding¶

Taxonomy of GNN¶

Tasks¶

Node-level tasks¶

Edge-level tasks¶

Graph-level (global-level) tasks¶

Semi-supervised learning for node classification¶

Supervised learning for graph classification¶

Unsupervised learning for graph embedding¶

Convolution on graph¶

Graph convolution networks¶

Spectral-based GCN¶

Spatial-based GCN¶

Spectral-based GCN¶

Single-node layer¶

Spectral CNN¶

Graph Fourier transform¶

Chebyshev spectral CNN (ChebNet)¶

First-order ChebNet¶

First-order approximation of ChebNet¶

Spatial-based GCN¶

Recurrent-based GCN (spatial)¶

Composition-based GCN (spatial)¶

Recurrent-based GCN¶

Graph neural network (GNN)¶

Composition-based GCN¶

Message passing neural networks (MPNN)¶

MPNN¶

MPNN¶

Message passing phase¶

Readout phase¶

Graph pooling module¶

mean/max/sum pooling¶

Graph attention networks¶

Graph attention network (GAT)¶

Multi-head attention¶

GAT¶

Graph autoencoder¶

Graph autoencoder (GAE)¶

Variational graph autoencoder (VGAE)¶

Graph generative networks¶

MolGAN¶

MolGAN¶

Graph spatial-temporal networks¶