Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast...
Transcript of Convolutional Neural Networks on Graphs with Fast...
![Page 1: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/1.jpg)
Convolutional Neural Networks on Graphs with Fast LocalizedSpectral Filtering
M. Defferrard1 X. Bresson2 P. Vandergheynst1
1EPFL, Lausanne, Switzerland
2Nanyang Technological University, Singapore
Itay Boneh, Asher KabakovitchTel-Aviv University, Deep Learning Seminar
2017
![Page 2: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/2.jpg)
Spectral FilteringNon-parametric
From convolution theorem:
x ∗G g = U(UTg � UT x
)= U
(g � UT x
)Or algebrically:
x ∗G g = U
g1 0. . .
0 gn
UT x
I Not localized
I O(n) parameters to train
I O(n2) multiplications (no FFT)
![Page 3: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/3.jpg)
Spectral FilteringNon-parametric
From convolution theorem:
x ∗G g = U(UTg � UT x
)= U
(g � UT x
)Or algebrically:
x ∗G g = U
g1 0. . .
0 gn
UT x
I Not localized
I O(n) parameters to train
I O(n2) multiplications (no FFT)
![Page 4: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/4.jpg)
Spectral FilteringPolynomial Parametrization
g is a continuous 1-D function, parametrized by θ:
x ∗G g = U(gθ(λ)� UT x
)= U
gθ(λ1) 0. . .
0 gθ(λn)
UT x
= Ugθ(Λ)UT x = gθ(L)x
Tφ = T~λ⇒ f (T ) = φf (Λ)φ−1
![Page 5: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/5.jpg)
Spectral FilteringPolynomial Parametrization
g is a continuous 1-D function, parametrized by θ:
x ∗G g = U(gθ(λ)� UT x
)= U
gθ(λ1) 0. . .
0 gθ(λn)
UT x
= Ugθ(Λ)UT x = gθ(L)x
Polynomial parametrization:
gθ(Λ) =K−1∑k=0
θkΛk
I K-localized: 1-hop for every L application
I K parameters to train - independent on the graph’s size
I Still O(n2) multiplications (multiplications with the basis U)
![Page 6: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/6.jpg)
Spectral FilteringPolynomial Parametrization
Impulse response on a 2D Euclidean domain
Impulse response on a graph
![Page 7: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/7.jpg)
Spectral FilteringRecursive Polynomial Parametrization
Chebyshev polynomials: Tk(λ) = 2λTk−1(λ)− Tk−2(λ)
T0(λ) = 1
T1(λ) = λ
Parametrization:
gθ(L) =K−1∑k=0
θkTk(L)
L = 2Lλ−1n − I (orthonormal basis in [−1, 1])
![Page 8: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/8.jpg)
Spectral FilteringRecursive Polynomial Parametrization
Filtering:
y = gθ(L)x =K−1∑k=0
θkTk(L)x =K−1∑k=0
θk xk
Recurrence: xk = Tk(L)x = 2Lxk−1 − xk−2
x0 = x
x1 = Lx
I K-localized: 1-hop for every L application
I K parameters to train - independent on the graph’s size
I O(Kn) multiplications (actually O(K |ε|))
I No EVD of L
![Page 9: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/9.jpg)
Spectral FilteringRecursive Polynomial Parametrization
Filtering:
y = gθ(L)x =K−1∑k=0
θkTk(L)x =K−1∑k=0
θk xk
Recurrence: xk = Tk(L)x = 2Lxk−1 − xk−2
x0 = x
x1 = Lx
I K-localized: 1-hop for every L application
I K parameters to train - independent on the graph’s size
I O(Kn) multiplications (actually O(K |ε|))
I No EVD of L
![Page 10: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/10.jpg)
Graph CNN
![Page 11: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/11.jpg)
Graph CNNLearning Filters
ys,j =
Fin∑i=1
gθi,j (L)xs,i
∂L∂θi ,j
=S∑
s=1
[xs,i ,0, . . . , xs,i ,K−1]T∂L∂ys,j
∂L∂xs,i
=Fout∑j=1
gθi,j (L)∂L∂ys,j
s = 1, . . . ,S - sample indexi = 1, . . . ,Fin - input feature map indexj = 1, . . . ,Fout - output feature map indexθi ,j - Fin × Fout Chebyshev coefficients vectors of order KL - Loss over a mini-batch of S samples
![Page 12: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/12.jpg)
Graph CNNLearning Filters
ys,j =
Fin∑i=1
gθi,j (L)xs,i
∂L∂θi ,j
=S∑
s=1
[xs,i ,0, . . . , xs,i ,K−1]T∂L∂ys,j
∂L∂xs,i
=Fout∑j=1
gθi,j (L)∂L∂ys,j
O (K |ε|FinFoutS)
Easily paralleled
![Page 13: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/13.jpg)
Graph PoolingCoarsening
I Multilevel clustering algorithm
I Reduce the size of the graph by a specified factor (2)
I Do all this efficiently
Graclus multilevel clustering algorithm
I Maximizing local normalized cut
I Greedily pick an unmarked vertex i and match it with an unmatched vertex jwhich maximizes the local normalized cut Wi ,j(1/di + 1/dj).
I Extremely fast.
I Dividing the number of nodes by approximately 2.
I Might generate singletons (non matched) nodes. Solved by using fake nodes.
![Page 14: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/14.jpg)
Graph PoolingCoarsening
I Multilevel clustering algorithm
I Reduce the size of the graph by a specified factor (2)
I Do all this efficiently
Graclus multilevel clustering algorithm
I Maximizing local normalized cut
I Greedily pick an unmarked vertex i and match it with an unmatched vertex jwhich maximizes the local normalized cut Wi ,j(1/di + 1/dj).
I Extremely fast.
I Dividing the number of nodes by approximately 2.
I Might generate singletons (non matched) nodes. Solved by using fake nodes.
![Page 15: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/15.jpg)
Graph PoolingFast Pooling
Example: Pooling by 4
I Graclus generates singletons: n0 = 8→ n1 = 5→ n2 = 3
I By adding fake nodes we get: n2 = 3→ n1 = 6→ n0 = 12
I z = [max(x0, x1),max(x4, x5, x6),max(x8, x9, x10)]
I Balanced binary trees ⇒ efficient on GPUs.
![Page 16: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/16.jpg)
ExperimentsMNIST
Wi ,j = e−‖xi−xj‖22/σ2
I 28×28 pixels + 192 fake nodes ⇒ n = |V| = 976
I 8-NN graph ⇒ |ε| = 3198
I Based on LeNet-5 ⇒ K = 5
![Page 17: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/17.jpg)
ExperimentsMNIST
Wi ,j = e−‖xi−xj‖22/σ2
I 28×28 pixels + 192 fake nodes ⇒ n = |V| = 976
I 8-NN graph ⇒ |ε| = 3198
I Based on LeNet-5 ⇒ K = 5
![Page 18: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/18.jpg)
ExperimentsMNIST
Wi ,j = e−‖xi−xj‖22/σ2
I 28×28 pixels + 192 fake nodes ⇒ n = |V| = 976
I 8-NN graph ⇒ |ε| = 3198
I Based on LeNet-5 ⇒ K = 5
I Isotropic filters (no orientation)
I Uninvestigated optimizations and initializations
![Page 19: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/19.jpg)
Experiments20NEWS
I 18,846 text documentsassociated with 20 classes
I 10K most common wordsfrom the 94K unique words⇒ n = |V| = 10K
I 16-NN,Wi ,j = e−‖zi−zj‖
22/σ
2
(zi - word2vec) ⇒|ε| = 132, 834
I x - bag-of-words model
![Page 20: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/20.jpg)
Experiments20NEWS
Comparison with other methods (K = 5)
I Slightly worse than multinomial naiveBayes classifier.
I Defeats fully-connected networks withmuch less parameters.
Total training time divided by # ofgradient steps
I Scales as O(n) as opposed toO(n2)
![Page 21: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/21.jpg)
Experiments20NEWS
Comparison with other methods (K = 5)
I Slightly worse than multinomial naiveBayes classifier.
I Defeats fully-connected networks withmuch less parameters.
Total training time divided by # ofgradient steps
I Scales as O(n) as opposed toO(n2)
![Page 22: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/22.jpg)
ExperimentsGraph Quality
⇒ The data structure is important
⇒ A well constructed graph is important
⇒ Proper approximations (LSHForest) can be used for larger DBs.
![Page 23: Convolutional Neural Networks on Graphs with Fast ...web.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering](https://reader034.fdocuments.net/reader034/viewer/2022042308/5ed450244e1aa219885a950b/html5/thumbnails/23.jpg)
Conclusions and Future Work
I Introduced a model with linear complexity.
I The quality of the input graph is of paramount importance.
I Local stationarity and compositionality are verified for text documents as long asthe graph is well constructed.