word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec,...
Transcript of word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec,...
![Page 1: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/1.jpg)
word2vec, node2vec, graph2vec, X2vec:Towards a Theory of Vector Embeddingsof Structured Data
Martin GroheRWTH Aachen
![Page 2: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/2.jpg)
X2vec
Vector EmbeddingsRepresent objects from arbitrary class X by vectors in some vectorspace H
space X of objects vector space H
ExampleWord embeddings in natural language processing: word2vec
2
![Page 3: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/3.jpg)
X2vec
Vector EmbeddingsRepresent objects from arbitrary class X by vectors in some vectorspace H
space X of objects vector space H
ExampleWord embeddings in natural language processing: word2vec
2
![Page 4: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/4.jpg)
Motivation and Ideas
How?I embed objects into vector space (Hilbert space) in such a way
that semantic relationships between objects are reflected ingeometric relationships of their images
I most importantly: “semantic distance” between objectscorrelated to geometric distance
Why?I standard machine learning algorithms operate on vector
representations of dataI enable a quantitative analysis using mathematical techniques
ranging from linear algebra to functional analysis
3
![Page 5: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/5.jpg)
Motivation and Ideas
How?I embed objects into vector space (Hilbert space) in such a way
that semantic relationships between objects are reflected ingeometric relationships of their images
I most importantly: “semantic distance” between objectscorrelated to geometric distance
Why?I standard machine learning algorithms operate on vector
representations of dataI enable a quantitative analysis using mathematical techniques
ranging from linear algebra to functional analysis
3
![Page 6: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/6.jpg)
Motivation and Ideas
How?I embed objects into vector space (Hilbert space) in such a way
that semantic relationships between objects are reflected ingeometric relationships of their images
I most importantly: “semantic distance” between objectscorrelated to geometric distance
Why?I standard machine learning algorithms operate on vector
representations of data
I enable a quantitative analysis using mathematical techniquesranging from linear algebra to functional analysis
3
![Page 7: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/7.jpg)
Motivation and Ideas
How?I embed objects into vector space (Hilbert space) in such a way
that semantic relationships between objects are reflected ingeometric relationships of their images
I most importantly: “semantic distance” between objectscorrelated to geometric distance
Why?I standard machine learning algorithms operate on vector
representations of dataI enable a quantitative analysis using mathematical techniques
ranging from linear algebra to functional analysis
3
![Page 8: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/8.jpg)
Node Embeddings
I objects that are embedded are the vertices of a graph
I graph can be directed, labelled, weighted, et ceteraI typical applications: node classification in social networks, link
prediction in knowledge graphs
4
![Page 9: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/9.jpg)
Node Embeddings
I objects that are embedded are the vertices of a graphI graph can be directed, labelled, weighted, et cetera
I typical applications: node classification in social networks, linkprediction in knowledge graphs
4
![Page 10: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/10.jpg)
Node Embeddings
I objects that are embedded are the vertices of a graphI graph can be directed, labelled, weighted, et ceteraI typical applications: node classification in social networks, link
prediction in knowledge graphs
4
![Page 11: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/11.jpg)
Graph Embeddings
I objects that are embedded are graphs
I graphs can be directed, labelled, weighted, et ceteraI typical applications: classification and regression on chemical
molecules
5
![Page 12: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/12.jpg)
Graph Embeddings
I objects that are embedded are graphsI graphs can be directed, labelled, weighted, et cetera
I typical applications: classification and regression on chemicalmolecules
5
![Page 13: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/13.jpg)
Graph Embeddings
I objects that are embedded are graphsI graphs can be directed, labelled, weighted, et ceteraI typical applications: classification and regression on chemical
molecules
5
![Page 14: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/14.jpg)
Node Embeddings vs Graph Embeddings
CommonalityI same structure
I graph embedding applied to neighbourhoods of nodes yieldsnode embedding
Important differencesI nodes a of a single graph have explicit relation(s) between
them and a clearly defined distance, but different graphs areonly “semantically” related
I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance
6
![Page 15: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/15.jpg)
Node Embeddings vs Graph Embeddings
CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields
node embedding
Important differencesI nodes a of a single graph have explicit relation(s) between
them and a clearly defined distance, but different graphs areonly “semantically” related
I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance
6
![Page 16: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/16.jpg)
Node Embeddings vs Graph Embeddings
CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields
node embedding
Important differencesI nodes a of a single graph have explicit relation(s) between
them and a clearly defined distance, but different graphs areonly “semantically” related
I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance
6
![Page 17: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/17.jpg)
Node Embeddings vs Graph Embeddings
CommonalityI same structureI graph embedding applied to neighbourhoods of nodes yields
node embedding
Important differencesI nodes a of a single graph have explicit relation(s) between
them and a clearly defined distance, but different graphs areonly “semantically” related
I node embeddings can (often) be trained for fixed graphs, butwe rarely have to deal with a finite set of graphs fixed inadvance
6
![Page 18: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/18.jpg)
Towards a Theory of Vector Embeddings
I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.
I Vector embeddings allow it to integrate data from differentsources to a common framework
I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.
I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)
7
![Page 19: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/19.jpg)
Towards a Theory of Vector Embeddings
I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.
I Vector embeddings allow it to integrate data from differentsources to a common framework
I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.
I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)
7
![Page 20: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/20.jpg)
Towards a Theory of Vector Embeddings
I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.
I Vector embeddings allow it to integrate data from differentsources to a common framework
I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.
I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)
7
![Page 21: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/21.jpg)
Towards a Theory of Vector Embeddings
I Vector embeddings allow for a quantitative analysis of discretedata using standard machine learning algorithms andmathematical techniques from linear algebra to functionalanalysis.
I Vector embeddings allow it to integrate data from differentsources to a common framework
I Node and graph embeddings have been studied in machinelearning for a long time.The existing work focusses on graphs and binary structures,not much is known about embedding arbitrary relations.
I Except for the mathematical foundations, there is little theory.Vector embeddings deserve a deeper study using the tools andmethods of TCS (analysis of algorithms and complexity, logic,structural graph theory, et cetera)
7
![Page 22: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/22.jpg)
Key Questions
Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?
Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?
ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?
8
![Page 23: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/23.jpg)
Key Questions
Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?
Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?
ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?
8
![Page 24: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/24.jpg)
Key Questions
Algorithms and ComplexityCan we compute the embedding efficiently? Can we invert itefficiently (i.e., reconstruct graphs from their image)?
Logic and SemanticsWhat is the semantical meaning of the embeddings? Whichproperties are preserved? Can we answer queries on the embeddeddata?
ComparisonHow do different embeddings compare? Do they induce the samemetric, the same topology?
8
![Page 25: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/25.jpg)
Embeddings Techniques
![Page 26: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/26.jpg)
Matrix Factorisation
Classical approach to defining node embeddings based on graphmetric.
Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by
Svw := exp(− distG (v ,w)
)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k
minimising‖AA> − S‖
I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .
10
![Page 27: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/27.jpg)
Matrix Factorisation
Classical approach to defining node embeddings based on graphmetric.
Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by
Svw := exp(− distG (v ,w)
)
I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k
minimising‖AA> − S‖
I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .
10
![Page 28: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/28.jpg)
Matrix Factorisation
Classical approach to defining node embeddings based on graphmetric.
Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by
Svw := exp(− distG (v ,w)
)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k
minimising‖AA> − S‖
I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .
10
![Page 29: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/29.jpg)
Matrix Factorisation
Classical approach to defining node embeddings based on graphmetric.
Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by
Svw := exp(− distG (v ,w)
)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k
minimising‖AA> − S‖
I Rows av of matrix A define embedding V → Rk .
Inner product 〈av , aw 〉 approximates similarity Svw .
10
![Page 30: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/30.jpg)
Matrix Factorisation
Classical approach to defining node embeddings based on graphmetric.
Graph G = (V ,E )I Define similarity matrix S ∈ RV×V , e.g. by
Svw := exp(− distG (v ,w)
)I Compute low dimensional factorisation, e.g. matrix A ∈ RV×k
minimising‖AA> − S‖
I Rows av of matrix A define embedding V → Rk .Inner product 〈av , aw 〉 approximates similarity Svw .
10
![Page 31: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/31.jpg)
Random Walks
DeepWalk and node2vec are node embedding algorithms based onthe following idea:
I Compile a list of short random walks in the graph.I Treat them like sentences in in natural language, with the
nodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.
(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)
11
![Page 32: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/32.jpg)
Random Walks
DeepWalk and node2vec are node embedding algorithms based onthe following idea:I Compile a list of short random walks in the graph.
I Treat them like sentences in in natural language, with thenodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.
(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)
11
![Page 33: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/33.jpg)
Random Walks
DeepWalk and node2vec are node embedding algorithms based onthe following idea:I Compile a list of short random walks in the graph.I Treat them like sentences in in natural language, with the
nodes of the graph as words, and use word embeddingtechniques (the skip-gram technique of word2vec) to computeembedding.
(Perozzi, Al-Rfou, Skiena 2014, Grover Leskovec 2016)11
![Page 34: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/34.jpg)
Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graph
I each node v has a vector-valued state x (t)v ∈ Rk at each point
t in timeI states are updated based on the states of the neighbours, e.g.
Aggregate : a(t+1)v ←
∑w∈N(v)
Wagg · x (t)w ,
Update : x (t+1)v ← σ
(Wup ·
(x (t)v
a(t+1)v
))
with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.
I iterated for a fixed number s of rounds, v 7→ x (s)v is the
resulting node embedding
12
![Page 35: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/35.jpg)
Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)
v ∈ Rk at each pointt in time
I states are updated based on the states of the neighbours, e.g.
Aggregate : a(t+1)v ←
∑w∈N(v)
Wagg · x (t)w ,
Update : x (t+1)v ← σ
(Wup ·
(x (t)v
a(t+1)v
))
with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.
I iterated for a fixed number s of rounds, v 7→ x (s)v is the
resulting node embedding
12
![Page 36: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/36.jpg)
Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)
v ∈ Rk at each pointt in time
I states are updated based on the states of the neighbours, e.g.
Aggregate : a(t+1)v ←
∑w∈N(v)
Wagg · x (t)w ,
Update : x (t+1)v ← σ
(Wup ·
(x (t)v
a(t+1)v
))
with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.
I iterated for a fixed number s of rounds, v 7→ x (s)v is the
resulting node embedding
12
![Page 37: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/37.jpg)
Graph Neural NetworksGraph Neural Networks (GNNs) are a deep learning framework forgraphs:I learned message passing network on the input graphI each node v has a vector-valued state x (t)
v ∈ Rk at each pointt in time
I states are updated based on the states of the neighbours, e.g.
Aggregate : a(t+1)v ←
∑w∈N(v)
Wagg · x (t)w ,
Update : x (t+1)v ← σ
(Wup ·
(x (t)v
a(t+1)v
))
with learned parameter matrices Wagg,Wup and a nonlinear“activation function” σ.
I iterated for a fixed number s of rounds, v 7→ x (s)v is the
resulting node embedding12
![Page 38: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/38.jpg)
Graph Neural Networks (cont’d)
I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)
I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph
I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive
I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.
(Hamilton, Ying, Leskovec 2017)
13
![Page 39: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/39.jpg)
Graph Neural Networks (cont’d)
I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)
I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph
I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive
I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.
(Hamilton, Ying, Leskovec 2017)
13
![Page 40: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/40.jpg)
Graph Neural Networks (cont’d)
I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)
I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph
I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive
I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.
(Hamilton, Ying, Leskovec 2017)
13
![Page 41: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/41.jpg)
Graph Neural Networks (cont’d)
I a GNN encodes the embeddingfunction and not just a list ofvectors for the nodes of a fixedgraph (as the previous techniquesdo)
I once trained, a GNN can still beapplied if the graph changes, oreven to an entirely different graph
I we say that GNNs provide an inductive node-embeddingmethod, whereas matrix factorisation techniques and randomwalk techniques are transductive
I by aggregating the vectors of all nodes, we can also use GNNsto compute graph embeddings.
(Hamilton, Ying, Leskovec 2017)
13
![Page 42: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/42.jpg)
Graph Kernels
Graph kernels are the inner product functions of graph embeddings,that is, functions
K (G ,H) =⟨Φ(G ),Φ(H)
⟩for some graph embedding Φ.
I Kernel trick: Compute and use K (G ,H) without evercomputing Φ(G ),Φ(H).
I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,
counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )
14
![Page 43: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/43.jpg)
Graph Kernels
Graph kernels are the inner product functions of graph embeddings,that is, functions
K (G ,H) =⟨Φ(G ),Φ(H)
⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever
computing Φ(G ),Φ(H).
I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,
counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )
14
![Page 44: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/44.jpg)
Graph Kernels
Graph kernels are the inner product functions of graph embeddings,that is, functions
K (G ,H) =⟨Φ(G ),Φ(H)
⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever
computing Φ(G ),Φ(H).I graph kernels are widely used for ML problems on graphs
I Common graph kernels are based on comparing random walks,counting small subgraphs, and the Weisfeiler-Leman algorithm
(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )
14
![Page 45: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/45.jpg)
Graph Kernels
Graph kernels are the inner product functions of graph embeddings,that is, functions
K (G ,H) =⟨Φ(G ),Φ(H)
⟩for some graph embedding Φ.I Kernel trick: Compute and use K (G ,H) without ever
computing Φ(G ),Φ(H).I graph kernels are widely used for ML problems on graphsI Common graph kernels are based on comparing random walks,
counting small subgraphs, and the Weisfeiler-Leman algorithm(Kashima et al. 2003, Gärtner et al 2003, Shervashidze etal. 2009,. . . )
14
![Page 46: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/46.jpg)
The Weisfeiler-Leman Algorithm
![Page 47: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/47.jpg)
A Simple Combinatorial Algorithm
The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .
Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is
some colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.
Run 1-WL (Demo by Erkal Selman
16
![Page 48: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/48.jpg)
A Simple Combinatorial Algorithm
The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .
Initialisation All vertices get the same colour.
Refinement Step Two nodes v ,w get different colours if there issome colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.
Run 1-WL (Demo by Erkal Selman
16
![Page 49: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/49.jpg)
A Simple Combinatorial Algorithm
The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .
Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is
some colour c such that v and w have differentnumbers of neighbours of colour c .
Refinement is repeated until colouring stays stable.
Run 1-WL (Demo by Erkal Selman
16
![Page 50: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/50.jpg)
A Simple Combinatorial Algorithm
The 1-dimensional Weisfeiler-Leman algorithm (1-WL)(a.k.a. colour refinement) iteratively computes a colouring of thevertices of graph G .
Initialisation All vertices get the same colour.Refinement Step Two nodes v ,w get different colours if there is
some colour c such that v and w have differentnumbers of neighbours of colour c .Refinement is repeated until colouring stays stable.
Run 1-WL (Demo by Erkal Selman
16
![Page 51: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/51.jpg)
Example
initial graph
colouring after round 1
colouring after round 2 stable colouring after round 3
17
![Page 52: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/52.jpg)
Example
initial graph colouring after round 1
colouring after round 2 stable colouring after round 3
17
![Page 53: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/53.jpg)
Example
initial graph colouring after round 1
colouring after round 2
stable colouring after round 3
17
![Page 54: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/54.jpg)
Example
initial graph colouring after round 1
colouring after round 2 stable colouring after round 3
17
![Page 55: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/55.jpg)
Complexity
Theorem (Cardon, Crochemore, 1982, Paige, Tarjan, 1987)The stable colouring of a given graph with n vertices and m edgescan be computed in time O((n + m) log n).
This is optimal for a fairly general class of partitioning algorithms(Berkholz, Bonsma, G. 2017).
18
![Page 56: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/56.jpg)
Complexity
Theorem (Cardon, Crochemore, 1982, Paige, Tarjan, 1987)The stable colouring of a given graph with n vertices and m edgescan be computed in time O((n + m) log n).
This is optimal for a fairly general class of partitioning algorithms(Berkholz, Bonsma, G. 2017).
18
![Page 57: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/57.jpg)
1-WL as an Isomorphism Test
1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.
Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:
19
![Page 58: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/58.jpg)
1-WL as an Isomorphism Test
1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.
I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:
19
![Page 59: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/59.jpg)
1-WL as an Isomorphism Test
1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)
I fails on some very simple graphs:
19
![Page 60: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/60.jpg)
1-WL as an Isomorphism Test
1-WL distinguishes two graphs G ,H if their colour histogramsdiffer, that is, some colour appears a different number of times in Gand H.Thus 1-WL can be used as an incomplete isomorphism test.I works on almost all graphs (Babai, Erdös, Selkow 1980)I fails on some very simple graphs:
19
![Page 61: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/61.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G
Round 0 Round 1 Round 2
20
![Page 62: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/62.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G
Round 0 Round 1 Round 2
20
![Page 63: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/63.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G
Round 0 Round 1 Round 2
20
![Page 64: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/64.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G
Round 0 Round 1 Round 2
20
![Page 65: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/65.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G Round 0
Round 1 Round 2
20
![Page 66: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/66.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G Round 0 Round 1
Round 2
20
![Page 67: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/67.jpg)
Colours and Trees
Λr = set of all colours used by 1-WL in the first r rounds
Λ =⋃
r≥0 Λr = set of all colours used
We can think of the colours in Λr as rooted trees of height r .
Example
Graph G Round 0 Round 1 Round 2
20
![Page 68: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/68.jpg)
Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ
WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:
WLr (G ) =(
wl(λ,G ) | λ ∈ Λr
)∈ RΛr .
WL Graph KernelsThe r -round WL-kernel is the mapping K
(r)WL defined by
K(r)WL(G ,H) :=
r∑`=0
∑λ∈Λ`
wl(λ,G ) · wl(λ,H).
(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)
We can also define an infinite-dimensional version based on theembedding WL(G ) =
(wl(λ,G ) | λ ∈ Λ
)∈ RΛ.
21
![Page 69: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/69.jpg)
Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ
WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:
WLr (G ) =(
wl(λ,G ) | λ ∈ Λr
)∈ RΛr .
WL Graph KernelsThe r -round WL-kernel is the mapping K
(r)WL defined by
K(r)WL(G ,H) :=
r∑`=0
∑λ∈Λ`
wl(λ,G ) · wl(λ,H).
(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)
We can also define an infinite-dimensional version based on theembedding WL(G ) =
(wl(λ,G ) | λ ∈ Λ
)∈ RΛ.
21
![Page 70: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/70.jpg)
Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ
WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:
WLr (G ) =(
wl(λ,G ) | λ ∈ Λr
)∈ RΛr .
WL Graph KernelsThe r -round WL-kernel is the mapping K
(r)WL defined by
K(r)WL(G ,H) :=
r∑`=0
∑λ∈Λ`
wl(λ,G ) · wl(λ,H).
(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)
We can also define an infinite-dimensional version based on theembedding WL(G ) =
(wl(λ,G ) | λ ∈ Λ
)∈ RΛ.
21
![Page 71: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/71.jpg)
Weisfeiler-Leman Graph KernelsFor λ ∈ Λ: wl(λ,G ) = number of vertices of G of colour λ
WL Graph EmbeddingsGraph embeddings WLr for every r ≥ 0 defined by:
WLr (G ) =(
wl(λ,G ) | λ ∈ Λr
)∈ RΛr .
WL Graph KernelsThe r -round WL-kernel is the mapping K
(r)WL defined by
K(r)WL(G ,H) :=
r∑`=0
∑λ∈Λ`
wl(λ,G ) · wl(λ,H).
(Shervashidze, Schweitzer, van Leeuwen, Mehlhorn, Borgwardt 2011)
We can also define an infinite-dimensional version based on theembedding WL(G ) =
(wl(λ,G ) | λ ∈ Λ
)∈ RΛ.
21
![Page 72: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/72.jpg)
Higher-Dimensional Weisfeiler-Leman
The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)
Running time: O(nk log n)
k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.
Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.
22
![Page 73: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/73.jpg)
Higher-Dimensional Weisfeiler-Leman
The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)
Running time: O(nk log n)
k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.
Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.
22
![Page 74: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/74.jpg)
Higher-Dimensional Weisfeiler-Leman
The k-dimensional Weisfeiler-Lemanalgorithm (k-WL) iteratively coloursk-tuples of vertices (Weisfeiler andLeman 1968, Babai ∼1980)
Running time: O(nk log n)
k-WL is much more powerful than 1-WL, but still not a completeisomorphism test.
Theorem (Cai, Fürer, Immerman 1992)For every k there are non-isomorphic graphs Gk ,Hk notdistinguished by k-WL.
22
![Page 75: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/75.jpg)
Weisfeiler and Leman go Neural
Theorem (Morris, Ritzert, Fey, Hamilton, Lenssen, Rattan,G. 2019, Xu, Hu, Leskovec, Jegelka 2019)GNNs can extract exactly the same information from a graph as1-WL.
23
![Page 76: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/76.jpg)
Digression: GNNs in Chemical Engineering
Higher Dimensional GNNsBased on the higher dimensional WL algorithm (Morris, Ritzert,Fey, Hamilton, Lenssen, Rattan, G. 2019).
Predicting Fuel Ignition QualityPrediction model for combustion-related properties of hydrocarbonsbased on 2-dimensional GNNs
Python & RDKit & k-GNN Python & PyTorch & PyTorch Geometric
2-GNN1-GNN
SMILES
… …
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
… …
1 423
57
6
!… " !
#$$%&'(
#$$%&'(
DCN, MON, or RON
… …
(Schweidtmann, Rittig, König, G., Mitsos, and Dahmen, 2020)
24
![Page 77: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/77.jpg)
Digression: GNNs in Chemical Engineering
Higher Dimensional GNNsBased on the higher dimensional WL algorithm (Morris, Ritzert,Fey, Hamilton, Lenssen, Rattan, G. 2019).
Predicting Fuel Ignition QualityPrediction model for combustion-related properties of hydrocarbonsbased on 2-dimensional GNNs
Python & RDKit & k-GNN Python & PyTorch & PyTorch Geometric
2-GNN1-GNN
SMILES
… …
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
1 423
57
6
… …
1 423
57
6
!… " !
#$$%&'(
#$$%&'(
DCN, MON, or RON
… …
(Schweidtmann, Rittig, König, G., Mitsos, and Dahmen, 2020)24
![Page 78: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/78.jpg)
Logical Characterisation
C is the (syntactical) extension of first-order predicate logic withcounting quantifiers ∃≥ix .
Theorem (Cai, Fürer, Immerman 1992)For every k ≥ 1 and all graphs G ,H, the following are equivalent.1. k-WL does not distinguish G and H.2. G and H satisfy the same sentences of the logic C with at
most (k + 1) variables.
25
![Page 79: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/79.jpg)
Logical Characterisation
C is the (syntactical) extension of first-order predicate logic withcounting quantifiers ∃≥ix .
Theorem (Cai, Fürer, Immerman 1992)For every k ≥ 1 and all graphs G ,H, the following are equivalent.1. k-WL does not distinguish G and H.2. G and H satisfy the same sentences of the logic C with at
most (k + 1) variables.
25
![Page 80: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/80.jpg)
Fractional Isomorphisms
G ,H graphs with vertex sets V ,W and adjacency matrices A,B
System L(G ,H) of linear equations in variables Xvw :∑v ′∈V
Avv ′Xv ′w =∑
w ′∈W
Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V
Xv ′w =∑
w ′∈W
Xvw ′ = 1 for all v ∈ V ,w ∈W .
ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.
Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.
26
![Page 81: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/81.jpg)
Fractional Isomorphisms
G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑
v ′∈V
Avv ′Xv ′w =∑
w ′∈W
Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V
Xv ′w =∑
w ′∈W
Xvw ′ = 1 for all v ∈ V ,w ∈W .
ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.
Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.
26
![Page 82: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/82.jpg)
Fractional Isomorphisms
G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑
v ′∈V
Avv ′Xv ′w =∑
w ′∈W
Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V
Xv ′w =∑
w ′∈W
Xvw ′ = 1 for all v ∈ V ,w ∈W .
ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.
Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.
26
![Page 83: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/83.jpg)
Fractional Isomorphisms
G ,H graphs with vertex sets V ,W and adjacency matrices A,BSystem L(G ,H) of linear equations in variables Xvw :∑
v ′∈V
Avv ′Xv ′w =∑
w ′∈W
Xvw ′Bw ′w for all v ∈ V ,w ∈W ,∑v ′∈V
Xv ′w =∑
w ′∈W
Xvw ′ = 1 for all v ∈ V ,w ∈W .
ObservationL(G ,H) has a nonnegative integer solution iff G and H areisomorphic.
Theorem (Tinhofer 1990)L(G ,H) has a nonnegative rational solution (called a fractionalisomorphism) if and only if 1-WL does not distinguish G ,H.
26
![Page 84: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/84.jpg)
Higher Dimensional Version
Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth
level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).
I Exact correspondence by G. and Otto 2015.
We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).
27
![Page 85: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/85.jpg)
Higher Dimensional Version
Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth
level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).
I Exact correspondence by G. and Otto 2015.
We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).
27
![Page 86: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/86.jpg)
Higher Dimensional Version
Atserias and Maneva 2013I There is a close correspondence between k-WL and the kth
level of the Sherali-Adams hierarchy over L(G ,H) (togetherwith the inequalities Xvw ≥ 0).
I Exact correspondence by G. and Otto 2015.
We denote the system of linear equalities whose nonnegativerational solutions characterise k-WL by L(k)(G ,H).
27
![Page 87: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/87.jpg)
Homomorphism Vectors
![Page 88: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/88.jpg)
Counting Graph HomomorphismsHomomorphism Counts
hom(F ,G ) := number of homomorphisms from F to G
Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF :→ RF by
HomF (G ) :=(
hom(F ,G )∣∣∣ F ∈ F).
We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)
⟩=∑k≥1
12k · |Fk |
∑F∈Fk
hom(F ,G ) · hom(F ,H),
where Fk :={F ∈ F
∣∣ |F | = k}.
29
![Page 89: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/89.jpg)
Counting Graph HomomorphismsHomomorphism Counts
hom(F ,G ) := number of homomorphisms from F to G
Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G → RF by
HomF (G ) :=(
hom(F ,G )∣∣∣ F ∈ F).
We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)
⟩=∑k≥1
12k · |Fk |
∑F∈Fk
hom(F ,G ) · hom(F ,H),
where Fk :={F ∈ F
∣∣ |F | = k}.
29
![Page 90: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/90.jpg)
Counting Graph HomomorphismsHomomorphism Counts
hom(F ,G ) := number of homomorphisms from F to G
Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G
class of all graphs
→ RF by
HomF (G ) :=(
hom(F ,G )∣∣∣ F ∈ F).
We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)
⟩=∑k≥1
12k · |Fk |
∑F∈Fk
hom(F ,G ) · hom(F ,H),
where Fk :={F ∈ F
∣∣ |F | = k}.
29
![Page 91: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/91.jpg)
Counting Graph HomomorphismsHomomorphism Counts
hom(F ,G ) := number of homomorphisms from F to G
Homomorphism Graph EmbeddingsFor every class F of graphs, we define HomF : G → RF by
HomF (G ) :=(
hom(F ,G )∣∣∣ F ∈ F).
We can define an inner product on rg(HomF ) by⟨HomF (G ),HomF (H)
⟩=∑k≥1
12k · |Fk |
∑F∈Fk
hom(F ,G ) · hom(F ,H),
where Fk :={F ∈ F
∣∣ |F | = k}.
29
![Page 92: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/92.jpg)
Example
The embedding HomF for F = { , , }
The graphs G and H are mapped to the same vector:
HomF (G ) = HomF (H) = (10, 20, 0).
30
![Page 93: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/93.jpg)
Example
The embedding HomF for F = { , , }
The graphs G and H are mapped to the same vector:
HomF (G ) = HomF (H) = (10, 20, 0).
30
![Page 94: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/94.jpg)
Example
GH
The embedding HomF for F = { , , }
The graphs G and H are mapped to the same vector:
HomF (G ) = HomF (H) = (10, 20, 0).
30
![Page 95: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/95.jpg)
Homomorphism Indistinguishability
Graph G and H are homomorphism indistinguishable over a class Fof graphs if HomF (G ) = HomF (H).
Theorem (Lovász 1967)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs.2. G and H are isomorphic.
31
![Page 96: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/96.jpg)
Homomorphism Indistinguishability
Graph G and H are homomorphism indistinguishable over a class Fof graphs if HomF (G ) = HomF (H).
Theorem (Lovász 1967)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs.2. G and H are isomorphic.
31
![Page 97: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/97.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 98: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/98.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .
Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 99: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/99.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 100: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/100.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 101: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/101.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 102: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/102.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 103: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/103.jpg)
Proofaut(G ) := number of automorphisms of GI (F ,G ) := number of injective homomorphisms from F to G
S(F ,G ) := number of surjective homomorphisms from F to G
We view hom, I , and S as infinite matrices in RG×G .Row and column indices are ordered by increasing size.
Observation 1I is upper triangular with positive diagonal entries and S is lowertriangular with positive diagonal entries.
Observation 2hom(F ,H) =
∑G S(F ,G ) · 1
aut(G) · I (G ,H).
That is, hom = S · D · I for a diagonal matrix D with positivediagonal entries.
Hence the columns of hom are linearly independent and thusmutually distinct.
32
![Page 104: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/104.jpg)
Bounded Tree Width
Theorem (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs of tree width at most k .2. G and H are not distinguished by the k-dimensional
Weisfeiler-Leman graph isomorphism algorithm.
CorollaryFor all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs of tree width at most k2. L(k)(G ,H) has a nonnegative rational solution.
33
![Page 105: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/105.jpg)
Bounded Tree Width
Theorem (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs of tree width at most k .2. G and H are not distinguished by the k-dimensional
Weisfeiler-Leman graph isomorphism algorithm.
CorollaryFor all graphs G ,H, the following are equivalent.1. G and H are homomorphism indistinguishable over the class of
all graphs of tree width at most k2. L(k)(G ,H) has a nonnegative rational solution.
33
![Page 106: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/106.jpg)
Paths, Cycles, and Planar Graphs
Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.
Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.
Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.
34
![Page 107: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/107.jpg)
Paths, Cycles, and Planar Graphs
Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.
Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.
Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.
34
![Page 108: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/108.jpg)
Paths, Cycles, and Planar Graphs
Theorem (Dell, G., Rattan 2018)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all paths.2. L(G ,H) has a rational solution.
Theorem (Folklore)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all cycles.2. G and H are co-spectral.
Theorem (Mančinska and Roberson 2019)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all planar graphs.2. G and H are quantum isomorphic.
34
![Page 109: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/109.jpg)
Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .
Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
width at most k .2. G and H satisfy the same sentences of the logic C with at
most k + 1 variables.
Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
depth at most k .2. G and H satisfy the same sentences of the logic C of
quantifier rank at most k .
35
![Page 110: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/110.jpg)
Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .
Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
width at most k .2. G and H satisfy the same sentences of the logic C with at
most k + 1 variables.
Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
depth at most k .2. G and H satisfy the same sentences of the logic C of
quantifier rank at most k .
35
![Page 111: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/111.jpg)
Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .
Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
width at most k .2. G and H satisfy the same sentences of the logic C with at
most k + 1 variables.
Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
depth at most k .2. G and H satisfy the same sentences of the logic C of
quantifier rank at most k .35
![Page 112: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/112.jpg)
Logical CharacterisationsRecall: C is the (syntactical) extension of first-order predicate logicwith counting quantifiers ∃≥ix .
Corollary (Dvorák 2010)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
width at most k .2. G and H satisfy the same sentences of the logic C with at
most k + 1 variables.
Theorem (G. 2020)For all graphs G ,H, the following are equivalent.1. G and H are hom. ind. over the class of all graphs of tree
depth at most k .2. G and H satisfy the same sentences of the logic C of
quantifier rank at most k .35
![Page 113: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/113.jpg)
Remarks
I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .
I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.
I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.
36
![Page 114: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/114.jpg)
Remarks
I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .
I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.
I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.
36
![Page 115: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/115.jpg)
Remarks
I Lovász’s deep theory of Graph Limits is based on the metricinduced by HomG .
I Homomorphism embeddings can be extended to arbitraryrelational structures and to weighted graphs and structures;many of the results are preserved.
I There are also homomorphism node embeddings based onhomomorphism counts of rooted graphs.
36
![Page 116: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/116.jpg)
Concluding Remarks
![Page 117: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/117.jpg)
Research Directions
I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.
I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like
relational structures, processes, dynamical systems.I Develop a better understanding of the semantics of vector
embeddings and answer queries directly on the embedded data.
38
![Page 118: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/118.jpg)
Research Directions
I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.
I Design algorithms for reconstructing graphs from vectors.
I Design vector embeddings for more complex objects likerelational structures, processes, dynamical systems.
I Develop a better understanding of the semantics of vectorembeddings and answer queries directly on the embedded data.
38
![Page 119: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/119.jpg)
Research Directions
I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.
I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like
relational structures, processes, dynamical systems.
I Develop a better understanding of the semantics of vectorembeddings and answer queries directly on the embedded data.
38
![Page 120: word2vec, node2vec, graph2vec, X2vec: Towards a Theory of ... · word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data MartinGrohe RWTHAachen](https://reader036.fdocuments.net/reader036/viewer/2022090609/6060044cba037c3ba7548bea/html5/thumbnails/120.jpg)
Research Directions
I Study relations between graph metrics, for example, themeasure induced by WL-Graph Kernels and thehomomorphism vectors over trees.
I Design algorithms for reconstructing graphs from vectors.I Design vector embeddings for more complex objects like
relational structures, processes, dynamical systems.I Develop a better understanding of the semantics of vector
embeddings and answer queries directly on the embedded data.
38