Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf ·...
Transcript of Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf ·...
![Page 1: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/1.jpg)
Learning Representations of
Large-Scale Networks Jian Tang
HEC Montréal Montréal Institute of Learning Algorithms (MILA) [email protected]
1
![Page 2: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/2.jpg)
2
Networks
Social Network
World Wide Web Internet-of-Things
Road Network
• Ubiquitous in real world
• A flexible and general data structure • Many types of data can be formulated as networks
![Page 3: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/3.jpg)
3
Continental Airline: Source: http://www.airlineroutemaps.com/
![Page 4: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/4.jpg)
4 Graph from Albert-László Barabási’ s SIGIR09 keynote
![Page 5: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/5.jpg)
5
Gene-Regulatory Network - Abdollahi et al. Transcriptional network governing the angiogenic switch in human pancreatic cancer. PNAS vol.
104 no. 31,2007
![Page 6: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/6.jpg)
Network Mining: Link Prediction
Does she know Richard Gere?
6
![Page 7: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/7.jpg)
Network Mining: Ranking
Co-citation network of Sloan Digital Sky Survey - http://nevac.ischool.drexel.edu/~james/infovis09/FP-tree-visual.html
• Importance of vertices • Which is the most influential paper?
7
![Page 8: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/8.jpg)
Network Mining: Community Detection
Coauthor
Network Information retrieval
Machine learning Data mining
Who tend to work together? - Q.Mei, D.Cai, D.Zhang, and C.Zhai, Topic Modeling with Network Regularization, WWW 2008
8
![Page 9: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/9.jpg)
Network Mining: Classification
• d1 is democratic • d2 is republican • What can we say about d3 and d4?
- Graph from Jerry Zhu’s tutorial in ICML 07
9
![Page 10: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/10.jpg)
Network Mining: Many Other Tasks
• Sampling
• Recommendation
• Structure analysis (e.g., structural holes)
• Evolution
• Matching
• Visualization
• „
10
![Page 11: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/11.jpg)
11
Traditional Representations of Networks
• Suffer from data sparsity • Suffer from high dimensionality • Does not facilitate computation • Does not represent “semantics” • „
![Page 12: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/12.jpg)
Research Question and Challenges (1)
• How to effectively and efficiently represent a Large-scale network?
• Challenges:
• Large-scale: millions of nodes and billions of edges
• Heterogeneous: directed/undirected, and binary/weighted
12
![Page 13: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/13.jpg)
13
Learning Node Representations for Networks
• Node Classification • Node Clustering • Link Prediction • Recommendation • „
• Word representation
Unstructured text
Text representation, e.g., word and document representation, „
„
Deep learning has been attracting increasing attention „
A future direction of deep learning is to integrate unlabeled data „ The Skip-gram model is quite effective and efficient „ „
degree
network
edge
node word
document
classification
text
embedding
Word co-occurrence network
• E.g., Facebook social network -> user representations (features)-> friend recommendation
Network Node representations
![Page 14: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/14.jpg)
Extremely Low-dimensional Representations: 2D/3D for Visualizing Networks
14
„. „.
„. „. „.
„. Networks 2D/3D Layout
Heatmaps
Network Diagrams Scatter Plots
„.
High-dimensional Data
![Page 15: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/15.jpg)
Visualizing Scientific Papers (Tang et al. 2016)
15
10M Scientific Papers on One Slide
![Page 16: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/16.jpg)
In some cases, we are given a collection of small networks„
16
Sentence dependency graph
Molecule networks
• A collection of small networks with different structures
„ Ego-networks
![Page 17: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/17.jpg)
Research Question and Challenges (2)
• How to learn an effective representation for an entire network?
• Challenges:
• The structures of different networks are different
17
![Page 18: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/18.jpg)
Outline
• Part I: Learning Node Representations of Networks
• Related Work: Laplacian Eigenmap, Word2Vec
• LINE, DeepWalk, and Node2Vec
• Extensions
• Part II: Visualizing Networks and High-Dimensional Data
• t-SNE
• LargeVis
• Pat III: Learning Representations of Entire Networks
• CNN
• Neural Message Passing
• Part IV: Summary, Challenges & Future Work
18
![Page 19: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/19.jpg)
Outline
• Part I: Learning Node Representations of Networks
• Related Work: Laplacian Eigenmap, Word2Vec
• LINE, DeepWalk, and Node2Vec
• Extensions
• Part II: Visualizing Networks and High-Dimensional Data
• t-SNE
• LargeVis
• Pat III: Learning Representations of Entire Networks
• CNN
• Neural Message Passing
• Part IV: Summary, Challenges & Future Work
19
![Page 20: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/20.jpg)
Problem Definition: Node Embedding
20
Networks Node representations
• Given a network/graph G=(V, E, W), where V is the set of nodes, E is the set of edges between the nodes, and W is the set of weights of the edges, the goal of node embedding is to represent each node i with a vector , which preserves the structure of networks.
![Page 21: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/21.jpg)
Related Work
• Classical graph embedding algorithms • MDS, IsoMap, LLE, Laplacian Eigenmap, „ • Hard to scale up
• Graph factorization (Ahmed et al. 2013) • Not specifically designed for network representation • Undirected graphs only
• Neural word embeddings (Bengio et al. 2003) • Neural language model • word2vec (skipgram), paragraph vectors, etc.
21
![Page 22: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/22.jpg)
Laplacian Eigenmap (Belkin and Niyogi, 2003)
22
• Intuition: the embeddings of similar nodes should be close to each other
• Objective:
• Where , L is the Laplacian matrix , and
• Optimization by finding the eigenvectors of smallest eigenvalues of the Laplacian matrix L:
• Computationally expensive for finding eigenvectors when networks are very big
L =D-W Dii = wijj
å
Lu = lDu
Mikhail Belkin and Partha Niyogi. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation, 2003.
![Page 23: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/23.jpg)
Skipgram (Mikolov et al. 2014)
23
• Goal: represent each word i with a vector by training from a sequence
• Distributional hypothesis (John Rupert Firth): You know a word by the company it keeps
• Skip-gram: learning word representations by predicting the nearby words within a window
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS 2014
![Page 24: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/24.jpg)
Skipgram
24
• Objective:
• Where c is the window size • Direct optimization is computationally expensive due to the softmax function
• Negative sampling:
• Where is a noisy distribution
![Page 25: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/25.jpg)
LINE: Large-scale Information Network Embedding (Tang et al., Most Cited Paper of WWW 2015)
• Arbitrary types of networks • Directed, undirected, and/or weighted
• Clear objective function • Preserve the first-order and second-order proximity
• Scalable • Asynchronous stochastic gradient descent • Millions of nodes and billions of edges: a coupe of hours on a single machine
25
Jian Tang, Meng Qu, Mingzhe Wang, Jun Yan, Ming Zhang and Qiaozhu Mei. LINE: Large-scale Information Network Embedding. WWW’15
![Page 26: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/26.jpg)
First-order Proximity
26
• The local pairwise proximity between the nodes • However, many links between the nodes are not observed
• Not sufficient for preserving the entire network structure
![Page 27: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/27.jpg)
Second-order Proximity
27
• Proximity between the neighborhood structures of the nodes
“The degree of overlap of two people’s friendship networks correlates with the strength of ties between them” --Mark Granovetter
“You shall know a word by the company it keeps” --John Rupert Firth
![Page 28: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/28.jpg)
Preserving the First-order Proximity (LINE 1st)
28
• Distributions: : (defined on the undirected edge i - j)
• Objective:
p̂1(vi,v j ) =wij
wmn(m,n)ÎE
å
O1 =KL( p̂1, p1) = - wij log p1(vi,v j )(i, j )ÎE
å
: Embedding of i
Empirical distribution of first-order proximity:
Model distribution of first-order proximity:
![Page 29: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/29.jpg)
Preserving the Second-order Proximity (LINE 2nd)
29
• Distributions: (defined on the directed edge i -> j)
• Objective:
p̂2(v j | vi ) =wij
wikkÎV
å
O2 = KL( p̂2(× | vi ), p2(× | vi ))i
å = - wij log p2(v j | vi )(i, j )ÎE
å
Empirical distribution of neighborhood structure:
Model distribution of neighborhood structure:
![Page 30: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/30.jpg)
Optimization Tricks
30
• Stochastic gradient descent + Negative Sampling • Randomly sample an edge and multiple negative edges
• The gradient w.r.t the embedding with edge (i, j)
• Problematic when the variances of weights of the edges are large • The variance of the gradients are large
• Solution: edge sampling • Sample the edges according to their weights and treat the edges as binary
• Complexity: O(d*K*|E|) • Linear to the dimensionality d, the number of negative samples K, and the number of edges
![Page 31: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/31.jpg)
Discussion
31
• Embed nodes with few neighbors • Expand the neighbors by adding higher-order neighbors • Breadth-first search (BFS) • Adding only second-order neighbors works well in most cases
• Embed new nodes • Fix the embeddings of existing nodes • Optimize the objective w.r.t. the embeddings of new nodes
![Page 32: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/32.jpg)
DeepWalk (Perozzi et al. 2014)
32
• Learning node representations with the technique for learning word representations, i.e., Skipgram
• Treat random walks on networks as sentences
Random walk generation (generate node contexts through random search)
Predict the nearby nodes in the random walks
Bryan Perozzi, Rami Al-Rfou, Steven Skiena. DeepWalk: Online Learning of Social Representations. KDD’14
![Page 33: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/33.jpg)
DeepWalk (Perozzi et al. 2014)
33
• Optimization: hierarchical softmax (Morin, Bengio, 2005)
• Assign the nodes to the leaves of a binary tree • Predict the node => predict a path in the tree
• Make binary decisions along the path
• Complexity from |V| to log(|V|)
Predict the nearby nodes in the random walks (v1->v3, v1->v5)
v1
Hierarchical softmax
![Page 34: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/34.jpg)
Node2Vec (Grover and Leskovec, 2016)
34
• Find the node context by a hybrid strategy of • Breadth-first Sampling (BFS): homophily • Depth-first Sampling (DFS): structural equivalence
Aditya Grover and Jure Leskovec. node2vec: Scalable Feature Learning for Networks. KDD’16
![Page 35: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/35.jpg)
Expand Node Contexts with Biased Random Walk
35
• Biased random walk with two parameters p and q • p: controls the probability of revisiting a node in the walk • q: controls the probability of exploring “outward” nodes • Find optimal p and q through cross-validation on labeled data
• Optimized through similar objective as LINE with first-order proximity
![Page 36: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/36.jpg)
Comparison between LINE, DeepWalk, and Node2Vec
36
Algorithm Neighbor Expansion
Proximity Optimization Labeled Data
LINE BFS 1st or 2nd Negative Sampling No
DeepWalk Random 2nd Hierarchical Softmax
No
Node2Vec BFS + DFS 1st Negative Sampling Yes
![Page 37: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/37.jpg)
Applications
37
• Node classification (Perozzi et al. 2014, Tang et al. 2015a, Grover et al. 2015 )
• Node visualization (Tang et al. 2015a) • Link prediction (Grover et al. 2015) • Recommendation (Zhao et al. 2016) • Text representation (Tang et al. 2015a, Tang et al. 2015b)
• „
![Page 38: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/38.jpg)
Node Classification
38
Table: Results on Flickr Network (Perozzi et al. 2014)
• social network => user representations (features) => classifier
• Community identities as classification labels
DeepWalk > Laplacian Eigenmap
![Page 39: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/39.jpg)
Node Classification
39
Table: Results on Youtube Network(Tang et al. 2015a)
• social network => user representations (features) => classifier
• Community identities as classification labels
LINE(1st + 2nd ) >LINE(2nd )> DeepWalk > LINE(1st )
![Page 40: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/40.jpg)
Node Visualization (Tang et al. 2015a)
40
• Coauthor network: authors from three different research fields
“Data mining” “Machine learning” “Computer vision”
(a) Graph factorization (b) DeepWalk (c) LINE(2nd )
![Page 41: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/41.jpg)
Link Prediction (Grover and Leskovec, 2016)
41
Node Embeddings (LINE, DeepWalk, node2vec) > Jaccard’s Coefficient > Adamic-Adar
Table: Results of Link Prediction
![Page 42: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/42.jpg)
Unsupervised Text Representation (Tang et al. 2015a)
degree
network
edge
node word
document
classification
text
embedding
Word co-occurrence network
Unstructured text
Text representation, e.g., word and document representation, „
„
Deep learning has been attracting increasing attention „
A future direction of deep learning is to integrate unlabeled data „
The Skip-gram model is quite effective and efficient „ Information networks encode the relationships between the data objects „ text
information
network
word …
classification
doc_1
doc_2
doc_3
doc_4 …
…
Word-document network
42
• Construct text networks from unstructured text
![Page 43: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/43.jpg)
Word Analogy (Tang et al. 2015a)
• Entire Wikipedia articles => word co-occurrence network (~2M words, 1B edges)
LINE(2nd) > LINE(1st)
LINE(2nd) > SkipGram
• Size of word co-occurrence networks does not grow linearly with data size • Only the weights of edges change
43
Algorithm Semantic(%) Syntactic(%) Overall
GF 61.38 44.08 51.93
SkipGram 69.14 57.94 63.02
LINE(1st) 58.08 49.42 53.35
LINE(2nd ) 73.79 59.72 66.10
![Page 44: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/44.jpg)
Text Classification on Long Documents (Tang et al. 2015a)
• Word co-occurrence network (w-w) , word-document network (w-d) to learn the word embedding
• Document embedding as average of word embeddings in the document
LINE(w-w) > SkipGram LINE(w-d) > LINE(w-w)
44
Accuracy
![Page 45: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/45.jpg)
Text Classification on Short Documents (Tang et al. 2015a)
• Word co-occurrence network (w-w) , word-document network (w-d) to learn the word embedding
• Document embedding as average of word embeddings in the document
LINE(w-w) > SkipGram LINE(w-w) > LINE(w-d)
45
Accuracy
![Page 46: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/46.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
46
![Page 47: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/47.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
47
![Page 48: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/48.jpg)
Other Variants
• Leverage global structural information (Cao et al. 2015)
• Non-linear methods based on autoencoders (Wang et al. 2016)
• Directed network embedding (Ou et al. 2016)
• Signed network embedding (Wang et al. 2017)
• Shaosheng Cao, Wei Lu, and Qiongkai Xu. GraRep: Learning graph representations with global structural information. CIKM’ 2015.
• Mingdong Ou, Peng Cui, Jian Pei, Wenwu Zhu. Asymmetric transitivity preserving graph embedding. KDD, 2016.
• Daixing Wang, Peng Cui, Wenwu Zhu. Structural deep network embedding. KDD, 2016. • Suhang Wang, Jiliang Tang, Charu Aggarwal, Yi Chang, Huan Liu. Signed network embedding in social media.
SDM 2017.
48
![Page 49: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/49.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
49
![Page 50: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/50.jpg)
Multi-view Network Embedding ( Qu and Tang et al. 2017)
• Multiple types of relationships between nodes exist in real-world
networks
• E.g., following, retweeting relationships between users in Twitter
• Each type of relationship => a view of the network
• Multiple types of relationships => multi-view networks
• Infer robust node representations with multiple views
• Complementary information in different views
Figure: Networks with multiple views Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, and Jiawei Han. Learning Distributed Node
Representations for Networks with Multiple Views. To appear in CIKM 2017.
50
![Page 51: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/51.jpg)
A Co-Regularization Approach ( Qu and Tang et al. 2017)
• Each node has a robust representation and multiple view-specific
representations
• Preserve the structure of different views through view-specific representations
• Promote the collaboration of different views to vote for robust representations
• Regularize the view-specific representations
51
![Page 52: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/52.jpg)
A Co-Regularization Approach ( Qu and Tang et al. 2017)
• Objective
View-specific objective
Regularization objective
:robust embedding of node i
:view-specific node embedding of node i
:weights of views of node i
52
![Page 53: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/53.jpg)
Learning the Weights of the Views via Neural Attention ( Qu and Tang et al. 2017)
• Define the attention weight of views for each node:
• According to the regularization term:
: concatenation of view-specific embeddings of node i
• Learning the weights with supervised data, e.g., node classification
: embedding of view k
53
![Page 54: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/54.jpg)
Results of Multi-label Node Classification
54
MVE > MVE-NoAttn > LINE/node2vec
Without learning the weights of views
Single best view
![Page 55: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/55.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
55
![Page 56: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/56.jpg)
Networks with Node Attributes (Yang et al. 2015, N.Kipf et al. 2016, Liao et al. 2017)
• Networks with text information (Yang et al. 2015)
• Networks with attributes (Liao et al. 2017)
• Gender, location, text, „
• Variational graph autoencoders (N.Kipf et al. 2016)
• Encode the node with neighborhood structures and attributes
• Decode the neighborhood structures
56
• Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, Edward Y. Chang. Network representation learning with rich text information. IJCAI 2015.
• Thomas N.Kipf and Max Welling. Variational Graph Auto-encoders. NIPS Workshop 2016. • Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua. Attributed Social Network Embedding. arXiv, 2017.
![Page 57: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/57.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
57
![Page 58: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/58.jpg)
Heterogeneous Network Embedding via Deep Architectures (Chang et al. 2015)
• Heterogeneous networks of images and text
• Make the embeddings of linked objects close to each other
• image-image, image-text, text-text
58
Siyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, Thomas S. Huang. Heterogeneous network embedding via Deep Architectures. KDD’15
![Page 59: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/59.jpg)
Heterogeneous Star Network Embedding (Chen et al. 2017)
• Heterogeneous Star Networks
• Paper, keywords, authors, venues
• Aims to embed the center objects
• paper
59
Ting Chen and Yizhou Sun, "Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification. WSDM’17.
![Page 60: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/60.jpg)
Extensions
• Other variants
• Multi-view networks
• Networks with node attributes
• Heterogeneous networks
• Task-specific network embedding
60
![Page 61: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/61.jpg)
Semi-supervised Text Representation (Tang et al. 2015b) • Heterogeneous text network
• Word-word, word-document, and word-label networks
• Different levels of word co-occurrences: local context-level, document-level, label-level
• Learning word embeddings through jointly training the heterogeneous networks
• Document embeddings as the average of word embeddings
Unstructured text
Text representation, e.g., word and document representation, „
„
Deep learning has been attracting increasing „
A future direction of deep learning is to integrate „
The Skip-gram model is quite effective and efficient „ Information networks encode the relationships
label document
label
label
null
null
null
degree
network
edge
node word
document
classification
text
embedding
Word co-occurrence network
text
information
network
word …
classification
doc_1
doc_2
doc_3
doc_4 …
…
Word-document network
text
information
network
word …
classification
label_2
label_1
label_3 … …
Word-label network
Jian Tang, Meng Qu, and Qiaozhu Mei. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. KDD’15.
61
![Page 62: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/62.jpg)
Results on Text Classification of Long Documents
20newsgroup Wikipedia IMDB
Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1
Unsupervised LINE(G_wd) 79.73 78.40 80.14 80.13 89.14 89.14
Predictive embedding
CNN 80.15 79.43 79.25 79.32 89.00 89.00
PTE(G_wl) 82.70 81.97 79.00 79.02 85.98 85.98
PTE(G_ww+G_wl) 83.90 83.11 81.65 81.62 89.14 89.14
PTE(G_wd+G_wl) 84.39 83.64 82.29 82.27 89.76 89.76
PTE(joint) 84.20 83.39 82.51 82.49 89.80 89.80
PTE > CNN
62
![Page 63: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/63.jpg)
Results on Text Classification of Short Documents
20newsgroup Wikipedia IMDB
Type Algorithm Micro-F1 Macro-F1 Micro-F1 Macro-F1 Micro-F1 Macro-F1
Unsupervised LINE(G_ww) 74.22 70.12 71.13 71.12 73.84 73.84
Predictive embedding
CNN 76.16 73.08 72.71 72.69 75.97 75.96
PTE(G_wl) 76.45 72.74 73.44 73.42 73.92 73.91
PTE(G_ww+G_wl) 76.80 73.28 72.93 72.92 74.93 74.92
PTE(G_wd+G_wl) 77.46 74.03 73.13 73.11 75.61 75.61
PTE(joint) 77.15 73.61 73.58 73.57 75.21 75.21
PTE CNN »
63
![Page 64: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/64.jpg)
Semi-supervised Classification with Graph Convolutional Networks (Kipf et al. 2017)
• Task: Given a graph G = (V, E), and the features of nodes
, and the labels of a subset of nodes are given.
• Learning the node representations through multi-layer graph convolutional networks
• Combining node representations (self-link) and representations of neighbors
X Î RN´D
Thomas N.Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. ICLR’17.
64
![Page 65: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/65.jpg)
Multi-layer Graph Convolution Neural Networks
• Final objective:
65
• Starting from the node features • Define the propagation rule
Add the self-links
= Normalize the matrix
Nonlinear propagation
H (0) = X
![Page 66: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/66.jpg)
Experimental Results (Kipf et al. 2017)
66
GCN > Label Propagation
![Page 67: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/67.jpg)
Outline
• Part I: Learning Node Representations of Networks
• Related Work: Laplacian Eigenmap, Word2Vec
• LINE, DeepWalk, and Node2Vec
• Extensions
• Part II: Visualizing Networks and High-Dimensional Data
• t-SNE
• LargeVis
• Pat III: Learning Representations of Entire Networks
• CNN
• Neural Message Passing
• Part IV: Summary, Challenges & Future Work
67
![Page 68: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/68.jpg)
Extremely Low-dimensional Representations: 2D/3D for Visualizing Networks
„.
„. „. „.
„. Networks 2D/3D Layout
„.
High-dimensional Data
K-Nearest Neighbor Graph (KNN-G) Construction
Graph Layout
68 Heatmaps
Network Diagrams Scatter Plots
![Page 69: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/69.jpg)
t-SNE (Maarten and Hinton, 2008, 2014 )
• State-of-the-art algorithms for high-dimensional data visualization
• Deployed in Tensorflow for visualizing the representations learned by deep neural networks.
Visualizations of MNIST Data TensorBoard Visualizations by t-SNE
L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. JMLR, 2008. L.J.P. van der Maaten. Accelerating t-SNE using Tree-Based Algorithms. JMLR, 2014.
69
![Page 70: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/70.jpg)
Constructing the K-nearest Neighbor Graph
• Finding the nearest neighbors for all the data points
• Vantage-point tree
• Calculating the similarities between the data points
• Complexity: O(NlogN) w.r.t. the number of data points N
: nearest neighbors of node i
70
![Page 71: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/71.jpg)
K-nearest Neighbor Graph Layout
• Similarity between two data points i and j in low-dimensional space is defined as:
• Objective: minimize the distance between the similarities defined in the high-dimensional spaces and low-dimensional spaces
• The complexity: O(NLogN) (Maarten, 2014).
: low-dimensional representations (coordinates) of node i
71
![Page 72: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/72.jpg)
Limitations of t-SNE
• K-NNG construction: complexity grows O(NlogN) to the number of data points N
• Graph layout: complexity is O(NlogN) • Very sensitive parameters
72
![Page 73: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/73.jpg)
LargeVis (Tang et al., Best Paper Nomination at WWW 2016) • Efficient approximation of K-NNG construction
• 30 times faster than t-SNE (3 million data points) • Better time-accuracy tradeoff
• Efficient probabilistic model for graph layout • O(NlogN) -> O(N) • 7 times faster than t-SNE (3 million data points) • Better visualization layouts • Stable parameters across different data sets
Jian Tang, Jingzhou Liu, Ming Zhang, and Qiaozhu Mei. Visualizing Large-scale and High-dimensional Data. WWW’16
73
![Page 74: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/74.jpg)
Random Projection Trees
• Partition the whole space into different regions with multiple hyperplanes
74
![Page 75: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/75.jpg)
Random Projection Trees
75
![Page 76: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/76.jpg)
Random Projection Trees
76
![Page 77: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/77.jpg)
Random Projection Trees
77
![Page 78: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/78.jpg)
Random Projection Trees
78
![Page 79: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/79.jpg)
K-NNG Construction
• Search nearest neighbors through traversing trees • Only data points in the leaf are considered as nearest neighbors
• Multiple trees are usually used to improve the accuracy • e.g., hundreds
79
![Page 80: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/80.jpg)
Reduce the Number of Trees
• Construct a less accurate K-NNG with a few trees • Iteratively refine the K-NNG through “neighbor exploring”
• “A neighbor of my neighbor is also likely to be my neighbor” • Second-order neighbors are also treated as candidates of first-order neighbors
80
![Page 81: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/81.jpg)
It Works!
• X axis: accuracy of K-NNG • Y axis: running time (minutes) • tSNE: 16 hours (95% accuracy) • LargeVis: 25 minutes
• >30 times faster than t-SNE
LargeVis
t-SNE
Random projection trees
81
![Page 82: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/82.jpg)
Learning the Layout of KNN Graph
• Preserve the similarities of the nodes in 2D/3D space • Represent each node with a 2D/3D vector • Keep similar data close while dissimilar data far apart
• Probability of observing a binary edge between nodes (i,j):
• Likelihood of observing a weighted edge between nodes (i,j):
p(eij =wij ) = p(eij =1)wij
82
![Page 83: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/83.jpg)
A Probabilistic Model for Graph Layout
• Objective:
• Randomly sample some negative edges • Optimized through asynchronous stochastic gradient descent • Time complexity: linear to the number of data points
γ : an unified weight assigned to negative edge
O = p(eij =wij )(i, j )ÎE
Õ (1- p(eij =wij )(i, j )ÎE
Õ )g
83
![Page 84: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/84.jpg)
It Works Too!
• Time complexity • t-SNE: O(NlogN) • LargeVis: O(N)
• On 3 million data points • t-SNE: 45 hours • LargeVis: 5.6 hours • Seven times faster LargeVis
t-SNE
84
![Page 85: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/85.jpg)
Visualization Quality
• Metric: classification accuracy with KNN
on 2D space
• Configuration: • LargeVis with default parameters
• t-SNE with default and optimal parameters
(tuned per data set)
• LargeVis t-SNE with optimal parameters
• LargeVis >> t-SNE with default parameters
• Parameters of LargeVis are very stable
LargeVis t-SNE (optimal parameters)
t-SNE (default parameters)
»
85
![Page 86: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/86.jpg)
10M Scientific Papers on One Slide
86
![Page 87: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/87.jpg)
10M Scientific Papers on One Slide
87
![Page 88: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/88.jpg)
Computer Science Mathematics
88
![Page 89: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/89.jpg)
Physics Biology
89
![Page 90: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/90.jpg)
Computer Science vs. Mathematics
90
![Page 91: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/91.jpg)
Computer Science vs. Physics
91
![Page 92: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/92.jpg)
Wikipedia Articles (color: semantic cluster)
92
![Page 93: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/93.jpg)
LiveJournal Network (color: community)
93
![Page 94: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/94.jpg)
Computer Science Authors (color: community)
94
![Page 95: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/95.jpg)
Summary • LargeVis: a new technique for visualizing networks
and high-dimensional data
• A better tool than t-SNE.
• >7 times faster than t-SNE on three million data points
95
![Page 96: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/96.jpg)
Software
https://github.com/tangjianpku/LINE
https://github.com/lferry007/LargeVis
LINE: (C++)
LargeVis : (C++&Python)
https://github.com/elbamos/largeVis
(307 stars, released since 2015.3)
(307 stars, released since 2016.7)
LargeVis Tutorial: https://jlorince.github.io/viz-tutorial/
Interactive Visualization:
https://github.com/NLeSC/DiVE
R version in CRAN:
Our release:
Other tools based on our implementation:
„
96
![Page 97: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/97.jpg)
Outline
• Part I: Learning Node Representations of Networks
• Laplacian Eigenmap
• Word2Vec
• LINE, DeepWalk, and Node2Vec
• Part II: Visualizing Networks and High-Dimensional Data
• t-SNE
• LargeVis
• Pat III: Learning Representations of Entire Networks
• CNN
• Neural Message Passing
• Part IV: Summary, Challenges & Future Work
97
![Page 98: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/98.jpg)
Learning Representations of Entire (Small) Networks
98
• A collection of small networks with different structures • How to represent an entire (small) network
Sentence dependency graph
Molecules
„ Ego-networks
![Page 99: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/99.jpg)
Convolutional Neural Networks
99
![Page 100: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/100.jpg)
PATCHY-SAN [Niepert+ '16]
Node selection (w=4 nodes)
c f g
c
d
b
a
f
e
g
h
i
d
c
d
b
f
e
c
f
e
g
d f
g i
Neighborhood assembly (at least k=4 nodes)
Neighborhood normalization (exactly k=4 nodes)
1
4
3
2
3
1
2
4
3 2
1 4
j
c
d
g i
j
1
2 3
4
100
![Page 101: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/101.jpg)
PATCHY-SAN [Niepert+ '16]
1 2 3 4 a1 a2 „ an
Attributes
Nodes
Apply CNNs
101
Normalized neighborhood
1
4
3
2 1
2 3
4
1 2 3 4 a1 a2 „ an
„
„
![Page 102: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/102.jpg)
Neural Message Passing
102
![Page 103: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/103.jpg)
Structure2vec (Dai et al. 2016)
103
• ũi = σ (W1xi+W2Σ j∈N(i) ũj+W3Σ j∈N(i)xj)
• {W1, W2, W3} are parameters
• N(i) are neighbors of i
• σ is an activation function
Message passing
![Page 104: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/104.jpg)
Structure2vec (Dai et al. 2016)
104
Message passing
• We have embedding vectors {ui}
• ũi = σ (W1xi+W2Σ j∈N(i) ũj+W3Σ j∈N(i)xj)
• Represent a graph by Σ i ũi
• Minimize the empirical square loss
• (y - θ Tσ (Σ i ũi))2
• y is the graph label
• θ is a parameter
![Page 105: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/105.jpg)
Neural Message Passing Framework (Gilmer et al. 2017)
105
• Message passing phrase
• Readout phase
v
w1 w2
w3 w4
: message function
: vertex update function
: readout function
![Page 106: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/106.jpg)
Gated Graph Neural Networks (Li et al. 2016)
106
• Message function:
• Vertex updating function
• Readout function
a learned matrix for each type of edge (discrete type)
Neural networks
![Page 107: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/107.jpg)
Neural Message Passing (Gilmer et al. 2017)
107
• Message function:
• Vertex updating function
• Readout function
or Set2Set (Vinyals et al. 2015)
A neural network maps a edge vector (continuous) to a matrix
Permutation invariant
![Page 108: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/108.jpg)
Neural Message Passing (Gilmer et al. 2017)
108
• Add virtual edges
• Add a specific type of edge between nodes without edges
• Add a master node
• Connect to every node
Table :Results on the task of predicting chemical properties (QM9 Dataset)
![Page 109: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/109.jpg)
Summary
• Convolutional Neural Network
• Define “receptive field” on graphs
• Neural message passing
• Define message passing, node updating, and graph pooling functions
109
![Page 110: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/110.jpg)
Outline
• Part I: Learning Node Representations of Networks
• Laplacian Eigenmap
• Word2Vec
• LINE, DeepWalk, and Node2Vec
• Part II: Visualizing Networks and High-Dimensional Data
• t-SNE
• LargeVis
• Pat III: Learning Representations of Entire Networks
• CNN
• Neural message passing
• Part IV: Summary, Challenges & Future Work
110
![Page 111: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/111.jpg)
Summary
• Network representation: a new methodology for analyzing and mining networks
• State-of-the-art approaches for node representation learning
• LINE, DeepWalk, and Node2Vec
• Moving towards to task-specific node representations (e.g., PTE and GraphConv)
• Visualizing large-scale networks and high-dimensional data
• LargeVis
• Sales up to tens of millions of nodes or data points
• Learning representations of network substructures
• CNN, Neural message passing 111
![Page 112: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/112.jpg)
Challenges & Future Work
• Scalability
• How to scale up to networks with billions of nodes
• Hierarchical representations
• How to learn hierarchical representations of networks
• Dynamic
• Heterogeneous networks
• Multiple types of nodes, multiple types of edges
• Learning isomorphism-invariance representations of entire networks
• „
112
![Page 113: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/113.jpg)
References ### Node Embeddings ###
[Belkin et al. 2003] Mikhail Belkin and Partha Niyogi. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation, 2003.
[Mikolov et al. 2014] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS 2014.
[Tang et al. 2015a] Jian Tang, Meng Qu, Mingzhe Wang, Jun Yan, and Qiaozhu Mei. LINE: Large-scale Information Network Embedding. WWW’15
[Perozzi et al. 2014] Bryan Perozzi, Rami Al-Rfou, Steven Skiena. DeepWalk: Online Learning of Social Representations. KDD’14
[Grover et al. 2016] Aditya Grover and Jure Leskovec. node2vec: Scalable Feature Learning for Networks. KDD’16
[Cao et al. 2015] Shaosheng Cao, Wei Lu, and Qiongkai Xu. GraRep: learning graph representations with global structural information. CIKM’15.
[Qu et al. 2017] Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, and Jiawei Han. Learning Distributed Node Representations for Networks with Multiple Views.
[Yang et al. 2015] Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, Edward Y. Chang. Network representation learning with rich text information. IJCAI 2015.
[Kipf et al. 2016] Thomas N.Kipf and Max Welling. Variational Graph Auto-encoders. NIPS Workshop 2016.
[Liao et al. 2017]Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua. Attributed Social Network Embedding. arXiv, 2017.
[Tang et al. 2015b] Jian Tang, Meng Qu, and Qiaozhu Mei. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. KDD’15.
[Kipf et al. 2017]Thomas N.Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. ICLR’17.
[Chang et al. 2017] Siyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, Thomas S. Huang. Heterogeneous network embedding via Deep Architectures. KDD’15
[Chen et al. 2017] Ting Chen and Yizhou Sun, "Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification. WSDM’17.
113
![Page 114: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/114.jpg)
References [Wang et al. 2017] Daixin Wang, Peng Cui, Wenwu Zhu. Structural deep network embedding. KDD, 2016.
### Node Visualizations ###
[Maaten et al. 2008] L.J.P. van der Maaten and G.E. Hinton. Visualizing High-Dimensional Data Using t-SNE. JMLR, 2008.
[Maaten et al. 2014] L.J.P. van der Maaten. Accelerating t-SNE using Tree-Based Algorithms. JMLR, 2014.
[Tang et al. 2016] Jian Tang, Jingzhou Liu, Ming Zhang, and Qiaozhu Mei. Visualizing Large-scale and High-dimensional Data. WWW’16
### Graph Embeddings ###
[Li et al. 2016] Cheng Li, Xiaoxiao Guo, and Qiaozhu Mei. 2016. DeepGraph: Graph Structure Predicts Network Growth. arXiv preprint arXiv:1610.06251 (2016).
[Niepert et al. 2016] Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In Proceedings of the 33rd annual international conference on machine learning. ACM.
[Li et al. 2017] Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. 2017. DeepCas: an End-to-end Predictor of Information Cascades. In Proceedings of the 26th international conference on World wide web.
[Dai et al. 2016] Dai, Hanjun, Bo Dai, and Le Song. "Discriminative embeddings of latent variable models for structured data." International Conference on Machine Learning. 2016.
[Gilmer et al. 2017] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, G. E. Dahl. Neural message passing for quantum chemistry. arXiv, 2017.
[Li et al. 2016] Y. Li, D. Tarlow, M. Brockschmidt, R. Zemel. Gated graph sequence neural networks. ICLR, 2016.
114
![Page 116: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/116.jpg)
Optimization
• The gradient w.r.t. the embedding
• Z is the partition function:
• The complexity w.r.t. the number of data points N is O(N^2)
• Too expensive!
116
![Page 117: Learning Representations of Large-Scale Networksir.sdu.edu.cn/~zhuminchen/RL/tangjian2017.pdf · Learning Representations of Large-Scale Networks Jian Tang HEC Montréal Montréal](https://reader030.fdocuments.net/reader030/viewer/2022021803/5b93282709d3f27f5d8c9d96/html5/thumbnails/117.jpg)
Barnes-Hut Approximation
• Rewriting the gradient as:
Attractive forces Complexity: linear to the number of edges
Repulsive forces Complexity: O(N^2)
• Constructing a quadtree of the nodes according to the current low-dimensional representations
Sum of node i and nodes in a cell:
From O(N^2) to O(NLogN)!
117