Learning Mixed Membership Models using Spectral...
Transcript of Learning Mixed Membership Models using Spectral...
![Page 1: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/1.jpg)
Learning Mixed Membership Modelsusing Spectral Methods
Anima Anandkumar
U.C. Irvine
![Page 2: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/2.jpg)
Data vs. Information
Big data: Missing observations, gross corruptions, outliers.
Learning useful information is finding needle in a haystack!
Learning with big data: statistically and computationally challenging!
![Page 3: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/3.jpg)
Optimization for Learning
Most learning problems can be cast as optimization.
Unsupervised Learning
Clusteringk-means, hierarchical . . .
Maximum Likelihood EstimatorProbabilistic latent variable models
Supervised Learning
Optimizing a neural network withrespect to a loss function
Input
Neuron
Output
![Page 4: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/4.jpg)
Convex vs. Non-convex Optimization
Progress is only tip of the iceberg..
Images taken from https://www.facebook.com/nonconvex
![Page 5: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/5.jpg)
Convex vs. Non-convex Optimization
Progress is only tip of the iceberg.. Real world is mostly non-convex!
Images taken from https://www.facebook.com/nonconvex
![Page 6: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/6.jpg)
Convex vs. Nonconvex Optimization
Unique optimum: global/local. Multiple local optima
![Page 7: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/7.jpg)
Convex vs. Nonconvex Optimization
Unique optimum: global/local. Multiple local optima
In high dimensions possiblyexponential local optima
![Page 8: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/8.jpg)
Convex vs. Nonconvex Optimization
Unique optimum: global/local. Multiple local optima
In high dimensions possiblyexponential local optima
How to deal with non-convexity?
![Page 9: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/9.jpg)
Convex vs. Nonconvex Optimization
Unique optimum: global/local. Multiple local optima
In high dimensions possiblyexponential local optima
How to deal with non-convexity?
Spectral methods provide an answer.
![Page 10: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/10.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 11: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/11.jpg)
Matrices and Tensors as Data Structures
Modern data: Multi-modal,multi-relational data.
Matrices: pairwise relations. Tensors: higher order relations.
Multi-modal data figure from Lise Getoor slides.
![Page 12: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/12.jpg)
Tensor Representation of Higher Order Moments
Multivariate Moments: Many possibilities, . . .
For a random vector x, can compute empirical moments
E[x⊗ x], E[x⊗ x⊗ x], . . .
Invert moments to learn about model parameters.
![Page 13: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/13.jpg)
Tensor Representation of Higher Order Moments
Multivariate Moments: Many possibilities, . . .
For a random vector x, can compute empirical moments
E[x⊗ x], E[x⊗ x⊗ x], . . .
Invert moments to learn about model parameters.
Matrix: Pairwise Moments
E[x⊗ x] ∈ Rd×d is a second order tensor.
E[x⊗ x]i1,i2 = E[xi1xi2 ].
For matrices: E[x⊗ x] = E[xx⊤].
Tensor: Higher order Moments
E[x⊗ x⊗ x] ∈ Rd×d×d is a third order tensor.
E[x⊗ x⊗ x]i1,i2,i3 = E[xi1xi2xi3 ].
![Page 14: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/14.jpg)
Spectral Decomposition of Tensors
M2 =∑iλiui ⊗ vi
= + ....
Matrix M2 λ1u1 ⊗ v1 λ2u2 ⊗ v2
![Page 15: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/15.jpg)
Spectral Decomposition of Tensors
M2 =∑iλiui ⊗ vi
= + ....
Matrix M2 λ1u1 ⊗ v1 λ2u2 ⊗ v2
M3 =∑iλiui ⊗ vi ⊗ wi
= + ....
Tensor M3 λ1u1 ⊗ v1 ⊗ w1 λ2u2 ⊗ v2 ⊗ w2
We have developed efficient methods to solve tensor decomposition.
![Page 16: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/16.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
![Page 17: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/17.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
Recall matrix power method: v 7→ M(I, v)
‖M(I, v)‖ =Mv
‖Mv‖ .
![Page 18: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/18.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
Recall matrix power method: v 7→ M(I, v)
‖M(I, v)‖ =Mv
‖Mv‖ .
Algorithm: tensor power method: v 7→ T (I, v, v)
‖T (I, v, v)‖ .
![Page 19: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/19.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
Recall matrix power method: v 7→ M(I, v)
‖M(I, v)‖ =Mv
‖Mv‖ .
Algorithm: tensor power method: v 7→ T (I, v, v)
‖T (I, v, v)‖ .
• vi’s are the only robust fixed points.
![Page 20: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/20.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
Recall matrix power method: v 7→ M(I, v)
‖M(I, v)‖ =Mv
‖Mv‖ .
Algorithm: tensor power method: v 7→ T (I, v, v)
‖T (I, v, v)‖ .
• vi’s are the only robust fixed points. • All other eigenvectors are saddle points.
![Page 21: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/21.jpg)
Orthogonal Tensor Power Method
Symmetric orthogonal tensor T ∈ Rd×d×d:
T =∑
i∈[k]
λivi ⊗ vi ⊗ vi.
Recall matrix power method: v 7→ M(I, v)
‖M(I, v)‖ =Mv
‖Mv‖ .
Algorithm: tensor power method: v 7→ T (I, v, v)
‖T (I, v, v)‖ .
• vi’s are the only robust fixed points. • All other eigenvectors are saddle points.
For an orthogonal tensor, no spurious local optima!
![Page 22: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/22.jpg)
Guaranteed Tensor Decomposition
Non-orthogonal decomposition T1 =∑
i∈[k] λiai ⊗ ai ⊗ ai.
Whitening matrix W : computed fromtensor slices.
Multilinear transform: T2 = T1(W,W,W )to orthogonalize.
To do whitening we need A1 to be fullcolumn rank.
v1v2
v3
Wa1a2a3
Tensor T1 Tensor T2
![Page 23: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/23.jpg)
Summary on Tensor Methods
= + ....
Tensor decomposition: iterative algorithm with tensor contractions.
Tensor contraction: T (W1,W2,W3), where Wi are matrices/vectors.
Strengths of tensor method
Guaranteed convergence to globally optimal solution!
Fast and accurate, orders of magnitude faster than previous methods.
Embarrassingly parallel and suited for distributed systems, e.g.Spark.
Exploit optimized BLAS operations on CPUs and GPUs.
Open source software available: on CPU, GPU and Spark.
![Page 24: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/24.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 25: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/25.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 26: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/26.jpg)
Mining Graph Data
Various Statistics
![Page 27: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/27.jpg)
Mining Graph Data
Various Statistics
Clustering/Grouping Nodes
![Page 28: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/28.jpg)
Mining Graph Data
Various Statistics
Clustering/Grouping Nodes
Link Prediction: Predict Missing Edges
Accomplish all tasks via statistical community models
![Page 29: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/29.jpg)
Network Communities in Various Domains
Social Networks
Social ties: e.g. friendships, co-authorships
Biological Networks
Functional relationships:e.g. gene regulation, neural activity.
Recommendation Systems
Recommendations: e.g. yelp reviews.
Community Detection: Infer hidden communities from observed network.
![Page 30: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/30.jpg)
Mixed Membership Community Models
![Page 31: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/31.jpg)
Mixed Membership Community Models
![Page 32: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/32.jpg)
Mixed Membership Community Models
![Page 33: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/33.jpg)
Mixed Membership Community Models
![Page 34: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/34.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 35: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/35.jpg)
Connection to LDA Topic Models
What statistics yield guaranteed learning of communities?
![Page 36: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/36.jpg)
Connection to LDA Topic Models
What statistics yield guaranteed learning of communities?
Topic Model Analogy
x
a b c
Word 1 Word 2 Word 3
Documents
Subgraph Counts as Graph Moments
![Page 37: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/37.jpg)
Subgraph Counts as Graph Moments
Utilize presence of star counts for learning
![Page 38: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/38.jpg)
Subgraph Counts as Graph Moments
Utilize presence of star counts for learning
![Page 39: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/39.jpg)
Subgraph Counts as Graph Moments
Utilize presence of star counts for learning
2-Star Count Matrix
M2(a, b) =1
|X|# of common neighbors in X
=1
|X|∑
x∈X
G(x, a)G(x, b).
M2 =1
|X|∑
x∈X
[G⊤x,A ⊗G⊤
x,B]
x
a bA B
X
n: Network size. M2 ∈ R|A|×|B| = O(n× n).
![Page 40: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/40.jpg)
3-star Counts
Third order tensor: three-dimensional array.
3-Star Count Tensor
M3(a, b, c) =1
|X|# of common neighbors in X
=1
|X|∑
x∈X
G(x, a)G(x, b)G(x, c).
M3 =1
|X|∑
x∈X
[G⊤x,A ⊗G⊤
x,B ⊗G⊤x,C ]
x
a b c
A B C
X
n: Network size. M3 ∈ R|A|×|B| = O(n× n× n).
![Page 41: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/41.jpg)
Multi-view Representation
Conditional independence of the three views
πx: community membership vector of node x.
3-stars
xX
A B C
Graphical model
πx
G⊤x,A G⊤
x,BG⊤
x,C
FA FB FC
Linear Multiview Model: E[G⊤x,A|Π] = FAπx.
E[M3|ΠA,B,C ] =∑i∈[k]
λi[(FA)i ⊗ (FB)i ⊗ (FC)i]
![Page 42: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/42.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 43: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/43.jpg)
Main Resultsk communities, n nodes. Uniform communities.
α0: Sparsity level of community memberships (Dirichlet parameter).
p, q: intra/inter-community edge density.
Scaling Requirements
n = Ω(k2(α0 + 1)3),p− q√
p= Ω
((α0 + 1)1.5k√
n
).
“A Tensor Spectral Approach to Learning Mixed Membership Community Models” by A.
Anandkumar, R. Ge, D. Hsu, and S.M. Kakade. COLT 2013.
![Page 44: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/44.jpg)
Main Resultsk communities, n nodes. Uniform communities.
α0: Sparsity level of community memberships (Dirichlet parameter).
p, q: intra/inter-community edge density.
Scaling Requirements
n = Ω(k2(α0 + 1)3),p− q√
p= Ω
((α0 + 1)1.5k√
n
).
For stochastic block model (α0 = 0), tight results
Tight guarantees for sparse graphs (scaling of p, q)
Tight guarantees on community size: require at least√n sized
communities
Efficient scaling w.r.t. sparsity level of memberships α0
“A Tensor Spectral Approach to Learning Mixed Membership Community Models” by A.
Anandkumar, R. Ge, D. Hsu, and S.M. Kakade. COLT 2013.
![Page 45: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/45.jpg)
Main Results (Contd)
α0: Sparsity level of community memberships (Dirichlet parameter).
Π: Community membership matrix, Π(i): ith community
S: Estimated supports, S(i, j): Support for node j in community i.
Norm Guarantees
1
nmax
i‖Πi −Πi‖1 = O
((α0 + 1)3/2
√p
(p− q)√n
)
![Page 46: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/46.jpg)
Main Results (Contd)
α0: Sparsity level of community memberships (Dirichlet parameter).
Π: Community membership matrix, Π(i): ith community
S: Estimated supports, S(i, j): Support for node j in community i.
Norm Guarantees
1
nmax
i‖Πi −Πi‖1 = O
((α0 + 1)3/2
√p
(p− q)√n
)
Support Recovery
∃ ξ s.t. for all nodes j ∈ [n] and all communities i ∈ [k], w.h.p
Π(i, j) ≥ ξ ⇒ S(i, j) = 1 and Π(i, j) ≤ ξ
2⇒ S(i, j) = 0.
Zero-error Support Recovery of Significant Memberships of All Nodes
![Page 47: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/47.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 48: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/48.jpg)
Experimental Results
n ∼ 20, 000
Yelp
n ∼ 40, 000
DBLP
n ∼ 1 million
Tensor vs. Variational
FB YP DBLP−sub DBLP10
0
101
102
103
104
105
Datasets
Run
ning
tim
e (s
econ
ds)
VariationalTensor
![Page 49: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/49.jpg)
Experimental Results on Yelp
Lowest error business categories & largest weight businesses
Rank Category Business
1 Latin American Salvadoreno Restaurant
2 Gluten Free P.F. Chang’s China Bistro
3 Hobby Shops Make Meaning
4 Mass Media KJZZ
5 Yoga Sutra Midtown
![Page 50: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/50.jpg)
Experimental Results on Yelp
Lowest error business categories & largest weight businesses
Rank Category Business
1 Latin American Salvadoreno Restaurant
2 Gluten Free P.F. Chang’s China Bistro
3 Hobby Shops Make Meaning
4 Mass Media KJZZ
5 Yoga Sutra Midtown
Top-5 bridging businesses
Business Categories
Four Peaks Brewing Restaurants, Bars, American, Nightlife, Food, Pubs, Tempe
Pizzeria Bianco Restaurants, Pizza, Phoenix
FEZ Restaurants, Bars, American, Nightlife, Mediterranean, Lounges, Phoenix
Matt’s Big Breakfast Restaurants, Phoenix, Breakfast& Brunch
Cornish Pasty Co Restaurants, Bars, Nightlife, Pubs, Tempe
![Page 51: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/51.jpg)
Outline
1 Introduction
2 Spectral Methods: Matrices and Tensors
3 Spectral Methods for Learning Community ModelsCommunity Detection in GraphsSpectral Methods for Community DetectionTheoretical GuaranteesExperimental Results
4 Conclusion
![Page 52: Learning Mixed Membership Models using Spectral Methodscseweb.ucsd.edu/~slovett/workshops/big-graphs-2016/talks/anandkumar.pdfDatavs. Information Big data: Missing observations, gross](https://reader034.fdocuments.net/reader034/viewer/2022050419/5f8f1b772810091f917058ea/html5/thumbnails/52.jpg)
Learning Communities using Spectral Methods
Guaranteed to recover correct model
Efficient sample and computational complexities
Better performance compared to EM, VariationalBayes etc.
Other Models
Expand to other hypergraph community models.
Our recent work considers social tagging networks: tripartite3-hypergraph models.