Scalable MCMC in degree corrected stochastic block...
Transcript of Scalable MCMC in degree corrected stochastic block...
![Page 1: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/1.jpg)
Scalable MCMC in degree corrected stochasticblock model
Soumyasundar Pal
Dept. of Electrical and Computer EngineeringMcGill University,
Montreal, Quebec, Canada
February 11, 2019
![Page 2: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/2.jpg)
Introduction
Community detection from networks
Academic collaboration, protein interaction, social networks
Community: dense internal and sparse external connections
Earlier approaches: hierarchical clustering, modularity
optimization, spectral clustering, clique percolation
Challenges: handling sparsity, scalability
2
![Page 3: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/3.jpg)
Introduction
Heuristic objective function, greedy optimization
Plethora of techniques1 2
Numerous quality metrics3
Principled approach: statistical modelling of community
structures
1S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486, pp. 75–174, Feb. 2010.2S. Parthasarathy, Y. Ruan, and V. Satuluri, “Community discovery in social networks: Applications, methods
and emerging trends,” in Social Network Data Analytics, pp. 79–113. Springer US, Boston, MA, Mar. 2011.3T. Chakraborty, A. Dalmia, A. Mukherjee, and N. Ganguly,“Metrics for community analysis: a survey, ACM
Comput. Surv., vol. 50, no. 4, pp. 1–37, Aug. 2017.
3
![Page 4: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/4.jpg)
Stochastic Block Model (SBM)
Connectivity depends on community membership4.
Stochastic equivalence of nodes within same community
N: no. nodes, K : no. communities
ci : membership of node i
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1}: (a, b)’th entry in adj.
matrix
βk` ∈ (0, 1): link probability between
two nodes in community k and `
yab|(ca = k , cb = `) ∼ Bernoulli(βk`)
4E. Abbe, “Community detection and stochastic block models, Found. and Trends Commun. and Inform.Theory, vol. 14, no.1-2, pp. 1162, Jun. 2018.
4
![Page 5: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/5.jpg)
Stochastic Block Model (SBM)
Connectivity depends on community membership4.
Stochastic equivalence of nodes within same community
N: no. nodes, K : no. communities
ci : membership of node i
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1}: (a, b)’th entry in adj.
matrix
βk` ∈ (0, 1): link probability between
two nodes in community k and `
yab|(ca = k , cb = `) ∼ Bernoulli(βk`)
4E. Abbe, “Community detection and stochastic block models, Found. and Trends Commun. and Inform.Theory, vol. 14, no.1-2, pp. 1162, Jun. 2018.
4
![Page 6: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/6.jpg)
Stochastic Block Model (SBM)
Connectivity depends on community membership4.
Stochastic equivalence of nodes within same community
N: no. nodes, K : no. communities
ci : membership of node i
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1}: (a, b)’th entry in adj.
matrix
βk` ∈ (0, 1): link probability between
two nodes in community k and `
yab|(ca = k , cb = `) ∼ Bernoulli(βk`)
4E. Abbe, “Community detection and stochastic block models, Found. and Trends Commun. and Inform.Theory, vol. 14, no.1-2, pp. 1162, Jun. 2018.
4
![Page 7: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/7.jpg)
Stochastic Block Model (SBM)
Connectivity depends on community membership4.
Stochastic equivalence of nodes within same community
N: no. nodes, K : no. communities
ci : membership of node i
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1}: (a, b)’th entry in adj.
matrix
βk` ∈ (0, 1): link probability between
two nodes in community k and `
yab|(ca = k , cb = `) ∼ Bernoulli(βk`)
4E. Abbe, “Community detection and stochastic block models, Found. and Trends Commun. and Inform.Theory, vol. 14, no.1-2, pp. 1162, Jun. 2018.
4
![Page 8: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/8.jpg)
Stochastic Block Model (SBM)5
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1, 2, ...}: (a, b)’th entry in adj. matrix
ωk` ∈ R+: average number of links between two nodes incommunity k and `
yab|(ca = k , cb = `) ∼ Poisson(ωk`) ,
L(y|C, ω) =∑k,`
(mk` logωk` − nkn`ωk`) ,
where, mk` =∑a,b
yab1{ca=k,cb=`} and nk =∑a
1{ca=k}
5B. Karrer and M. E. J. Newman, “Stochastic blockmodels and community structure in networks, Phys. Rev.E, vol. 83,no. 1, pp. 016107, Jan. 2011.
5
![Page 9: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/9.jpg)
Stochastic Block Model (SBM)5
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1, 2, ...}: (a, b)’th entry in adj. matrix
ωk` ∈ R+: average number of links between two nodes incommunity k and `
yab|(ca = k , cb = `) ∼ Poisson(ωk`) ,
L(y|C, ω) =∑k,`
(mk` logωk` − nkn`ωk`) ,
where, mk` =∑a,b
yab1{ca=k,cb=`} and nk =∑a
1{ca=k}
5B. Karrer and M. E. J. Newman, “Stochastic blockmodels and community structure in networks, Phys. Rev.E, vol. 83,no. 1, pp. 016107, Jan. 2011.
5
![Page 10: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/10.jpg)
Stochastic Block Model (SBM)5
ci ∈ {1, 2, ...,K}, C = {ci}Ni=1
yab ∈ {0, 1, 2, ...}: (a, b)’th entry in adj. matrix
ωk` ∈ R+: average number of links between two nodes incommunity k and `
yab|(ca = k , cb = `) ∼ Poisson(ωk`) ,
L(y|C, ω) =∑k,`
(mk` logωk` − nkn`ωk`) ,
where, mk` =∑a,b
yab1{ca=k,cb=`} and nk =∑a
1{ca=k}
5B. Karrer and M. E. J. Newman, “Stochastic blockmodels and community structure in networks, Phys. Rev.E, vol. 83,no. 1, pp. 016107, Jan. 2011.
5
![Page 11: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/11.jpg)
Stochastic Block Model (SBM)
ML estimate:
ωk` =mk`
nkn`
L(y|C) = maxωL(y|C, ω) =
∑k,`
(mk`
m
)log
(mk`/m)
(nkn`/N2),
where, m =∑k,`
mk,` .
Greedy algorithm
pick a random node, place it in a community to maximally increase
the objective.
6
![Page 12: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/12.jpg)
Stochastic Block Model (SBM)
ML estimate:
ωk` =mk`
nkn`
L(y|C) = maxωL(y|C, ω) =
∑k,`
(mk`
m
)log
(mk`/m)
(nkn`/N2),
where, m =∑k,`
mk,` .
Greedy algorithm
pick a random node, place it in a community to maximally increase
the objective.
6
![Page 13: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/13.jpg)
Stochastic Block Model (SBM)
ML estimate:
ωk` =mk`
nkn`
L(y|C) = maxωL(y|C, ω) =
∑k,`
(mk`
m
)log
(mk`/m)
(nkn`/N2),
where, m =∑k,`
mk,` .
Greedy algorithm
pick a random node, place it in a community to maximally increase
the objective.
6
![Page 14: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/14.jpg)
Degree Corrected Stochastic Block Model (DC-SBM)
degree heterogeneity within community
θa ∈ (0, 1): degree correction parameters
yab|(ca = k , cb = `) ∼ Poisson(θaθbωk`) ,∑a
θa1{ca=k} = 1 .
L(y|C) = maxω,θL(y|C, ω, θ) =
∑k,`
(mk`
m
)log
(mk`/m)
(κkκ`/m2),
where, κk =∑a
da1{ca=k} .
7
![Page 15: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/15.jpg)
Degree Corrected Stochastic Block Model (DC-SBM)
degree heterogeneity within community
θa ∈ (0, 1): degree correction parameters
yab|(ca = k , cb = `) ∼ Poisson(θaθbωk`) ,∑a
θa1{ca=k} = 1 .
L(y|C) = maxω,θL(y|C, ω, θ) =
∑k,`
(mk`
m
)log
(mk`/m)
(κkκ`/m2),
where, κk =∑a
da1{ca=k} .
7
![Page 16: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/16.jpg)
Degree Corrected Stochastic Block Model (DC-SBM)
degree heterogeneity within community
θa ∈ (0, 1): degree correction parameters
yab|(ca = k , cb = `) ∼ Poisson(θaθbωk`) ,∑a
θa1{ca=k} = 1 .
L(y|C) = maxω,θL(y|C, ω, θ) =
∑k,`
(mk`
m
)log
(mk`/m)
(κkκ`/m2),
where, κk =∑a
da1{ca=k} .
7
![Page 17: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/17.jpg)
Degree Corrected Stochastic Block Model (DC-SBM)
degree heterogeneity within community
θa ∈ (0, 1): degree correction parameters
yab|(ca = k , cb = `) ∼ Poisson(θaθbωk`) ,∑a
θa1{ca=k} = 1 .
L(y|C) = maxω,θL(y|C, ω, θ) =
∑k,`
(mk`
m
)log
(mk`/m)
(κkκ`/m2),
where, κk =∑a
da1{ca=k} .
7
![Page 18: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/18.jpg)
Degree Corrected Stochastic Block Model (DC-SBM)
degree heterogeneity within community
θa ∈ (0, 1): degree correction parameters
yab|(ca = k , cb = `) ∼ Poisson(θaθbωk`) ,∑a
θa1{ca=k} = 1 .
L(y|C) = maxω,θL(y|C, ω, θ) =
∑k,`
(mk`
m
)log
(mk`/m)
(κkκ`/m2),
where, κk =∑a
da1{ca=k} .
7
![Page 19: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/19.jpg)
Mixed Membership Stochastic Blockmodel (MMSB)
Overlapping communities6
Community membership probability: πa ∈ (0, 1)K ,K∑
k=1
πak = 1
Prior distributions: βk` ∼ Beta(η), πa ∼ Dir(α)
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbsample yab|(Zab = k ,Zba = `) ∼ Bernoulli(βk`)
Posterior inference of p(β, π|y)
assortative MMSB (a-MMSB): βk` = δ for k 6= `
6E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,”in J. Mach. Learn.Res., vol. 9, pp. 19812014, Jun. 2008.
8
![Page 20: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/20.jpg)
Mixed Membership Stochastic Blockmodel (MMSB)
Overlapping communities6
Community membership probability: πa ∈ (0, 1)K ,K∑
k=1
πak = 1
Prior distributions: βk` ∼ Beta(η), πa ∼ Dir(α)
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbsample yab|(Zab = k ,Zba = `) ∼ Bernoulli(βk`)
Posterior inference of p(β, π|y)
assortative MMSB (a-MMSB): βk` = δ for k 6= `
6E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,”in J. Mach. Learn.Res., vol. 9, pp. 19812014, Jun. 2008.
8
![Page 21: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/21.jpg)
Mixed Membership Stochastic Blockmodel (MMSB)
Overlapping communities6
Community membership probability: πa ∈ (0, 1)K ,K∑
k=1
πak = 1
Prior distributions: βk` ∼ Beta(η), πa ∼ Dir(α)
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbsample yab|(Zab = k ,Zba = `) ∼ Bernoulli(βk`)
Posterior inference of p(β, π|y)
assortative MMSB (a-MMSB): βk` = δ for k 6= `
6E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,”in J. Mach. Learn.Res., vol. 9, pp. 19812014, Jun. 2008.
8
![Page 22: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/22.jpg)
Mixed Membership Stochastic Blockmodel (MMSB)
Overlapping communities6
Community membership probability: πa ∈ (0, 1)K ,K∑
k=1
πak = 1
Prior distributions: βk` ∼ Beta(η), πa ∼ Dir(α)
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbsample yab|(Zab = k ,Zba = `) ∼ Bernoulli(βk`)
Posterior inference of p(β, π|y)
assortative MMSB (a-MMSB): βk` = δ for k 6= `
6E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,”in J. Mach. Learn.Res., vol. 9, pp. 19812014, Jun. 2008.
8
![Page 23: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/23.jpg)
Mixed Membership Stochastic Blockmodel (MMSB)
Overlapping communities6
Community membership probability: πa ∈ (0, 1)K ,K∑
k=1
πak = 1
Prior distributions: βk` ∼ Beta(η), πa ∼ Dir(α)
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbsample yab|(Zab = k ,Zba = `) ∼ Bernoulli(βk`)
Posterior inference of p(β, π|y)
assortative MMSB (a-MMSB): βk` = δ for k 6= `
6E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,”in J. Mach. Learn.Res., vol. 9, pp. 19812014, Jun. 2008.
8
![Page 24: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/24.jpg)
Inference in a-MMSB
Variational inference7
– Mean field approximation
– Stochastic gradient optimization
– Outperforms traditional techniques
Markov chain Monte Carlo8
– Stochastic gradient Riemannian Langevin dynamics (SGRLD)
– Faster convergence
– Better approximation of posterior
7P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlappingcommunities,” in Proc. Adv. Neural Inf. Proc. Systems, Dec. 2012.
8W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Proc.Artificial Intell.and Statist., May 2016.
9
![Page 25: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/25.jpg)
Inference in a-MMSB
Variational inference7
– Mean field approximation
– Stochastic gradient optimization
– Outperforms traditional techniques
Markov chain Monte Carlo8
– Stochastic gradient Riemannian Langevin dynamics (SGRLD)
– Faster convergence
– Better approximation of posterior
7P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlappingcommunities,” in Proc. Adv. Neural Inf. Proc. Systems, Dec. 2012.
8W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Proc.Artificial Intell.and Statist., May 2016.
9
![Page 26: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/26.jpg)
Inference in a-MMSB
Variational inference7
– Mean field approximation
– Stochastic gradient optimization
– Outperforms traditional techniques
Markov chain Monte Carlo8
– Stochastic gradient Riemannian Langevin dynamics (SGRLD)
– Faster convergence
– Better approximation of posterior
7P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlappingcommunities,” in Proc. Adv. Neural Inf. Proc. Systems, Dec. 2012.
8W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Proc.Artificial Intell.and Statist., May 2016.
9
![Page 27: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/27.jpg)
Inference in a-MMSB
Variational inference7
– Mean field approximation
– Stochastic gradient optimization
– Outperforms traditional techniques
Markov chain Monte Carlo8
– Stochastic gradient Riemannian Langevin dynamics (SGRLD)
– Faster convergence
– Better approximation of posterior
7P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlappingcommunities,” in Proc. Adv. Neural Inf. Proc. Systems, Dec. 2012.
8W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Proc.Artificial Intell.and Statist., May 2016.
9
![Page 28: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/28.jpg)
Mixed Membership Degree Corrected Blockmodel(MMDCB)9
Generalization of a-MMSB
Node specific degree correction parameters: ra ∈ RCommunity specific parameters: qk > 0
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbif Zab = Zba = k :
sample yab ∼ Bernoulli(logit−1(qk + ra + rb))
else:
sample yab ∼ Bernoulli(logit−1(ra + rb))
Prior distributions: ra ∼ N (0, σ2), qk ∼ N (0, σ2)1{qk>0}Posterior inference of p(q, r, π|y)
9S. Pal and M. Coates, “Scalable MCMC in degree corrected stochastic block model,” in Proc. Intl. Conf.Acoust., Speech and Signal Proc, May 2019 (Accepted).
10
![Page 29: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/29.jpg)
Mixed Membership Degree Corrected Blockmodel(MMDCB)9
Generalization of a-MMSB
Node specific degree correction parameters: ra ∈ R
Community specific parameters: qk > 0
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbif Zab = Zba = k :
sample yab ∼ Bernoulli(logit−1(qk + ra + rb))
else:
sample yab ∼ Bernoulli(logit−1(ra + rb))
Prior distributions: ra ∼ N (0, σ2), qk ∼ N (0, σ2)1{qk>0}Posterior inference of p(q, r, π|y)
9S. Pal and M. Coates, “Scalable MCMC in degree corrected stochastic block model,” in Proc. Intl. Conf.Acoust., Speech and Signal Proc, May 2019 (Accepted).
10
![Page 30: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/30.jpg)
Mixed Membership Degree Corrected Blockmodel(MMDCB)9
Generalization of a-MMSB
Node specific degree correction parameters: ra ∈ RCommunity specific parameters: qk > 0
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbif Zab = Zba = k :
sample yab ∼ Bernoulli(logit−1(qk + ra + rb))
else:
sample yab ∼ Bernoulli(logit−1(ra + rb))
Prior distributions: ra ∼ N (0, σ2), qk ∼ N (0, σ2)1{qk>0}Posterior inference of p(q, r, π|y)
9S. Pal and M. Coates, “Scalable MCMC in degree corrected stochastic block model,” in Proc. Intl. Conf.Acoust., Speech and Signal Proc, May 2019 (Accepted).
10
![Page 31: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/31.jpg)
Mixed Membership Degree Corrected Blockmodel(MMDCB)9
Generalization of a-MMSB
Node specific degree correction parameters: ra ∈ RCommunity specific parameters: qk > 0
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbif Zab = Zba = k :
sample yab ∼ Bernoulli(logit−1(qk + ra + rb))
else:
sample yab ∼ Bernoulli(logit−1(ra + rb))
Prior distributions: ra ∼ N (0, σ2), qk ∼ N (0, σ2)1{qk>0}Posterior inference of p(q, r, π|y)
9S. Pal and M. Coates, “Scalable MCMC in degree corrected stochastic block model,” in Proc. Intl. Conf.Acoust., Speech and Signal Proc, May 2019 (Accepted).
10
![Page 32: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/32.jpg)
Mixed Membership Degree Corrected Blockmodel(MMDCB)9
Generalization of a-MMSB
Node specific degree correction parameters: ra ∈ RCommunity specific parameters: qk > 0
Generative Model
for any two nodes a and b :
sample Zab ∼ πa and Zba ∼ πbif Zab = Zba = k :
sample yab ∼ Bernoulli(logit−1(qk + ra + rb))
else:
sample yab ∼ Bernoulli(logit−1(ra + rb))
Prior distributions: ra ∼ N (0, σ2), qk ∼ N (0, σ2)1{qk>0}Posterior inference of p(q, r, π|y)
9S. Pal and M. Coates, “Scalable MCMC in degree corrected stochastic block model,” in Proc. Intl. Conf.Acoust., Speech and Signal Proc, May 2019 (Accepted).
10
![Page 33: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/33.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD)
11
![Page 34: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/34.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD)
11
![Page 35: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/35.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD)
11
![Page 36: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/36.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD)
11
![Page 37: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/37.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD)
11
![Page 38: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/38.jpg)
Langevin Monte Carlo
Metropolis adjusted Langevin algorithm (MALA)
parameter θ, observed data X = {x1, x2, ..., xN}
prior distribution p(θ), generative model p(X|θ) =N∏i=1
p(xi |θ)
posterior distribution: p(θ|X) ∝ p(θ)N∏i=1
p(xi |θ)
q(θ∗|θ) = N (θ∗|θ +ε
2
(∇θ log p(θ) +
N∑i=1
∇θ log p(xi |θ)), εI )
acceptance probability: min
(1,
p(θ∗|x)q(θ|θ∗)p(θ|x)q(θ∗|θ)
)
Preconditioning: Riemannian Langevin Dynamics (RLD) 11
![Page 39: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/39.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 40: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/40.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 41: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/41.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 42: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/42.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 43: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/43.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 44: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/44.jpg)
Langevin Monte Carlo
Stochastic gradient Langevin dynamics (SGLD)
complexity in LD: O(N)
stochastic gradient: O(n) complexity
∇θ log p(X|θ) ≈ N
n
∑xti∈Xt
∇θ log p(xti |θ)
annealed step-size schedule∞∑t=1
εt =∞ and∞∑t=1
ε2t <∞
no acceptance probability computation
asymptotic convergence10 to the posterior distribution
10M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient Langevin dynamics,”in Proc. Intl. Conf. Machine Learning, Bellevue, WA, USA, Jun. 2011. 12
![Page 45: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/45.jpg)
Numerical Experiments and Results
NETSCIENCE RELATIVITY HEP-TH HEP-PH11
Nodes 1589 5242 9877 12008Edges 2742 14996 25998 118521
held out test set: 10% of the links, same number of non-links
evaluation metrics:
- average perplexity:
perpavg (Ytest|{π(i), q(i), r (i)}Ti=1)
= exp
(−
∑yab∈Ytest
log
{1
T
T∑i=1
p(yab|π(i), q(i), r (i))
}|Ytest|
).
- area under ROC (AUC) for link prediction task
11J. Leskovec, J. Kleinberg, and C. Faloutsos,“Graph evolution: densification and shrinking diameters,”in ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, Mar. 2007
13
![Page 46: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/46.jpg)
Numerical Experiments and Results
NETSCIENCE RELATIVITY HEP-TH HEP-PH11
Nodes 1589 5242 9877 12008Edges 2742 14996 25998 118521
held out test set: 10% of the links, same number of non-links
evaluation metrics:
- average perplexity:
perpavg (Ytest|{π(i), q(i), r (i)}Ti=1)
= exp
(−
∑yab∈Ytest
log
{1
T
T∑i=1
p(yab|π(i), q(i), r (i))
}|Ytest|
).
- area under ROC (AUC) for link prediction task
11J. Leskovec, J. Kleinberg, and C. Faloutsos,“Graph evolution: densification and shrinking diameters,”in ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, Mar. 2007
13
![Page 47: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/47.jpg)
Numerical Experiments and Results
NETSCIENCE RELATIVITY HEP-TH HEP-PH11
Nodes 1589 5242 9877 12008Edges 2742 14996 25998 118521
held out test set: 10% of the links, same number of non-links
evaluation metrics:
- average perplexity:
perpavg (Ytest|{π(i), q(i), r (i)}Ti=1)
= exp
(−
∑yab∈Ytest
log
{1
T
T∑i=1
p(yab|π(i), q(i), r (i))
}|Ytest|
).
- area under ROC (AUC) for link prediction task
11J. Leskovec, J. Kleinberg, and C. Faloutsos,“Graph evolution: densification and shrinking diameters,”in ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, Mar. 2007
13
![Page 48: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/48.jpg)
Convergence of Perplexity
0 1 2 3 4
time (sec) 104
4
5
6
7
8
9
Pe
rple
xity
a-MMSB
MMDCB
Figure: Convergence of perplexity for HEP-PH dataset, K = 50
14
![Page 49: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/49.jpg)
Comparison of Perplexity at convergence
25 50 75 100
Number of communities (K)
10
11
12
Perp
lexity
(a)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
11.5
12
12.5
13
13.5
14
Perp
lexity
(b)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
18
20
22
24
26
28
Perp
lexity
(c)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
3.5
4
4.5
5
5.5
6
Perp
lexity
(d)
a-MMSB
MMDCB
Figure: (a) NETSCIENCE, (b) RELATIVITY, (c) HEP-TH and (d) HEP-PH
15
![Page 50: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/50.jpg)
Comparison of AUC at convergence
25 50 75 100
Number of communities (K)
75
80
85
AU
C
(a)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
78
80
82
84
86
AU
C
(b)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
75
80
85
AU
C
(c)
a-MMSB
MMDCB
25 50 75 100
Number of communities (K)
88
90
92
94
96
AU
C
(d)
a-MMSB
MMDCB
Figure: (a) NETSCIENCE, (b) RELATIVITY, (c) HEP-TH and (d) HEP-PH
16
![Page 51: Scalable MCMC in degree corrected stochastic block modelnetworks.ece.mcgill.ca/.../Soumya...presentation.pdf · Scalable MCMC in degree corrected stochastic block model Soumyasundar](https://reader036.fdocuments.net/reader036/viewer/2022070917/5fb70b41e628500f700930ef/html5/thumbnails/51.jpg)
Conclusion
MMDCB models the observed graph better than a-MMSB.
SG-MCMC algorithms scale well to large networks.
Future work:
- better generative models
- efficient mini-batch sampling, variance reduction
- more advanced SG-MCMC algorithms
17