Leonid E. Zhukov - Leonid Zhukov · Leonid E. Zhukov School of Data Analysis and Arti cial...
Transcript of Leonid E. Zhukov - Leonid Zhukov · Leonid E. Zhukov School of Data Analysis and Arti cial...
Graph partitioning algorithms
Leonid E. Zhukov
School of Data Analysis and Artificial IntelligenceDepartment of Computer Science
National Research University Higher School of Economics
Structural Analysis and Visualization of Networks
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 1 / 36
Lecture outline
1 Graph paritioningMetricsAlgorithms
2 Spectral optimizationMin cutNormalized cutModularity maximization
3 Multilevel spectral
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 2 / 36
Network Communities
graph density
ρ =m
n(n − 1)/2
community (cluster) density
δint(C ) =mc
nc(nc − 1)/2
external edges density
δext(C ) =mext
nc(n − nc)
community (cluster): δint > ρ, δext < ρ
cluster detectionmax(δint − δext)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 3 / 36
Community detection
Consider only sparse graphs m� n2
Each community should be connected
Combinatorial optimization problem:- optimization criterion- optimization method
Exact solution NP-hard(bi-partition: n = n1 + n2, n!/(n1!n2!) combinations)
Solved by greedy, approximate algorithms or heuristics
Recursive top-down 2-way partition, multiway partiton
Balanced class partition vs communities
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 4 / 36
Graph cut
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 5 / 36
Optimization criterion: graph cut
Graph G (E ,V ) partition: V = V1 + V2
Graph cut
Q = cut(V1,V2) =∑
i∈V1,j∈V2
eij
Ratio cut:
Q =cut(V1,V2)
||V1||+
cut(V1,V2)
||V2||Normalized cut:
Q =cut(V1,V2)
Vol(V1)+
cut(V1,V2)
Vol(V2)
Quotient cut (conductance):
Q =cut(V1,V2)
min (Vol(V1),Vol(V2))
where: Vol(V1) =∑
i∈V1,j∈V eij =∑
i∈V1ki
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 6 / 36
Optimization methods
Greedy optimization:– Local search [Kernighan and Lin, 1970], [Fidducia and Mettheyses,1982]
Approximate optimization:– Spectral graph partitioning [M. Fiedler, 1972], [Pothen et al 1990],[Shi and Malik, 2000]– Multicommodity flow [Leighton and Rao, 1988]
Heuristics algorithms:– Multilevel graph partitioning (METIS) [G. Karypis, Kumar 1998]
Randomized algorithms:–Randomized min cut [D. Karger, 1993]
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 7 / 36
Graph cuts
Let V = V+ + V− be partitioning of the nodes
Let s = {+1,−1,+1, ...− 1,+1}T - indicator vector
x-1 x-1 x+1 x+1 x+1
s(i) =
{+1 : if v(i) ∈ V+
−1 : if v(i) ∈ V−
Number of edges, connecting V+ and V−
cut(V+,V−) =1
4
∑e(i ,j)
(s(i)− s(j))2 =1
8
∑i ,j
Aij(s(i)− s(j))2 =
=1
4
∑i ,j
(kiδijs(i)2 − Aijs(i)s(j)) =1
4
∑i ,j
(kiδij − Aij)s(i)s(j)
cut(V+,V−) =1
4
∑i ,j
(Dij − Aij)s(i)s(j)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 8 / 36
Graph cuts
Graph Laplacian: Lij = Dij − Aij , where Dii = diag(ki )
Lij =
k(i) , if i = j−1 , if ∃ e(i , j)
0 , otherwise
Laplacian matrix 5x5:
L =
1 −1−1 2 −1
−1 2 −1−1 2 −1
−1 1
x1 x2 x3 x4 x5
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 9 / 36
Graph cuts
Graph Laplacian: L = D− A
Graph cut:
Q(s) =sTLs
4
Minimal cut:mins
Q(s)
Balanced cut constraint: ∑i
s(i) = 0
Integer minimization problem, exact solution is NP-hard!
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 10 / 36
Spectral method - relaxation
Discrete problem → continuous problem
Discrete problem: find
mins
(1
4sTLs)
under constraints: s(i) = ±1,∑
i s(i) = 0;
Relaxation - continuous problem: find
minx
(1
4xTLx)
under constraints:∑
i x(i)2 = n ,∑
i x(i) = 0
Given x(i), round them up by s(i) = sign(x(i))
Exact constraint satisfies relaxed equation, but not other way around!
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 11 / 36
Spectral method - computations
Constraint optimization problem (Lagrange multipliers):
Q(x) =1
4xTLx− λ(xTx− n), xTe = 0
Eigenvalue problem:Lx = λx, x ⊥ e
Solution:Q(xi) =
n
4λi
First (smallest) eigenvector:
Le = 0, λ = 0, x1 = e
Looking for the second smallest eigenvalue/eigenvector λ2 and x2
Minimization of Rayleigh-Ritz quotient:
minx⊥x1
(xTLx
xTx)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 12 / 36
Spectral graph theory
λ1 = 0
Number of λi = 0 equal to the number of connected components
0 ≤ λ2 ≤ 2λ2 = 0, disconnected graphλ2 = 1, totally connected
Graph diameter (longest shortest path)
D(G ) >=4
nλ2
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 13 / 36
Spectral graph partitioning algorithm
Algorithm: Spectral graph partitioning - normalized cuts
Input: adjacency matrix A
Output: class indicator vector s
compute D = diag(deg(A));
compute L = D− A;
solve for second smallest eigenvector:min cut: Lx = λx;
normalized cut : Lx = λDx;
set s = sign(x2)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 14 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 15 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 16 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 17 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 18 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 19 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 20 / 36
Spectral ordering
Compute the eigenvector x2 corresponding to λ2
Sort eigenvector x2 and retain permutation vector p, x2(p(i)) - sorted
Permute columns and rows of A according to “new” ordering, A(p,p)
Since∑
e(i ,j)(x(i)− x(j))2 is minimized ⇒there are few edges connecting distant x(i) and x(j)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 21 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 22 / 36
Optimization criterion: graph cut
Graph cut
Q = cut(V1,V2) =∑
i∈V1,j∈V2
eij
Ratio cut:
Q =cut(V1,V2)
||V1||+
cut(V1,V2)
||V2||Normalized cut:
Q =cut(V1,V2)
Vol(V1)+
cut(V1,V2)
Vol(V2)
Quotient cut (conductance):
Q =cut(V1,V2)
min (Vol(V1),Vol(V2))
where: Vol(V1) =∑
i∈V1,j∈V eij =∑
i∈V1ki
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 23 / 36
Cut metrics
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 24 / 36
Optimization criterion: modularity
Modularity:
Q =1
2m
∑ij
(Aij −
kikj2m
)δ(ci , cj)
where nc - number of classes and
δ(ci , cj) =
{1 : if ci = cj0 : if ci 6= cj
- kronecker delta
[Maximization!]
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 25 / 36
Spectral modularity maximization
Direct modularity maximization for bi-partitioning, [Newman, 2006]
Let two classes c1 = V+, c2 = V−, indicator variable s = ±1
δ(ci , cj) =1
2(si sj + 1)
Modularity
Q =1
4m
∑ij
(Aij −
kikj2m
)(si sj + 1) =
1
4m
∑i ,j
Bijsi sj
where
Bij = Aij −kikj2m
M. Newman, 2006
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 26 / 36
Spectral modularity maximization
Qudratic form:
Q(s) =1
4msTBs
Integer optimization - NP, relaxation s → x , x ∈ R
Keep norm ||x ||2 =∑
i x2i = xTx = n
Quadratic optimization
Q(x) =1
4mxTBx− λ(xTx− n)
Eigenvector problemBxi = λixi
Approximate modularity
Q(xi) =n
4mλi
Modularity maximization - largest λ = λmax
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 27 / 36
Spectral modularity maximization
Can’t choose s = xk, can select optimal s
Decompose in the basis: s =∑
j ajxj, where aj = xTj s
Modularity
Q(s) =1
4msTBs =
1
4m
∑i
(xTi s)2λi
maxQ(s) reached when λ1 = λmax and max xT1 s =∑
j x1jsj
Choose s as close as possible to x, i.e. maxs(sTx) = maxsi∑
sixi :si = +1, if xi > 0si = −1, if xi < 0
Choose s ‖ x1, s = sign(x1)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 28 / 36
Modularity maximization
Algorithm: Spectral moduldarity maximization: two-way parition
Input: adjacency matrix A
Output: class indicator vector s
compute k = deg(A);
compute B = A− 12mkkT ;
solve for maximal eigenvector Bx = λx;
set s = sign(xmax)
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 29 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 30 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 31 / 36
Example
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 32 / 36
Example
Graph density: ρ = 0.139
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 33 / 36
Multilevel spectral
recursive partitioning
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 34 / 36
Multilevel spectral
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 35 / 36
References
M. Fiedler. Algebraic connectivity of graphs, Czech. Math. J, 23, pp298-305, 1973A. Pothen, H. Simon and K. Liou. Partitioning sparse matrices witheigenvectors of graphs, SIAM Journal of Matrix Analysis, 11, pp430-452, 1990Bruce Hendrickson and Robert Leland. A Multilevel Algorithm forPartitioning Graphs, Sandia National Laboratories, 1995Jianbo Shi and Jitendra Malik. Normalized Cuts and ImageSegmentation, IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 22, N 8, pp 888-905, 2000M.E.J. Newman. Modularity and community structure in networks.PNAS Vol. 103, N 23, pp 8577-8582, 2006B. Good, Y.-A. de Montjoye, A. Clauset. Performance of modularitymaximization in practical contexts, Physical Review E 81, 046106,2010
Leonid E. Zhukov (HSE) Lecture 9 10.03.2015 36 / 36