Luca Trevisan U.C Berkeley Based on work withTsz Chiu Kwok...

Spectral Algorithms for Graph Partitioning

Luca TrevisanU.C Berkeley

Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee,Yin Tat Lee, and Shayan Oveis Gharan

spectral graph theory

Use linear algebra to

• understand/characterize graph properties

• design efficient graph algorithms

linear algebra and graphs

Take an undirected graph G = (V ,E ), consider matrix

A[i , j ] := weight of edge (i , j)

eigenvalues −→ combinatorial propertieseigenvenctors −→ algorithms for combinatorial problems

fundamental thm of spectral graph theory

G = (V ,E ) undirected, A adjacency matrix, dv degree of v , Ddiagonal matrix of degrees

L := I − D−12AD−

L is symmetric, all its eigenvalues are real and

• 0 = λ1 ≤ λ2 ≤ · · · ≤ λn ≤ 2

• λk = 0 iff G has ≥ k connected components

• λn = 2 iff G has a bipartite connected component

L := I − D−12AD−

• 0 = λ1 ≤ λ2 ≤ · · · ≤ λn ≤ 2

L := I − D−12AD−

• 0 = λ1 ≤ λ2 ≤ · · · ≤ λn ≤ 2

L := I − D−12AD−

• 0 = λ1 ≤ λ2 ≤ · · · ≤ λn ≤ 2

0, 0, .69, .69, 1.5, 1.5, 1.8, 1.8

eigenvalues as solutions to optimizationproblems

If M ∈ Rn×n is symmetric, and

λ1 ≤ λ2 ≤ · · · ≤ λnare eigenvalues counted with multiplicities, then

λk = minA k−dim subspace of RV

maxx∈A

and eigenvectors of λ1, . . . , λk are basis of opt A

why eigenvalues relate to connectedcomponents

If G = (V ,E ) is undirected and L is Laplacian, then

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

If λ1 ≤ λ2 ≤ · · ·λn are eigenvalues of L with multiplicities,

maxx∈A

R(x) :=

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

Note:∑

(u,v)∈E |xu − xv |2 = 0 iff x constant on connectedcomponents

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

maxx∈A

R(x) :=

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

Note:∑

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

maxx∈A

R(x) :=

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

Note:∑

if G has ≥ k connected components

maxx∈A

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

Consider the space of all vectors that are constant in eachconnected component; it has dimension at least k and so itwitnesses λk = 0.

if G has ≥ k connected components

maxx∈A

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

Consider the space of all vectors that are constant in eachconnected component; it has dimension at least k and so itwitnesses λk = 0.

if λk = 0

maxx∈A

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

If λk = 0 there is a k-dimensional space A of vectors, eachconstant on connected components

The graph must have ≥ k connected components

if λk = 0

maxx∈A

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

if λk = 0

maxx∈A

∑(u,v)∈E |xu − xv |2∑

v dv · x2v

conductance

vol(S) :=∑

v∈S dv

φ(S) :=E (S , S)

vol(S)

φ(G ) := minS⊆V :0<vol(S)≤ 1

2vol(V )

In regular graphs, same as edge expansion and, up to factor of 2non-uniform sparsest cut

conductance

0, 0, .69, .69, 1.5, 1.5, 1.8, 1.8

φ(G ) = φ(0, 1, 2) =0

conductance

0, .055, .055, .211, . . .

φ(G ) =2

conductance

φ(G ) =5

finding S of small conductance

• G is a social network

• S is a set of users more likely to be friends with each otherthan anyone else

• G is a similarity graph on items of a data sets• S are items more similar to each other than to other items• [Shi, Malik]: image segmentation

• G is an input of a NP-hard optimization problem• Recurse on S , V − S , use dynamic programming to combine in

time 2O(|E(S,S)|)

• G is a social network• S is a set of users more likely to be friends with each other

than anyone else

time 2O(|E(S,S)|)

than anyone else

• G is a similarity graph on items of a data sets

• S are items more similar to each other than to other items• [Shi, Malik]: image segmentation

time 2O(|E(S,S)|)

than anyone else

time 2O(|E(S,S)|)

than anyone else

time 2O(|E(S,S)|)

conductance versus λ2

Recall:λ2 = 0⇔ G disconnected

⇔ φ(G ) = 0

Cheeger inequality (Alon, Milman, Dodzuik, mid-1980s):

2≤ φ(G ) ≤

√2λ2

Recall:λ2 = 0⇔ G disconnected⇔ φ(G ) = 0

2≤ φ(G ) ≤

√2λ2

Recall:λ2 = 0⇔ G disconnected⇔ φ(G ) = 0

2≤ φ(G ) ≤

√2λ2

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

Constructive proof of φ(G ) ≤√

• given eigenvector x of λ2.

• sort vertices v according to xv

• a set of vertices S occurring as initial segment or finalsegment of sorted order has

φ(S) ≤√

2λ2 ≤ 2√φ(G )

Fiedler’s sweep algorithm can be implemented in O(|V |+ |E |) time

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(S) ≤√

≤ 2√φ(G )

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(S) ≤√

2λ2 ≤ 2√φ(G )

applications

2≤ φ(G ) ≤

√2λ2

• rigorous analysis of sweep algorithm(practical performance usually better than worst-case analysis)

• to certify φ ≥ Ω(1), enough to certify λ2 ≥ Ω(1)

• mixing time of random walk is O(

log |V |)

and so also

φ2(G )log |V |

applications

2≤ φ(G ) ≤

√2λ2

log |V |)

and so also

φ2(G )log |V |

applications

2≤ φ(G ) ≤

√2λ2

log |V |)

and so also

φ2(G )log |V |

applications

2≤ φ(G ) ≤

√2λ2

log |V |)

and so also

φ2(G )log |V |

A similar theory for other eigenvectors, with algorithmicapplications

One intended application: Like Cheeger’s inequality is worst caseanalysis of Fiedler’s algorithm, develop worst-case analysis ofspectral clustering

spectral clustering

The following algorithm works well in practice. (But to solve whatproblem?)

Compute k eigenvectors x(1), . . . , x(k) ofk smallest eigenvalues of L

Define mapping F : V → Rk

F (v) := (x(1)v , x

(2)v , . . . , x

(k)v )

Apply k-means to the points that vertices are mapped to

spectral embedding of a random graphk = 3

When λk is small

Lee, Oveis-Gharan, T, 2012

From fundamental theorem of spectral graph theory:

• λk = 0⇔ G has ≥ k connected components

We prove:

• λk small ⇔ G has ≥ k disjoint sets of small conductance

Lee, Oveis-Gharan, T, 2012

From fundamental theorem of spectral graph theory:

• λk = 0⇔ G has ≥ k connected components

We prove:

• λk small ⇔ G has ≥ k disjoint sets of small conductance

order-k conductance

Define order-k conductance

φk(G ) := minS1,...,Sk disjoint

maxi=1,...,k

φ(Si )

• φ(G ) = φ2(G )

• φk(G ) = 0⇔ G has ≥ k connected components

0, 0, .69, .69, 1.5, 1.5, 1.8, 1.8

φ3(G ) = max

0, 0, .69, .69, 1.5, 1.5, 1.8, 1.8

φ3(G ) = max

0, .055, .055, .211, . . .

φ3(G ) = max

λk = 0⇔ φk = 0

λk2≤ φk(G ) ≤ O(k2) ·

√λk

(Lee,Oveis Gharan, T, 2012)

Upper bound is constructive: in nearly-linear time we can find kdisjoint sets each of expansion at most O(k2)

√λk .

λk = 0⇔ φk = 0

λk2≤ φk(G ) ≤ O(k2) ·

√λk

Upper bound is constructive: in nearly-linear time we can find kdisjoint sets each of expansion at most O(k2)

√λk .

λk2≤ φk(G ) ≤ O(k2) ·

√λk

φk(G ) ≤ O(√

(log k) · λ1.1k)

φk(G ) ≤ O(√

(log k) · λO(k))

(Louis, Raghavendra, Tetali, Vempala, 2012)

Factor O(√

log k) is tight

λk2≤ φk(G ) ≤ O(k2) ·

√λk

φk(G ) ≤ O(√

(log k) · λ1.1k)

φk(G ) ≤ O(√

(log k) · λO(k))

(Louis, Raghavendra, Tetali, Vempala, 2012)

Factor O(√

log k) is tight

φk(G ) ≤ O(√

(log k) · λO(k))

(Louis, Raghavendra, Tetali, Vempala, 2012)(Lee,Oveis Gharan, T, 2012)

Miclo 2013:

• Analog for infinite-dimensional Markov process.

• Application: every hyper-bounded operator has a spectral gap.

• Solves 40+ year old open problem

spectral clustering

Recall spectral clustering algorithm

1 Compute k eigenvectors x(1), . . . , x(k) ofk smallest eigenvalues of L

2 Define mapping F : V → Rk

F (v) := (x(1)v , x

(2)v , . . . , x

(k)v )

3 Apply k-means to the points that vertices are mapped to

LOT algorithms and LRTV algorithm find k low-conductance setsif they exist

First step is spectral embedding

Then geometric partitioning but not k-means

spectral clustering

Recall spectral clustering algorithm

1 Compute k eigenvectors x(1), . . . , x(k) ofk smallest eigenvalues of L

2 Define mapping F : V → Rk

F (v) := (x(1)v , x

(2)v , . . . , x

(k)v )

3 Apply k-means to the points that vertices are mapped to

LOT algorithms and LRTV algorithm find k low-conductance setsif they exist

First step is spectral embedding

Then geometric partitioning but not k-means

When λk is large

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(G ) ≤ O(k) · λ2√λk

(Kwok, Lau, Lee, Oveis Gharan, T, 2013)

and the upper bound holds for the sweep algorithm

Nearly-linear time sweep algorithm finds S such that, for every k ,

φ(S) ≤ φ(G ) · O(

k√λk

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(G ) ≤ O(k) · λ2√λk

φ(S) ≤ φ(G ) · O(

k√λk

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(G ) ≤ O(k) · λ2√λk

φ(S) ≤ φ(G ) · O(

k√λk

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(G ) ≤ O(k) · λ2√λk

Liu 2014: analog for Riemann manifolds. Applications to boundingeigenvalue gaps

Cheeger inequality

2≤ φ(G ) ≤

√2λ2

φ(G ) ≤ O(k) · λ2√λk

Liu 2014: analog for Riemann manifolds. Applications to boundingeigenvalue gaps

proof structure

1 We find a 2k-valued vector y such that

||x − y ||2 ≤ O

)where x is eigenvector of λ2

2 For every k-valued vector y , applying the sweep algorithm tox finds a set S such that

φ(S) ≤ O(k) ·(λ2 +

√λ2 · ||x − y ||

So the sweep algorithm finds S

φ(S) ≤ O(k) · λ2 ·1√λk

proof structure

||x − y ||2 ≤ O

φ(S) ≤ O(k) ·(λ2 +

√λ2 · ||x − y ||

φ(S) ≤ O(k) · λ2 ·1√λk

proof structure

||x − y ||2 ≤ O

φ(S) ≤ O(k) ·(λ2 +

√λ2 · ||x − y ||

φ(S) ≤ O(k) · λ2 ·1√λk

bounded threshold rank

[Arora, Barak, Steurer], [Barak, Raghavendra, Steurer],[Guruswami, Sinop], ...

If λk is large,then a cut of approximately minimal conductance

can be found in time exponential in kusing spectral methods [ABS]or convex relaxations [BRS, GS]

[Kwok, Lau, Lee, Oveis-Gharan, T]: the nearly-linear time sweepalgorithm finds good approximation if λk is large

[Arora, Barak, Steurer], [Barak, Raghavendra, Steurer],[Guruswami, Sinop], ...

If λk is large,then a cut of approximately minimal conductance

can be found in time exponential in kusing spectral methods [ABS]or convex relaxations [BRS, GS]

[Kwok, Lau, Lee, Oveis-Gharan, T]: the nearly-linear time sweepalgorithm finds good approximation if λk is large

If λk is large,

[Kwok, Lau, Lee, Oveis-Gharan, T]: the nearly-linear time sweepalgorithm finds good approximation

[Oveis-Gharan, T]: the Goemans-Linial relaxation (solvable inpolynomial time independent of k) can be rounded better than inthe Arora-Rao-Vazirani analysis

[Oveis-Gharan, T]: the graph satisfies a weak regularity lemma inthe sense of Frieze and Kannan

If λk is large,

If λk is large, the graph is an easy instance for several algorithms.Two types of results:

• There is a near-optimal combinatorial solution with a simplestructure

Find it by brute force

• The optimum of a relaxation has a special structure

Improved rounding algorithm that uses the special structure

When λk+1 is large and λk is small

eigenvalue gap

• if λk = 0 and λk+1 > 0 then G has exactly k connectedcomponents;

• [Tanaka 2012] If λk+1 > 5k√λk , then vertices of G can be

partitioned into k sets of small conductance, each inducing asubgraph of large conductance. [Non-algorithmic proof]

• [Oveis-Gharan, T 2013] Sufficient φk+1(G ) > (1 + ε)φk(G ); ifλk+1 > O(k2)

√λk , then partition can be found

algorithmically

eigenvalue gap

algorithmically

eigenvalue gap

algorithmically

In Summary

λk is small: There are k disjoint non-expanding sets; they can befound efficiently

λk is large: Easy instance for various algorithms.

λk is small and λk+1 is large: Nodes can be partitioned into ksets, each with high internal expansion and small externalexpansion

lucatrevisan.wordpress.com/tag/expanders/

Luca Trevisan U.C Berkeley Based on work withTsz Chiu Kwok...

Documents

Transcript of Luca Trevisan U.C Berkeley Based on work withTsz Chiu Kwok...