Sparse Approximations

34
Sparse Approximations Nick Harvey University of British Columbia

description

Sparse Approximations. Nick Harvey University of British Columbia. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A. Approximating Dense Objects by Sparse Objects. Floor joists. Wood Joists. Engineered Joists. - PowerPoint PPT Presentation

Transcript of Sparse Approximations

Page 1: Sparse Approximations

Sparse Approximations

Nick Harvey University of British Columbia

Page 2: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Floor joists

Wood Joists Engineered Joists

Page 3: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Bridges

Masonry Arch Truss Arch

Page 4: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Bones

Human Femur Robin Bone

Page 5: Sparse Approximations

Mathematically• Can an object with many pieces be

approximately represented by fewer pieces?

• Independent random sampling usually does well

• Theme of this talk: When can we beat random sampling?

Dense GraphSparse Graph

6 -1 -1 -1 -1 -1-1 4 -1 -1 -1-1 -1 6 -1 -1 -1

-1 5 -1 -1-1 -1 -1 7 -1 -1 -1-1 -1 -1 5-1 -1 -1 5 -1

-1 -1 -1 -1 6

6 -15 -1 -3

-1 28

-1 2 -11

-3 -1 52

Dense MatrixSparse Matrix

Page 6: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 7: Sparse Approximations

Discrepancy• Given vectors v1,…,vn2Rd with kvikp bounded.

Want y2{-1,1}n with ki yivikq small.• Eg1: If kvik1·1 then E ki yi vik1 ·• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 ·

Spencer ‘85: Partial Coloring + Entropy MethodGluskin ‘89: Sidak’s LemmaGiannopoulos ‘97: Partial Coloring + SidakBansal ‘10: Brownian Motion + Semidefinite ProgramBansal-Spencer ‘11: Brownian Motion + Potential functionLovett-Meka ‘12: Brownian Motion

Non-algorithmic

Algorithmic

Page 8: Sparse Approximations

Discrepancy• Given vectors v1,…,vn2Rd with kvikp

bounded.Want y2{-1,1}n with ki yivikq small.

• Eg1: If kvik1·1 then E ki yi vik1 ·• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 · • Eg3: If kvik1·¯, kvik1·±, and ki vik1·1, then

9y with ki yi vik1 · Harvey ’13: Using Lovasz Local Lemma.Question: Can log(±/¯2) factor be improved?

Page 9: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 10: Sparse Approximations

Partitioning sums of rank-1 matrices• Let v1,…,vn2Rd satisfy i vivi

T=I and kvik2·±.Want y2{-1,1}n with ki yivivi

Tk2 small.• Random sampling: E ki yivivi

Tk2 · .Rudelson ’96: Proofs using majorizing measures, then nc-Khintchine

• Marcus-Spielman-Srivastava ’13:9y2{-1,1}n with ki yivivi

Tk2 · .

2

Page 11: Sparse Approximations

Partitioning sums of matrices• Given dxd symmetric matrices M1,

…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · Also follows from nc-Khintchine.Ahlswede-Winter ’02: Using matrix moment generating function.Tropp ‘12: Using matrix cumulant generating function.

Page 12: Sparse Approximations

Partitioning sums of matrices

• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · • Question: 9y2{-1,1}n with ki yiMik2 · ?• Conjecture: Suppose i Mi=I and kMikSch-1·±.

9y2{-1,1}n with ki yiMik2 · ?–MSS ’13: Rank-one case is true– Harvey ’13: Diagonal case is true (ignoring

log(¢) factor)

False!

Page 13: Sparse Approximations

Partitioning sums of matrices

• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · • Question: Suppose only that kMik2·1.

9y2{-1,1}n with ki yiMik2 · ?– Spencer/Gluskin: Diagonal case is true

Page 14: Sparse Approximations

Column-subset selection• Given vectors v1,…,vn2Rd with kvik2=1.

Let st.rank=n/ki viviTk2. Let .

9y2{0,1}n s.t. i yi=k and (1-²)2 · ¸k( i yivivi

T ).

Spielman-Srivastava ’09: Potential function argumentYoussef ’12: Let . 9y2{0,1}n

s.t. i yi=k, (1-²)2 · ¸k( i yivivi

T ) and ¸1( i yiviviT ) ·

(1+²)2.

Page 15: Sparse Approximations

Column-subset selectionup to the stable rank

• Given vectors v1,…,vn2Rd with kvik2=1.Let st.rank=n/ki vivi

Tk2. Let .For y2{0,1}n s.t. i yi=k, can we control ¸k( i yivivi

T ) and ¸1( i yiviviT ) ?

– ¸k can be very small, say O(1/d).– Rudelson’s theorem: can get ¸1 · O(log d) and

¸k>0.– Harvey-Olver ’13: ¸1 · O(log d / log log d) and ¸k>0.–MSS ‘13: If i vivi

T =I, can get ¸1 · O(1) and ¸k>0.

Page 16: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 17: Sparse Approximations

Graph Laplacian

Lu = D-A =

7 -2 -5-2 3 -1-5 -1 16 -

10-

1010

abcd

a b c d

weighted degree of node

c

negative of u(ac)

Graph with weights u: 5 102 1

Laplacian Matrix:

ab

dc

Effective Resistance from s to t: voltage difference when each edge e is a (1/ue)-ohm resistor and a 1-amp current source placed between s and t= (es-et)T Lu

y (es-et)Effective Conductance: cst = 1 / (effective resistance from s

to t)

Page 18: Sparse Approximations

Spectral approximation of graphs

®-spectral sparsifier: Lu ¹ Lw ¹ ®¢Lu

5 -1 -1 -1 -1 -14 -1 -1 -1 -1

-1 -1 6 -1 -1 -1 -1-1 5 -1 -1 -1 -1

-1 -1 -1 7 -1 -1 -1 -1-1 -1 -1 5 -1 -1-1 -1 -1 5 -1 -1

-1 -1 -1 -1 6 -1 -1-1 -1 -1 -1 -1 5

-1 -1 -1 -1 -1 -1 6

6 -1 -55 -1 -3 -1

-1 2 -18 -8

-1 2 -11 -1

-3 -1 5 -12 -1 -1

-5 -1 -1 -1 8-1 -8 -1 10

Edge weights u

Edge weights w

Lu = Lw =

Page 19: Sparse Approximations

Ramanujan Graphs• Suppose Lu is complete graph on n vertices

(ue=1 8e).• Lubotzky-Phillips-Sarnak ’86:

For infinitely many d and n, 9w2{0,1}E such that e we=dn/2 (actually Lw is d-regular)and

• MSS ‘13: Holds for all d¸3, and all n=c¢2k.• Friedman ‘04: If Lw is a random d-regular graph,

then 8²>0

with high probability.

Page 20: Sparse Approximations

Arbitrary graphs• Spielman-Srivastava ’08: For any graph

Lu with n=|V|, 9w2RE such that |support(w)| = O(n log(n)/²2)

andProof: Follows from Rudelson’s theorem

• MSS ’13: For any graph Lu with n=|V|,9w2RE such that we 2 £(²2) ¢ N ¢ (effective

conductance of e) |support(w)| = O(n/²2)

and

Page 21: Sparse Approximations

Spectrally-thin trees• Question: Let G be an unweighted graph with n

vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .

• Equivalent to

• Goddyn’s Conjecture ‘85: There is a subtree T with

– Relates to conjectures of Tutte (‘54) on nowhere-zero flows,and to approximations of the traveling salesman problem.

Page 22: Sparse Approximations

Spectrally-thin trees• Question: Let G be an unweighted graph with n

vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .

• Rudelson’s theorem: Easily gives ® = O(log n).• Harvey-Olver ‘13: ® = O(log n / log log n).

Moreover, there is an efficient algorithm to find such a tree.

• MSS ’13: ® = O(1), but not algorithmic.

Page 23: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 24: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives

spectral thinness, but not a tree.► Sample every edge e independently with

prob. xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.

Page 25: Sparse Approximations

Matrix ConcentrationTheorem: [Tropp ‘12]Let Y1,…,Ym be independent, PSD matrices of size nxn.Let Y=i Yi and Z=E [ Y ]. Suppose Yi ¹ R¢Z a.s. Then

Page 26: Sparse Approximations

Define sampling probabilities xe = 1/ce. It is known that e xe

= n–1.Claim: Independent sampling gives T µ E with E [|T|]=n–1 and

Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD matrices.Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.ThenDefine Me = ce¢Le. Then Z = LG and Me ¹ Z holds.Setting ®=6 log n / log log n, we get whp.But T is not a tree!

Independent sampling

Laplacian of the single edge eProperties of conductances used

Page 27: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives spectral thinness,

but not a tree.► Sample every edge e independently with prob.

xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.► Run pipage rounding to get tree T with Pr[ e2T ] = xe =

1/ce

Page 28: Sparse Approximations

Pipage rounding[Ageev-Svirideno ‘04, Srinivasan ‘01, Calinescu et al. ‘07, Chekuri et al. ‘09]

Let P be any matroid polytope.E.g., convex hull of characteristic vectors of spanning trees.Given fractional x

Find coordinates a and b s.t. linez x + z ( ea – eb ) stays in current faceFind two points where line leaves PRandomly choose one of thosepoints s.t. expectation is x

Repeat until x = ÂT is integral

x is a martingale: expectation of final ÂT is original fractional x.

ÂT1ÂT2

ÂT3

ÂT4

ÂT5

ÂT6

x

Page 29: Sparse Approximations

Say f : Rm ! R is concave under swaps if z ! f( x + z(ea-eb) ) is concave 8x2P, 8a, b2[m].Let X0 be initial point and ÂT be final point visited by pipage rounding.Claim: If f concave under swaps then E[f(ÂT)] · f(X0). [Jensen]

Let E µ {0,1}m be an event.Let g : [0,1]m ! R be a pessimistic estimator for E, i.e.,

Claim: Suppose g is concave under swaps. Then Pr[ ÂT 2 E ] · g(X0).

Pipage rounding and concavity

Page 30: Sparse Approximations

Chernoff BoundChernoff Bound: Fix any w, x 2 [0,1]m and let ¹ = wTx.Define . Then,

Claim: gt,µ is concave under swaps. [Elementary calculus]

Let X0 be initial point and ÂT be final point visited by pipage rounding.Let ¹ = wTX0. Then Bound achieved by independent sampling also achieved by pipage rounding

Page 31: Sparse Approximations

Matrix Pessimistic Estimators

Main Theorem: gt,µ is concave under swaps.

Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD matrices.Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.LetThen and .

Bound achieved by independent sampling also achieved by pipage rounding

Pessimistic estimator

Page 32: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives spectral thinness,

but not a tree.► Sample every edge e independently with prob. xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.► Run pipage rounding to get tree T with Pr[ e2T ] = xe =

1/ce

Page 33: Sparse Approximations

Matrix AnalysisMatrix concentration inequalities are usually proven via sophisticated inequalities in matrix analysisRudelson: non-commutative Khinchine inequalityAhlswede-Winter: Golden-Thompson inequalityif A, B symmetric, then tr(eA+B) · tr(eA eB).Tropp: Lieb’s concavity inequality [1973]if A, B Hermitian and C is PD, then z ! tr exp( A + log(C+zB) ) is concave.Key technical result: new variant of Lieb’s theoremif A Hermitian, B1, B2 are PSD, and C1, C2 are PD, then z ! tr exp( A + log(C1+zB1) + log(C2–zB2) ) is concave.

Page 34: Sparse Approximations

QuestionsCan Spencer/Gluskin theorem be

extended to matrices?Can MSS’13 be made algorithmic?Can MSS’13 be extended to large-rank

matrices?O(1)-spectrally thin trees exist. Can one

be found algorithmically?Are O(1)-spectrally thin trees helpful for

Goddyn’s conjecture?