Provable Deterministic Leverage Score Sampling€¦ · The Column Subset Selection Problem (CSSP)...

Provable Deterministic Leverage Score Sampling

Dimitris Papailiopoulos (UC Berkeley)Anastasios Kyrillidis (EPFL)

Christos Boutsidis (Yahoo Labs)

New York, New York

August 27th, 2014

Singular Value Decomposition

m × n matrix A

k < ρ = rank(A)

Low-rank matrix approximation problem:

minX∈Rm×n,rank(X)≤k

||A − X||F

Singular Value Decomposition (SVD):

A = U · Σ · VT =(

Uk Uρ−k)︸︷︷︸

(Σk 00 Σρ−k

)︸︷︷︸

ρ×ρ

VTρ−k

)︸︷︷︸

Uk ∈ Rm×k , Σk ∈ Rk×k , and Vk ∈ Rn×k

Solution via Eckart-Young Theorem

Ak = Uk Σk VTk = AVk VT

k . O(mn min{m,n}) time

The Column Subset Selection Problem (CSSP)

Definition

Let A ∈ Rm×n and let c < n be a sampling parameter. Find ccolumns of A – denoted as C ∈ Rm×c – that minimize

‖A − CC†A‖F or ‖A − CC†A‖2,

where C† denotes the Moore-Penrose pseudo-inverse.

CSSP gives a low-rank matrix factorization to A (X = C†A): A

Motivation

Consider applying this to date-by-stock matrices.

Returns the most important stocks in the portfolio.

Interpretable matrix decompositions in general.

Prior work on CSSP

c ‖A − CC†A‖2F ≤ Running time

1 k/ε2 ‖A − Ak‖2F + ε‖A‖2

F nnz(A)2 (k log k)/ε2 (1 + ε)‖A − Ak‖2

3 (k log k)/ε2 (1 + ε)‖A − Ak‖2F mnk2 log k

4 k/ε (1 + ε)‖A − Ak‖2F mnk/ε

5 k/ε (1 + ε)‖A − Ak‖2F m3nk/ε

References:1 Frieze, Kannan, Vempala. FOCS. 2003.

2 Drineas, Mahoney, and Muthukrishnan. RANDOM, 2006.

3 Deshpande, Rademacher, Vempala, Wang. SODA, 2006.

4 Boutsidis, Drineas, Magdon-Ismail. FOCS, 2011.

5 Guruswami, Sinop. SODA, 2012

There are more results in the linear algebra literature focusing on the spectral norm version of the CSSP.

Leverage scores and randomized samplingDrineas, Mahoney, and Muthukrishnan. RANDOM, 2006.

Definition

[Leverage scores] Let Vk ∈ Rn×k contain the top k right singularvectors of an m × n matrix A with rank ρ = rank(A) ≥ k . Then,the (rank-k ) leverage score of the i-th column of A is defined as

`(k)i = ‖[Vk ]i,:‖22, i = 1,2, . . . ,n.

For a target rank k < rank(A), define a probabilitydistribution over the columns of A, pi = `

(k)i /k ;

In c independent and identically distributed passes,sample with replacement c columns from AFor c = O(k log k/ε2) and with constant probability:‖A − CC†A‖F ≤ (1 + ε) ‖A − Ak‖F.

Deterministic leverage score sampling[Jollife, 1972]

Compute the leverage scores of A w.r.t. some k .

Pick the c columns with the largest leverage scores.

Nice empirical results.

No theoretical analysis.

Contribution of this talk: theoretical analysis of deterministicleverage scores sampling.

Deterministic leverage score sampling[revisited]

Input: A ∈ Rm×n, k , θ (0 < θ < 1)- ComputeVk ∈Rn×k (via SVD).- Compute the leverage scores:for i = 1,2, . . . ,n`(k)i =

∥∥[Vk ]i,:∥∥2

2end forWithout loss of generality, let `(k)i ’s be sorted:

`(k)1 ≥ · · · ≥ `(k)i ≥ `(k)i+1 ≥ · · · ≥ `

(k)n .

Find index c ∈ {1, . . . ,n} such that:

c = argminc

`(k)i > θ

If c < k , set c = k .Output: C ∈ Rm×c containing the first c columns of A.

Main result

Theorem

Letθ = k − ε,

for some ε ∈ (0,1). Then, for ξ = {2,F}, we have

‖A − CC†A‖2ξ < (1 + ε) · ‖A − Ak‖2ξ .

Weak result if the leverage scores are almost uniform.

Main result: leverage scores following a power law

Theorem

Let the leverage scores follow a power-law decay with exponentαk = 1 + η, for η > 0:

`(k)i =

`(k)1iαk

Let θ = k − ε. Then,

) 11+η

and‖A − CC†A‖2ξ < (1 + ε) · ‖A − Ak‖2ξ .

Is power law a realistic assumption?

Test leverage scores of large graphs.

Show leverage scores follow power law decays.

Power law is a realistic assumption

1 200 400 600 800 100010−5

α 1 0 = 1 .45

amazon

1 200 400 600 800 100010−5

α 1 0 = 1 .5

citeseer

1 200 400 600 800 100010−10

10−5

α 1 0 = 1 .7

foursquare

1 200 400 600 800 100010−5

α 1 0 = 1 .13

github

1 200 400 600 800 100010−5

α 1 0 = 2

gnutella

1 200 400 600 800 100010−5

α 1 0 = 1 .6

google

1 200 400 600 800 100010−4

10−2

α 1 0 = 0 .9

gowalla

1 200 400 600 800 100010−3

10−2

10−1

α 1 0 = 0 .2

livejournal

1 200 400 600 800 100010−4

10−2

α 1 0 = 0 .9

slashdot

1 200 400 600 800 100010−5

α 1 0 = 1 .6

1 200 400 600 800 100010−4

10−3

10−2

α 1 0 = 0 .2

skitter

1 200 400 600 800 1000

10−3.6

10−3.3α 1 0 = 0 .12

1 200 400 600 800 100010−5

α 1 0 = 1 .58

1 200 400 600 800 100010−10

α 1 0 = 4

writers

1 200 400 600 800 100010−5

α 1 0 = 1 .75

youtube groups

1 200 400 600 800 100010−4

10−2

α 1 0 = 0 .5

youtube

k = 10Show decay of leverage scores logarithmic scalePlot a fitting power-law curve β · x−αk .True leverage scores are plotted with a red× marker.The fitted curves are denoted with a solid blue line.

Power-law decaying leverage scores

5 5000

∥A−CC

† A∥2 2

∥A−A

k∥2 2

10 5000

k = 10

50 5000

k = 50

100 5000

c =152

k = 100

5 5000

∥A−CC

† A∥2 2

∥A−A

k∥2 2

10 5000

50 5000

100 5000

c =129

m = 200, n = 1000.k = 5, 10, 50, 100.c = 1, 2, ..., 1000.αk = 0.5 and αk = 1.5.

Blue curve is the relative error ratio ‖A − CC†A‖22/‖A − Ak‖2

2The vertical cyan line corresponds to the point where k = cThe vertical magenta line indicates the point where the c sampled columns offer a better approximationcompared to the best rank-k matrix Ak

Nearly-uniform leverage scores

5 500 10000

∥A−CC

† A∥2 2

∥A−A

k∥2 2

c =473

10 500 10000

c =404

k = 10

50 500 10000

c =629

k = 50

100 500 10000

c =630

k = 100

m = 200, n = 1000.

k = 5, 10, 50, 100.

c = 1, 2, ..., 1000.

Blue curve is the relative error ratio ‖A − CC†A‖22/‖A − Ak‖2

The leftmost vertical cyan line corresponds to the point where k = c.

The rightmost vertical magenta line indicates the point where the c sampled columns offer as good anapproximation as that of the best rank-k matrix Ak

Conclusions

The Column Subset Selection Problemapproach: sampling w.r.t the leverage scores.

Randomized leverage scores sampling

theory: strong results [Drineas et al, 2008].practice: strong performance

Deterministic leverage scores sampling

theory: good performance if leverage scores follow apower law decay.

practice: many real data exhibit leverage scores withpower law decays.

Provable Deterministic Leverage Score Sampling€¦ · The Column Subset Selection Problem (CSSP)...

Documents

Transcript of Provable Deterministic Leverage Score Sampling€¦ · The Column Subset Selection Problem (CSSP)...

Senior Software Engineering Project CSSP Project CEN 4935 ...itech.fgcu.edu/faculty/zalewski/CEN4935/Projects/SampleSRS-CSSP.… · CSSP Project CEN 4935 April 5, 2006 Adam Cox Tass

Data Visualization, Analytics, and Law Enforcement CSSP ...

Rapport saisine « indicateurs de résulatst des IAS, CSSP/HCSP ...

PPS-2RM-B 1 PPS GENERATOR OPERATING MANUAL · PPS-2RM-B 1 PPS GENERATOR OPERATING MANUAL SPECTRADYNAMICS, INC • 1849 Cherry St. Unit 2. • Louisville, CO 80027 Phone: (303) 665-1852

Revisão de Álgebra Linear - abelsiqueira.github.ioabelsiqueira.github.io/disciplinas/cm116/2018/algebra-linear.pdf · QuadradosMínimos ConsidereA 2Rm n,comm >n,eb 2Rm. OsistemaAx

CHURCH CHORAL 2rm - akunk.itenas.ac.id

CSSP PYA Guide

PRIMAVERA › fa5e7171 › files... · 2020-02-11 · • Marce 6AV/2RM F 560 TE • Motore OHV 4 tempi • Cilindrata: 163 cc • Larghezza di lavoro: 900 mm • Marce 6AV/2RM •

Cssp project theft

download the entire volume - CSSP - CNRS

Cssp sample 3

Cssp y Regularizacion de Estableciemientos_2016

Program Sokongan Keselamatan Perjalanan (CSSP) Suatu Usaha ...

La prise en compte des 2RM dans les aménagementsvoiriepourtous.cerema.fr/IMG/pdf/2rm-infra-bv-fev2011_cle7ef8ab.pdf · Sécurité recherchée dès la conception de l’infrastructure

Lecture 10 - Intro to CSSP

Accidentologie des deux-roues motorisés · Projet ANR-05-PDIT-011-01 "2RM" Fiche résumé 5 Projet ANR Predit "2RM" 2006-2008 Accidentologie, Usage et Représentation des Deux-Roues

Actes des Journées Scientifiques Geri 2RM

Presentaci+-ªn CSSP 05

CSSP - ILF

Unclassified STD/CSSP/WPTGS(2015)24 - OECD