Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf ·...
Transcript of Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf ·...
![Page 1: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/1.jpg)
GreedyColumnSubsetSelection:NewBoundsandDistributedAlgorithms
JasonAltschuler
JointworkwithAdityaBhaskara,ThomasFu,Vahab Mirrokni,AfshinRostamizadeh,andMorteza Zadimoghaddam
ICML2016
![Page 2: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/2.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedyalgorithm
4. (Distributed)coreset greedyalgorithm
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 3: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/3.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedyalgorithm
4. (Distributed)coreset greedyalgorithm
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 4: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/4.jpg)
Low-RankApproximation
Given(large)matrixAinRmxn andtargetrankk<<m,n:
• Optimalsolution:k-rankSVD• Applications:
• Dimensionalityreduction• Signaldenoising• Compression• ...
![Page 5: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/5.jpg)
ColumnSubsetSelection(CSS)• Columnsoftenhaveimportantmeaning• CSS:Low-rankmatrixapproximationincolumnspaceofA
m
n
m
kk
n
A[S]AAA
![Page 6: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/6.jpg)
WhyuseCSSfordimensionalityreduction?• Unsupervised• Don’tneedlabeleddata
• Classifierindependent• Canreuseoutputfordifferentclassifiers
• Interpretable• Generatefeaturesbysubselecting insteadofarbitraryfunction
• Efficientduringinference• Featuresubselection (CSS)betterthanmatrixmultiplication(SVD)if:• Latencysensitive• SVDprojectionmatrixprohibitivelylarge• Sparse
![Page 7: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/7.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedyalgorithm
4. (Distributed)coreset greedyalgorithm
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 8: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/8.jpg)
• CSSisUG-hard [Civril 2014]
• Importancesampling [Drineas etal.2004,Friezeetal.2004,…]• Fast,butadditive-errorbounds
• Morecomplicatedalgorithms [Desphande etal.2006,Drineas etal.2006,Boutsidis etal.2009,Boutsidis etal.2011,Cohenetal.2015,…]• Multiplicative-errorbounds,butcomplicated→notasfast/distributable
(Verysimplified)backgroundonCSS
![Page 9: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/9.jpg)
• CSSisUG-hard [Civril 2014]
• Importancesampling [Drineas etal.2004,Friezeetal.2004,…]• Fast,butadditive-errorbounds
• Morecomplicatedalgorithms [Desphande etal.2006,Drineas etal.2006,Boutsidis etal.2009,Boutsidis etal.2011,Cohenetal.2015,…]• Multiplicative-errorbounds,butcomplicated→notasfast/distributable
• Greedy [Farahat etal.2011,Civril etal.2011,Boutsidis etal.2015]
• Multiplicative-error boundsandfast/distributable
(Verysimplified)backgroundonCSS
![Page 10: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/10.jpg)
Contributions• Provetightapproximationguaranteeforthegreedyalgorithm
• Firstdistributedimplementationwithprovableapproximationfactors
• Furtheroptimizationsforthegreedyalgorithm
• Empiricalresultsshowingthesealgorithmsareextremelyscalableandhaveaccuracycomparablewiththestate-of-the-art
![Page 11: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/11.jpg)
CSS(A,k)
GCSS(A,B,k)
• GCSS(A,B,k)useskcolumnsofBtoapproximateA
• Note:GCSS(A,A,k)=CSS(A,k)
GeneralizedColumnSubsetSelection(GCSS)
![Page 12: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/12.jpg)
denote byf(S) originalGCSScostfunction
• GCSS maximizingfsubjecttocardinalityconstraint• Intuition:fmeasureshowmuchofAis“covered/explained”by
selectedcolumns
ConvenientreformulationofGCSS
![Page 13: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/13.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedyalgorithm
4. (Distributed)coreset greedyalgorithm
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 14: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/14.jpg)
GREEDYalgorithmtomaximizef
![Page 15: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/15.jpg)
Ourresult:AnalysisofGREEDY
![Page 16: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/16.jpg)
• Weexpectvectorsintobewell-conditioned(think“almostorthogonal”) small
• If boundedbyaconstant,thenonlyneed columns
• Significantimprovementuponcurrentbounds:dependonworst singularvalueofany kcolumns
Ourresult:AnalysisofGREEDY
![Page 17: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/17.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedy+ approximationguarantees
4. (Distributed)coreset greedy+ approximationguarantees
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 18: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/18.jpg)
DISTGREEDY:GCSS(A,B,k)withLmachinesB
…
…
Machine1 MachineLMachine2
Designatedmachine
![Page 19: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/19.jpg)
DISTGREEDY:firstobservations• Easy/naturaltoimplementinMapReduce
• 2-passstreamingalgorithminrandomarrivalmodelforcolumns
• Canalsodomultiplerounds/epochs.Goodfor:• Massivedatasets• Gettingbetterapproximations(nextslide)
![Page 20: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/20.jpg)
Ourresults:AnalysisofDISTGREEDYConsideraninstanceGCSS(A,B,k)
![Page 21: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/21.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedy+ approximationguarantees
4. (Distributed)coreset greedy+ approximationguarantees
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 22: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/22.jpg)
4optimizationsthatpreserveourapproximationfor
1.JLLemma [Johnson&Lindenstrauss 1982,Sarlos 2006]:randomlyprojecttorowswhile
stillpreservingk-linearcombos
2.Projection-CostPreservingSketches[Cohenetal.2015]:sketchAwith columns.
3.“StochasticGreedy”[Mirzasoleiman etal. 2015]:eachiterationonlyuses marginalutilitycalls
insteadof..
4.UpdatingAeveryiteration [Farahat etal.2013]:aftereachiteration,removeprojectionsofAandBonto
selectedcolumn.Reducescomplexityofmarginalutilityfrom
ScalableImplementation:GREEDY++
![Page 23: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/23.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedy+ approximationguarantees
4. (Distributed)coreset greedy+ approximationguarantees
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketches
TalkOutline
![Page 24: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/24.jpg)
“Small”dataset(mnist):toshowaccuracy
• Takeaway: GREEDY,GREEDY++,andGREEDY-corehaveroughlysameaccuracyasstate-of-the-art
![Page 25: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/25.jpg)
Largedataset(news20.binary)toshowscalability
• Takeaway:DISTGREEDYabletoscaletomassivedatasetswhilestillselectingeffectivefeatures
![Page 26: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/26.jpg)
1. Background/motivationforColumnSubsetSelection(CSS)
2. Previouswork+ ourcontributions
3. (Single-machine)greedy+ approximationguarantees
4. (Distributed)coreset greedy+ approximationguarantees
5. Furtheroptimizations
6. Experiments
7. [Timepermitting]Proofsketch:analysisofGREEDY
TalkOutline
![Page 27: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/27.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 28: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/28.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 29: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/29.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 30: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/30.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 31: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/31.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 32: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/32.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 33: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/33.jpg)
Proofsketch:AnalysisofGREEDY
● Keylemma:ExistselementofOPTk thatgiveslargemarginalgaintoGREEDYr
● Closesgaptof(OPTk)● Similartosubmodular functions
![Page 34: Greedy Column Subset Selection: New Bounds and Distributed ...jasonalt/Altschuler_ICML_talk.pdf · Why use CSS for dimensionality reduction? • Unsupervised • Don’t need labeled](https://reader035.fdocuments.net/reader035/viewer/2022070715/5ed79927f28cb6352d6b57b4/html5/thumbnails/34.jpg)
Questions?