Sparsity Control for Robust Principal Component Analysis
description
Transcript of Sparsity Control for Robust Principal Component Analysis
1
Sparsity Control for Robust Principal Component Analysis
Gonzalo Mateos and Georgios B. Giannakis
ECE Department, University of Minnesota
Acknowledgments: NSF grants no. CCF-1016605, EECS-1002180
Asilomar ConferenceNovember 10, 2010
22
Principal Component Analysis
Our goal: robustify PCA by controlling outlier sparsity
Motivation: (statistical) learning from high-dimensional data
Principal component analysis (PCA) [Pearson’1901] Extraction of low-dimensional data structure Data compression and reconstruction PCA is non-robust to outliers [Jolliffe’86]
DNA microarray Traffic surveillance
33
Our work in context
Robust PCA Robust covariance matrix estimators [Campbell’80], [Huber’81] Computer vision [Xu-Yuille’95], [De la Torre-Black’03] Low-rank matrix recovery from sparse errors [Wright et al’09]
Huber’s M-class and sparsity in linear regression [Fuchs’99]
Contemporary applications Anomaly detection in IP networks [Huang et al’07], [Kim et al’09] Video surveillance, e.g., [Oliver et al’99]
Original Robust PCA `Outliers’
44
PCA formulations Training data:
Minimum reconstruction error: Dimensionality reduction operator Reconstruction operator
Maximum variance:
Factor analysis model:
Solution:
55
Robustifying PCA Least-trimmed squares (LTS) regression [Rousseeuw’87]
(LTS PCA)
LTS-based PCA for robustness
is the -th order statistic among
Trimming constant determines breakdown point
Q: How should we go about minimizing ?
(LTS PCA) is nonconvex; existence of minimizer(s)?
A: Try all subsets of size , solve, and pick the best
Simple but intractable beyond small problems
66
Modeling outliers
Remarks and are unknown If outliers sporadic, then vector is sparse!
Introduce auxiliary variables s.t. inlieroutlier
Inliers obey ; outliers something else Inlier noise: are zero-mean i.i.d. random vectors
Natural (but intractable) estimator
77
LTS PCA as sparse regression Lagrangian form
Tuning controls sparsity in , thus number of outliers
(P0)
Justifies the model and its estimator (P0); ties sparsity with robustness
Proposition 1: If solves (P0) with chosen such that , then solves (LTS PCA) too.
8
Just relax! (P0) is NP-hard relax
(P2)
Q: Does (P2) yield robust estimates ?
A: Yap! Huber estimator is a special case
Role of sparsity controlling is central
9
Entrywise outliers Use -norm regularization
(P1)
Original Robust PCA (P2) Robust PCA (P1)
Outlier pixels
Entire image
rejected
Outlier pixels
rejected
1010
Alternating minimization(P1)
update: reduced-rank Procrustes rotation update: coordinatewise soft-thresholding
Proposition 2: Alg. 1’s iterates converge to a stationary point of (P1).
1111
Refinements Nonconvex penalty terms approximate better in (P0)
Options: SCAD [Fan-Li’01], or sum-of-logs [Candes etal’08]
Iterative linearization-minimization of around Iteratively reweighted version of Alg. 1 Warm start: solution of (P1) or (P2) Bias reduction in (cf. weighted Lasso [Zou’06])
Discard outliers identified in Re-estimate missing data problem
1212
Online robust PCA Motivation: Real-time data and memory limitations
Exponentially-weighted robust PCA
Approximation [Yang’95] At time , do not re-estimate past outlier vectors
1313
Video surveillanceOriginal PCA Robust PCA `Outliers’
Data: http://www.cs.cmu.edu/~ftorre/
1414
Online PCA in actionA
ng
le b
etw
een
C(n
) an
d C
Inliers:
Outliers:
Figure of merit: angle between and
1515
Concluding summary Sparsity control for robust PCA
LTS PCA as -(pseudo)norm regularized regression (NP-hard) Relaxation (group)-Lassoed PCA M-type estimator Sparsity controlling role of central
Tests on real video surveillance data for anomaly extraction
Batch and online robust PCA algorithms i) Outlier identification, ii) Robust subspace tracking Refinements via nonconvex penalty terms
Ongoing research Preference measurement: conjoint analysis and collaborative filtering Robustifying kernel PCA and blind dictionary learning