Download - EE 290A: Generalized Principal Component Analysis Lecture 2 (by Allen Y. Yang): Extensions of PCA Sastry & Yang © Spring, 2011EE 290A, University of California,

EE 290A: Generalized Principal Component Analysis

Lecture 2 (by Allen Y. Yang): Extensions of PCA

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley 1

Last time

Challenges in modern data clustering problems.

PCA reduces dimensionality of the data while retaining as much data variation as possible.

Statistical view: The first d PCs are given by the d leading eigenvectors of the covariance.

Geometric view: Fitting a d-dim subspace model via SVDSastry & Yang © Spring,

2011EE 290A, University of California,

Berkeley 2

This lecture

Determine an optimal number of PCs: d Probabilistic PCA Kernel PCA Robust PCA shall be discussed later



Determine the number of PCs Choosing the optimal number of PCs in

noise-free case is straightforward:



In the noisy case



knee point

A Model Selection Problem

With moderate Gaussian noise, to keep 100% fidelity of the data, all D-dim must be preserved.

However, we can still find tradeoff between model complexity and data fidelity?



More principled conditions



Probabilistic PCA: A generative approach



Given sample statistics, (*) contains ambiguities

Assume y is standard normal, and εis isotropic

Then each observation is also Gaussian



Determining principal axes by MLE



Compute the log-likelihood for n samples

The gradient of L leads to stationary points

Two nontrivial solutions



Kernel PCA: for nonlinear data Nonlinear embedding



Example



Question: How to recover the coef? Compute the null space of the data

matrix

The special polynomial embedding is called the Veronese map



Dimensionality Issue in Embedding Given D and order n, what is the

dimension of the Veronese map?

Often the dimension blows up with large D or n.Question: Can we find the higher-order nonlinear structures without explicitly calling the embedding function?Sastry & Yang © Spring,

2011EE 290A, University of California,

Berkeley 16

Nonlinear PCA

Nonlinear PCs



In the case M is much larger than n



Kernel PCA

Computations in NLPCA only involve inner products of the embedded samples, not the samples themselves.

Therefore, the mapping relation can be expressed in the the computation of PCA without explicitly calling the embedding function.

The inner product of two embedded samples is called the kernel function.



Kernel Function



Computing NLPCs via Kernel Matrix



Examples of Popular Kernels

Polynomial kernel:

Gaussian kernel (Radial Basis Function):

Intersection kernel: