Advanced Introduction to Machine Learning CMU...
Transcript of Advanced Introduction to Machine Learning CMU...
![Page 1: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/1.jpg)
Advanced Introduction to
Machine Learning CMU-10715
Principal Component Analysis
Barnabás Póczos
![Page 2: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/2.jpg)
2
Contents
Motivation
PCA algorithms
Applications
Some of these slides are taken from
• Karl Booksh Research group
• Tom Mitchell
• Ron Parr
![Page 3: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/3.jpg)
3
Motivation
![Page 4: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/4.jpg)
4
• Data Visualization
• Data Compression
• Noise Reduction
PCA Applications
![Page 5: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/5.jpg)
5
Data Visualization
Example:
• Given 53 blood and urine samples (features) from 65 people.
• How can we visualize the measurements?
![Page 6: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/6.jpg)
6
• Matrix format (65x53)
H-WBC H-RBC H-Hgb H-Hct H-MCV H-MCH H-MCHCH-MCHC
A1 8.0000 4.8200 14.1000 41.0000 85.0000 29.0000 34.0000
A2 7.3000 5.0200 14.7000 43.0000 86.0000 29.0000 34.0000
A3 4.3000 4.4800 14.1000 41.0000 91.0000 32.0000 35.0000
A4 7.5000 4.4700 14.9000 45.0000 101.0000 33.0000 33.0000
A5 7.3000 5.5200 15.4000 46.0000 84.0000 28.0000 33.0000
A6 6.9000 4.8600 16.0000 47.0000 97.0000 33.0000 34.0000
A7 7.8000 4.6800 14.7000 43.0000 92.0000 31.0000 34.0000
A8 8.6000 4.8200 15.8000 42.0000 88.0000 33.0000 37.0000
A9 5.1000 4.7100 14.0000 43.0000 92.0000 30.0000 32.0000
Inst
ance
s
Features
Difficult to see the correlations between the features...
Data Visualization
![Page 7: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/7.jpg)
7
• Spectral format (65 curves, one for each person)
0 10 20 30 40 50 600
100
200
300
400
500
600
700
800
900
1000
measurement
Val
ue
Measurement
Difficult to compare the different patients...
Data Visualization
![Page 8: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/8.jpg)
8
0 10 20 30 40 50 60 700
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Person
H-Ba
nds
• Spectral format (53 pictures, one for each feature)
Difficult to see the correlations between the features...
Data Visualization
![Page 9: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/9.jpg)
9
0 50 150 250 350 45050100150200250300350400450500550
C-Triglycerides
C-L
DH
0100
200300
400500
0
200
400
6000
1
2
3
4
C-TriglyceridesC-LDH
M-E
PI
Bi-variate Tri-variate
How can we visualize the other variables???
… difficult to see in 4 or higher dimensional spaces...
Data Visualization
![Page 10: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/10.jpg)
10
• Is there a representation better than the coordinate axes?
• Is it really necessary to show all the 53 dimensions?
• … what if there are strong correlations between the features?
• How could we find the smallest subspace of the 53-D space that keeps the most information about the original data?
• A solution: Principal Component Analysis
Data Visualization
![Page 11: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/11.jpg)
11
PCA Algorithms
![Page 12: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/12.jpg)
12
Orthogonal projection of the data onto a lower-dimension linear space that...
maximizes variance of projected data (purple line)
minimizes the mean squared distance between
• data point and
• projections (sum of blue lines)
PCA:
Principal Component Analysis
![Page 13: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/13.jpg)
13
Idea:
Given data points in a d-dimensional space, project them into a lower dimensional space while preserving as much information as possible.
• Find best planar approximation of 3D data
• Find best 12-D approximation of 104-D data
In particular, choose projection that minimizes squared error in reconstructing the original data.
Principal Component Analysis
![Page 14: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/14.jpg)
14
PCA Vectors originate from the center of mass.
Principal component #1: points in the direction of the largest variance.
Each subsequent principal component
• is orthogonal to the previous ones, and
• points in the directions of the largest variance of the residual subspace
Principal Component Analysis
Properties:
![Page 15: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/15.jpg)
15
2D Gaussian dataset
![Page 16: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/16.jpg)
16
1st PCA axis
![Page 17: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/17.jpg)
17
2nd PCA axis
![Page 18: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/18.jpg)
18
m
i
i
T
i
T
m 1
2
111
2 })]({[1
maxarg xwwxwww
}){(1
maxarg1
2
i1
1
m
i
T
mxww
w
To find w2, we maximize
the variance of the
projection in the residual
subspace
To find w1, maximize the variance of projection of x
x’ PCA reconstruction
Given the centered data {x1, …, xm}, compute the principal vectors:
1st PCA vector
2nd PCA vector
x
w1
w
x’=w1(w1Tx)
w
PCA algorithm I (sequential)
x-x’ w2
![Page 19: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/19.jpg)
19
m
i
k
j
i
T
jji
T
km 1
21
11
})]({[1
maxarg xwwxwww
We maximize the variance
of the projection in the
residual subspace
Maximize the variance of projection of x
x’ PCA reconstruction
Given w1,…, wk-1, we calculate wk principal vector as before:
kth PCA vector
w1(w1Tx)
w2(w2Tx)
x
w1
w2 x’=w1(w1
Tx)+w2(w2Tx)
w
PCA algorithm I (sequential)
![Page 20: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/20.jpg)
20
• Given data {x1, …, xm}, compute covariance matrix
• PCA basis vectors = the eigenvectors of
• Larger eigenvalue more important eigenvectors
m
i
T
im 1
))((1
xxxx
m
i
im 1
1xxwhere
PCA algorithm II (sample covariance matrix)
![Page 21: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/21.jpg)
21
PCA algorithm(X, k): top k eigenvalues/eigenvectors
% X = N m data matrix,
% … each data point xi = column vector, i=1..m
•
• X subtract mean x from each column vector xi in X
• X XT … covariance matrix of X
• { i, ui }i=1..N = eigenvectors/eigenvalues of
... 1 2 … N
• Return { i, ui }i=1..k % top k PCA components
m
im 1
1ixx
PCA algorithm II (sample covariance matrix)
![Page 22: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/22.jpg)
22
Finding Eigenvectors with Power Iteration
v=Sigma*v;
v=v/sqrt(v'*v);
) vPCA1
Power iteration 1:
Power iteration 2:
vPCA1T*Sigma*vPCA1
Sigma2=Sigma- *vPCA1*vPCA1T;
v=Sigma2*v;
v=v/sqrt(v'*v);
) vPCA2
First eigenvector:
Second eigenvector:
![Page 23: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/23.jpg)
23
![Page 24: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/24.jpg)
24
Singular Value Decomposition of the centered data matrix X.
Xfeatures samples = USVT
X VT S U =
samples
significant
noise
nois
e noise
signif
ican
t
sig.
PCA algorithm III (SVD of the data matrix)
![Page 25: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/25.jpg)
25
• Columns of U
• the principal vectors, { u(1), …, u(k) }
• orthogonal and has unit norm – so UTU = I
• Can reconstruct the data using linear combinations of { u(1), …, u(k) }
• Matrix S
• Diagonal
• Shows importance of each eigenvector
• Columns of VT
• The coefficients for reconstructing the samples
PCA algorithm III
![Page 26: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/26.jpg)
26
Applications
![Page 27: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/27.jpg)
Want to identify specific person, based on facial image
Robust to glasses, lighting,…
Can’t just use the given 256 x 256 pixels
Face Recognition
27
![Page 28: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/28.jpg)
Method A: Build a PCA subspace for each person and check which subspace can reconstruct the test image the best
Method B: Build one PCA database for the whole dataset and then classify based on the weights.
Applying PCA: Eigenfaces
28
![Page 29: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/29.jpg)
Example data set: Images of faces
• Eigenface approach [Turk & Pentland], [Sirovich & Kirby]
Each face x is …
• 256 256 values (luminance at location)
• x in 256256 (view as 64K dim vector)
Form X = [ x1 , …, xm ] centered data mtx
Compute = XXT
Problem: is 64K 64K … HUGE!!!
Applying PCA: Eigenfaces
29
256 x
256
real va
lues
m faces
X =
x1, …, xm
![Page 30: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/30.jpg)
Suppose m instances, each of size N
• Eigenfaces: m=500 faces, each of size N=64K
Given NN covariance matrix , can compute
• all N eigenvectors/eigenvalues in O(N3)
• first k eigenvectors/eigenvalues in O(k N2)
If N=64K, this is EXPENSIVE!
30
Computational Complexity
![Page 31: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/31.jpg)
• Note that m<<64K • Use L=XTX instead of =XXT
• If v is eigenvector of L,
then Xv is eigenvector of
Proof: L v = v
XTX v = v
X (XTX v) = X( v) = Xv
(XXT)X v = (Xv)
(Xv) = (Xv)
256 x
256
real va
lues
m faces
X =
x1, …, xm
31
A Clever Workaround
![Page 32: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/32.jpg)
32
Principle Components (Method B)
![Page 33: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/33.jpg)
Requires carefully controlled data:
• All faces centered in frame
• Same size
• Some sensitivity to angle
Method is completely knowledge free
• (sometimes this is good!)
• Doesn’t know that faces are wrapped around 3D objects (heads)
• Makes no effort to preserve class distinctions
33
Shortcomings
![Page 34: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/34.jpg)
34
Happiness subspace
![Page 35: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/35.jpg)
35
Disgust subspace
![Page 36: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/36.jpg)
36
Facial Expression Recognition Movies
![Page 37: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/37.jpg)
37
Facial Expression Recognition Movies
![Page 38: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/38.jpg)
38
Facial Expression Recognition Movies
![Page 39: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/39.jpg)
Image Compression
![Page 40: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/40.jpg)
40
Divide the original 372x492 image into patches:
• Each patch is an instance that contains 12x12 pixels on a grid
Consider each as a 144-D vector
Original Image
![Page 41: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/41.jpg)
41
L2 error and PCA dim
![Page 42: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/42.jpg)
42
PCA compression: 144D ) 60D
![Page 43: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/43.jpg)
43
PCA compression: 144D ) 16D
![Page 44: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/44.jpg)
44
2 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
12
16 most important eigenvectors
![Page 45: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/45.jpg)
45
PCA compression: 144D ) 6D
![Page 46: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/46.jpg)
46
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
6 most important eigenvectors
![Page 47: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/47.jpg)
47
PCA compression: 144D ) 3D
![Page 48: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/48.jpg)
48
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
3 most important eigenvectors
![Page 49: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/49.jpg)
49
PCA compression: 144D ) 1D
![Page 50: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/50.jpg)
50
Looks like the discrete cosine bases of JPG!...
60 most important eigenvectors
![Page 51: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/51.jpg)
51 http://en.wikipedia.org/wiki/Discrete_cosine_transform
2D Discrete Cosine Basis
![Page 52: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/52.jpg)
Noise Filtering
![Page 53: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/53.jpg)
53
x x’
U x
Noise Filtering
![Page 54: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/54.jpg)
54
Noisy image
![Page 55: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/55.jpg)
55
Denoised image using 15 PCA components
![Page 56: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/56.jpg)
PCA Shortcomings
![Page 57: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/57.jpg)
57 PCA doesn’t know about class labels!
Problematic Data Set for PCA
![Page 58: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/58.jpg)
58
• PCA maximizes variance, independent of class
magenta
• FLD attempts to separate classes
green line
PCA vs Fisher Linear Discriminant
![Page 59: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/59.jpg)
59 PCA cannot capture NON-LINEAR structure!
Problematic Data Set for PCA
![Page 60: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/60.jpg)
60
PCA • finds orthonormal basis for data • Sorts dimensions in order of “importance” • Discard low significance dimensions
Applications:
• Get compact description • Remove noise • Improve classification (hopefully)
Not magic:
• Doesn’t know class labels • Can only capture linear variations
One of many tricks to reduce dimensionality!
PCA Conclusions
![Page 61: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/61.jpg)
61
PCA Theory
![Page 62: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/62.jpg)
62
GOAL:
Justification of Algorithm II
![Page 63: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/63.jpg)
63
x is centered!
Justification of Algorithm II
![Page 64: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/64.jpg)
64
GOAL:
Use Lagrange-multipliers for the constraints.
Justification of Algorithm II
![Page 65: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/65.jpg)
65
Justification of Algorithm II
![Page 66: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/66.jpg)
66
Thanks for the Attention!
![Page 67: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/67.jpg)
67
Attic
![Page 68: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/68.jpg)
68
Kernel PCA
![Page 69: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/69.jpg)
69
Performing PCA in the feature space
Lemma
Proof:
Kernel PCA
![Page 70: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/70.jpg)
70
Lemma
Kernel PCA
![Page 71: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/71.jpg)
71
Proof
Kernel PCA
![Page 72: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/72.jpg)
72
How to use to calculate the projection of a new sample t?
Where was I cheating?
The data should be centered in the feature space, too! But this is manageable...
Kernel PCA
![Page 73: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/73.jpg)
73 http://en.wikipedia.org/wiki/Kernel_principal_component_analysis
Input points before kernel PCA
![Page 74: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/74.jpg)
74
The three groups are distinguishable using the first component only
Output after kernel PCA
![Page 75: Advanced Introduction to Machine Learning CMU …bapoczos/Classes/ML10715_2015Fall/slides/...Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás](https://reader033.fdocuments.net/reader033/viewer/2022053010/5f0de3007e708231d43c9227/html5/thumbnails/75.jpg)
… faster if train with…
• only people w/out glasses
• same lighting conditions
75
Reconstructing… (Method B)