Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit...

43
Spring 2020: Venu: Haag 315, Time: M/W 4-5:15pm ECE 5582 Computer Vision Lec 13: Low Dimension Embedding Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: [email protected], Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li: ECE 5582 Computer Vision, 2020 p.1 slides created with WPS Office Linux and EqualX LaTex equation editor

Transcript of Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit...

Page 1: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Spring 2020: Venu: Haag 315, Time: M/W 4-5:15pm

ECE 5582 Computer VisionLec 13: Low Dimension Embedding

Zhu LiDept of CSEE, UMKC

Office: FH560E, Email: [email protected], Ph: x 2346.http://l.web.umkc.edu/lizhu

Z. Li: ECE 5582 Computer Vision, 2020 p.1

slides created with WPS Office Linux and EqualX LaTex equation editor

Page 2: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Outline

Recap: Part I

Linear Algebra Refresher SVD and Principal Component Analysis (PCA) Laplacian Eigen Map (LEM) Stochastic Neigborhood Embedding (SNE)

Z. Li: ECE 5582 Computer Vision, 2020 p.2

Page 3: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Handcrafted Feature Pipeline

An image retrieval pipeline (hand crafted features)

p.3

Image Formation

Feature Computing

Feature Aggregation

Classification

Knowledge/Data Base

Z. Li: ECE 5582 Computer Vision, 2020

Homography,Color space

Color histogramFiltering, Edge DetectionHoG, Harris Detector, LoG Scale space, SIFT

BowVLADFisher VectorSupervector

TPR, FPR, Precision, Recall, mAP

kNN, BayesianSVM, Kernel Machine

Page 4: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Vector and Matrix Notations

Vector

Matrix

p.4Z. Li: ECE 5582 Computer Vision, 2020

Page 5: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Vector Products

Inner Product

Outer Product

p.5Z. Li: ECE 5582 Computer Vision, 2020

Page 6: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matrix-Vector Product

y=Ax

So, y is a linear combination of basis {ak} with weights from x

p.6Z. Li: ECE 5582 Computer Vision, 2020

Page 7: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matrix Product

C=AB

Associative: ABC = (AB)C = A(BC)

Distributive: A(B+C) = AB + AC

p.7

A: nxp B: pxm A: nxm=

Z. Li: ECE 5582 Computer Vision, 2020

Page 8: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Outer Product/Kron

Vector outer product:

Example

Z. Li: ECE 5582 Computer Vision, 2020 p.8

Page 9: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matrix Transpose

Transpose

p.9Z. Li: ECE 5582 Computer Vision, 2020

Page 10: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matrix Trace and Determinant

Trace:Tr(A): only for nxn square matrix

Determinant: Det(A): The size of volumes spanned by A, All possible linear combinations of a1 and a2

p.10

Det(A) = |2-9| = 7;

Z. Li: ECE 5582 Computer Vision, 2020

Page 11: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Eigen Values and Eigen Vectors

Definition: for nxn matrix A:

In Matlab: [P, V]=eig(A);

p.11Z. Li: ECE 5582 Computer Vision, 2020

Page 12: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Eigen Vectors of Symmetric Matrix

If square matrix A: nxn is symmetric A=AT

Then its Eigen Values are real, and Eigen Vectors are othonormal:

� = �X��

where S is a diagonal matrix with eigen values of A.

Application: solution to the Quadratic form maximization:

will be the largest eigen value, and x* will be the corresponding eigen vector of A.

p.12Z. Li: ECE 5582 Computer Vision, 2020

Page 13: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SVD for non square matrix: A mxn:

p.13Z. Li: ECE 5582 Computer Vision, 2020

� = ���

��

Σ

Page 14: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SVD as Signal ExpansionThe Singular Value Decomposition (SVD) of an nxm matrix A, is,

Where the diagonal of S are the eigen values of AAT, [��,��,…,  ��], called “singular values” U are eigenvectors of AAT, and V are eigen vectors of ATA,

the outer product of uiviT, are basis of A in reconstruction:

p.14

� = �X�� =��������

A(mxn) = U(mxm) S(mxn)

V(nxn)

The 1st order SVD approx. of A is:

�� ∗��: , 1� ∗��: , 1��

Z. Li: ECE 5582 Computer Vision, 2020

Page 15: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SVD approximation of an image

Very easy…function [x]=svd_approx(x0, k)dbg=0;if dbg x0= fix(100*randn(4,6)); k=2;end[u, s, v]=svd(x0);[m, n]=size(s);x = zeros(m, n); sgm = diag(s);for j=1:k x = x + sgm(j)*u(:,j)*v(:,j)'; end

p.15Z. Li: ECE 5582 Computer Vision, 2020

Page 16: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SVD for Separable Filtering

Take LoG filter for eg. h=fspecial('LoG', 11, 2.0); [u,s,v]=svd(h); h1=s(1,1)*u(:,1)*v(:,1)';

Z. Li: ECE 5582 Computer Vision, 2020 p.16

h1 is 1-SVD approx of LoG

Many implications forDeep networks acceleration !

Page 17: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

NormVector Norm: Length of the vector Euclidean Norm (L2 Norm): norm(x, 2)

Lp norm:

Matrix Norm: Forbenius Norm

p.17Z. Li: ECE 5582 Computer Vision, 2020

Page 18: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Quadratic Form

Quadratic form f(x)=xTAx in R:

Positive Definite (PD): For non-zero x, xTAx > 0

Positive Semi-Definite (PSD): For non-zero x, xTAx >= 0

Indefinite: Exists x1, x2 non zero, but x1

TAx1 >0, while x2TAx2 < 0;

p.18Z. Li: ECE 5582 Computer Vision, 2020

Page 19: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matrix Calculus

Gradient of f(A):

Matrix Gradient Properties

p.19Z. Li: ECE 5582 Computer Vision, 2020

Page 20: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Hessian of f(X)

For function:�:�� →�

Gradient & Hessian of Quadratic Form: f(x)= xTAx

p.20Z. Li: ECE 5582 Computer Vision, 2020

Page 21: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

PCA -Dimension Reduction

A typical image retrieval pipeline

Z. Li: ECE 5582 Computer Vision, 2020 p.21

Image Formation

Feature Computing

Feature Aggregation

Classification

Knowledge/Data Base

e.g, dense SIFT: 12000 x 128 e.g, Fisher Vector: k=64, d=128

Rd -> Rp

Page 22: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Outline

Recap: Part I

Linear Algebra Refresher SVD and Principal Component Analysis (PCA) Laplacian Eigen Map (LEM) Stochastic Neigborhood Embedding (SNE)

Z. Li: ECE 5582 Computer Vision, 2020 p.22

Page 23: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Principal Component Analysis

The formulation: for data points {x1, x2,…, } in Rn, find a lower dimensional

representation in Rm, via a projection W,: mxn, s.t., the energy of the data is preserved

Z. Li: ECE 5582 Computer Vision, 2020 p.23

Page 24: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

PCA solution

Take the Lagrangian of the problem

Take the derivative w.r.t. to w, and KKT condition gives us,

This is an Eigen problem, finding projection s.t. it is just a scaling along the scatter matrix eigen vectors.

Z. Li: ECE 5582 Computer Vision, 2020 p.24

X� = ��

Page 25: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

PCA – how to compute

PCA via SVD on the Covariance matrix

Z. Li: ECE 5582 Computer Vision, 2020 p.25

S: covariance, nxn

Page 26: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

2d Data

2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5-2

0

2

4

6

8

10

Z. Li: ECE 5582 Computer Vision, 2020 p.26

Page 27: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Principal Components

-5 -4 -3 -2 -1 0 1 2 3 4 5-5

-4

-3

-2

-1

0

1

2

3

4

5

1st principal vector

2nd principal vector

Gives best axis to project Minimum RMS

errorPrincipal vectors

are orthogonal

Z. Li: ECE 5582 Computer Vision, 2020 p.27

Page 28: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

PCA on HoGs

Matlab Implementation of PCA: [A, s, eig_values]=princomp(hogs);

Z. Li: ECE 5582 Computer Vision, 2020 p.28

HoG basis function

Page 29: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

PCA Application in Aggregation

SIFT aggregation Usually a PCA is

done on SIFT features, to reduce the dimension from 128 to say 24, 32. Then a GMM is

trained in R32 space, for FV encoding

Homework-2 Aggregation Fisher Vector

Aggregation of SIFT

Z. Li: ECE 5582 Computer Vision, 2020 p.29

load../../dataset/cdvs_sift_aggregation_test_data.mat;

[n_sift, kd_sift]=size(gd_sift_cdvs);offs = randperm(n_sift); offs = offs(1:200*2^10);% PCA[A1, s1, lat1]=princomp(double(gd_sift_cdvs(offs,:)));

figure(41); hold on; grid on; stem(lat1, '.'); title('sift pca eigen values');

Page 30: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SIFT PCA

Eigen values

That is why we have kd=[24, 32 48] for SIFT GMM in FV aggregation

Z. Li: ECE 5582 Computer Vision, 2020 p.30

Page 31: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SIFT PCA Basis Functions

Capturing max variation directions

Z. Li: ECE 5582 Computer Vision, 2020 p.31

Page 32: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Visualizing SIFT in lower dimensional space

Project SIFTs from 2 images to 2D space

Z. Li: ECE 5582 Computer Vision, 2020 p.32

Page 33: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Laplacian Eigen Map

Directly compute an embedding {yk} from input {xk} in RD without the explicit projection model A, s.t. Y=AX Objective function

where the nxn affinity matrix W reflects the data points relationship in the original space X.

Z. Li: ECE 5582 Computer Vision, 2020 p.33

M. Belkin and P. Niyogi. Laplacian Eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems, volume 14, pages 585–591, Cambridge, MA, USA, 2002. The MIT Press

Page 34: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Graph Laplacian

Graph Laplacian: L= D - W

p.34Z. Li: ECE 5582 Computer Vision, 2020

Page 35: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Laplacian Eigenmap

Minimizing the following

Is equivalent to

Where D is the degree matrix (diagonal) with

Z. Li: ECE 5582 Computer Vision, 2020 p.35

Page 36: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Laplacian Eigen Map Solution

Numerically, solve the eigen problem:

where the first d smallest eigen values’ corresponding eigenvectors, will form a d-dimensional feature of {yk}

Z. Li: ECE 5582 Computer Vision, 2020 p.36

�� = ���

eigen vec

n-point gives nxn Laplacianfirst k eigenvectors of 1xn,give us k-dimension inducedembedding

Page 37: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Stochastic Neigbor Embedding

Hinton's work: For high dimensional data {xi} in R20x20, e.g., digits in

MINST Find its lower dimension (e.g, 2D) embedding such that

their relative affinity is preserved Unsupervised (no label info utilized)

Z. Li: ECE 5582 Computer Vision, 2020 p.37

Page 38: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Probability Preserving Embedding

• Each point in high-dim has a conditional probability of picking each other point as its neighbor.

• The distribution over

neighbors is based on the high-dim pairwise distances.– If we do not have

coordinates for the datapoints we can use a matrix of dissimilarities instead of pairwise distances.

High-D Space

i

jk

probability of picking j given that you start at i

p.38Z. Li: ECE 5582 Computer Vision, 2020

Page 39: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Problem Formulation SNE starts by converting the Euclidean distances between

high-dimensional data points pair distance d(xi, xj) into conditional probabilities that represent similarity. It can be described as:

its lower dimensional embedding{xi} in RD to {yi} in Rd, has similar distance as conditional prob as,

Stochastic Neighbor Embedding

not symmetric

p.39Z. Li: ECE 5582 Computer Vision, 2020

Page 40: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Stochastic Neighbor Embedding (SNE) Preserving pair wise prob relationship in terms of

conditional prob, i.e, minimizes the differences of p(j|i) and q(j|i) for all pairs of {xi, xj} and {yi, yj}

KL distance, measures the difference in two distributions (bonus points for HW-1, using KL to measure histogram distance) Has coding penality interpretation:

http://sce2.umkc.edu/csee/lizhu/teaching/2018.fall.video-com/notes/lec02.pdf

40

p.40Z. Li: ECE 5582 Computer Vision, 2020

Page 41: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

SNE solution

Gradient of the total KL distance:

this gives us gradient search solution:

moving along the gradient, with a momentum factor

p.41Z. Li: ECE 5582 Computer Vision, 2020

Page 42: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Matlab Implementation

t-Distribution SNE example: HW-2 data embedding

Z. Li: ECE 5582 Computer Vision, 2020 p.42

Page 43: Lec 13: Low Dimension Embedding · Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors - > embedding

Summary SVD and PCA SVD – non-square matrix decomposition, left transform and

right transform, with scaling in between SVD – as an image decomposition, linear combination of

outer-product basis PCA – eigen values indicate amount of info/energy in each

dimension, PCA – basis are eigen vectors to the covariance matrix

Laplacian Eigenmap direct data embedding without explicit projection input data -> affinity graph -> graph Laplacian eigenvectors -

> embedding by picking up eigenvectorsStochastic Neigbor Embedding No explict projection matrix Embedding by preserving probabilitic affinity Solution via gradient algorithm

Z. Li: ECE 5582 Computer Vision, 2020 p.43