1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google...
-
date post
20-Dec-2015 -
Category
Documents
-
view
219 -
download
4
Transcript of 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google...
![Page 1: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/1.jpg)
1Jun Wang, 2Sanjiv Kumar, and 1Shih-Fu Chang1Columbia University, New York, USA
2Google Research, New York, USA
Sequential Projection Learning for Hashing with Compact Codes
![Page 2: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/2.jpg)
Nearest neighbor search in large databases with high dimensional points is important (e.g. image/video/document retrieval)
Exact search not practical Computational cost Storage cost (need to store original data points)
Approximate nearest neighbor (ANN)Tree approaches (KD tree, metric tree, ball tree, … ) Hashing methods (locality sensitive hashing, spectral
hashing, …)
Nearest Neighbor Search
![Page 3: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/3.jpg)
Hyperplane partitioning
Linear projection based hashing
x1
Binary Hashing
X x1 x2 x3 x4 x5
h1 0 1 1 0 1
h2 1 0 1 0 1
h1h2
… … … … … …
hk … … … … …
010… 100… 111… 001… 110…x2
x3
x4
x5
![Page 4: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/4.jpg)
Related Work Different choices of projections
random projections locality sensitive hashing (LSH, Indyk et al. 98) shift invariant kernel hashing (SIKH, Raginsky et al. 09)
principal projections spectral hashing (SH, Weiss, et al. 08)
Different choices of identity function: LSH & Boosted SSC (Shakhnarovich, 05) sinusoidal function: SH & SIKH
Other recent work: Restricted boltzman machines (RBMs, Hinton et al. 06) Jian et al. 08 (metric learning) Kulis et al. 09 & Mu et al. 10 (kernerlized) Kulis NIPS 09 (binary reconstructive embedding)
![Page 5: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/5.jpg)
Existing hashing methods mostly rely on random or principal projectionsnot compact low accuracy
Simple metrics are usually not enough to express semantic similaritySimilarity given by a few pairwise labels
Goal: to learn binary hash functions that give high accuracy with compact codes--- semi-supervised and unsupervised cases
Main Issues
![Page 6: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/6.jpg)
Formulation
x1
x2
x3
x4
x5
x6
x7
x8
x1 x2 x3 x4 x5x6 x7 x8
![Page 7: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/7.jpg)
Formulation - Empirical Fitness
# of # of
# of correctly hashed pairs
# of wrongly hashed pairs
![Page 8: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/8.jpg)
How well the hash codes fit on the training data
Recall the definition of pair-wise label metric
Objective:
Empirical Fitness
![Page 9: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/9.jpg)
Replace the sign of projections with the signed magnitude
A more simplified matrix form as:
Relaxing Empirical Fitness
![Page 10: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/10.jpg)
Maximizing empirical fitness is not sufficient
Maximizing the combination of empirical fitness over training data and entropy of hash codes
Information Theoretic Regularizer
-1 1-1
1 Neighbor pair
Non-neighbor pair
Maximum entropy principle
![Page 11: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/11.jpg)
Step 1: Maximum entropy equals balanced partition
Step 2: Balanced partition equals partition with maximum variance
Step 3: Substitute the maximum bit-variance term by its lower bound
Relaxing Regularization Term
![Page 12: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/12.jpg)
Final Objective
Orthogonal solution by Eigen-decomposition on adjusted covariance matrix M
“adjusted” covariance matrix
Not very accurate!
![Page 13: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/13.jpg)
Sequential Solution Motivation: to learn a new hash function which tries
to correct the mistakes made by previous hash function
![Page 14: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/14.jpg)
S3PLH Algorithm Summary
![Page 15: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/15.jpg)
Unsupervised Extension (USPH) Observation - boundary errors
![Page 16: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/16.jpg)
Pseudo Pair-Wise Labels Pseudo set of labeled data from
contains all the point pairs:
Accordingly, generate a pseudo label matrix
Different with SPLH, USPLH generates new pseudo labels and the corresponding label matrix, instead of updating weights of a fixed set of given labels.
![Page 17: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/17.jpg)
Experiments Datasets MNIST (70K) – supervised case SIFT Data (1 Million SIFT features) – unsupervised case
Evaluation protocol Mean Average Precision and Recall
Setting of training:MNIST data
semi-supervised: 1K labeled samples for training, 1K for query test
SIFT Data
unsupervised: 2K pseudo labels for training, 10K for query test
![Page 18: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/18.jpg)
MNIST Digits
48-bit Recall curve
![Page 19: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/19.jpg)
Training and Test Time
![Page 20: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/20.jpg)
SIFT 1 Million Data
48-bit Recall curve
![Page 21: 1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649d435503460f94a1f625/html5/thumbnails/21.jpg)
Summary and contributions A semi-supervised paradigm for hashing learning
(Empirical risk with information theoretic regularization); Sequential learning idea for error correction; Extension of unsupervised case; Easy implementation and highly scalable;
Future work Theoretical analysis of performance guarantee Weighted hamming embedding
Summary and Conclusion