Neighborhood Component Analysis 20071108

Neighbourhood Component Analysis

T.S. Yo

References

Outline

● Introduction● Learn the distance metric from data● The size of K● Procedure of NCA● Experiments● Discussions

Introduction (1/2)

● KNN – Simple and effective– Nonlinear decision surface– Non-parametric– Quality improved with more data– Only one parameter, K -> easy for tuning

Introduction (2/2)

● Drawbacks of KNN – Computationally expensive: search through the

whole training data in the test time– How to define the “distance” properly?

● Learn the distance metric from data, and force it to be low rank.

Learn the Distance from Data (1/5)

● What is a good distance metric? – The one that minimize (optimize) the cost!

● Then, what is the cost?– The expected testing error– Best estimated with leave-one-out (LOO) cross-

validation error in the training dataKohavi, Ron (1995). "A study of cross-validation and bootstrap for accuracy estimation and model selection". Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence 2 (12): 1137–1143. (Morgan Kaufmann, San Mateo)

● Modeling the LOO error:– Let pij be the probability that point xj is selected as

point xi's neighbour.– The probability that points are correctly classified

when xi is used as the reference is:

● To maximize pi for all xi means to minimize LOO error.

Softmax Function

● Then, how do we define pij ?– According to the softmax of the distance dij

– Relatively smoother than dij

● How do we define dij ?● Limit the distance measure within Mahalanobis

(quadratic) distance.

● That is to say, we project the original feature vectors x into another vector space with q transformation matrix, A

● Substitute the dij in pij :

● Now, we have the objective function :

● Maximize f(A) w.r.t. A → minimize overall LOO error

The Size of k ● For the probability distribution pij :

● The perplexity can be used as an estimate for the size of neighbours to be considered, k

Procedure of NCA (1/2)

● Use the objective function and its gradient to learn the transformation matrix A and K from the training data, Dtrain(with or without dimension reduction).

● Project the test data, Dtest, into the transformed space.

● Perform traditional KNN (with K and ADtrain) on the transformed test data, ADtest.

Procedure of NCA (2/2)

● Functions used for optimization

Experiments – Datasets (1/2)

● 4 from UCI ML Repository, 2 self-made

Experiments – Datasets (2/2)

n2d is a mixture of two bivariate normal distributions with different means and covariance matrices. ring consists of 2-d concentric rings and 8 dimensions of uniform random noise.

Experiments – Results (1/4)

Error rates of KNN and NCA with the same K.

It is shown that generally NCA does improve the performance of KNN.

● Compare with other classifiers

● Rank 2 dimension reduction

Discussions (1/8)

● Rank 2 transformation for wine

Discussions (2/8)

● Rank 1 transformation for n2d

Discussions (3/8)

● Results of Goldberger et al.

(40 realizations of 30%/70% splits)

Discussions (4/8)

● Results of Goldberger et al.

(rank 2 transformation)

Discussions (5/8)

● Results of experiments suggest that with the learned distance metric by NCA algorithm, KNN classification can be improved.

● NCA also outperforms traditional dimension reduction methods for several datasets.

Discussions (6/8)

● Comparing to other classification methods (i.e. LDA and QDA), NCA usually does not give the best accuracy.

● Some odd performance on dimension reduction suggests that a further investigation on the optimization algorithm is necessary.

Discussions (7/8)

● Optimize a matrix● Can we Optimize these Functions? (Michael L. Overton)

– Globally, no. Related problems are NP-hard (Blondell-Tsitsiklas, Nemirovski)

– Locally, yes.● But not by standard methods for nonconvex,

smooth optimization● Steepest descent, BFGS or nonlinear conjugate gradient will typically jam because of nonsmoothness

Discussions (8/8)

● Other methods learn distant metric from data– Discriminant Common Vectors(DCV)

● Similar to NCA, DCV focuses on optimizing the distance metric on certain objective functions

– Laplacianfaces(LAP)● Emphasizes more on dimension reduction

J. Liu and S. Chen，Discriminant Common Vecotors Versus Neighbourhood Components Analysis and Laplacianfaces: A comparative study in small sample size problem. Image and Vision Computing

Question?

Thank you!

Derive the Objective Function (1/5)

● From the assumptions, we have :

Neighborhood Component Analysis 20071108

Technology

Transcript of Neighborhood Component Analysis 20071108

Introduction to Spatial Epidemiology Analyses and Methodsmodel, the spatially structured component is modelled according to a certain adjacency structure given by a neighborhood matrix

Neighborhood Watch Handbook - Golden Valley, Minnesota · Golden Valley Police Department • Neighborhood Watch Handbook • Page 1 Neighborhood Watch Neighborhood Watch is a crime

Neighborhood Transit-Oriented Development Plan Neighborhood ...

Annual HPWREN - High Performance Wireless …hpwren.ucsd.edu/news/20071108/2007HAUW-Presentations/...Annual HPWREN Users Meeting November 6, 2007 MOBLAS 4 NASA Goddard Space Flight

Neighboring Neighborhood Movement Animator Project About ...€¦ · Neighborhood Animator Project equipping residents to connect and work with the assets in their neighborhood Neighborhood

Verona Rd./US 18/151, ROD - Attachment D, Overview exhibits · 2015. 3. 24. · Orchard Ridge Neighborhood Meadowood Neighborhood Dunn’s Marsh Neighborhood Allied Neighborhood Nakoma

Neighborhood Greenways Define the Route...Neighborhood Greenways April11, 2013 All TMD Meeting Functions of Neighborhood Greenways • Family friendly routes • Neighborhood networks

Neighborhood memory and neighborhood construction. The ...

Neighborhood News Summer 2011 11.… · Neighborhood News The Burbank Gardens Neighborhood Association encourages neighborhood participation by working to maintain the historic value,

UNIVERSITY NEIGHBORS NEIGHBORHOOD PLAN AURORA NEIGHBORHOOD … · Affordable housing in neighborhood is a plus Strong housing stock – variety and diverse Market neighborhood to

Latency of neighborhood based recommender systemssch/publications/fi.pdf · 1 Introduction Recommender systems (RS) are an important component of the Intelligent Web. The systems

EAST PORTLAND NEIGHBORHOOD...Your NEIGHBORHOOD ASSOCIATIONS Argay Neighborhood Association Centennial Community Association Glenfair Neighborhood Association Hazelwood Neighborhood

Neighborhood Council Funding Program APPLICATION for Neighborhood … · 2020-07-03 · Neighborhood Council Funding Program APPLICATION for Neighborhood Purposes Grant (NPG) This

NEIGHBORHOOD STABILIZATION PROGRAM A Component of HUD’s Community Development Block Grant (CDBG) Program.

Neighborhood Improvement Development Corporation ... · Neighborhood Improvement Development Corporation Department of Neighborhood Services Department of City Development . 2673

Neighborhood Associations - Fort Lauderdalegis.fortlauderdale.gov/PDFMaps/Neighborhood Associations (36x48... · Rock Island Community Development Twinlakes Neighborhood Assoc. Central

Downtown Area Neighborhood Alliance A Neighborhood Partnership.

Neighborhood Guide toNeighborhood Guide to Neighborhood

20071108 Homeless Study

Unit 3: Neighborhood News Theme 5: My Neighborhood