Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension...

29
, Background Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction Jens Nilsson 1 Fei Sha 2 Michael I. Jordan 3 1 Centre for Mathematical Sciences Lund University 2 Computer Science Division University of California, Berkeley 3 Computer Science Division and Department of Statistics University of California, Berkeley 24’th International Conference on Machine Learning Regression on Manifolds using Kernel Dimension Reduction

Transcript of Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension...

Page 1: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on Manifolds using KernelDimension Reduction

Jens Nilsson1 Fei Sha2 Michael I. Jordan3

1Centre for Mathematical SciencesLund University

2Computer Science DivisionUniversity of California, Berkeley

3Computer Science Division and Department of StatisticsUniversity of California, Berkeley

24’th International Conference on Machine Learning

Regression on Manifolds using Kernel Dimension Reduction

Page 2: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Aim

A methodology for discovering a data manifold that best preservesinformation relevant to a nonlinear regression.

Regression on Manifolds using Kernel Dimension Reduction

Page 3: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Outline

BackgroundDimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Manifold Kernel Dimension Reduction

Experimental ResultsRegression on a TorusGlobal Temperature DataImage Data

Regression on Manifolds using Kernel Dimension Reduction

Page 4: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Dimensionality Reduction

I Given:I covariate vectors xi ∈ X ⊂ R

D , i = 1, . . . ,m.

I Find a lower-dimensional subspace Z ⊂ X and a mapping

X 3 xi 7→ zi ∈ Z, i = 1, . . . ,m

such that “zi represents xi well”.

Regression on Manifolds using Kernel Dimension Reduction

Page 5: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Regression Analysis

I Given:I covariate vectors xi ∈ X ⊂ R

D

I response vectors yi ∈ Y ⊂ RR , i = 1, . . . ,m

I Assume xi and yi are samples from random variables X ,Y .I Estimate properties of P(Y |X).

I Typically, learn E[Y |X = x] as a function of x.

Regression on Manifolds using Kernel Dimension Reduction

Page 6: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Dimensionality Reduction in Regression

I Find a lower-dimensional subspace Z ⊂ X and a mapping

X 3 xi 7→ zi ∈ Z, i = 1, . . . ,m

such that zi retains maximal predictive power w.r.t yi.I Supervised dimensionality reduction

Regression on Manifolds using Kernel Dimension Reduction

Page 7: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Some Dimensionality Reduction Approaches

Unsupervised Supervised

linear Principal Component Analysis Linear Discriminant Analysis (LDA)Z Canonical Correlation Analysis (CCA)

Sufficient Dimension Reduction (SDR)

non- Isomap Kernel LDAlinear Locally Linear Embeddings Kernel CCAZ Laplacian Eigenmaps Our method

Regression on Manifolds using Kernel Dimension Reduction

Page 8: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Some Dimensionality Reduction Approaches

Unsupervised SupervisedE[Y |Z ]

linear Principal Component Analysis LDAZ CCA linear

SDR model-free

non- Isomap Kernel LDAlinear Locally Linear Embeddings Kernel CCA linearZ Laplacian Eigenmaps Our method model-free

Regression on Manifolds using Kernel Dimension Reduction

Page 9: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Sufficient Dimension Reduction

I Parameterize Z by B ∈ RD×d where BTB = I.I Find B such that

Y y X | BTX (1)

I The intersection of all such ZB defines thecentral subspace, S.

Regression on Manifolds using Kernel Dimension Reduction

Page 10: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Why search for the Central Subspace?

I More efficient regression modeling in the lower-dimensional SI Exploratory analysis of S itself.

Regression on Manifolds using Kernel Dimension Reduction

Page 11: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Identifying the Central Subspace

I Methods based on inverse regression [Li 1991; Cook & Li 1991] .I Often limited to one-dimensional responsesI Impose assumptions on the distribution of X

I Kernel Dimension Reduction (KDR) [Fukumizu, Bach & Jordan

2004; Fukumizu, Bach & Jordan 2006] overcomes theseshortcomings

Regression on Manifolds using Kernel Dimension Reduction

Page 12: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

Kernel Dimension ReductionMeasure conditional independence in RKHS

I Map X and Y to reproducing kernel Hilbert spaces HX ,HY .

X 7→ f ∈ HX , Y 7→ g ∈ HY

I We can analyze statistical properties of f and gI Cross-covariance C f g between f and g can be represented by

an operator ΣYX : HX → HY such that

〈g,ΣYX f〉HY = C f g, ∀f , g (2)

I Conditional covariance operator

ΣYY |X = ΣYY −ΣYXΣ−1XXΣXY . (3)

Regression on Manifolds using Kernel Dimension Reduction

Page 13: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

KDR Theorem

I Fukumizu, Bach & Jordan (2006):1. ΣYY |X ≺ ΣYY |BTX

2. ΣYY |X = ΣYY |BTX ⇐⇒ Y y X |BTX

I The central space can be found by minimizing ΣYY |BTX .

Regression on Manifolds using Kernel Dimension Reduction

Page 14: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction

KDR Algorithm

I The minimization of ΣYY |BTX can be formulated as

min Tr JK cY (K c

BTX+ NεI)−1K

such that BTB = I(4)

where K cY and K c

BTXare centered kernel matrices.

Regression on Manifolds using Kernel Dimension Reduction

Page 15: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Manifold KDR (mKDR)

I Nice properties of KDR:I Weak assumptions on XI Kernel formulation

I Incorporate intrinsic geometry of the covariate vectorsI Apply “manifold learning” kernelsI Specifically, we use the normalized Graph Laplacian

I Manifold Kernel Dimension Reduction (mKDR)

Regression on Manifolds using Kernel Dimension Reduction

Page 16: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Eigenvectors of the Graph Laplacian

Figure: First four (non-constant)eigenvectors of the graphLaplacian on a torus.

I Harmonics on the manifoldI Reflect intrinsic coordinates

Regression on Manifolds using Kernel Dimension Reduction

Page 17: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Work in Kernel Feature Space

I The KDR optimization is computationally heavyI To speed up we work in (truncated) kernel feature space

spanned by the M principal kernel eigenvectorsv i , i = 1, . . . ,M.

I Approximate the image of central space with a lineartransformation ΦVT.

Regression on Manifolds using Kernel Dimension Reduction

Page 18: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

mKDR Algorithm

I mKDR minimization problem:

min Tr JK cY (VΩVT + NεI)−1K

such that Ω 0Tr(Ω) = 1

(5)

I Φ =√Ω

I Solve using the projected gradient method

Regression on Manifolds using Kernel Dimension Reduction

Page 19: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Regression on a Torus

I xi have intrinsic coordinates[θi , φi] ∈ S1 × S1

I y is a sigmoid function of∣∣∣∣∣∣[θ, φ]∣∣∣∣∣∣

Regression on Manifolds using Kernel Dimension Reduction

Page 20: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

mKDR finds the Central SubspaceI Ω nearly rank 1⇒ Ω ≈ aaT ; Project onto a

−3 −2 −1 0 1

x 10−3

0

0.2

0.4

0.6

0.8

1

Va

y

Figure: Uniform grid sampling

−3 −2 −1 0 1

x 10−3

0

0.2

0.4

0.6

0.8

1

Va

y

Figure: Uniformly randomsampling with additive noise

Regression on Manifolds using Kernel Dimension Reduction

Page 21: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

mKDR can be used to guide visualizationI Map xi onto the eigenvectors vi with largest weight in Φ.I “Predictive eigenvectors” in contrast to principal eigenvectors

used in e.g. Laplacian eigenmaps.

v1v

2

v 3

Figure: Principal eigenvectors

v1v

3

v 5

Figure: Predictive eigenvectors

Regression on Manifolds using Kernel Dimension Reduction

Page 22: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Global Temperature Data

I yi are satellite measurements of atmosphericaltemperatures around the globe.

I 3168 observation pointsI xi lie on a spheroid in R3

I Regress the temperature y on x.

Regression on Manifolds using Kernel Dimension Reduction

Page 23: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Regression Model of Temperature DistributionCompute the central space ΦVT and use linear regression to model E[Y |ΦVT].

Figure: Central space coordinate

Figure: Predicted temperature

Figure: Prediction errorRegression on Manifolds using Kernel Dimension Reduction

Page 24: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Visualization of an Image Data Manifold

I xi are a set of 1000 grayscale images of size 100 × 80 pixelsI 4 degrees of freedom: rotation angle, tilt angle and

translations in the image planeI Data lie on a 4-dimensional manifold in R100·80

I Create a lower-dimensional embedding that captures thevariation in rotation angle

Regression on Manifolds using Kernel Dimension Reduction

Page 25: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Unsupervised EmbeddingI Project onto the principal eigenvectors, i.e. Laplacian

Eigenmaps

Figure: Principal eigenvectors.Color by tilt angle.

Figure: Principal eigenvectors.Color by rotation angle.

Regression on Manifolds using Kernel Dimension Reduction

Page 26: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Regression on a TorusGlobal Temperature DataImage Data

Predictive embedding guided by mKDR

I Apply mKDR with rotationangle as response

I Map data onto predictiveeigenvectors of the GraphLaplacian

Figure: Predictive eigenvectors

Regression on Manifolds using Kernel Dimension Reduction

Page 27: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Summary

I mKDR discovers manifolds that optimally preserve predictivepower w.r.t response variables.

I mKDR enables:I flexible regression modelingI supervised exploration of nonlinear data manifolds

I mKDR extends:I sufficient dimension reduction to nonlinear manifolds.I manifold learning to the supervised setting.

Regression on Manifolds using Kernel Dimension Reduction

Page 28: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

BackgroundManifold Kernel Dimension Reduction

Experimental ResultsSummary

Acknowledgements

Funding sources:I Yahoo! ResearchI Microsoft ResearchI AstraZeneca R&D LundI The Swedish Knowledge FoundationI The Royal Swedish Academy of Engineering SciencesI The Royal Physiographic Society in Lund

Regression on Manifolds using Kernel Dimension Reduction

Page 29: Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension Reduction Experimental Results Summary Regression on Manifolds using Kernel Dimension Reduction

,

Appendix References

References I

Cook and Li][1991]Cook1991 Cook, R. D., & Li, B. (1991).Discussion of Li (1991).Journal of the American Statistical Association, 86, 328–332.

Fukumizu et al.][2004]Fukumizu2004 Fukumizu, K., Bach, F. R., & Jordan, M. I. (2004).Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces.Journal of Machine Learning Research, 5, 73–99.

Fukumizu et al.][2006]Fukumizu2006 Fukumizu, K., Bach, F. R., & Jordan, M. I. (2006).Kernel dimension reduction in regression (Technical Report).Department of Statistics, University of California, Berkeley.

Li][1991]Li1991 Li, K.-C. (1991).Sliced inverse regresion for dimension reduction.Journal of the American Statistical Association, 86, 316–327.

Regression on Manifolds using Kernel Dimension Reduction