Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension...
Transcript of Regression on Manifolds using Kernel Dimension Reduction€¦ · Manifold Kernel Dimension...
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on Manifolds using KernelDimension Reduction
Jens Nilsson1 Fei Sha2 Michael I. Jordan3
1Centre for Mathematical SciencesLund University
2Computer Science DivisionUniversity of California, Berkeley
3Computer Science Division and Department of StatisticsUniversity of California, Berkeley
24’th International Conference on Machine Learning
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Aim
A methodology for discovering a data manifold that best preservesinformation relevant to a nonlinear regression.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Outline
BackgroundDimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Manifold Kernel Dimension Reduction
Experimental ResultsRegression on a TorusGlobal Temperature DataImage Data
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Dimensionality Reduction
I Given:I covariate vectors xi ∈ X ⊂ R
D , i = 1, . . . ,m.
I Find a lower-dimensional subspace Z ⊂ X and a mapping
X 3 xi 7→ zi ∈ Z, i = 1, . . . ,m
such that “zi represents xi well”.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Regression Analysis
I Given:I covariate vectors xi ∈ X ⊂ R
D
I response vectors yi ∈ Y ⊂ RR , i = 1, . . . ,m
I Assume xi and yi are samples from random variables X ,Y .I Estimate properties of P(Y |X).
I Typically, learn E[Y |X = x] as a function of x.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Dimensionality Reduction in Regression
I Find a lower-dimensional subspace Z ⊂ X and a mapping
X 3 xi 7→ zi ∈ Z, i = 1, . . . ,m
such that zi retains maximal predictive power w.r.t yi.I Supervised dimensionality reduction
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Some Dimensionality Reduction Approaches
Unsupervised Supervised
linear Principal Component Analysis Linear Discriminant Analysis (LDA)Z Canonical Correlation Analysis (CCA)
Sufficient Dimension Reduction (SDR)
non- Isomap Kernel LDAlinear Locally Linear Embeddings Kernel CCAZ Laplacian Eigenmaps Our method
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Some Dimensionality Reduction Approaches
Unsupervised SupervisedE[Y |Z ]
linear Principal Component Analysis LDAZ CCA linear
SDR model-free
non- Isomap Kernel LDAlinear Locally Linear Embeddings Kernel CCA linearZ Laplacian Eigenmaps Our method model-free
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Sufficient Dimension Reduction
I Parameterize Z by B ∈ RD×d where BTB = I.I Find B such that
Y y X | BTX (1)
I The intersection of all such ZB defines thecentral subspace, S.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Why search for the Central Subspace?
I More efficient regression modeling in the lower-dimensional SI Exploratory analysis of S itself.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Identifying the Central Subspace
I Methods based on inverse regression [Li 1991; Cook & Li 1991] .I Often limited to one-dimensional responsesI Impose assumptions on the distribution of X
I Kernel Dimension Reduction (KDR) [Fukumizu, Bach & Jordan
2004; Fukumizu, Bach & Jordan 2006] overcomes theseshortcomings
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
Kernel Dimension ReductionMeasure conditional independence in RKHS
I Map X and Y to reproducing kernel Hilbert spaces HX ,HY .
X 7→ f ∈ HX , Y 7→ g ∈ HY
I We can analyze statistical properties of f and gI Cross-covariance C f g between f and g can be represented by
an operator ΣYX : HX → HY such that
〈g,ΣYX f〉HY = C f g, ∀f , g (2)
I Conditional covariance operator
ΣYY |X = ΣYY −ΣYXΣ−1XXΣXY . (3)
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
KDR Theorem
I Fukumizu, Bach & Jordan (2006):1. ΣYY |X ≺ ΣYY |BTX
2. ΣYY |X = ΣYY |BTX ⇐⇒ Y y X |BTX
I The central space can be found by minimizing ΣYY |BTX .
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Dimensionality ReductionSufficient Dimension ReductionKernel Dimension Reduction
KDR Algorithm
I The minimization of ΣYY |BTX can be formulated as
min Tr JK cY (K c
BTX+ NεI)−1K
such that BTB = I(4)
where K cY and K c
BTXare centered kernel matrices.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Manifold KDR (mKDR)
I Nice properties of KDR:I Weak assumptions on XI Kernel formulation
I Incorporate intrinsic geometry of the covariate vectorsI Apply “manifold learning” kernelsI Specifically, we use the normalized Graph Laplacian
I Manifold Kernel Dimension Reduction (mKDR)
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Eigenvectors of the Graph Laplacian
Figure: First four (non-constant)eigenvectors of the graphLaplacian on a torus.
I Harmonics on the manifoldI Reflect intrinsic coordinates
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Work in Kernel Feature Space
I The KDR optimization is computationally heavyI To speed up we work in (truncated) kernel feature space
spanned by the M principal kernel eigenvectorsv i , i = 1, . . . ,M.
I Approximate the image of central space with a lineartransformation ΦVT.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
mKDR Algorithm
I mKDR minimization problem:
min Tr JK cY (VΩVT + NεI)−1K
such that Ω 0Tr(Ω) = 1
(5)
I Φ =√Ω
I Solve using the projected gradient method
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Regression on a Torus
I xi have intrinsic coordinates[θi , φi] ∈ S1 × S1
I y is a sigmoid function of∣∣∣∣∣∣[θ, φ]∣∣∣∣∣∣
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
mKDR finds the Central SubspaceI Ω nearly rank 1⇒ Ω ≈ aaT ; Project onto a
−3 −2 −1 0 1
x 10−3
0
0.2
0.4
0.6
0.8
1
Va
y
Figure: Uniform grid sampling
−3 −2 −1 0 1
x 10−3
0
0.2
0.4
0.6
0.8
1
Va
y
Figure: Uniformly randomsampling with additive noise
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
mKDR can be used to guide visualizationI Map xi onto the eigenvectors vi with largest weight in Φ.I “Predictive eigenvectors” in contrast to principal eigenvectors
used in e.g. Laplacian eigenmaps.
v1v
2
v 3
Figure: Principal eigenvectors
v1v
3
v 5
Figure: Predictive eigenvectors
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Global Temperature Data
I yi are satellite measurements of atmosphericaltemperatures around the globe.
I 3168 observation pointsI xi lie on a spheroid in R3
I Regress the temperature y on x.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Regression Model of Temperature DistributionCompute the central space ΦVT and use linear regression to model E[Y |ΦVT].
Figure: Central space coordinate
Figure: Predicted temperature
Figure: Prediction errorRegression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Visualization of an Image Data Manifold
I xi are a set of 1000 grayscale images of size 100 × 80 pixelsI 4 degrees of freedom: rotation angle, tilt angle and
translations in the image planeI Data lie on a 4-dimensional manifold in R100·80
I Create a lower-dimensional embedding that captures thevariation in rotation angle
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Unsupervised EmbeddingI Project onto the principal eigenvectors, i.e. Laplacian
Eigenmaps
Figure: Principal eigenvectors.Color by tilt angle.
Figure: Principal eigenvectors.Color by rotation angle.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Regression on a TorusGlobal Temperature DataImage Data
Predictive embedding guided by mKDR
I Apply mKDR with rotationangle as response
I Map data onto predictiveeigenvectors of the GraphLaplacian
Figure: Predictive eigenvectors
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Summary
I mKDR discovers manifolds that optimally preserve predictivepower w.r.t response variables.
I mKDR enables:I flexible regression modelingI supervised exploration of nonlinear data manifolds
I mKDR extends:I sufficient dimension reduction to nonlinear manifolds.I manifold learning to the supervised setting.
Regression on Manifolds using Kernel Dimension Reduction
,
BackgroundManifold Kernel Dimension Reduction
Experimental ResultsSummary
Acknowledgements
Funding sources:I Yahoo! ResearchI Microsoft ResearchI AstraZeneca R&D LundI The Swedish Knowledge FoundationI The Royal Swedish Academy of Engineering SciencesI The Royal Physiographic Society in Lund
Regression on Manifolds using Kernel Dimension Reduction
,
Appendix References
References I
Cook and Li][1991]Cook1991 Cook, R. D., & Li, B. (1991).Discussion of Li (1991).Journal of the American Statistical Association, 86, 328–332.
Fukumizu et al.][2004]Fukumizu2004 Fukumizu, K., Bach, F. R., & Jordan, M. I. (2004).Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces.Journal of Machine Learning Research, 5, 73–99.
Fukumizu et al.][2006]Fukumizu2006 Fukumizu, K., Bach, F. R., & Jordan, M. I. (2006).Kernel dimension reduction in regression (Technical Report).Department of Statistics, University of California, Berkeley.
Li][1991]Li1991 Li, K.-C. (1991).Sliced inverse regresion for dimension reduction.Journal of the American Statistical Association, 86, 316–327.
Regression on Manifolds using Kernel Dimension Reduction