Fisher metric, support stability and optimal number of ... · Joint work with Clarice Poon...
Transcript of Fisher metric, support stability and optimal number of ... · Joint work with Clarice Poon...
Fisher metric, support stability and optimal number of measurements in
compressive off-the-grid recovery
Nicolas KerivenEcole Normale Supérieure (Paris)
CFM-ENS chair in Data Science
Joint work with Clarice Poon (Cambridge Uni.), Gabriel Peyré (ENS)
Discrete compressive sensing
1/14
Discrete compressive sensing
1/14
Discrete compressive sensing
1/14
• Signal: vector
Discrete compressive sensing
1/14
• Signal: vector• Sparsity: few non-zeros coefficients
Discrete compressive sensing
1/14
• Signal: vector• Sparsity: few non-zeros coefficients• Dimensionality reduction (often random matrix)
Discrete compressive sensing
1/14
• Signal: vector• Sparsity: few non-zeros coefficients• Dimensionality reduction (often random matrix)• Recovery: convex relaxation LASSO
Discrete compressive sensing
1/14
Continuous sparsity ?
2/14
Continuous sparsity ?
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure• Sparsity:
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation?
See Keriven 2017, Gribonval 2017
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation? BLASSO [De Castro, Gamboa 2012]
See Keriven 2017, Gribonval 2017
2/14
- Fourier coeff- Convolution
Continuous sparsity ?
• Signal: Radon measure• Sparsity:• Dimensionality reduction (e.g. first Fourier coefficients)• Recovery: convex relaxation? BLASSO [De Castro, Gamboa 2012]
Other approaches: « Prony-like » ESPRIT, MUSIC… (but only 1d noiseless Fourier)
See Keriven 2017, Gribonval 2017
2/14
Example of applications
Fluorescence microscopy (3D) PALM, STORM… [Betzig 2006]
3/14
Example of applications
Fluorescence microscopy (3D) PALM, STORM… [Betzig 2006]
Astronomy (2D)[Puschmann 2017]
3/14
Example of applications
Fluorescence microscopy (3D) PALM, STORM… [Betzig 2006]
Compressive mixture model learning(many D)[Keriven 2017]
Astronomy (2D)[Puschmann 2017]
3/14
Example of applications
Fluorescence microscopy (3D) PALM, STORM… [Betzig 2006]
Compressive mixture model learning(many D)[Keriven 2017]
Astronomy (2D)[Puschmann 2017]
• Neuro-imaging with EEG (3D) [Gramfort 2013]
• 1-layer neural network (many D) [Bach 2017]
• Radar• Geophysics• …
3/14
A seminal result [Candès, Fernandez-Granda 2012]
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
Inverse Fourier
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
Inverse Fourier
BLASSO
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
• Regular Fourier on the Torus
• Minimal separation
Inverse Fourier
BLASSO
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
• Regular Fourier on the Torus
• Minimal separation
• Reconstruction: formulated as SDP (other: Frank-Wolfe,
greedy, Prony-like…)
Inverse Fourier
BLASSO
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
• Regular Fourier on the Torus
• Minimal separation
• Reconstruction: formulated as SDP (other: Frank-Wolfe,
greedy, Prony-like…)
• Noise: weak convergence (mass of concentrated
around true Diracs)
Inverse Fourier
BLASSO
4/14
A seminal result [Candès, Fernandez-Granda 2012]
First Fourier coefficients
• Regular Fourier on the Torus
• Minimal separation
• Reconstruction: formulated as SDP (other: Frank-Wolfe,
greedy, Prony-like…)
• Noise: weak convergence (mass of concentrated
around true Diracs)
Inverse Fourier
BLASSO
4/14
Many extensions since…
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
• [Duval, Peyré 2015]: In some cases, support stability in the small noise regime
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
• [Duval, Peyré 2015]: In some cases, support stability in the small noise regime• Noise level under which support stability is achieved?
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
• [Duval, Peyré 2015]: In some cases, support stability in the small noise regime• Noise level under which support stability is achieved?
Contributions:
• Generalize to many multi-d measurement operators, express the minimal separation as a geometry-aware Fisher metric
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
• [Duval, Peyré 2015]: In some cases, support stability in the small noise regime• Noise level under which support stability is achieved?
Contributions:
• Generalize to many multi-d measurement operators, express the minimal separation as a geometry-aware Fisher metric
• 1: Remove the random sign assumption (weak convergence)
Previous work, contribution
5/14
Relevant previous works:
• [Tang, Recht 2013]: random Fourier coefficients are sufficient• Random signs assumption• 1D discrete Fourier
• [Bendory et al. 2016]: extension to other measurement operators• Minimal separation does not take into account geometry of the meas. operator
• [Duval, Peyré 2015]: In some cases, support stability in the small noise regime• Noise level under which support stability is achieved?
Contributions:
• Generalize to many multi-d measurement operators, express the minimal separation as a geometry-aware Fisher metric
• 1: Remove the random sign assumption (weak convergence)
• 2: Prove support stability when (with random signs)
Outline
Background on dual certificates
Minimal separation and Fisher metric
Conclusion, outlooks
Main results, applications
Dual certificates
6/14
Random linear operator:
Dual certificates
6/14
Noisy measurement:Random linear operator:
Dual certificates
6/14
The BLASSO problem:
Noisy measurement:Random linear operator:
Dual certificates
First-order conditions
solution ofBLASSO
6/14
The BLASSO problem:
Noisy measurement:Random linear operator:
solution of
Dual certificates
First-order conditions
Dual certificate (noiseless case)
solution ofBLASSO
6/14
The BLASSO problem:
Noisy measurement:Random linear operator:
What is a dual certificate ?
What does it look like ?
7/14
What is a dual certificate ?
What does it look like ?
7/14
Case :
What is a dual certificate ?
What does it look like ?
7/14
Case :
What is a dual certificate ?
What does it look like ?
7/14
Case :
What is a dual certificate ?
What does it look like ?
7/14
Ensures uniqueness and robustness…
Non-degenerate dual certif.
Case :
What is a dual certificate ?
What does it look like ?
7/14
Ensures uniqueness and robustness…
Non-degenerate dual certif.
Intuitively:
Larger -> easierinterpolation
Proof strategy
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Proof strategy
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Proof strategy
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Proof strategy
m=10
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Proof strategy
m=10 m=50
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Proof strategy
m=500m=10 m=50
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Proof strategy
m=500m=10 m=50
8/14
Step 1: Study the limit case to derive an appropriate notion of minimal separation
Step 2: bound the deviation for finite number of measurements
Step 3: recovery• Adaptation of [Azaïs 2015] for weak convergence• Quantitative Implicit Function Theorem [Denoyelle 2015] for support stability
Outline
Background on dual certificates
Minimal separation and Fisher metric
Conclusion, outlooks
Main results, applications
Limit covariance kernels
9/14
How to construct a certificate ?Study limit covariance kernel when :
Limit covariance kernels
9/14
How to construct a certificate ?Study limit covariance kernel when :
Sub-sampled version:
Limit covariance kernels
9/14
How to construct a certificate ?Study limit covariance kernel when :
Strategy under minimal separation
Sub-sampled version:
Limit covariance kernels
9/14
How to construct a certificate ?Study limit covariance kernel when :
Strategy under minimal separation
Sub-sampled version:
Limit covariance kernels
Min. Separation
9/14
How to construct a certificate ?Study limit covariance kernel when :
Strategy under minimal separation
Sub-sampled version:
Limit covariance kernels
Min. Separation
1: kernel at each saturation point
9/14
How to construct a certificate ?Study limit covariance kernel when :
Strategy under minimal separation
Sub-sampled version:
Limit covariance kernels
Min. Separation
2: Small adjustements(minimal separation)
1: kernel at each saturation point
9/14
How to construct a certificate ?Study limit covariance kernel when :
Strategy under minimal separation
Sub-sampled version:
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Classical case: translation-invariant kernel
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Classical case: translation-invariant kernel
natural
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Classical case: translation-invariant kernel Non translation-inv. ?
Kernel for microscopynatural
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Classical case: translation-invariant kernel Non translation-inv. ?
Kernel for microscopy
Riemannian metric associated to a kernel [Amari 99]:
: metric tensor : geodesic distance
natural
Minimal separation and Fisher metric
10/14
Which metric for separation ?
Classical case: translation-invariant kernel Non translation-inv. ?
Kernel for microscopy
Riemannian metric associated to a kernel [Amari 99]:
: metric tensor : geodesic distance
Thm: under some hypothesis, for , there exists non-degenerate
natural
Examples
16/04/2018
Features
Fisher metric and minimal separation
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Microscopy (Laplace transform):
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Microscopy (Laplace transform):
Kernel
11/14
Examples
16/04/2018
Features
Fisher metric and minimal separation
Discrete Fourier on Torus:Féjer kernel
Continuous Gaussian Fourier:Gaussian kernel
Microscopy (Laplace transform):
Kernel
11/14
Outline
Background on dual certificates
Minimal separation and Fisher metric
Conclusion, outlooks
Main results, applications
Main results
Thm: Eliminating the random signs
12/14
Main results
Thm: Eliminating the random signs
Depend on kernel
12/14
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
Depend on kernel
12/14
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
• Proof: golfing scheme [Gross 2009, Candès Plan 2011]
Depend on kernel
12/14
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
• Proof: golfing scheme [Gross 2009, Candès Plan 2011]
Thm: Support stability
Depend on kernel
12/14
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
• Proof: golfing scheme [Gross 2009, Candès Plan 2011]
Thm: Support stability
Depend on kernel
With random signs Without random signs
12/14
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
• Proof: golfing scheme [Gross 2009, Candès Plan 2011]
Thm: Support stability
Depend on kernel
With random signs Without random signs
12/14
• Quantified small noise : if , then:
Main results
Thm: Eliminating the random signs
• The recovered measure concentrate around true Diracs
• Proof: golfing scheme [Gross 2009, Candès Plan 2011]
Thm: Support stability
Depend on kernel
With random signs Without random signs
12/14
• Quantified small noise : if , then:
• The recovered measure is formed of exactly Diracs
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
13/14
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
13/14
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
• Application: Gaussian Mixture Model learning
13/14
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
• Application: Gaussian Mixture Model learning
• Given
13/14
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
• Application: Gaussian Mixture Model learning
• Given
• Compute with Gaussian
13/14
(streaming, distributed)
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
• Application: Gaussian Mixture Model learning
• Given
• Compute with Gaussian
• Solve the BLASSO with as characteristic function of a Gaussian
13/14
(streaming, distributed)
Applications
16/04/2018
• Féjer kernel (discrete Fourier):
• Microscopy with Laplace transform:
• Application: Gaussian Mixture Model learning
• Given
• Compute with Gaussian
• Solve the BLASSO with as characteristic function of a Gaussian
Then: if
The BLASSO yields exactly s Diracs: non-asymptotic model selection !
13/14
(streaming, distributed)
Outline
Background on dual certificates
Minimal separation and Fisher metric
Conclusion, outlooks
Main results, applications
Summary, outlooks
• Summary: generalization of existing results on super-resolutionwith random measurements (and minimal separation)
• Introduction of the kernel Fisher metric to measure minimal separation• Application in particular to a non-translation-invariant example for microscopy
• No need of random signs for weak convergence (golfing scheme)• Quantitative support stability
14/14
Summary, outlooks
• Summary: generalization of existing results on super-resolutionwith random measurements (and minimal separation)
• Introduction of the kernel Fisher metric to measure minimal separation• Application in particular to a non-translation-invariant example for microscopy
• No need of random signs for weak convergence (golfing scheme)• Quantitative support stability
14/14
• Outlooks• Implication of support stability for algorithms ? (active field)• Better characterization of the « universality » of the geodesic distance• More quantified treatment of dimension• Other practical applications (eg 1-layer neural networks with continuum of
neurons [Bach 2017])
Poon, Keriven, Peyré. A Dual Certificates Analysis of Compressive Off-the-Grid Recovery. Preprint arxiv:1802.08464
Poon, Keriven, Peyré. Support Localization and the Fisher Metric for off-the-grid Sparse Regularization. Preprint arxiv:1810.03340
Thank you !
14/14
data-ens.github.ioEnter the data challenges!Come to the colloquium!Come to the Laplace seminars!