Learning Sparse High Dimensional Filters: Image Filtering ... · In CVPR, 2015. 7. Liang Chieh, C....

1
{(f i , v i )} n i=1 Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks Varun Jampani 1 , Martin Kiefel 1,2 , Peter V. Gehler 1,2 1 MPI for Intelligent Systems, Tübingen; 2 Bernstein Center for Computational Neuroscience, Tübingen {varun.jampani, martin.kiefel, peter.gehler}@tuebingen.mpg.de Perceiving Systems – ps.is.tue.mpg.de Sparse High Dimensional Filtering 1 For example, Bilateral Filtering an input , and features : A common choice is pixel position and color: . Using position features, corresponds to standard spatial Gaussian convolution. Bilateral filters are image dependent Bilateral filter output Spa3al Gauss filter output f =(x, y, r, g, b) f =(x, y, r, g, b) f =(x, y ) v 0 i = X j 2N e - 1 2σ 2 ||f i -f j || 2 v j We learn image adaptive filters. This generalizes DenseCRF and enables bilateral CNNs. Code: http://bilateralnn.is.tuebingen.mpg.de Single Filter Applications 1. Joint Bilateral Upsampling: Upsample a low-resolution result using a high-resolution guidance image [3]. Guidance Ground Truth Bicubic Upsampling Gauss Bilateral Upsampling Learned Bilateral Upsampling Sample result of color upsampling (above) and depth upsampling (below) Input Bilateral Neural Networks References: 1. Aurich, V., & Weule, J. Non-linear Gaussian filters performing edge preserving diffusion. In Mustererkennung, 1995. 2. Adams, A., Baek, J., & Davis, M. A. Fast highdimensional filtering using the permutohedral lattice. In Computer Graphics Forum 29(2), 2010. 3. Kopf, J. et al. Joint bilateral upsampling. ACM Transactions on Graphics (TOG), 26(3), 2007. 4. Krähenbühl, P., & Koltun, V. Efficient inference in fully Connected CRFs with Gaussian edge potentials. In NIPS, 2011. 5. Everingham, M. et al. The Pascal visual object classes (voc) challenge. IJCV, 88(2), 2010. 6. Bell, S. et al. Material recognition in the wild with the materials in context database. In CVPR, 2015. 7. Liang Chieh, C. et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In ICLR, 2015. Learning Permutohedral Lattice Filters 2 We use the permutohedral lattice from [2]. Splat Convolve Gauss Filter Slice Bilateral filtering using permutohedral la=ce v 0 = S slice WS splat v Conclusion We propose a filter parameterization and learning technique for sparse and high dimensional data. This can be used for fast learnable bilateral filtering or general high dimensional filtering. Faster and be?er convergence with BNNs Horizontal and vertical stacking of high-dimensional filters with end-to-end learning. Enables Bilateral Neural Networks (BNN): High dimensional data; unordered (sparse) set of inputs; needs no grid layout. Example: Character recognition with hand-written input. Splat and filter only sparse foreground points. Permutohedral filter bank Generalization to non-separable fully parameterized filters results in runtime linear in feature dimensions. CPU/GPU run3me (in ms) comparisons, ddim Caffe vs. Bilateral Convolu3onal Layer (BCL) 3D mesh denoising using learned bilateral filtering Ground Truth Noisy Mesh Denoised Mesh 2. 3D Mesh Denoising Problem: Learn filters in high dimensional space v f La=ce Visualiza3on. Pixels in the same simplex are shown with the same color Image f =(x, y ) f =(r, g, b) f =(x, y, r, g, b) 3 Generalization of DenseCRF 4 Parameterize the filter kernel instead of using Gaussian. We also generalize to non-separable filters. Learn permutohedral filter via stochastic gradient descent. Splat, Convolve and Slice are linear operations and differentiable. Gradients of a scalar loss , with respect to – Input, : Filter weights, : Consequences: Fast and learnable high dimensional filtering. Image adaptive filtering. Filtering unordered set of points (e.g. sparse 3D points). Input and output points can be different. @ L @ v = S 0 splat W 0 S 0 slice @ L @ v 0 @ L @ (W ) i,j = S 0 slice @ L @ v i (S splat v ) j v W L Mean-field inference in densely connected CRFs is tractable [4]. It reduces to bilateral filtering: In [4]: Gaussian edge potentials: Here: Learn pairwise potentials with back- propagation through truncated message passing. IoU on Pascal VOC12 [5] val./test dataset with DeepLab unaries [7] [7] Total/Class accuracy on MINC dataset [6] with AlexNet unaries [6] Ground Truth AlexNet [6] Input +2stepMF GaussCRF +2stepMF LearnedCRF Sample result of material segmenta3on q t+1 i (x i ) | {z } beliefs = 1 Z i exp{- u (x i ) | {z } unary - X l2L X j 6=i pairwise z }| { ij p (x i ,l ) q t j (x j ) | {z } bilateral filtering } 5 ij p (x i ,x j )= μ(x i ,x j ) exp(-(f i - f j ) > -1 (f i - f j )) 6 1D 3D 2D ND ?

Transcript of Learning Sparse High Dimensional Filters: Image Filtering ... · In CVPR, 2015. 7. Liang Chieh, C....

Page 1: Learning Sparse High Dimensional Filters: Image Filtering ... · In CVPR, 2015. 7. Liang Chieh, C. et al. Semantic image segmentation with deep convolutional nets and fully connected

{(fi,vi)}ni=1

Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks Varun Jampani1, Martin Kiefel1,2, Peter V. Gehler1,2

1MPI for Intelligent Systems, Tübingen; 2Bernstein Center for Computational Neuroscience, Tübingen

{varun.jampani, martin.kiefel, peter.gehler}@tuebingen.mpg.de Perceiving Systems – ps.is.tue.mpg.de

Sparse High Dimensional Filtering 1

For example, Bilateral Filtering an input , and features :

A common choice is pixel position and color: . Using position features, corresponds to standard spatial Gaussian convolution.

Bilateral  filters  are  image  dependent  

Bilateral  filter  output   Spa3al  Gauss  filter  output  f = (x, y, r, g, b)

f = (x, y, r, g, b)f = (x, y)

v0i =

X

j2Ne�

12�2 ||fi�fj ||2vj

We learn image adaptive filters. This generalizes DenseCRF and enables bilateral CNNs. Code: http://bilateralnn.is.tuebingen.mpg.de

Single Filter Applications 1. Joint Bilateral Upsampling: Upsample a low-resolution result using a high-resolution guidance image [3].

Guidance   Ground  Truth  Bicubic  

Upsampling  Gauss  Bilateral  Upsampling  

Learned  Bilateral  Upsampling  

Sample  result  of  color  upsampling  (above)  and  depth  upsampling  (below)  

Input  

Bilateral Neural Networks

References: 1.  Aurich, V., & Weule, J. Non-linear Gaussian filters performing edge preserving diffusion. In Mustererkennung, 1995. 2.  Adams, A., Baek, J., & Davis, M. A. Fast high‐dimensional filtering using the permutohedral lattice. In Computer Graphics

Forum 29(2), 2010. 3.  Kopf, J. et al. Joint bilateral upsampling. ACM Transactions on Graphics (TOG), 26(3), 2007. 4.  Krähenbühl, P., & Koltun, V. Efficient inference in fully Connected CRFs with Gaussian edge potentials. In NIPS, 2011. 5.  Everingham, M. et al. The Pascal visual object classes (voc) challenge. IJCV, 88(2), 2010. 6.  Bell, S. et al. Material recognition in the wild with the materials in context database. In CVPR, 2015. 7.  Liang Chieh, C. et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In ICLR, 2015.

Learning Permutohedral Lattice Filters 2 We use the permutohedral lattice from [2].

Splat Convolve

Gauss Filter

Slice

Bilateral  filtering  using  permutohedral  la=ce  

v0 = SsliceWSsplatv

Conclusion We propose a filter parameterization and learning technique for sparse and high dimensional data. This can be used for fast learnable bilateral filtering or general high dimensional filtering.

Faster  and  be?er  convergence  with  BNNs  

Horizontal and vertical stacking of high-dimensional filters with end-to-end learning.

Enables Bilateral Neural Networks (BNN): High dimensional data; unordered (sparse) set of inputs; needs no grid layout.

Example: Character recognition with hand-written input. •  Splat and filter only sparse foreground points.

Permutohedral  filter  bank  

Generalization to non-separable fully parameterized filters results in runtime linear in feature dimensions.

CPU/GPU  run3me  (in  ms)  comparisons,    d-­‐dim  Caffe  vs.  Bilateral  Convolu3onal  Layer  (BCL)  

3D  mesh  denoising  using  learned  bilateral  filtering  

Ground  Truth   Noisy  Mesh   Denoised  Mesh  

2. 3D Mesh Denoising

Problem: Learn filters in high dimensional space

v f

La=ce  Visualiza3on.  Pixels  in  the  same  simplex  are  shown  with  the  same  color  

Image   f = (x, y) f = (r, g, b) f = (x, y, r, g, b)

3 Generalization of DenseCRF 4

Parameterize the filter kernel instead of using Gaussian. We also generalize to non-separable filters.

Learn permutohedral filter via stochastic gradient descent.

Splat, Convolve and Slice are linear operations and differentiable.

Gradients of a scalar loss , with respect to – Input, : Filter weights, :

Consequences: •  Fast and learnable high dimensional filtering. •  Image adaptive filtering. •  Filtering unordered set of points (e.g. sparse 3D points). •  Input and output points can be different.

@L

@v= S0

splatW0S0

slice@L

@v0@L

@(W )i,j=

✓S0slice

@L

@v

i

(Ssplatv)j

v WL

Mean-field inference in densely connected CRFs is tractable [4]. It reduces to bilateral filtering:

In [4]: Gaussian edge potentials:

Here: Learn pairwise potentials with back-propagation through truncated message passing.

IoU  on  Pascal  VOC12  [5]  val./test  dataset  with  DeepLab  unaries  [7]  [7]  

Total/Class  accuracy  on  MINC  dataset  [6]  with  AlexNet  unaries  [6]  

Ground  Truth   AlexNet  [6]  Input  +2stepMF-­‐GaussCRF  

+2stepMF-­‐LearnedCRF  

Sample  result  of  material  segmenta3on  q

t+1i (xi)| {z }beliefs

=

1

Ziexp{� u(xi)| {z }

unary

�X

l2L

X

j 6=i

pairwisez }| {

ijp (xi, l) q

tj(xj)

| {z }bilateral filtering

}

5

ijp (xi, xj) = µ(xi, xj) exp(�(fi � fj)

>⌃

�1(fi � fj))

6

1-­‐D   3-­‐D  2-­‐D   N-­‐D  

?