Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… ·...

16
Neural Information Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence Seungryong Kim 1 , Stepthen Lin 2 , Sangryul Jeon 1 , Dongbo Min 3 , Kwanghoon Sohn 1 Dec. 05, 2018 1) 2) 3)

Transcript of Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… ·...

Page 1: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Neural Information Processing Systems (NeurIPS) 2018

Recurrent Transformer Networks for Semantic Correspondence

Seungryong Kim1, Stepthen Lin2, Sangryul Jeon1, Dongbo Min3, Kwanghoon Sohn1

Dec. 05, 2018

1) 2) 3)

Page 2: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Semantic Correspondence

• Establishing dense correspondences between semantically similar images, i.e., different instances within the same object or scene categories

• For example, the wheels of two different cars, the body of people or animals

2Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Introduction

Page 3: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Challenges in Semantic Correspondence

3Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Introduction

Photometric Deformations

• Intra-class appearance and attribute variations

• Etc.

Geometric Deformations

• Different viewpoint or baseline• Non-rigid shape deformations• Etc.

Lack of Supervision

• Labor-intensive of annotation• Degraded by subjectivity• Etc.

?

Page 4: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Objective

How to estimate locally-varying affine transformation fieldswithout ground-truth supervision?

4Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Problem Formulation

𝑖𝐓𝑖 = 𝐀𝑖 , 𝐟𝑖

𝑖′ = T𝑖𝑖

Page 5: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Methods for Geometric Invariance in Feature Extraction Step

• UCN [Choy et al., NeurIPS’16]• CAT-FCSS [Kim et al., TPAMI’18]• Etc.

5Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Background

Spatial Transformer Networks (STNs)-based methods [Jaderberg et al., NeurIPS’15]𝐀𝑖 is learned wo/𝐀𝑖

But, 𝐟𝑖 is learned w/𝐟𝑖∗

Geometric inference based on only source or target image

Page 6: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Methods for Geometric Invariance in Regularization Step

• GMat. [Rocco et al., CVPR’17]• GMat. w/Inl. [Rocco et al., CVPR’18]• Etc.

6Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Background

𝐓𝑖 is learned wo/𝐓𝑖∗

using self- or meta-supervisionGeometric Inference using source/target images

Globally-varying geometric Inference onlyOnly fixed, untransformed versions of the features

Page 7: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Networks Configuration

• To weaves the advantages of STN-based methods and geometric matching methods by recursively estimating geometric transformation residuals using geometry-aligned feature activations

7Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Recurrent Transformer Networks (RTNs)

Page 8: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Feature Extraction Networks

• Input images 𝐼𝑠 and 𝐼𝑡 are passed through Siamese convolution networks with parameters 𝐖𝐹 such that 𝐷𝑖 = 𝐹 𝐼 𝐖𝐹

• Using CAT-FCSS, VGGNet (conv4-4), ResNet (conv4-23)

8Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Recurrent Transformer Networks (RTNs)

Page 9: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Recurrent Geometric Matching Networks

• Constraint correlation volume construction

𝐶(𝐷𝑖𝑠, 𝐷𝑡(𝐓𝑗)) =< 𝐷𝑖

𝑠, 𝐷𝑡(𝐓𝑗) >/ < 𝐷𝑖𝑠, 𝐷𝑡(𝐓𝑗) >

2

9Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Recurrent Transformer Networks (RTNs)

Source Target

Page 10: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Recurrent Geometric Matching Networks

• Recurrent geometric inference

𝐓𝑖𝑘 − 𝐓𝑖

𝑘−1 = 𝐹(𝐶(𝐷𝑖𝑠, 𝐷𝑡(𝐓𝑖

𝑘−1))|𝐖𝐺)

10Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Recurrent Transformer Networks (RTNs)

Source Target Iter. 1 Iter. 2 Iter. 3 Iter. 4

Page 11: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Weakly-supervised Learning• Intuition: matching score between the source 𝐷𝑠 at each pixel 𝑖 and the target

𝐷𝑡(𝐓𝑖) should be maximized while keeping the scores of other candidates low

• Loss Function:

𝐿 𝐷𝑖𝑠, 𝐷𝑡 𝐓 = −

𝑗∈𝑀𝑖

𝑝𝑗∗log(𝑝(𝐷𝑖

𝑠, 𝐷𝑡(𝐓𝑗)))

where the function 𝑝(𝐷𝑖𝑠 , 𝐷𝑡(𝐓𝑗)) is a Softmax probability

𝑝(𝐷𝑖𝑠, 𝐷𝑡(𝐓𝑗)) =

exp(𝐶(𝐷𝑖𝑠, 𝐷𝑡(𝐓𝑗)))

σ𝑙∈𝑀𝑖exp(𝐶(𝐷𝑖

𝑠, 𝐷𝑡(𝐓𝑙)))

where 𝑝𝑗∗ denotes a class label defined as 1 if 𝑗 = 𝑖, 0 otherwise

11Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Recurrent Transformer Networks (RTNs)

Page 12: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Results on the TSS Benchmark

12Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Experimental Results

Source images

Target images

SCNet[Han et al., ICCV’17]

GMat. w/Inl. [Rocco et al., CVPR’18]

RTNs

Page 13: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Results on the PF-PASCAL Benchmark

13Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Experimental Results

Source images

Target images

SCNet[Han et al., ICCV’17]

GMat. w/Inl. [Rocco et al., CVPR’18]

RTNs

Page 14: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Results on the PF-PASCAL Benchmark

14Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Experimental Results

Source images

Target images

SCNet[Han et al., ICCV’17]

GMat. w/Inl. [Rocco et al., CVPR’18]

RTNs

Page 15: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

15Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018

Concluding Remarks• RTNs learn to infer locally-varying geometric fields for semantic

correspondence in an end-to-end and weakly-supervised fashion

• The key idea is to utilize and iteratively refine the transformations and convolutional activations through matching between the image pair

• A technique is presented for weakly-supervised training of RTNs

Page 16: Neural Information Processing Systems (NeurIPS) 2018 Recurrent ...05-09-45)-05-10-20-1263… · Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence

Seungryong Kim, Ph.D.Digital Image Media Lab.

Yonsei University, Seoul, KoreaTel: +82-2-2123-2879

E-mail: [email protected]: http://diml.yonsei.ac.kr/~srkim/

Thank you!See you at 210 & 230 AB #119