High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural...

87
High Dimensional Convolutional Neural Networks for 3D perception Chris Choy, Ph.D. candidate @ Stanford Vision and Learning Lab 1

Transcript of High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural...

Page 1: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

High Dimensional Convolutional Neural Networksfor 3D perception

Chris Choy,Ph.D. candidate @ Stanford Vision and Learning Lab

1

Page 2: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

The Success of Convolutional Networks

5

AlexNet [Krizhevsky et al.]

R-CNN [Girshick et al.]

FCNN [Long et al.]

GAN [Goodfellow et al.]

Page 3: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Experience

Versatility

The Success of Convolutional Networks

6

Efficiency

Speech Recognition, Abdel-Hamid et al.

Machine Translation

Object Detection Semantic Segmentation

Page 4: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Examples of 3D Vision Tasks

7

3D Reconstruction

3D Object Pose Estimation

3D Registration

3D Object Tracking

Page 5: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Vision in Action

8

Nvidia Research, 2019 Microsoft HoloLens Amazon AR View

Page 6: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

15

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 7: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

16

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 8: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Reconstruction

● 3D-Recurrent Reconstruction Neural Networks,

Chris, Danfei, JunYoung, Kevin, Silvio, ECCV’16

● Universal Correspondence Networks, Chris,

JunYoung, Silvio, Manmohan, NIPS’16

● Weakly supervised 3D Reconstruction with

Adversarial Constraint, JunYoung, Chris,

Manmohan, Animehs, Silvio, 3DV’17

● DeformNet: Free-Form Deformation Network for

3D Shape Reconstruction from a Single Image,

Andrey, Jingwei, Animesh, Viraj, JunYoung,

Chris, Silvio, WACV’18

● Text2Shape: Generating Shapes from Natural

Language by Learning Joint Embeddings, Kevin,

Chris, Manolis, Angel, Thomas, Silvio, ACCV’18

● 4D-Spatio Temporal ConvNets: Minkowski

Convolutional Neural Networks, Chris,

JunYoung, Silvio, CVPR’19

17

Page 9: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Reconstruction from Few Images● Single or Multi-view images of an object

● Online retail stores

18

Input Images 3D Reconstruction

TODO

Page 10: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Reconstruction from Few Images

● Wide baseline

● Specular / texture-less region

● Single view

19

Page 11: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Reconstruction

20

Observations (Images)

3D Representation

Algorithms

Structure from Motion

[Longuet-Higgins, Haming et al., Snavely et al., …]

Depth Estimation

[Eigen et al., Saxena et al., …]

MVS

Tomography

Object-centric

Reconstruction

Page 12: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Recurrent Reconstruction Neural Networks

● End-to-end 3D reconstruction

● Unified framework● Single-view & Multi-view reconst.

● 3D-Convolutional LSTM● Update hidden states

Chris, Danfei, JunYoung, Kevin, Silvio, 3D-Recurrent Reconstruction Neural Networks, ECCV’16 22

Page 13: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

23

Page 14: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

24

Page 15: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

25

Page 16: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

26

Page 17: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

27

Page 18: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Chris, Danfei, JunYoung, Kevin, Silvio, 3D-Recurrent Reconstruction Neural Networks, ECCV’16

Number of images

30

Increasing confidence on armrests

Update / maintain prediction

Page 19: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

33Chris, Danfei, JunYoung, Kevin, Silvio, 3D-Recurrent Reconstruction Neural Networks, ECCV’16

Robustness to texture and # views

Page 20: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

35

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 21: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

● SegCloud: Semantic Segmentation of

3D Point Clouds, Lyne, Chris, Iro,

JunYoung Silvio, 3DV’17

● 4D-Spatio Temporal ConvNets:

Minkowski Convolutional Neural

Networks, Chris, JunYoung, Silvio,

CVPR’19

● Fully Convolutional Geometric Features,

Chris, Jaesik, Vladlen, ICCV’19

36

Page 22: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

O(N3) volume

Sparsity of 3D data

37

O(N2) surfacevs.

Page 23: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

38

20cm voxel : 18%

Page 24: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

39

10cm voxel : 9%

Page 25: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

40

5cm voxel : 4.5%

Page 26: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

41

2.5cm voxel : 1.8%

Page 27: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Sparse Representations and Convolution

43

Continuous Representation

Discrete Representation

OctNet and Octree

[Riegler et al.]

Sparse Tensor

[Graham et al., Choy et al.]

Points and PointNet

[Qi et al.]

Continuous Convolution

• PointCNN

• Monte Carlo Conv

• Surface / Tangent Conv

Occupancy Net

[Mescheder et al.]

Deep SDF

[Park et al.]

Deep Level Sets

[Michalkiewicz et al.]

….

Graph Representation

Graph Net

[Kipf & Wellings]

Conv on Graph

[Defferrard et al.]

….

….

Hybrid Representation

Contiuous + Graph

Page 28: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Sparse Matrix● Majority of elements are 0

● Efficient representation● Non-zero elements only

● Compressed sparse row (CSR)

● List of lists

● COOrdinate list

● Etc.

● Example: 2x2 matrix○ COOrdinate (COO) representation

○ 4 at (0, 0)

○ 1 at (1, 1)

(0, 0)

45

Page 29: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Sparse Tensor

● High-dimensional extension

● COOrdinate representation○ 4 at (0, 0, 0)

○ 1 at (1, 1, 0)

○ 9 at (1, 1, 1)

(0, 0, 0)

46

Page 30: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Convolution on a Sparse Tensor

[Graham et al., Submanifold Sparse ConvNet, 2017]

[Graham and Maaten, 3D Sparse ConvNet, 2018] 47

Cannot support arbitrary sparsity

Dense Tensor Kernel

Static Sparsity Pattern

ConvolutionSparse Convolution

Page 31: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Generalized Convolution

50

Can support arbitrary sparsity

Sparse Tensor Kernel

Dynamic Sparsity Pattern

[Graham et al.] [Choy et al.]

Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 32: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Generalized Convolution

51

Can support arbitrary sparsity

Sparse Tensor Kernel

Dynamic Sparsity Pattern

Sparsity pattern manipulation

Ex) C = A + B

Ex) Pruning

High-dimensional ConvNet

Volume of dense convolution kernel: O(ND)

Sparse convolution kernel: O(D)

Generative Tasks

Page 33: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Generalized Convolution: Special Cases

52

Sparse Tensor Kernel Dynamic Sparsity Pattern

• Dilated Convolution

• Separable Convolution

• Sparse Convolution

• Octree Generative Networks

Arbitrary sparsity

• Dense Convolution

Page 34: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Minkowski EngineA convolutional neural network

library for sparse tensors

● Convolution

● [Max/Avg/Global] Pool

● Broadcast

● [Batch/Instance] Normalization

● Tensor arithmetic

● Pruning

● …

60Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 35: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Minkowski Network● Very deep convolutional neural networks possible in 3D

○ 42-layer deep neural networks for semantic segmentation

○ 101 layers for classification

● Reuse network architectures from years of research in 2D

61

ResNet18

4D MinkNet18

Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 36: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Minkowski Engine for other applications

62Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 37: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Sparsity Pattern Reconstruction

65Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 38: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

● Partition 3D scans or data into semantic parts

● Label each voxel or 3D point as one of semantic labels

3D Perception: Semantic Segmentation

66

Page 39: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Semantic Segmentation on Sparse Tensors

● Sparse tensors for all input/output feature maps

● U-shaped network○ Hierarchical map

○ Increases receptive field size exponentially

67Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 40: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101
Page 41: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101
Page 42: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Results: ScanNet

70Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 43: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Results: Stanford 3D

72Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 44: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

74

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 45: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Feature Learning

● Universal Correspondence Network,

Chris, JunYoung, Silvio, Manmohan,

NIPS’16

● Fully Convolutional Geometric Features,

Chris, Jaesik, Vladlen, ICCV’19

75

Page 46: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Geometric Feature● A vector representation of the local / global 3D geometry

○ Correspondence, registration, tracking, scene flow, ...

76

Page 47: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Prior works in 3D Geometric Features

● Extract a small 3D patch○ Limits context, receptive field

○ Features extracted separately

● Preprocessing ○ Normal, Signed Distance Function, curvatures

Choy et al., Fully Convolutional Geometric Features, ICCV’19 77

Hand-designed Features Learned Features

Spin Image, USC, SHOT, PFH, FPFH3DMatch, CGF, PointNet, PPF, FoldNet,

PPFFold, CapsuleNet, DirectReg, SmoothNet

Page 48: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Fully Convolutional Metric Learning

● No preprocessing, no patch extraction○ no receptive field limit by crop size

○ Efficient reuse of shared computation

● Hardest Negative Mining

Choy et al., Universal Correspondence Network, NIPS’16Choy et al., Fully Convolutional Geometric Features, ICCV’19 80

Page 49: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Fully Convolutional Geometric Features

Choy et al., Fully Convolutional Geometric Features, ICCV’19 81

Page 50: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

82

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 51: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

4D Spatio-temporal data (3D Video)

83

Page 52: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D to 4D Spatio-temporal perception

● 4D Markov Random Fields for Medical Imaging [McInerney & Terzopoulos, 1995]

● 4D Cardiac Image Segmentation [Lorenzo-Valdés et al., 2014]84

Advantages of 4D data

• Temporal consistency

• Novel viewpoint

• Dynamics / Action

Challenges of 4D data

• Weak 3D perception

• ComplexityMemory: O(TN3)

Computation: O(K4 TN3)

Page 53: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

High Dimensional Spaces and Generalized Convolution

85

Challenges

• Weak 3D perception

• ComplexityMemory: O(TN3)

Computation: O(K4 TN3)

Minkowski ConvNet

Sparse Tensor

Generalized Convolution

Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 54: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

4D Spatio-Temporal Semantic Segmentation● Spatially aligned 3D video

○ Static objects have the same 3D coordinates

○ GPS, SLAM

● Synthetic dataset: Synthia

● Network:○ U-shaped Net for semantic segmentation, in 4D

86Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 55: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101
Page 56: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

88

Page 57: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Results: 4D Synthia Dataset

94

Faster & Better

Regularized

Full 4D convolution

More effective for small objects

Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 58: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

95

3D

Co

nvN

et

4D

Co

nvN

et

Choy et al., 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR’19

Page 59: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

96

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

Page 60: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Reconstruction

98

3D Scans

Preprocessing

Fragments

3D Pairwise

Registration

Fragment A

Fragment B

Global Consistency

Page 61: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Pairwise Registration

99

3D Fragments

Feature Extraction

Correspondence

Global Registration

OANet, Zhao et al., 2019

LFGC, Yi et al., 2018

FCGF, Choy et al. 2019

SmoothNet, Gojcic et al. 2019

CapsuleNet, Zhao et al., 2019

PPF, PPF-Fold, Deng et al., 2019, 2018

FoldingNet, Yang et al., 2017

Page 62: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Pairwise Registration

100

3D Fragments

Feature Extraction

Correspondence

Global Registration

OANet, Zhao et al., 2019

LFGC, Yi et al., 2018

((x,y,z), (x’, y’, z’))

Nearest Neighbor

Feature Extraction

Dimensionless data

Approximate

P(correspondence correct)

Page 63: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Geometry of 3D Correspondence

101Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Fragment A

Fragment B

(x,y,z), (x’, y’, z’)

Inliers: Blue, Outliers: Red

Page 64: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Correspondences and 6D Surface

102Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

(x,y,z), (x’, y’, z’)

• (x,y,z) → Fragment A

• (x’,y’,z’) → Fragment B

Concatenate

• (x, y, z, x’, y’, z’)

• First 3 follow A, last 3 follow B

• Inliers follow the common geometry

6D Hyper Surface

Page 65: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Correspondences form high-dimensional geometry● X = {1,2,3,4,5}

● Y = T(X) where T(x) := x + 4

● Correspondence○ {(1, 5), (3, 7), (4, 8), (5, 9), (2, 9)}

● Correct correspondences○ Follow the common geometry

○ Inliers

● Incorrect correspondences○ Outliers

103Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Page 66: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Inlier vs. Outlier

Label each correspondence as Inlier vs. Outlier

→ Label each 6D point as an Inlier vs. Outlier

→ Label each 3D point as chair, bed, …

104Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Page 67: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

6D Convolutional Neural Network

105Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Translation invariance: Fragments can be located anywhere in 3D space

Multi-resolution (large receptive field, less sparse)

Page 68: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Results: 3D Correspondence Segmentation

106

3D Fragments

Feature Extraction

Correspondence

Global Registration

Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

6D ConvNet Confidence Filter

Page 69: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Results: 3D Correspondence Segmentation

107

Yi et al., Learning to find good correspondences, 2018

Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Page 70: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

108

Page 71: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

109

Page 72: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Correspondences and 6D Geometry

110Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Fragment A

Fragment B

(x,y,z), (x’, y’, z’)

3D Fragments

Feature Extraction

Correspondence

Global Registration

Page 73: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

2D Correspondences and 4D Geometry

111Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Image A

Image B

(x,y), (x’, y’)

Images

Feature Extraction

Correspondence

Global Registration

Page 74: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

2D Correspondences and 4D Geometry

112Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

(x,y), (x’, y’)

• (x,y) → Image A

• (x’,y’) → Image B

Page 75: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

2D Correspondences and 4D Geometry

113Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Second degree polynomial (x, y, x’, y’) = 0

Page 76: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Conic Sections

114

Page 77: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

4D Hyper Conic Section of 5D Hyper Cones

115

Page 78: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

2D Correspondences and 4D Geometry

116Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

(x,y), (x’, y’)

• (x,y) → Image A

• (x’,y’) → Image B

• 2-nd degree polynomial = 0

4D hyper conic section

Page 79: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

YFCC 100M dataset

117

Yi et al., Learning to find good correspondences, 2018

Zhang et al., Learning Two-View Correspondences and Geometry Using Order-Aware Network, 2019

Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Page 80: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

118

Ou

rsZ

han

g e

t a

l.Y

i et

al.

Yi et al., Learning to find good correspondences, 2018

Zhang et al., Learning Two-View Correspondences and Geometry Using Order-Aware Network, 2019

Choy et al., High-dimensional Convolutional Networks for Geometric Pattern Recognition, 2020

Page 81: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

3D Perception

119

3D Reconstruction

3D Semantic Segmentation

Perception on a Set of 3D Data

3D Feature Learning

4D Spatio-Temporal Perception

4D and 6D for Registration

Supervised Reconstruction

3D Convolutional Networks

4D Convolutional Networks

4D Convolutional Networks

6D Convolutional Networks

Conclusions

7D Convolutional Networks

32D Convolutional Networks

Page 82: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Conclusions and Future Work

● Many more high-dimensional problems○ Geometric structure

● Expand the high-dimensional pattern recognition problems to○ 3D object detection

○ Tracking

○ Reconstruction

120

Page 83: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Thank you

121

Page 84: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Thank you

122

Vladlen Koltun Jaesik Park

JunYoung Gwak Iro Armeni Lyne Tchapmi

Manmohan Chandraker

Kevin Chen Kuan Fang

Page 85: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Thank you

123

Leonidas GuibasBenjamin Van Roy Gordon Wetzstein Tsachy Weissman

Page 86: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Thank youDanfei Xu, Yuke Zhu, Animesh Garg,

Andrey Kurenkov, Manolis Savva,

Angel Chang, Namhoon Lee, Yu

Xiang, Junha Lee, Michael Stark

124

Page 87: High Dimensional Convolutional Neural Networks · 2020-03-23 · Very deep convolutional neural networks possible in 3D 42-layer deep neural networks for semantic segmentation 101

Thank you for your attention

125