Recent Progress in Object Detectionvspc.ee.cuhk.edu.hk/slides/detection-talk.pdf · Recent Progress...
Transcript of Recent Progress in Object Detectionvspc.ee.cuhk.edu.hk/slides/detection-talk.pdf · Recent Progress...
Recent Progress in Object Detection
Jiaqi Wang
Multimedia Laboratory
The Chinese University of Hong Kong
Progress
2014 2015 2016 2017 2018
R-CNN
Fast R-CNNFaster R-CNN R-FCN
FPN
Mask R-CNN
SSD
YOLO
RetinaNet
Cascade R-CNN
YOLO v2
Relation Network
CornerNetMultiBox
SNIP
recent
General pipeline
backbone neckimage
Feature generation
proposalsdense
cls. & reg.
sliding window
anchors
RoI feature
extractortask head
Region proposal
RoI
features
Region recognition
Two-stage Detector
General pipeline
backbone neckimage
Feature generation
dense
cls. & reg.
sliding window
anchors
Single-stage Detector
Faster R-CNN
Training pipeline
• Joint training: multi-task
backbone
RPN
Fast R-
CNN head
proposalsNo gradient
preferred
RetinaNet
Focal Loss
• Problem: class imbalance• inefficient training
• loss is overwhelmed by negative samples
Model Solution
Two-stage detectors 1) proposal
2) mini-batch sampling
SSD Hard negative mining
RetinaNet Focal loss
COCO Challenge 2018
Comparison of our approach with 2017 winning entries on COCO test-dev.
0.44
0.467
0.474
0.49
0.42
0.43
0.44
0.45
0.46
0.47
0.48
0.49
0.5
MA
SK
AP
2017 winner (single model) 2017 winner
Single model Final results
0.505
0.526
0.541
0.56
0.48
0.49
0.5
0.51
0.52
0.53
0.54
0.55
0.56
0.57
BO
X A
P
2017 winner (single model) 2017 winner
Single model Final results
COCO Challenge 2018
1. We developed a hybrid task cascade framework for detection and segmentation.
Detection &Segmentation
COCO Challenge 2018
1. We developed a hybrid task cascade framework for detection and segmentation.
2. We proposed a feature guided anchoring scheme to improve the average recall (AR) of RPN by 10 points.
ProposalDetection &Segmentation
COCO Challenge 2018
1. We developed a hybrid task cascade framework for detection and segmentation.
2. We proposed a feature guided anchoring scheme to improve the average recall (AR) of RPN by 10 points.
3. We designed a new backbone FishNet.
Backbone ProposalDetection &Segmentation
COCO Challenge 2018
Hybrid Task Cascade (HTC)
• Cascade Mask R-CNN (Cascade R-CNN + Mask R-CNN)
Fp
oo
l
pool
po
ol
M3 B3M2 B2M1 B1RPN
Problem: Two branches at each stage are executed in parallel, without interaction.
COCO Challenge 2018
Hybrid Task Cascade (HTC)
• Interleaved execution
F
po
ol
po
ol
po
ol
M3
B3
M2
B2
M1
B1RPN
po
ol
Problem: No direct information flow between mask branches at different stages.
COCO Challenge 2018
Hybrid Task Cascade (HTC)
• Mask Information Flow
F
po
ol
po
ol
po
ol
M3
B3
M2
B2
M1
B1RPN
po
ol
Problem: Spatial context is not much explored.
COCO Challenge 2018
Hybrid Task Cascade (HTC)
• Spatial context
F
po
ol
po
ol
po
ol
M3
B3
S M2
B2
M1
B1RPN
po
ol
COCO Challenge 2018
Guided anchoring
backbone neckimage
Feature generation
proposalsdense
cls. & reg.
sliding window
anchors
RoI feature
extractortask head
Region proposal
RoI
features
Region recognition
learnable
COCO Challenge 2018
Guided anchoring
• Our goal• Sparse
• Arbitrary shape
• General rules for anchor design• Alignment
• Consistency
COCO Challenge 2018
Guided anchoring
RPN(ResNet-50)
RPN (ResNet-152)
RPN (ResNeXt-101)
GA-RPN (ResNet-50)
GA-RPN (SENet-154)
RPN (SENet-154)
0.58
0.6
0.62
0.64
0.66
0.68
0.7
0.72
0 2 4 6 8 10 12
AR
1000
Runtime on TITAN X (fps)
COCO Challenge 2018
1. Training scales• short edge: random sampled from 400 ~ 1400• long edge: 1600
2. Test scales• (600, 900), (800, 1200), (1000, 1500), (1200, 1800), (1400, 2100)
3. Pipeline• Joint training• Finetune with GA-RPN proposals• Test with GA-RPN proposals
4. Resources• 32 Tesla V100 GPUs (16GB) for 3 days
Implementation details
COCO Challenge 2018
Implementation details
Backbones
• SENet-154• ResNeXt101 (64*4d) • ResNeXt101 (32*8d)• DPN-107• FishNet
comparable
~0.8 points higher
COCO Challenge 2018
Implementation details
Other tricks
• w/ SoftNMS• w/o OHEM• w/o classwise balance sampling• w/o voting for bbox or mask
COCO Challenge 2018
• With bells and whistles
baselineR-50 Cascade
with mask
HTC
deformableconv
synchronize BN
multi-scaletraining
betterbackbone
GARPNfinetune
multi-scale &flip testing
modelensemble
35
37
39
41
43
45
47
49
mask AP on test-dev
36.9
47.4
(+2.1)
38.4
(+1.5)
45.3
(+1.0)
39.5
(+1.1)
40.7
(+1.2)
42.5
(+1.8)
49.0
(+1.6)
44.3
(+1.8)
mmdetection
GitHub: mmdetection
√
√
√
√
√
√
√
√
√
√
√
• Comprehensive
RPN Fast/Faster R-CNN
Mask R-CNN FPN
Cascade R-CNN RetinaNet
More … …
• High performance
Better performance
Optimized memory consumption
Faster speed
• Handy to develop
Written with PyTorch
Modular design
Hybrid Task Cascade for Instance Segmentation (Accepted to CVPR 2019)
https://arxiv.org/abs/1901.07518
Region Proposal by Guided Anchoring (Accepted to CVPR 2019)
https://arxiv.org/abs/1901.03278
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction (Accepted to NIPS 2018)
https://arxiv.org/abs/1901.03495