MSCOCO Instance Segmentation Challenges...

MSCOCO

Instance Segmentation

Challenges 2018

Megvii (Face++) Team

lizeming@megvii.com

I. COCO’18 Instance Seg

Zeming LI Yueqing ZHUANG Gang YU Jian SUNXiangyu ZHANG

Overview

2015 2016 2017(Megvii) Ours

Detector mmAP

46.7 48.8

2015 2016 2017 Ours

Mask mmAP

Object Detector

3.4% improvement

The results is obtained on test-dev

Instance Segmentation

2.1% improvement

Improvements

Outline

1) Location Sensitive Header 2) Backbone Improvement

3) Two-Pass Pipeline 4) Results

Outline

3) Two-Pass Pipeline 4)Results

FPN Original Mask HeadInstance Seg

Det mmAP

Original Paper(detectron 1x) 33.6 -

Our Re-implement 34.4 37.0

Mask RCNN Baseline

Location Sensitive Header

Overall Architecture Comparison

1) Location Sensitive Detector

name Mask AP Bbox AP Improvement

Baseline 34.4 37.0 -

+ Local Sensitive Detector 35.4 38.7 + 1.0 / +1.7

2) Multi-Scale RoI

+ Local Sensitive Detector 35.6 38.7 + 1.0 / +1.7

+ Multi-Scale RoI 35.8 38.9 + 0.2 / +0.2

3) Heavier Header

Heavier Header 35.3 36.8 + 0.9 / -0.2

4) Mask Edge Loss

Mask Edge Loss 35.0 37.0 + 0.6 / +0.0

4) Mask Edge Loss

Sigmoid Cross Entropy

Review of overall Architecture

Location Sensitive Header:

1) Location Sensitive Detector

2) Multi-Scale RoI

3) Heavier Header

4) Mask Edge Loss

Overall Performance in Small and Large Model

BackBone Header Mask AP Bbox AP Improvement

ResNet50 Baseline 34.4 37.0 -

Location Sensitive Header 37.0 39.3 + 2.6 / + 2.0

ShuffleV2-GAP Baseline 40.3 45.0 -

Location Sensitive Header 42.3 46.5 +2.0/+1.5

We will introduce backbone in next slides

Outline

Backbone Improvement1. Channel Information Flow

Ma N, Zhang X, Zheng H T, et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design[J].

2. Add Global Information

+GAP 35.1 37.7 +0.7/+ 0.7

Backbone Improvement

Outline

Two-Pass Pipeline

Outline

Results

name Mask AP(val) Bbox AP(val) Improvement

ResNet50 ( 2x-2batch-setting) 36.1 39.3 -

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

2x Means 2x training setting used in Detectron

Trained On Megvii’s Megbrain

Results

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

+ Location Sensitive Header 42.3 46.5 +2.0 /+1.5

Results

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

+ Local Sensitive Header 42.3 46.5 +2.0 /+1.5

+ 2 Batch Per GPU

+ Multi Scale Training

+ BN training

44.5 49.3 +2.2/ 2.8

Results

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

+ 2 Batch Per GPU

+ BN training

44.5 49.3 +2.2/ 2.8

+ Improve on Dets 47.6 55.4 +3.1/ 6.1

Results

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

+ 2 Batch Per GPU

+ BN training

44.5 49.3 +2.2/ 2.8

+ Improve on Dets 47.6 55.4 +3.1/ 6.1

+ Seg Multi-scale Testing 48.1 55.4 +0.5/0.0

Results

ShuffleV2 (1batch) 40.3 45.0 +3.8/+5.7

+ 2 Batch Per GPU

+ BN training

44.5 49.3 +2.2/ 2.8

+ Improve on Dets 47.6 55.4 +3.1/ 6.1

+ Seg Multi-scale Testing 48.1/ 48.8(dev) 55.4/ 56.0(dev) +0.5/0.0

Instance Segmentation is obtained by single instance segmentation model

Results name Bbox AP(val) Improvement

Baseline 49.3 -

+Soft-Nms 49.8 +0.5

+Multi-scale Testing 51.6 +1.8

+Ensemble 53.6 +2.0

add an additional model for ensemble:

+with cascade R-CNN

+external COCO++ 11W data

55.4 +1.8

Results

Visualization Comparison

baseline

Location

Sensitive

Header

Refine Location Error

Our Baseline Location Sensitive Header

Visualization

Detector Results Mask Results

Visualization

Detector Results Mask Results

Summary & thanks

1. Location Sensitive Header

2. Backbone Improvement

3. Pipeline Optimization

Other Improvements:1. Multi-Scale Training

2. Large Batch (MegDet : [C. Peng, CVPR’ 18])

3. Multi-Scale and Flip Testing

4. Ensemble (only for Detection)

Looking for Interns, Researcher，Research Engineer

career@megvii.com

yugang@megvii.com

MSCOCO Instance Segmentation Challenges...

Documents

Transcript of MSCOCO Instance Segmentation Challenges...

DRIT-ECCV18 Hsin-Ying Lee Hung-Yu Tseng Jia-Bin Huang ...jbhuang/supp/ECCV_2018_DRIT_Poster.pdf · Hsin-Ying Lee*1 Hung-Yu Tseng*1 Jia-Bin Huang2 Maneesh Singh3 Ming-HsuanYang1,4

End-to-End Learning of Driving Models with Surround-View ...people.ee.ethz.ch/~daid/publications/[eccv18]End-to-End-Driving... · our eyes, side-view mirrors and a rear-view mirror

COCO Challenge 2018 Panoptic Segmentation Taskpresentations.cocodataset.org/ECCV18/COCO18-Panoptic-PKU_360.pdf• Thing segments override stuff segments. • Comparison between semantic

MSCOCO Keypoints Challenge 2018 - cocodataset.orgpresentations.cocodataset.org/ECCV18/COCO18-Keypoints-Megvii.pdf · test_challenge test_dev mini_validation Ensemble model 76.4(final

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene …daid/publications/[eccv18... · 2018. 9. 6. · Model Adaptation with Synthetic and Real Data for Semantic

Human Motion Reconstruction from Action Video Data Using …cs231n.stanford.edu/reports/2017/pdfs/945.pdfthe winning method in 2016 MSCOCO Keypoints Chal-lenge. The overall pipeline

MSCOCO Keypoints Challenge 2017 - cocodataset.orgpresentations.cocodataset.org/COCO17-Keypoints-Megvii.pdfIs Hourglass good for COCO keypoint ResNet-FPN-like[1]network works better

Cross-Modal Hamming Hashing - Yue Caoyue-cao.me/doc/cross-modal-hamming-hashing-eccv18.pdf · hashing methods [14,15,16,42,43,44] have given state-of-the-art results on many image

Microsoft COCO Captions: Data Collection and Evaluation Serverjeanoh/16-785/papers/chen-arxiv2015-mscoco-m… · b(C;S) = (1 if l C>l S e1 l S=l C if l C l S; (2) where l C is the

Show and Tell: Lessons learned from the 2015 MSCOCO Image ... · Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol Vinyals, Alexander Toshev, Samy

FPN-basedNetworkfor Panoptic Segmentationpresentations.cocodataset.org/ECCV18/COCO18-Panoptic... · 2018-10-29 · Lin, Tsung-Yi, et al. "Feature Pyramid Networks for Object Detection."CVPR.

Constraint-Aware Deep Neural Network Compressionmori/research/papers/chen-eccv18.pdf · Output: Compressed network F 1:F[0] = F 0 2: for t = 1 to Tdo 3:Update cooled constraint values

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge · · 2016-09-22Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol

End-to-End Learning of Driving Models with Surround-View ...daid/publications/[eccv18]End-to-End... · Fig.1: An illustration of our driving system. Cameras provide a 360-degree view

Hierarchical Metric Learning and Matching for 2D and 3D ... › ~mkchandraker › pdf › eccv18... · image pyramids often used for matching tasks. We propose concrete CNN architectures

MSCOCO & Mapillary Panoptic Segmentation Challenge 2018 · Residual L2 Loss 𝑙 𝑐𝑙𝑠 𝑙 𝐿2 Base Network 𝑔 − 𝑓1 Design F1 F2 1. Extract two feature maps from

Puneet K. Dokania · Puneet Dokania!1 Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence (ECCV18) Puneet K. Dokania (University of Oxford) 7th Aug,

MSCOCO & Mapillary Panoptic Segmentation Challenge 2018presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf · Huanyu Liu Xiangyu Zhang Gang Yu Jian Sun Xu Liu Zeming Li

DRIT-ECCV18 Hsin-Ying Lee Hung-Yu Tseng Jia-Bin Huang ...jbhuang/supp/ECCV_2018_DRIT_Poster.pdf · Hsin-Ying Lee1 Hung-Yu Tseng1 Jia-Bin Huang2 Maneesh Singh3 Ming-HsuanYang1,4