Single Shot Text Detector with Regional Attention · Single Shot Text Detector with Regional...

International Conference on Computer Vision 2017

Single Shot Text Detector with Regional AttentionPan He1, Weilin Huang2, Tong He3, Qile Zhu1, Yu Qiao3 and Andy Li1

1 National Science Foundation Center for Big Learning, University of Florida 2 Department of Engineering Science, University of Oxford

3 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

Ø Goal:• Improve the speed and accuracy for

scene text detection.

Ø Existing works:• Pixel-based detectors [1]: Using

cascaded Fully Convolutional Networks (FCN) to cast character-based detections into the problem of pixel-wise text semantic segmentation.

• Box-based detectors [2]: Extending object detectors such as Faster R-CNN [3] or SSD [4] to predict text boxes by simply using bounding-box annotation.

Ø Problem & Motivation:• In spite of effectively identifying rough

text regions, pixel-based text detectors fail to produce accurate word-level predictions with a single model. The main challenge is to precisely identify individual words from a detected rough region of text.

• Box-based text detectors are often trained by simply using bounding-box annotations, which may be too coarse (high-level) to provide a direct and detailed supervision, compared to the pixel-based detectors where a text mask is provided.

Ø Our idea:• We proposed techniques to bridge the

gap between the pixel-based detectorsand the box-based detectors, resulting in a single-shot model that essentially works in a coarse-to-fine manner.

Introduction

[1] Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, and X. Bai. Multi-oriented text detection with fully convolutional networks, CVPR, 2016.[2] Z. Tian, W. Huang, T. He, P. He, and Y. Qiao. Detecting text in natural image with connectionist text proposal network, ECCV, 2016. [3] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: To- wards real-time object detection with region proposal networks, NIPS, 2015.[4] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg. SSD: Single shot multibox detector, ECCV, 2016.[5] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions, CVPR, 2015.[6] P. He, W. Huang, Y. Qiao, C. C. Loy, and X. Tang. Reading scene text in deep convolutional sequences, AAAI, 2016.

Reference

ØComparisons with state-of-the-art results:

ØExploration study:

ØQualitative results:

Experiment ResultsFramework of SSTD with Regional Attention

Ø Idea:• A top-down spatial attention on text regions

to suppress the background interference and cast the cascaded FCNs detectors into a single model.

Ø Attention Map:• We compute a text attention map from

Aggregated Inception Features (AIFs).

• The attention map indicates rough text regions and is further encoded into the AIFs (via element-wise dot production).

• The attention module is trained by using a pixel-wise binary mask of text.

Text Attention Module

Ø Idea: • Aggregating inception features in different

layers (with varied resolutions) to enhance local detailed information and encode richer context information.

Ø Aggregated Inception Features:• Similar to Inception architecture in GoogleNet

[5], we get inception features through four different convolutional operations, with dilated convolutions applied.

• We further enhance the inception features by aggregating multi-layer inception features, by using channel concatenation.

• Each AIF is computed by fusing the inception features of current layer with two directly adjacent layers.

Hierarchical Inception Module

Figure 2: Architecture of Text Attention Module Figure 3: Architecture of Inception Module

Figure 1: Framework of SSTD with Regional Attention

Table 1. Performances on the ICDAR 2013 and ICDAR 2015 datasets

Table 2: Performance on COCO-text dataset

Table 3: Exploration study on the ICDAR 2013 dataset

Single Shot Text Detector with Regional Attention · Single Shot Text Detector with Regional...

Documents

Transcript of Single Shot Text Detector with Regional Attention · Single Shot Text Detector with Regional...

SSD: Single Shot MultiBox Detectorwliu/papers/ssd_eccv2016_slide.pdfSSD: Single Shot MultiBox Detector Wei Liu(1), Dragomir Anguelov(2), Dumitru Erhan(3), Christian Szegedy(3), Scott

SSD-Sface: Single shot multibox detector for small facesIn this thesis we present an approach to adapt the Single Shot multibox Detector (SSD) for face detection. Our experiments are

sandwiches...cafe con leche single shot & evaporated milk single shot & steamed milk single shot / double shot triple shot all coffee drinks sweetened espresso Soda water bottled water

One-shot Linear Decorrelating Detector for Asynchronous CDMA

DSSD : Deconvolutional Single Shot Detector - arXiv · PDF fileDSSD : Deconvolutional Single Shot Detector Cheng-Yang Fu 1, Wei Liu , Ananth Ranga2, Ambrish Tyagi2, Alexander C. Berg1

Deconvolutional Single Shot Detector のマルチタスク化によ …mprg.jp/data/FLABResearchArchive/Bachelor/B17/Abstract/nakase.pdfDeconvolutional Single Shot Detector のマルチタスク化による物体検出の高精度化

DSFD: Dual Shot Face Detector - CVF Open Accessopenaccess.thecvf.com/content_CVPR_2019/papers/Li...DSFD: Dual Shot Face Detector Jian Li† Yabiao Wang‡ Changan Wang‡ Ying Tai‡

SSD: Single Shot MultiBox Detector - Computer Sciencewliu/papers/ssd.pdf · SSD: Single Shot MultiBox Detector Wei Liu1, Dragomir Anguelov2, Dumitru Erhan3, Christian Szegedy3, Scott

Detect-SLAM: Making Object Detection and SLAM Mutually ... · Single Shot Multibox Object Detector (SSD) [14] is the ﬁrst DNN-based real-time object detector that achieves above

FSSD: Feature Fusion Single Shot Multibox Detector · FSSD: Feature Fusion Single Shot Multibox Detector Zuo-Xin Li Fu-Qiang Zhou Key Laboratory of Precision Opto-mechatronics Technology,

Body DWI 研究会mrifan.net/wp-content/uploads/2015/02/セクション1-2...Fast Imaging EPI-single shot TSE-single shot TSE-single shot TSE-multi shot TSE-single shot factor 27 58

DSSD : Deconvolutional Single Shot Detector

Single Shot Text Detector with Regional Attention Shot Text Detector with Regional Attention Pan He1, Weilin Huang2, 3, Tong He3, Qile Zhu1, Yu Qiao3, and Xiaolin Li1 1National Science

SSD: Single Shot MultiBox Detector - Welcome to the UNC ...wliu/papers/ssd.pdf · SSD: Single Shot MultiBox Detector Wei Liu1, Dragomir Anguelov2, Dumitru Erhan3, Christian Szegedy3,

1 LS-Net: Fast Single-Shot Line-Segment Detector

Comparison of Single-Detector and 90°·Angled Two-Detector ...tech.snmjournals.org/content/23/3/158.full.pdfComparison of Single-Detector and Two-Detector Acquisitions Studies Camera/detector

A Fusion Strategy for the Single Shot Text Detectorstatic.tongtianta.site/paper_pdf/b69e1420-62bf-11e... · In addition, Liao et al. [12] proposed a single shot text detector called

SSD: Single Shot MultiBox Detectorkosecka/cs747/presentation-ssd.pdfSSD: Single Shot MultiBox Detector Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang

SSD: Single Shot MultiBox Detector - arXiv.org e-Print archive · PDF fileSSD: Single Shot MultiBox Detector Wei Liu1, Dragomir Anguelov2, Dumitru Erhan3, Christian Szegedy3, Scott

ria single shot