Text Detection in Video Min Cai 2002.3.13. Background Video OCR: Text detection, extraction and...

20
Text Detection in Video Min Cai 2002.3.13
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    229
  • download

    6

Transcript of Text Detection in Video Min Cai 2002.3.13. Background Video OCR: Text detection, extraction and...

Page 1: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Text Detection in Video

Min Cai

2002.3.13

Page 2: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Background

Video OCR: Text detection, extraction and recognition

Detection Target: Artificial text

Text detection: Detect the region from Single frame Refine the region by combining consecutive frames

Page 3: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Existing Work

Feature Extraction Text Detection based on feature

Color Connected-component

Texture Texture-Segmentation

Edge Top-Down

Bottom-Up

Page 4: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Connected-component-based methods

Basic idea Treat text as an uniform color (color level) and classify each pixel as

text or non-text according to the color value. Combine connected text-pixels into connected components. Group collinear connected components into a text string.

Advantage Can detect an arbitrary orientation text ---- with similar color and in

a simple background. Disadvantage

Sensitive to color variance Lossy compression of video introduces color bleeding Complex background

Page 5: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Texture Segmentation method

Basic idea Treat text as a type of texture Use texture segmentation algorithms to detect text

Gabor Filter Gaussian derivatives

Advantage Can segment text areas & graphic areas in a simple background

efficiently. It is usually used in document analysis.

Disadvantage Time-consuming Cannot handle well a text embedded in various background.

Page 6: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Bottom-Up method

Basic idea A seed region is defined as a small region with high edge density. Grow a seed region into successively larger components until all

seed regions are reached on the image.

Advantage It is a generic method to detect a homogeneous object of various

shape. That is, it can detect not only a rectangular object, but also other shapes.

Disadvantage Sensitive to noise. Can not handle the large range of font-size. Sensitive to the stroke density (different language).

Page 7: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Top-Down method

Basic idea Based on run-length smoothing algorithm Analyze horizontal and vertical projection profiles

Advantage Can detect the boundary of horizontal alignment text string quickly

and correctly Noise insensitive

Disadvantage Cannot handle diagonal alignment text. One pass of horizontal & vertical projection cannot handle the

complex layout.

Page 8: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Analysis (1)

A certain contrast against background Artificial text strings are designed to be read easily

A certain stroke density Text strings always appear horizontally Spatial cohesion

Characters of the same text string are of similar heights, orientation and spacing

Size constraint Text strings have certain size restriction

A text string appears in multiple consecutive frames and the similar position.

Page 9: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Analysis (2)

Problems Resolutions

How to extract more useful edge? Local Thresholding

How to highlight text areas? Text area recovery

How to detect text regions fast and correctly

?

Coarse-To-Fine detection

Page 10: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Single Threshold

Page 11: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Local threshold (1)

Use a small kernel (red) to scan the whole image. In a bigger window (gray) surrounding the kernel, calculate

the local threshold corresponding to its local histogram.

a. Window move

MIN MAXT-local

Count

Edgestrength 0

Low half High half

b. Local threshold selection

Page 12: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Local threshold (2)

Page 13: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Text-like area recovery (1)

Before recovery After recovery

Page 14: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Text-like area recovery (2)

Before recovery After recovery

Page 15: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

High pass filter

Page 16: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Using Top-down scheme to detect text-like areas

Coarse-to-Fine detection

Horizontal project

Vertical project

Can divide?

The first region from the array

Add to Processing array

Initial:Add the whole

Image to processing array

Add to result array YesNo

Page 17: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Detect text-like areas

b. Coarse vertical projection

1) 2)

3) 4)

Page 18: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Refinement

Combine the neighboring text areas with similar height

Using size constraints to remove unsatisfied areas

Page 19: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Multi-frame analysis

Text region matching Find all the regions corresponding to the same text

Text region enhancement Enhance the text image quality by multi-frame integration

Repetitive text elimination Only record the text at its first emergence.

Page 20: Text Detection in Video Min Cai 2002.3.13. Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Thank you!

End