1 Incremental Detection of Text on Road Signs from Video Wen Wu Joint work with Xilin Chen and Jie...

Post on 22-Dec-2015

212 views 0 download

Tags:

Transcript of 1 Incremental Detection of Text on Road Signs from Video Wen Wu Joint work with Xilin Chen and Jie...

1

Incremental Detection of Text on Road Signs from Video

Wen Wu

Joint work with Xilin Chen and Jie Yang

2

Acquire Text, Process Text

Corpus

Language(Text)

Language(Text)

Web

Visual

Speech

NLP

Translation

IR/IE

Multimedia

Speech

3

Text helps to understand images

4

Why interested in text on signs?

• Signs are everywhere in our daily life, such as shop names, billboard, street names, etc;

• Like other information device, road signs are placed to convey information to human for different purposes;

• Text could be the most flexible way to express dynamic information.

• Why not make computer to understand those text and further assist human?

5

Too many signs cause problems

6

It happened in Pittsburgh too!

7

Task • Automatically detect text on road signs

from video.

8

Related work

9

What makes us to detect sign?

10

What do you think?

11

Vertical plane property of signs

12

Divide-and-Conquer Strategy

• Decompose the original task into two sub-tasks, that is, localization of road signs and detection of text;

• Propose algorithms for two sub-tasks respectively, integrate them by mapping corresponding feature points;

• Use features from not only individual 2D images but also temporal dependency between them.

13

Incremental Detection Framework

14

Why incremental?

• Computation requirement– Detection is a computation-expensive step;– In contrast, mapping correspondence points

is a cheap step;

• Video resolution – Detection requires low resolution– OCR requires high resolution

Localize Detect Recognize

Time

15

System Implementation Prototype

Built on a PC with Intel Pentium 4 CPU @1.8

GHz and 1GB memory, Windows XP;

Data:

1) Captured by a DV camera mounted on a minivan.

2) Video frame size is 640*480.

3) The database included about 3 hours of videos, captured in different conditions, i.e., in the morning, afternoon, and dusk.

16

A Demo

Demo

17

Sequences of the Demo

18

Incremental vs. Non-incremental

Another demo

19

Summary of Evaluation• 22 video sequences with

different driving situations; • Vehicle’s speed varies from

20 to 55 MPH • Testing data contain ~90

road signs and > 300 words.

# of signs Hit rate False hits

92 92.4% 17.9%

Hit rate False hits Speed

Non-Incre- 80.2% 85.6% 2-6 fps

Incre- 88.9% 9.2% 8-16fps

Table 1. Sign localization performance Table 2. Text detection performance

20

Contributions

• Proposed a unified framework for automatically detecting text on road signs from video based on the natural characteristics of the task;

• Exploited features for text detection not only from individual 2D images but also from temporal dependency in video;

• Made connection between understanding visual information and understanding language (text).

21

Conclusions & Future Work

• Automatic detection of text on road signs could be very useful in various applications;

• Experiments have shown that the new framework could significantly improves robustness and efficiency of any existing text detection algorithm;

• Future work: Apply various language methods to detected texts in video, e.g., translation, IR, etc.

22

Question ?

Thank You