Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
0
Transcript of Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
Chinese Character Recognition for Video
Presented by: Vincent Cheung
Date: 25 October 1999
Introduction
Many dialects in Chinese, but Chinese Characters is common in anywhere.
Many video programs have Chinese subtitles nowadays
Extract text from digital video programs can help for indexing, searching and retrieval
Features of Subtitles Characters are in foreground They are monochrome They are rigid, from frame to frame They are upright They have size restrictions They contrast with the background They appear in clusters at a limited
distance aligned to a horizontal line
Steps to Recognise Text
Clearing the background, removing noise
Segmenting the characters
Recognising them by pattern matching
Demo Video
A piece of news from ATV about Airport Authority Hong Kong and is reported in Cantonese
In MPEG format
1... 2... 3... Action!
MPEG Video
Consisted of a video track and an audio track
Consisted of frames For video part, a frame is
representing a static image
Steps to Remove BackgroundAgnihotri & Dimitrova Suggested 7 steps p
rocedures: Channel Separation Image Enhancement Edge Detection Edge Filtering Character Detection Text Box Detection Text Line Detection & Enhancement
Sample Frame The 100th frame of
the demo video
Channel Separation Use Red Channel which gives higher contrast edges More probably that natural environment are in blue or
green
Green ChannelRed Channel Blue Channel
Image Enhancement
To filter salt and pepper noise To sharpen the edges
Quality of our mpeg video is quite good that we no need to take this step
Edge Detection
Find out the edges from the image Use a 3x3 matrice mask
-1 -1 -1
[ -1 12 -1 ]
-1 -1 -1 Use Sobel Filter instead edges around text may be broken
and not connected
Sample Edge Image
Edge Filtering
To remove areas which possibly do not contain text
Characters would give high density of objects, hence high density of edges
Finding out areas with high density of edges which give hints of where the characters located
Density of edges in horizontal lines
Filtering the Irrelevant Edges
Density of Edges in Vertical
What if the length of subtitle is short?? Cut the image into certain parts
and calculate the density of edges in those areas
Prevent the case if the subtitle is short and cannot give an overall view
Sample Image Divided in Parts
Challenges in Chinese Characters Segmentation Square? Not Really, they are variable in size!! H
aving different height and width e.g.: (日 , 曰 ) Lead to some problem in Fixed-Distan
ce Approach Segmentation More problems if mixed with English, N
umbers, and Symbols e.g. 18部「 IBM」電腦
Usually written in horizontal way, like English.
Do segmentation like English? English: each character is horizontally lin
ked Chinese: may not have such linkage e.g.:八 , 川
Challenges in Chinese Characters Segmentation
Character Recognition
Pattern Matching most straight forward two pattern are compared by using pattern distance
Classification for Faster Matching By blackness (e.g. 一 , 鬱 ) By projection profiles
Possible Enhancement Picking out the moving objects by
keeping track of a number of consecutive frames
Use of lexicon to choose the most possible character
Q & A