Developing Document Image Retrieval System

K. Zagoris, K. Ergina and N. Papamarkos

Image Processing and Multimedia LaboratoryDepartment of Electrical & Computer

EngineeringDemocritus University of Thrace,

67100 Xanthi, Greece

Phenomenal growth of the size of multimedia data and especially document images

Caused by the easiness to create such images using scanners or digital cameras

Huge quantities of document images are created and stored in image archives without having any indexing information

The Document Retrieval Problem

The overall structure of the Document Image Retrieval System

Binarization (Otsu Technique)

Original Document

Preprocessing stage

Median Filter

Indentify all the Connected Components (CCs)

Calculate the most common height of the document CCs (CCch)

Reject the CCs with height less than 70% of the CCch. That only reject areas of punctuation points and noise.

Expand the left and right sides of the resulted CCs by 20% of the CCch

The words are the merged overlapping CCs

Using the Connected

Components Labeling

and Filtering method

Word Segmentation

Width to Height RatioWord Area Density. The percentage of the

black pixels included in the word-bounding box

Center of Gravity. The Euclidean distance from the word’s center of gravity to the upper left corner of the bounding box:

Features

(1,0) (0,1)

(0,0) (0,0)

M MC C

( , )qp

x yM f x y

width height

Vertical Projection. The first twenty (20) coefficients of the Discrete Cosine Transform (DCT) of the smoothed and normalized vertical projection.

Features

Original Image

The Vertical Projection

Smoothed and normalized

Top – Bottom Shape Projections. A vector of 50 elementsThe first 25 values are the first 25 coefficients of the

smoothed and normalized Top Shape Projection DCT The rest 25 values are equal to the first 25 coefficients of the

smoothed and normalized Bottom Shape Projection DCT.

Features

Upper Grid Features is a ten element vector with binary values which are extracted from the upper part of each word image.

Down Grid Features is a ten element vector with binary values which are extracted from the lower part of the word image.

Features

Upper – Down Grid Features

[0,0,0,1195 ,0,0,0,0,0,0][0,0,0,1 ,0,0,0,0,0,

[0,0,0,0 ,0,0,0, 598 , 50 , 33 ]

[0,0,0,0 ,0,0,0,1,1,0]

Descriptor

The Structure of

the Descriptor

User enters a query word The proposed system creates an image of the

query word with font height equal to the average height of all the word-boxes obtained through the Word Segmentation stage of the Offline operation.

For our experimental set the average height is 50

The font type of the query image is ArialThe smoothing and normalizing of the various

features described before, suppress small differences between various types of fonts

Query Image Creation

The Matching Process

100 image documents created artificially from various texts

Then Gaussian and “Salt and Pepper” noise was added

Implement in parallel a text search engine which makes easier the verification and evaluation of the search results of the proposed system

Experimental Document Database

Implementation

o Visual Studio

o Microsoft .NET

Framework 2.0

o C# Language

o Microsoft SQL

Server 2005

http://orpheus.ee.duth.gr/irs2_5/

Evaluation

o Precision and

the Recall metrics

o 30 searches in

100 document

images

o Font Query:

Mean Precision: 87.8% Mean Recall: 99.26%

FineReader® 9.0 OCR Program

EvaluationQuery Font Name “Tahoma”.

The query word is given in text and then transformed to word image

The proposed system extract nine (9) powerful features for the description of the word images

These features describe satisfactorily the shape of the words while at the same moment they suppress small differences due to noise, size and type of fonts

Based on our experiments the proposed system performs better in the same database than a commercial OCR package

Conclusion

Thank you!

Developing Document Image Retrieval System

Technology

Transcript of Developing Document Image Retrieval System

Image Retrieval Part II

Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval.

Content Based Image Retrieval using Query by Approximate … · Retrieval (KBIR), Semantic Based Image Retrieval (SBIR) and Content Based Image Retrieval (CBIR) [1]. The KBIR methods

Image Retrieval for Image-Based Localization Revisited · 2015. 4. 9. · retrieval systems [7,25,30] and image retrieval approaches for image-based localization. The former aim at

Image Retrieval Part I (Introduction). 2 Image Understanding Functions Image indexing similarity matching image retrieval (content-based method)

Efficient Image Retrieval

Content Based Image Retrieval (CBIR) Using Segmentation ... · Content Based Image Retrieval (CBIR) Using Segmentation Process ... Image retrieval has been an active research area

Keypoints in Image Retrieval

Content Based Image Retrieval through Features like Color ... Based Image Retrieval Through... · Content Based Image Retrieval through Features like Color, Texture and Shape B Narasingh

Private Content Based Image Retrieval Using Hadoop€¦ · 1.1 Content Based Image Retrieval The term "Content-Based Image Retrieval" is used for retrieving the corresponding images

Image Retrieval

Image Database Retrieval

Content Based Image Retrieval

Web Image Retrieval

Descriptive Semantic Image Retrieval

LNCS 6315 - Feature Tracking for Wide-Baseline Image Retrieval · scheme for image retrieval. Keywords: Wide baseline, image retrieval, quantization. 1 Introduction In this paper

Composing Text and Image for Image Retrieval - An ... · Image retrieval and product search: Image retrieval is an important vision problem and signiﬁcant progress has been made

Image retrieval using wavelet-based salient pointsdisi.unitn.it/~sebe/publications/jei_proof_version.pdf · of image retrieval.1–3 Early research in image retrieval pro-posed manually

Content-Based Image Retrieval

DOCUMENT RETRIEVAL AND IMAGE RETRIEVAL - 筑波 … · for document retrieval and image retrieval ... clauses have widely been used for modeling cognitive structure． Since ... a