Developing Document Image Retrieval System
-
Upload
konstantinos-zagoris -
Category
Technology
-
view
2.321 -
download
2
description
Transcript of Developing Document Image Retrieval System
![Page 1: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/1.jpg)
K. Zagoris, K. Ergina and N. Papamarkos
Developing Document Image Retrieval System
Image Processing and Multimedia LaboratoryDepartment of Electrical & Computer
EngineeringDemocritus University of Thrace,
67100 Xanthi, Greece
![Page 2: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/2.jpg)
Phenomenal growth of the size of multimedia data and especially document images
Caused by the easiness to create such images using scanners or digital cameras
Huge quantities of document images are created and stored in image archives without having any indexing information
The Document Retrieval Problem
![Page 3: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/3.jpg)
The overall structure of the Document Image Retrieval System
![Page 4: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/4.jpg)
Binarization (Otsu Technique)
Original Document
Preprocessing stage
Median Filter
![Page 5: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/5.jpg)
Indentify all the Connected Components (CCs)
Calculate the most common height of the document CCs (CCch)
Reject the CCs with height less than 70% of the CCch. That only reject areas of punctuation points and noise.
Expand the left and right sides of the resulted CCs by 20% of the CCch
The words are the merged overlapping CCs
Using the Connected
Components Labeling
and Filtering method
Word Segmentation
![Page 6: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/6.jpg)
Width to Height RatioWord Area Density. The percentage of the
black pixels included in the word-bounding box
Center of Gravity. The Euclidean distance from the word’s center of gravity to the upper left corner of the bounding box:
Features
(1,0) (0,1)
(0,0) (0,0)
,x y
M MC C
M M
( , )qp
pqx y
x yM f x y
width height
![Page 7: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/7.jpg)
Vertical Projection. The first twenty (20) coefficients of the Discrete Cosine Transform (DCT) of the smoothed and normalized vertical projection.
Features
Original Image
The Vertical Projection
Smoothed and normalized
![Page 8: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/8.jpg)
Top – Bottom Shape Projections. A vector of 50 elementsThe first 25 values are the first 25 coefficients of the
smoothed and normalized Top Shape Projection DCT The rest 25 values are equal to the first 25 coefficients of the
smoothed and normalized Bottom Shape Projection DCT.
Features
![Page 9: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/9.jpg)
Upper Grid Features is a ten element vector with binary values which are extracted from the upper part of each word image.
Down Grid Features is a ten element vector with binary values which are extracted from the lower part of the word image.
Features
![Page 10: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/10.jpg)
Upper – Down Grid Features
[0,0,0,1195 ,0,0,0,0,0,0][0,0,0,1 ,0,0,0,0,0,
0]
[0,0,0,0 ,0,0,0, 598 , 50 , 33 ]
[0,0,0,0 ,0,0,0,1,1,0]
![Page 11: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/11.jpg)
Descriptor
The Structure of
the Descriptor
![Page 12: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/12.jpg)
User enters a query word The proposed system creates an image of the
query word with font height equal to the average height of all the word-boxes obtained through the Word Segmentation stage of the Offline operation.
For our experimental set the average height is 50
The font type of the query image is ArialThe smoothing and normalizing of the various
features described before, suppress small differences between various types of fonts
Query Image Creation
![Page 13: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/13.jpg)
The Matching Process
![Page 14: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/14.jpg)
100 image documents created artificially from various texts
Then Gaussian and “Salt and Pepper” noise was added
Implement in parallel a text search engine which makes easier the verification and evaluation of the search results of the proposed system
Experimental Document Database
![Page 15: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/15.jpg)
Implementation
o Visual Studio
2008
o Microsoft .NET
Framework 2.0
o C# Language
o Microsoft SQL
Server 2005
http://orpheus.ee.duth.gr/irs2_5/
![Page 16: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/16.jpg)
Evaluation
o Precision and
the Recall metrics
o 30 searches in
100 document
images
o Font Query:
Arial
Mean Precision: 87.8% Mean Recall: 99.26%
![Page 17: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/17.jpg)
FineReader® 9.0 OCR Program
EvaluationQuery Font Name “Tahoma”.
Mean Precision: 76.67% Mean Recall: 58.42%
Mean Precision: 89.44% Mean Recall: 88.05%
![Page 18: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/18.jpg)
The query word is given in text and then transformed to word image
The proposed system extract nine (9) powerful features for the description of the word images
These features describe satisfactorily the shape of the words while at the same moment they suppress small differences due to noise, size and type of fonts
Based on our experiments the proposed system performs better in the same database than a commercial OCR package
Conclusion
![Page 19: Developing Document Image Retrieval System](https://reader035.fdocuments.in/reader035/viewer/2022062514/5580bb22d8b42ac6088b4fcc/html5/thumbnails/19.jpg)
Thank you!