Binarisation Algorithms Analysis on Document and Natural Scene Images

11
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245 _____________________________________________________________________________________________ 5235 IJRITCC | August 2015, Available @ http://www.ijritcc.org _______________________________________________________________________________________ Binarisation Algorithms Analysis on Document and Natural Scene Images Dharam Veer Sharma, Assistant Professor, Department of Computer Science, Punjabi University Patiala-147001, India [email protected] Sukhdev Singh Assistant Professor, Department of Computer Science, Multani Mal Modi College Patiala, India [email protected] Abstract:-The binarisation plays an important role in a system for text extraction from images which is a prominent area in digital image processing. The primary goal of the binarisation techniques are to covert colored and gray scale image into black and white image so that overall computational overhead can be minimized. It has great impact on performance of the system for text extraction from image. Such system has number of applications like navigation system for visually impaired persons, automatic text extraction from document images, and number plate detection to enforcement traffic rules etc. The present study analysed the performance of well known binarisation algorithms on degraded documents and camera captured images. The statistical parameters namely Precession, Recall and F-measure and PSNR are used to evaluate the performance. To find the suitability of the binarisation method for text preservation in natural scene images, we have also considered visual observation Keywords:- Binarisation, Ostu, Niblack, Sauvola, Kittler, Frengs, Bernsen, Natural scene image, degraded documents ___________________________________________________****_________________________________________________ I. Introduction: The information present in document images and camera captured scene images can be used for many applications such as automatic document reader, Navigation System for Visually handicap. A lot of research is being carried out to improve the performance and efficiency of OCR for documents and text extraction system for camera captured images. In the standard OCR and text extraction system, the binarization is carried out to convert colored or grey scale image to monochromatic or binary images. The binarization algorithms [1] are classified into two categories based on discontinuity or similarity of grey values. The algorithms, using discontinuity, segment an image based on abrupt changes in grey level, whereas algorithms using similarity are based on thresholding, region growing, region splitting and merging. In case of thresholding algorithms, the conversion is based on finding a threshold grey value and deciding whether a pixel having a particular grey value is to be converted to black or white. Usually within an image the pixels having grey value greater than the threshold is transformed to white and the pixels having grey value lesser than the threshold is transformed to black The major challenges faced in document binarisation are presence of noise, improper illumination, ink overlapping, sequences. Whereas in case of digital images like X-ray, Ultrasound images, MRI images and camera captured images, binarisation process have to address different level of complexity. Particularly, when we are working on medical images, binarisation is going to be hardest task as very fine detail in image has great significant. In literature [2], it is found that most of the time; these images are suffered due to noise like Speckle, Gaussian noise and Salt Pepper noise. The camera captured images which are also termed as natural scene images have its own complexity issues. As natural scene images carry itself complex background around the object of interest in the scene. The binarisation of such images is work of art as we are interested in objects in the image which are sometime very hard to distinguish from background. The present study has considered well known Binarization algorithm from literature that has shown good results for Document images or camera capture image. These algorithms are tested on document and natural scene image which are different from document images. The complication level of document and camera captured images is discussed and compared as below: TABLE I. TABLE1: COMPARISON BETWEEN DOCUMENT IMAGES AND NATURAL SCENE IMAGES Sr.no. Complexity Document Binarization Natural Scene Image 1 Background color The images have homogenous colored background. Generally back or white is preferred as background of most of documents. The images have heterogeneous colored background. Natural scene is full of colors and color combination is also very from scene to scene. 2 Shapes in background Generally, the documents do not have shapes in background. If document has shapes then shapes are clearly separated from text. The images have full of different shapes and there is no clear separation between text and background object. It is difficult to separate text from background. 3 Foreground text The document images have homogenous text style and color in text and are easy to identify text as foreground. There is clear division of text and non text regions. The natural scene images have different style and shape of text and are difficult to identify text in background.

description

The binarisation plays an important role in a system for text extraction from images which is a prominent area in digital image processing. The primary goal of the binarisation techniques are to covert colored and gray scale image into black and white image so that overall computational overhead can be minimized. It has great impact on performance of the system for text extraction from image. Such system has number of applications like navigation system for visually impaired persons, automatic text extraction from document images, and number plate detection to enforcement traffic rules etc. The present study analysed the performance of well known binarisation algorithms on degraded documents and camera captured images. The statistical parameters namely Precession, Recall and F-measure and PSNR are used to evaluate the performance. To find the suitability of the binarisation method for text preservation in natural scene images, we have also considered visual observation.

Transcript of Binarisation Algorithms Analysis on Document and Natural Scene Images

Page 1: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5235 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Binarisation Algorithms Analysis on Document and Natural Scene Images

Dharam Veer Sharma,

Assistant Professor,

Department of Computer Science,

Punjabi University Patiala-147001, India

[email protected]

Sukhdev Singh

Assistant Professor,

Department of Computer Science,

Multani Mal Modi College Patiala, India

[email protected]

Abstract:-The binarisation plays an important role in a system for text extraction from images which is a prominent area in digital image

processing. The primary goal of the binarisation techniques are to covert colored and gray scale image into black and white image so that overall

computational overhead can be minimized. It has great impact on performance of the system for text extraction from image. Such system has

number of applications like navigation system for visually impaired persons, automatic text extraction from document images, and number plate

detection to enforcement traffic rules etc. The present study analysed the performance of well known binarisation algorithms on degraded

documents and camera captured images. The statistical parameters namely Precession, Recall and F-measure and PSNR are used to evaluate the

performance. To find the suitability of the binarisation method for text preservation in natural scene images, we have also considered visual

observation

Keywords:- Binarisation, Ostu, Niblack, Sauvola, Kittler, Frengs, Bernsen, Natural scene image, degraded documents

___________________________________________________****_________________________________________________

I. Introduction:

The information present in document images and camera

captured scene images can be used for many applications

such as automatic document reader, Navigation System for

Visually handicap. A lot of research is being carried out to

improve the performance and efficiency of OCR for

documents and text extraction system for camera captured

images. In the standard OCR and text extraction system, the

binarization is carried out to convert colored or grey scale

image to monochromatic or binary images. The binarization

algorithms [1] are classified into two categories based on

discontinuity or similarity of grey values. The algorithms,

using discontinuity, segment an image based on abrupt

changes in grey level, whereas algorithms using similarity are

based on thresholding, region growing, region splitting and

merging. In case of thresholding algorithms, the conversion is

based on finding a threshold grey value and deciding whether

a pixel having a particular grey value is to be converted to

black or white. Usually within an image the pixels having

grey value greater than the threshold is transformed to white

and the pixels having grey value lesser than the threshold is

transformed to black

The major challenges faced in document binarisation are

presence of noise, improper illumination, ink overlapping,

sequences. Whereas in case of digital images like X-ray,

Ultrasound images, MRI images and camera captured images,

binarisation process have to address different level of

complexity. Particularly, when we are working on medical

images, binarisation is going to be hardest task as very fine

detail in image has great significant. In literature [2], it is

found that most of the time; these images are suffered due to

noise like Speckle, Gaussian noise and Salt Pepper noise. The

camera captured images which are also termed as natural

scene images have its own complexity issues. As natural

scene images carry itself complex background around the

object of interest in the scene. The binarisation of such

images is work of art as we are interested in objects in the

image which are sometime very hard to distinguish from

background. The present study has considered well known

Binarization algorithm from literature that has shown good

results for Document images or camera capture image. These

algorithms are tested on document and natural scene image

which are different from document images. The complication

level of document and camera captured images is discussed

and compared as below:

TABLE I. TABLE1: COMPARISON BETWEEN DOCUMENT IMAGES AND NATURAL SCENE IMAGES

Sr.no. Complexity Document Binarization Natural Scene Image

1 Background

color

The images have homogenous colored background.

Generally back or white is preferred as background of most of documents.

The images have heterogeneous colored background.

Natural scene is full of colors and color combination is also very from scene to scene.

2 Shapes in

background

Generally, the documents do not have shapes in

background. If document has shapes then shapes are clearly separated from text.

The images have full of different shapes and there is

no clear separation between text and background object. It is difficult to separate text from background.

3 Foreground text The document images have homogenous text style and

color in text and are easy to identify text as foreground. There is clear division of text and non text regions.

The natural scene images have different style and

shape of text and are difficult to identify text in background.

Page 2: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5236 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

4. Orientation of

Text

We can predict orientation of text in document image.

Generally text is in horizontal and vertical orientation. Moreover, orientation of the text in most of cases is

uniform throughout the page.

The orientation of text in un-predicted, text may have

any orientation i.e. horizontal, vertical, and diagonal. In many cases single scene may have text in multiple

orientations.

6 Illumination Most of the document images are equally and highly

illuminated. Even in some case there is minor variation in

illumination which can be overcome with simple image enhancement methods

The natural scene images are unequally illuminated

due to multiple sources of light like sun light, tube

light, reflection, shadow etc. It can also be affected with weather conditions at the time of capturing the

image.

7. Size of text The document images have similar size of text size like all

heading in documents have similar size, text under heading have similar size, etc.

In case of the natural scene image, there no standard

rule can be framed to categorize size of text.

8 Text like

pattern

In documents images, there are very rare object which

mislead as content as text and non text. There is very clear classification between text and non text regions in a

document image.

In the natural scene images, there are objects which

may mislead to identification of text like fences, leaf.

II. BINARISATION

Binarisation segments input image into two sections i.e.

background and foreground. Usually two intensity levels 0

for background and 1 for foreground are considered. The

appropriate threshold value is needed to calculate to

segments the image into appropriate segments. We have

under gone recent literature [15-21] in the field of

binarisation and find out most frequent techniques used for

binarisation of documents and camera captured images. The

following well known algorithms are considered for

analysis.

TABLE2: LIST OF BINARISATION TECHNIQUES

1. Otsu‟s Binarization Technique

2. Niblack‟s Binarization technique

3. Souvola‟s Binarization

technique

4. Morphological Binarization

methods

5. Bernsen‟s Binarization

6. Feng‟s Binarization

7. Sliding Window Mean

Thresholding

8. Sliding Window Median

Thresholding method

9. Sliding Window MidGray Thresholding

method

10. Wolf‟s Binarization method

11. White and Rohrer‟s Binarization

12. T.R.Singh‟s Binarization method

2.1. Otsu binarization

Otsu‟s binarization technique [2,3] is global thresholding

technique in which single threshold value is used to

binarized the whole image. The algorithm assumes that the

image is to represent into two classes of pixels i.e.

foreground and background. The optimum thresholding

value is calculated such that intra-class variance is minimal.

It is one of the fastest binarization algorithms because only

histogram calculation is required using array of length 256.

The histogram calculation helps to calculate mean value of

pixels. The calculation of the inter-classes or intra-classes

variances is based on the normalized histogram

Hn= { Hn(0)………………………Hn(255)} of image,

where Hn i n i=0 = 1

The inter-classes variance for each gray level is given by

Vinter_variance=q1(t) × q2(t)×[μ1(t)- μ2(t)]2

Where, μ1(t)= 1

q1(t) Hn i × it−1

i=0

μ2(t)= 1

q2(t) Hn i × i255

i=0

q1(t)= Hn i t−1i=0

and q2(t)= Hn i 255i=0

Niblack Binarization Technique

Niblack‟s binarisation [10, 4] is local Binarization method

based on sliding window in which thresholding is computed

for each and every pixels of the image. The sliding window

is a rectangular region about the pixel for which

thresholding is being computed. In this technique local

mean and standard deviation is calculated within

rectangular region to find out thresholding value for central

Page 3: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5237 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

pixel surrounded by neighborhood pixels. The thresholding

is decided by the following formula.

T(x,y)=m(x,y)+K* S(x,y),

Where m(x,y) and S(x,y) are the average of intensity values

in local area of window and standard deviation for pixel

intensity values of the window. Drawback of the method

given above is considerable sensitivity of the choice of

window size and decision parameters K. To overcome the

problem of sensitivity the parameters „K‟ and „R‟ are used

as

T(x,y)=m(x,y)*[1+K(1-S(x,y)/R] where R and K are

empirical constants.

2.2. Souvola Binarization Technique

Sauvola‟s binarization [14,5] is a locally adaptive

thresholding of grey scale image. According to the

algorithm, local mean and standard deviation are calculated

on W× W window around a pixel. Let us consider T

threshold for pixels of image and Mw(i ,j),σw(i,j) are

considered as local mean and standard deviation on local

window of size WxW for pixel location (i,j). The value of T

can be calculated as

Tw,k(i,j)= Mw(i,j) *(1+ K (σw (i,j) /R – 1))

Where, Mw(i,j) is called local mean for pixels in window of

size Wx, σw(i,j) represents standard deviation for pixels in

window of size of size W × W. The parameters, R, k are

constant parameters (also called control Factors) to control

sensitivity to noise.

2.3. Bernsen’s Binarization

Bernsen‟s binarization [17,6] introduced local thresholding

method. In this method we compute local minimum and

maximum for neighborhood around each pixel f(x,y) ϵ [0,L-

1] and use median of two as the threshold for each pixel in

image with reference of size of window. The sliding

window is mask of size NxN centered pixel is coincide with

the pixel in focus. We keep on slide window from one pixel

location to another pixel position in the image to find out

appropriate threshold value.

T(x,y)= (Plow +Phigh)/2 , If Phigh - Plow ≥ L

T(x,y)= GT , If Phigh - Plow < L

Where Plow is low value of gray pixel value in the N×M

size window, Phigh is high value of gray pixel value in the

NXM size window. GT stands for Global Thresholding

using Otsu‟s Binarization.

G(x,y)=(Fmax(x,y)+Fmin(x,y))/2 and B(x,y)= 1 if

f(x,y)<g(x,y), otherwise 0.

2.4. Feng’s Binarization:

Feng‟s binarization [7] is based on local thresholding in

which in place of single sliding window two local windows

are used. One window is sliding over each pixel and

calculates attributes on the basis of intensity values of

neighborhood pixels. Second window differ in the size,

usually larger than the size of local window. The values of

local mean (Mean Local-m) and minimum gray value

(MinGray-M) are calculated in the primary local window

with reference to pixels located in the neighborhood. The

values of standard deviation are calculated in the both the

Windows. The variable S(Local Deviation) and Rs

(standard deviation) of pixels of large window. The

following formula is used to calculate threshold:

Tfeng =(1- α 1)*m+α2*(S/Rs)*(m-M)+ α3*M

Where α2=k1*(S/Rs)ϒ , α3=k2*(S/Rs)ϒ, value of ϒ=2, S

is Standard deviation within local windows standard

deviation of secondary window, m is local mean within

primary window, M is minimum gray value within primary

window.

Sliding Window Mean Thresholding

Sliding window mean thresholding [21] is performed by

using calculating mean of gray scale value within local

window. The sliding window is placed over each pixel of

the image and neighboring pixels values are considered for

calculating mean. The following equation is used to find

local mean using sliding window.

pixel={ Pixel value > Local Mean-C ? Object:

Background}, where c=0 or any value to control noise level.

Sliding Window Median Thresholding

Sliding window median thresholding [21] is performed by

using calculating mean of gray scale value within local

window. The sliding window is placed over each pixel of

the image and neighboring pixels values are considered for

calculating median. The pixels surrounded by central pixels

are considered for sorting and after sorting the pixel median

is calculated. The following equation is used to find local

mean using sliding window.

Pixel= {Pixel value > Local Median-C? Object:

Background}, Where, c=0 or any value to control noise

level.

Page 4: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5238 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

2.5. Sliding Window MidGray Thresholding

Sliding window median thresholding [21] is performed by

using calculating mean of gray scale value within local

window. The sliding window is placed over each pixel of

the image and neighboring pixels values are considered for

calculating median. The following equation is used to find

local mean using sliding window.

Pixel= {Pixel value > (Local Maximum + Median)/2 –C ?

Object: Background}, usually c=0 or any value to control

noise level.

2.6. Wolf’s binarization technique

Wolf‟s binarization technique [8,10] is implemented by

calculating stranded deviation and mean within local

window and over whole image. The sliding window is

placed over each pixel of the image and neighboring pixels

values are considered for calculating m(mean) and

S(standard deviation). The following equation is used to

find the threshold over local window.

Twolf=(1-K)*m+K*M+K*S/R(m-M)

Where k=0.5(Fixed), M- Minimum Gray value, m-mean

local, M-mean global, S-Standard Deviation local, R

Standard Global

2.7. White and Rohrer’s Binarization

White and Rohrer‟s binarization technique [11] is used to

calculate threshold value for Binarization of image by using

bias parameters as multiple to pixel value under

consideration. The concept of sliding window is used to

find out mean value of neighborhood pixels around centered

pixel. The following equation is used to compute threshold

value for central pixel in the image.

mlocal(x,y)<I(x,y).Bias set pixel I(x,y) to

1, otherwise 0

Where, mlocal(x,y) is the value of local mean within

window by using neighborhood pixels. I(x,y) is the value of

pixel considered for Binarized. Bias is the control variable

(suitable value is 2).

2.8. T.R. Singh and et. al ’s binarization

T.R Singh et. al [12]introduced local thresholding

computation by calculate local mean and mean deviation by

using following equation.

T(x,y)=m(x,y)[1+K*(δ(x,y)/(1-δ)-1)]

Where δ (x,y)=I(x,y)-m(x,y), k is a bias and k € [0,1] ,

mlocal(x,y) is the value of local mean within window by

using neighborhood pixels. I(x,y) is the value of pixel

considered for Binarized.

III. ANALYSIS:

We have considered ICDAR2009[13] dataset for binarisation

of degraded document images and ICDAR 2011[14] for

natural scene images.In literature several papers are found

that use precision, recall analysis and f-measure. But in case

of natural scene image, the background is too complex that it

is not easy to separate object from it. There is lot of text like

objects like tree leaf, branches, window, etc. In current study

the following methods are used. The more than two thousand

images are captures using Nikon D3200 Digital Camera.

Images covered different location that contains sign board,

notice board, banners, and text written on wall and vehicles

in city Patiala. The text present in these boards is in

Gurmukhi Script. The images are captured at different

locations and at time so that different lighting conditions are

covered. Images are resized to cover come computation

overhead.

3.1. Parameters of analysis.

The outputs from different Binarization algorithms are

evaluated to find out the best suitable algorithm.

Visual Observation: visual observation is a most

efficient method to observe the effect of

binarization techniques separating text from

background

Recall(r): It is ratio of number of relevant object

pixels retrieved to the total number of relevant

object pixels in the reference image.

Recall= (A/(A+B))*100

Precision (p): It is ratio of number of the number of

relevant object pixels retrieved to the total number of

irrelevant and relevant object pixels retrieved. The precision

can be expressed by the following equation

Page 5: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5239 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Precision= ((A)/(A + C ))*100

F-measure (f): it is defined to show goodness of the

results. A higher value of F-measure represents a better

result. It is expressed by the following equation

F-measure= (2 * Recall * Precision)/(Recall + Precision)

Peak Signal to Noise Ratio (PSNR): it is defined as the

ratio of peak signal power to average noise power. It can be

calculated as

PSNR= 10 log10 2552*MN/ Ʃi Ʃj(x(i,j)-y(i,j)2

The different algorithms which are consider for Binarization

are Otsu‟s Binarization Technique, Niblack‟s Binarization

technique, Souvola‟s Binarization technique, Morphological

Binarization, Bernsen‟s Binarization, Feng‟s Binarization,

Sliding Window Mean Thresholding, Sliding Window

Median Thresholding, Sliding Window MidGray

Thresholding, Wolf‟s Binarization, White and Rohrer‟s

Binarization, T.R.Singh‟s Binarization

3.2. Ground Truth Images: The ground truth images

are created with manual tool and image is divided into two

area foreground and background. The area contains text is

labeled as foreground and other area is labeled as

background. The databases of ground truth images [14]

from Eleventh International Conference on Document

Analysis and Recognition (ICDAR 2011) are consider of

evaluation of the Binarization algorithms. The following are

few sample ground truth images.

TABLE3: GROUND TRUTH IMAGES FROM ICDAR 2009.

Original Image Ground Truth Image Original Image Ground Truth Image

TABLE4: GROUND TRUTH NATURAL SCENE IMAGES FROM ICDAR 2011

Page 6: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5240 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

TABLE 5: GROUND TRUTH IMAGE OF NATURAL SCENE IMAGES CAPTURED DURING STUDY

Original Image Ground Truth Image Original Image Ground Truth Image

1.1. Visual performance: Visual performance of Binarisation methods on document images:

Kitter’s method Otsu’s Niblack’s method Wolf’s method

Souvola’s method Wolf’s Binarisation Bernsen’s method White&Rohrer’s

Feng’s method Median filter MidGray filter T.R.S.’s method

Page 7: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5241 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Analysis of statistical parameters: We use recall, precision, f-measure and PSNR to access the performance of document and natural

scene images. The following table shows performance using these parameters.

1.2. Decision parameters based analysis: The following parameters are used to access the performance of the

different methods.

TABLE7. ANALYSIS ON BASED PRECISION, RECALL, F-MEASURE AND PSNR.

Sr.No Method Precision Recall f-measure PSNR

1. Otsu‟s Binarisation 0.8967 0.9270 91.16829 20.01837

2. Niblack‟s Binarisation 0.2364 0.9843 38.1246 7.9908

3. Souvola‟s Binarisation 0.8802 0.9332 90.5918 20.1616

4. Wolf‟s Binarisation 0.9846 0.5021 66.5088 12.9281

5. Bernsen‟s Binarisation 0.1699 0.8658 28.4012 3.5669

6. Feng‟s Binarisation 0.0153 0.1251 2.7198 0.4498

7. Median filter‟s Binarisation 0.9409 30.2122 0.0988 3.5857

8. MidGray filter‟s Binarisation 0.7728 36.0587 0.1269 5.5863

9. Mean filter‟s Binarisation 0.3184 11.2004 0.0453 2.9353

10. Wolf‟s Binarisation 0.9846 0.5021 66.5088 12.9281

11. White&Rohrer‟s Binarisation 0.9255 0.3917 55.0452 11.9066

12. T.R.S.‟s Binarisation 0.2106 0.8992 34.1320 4.5634

13. Kitter‟s Binarisation 0.8968 0.9271 91.1683 20.4922

TABLE 8: ICDAR-2011(SAMPLE DATABASE)

ICDAR-2011 Image Ostu Niblack

Page 8: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5242 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Page 9: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5243 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

Results: The best suitable algorithms for Binarization of Natural Scene images are implemented on ICDAR-2011

database which gave satisfactory results. The above are few examples with results.

TABLE 9: IMPLEMENTATION OF BINARISATION METHODS ON NATURAL SCENE IMAGES

Original Image Otsu‟s Binarization Original Image Niblack‟s Binarization

Original Image Otsu‟s Binarization Original Image Bernsen’s Binarization

Original Image Wolf‟s binarization Original Image White and Rohrer‟s

Page 10: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5244 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

3.3. Detail Analysis:

The Binarization is one of the key steps to recognize text in the

image so that it can be used for appropriate function such as

automatic document reader, Navigation System for Visually

handicap. The twelve algorithms from literature has been

considered for present research work and their performance has

been evaluated using Visual Observation and Mathematical

noise and Error calculation methods such as Recall, Precision

and F-measure and PSNR. The following observations have

been noticed to find out best suitable algorithm.

1. Otsu‟s method is good Global Binarization method for

image where background is uniformly illuminated. It is one

of the fasted methods as computational over head is very

low. It produces noise into image when there is non-

uniform illumination. It gives best result for uniform

illumination i.e. Histogram should be bimodal.

2. Window size plays an important role in performance of

Niblack‟s Binarization techniques. The size of the sliding

window should be small enough to preserve local

significance and large enough to suppress noise. Through

experiments it is come to know that 15 x 15 is an ideal size

for Binarization. But larger the window

size slow down the binarization process. So it is

recommended to use appropriate size of window like 15 X15

or 17x17. It is good algorithm to segment text region in the

image but it produce a large amount of binarization noise in

the non-text regions. Large window size minimizes the noise.

The values of parameters p and k play an important role to

overcome noise problem in the output Binarized image. The

most suitable value of p=0.5 and k=-0.2.

3. Sauvola‟s Binarization algorithm is modification of Niblack‟s

Algorithm as in case of Niback‟s algorithm there is unwanted

noise in non-text area. In this algorithm noise control is better

than Niblack. The value of K is always positive and lies

between interval [0.2 to 0.5].The experiment results reveal

that appropriate value are k=0.5 and k=0.34. The size of the

window plays important role to segment the image, the value

of window size 15 x 15 gives good results as in the case of

Niblack. Similarly there is trade-off between size of window

and execution time.

4. In Sauvola‟s Binarization, value of R=128 is fixed for Gray

scale Image.

When loca neighborhood has high contrast then σw(i,j)=R

T=Mw(i,J)*(1+k(1-1)), When local neighborhood has low

constant then Twk(i,j) goes below the mean value.

Sauvola‟s technique reduces noise and enhances textual

part. Computation of Mw and σw(i,j) for each

neighborhood is time consuming and computing

complexity O(n2w2) for MxN size of image. If the text part

of the image is small as compare to total image size then

local standard deviation of pixels in the window of image

suppress text regions in the image.

5. Feng's Binarization introduced concept of two sliding

window where one is local and other is of larger size

depending upon the application type. The current research

used local window of size 15 x 15 and second window of

size NxM, where N and M are dimensions of the image. It

gives good result for all type of images but the Drawback

of the method is determination of empirical parameters.

Through experiment it is revealed that one set of

parameters which are good for specific types of images

may not be suitable for other kind of images. There are two

types of windows to compute local thresholding so it is

slow as compare to single window algorithm. Output of

sample images are displayed above in the table.

6. The method used by TR Singh is based on calculation of

loam mean and Bias constant which may take value

between(0,1). The significant improvement in this

algorithm is the size of sliding window. It gives good result

even for small size of window for example w=7. If it is

compare with Niblack and Sauvola, visually outputs are

almost similar but there is lot of difference in

computational time. In case of Niblack and Sauvola‟s

Binarization methods suitable size of window is 17 to 21

but TR Singh's method requires only 7 to 9 size window to

show significant result. Comparatively it is faster than

Niback and Sauvola Binarization methods.

7. Morphological Binarization method make use of Erosion

operation as it gives good result as compare to Dilation,

opening and closing operation. It is suitable for image

where text size is large and there is uniform font is used. It

distorts the shape of small text and disappear text of very

small size. The best convocation mask suitable for

Binarization of text images is 7x7.

8. Sliding window mean make use of window of size 7 x 7 to

find out local mean of the image. The local mean is

compare with current pixel of the local window to find out

the threshold. it is good method for images where text

written on uniform background. it also add lot of noise in

the image as multiple colored background image have

different mean in their local area result is non-uniform

black and white area as shown in the above table.

9. As in the case of sliding mean thresholding, local window

is used to calculate median of the local area in the window.

Page 11: Binarisation Algorithms Analysis on Document and Natural Scene Images

International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245

_____________________________________________________________________________________________

5245 IJRITCC | August 2015, Available @ http://www.ijritcc.org

_______________________________________________________________________________________

The pixels in the neighborhood are stored in acceding order

with their gray values than threshold is calculated by

comparing current pixel and median value. it gives good

result only when there is uniform background or minimum

color are used in the text area. Even in it good result

significant noise is noticed. It is good for only very few

images as natural scene images are generally full of

background color. MidGray value is calculated by

considering pixel in the window and it is also useful when

there is large text area.

10. The algorithm suggested by Wolf et al. is based on

calculation of local standard deviation and local mean. The

sliding window is used to find values of standard deviation

and local mean so size of the sliding window is in the key

role. Through experiments is revealed that best suitable size

of window is 11x 11 and above. It gave good result for

almost all kind of images. The image with uniform

background is clearly Binarized and exact shape and size of

the text is preserved as compare to Otsu's Binarization is a

little bit slow but is gives good result even with non-

uniform background which is not possible by Otsu. If it is

compared with Niblack‟s and Sauvola‟s Binarization is

good as there is less amount of Noise as compare to both

the algorithm. The suitable size of sliding window of

Niblack‟s and Sauvola‟s Binarization is 17x17 and more

where as in case of Wolf we can get good result even at

size of 11X11.

REFERENCES

[1] Stathis, Pavlos, Ergina Kavallieratou, and Nikos Papamarkos.

"An evaluation survey of binarization algorithms on historical

documents." Pattern Recognition, 2008.ICPR 2008.19th

International Conference on .IEEE, 2008.

[2] N. Otsu, “A threshold selection method from gray-level

histograms”, IEEE Transactions on Systems, Man, and

Cybernetics, vol. 9, Issue 1, pp. 62–66, 1979.

[3] W. Niblack, An Introduction to Digital Image Processing,

Prentice Hall, Englewood Cliffs, NJ, USA, 1986.

[4] J. Sauvola and M. Pietikäinen, “Adaptive document image

binarization,” Pattern Recognition, vol. 33, no. 2, pp. 225–

236, 2000.

[5] J. Bernsen, “Dynamic thresholding of grey-level images,” in

Proceedings of the 8th International Conference on Pattern

Recognition, pp. 1251–1255, Paris, France, 1986.

[6] Valizadeh, M., Kabir, E.: “Binarization of degraded document

image based on feature space partitioning and classification.

International Journal on Document Analysis and Recognition

Vol.15, Issue 1, pp.57–69, 2012.

[7] Su, B., Lu, S., Tan, C.L.,“ Robust document image

binarization technique for degraded document Images”, IEEE

Transactions on Image Processing. Vol.22 (4), pp.1408–1417,

2013.

[8] Morteza, V., Ehsanollah, K.,“An adaptive water flow model

for binarization of degraded document images”. International

Journal on Document Analysis and Recognition, Vol.16 (2),

pp. 165–176, 2013.

[9] Gaceb,Djamel, Frank Lebourgeois, Jean Duong."Adaptative

Smart-Binarization Method: For Images of Business

Documents”, in proceedings of 12th International Conference

on Document Analysis and Recognition (ICDAR), 2013.

[10] W. Niblack, “An Introduction to Digital image processing”,

Prentice Hall, pp 115-116, 1986.

[11] J. Kittler, J. Illingworth, “On threshold selection using

clustering criteria”, IEEE Trans. Systems Man Cybernet.15,

pp.652–655, 1985.

[12] Y.Yang and H.Yan, „An adaptive logical method for

binarization of degraded document images‟, Pattern

Recognition (PR), pp.787–807, 2000.

[13] T. W. Ridler and S. Calvard, “Picture thresholding using an

iterative selection method”, IEEE Transactions on Systems,

Man, and Cybernetics, pp.630-632, 1978.

[14] J. Sauvola, M. Pietikainen,"Adaptive document image

binarization", pp 33, 225–236, 2000

[15] L. Gottesfeld Brown, “A survey of Image Registration

Techniques”, ACM Computing Surveys, Vol 24, No

4,pp.325-376, 1992.

[16] Gatos, B., Ntirgiannis, K., Perantonis S.J. “Improved

document image binarization by using a combination of

multiple binarization techniques and adapted edge

information”, in proceedings of 19th International Conference

on Pattern Recognition (ICPR), pp. 1–4, 2008.

[17] Bernsen, J. “Dynamic thresholding of gray level images”, in

the proceedings of International conference on pattern

recognition (ICPR), pp. 1251–1255,1986.

[18] Rabeux, Vincent, "Quality evaluation of ancient digitized

documents for binarization prediction."Document Analysis

and Recognition (ICDAR), 2013 12th International

Conference on.IEEE, 2013.

[19] He J,Do QDM,Downton AC and Kim JH,"A comparison of

binarization methods for historical archive

documents.In:Eighth international conference on document

analysis and recognition (ICDAR‟05), pp.538–542.

[20] B.Gatos,I.Pratikakis, S.J. Perantonis, “Adaptive degraded

document image binarization”, Pattern Recognition,

vol.39, pp.317–327, 2006.

[21] M. Sezgin and B. Sankur,“Survey over image thresholding

techniques and quantitative performance evaluation,” J.

Electron.Imag.,vol.13,no.1,pp.146–165,2004.