Binarisation Algorithms Analysis on Document and Natural Scene Images
-
Upload
editor-ijritcc -
Category
Documents
-
view
6 -
download
0
description
Transcript of Binarisation Algorithms Analysis on Document and Natural Scene Images
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5235 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
Binarisation Algorithms Analysis on Document and Natural Scene Images
Dharam Veer Sharma,
Assistant Professor,
Department of Computer Science,
Punjabi University Patiala-147001, India
Sukhdev Singh
Assistant Professor,
Department of Computer Science,
Multani Mal Modi College Patiala, India
Abstract:-The binarisation plays an important role in a system for text extraction from images which is a prominent area in digital image
processing. The primary goal of the binarisation techniques are to covert colored and gray scale image into black and white image so that overall
computational overhead can be minimized. It has great impact on performance of the system for text extraction from image. Such system has
number of applications like navigation system for visually impaired persons, automatic text extraction from document images, and number plate
detection to enforcement traffic rules etc. The present study analysed the performance of well known binarisation algorithms on degraded
documents and camera captured images. The statistical parameters namely Precession, Recall and F-measure and PSNR are used to evaluate the
performance. To find the suitability of the binarisation method for text preservation in natural scene images, we have also considered visual
observation
Keywords:- Binarisation, Ostu, Niblack, Sauvola, Kittler, Frengs, Bernsen, Natural scene image, degraded documents
___________________________________________________****_________________________________________________
I. Introduction:
The information present in document images and camera
captured scene images can be used for many applications
such as automatic document reader, Navigation System for
Visually handicap. A lot of research is being carried out to
improve the performance and efficiency of OCR for
documents and text extraction system for camera captured
images. In the standard OCR and text extraction system, the
binarization is carried out to convert colored or grey scale
image to monochromatic or binary images. The binarization
algorithms [1] are classified into two categories based on
discontinuity or similarity of grey values. The algorithms,
using discontinuity, segment an image based on abrupt
changes in grey level, whereas algorithms using similarity are
based on thresholding, region growing, region splitting and
merging. In case of thresholding algorithms, the conversion is
based on finding a threshold grey value and deciding whether
a pixel having a particular grey value is to be converted to
black or white. Usually within an image the pixels having
grey value greater than the threshold is transformed to white
and the pixels having grey value lesser than the threshold is
transformed to black
The major challenges faced in document binarisation are
presence of noise, improper illumination, ink overlapping,
sequences. Whereas in case of digital images like X-ray,
Ultrasound images, MRI images and camera captured images,
binarisation process have to address different level of
complexity. Particularly, when we are working on medical
images, binarisation is going to be hardest task as very fine
detail in image has great significant. In literature [2], it is
found that most of the time; these images are suffered due to
noise like Speckle, Gaussian noise and Salt Pepper noise. The
camera captured images which are also termed as natural
scene images have its own complexity issues. As natural
scene images carry itself complex background around the
object of interest in the scene. The binarisation of such
images is work of art as we are interested in objects in the
image which are sometime very hard to distinguish from
background. The present study has considered well known
Binarization algorithm from literature that has shown good
results for Document images or camera capture image. These
algorithms are tested on document and natural scene image
which are different from document images. The complication
level of document and camera captured images is discussed
and compared as below:
TABLE I. TABLE1: COMPARISON BETWEEN DOCUMENT IMAGES AND NATURAL SCENE IMAGES
Sr.no. Complexity Document Binarization Natural Scene Image
1 Background
color
The images have homogenous colored background.
Generally back or white is preferred as background of most of documents.
The images have heterogeneous colored background.
Natural scene is full of colors and color combination is also very from scene to scene.
2 Shapes in
background
Generally, the documents do not have shapes in
background. If document has shapes then shapes are clearly separated from text.
The images have full of different shapes and there is
no clear separation between text and background object. It is difficult to separate text from background.
3 Foreground text The document images have homogenous text style and
color in text and are easy to identify text as foreground. There is clear division of text and non text regions.
The natural scene images have different style and
shape of text and are difficult to identify text in background.
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5236 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
4. Orientation of
Text
We can predict orientation of text in document image.
Generally text is in horizontal and vertical orientation. Moreover, orientation of the text in most of cases is
uniform throughout the page.
The orientation of text in un-predicted, text may have
any orientation i.e. horizontal, vertical, and diagonal. In many cases single scene may have text in multiple
orientations.
6 Illumination Most of the document images are equally and highly
illuminated. Even in some case there is minor variation in
illumination which can be overcome with simple image enhancement methods
The natural scene images are unequally illuminated
due to multiple sources of light like sun light, tube
light, reflection, shadow etc. It can also be affected with weather conditions at the time of capturing the
image.
7. Size of text The document images have similar size of text size like all
heading in documents have similar size, text under heading have similar size, etc.
In case of the natural scene image, there no standard
rule can be framed to categorize size of text.
8 Text like
pattern
In documents images, there are very rare object which
mislead as content as text and non text. There is very clear classification between text and non text regions in a
document image.
In the natural scene images, there are objects which
may mislead to identification of text like fences, leaf.
II. BINARISATION
Binarisation segments input image into two sections i.e.
background and foreground. Usually two intensity levels 0
for background and 1 for foreground are considered. The
appropriate threshold value is needed to calculate to
segments the image into appropriate segments. We have
under gone recent literature [15-21] in the field of
binarisation and find out most frequent techniques used for
binarisation of documents and camera captured images. The
following well known algorithms are considered for
analysis.
TABLE2: LIST OF BINARISATION TECHNIQUES
1. Otsu‟s Binarization Technique
2. Niblack‟s Binarization technique
3. Souvola‟s Binarization
technique
4. Morphological Binarization
methods
5. Bernsen‟s Binarization
6. Feng‟s Binarization
7. Sliding Window Mean
Thresholding
8. Sliding Window Median
Thresholding method
9. Sliding Window MidGray Thresholding
method
10. Wolf‟s Binarization method
11. White and Rohrer‟s Binarization
12. T.R.Singh‟s Binarization method
2.1. Otsu binarization
Otsu‟s binarization technique [2,3] is global thresholding
technique in which single threshold value is used to
binarized the whole image. The algorithm assumes that the
image is to represent into two classes of pixels i.e.
foreground and background. The optimum thresholding
value is calculated such that intra-class variance is minimal.
It is one of the fastest binarization algorithms because only
histogram calculation is required using array of length 256.
The histogram calculation helps to calculate mean value of
pixels. The calculation of the inter-classes or intra-classes
variances is based on the normalized histogram
Hn= { Hn(0)………………………Hn(255)} of image,
where Hn i n i=0 = 1
The inter-classes variance for each gray level is given by
Vinter_variance=q1(t) × q2(t)×[μ1(t)- μ2(t)]2
Where, μ1(t)= 1
q1(t) Hn i × it−1
i=0
μ2(t)= 1
q2(t) Hn i × i255
i=0
q1(t)= Hn i t−1i=0
and q2(t)= Hn i 255i=0
Niblack Binarization Technique
Niblack‟s binarisation [10, 4] is local Binarization method
based on sliding window in which thresholding is computed
for each and every pixels of the image. The sliding window
is a rectangular region about the pixel for which
thresholding is being computed. In this technique local
mean and standard deviation is calculated within
rectangular region to find out thresholding value for central
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5237 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
pixel surrounded by neighborhood pixels. The thresholding
is decided by the following formula.
T(x,y)=m(x,y)+K* S(x,y),
Where m(x,y) and S(x,y) are the average of intensity values
in local area of window and standard deviation for pixel
intensity values of the window. Drawback of the method
given above is considerable sensitivity of the choice of
window size and decision parameters K. To overcome the
problem of sensitivity the parameters „K‟ and „R‟ are used
as
T(x,y)=m(x,y)*[1+K(1-S(x,y)/R] where R and K are
empirical constants.
2.2. Souvola Binarization Technique
Sauvola‟s binarization [14,5] is a locally adaptive
thresholding of grey scale image. According to the
algorithm, local mean and standard deviation are calculated
on W× W window around a pixel. Let us consider T
threshold for pixels of image and Mw(i ,j),σw(i,j) are
considered as local mean and standard deviation on local
window of size WxW for pixel location (i,j). The value of T
can be calculated as
Tw,k(i,j)= Mw(i,j) *(1+ K (σw (i,j) /R – 1))
Where, Mw(i,j) is called local mean for pixels in window of
size Wx, σw(i,j) represents standard deviation for pixels in
window of size of size W × W. The parameters, R, k are
constant parameters (also called control Factors) to control
sensitivity to noise.
2.3. Bernsen’s Binarization
Bernsen‟s binarization [17,6] introduced local thresholding
method. In this method we compute local minimum and
maximum for neighborhood around each pixel f(x,y) ϵ [0,L-
1] and use median of two as the threshold for each pixel in
image with reference of size of window. The sliding
window is mask of size NxN centered pixel is coincide with
the pixel in focus. We keep on slide window from one pixel
location to another pixel position in the image to find out
appropriate threshold value.
T(x,y)= (Plow +Phigh)/2 , If Phigh - Plow ≥ L
T(x,y)= GT , If Phigh - Plow < L
Where Plow is low value of gray pixel value in the N×M
size window, Phigh is high value of gray pixel value in the
NXM size window. GT stands for Global Thresholding
using Otsu‟s Binarization.
G(x,y)=(Fmax(x,y)+Fmin(x,y))/2 and B(x,y)= 1 if
f(x,y)<g(x,y), otherwise 0.
2.4. Feng’s Binarization:
Feng‟s binarization [7] is based on local thresholding in
which in place of single sliding window two local windows
are used. One window is sliding over each pixel and
calculates attributes on the basis of intensity values of
neighborhood pixels. Second window differ in the size,
usually larger than the size of local window. The values of
local mean (Mean Local-m) and minimum gray value
(MinGray-M) are calculated in the primary local window
with reference to pixels located in the neighborhood. The
values of standard deviation are calculated in the both the
Windows. The variable S(Local Deviation) and Rs
(standard deviation) of pixels of large window. The
following formula is used to calculate threshold:
Tfeng =(1- α 1)*m+α2*(S/Rs)*(m-M)+ α3*M
Where α2=k1*(S/Rs)ϒ , α3=k2*(S/Rs)ϒ, value of ϒ=2, S
is Standard deviation within local windows standard
deviation of secondary window, m is local mean within
primary window, M is minimum gray value within primary
window.
Sliding Window Mean Thresholding
Sliding window mean thresholding [21] is performed by
using calculating mean of gray scale value within local
window. The sliding window is placed over each pixel of
the image and neighboring pixels values are considered for
calculating mean. The following equation is used to find
local mean using sliding window.
pixel={ Pixel value > Local Mean-C ? Object:
Background}, where c=0 or any value to control noise level.
Sliding Window Median Thresholding
Sliding window median thresholding [21] is performed by
using calculating mean of gray scale value within local
window. The sliding window is placed over each pixel of
the image and neighboring pixels values are considered for
calculating median. The pixels surrounded by central pixels
are considered for sorting and after sorting the pixel median
is calculated. The following equation is used to find local
mean using sliding window.
Pixel= {Pixel value > Local Median-C? Object:
Background}, Where, c=0 or any value to control noise
level.
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5238 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
2.5. Sliding Window MidGray Thresholding
Sliding window median thresholding [21] is performed by
using calculating mean of gray scale value within local
window. The sliding window is placed over each pixel of
the image and neighboring pixels values are considered for
calculating median. The following equation is used to find
local mean using sliding window.
Pixel= {Pixel value > (Local Maximum + Median)/2 –C ?
Object: Background}, usually c=0 or any value to control
noise level.
2.6. Wolf’s binarization technique
Wolf‟s binarization technique [8,10] is implemented by
calculating stranded deviation and mean within local
window and over whole image. The sliding window is
placed over each pixel of the image and neighboring pixels
values are considered for calculating m(mean) and
S(standard deviation). The following equation is used to
find the threshold over local window.
Twolf=(1-K)*m+K*M+K*S/R(m-M)
Where k=0.5(Fixed), M- Minimum Gray value, m-mean
local, M-mean global, S-Standard Deviation local, R
Standard Global
2.7. White and Rohrer’s Binarization
White and Rohrer‟s binarization technique [11] is used to
calculate threshold value for Binarization of image by using
bias parameters as multiple to pixel value under
consideration. The concept of sliding window is used to
find out mean value of neighborhood pixels around centered
pixel. The following equation is used to compute threshold
value for central pixel in the image.
mlocal(x,y)<I(x,y).Bias set pixel I(x,y) to
1, otherwise 0
Where, mlocal(x,y) is the value of local mean within
window by using neighborhood pixels. I(x,y) is the value of
pixel considered for Binarized. Bias is the control variable
(suitable value is 2).
2.8. T.R. Singh and et. al ’s binarization
T.R Singh et. al [12]introduced local thresholding
computation by calculate local mean and mean deviation by
using following equation.
T(x,y)=m(x,y)[1+K*(δ(x,y)/(1-δ)-1)]
Where δ (x,y)=I(x,y)-m(x,y), k is a bias and k € [0,1] ,
mlocal(x,y) is the value of local mean within window by
using neighborhood pixels. I(x,y) is the value of pixel
considered for Binarized.
III. ANALYSIS:
We have considered ICDAR2009[13] dataset for binarisation
of degraded document images and ICDAR 2011[14] for
natural scene images.In literature several papers are found
that use precision, recall analysis and f-measure. But in case
of natural scene image, the background is too complex that it
is not easy to separate object from it. There is lot of text like
objects like tree leaf, branches, window, etc. In current study
the following methods are used. The more than two thousand
images are captures using Nikon D3200 Digital Camera.
Images covered different location that contains sign board,
notice board, banners, and text written on wall and vehicles
in city Patiala. The text present in these boards is in
Gurmukhi Script. The images are captured at different
locations and at time so that different lighting conditions are
covered. Images are resized to cover come computation
overhead.
3.1. Parameters of analysis.
The outputs from different Binarization algorithms are
evaluated to find out the best suitable algorithm.
Visual Observation: visual observation is a most
efficient method to observe the effect of
binarization techniques separating text from
background
Recall(r): It is ratio of number of relevant object
pixels retrieved to the total number of relevant
object pixels in the reference image.
Recall= (A/(A+B))*100
Precision (p): It is ratio of number of the number of
relevant object pixels retrieved to the total number of
irrelevant and relevant object pixels retrieved. The precision
can be expressed by the following equation
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5239 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
Precision= ((A)/(A + C ))*100
F-measure (f): it is defined to show goodness of the
results. A higher value of F-measure represents a better
result. It is expressed by the following equation
F-measure= (2 * Recall * Precision)/(Recall + Precision)
Peak Signal to Noise Ratio (PSNR): it is defined as the
ratio of peak signal power to average noise power. It can be
calculated as
PSNR= 10 log10 2552*MN/ Ʃi Ʃj(x(i,j)-y(i,j)2
The different algorithms which are consider for Binarization
are Otsu‟s Binarization Technique, Niblack‟s Binarization
technique, Souvola‟s Binarization technique, Morphological
Binarization, Bernsen‟s Binarization, Feng‟s Binarization,
Sliding Window Mean Thresholding, Sliding Window
Median Thresholding, Sliding Window MidGray
Thresholding, Wolf‟s Binarization, White and Rohrer‟s
Binarization, T.R.Singh‟s Binarization
3.2. Ground Truth Images: The ground truth images
are created with manual tool and image is divided into two
area foreground and background. The area contains text is
labeled as foreground and other area is labeled as
background. The databases of ground truth images [14]
from Eleventh International Conference on Document
Analysis and Recognition (ICDAR 2011) are consider of
evaluation of the Binarization algorithms. The following are
few sample ground truth images.
TABLE3: GROUND TRUTH IMAGES FROM ICDAR 2009.
Original Image Ground Truth Image Original Image Ground Truth Image
TABLE4: GROUND TRUTH NATURAL SCENE IMAGES FROM ICDAR 2011
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5240 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
TABLE 5: GROUND TRUTH IMAGE OF NATURAL SCENE IMAGES CAPTURED DURING STUDY
Original Image Ground Truth Image Original Image Ground Truth Image
1.1. Visual performance: Visual performance of Binarisation methods on document images:
Kitter’s method Otsu’s Niblack’s method Wolf’s method
Souvola’s method Wolf’s Binarisation Bernsen’s method White&Rohrer’s
Feng’s method Median filter MidGray filter T.R.S.’s method
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5241 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
Analysis of statistical parameters: We use recall, precision, f-measure and PSNR to access the performance of document and natural
scene images. The following table shows performance using these parameters.
1.2. Decision parameters based analysis: The following parameters are used to access the performance of the
different methods.
TABLE7. ANALYSIS ON BASED PRECISION, RECALL, F-MEASURE AND PSNR.
Sr.No Method Precision Recall f-measure PSNR
1. Otsu‟s Binarisation 0.8967 0.9270 91.16829 20.01837
2. Niblack‟s Binarisation 0.2364 0.9843 38.1246 7.9908
3. Souvola‟s Binarisation 0.8802 0.9332 90.5918 20.1616
4. Wolf‟s Binarisation 0.9846 0.5021 66.5088 12.9281
5. Bernsen‟s Binarisation 0.1699 0.8658 28.4012 3.5669
6. Feng‟s Binarisation 0.0153 0.1251 2.7198 0.4498
7. Median filter‟s Binarisation 0.9409 30.2122 0.0988 3.5857
8. MidGray filter‟s Binarisation 0.7728 36.0587 0.1269 5.5863
9. Mean filter‟s Binarisation 0.3184 11.2004 0.0453 2.9353
10. Wolf‟s Binarisation 0.9846 0.5021 66.5088 12.9281
11. White&Rohrer‟s Binarisation 0.9255 0.3917 55.0452 11.9066
12. T.R.S.‟s Binarisation 0.2106 0.8992 34.1320 4.5634
13. Kitter‟s Binarisation 0.8968 0.9271 91.1683 20.4922
TABLE 8: ICDAR-2011(SAMPLE DATABASE)
ICDAR-2011 Image Ostu Niblack
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5242 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5243 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
Results: The best suitable algorithms for Binarization of Natural Scene images are implemented on ICDAR-2011
database which gave satisfactory results. The above are few examples with results.
TABLE 9: IMPLEMENTATION OF BINARISATION METHODS ON NATURAL SCENE IMAGES
Original Image Otsu‟s Binarization Original Image Niblack‟s Binarization
Original Image Otsu‟s Binarization Original Image Bernsen’s Binarization
Original Image Wolf‟s binarization Original Image White and Rohrer‟s
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5244 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
3.3. Detail Analysis:
The Binarization is one of the key steps to recognize text in the
image so that it can be used for appropriate function such as
automatic document reader, Navigation System for Visually
handicap. The twelve algorithms from literature has been
considered for present research work and their performance has
been evaluated using Visual Observation and Mathematical
noise and Error calculation methods such as Recall, Precision
and F-measure and PSNR. The following observations have
been noticed to find out best suitable algorithm.
1. Otsu‟s method is good Global Binarization method for
image where background is uniformly illuminated. It is one
of the fasted methods as computational over head is very
low. It produces noise into image when there is non-
uniform illumination. It gives best result for uniform
illumination i.e. Histogram should be bimodal.
2. Window size plays an important role in performance of
Niblack‟s Binarization techniques. The size of the sliding
window should be small enough to preserve local
significance and large enough to suppress noise. Through
experiments it is come to know that 15 x 15 is an ideal size
for Binarization. But larger the window
size slow down the binarization process. So it is
recommended to use appropriate size of window like 15 X15
or 17x17. It is good algorithm to segment text region in the
image but it produce a large amount of binarization noise in
the non-text regions. Large window size minimizes the noise.
The values of parameters p and k play an important role to
overcome noise problem in the output Binarized image. The
most suitable value of p=0.5 and k=-0.2.
3. Sauvola‟s Binarization algorithm is modification of Niblack‟s
Algorithm as in case of Niback‟s algorithm there is unwanted
noise in non-text area. In this algorithm noise control is better
than Niblack. The value of K is always positive and lies
between interval [0.2 to 0.5].The experiment results reveal
that appropriate value are k=0.5 and k=0.34. The size of the
window plays important role to segment the image, the value
of window size 15 x 15 gives good results as in the case of
Niblack. Similarly there is trade-off between size of window
and execution time.
4. In Sauvola‟s Binarization, value of R=128 is fixed for Gray
scale Image.
When loca neighborhood has high contrast then σw(i,j)=R
T=Mw(i,J)*(1+k(1-1)), When local neighborhood has low
constant then Twk(i,j) goes below the mean value.
Sauvola‟s technique reduces noise and enhances textual
part. Computation of Mw and σw(i,j) for each
neighborhood is time consuming and computing
complexity O(n2w2) for MxN size of image. If the text part
of the image is small as compare to total image size then
local standard deviation of pixels in the window of image
suppress text regions in the image.
5. Feng's Binarization introduced concept of two sliding
window where one is local and other is of larger size
depending upon the application type. The current research
used local window of size 15 x 15 and second window of
size NxM, where N and M are dimensions of the image. It
gives good result for all type of images but the Drawback
of the method is determination of empirical parameters.
Through experiment it is revealed that one set of
parameters which are good for specific types of images
may not be suitable for other kind of images. There are two
types of windows to compute local thresholding so it is
slow as compare to single window algorithm. Output of
sample images are displayed above in the table.
6. The method used by TR Singh is based on calculation of
loam mean and Bias constant which may take value
between(0,1). The significant improvement in this
algorithm is the size of sliding window. It gives good result
even for small size of window for example w=7. If it is
compare with Niblack and Sauvola, visually outputs are
almost similar but there is lot of difference in
computational time. In case of Niblack and Sauvola‟s
Binarization methods suitable size of window is 17 to 21
but TR Singh's method requires only 7 to 9 size window to
show significant result. Comparatively it is faster than
Niback and Sauvola Binarization methods.
7. Morphological Binarization method make use of Erosion
operation as it gives good result as compare to Dilation,
opening and closing operation. It is suitable for image
where text size is large and there is uniform font is used. It
distorts the shape of small text and disappear text of very
small size. The best convocation mask suitable for
Binarization of text images is 7x7.
8. Sliding window mean make use of window of size 7 x 7 to
find out local mean of the image. The local mean is
compare with current pixel of the local window to find out
the threshold. it is good method for images where text
written on uniform background. it also add lot of noise in
the image as multiple colored background image have
different mean in their local area result is non-uniform
black and white area as shown in the above table.
9. As in the case of sliding mean thresholding, local window
is used to calculate median of the local area in the window.
International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 8 5235 - 5245
_____________________________________________________________________________________________
5245 IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
The pixels in the neighborhood are stored in acceding order
with their gray values than threshold is calculated by
comparing current pixel and median value. it gives good
result only when there is uniform background or minimum
color are used in the text area. Even in it good result
significant noise is noticed. It is good for only very few
images as natural scene images are generally full of
background color. MidGray value is calculated by
considering pixel in the window and it is also useful when
there is large text area.
10. The algorithm suggested by Wolf et al. is based on
calculation of local standard deviation and local mean. The
sliding window is used to find values of standard deviation
and local mean so size of the sliding window is in the key
role. Through experiments is revealed that best suitable size
of window is 11x 11 and above. It gave good result for
almost all kind of images. The image with uniform
background is clearly Binarized and exact shape and size of
the text is preserved as compare to Otsu's Binarization is a
little bit slow but is gives good result even with non-
uniform background which is not possible by Otsu. If it is
compared with Niblack‟s and Sauvola‟s Binarization is
good as there is less amount of Noise as compare to both
the algorithm. The suitable size of sliding window of
Niblack‟s and Sauvola‟s Binarization is 17x17 and more
where as in case of Wolf we can get good result even at
size of 11X11.
REFERENCES
[1] Stathis, Pavlos, Ergina Kavallieratou, and Nikos Papamarkos.
"An evaluation survey of binarization algorithms on historical
documents." Pattern Recognition, 2008.ICPR 2008.19th
International Conference on .IEEE, 2008.
[2] N. Otsu, “A threshold selection method from gray-level
histograms”, IEEE Transactions on Systems, Man, and
Cybernetics, vol. 9, Issue 1, pp. 62–66, 1979.
[3] W. Niblack, An Introduction to Digital Image Processing,
Prentice Hall, Englewood Cliffs, NJ, USA, 1986.
[4] J. Sauvola and M. Pietikäinen, “Adaptive document image
binarization,” Pattern Recognition, vol. 33, no. 2, pp. 225–
236, 2000.
[5] J. Bernsen, “Dynamic thresholding of grey-level images,” in
Proceedings of the 8th International Conference on Pattern
Recognition, pp. 1251–1255, Paris, France, 1986.
[6] Valizadeh, M., Kabir, E.: “Binarization of degraded document
image based on feature space partitioning and classification.
International Journal on Document Analysis and Recognition
Vol.15, Issue 1, pp.57–69, 2012.
[7] Su, B., Lu, S., Tan, C.L.,“ Robust document image
binarization technique for degraded document Images”, IEEE
Transactions on Image Processing. Vol.22 (4), pp.1408–1417,
2013.
[8] Morteza, V., Ehsanollah, K.,“An adaptive water flow model
for binarization of degraded document images”. International
Journal on Document Analysis and Recognition, Vol.16 (2),
pp. 165–176, 2013.
[9] Gaceb,Djamel, Frank Lebourgeois, Jean Duong."Adaptative
Smart-Binarization Method: For Images of Business
Documents”, in proceedings of 12th International Conference
on Document Analysis and Recognition (ICDAR), 2013.
[10] W. Niblack, “An Introduction to Digital image processing”,
Prentice Hall, pp 115-116, 1986.
[11] J. Kittler, J. Illingworth, “On threshold selection using
clustering criteria”, IEEE Trans. Systems Man Cybernet.15,
pp.652–655, 1985.
[12] Y.Yang and H.Yan, „An adaptive logical method for
binarization of degraded document images‟, Pattern
Recognition (PR), pp.787–807, 2000.
[13] T. W. Ridler and S. Calvard, “Picture thresholding using an
iterative selection method”, IEEE Transactions on Systems,
Man, and Cybernetics, pp.630-632, 1978.
[14] J. Sauvola, M. Pietikainen,"Adaptive document image
binarization", pp 33, 225–236, 2000
[15] L. Gottesfeld Brown, “A survey of Image Registration
Techniques”, ACM Computing Surveys, Vol 24, No
4,pp.325-376, 1992.
[16] Gatos, B., Ntirgiannis, K., Perantonis S.J. “Improved
document image binarization by using a combination of
multiple binarization techniques and adapted edge
information”, in proceedings of 19th International Conference
on Pattern Recognition (ICPR), pp. 1–4, 2008.
[17] Bernsen, J. “Dynamic thresholding of gray level images”, in
the proceedings of International conference on pattern
recognition (ICPR), pp. 1251–1255,1986.
[18] Rabeux, Vincent, "Quality evaluation of ancient digitized
documents for binarization prediction."Document Analysis
and Recognition (ICDAR), 2013 12th International
Conference on.IEEE, 2013.
[19] He J,Do QDM,Downton AC and Kim JH,"A comparison of
binarization methods for historical archive
documents.In:Eighth international conference on document
analysis and recognition (ICDAR‟05), pp.538–542.
[20] B.Gatos,I.Pratikakis, S.J. Perantonis, “Adaptive degraded
document image binarization”, Pattern Recognition,
vol.39, pp.317–327, 2006.
[21] M. Sezgin and B. Sankur,“Survey over image thresholding
techniques and quantitative performance evaluation,” J.
Electron.Imag.,vol.13,no.1,pp.146–165,2004.