Midsummer Night’s Dream Characters. Meet the Characters from Athens…
Investigation on Algorithm for Handwritten Gujarati OCR · 99.30% for Numeral, 92.37%, 92.21% and...
Transcript of Investigation on Algorithm for Handwritten Gujarati OCR · 99.30% for Numeral, 92.37%, 92.21% and...
-
Investigation on Algorithm for Handwritten Gujarati OCR
Ph.D. Synopsis
Submitted To
Gujarat Technological University
For the Degree
of
Doctor of Philosophy
in
Electronics and Communication Engineering
By
Mikita R. Gandhi
Enrollment No: 149997111007
Supervisor:
Dr. Vishvjit Thakar
Associate Professor and Head
Information and Communication Technology,
Sankalchand Patel University,
Visnagar.
Co-Supervisor:
Dr. Hetal N. Patel
Professor & Head
Electronics and Communication Department,
A.D.Patel Institute of Technology,
New V. V. Nagar.
-
Table of Contents
1. Title of the Thesis and Abstract............................................................................. 1
2. Brief description on the state of the art of the research topic................................ 2
3. Objective and Scope of the work........................................................................... 4
4. Original contribution by the thesis......................................................................... 4
5. Methodology of Research, Results / Comparisons................................................ 4
6. Achievements with respect to objectives............................................................... 17
7. Conclusions............................................................................................................ 18
8. List of publication arising from the thesis............................................................. 19
9. References.............................................................................................................. 20
-
1
1. Title of the Thesis and Abstract
1.1 Title of the Thesis:
Investigation on Algorithm for Handwritten Gujarati OCR
1.2 Abstract:
Optical Character Recognition is getting much more attention because by this the computer
learns and recognizes the regional languages pretty well and if it successes, then it opens a
whole new world of endless possibility.The machine printed characters are accurately
recognizable which has solved many problems and hence commercialized in routine use but
the recognition of hand written characters are very difficult and methods of recognition of
hand written documents is still a subject of active research. There is no common algorithm is
possible for all Indian language, because each Indian language has its own features and
restrictions. In Gujarat state, Gujarati is the commercial language and most of the
communication in Government office, schools and private sectors is done in Gujarati.
Handwritten Gujarati OCR system was developed for handwritten amount on cheque,
automatic reading of marks from answer sheet and a learning application for education
system. The research work is mainly focused on implementation of robust algorithm for
Handwritten Gujarati OCR.
The KNN and SVM classifiers were used on different feature extraction methods like pixel
count ratio, object gradient; geometry, profile, local binary pattern, ceter-symmetric local
binary pattern and wavelet transform methods. Furthermore hybrid feature extraction
methods were used for increase the performance of character recognition. The other novel
approach of automated features extracted was implemented using Deep learning. The
extracted features were given to SVM for handwritten character classification. For increasing
recognition rate of characters, pretrained Deep Neural network (Alexnet) has been used and
implemented three different application: Handwritten Guajarati Numeral to speech
conversion, character to speech conversion and Automatic Handwritten Marks Recognition.
KNN, SVM and Deep Neural Networks gives recognition accuracy of 98.14%,98.72% and
99.30% for Numeral, 92.37%, 92.21% and 97.65% for characters and 92.64%, 92.93% and
97.73% for combining Numerals and characters respectively.
-
2
2. Brief descriptions on the state of the art of the research topic
As the world move closer to the concept of the “paperless office,” more and more
communication and storage of documents is performed digitally. Documents and files that
were once stored physically on paper are now being converted into electronic form in order to
facilitate quicker additions, searches, and modifications and also doing this, life of such
documents are prolonged. The advances in character recognition were limited to the
extraction of English language character for both digital and handwritten. The character
recognition of Indian languages can help authors, novelist, and many people to recognize the
Indian characters and even to extract old heritage documents. The research work is
approximately negligible for handwritten character recognition in general for Indian
languages and Gujarati language in particular. In Gujarat State, all Government agency
documents are written in Gujarati language. The software is available for printed Gujarati
OCR but recognition of handwritten character is still changing exertion.
Basic block diagram of OCR system is shown in figure 2.1. There are five major stages are
like preprocessing, segmentation, representation, training and recognition and post
processing.
Figure 2.1 Basic block diagram
Preprocessing is required to make the raw data usable in the descriptive stages of character
analysis like smoothing, sharpening, binarize the image, remove background and extracting
the required information. Segmentation converts the document into separate character by first
segment the lines, then line segments the words and from words to individual characters
which is used by classifier. In representation stage, the set of features are extracted to
distinguished one class of the images from other class. KNN, SVM, Neural Network, Deep
Learning like classifier are used for training and recognition.
The Gujarati OCR worked was initiated by Sameer Antani et.al.[1] on printed Gujarati
Script.KNN and hamming distance classifier was applied on 15 characters; 30 samples for
-
3
each character and got 67 % and 41.33% accuracy respectively. Using template matching and
wavelet transform coefficients [2], Shah S. K et.al. attained 72.30 % accuracy for printed
Gujarati OCR, Ankit K. Sharma et.al. [3] worked on zoning method and using multilayer
feed forward neural network classifier achieved 95.92% accuracy for handwritten Gujarati
Numerals and Archna vyas et.al.[4] got 96.99% accuracy using KNN. Using hybrid feature
space method and SVM classifier A. Desai [5] has recognize forty handwritten Gujarati
characters with 86.66% accuracy. The Zonal Boundary was successfully detected [6] by
Jignesh Dholakia et.al. using zoning method. Swital J. Macwan et al [7] has applied discrete
wavelet transform method on Gujarati Handwritten and got 89.46% accuracy. V. A. Naik et
al have used different structural and statistical features for recognition of handwritten
numerals and acquired 95% accuracy [8] and Dinesh Satange et al. obtained 90% accuracy
using Multi Layer Perception[9].
Ashutosh Aggarwal et.al. [10]has worked on gradient based feature extraction and SVM
classifier for Hindi handwritten character recognition. LBP features are used for Bangla digits
recognition in 2015 [11] and achieved 96.7% accuracy using KNN classifier; the same LBP
feature applied on Persian/Arabic handwritten digit recognition[12]. Sekhar Mandal et al [13]
proposed algorithm for machine-printed character recognition in Bangla language using two
dimensional wavelet transform and gradient information. Saleem Pasha et al have solved
problem of handwritten recognition for Kannada language using statistical featuresand
wavelet transform [14].
Two stage CNN network was used by Shibaprasad Sen et al [15] for online Bengali
handwritten character recognition and gain 99.40% accuracy. Akm Ashiquzzaman et al
worked on 10 different layer of CNN architecture for Arabic handwritten digit and achieved
97.4% accuracy [16]. Chaouki Boufenar et al shown the three different approach of Deep
learning methods for handwritten Arabic character recognition: i) scratch approach; (ii)
transfer learning approach and (iii) fine-tuning approach [17].
Table 2.1 shows the Gujarati Numerals and Characters used for research work.
Numerals
૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯
Characters
ક ખ ગ ઘ ચ છ જ ઝ ટ ઠ ડ ઢ ણ ત થ દ ધ ન પ ફ
બ ભ મ ય ર લ વ શ ષ સ હ ળ ક્ષ જ્ઞ શ્ર અ ઋ ઈ ઉ ઊ
Table 2.1 Gujarati Numerals and Characters
-
4
3. Objective and Scope of work
3.1 Objective:
To develop an algorithm for handwritten Gujarati OCR features to recognize
numerals and Character.
To design an Optical Character Recognition system for handwritten Gujarati
Numerals.
To design an Optical Character Recognition system for handwritten Gujarati
Characters.
To design an Optical Character Recognition system for combined handwritten
Gujarati Numerals and Characters.
3.2 Scope of work:
The research work is useful to automatic detection of amount written in Gujarati on
bank cheque, marks written on answer sheet and in Gujarati numeral and characters
learning application.
Handwritten Gujarati numeral and characters to speech conversation
The implemented algorithms can be useful for recognition of Gujarati text modifier.
4. Original contribution by the thesis
Develop different feature extraction methods along with creation of database for
Gujarati Handwritten numerals and characters.
Three different classification methods: KNN, SVM and Deep learning was used for
recognition
Hybrid features with the above listed three classification methods.
Transfer learning approach of Deep learning is used for better accuracy.
Three applications are developed:
Gujarati handwritten Numeral to speech conversion
Gujarati handwritten Character to speech conversion
Automatic Handwritten Marks recognition.
5. Methodology of research, results and comparisons
There is no standard database available for handwritten Gujarati OCR, the database was
created for research work which containing 5000 samples for Numerals and 10,000 samples
for characters from different age people.
-
5
5.1 Feature Extraction Method:
Method 1: Pixel Count Ratio
The image is resized into 64 X64 matrix. Some morphological operations are applied on it.
Then image was divided into 8X 8 zone and total 64 zones are created. From each zone, the
ratio of number of white pixels/number of Black pixels is taken as features so total 64
features are generated.
Figure 5.1 shows the image of Guajarati numeral ‘1’ and its 64 different zones.
5.1 pixel count ratio
Method 2: Object Gradient
The gradient magnitude and gradient direction is used as features. First the image is divided
into 9 sub images. The code is assigned for 30° span of direction. So total 12 code assigned to
each sub image. The total 12X9 =108 features are obtained from single image .
Figure 5.2 shows the image of Guajarati numeral ‘0’ and its 9 different zones. For each sub
image gradient magnitude and gradient direction is computed and further, each white pixels
gradient direction is observed, then checks that specific pixel lies in which span, according to
span the code is assign to that pixel.
Figure 5.2 Object Gradient
-
6
Method 3: Object Geometry
In this method [18] object Geometry is used as features, geometry features like horizontal
line, vertical line, right diagonal and left diagonal lines, area and Euler number. For these
feature extraction image is divided into 9 sub images. From each sub images, first starting
point and intersection points are founded, and then numbers of line segments are counted in
particular direction like horizontal, vertical, right diagonal and left diagonal lines. The first
four features of each sub images are values of these lines, computed by equation (1).
Value =1 - ((number of lines /10) * 2) …………… (1)
Next four features are computed using length of lines. If particular line is not available, than
consider value =-1, else normalize the length consider as feature value. Last feature is
considered as area of sub image. So total 9 features from each sub image X 9 sub images =81
features and Euler no of image is consider as one another features. Hence total 82 features are
computed.
Figure 5.3 Object Geometry
Figure 5.3 shows the one of the sub image of digit ‘0’. The 9 features are calculated as below:
No. of segments: 3
o No. of horizontal lines : 0
o No. of vertical lines : 0
o No. of right diagonal lines :2
o No. of left diagonal lines :1
Value is calculated by:
Value =1 - ((number of lines /10) * 2)
So the first four features are:
o Value of horizontal lines : 1
o Value of vertical lines :1
o Value of right diagonal lines:0.6
-
7
o Value of left diagonal lines :0.8
Next 4 features are calculated as:
Length= (total no. of pixels in a particular direction) / (total no. of all pixels belonging to
skeleton)
Total no. of all pixels belonging to skeleton: 20
If there is no pixels in particular line than consider length = -1
o Normalized Length of all horizontal lines :-1
o Normalized Length of all vertical lines :-1
o Normalized Length of all right diagonal lines :12/20 =0.60
o Normalized Length of all left diagonal lines :5/20 =0.25
The 9th feature from each zone is computed as:
o Normalized Area of the Skeleton= (Total no. of all pixels belonging to
skeleton) /(size of sub image)
o Normalized Area of the Skeleton = 20/289 = 0.0692
Method 4: Character Profile
Objects horizontal, vertical, right diagonal and left diagonal profile are considered as
features. The image is resized into 50 X 50. So total 298 features =50 horizontal + 50
vertical + 99 right diagonal+ 99 left diagonal profile are calculated.
Figure 5.4 show the character profile for numeral ‘1’.
Figure 5.4 Character Profile
-
8
Method 5: Local Binary Pattern
Local Binary Patterns (LBP) is mostly used as feature extraction method in recognition of
Face, fingerprint, texture. It operates on image pixels and replace its value with decimal
number. In image, each central pixel value is compared with its eight neighboring pixels, if
the neighboring pixel has less value than assign 0 else assign 1 to that pixel. Considering top
left corner as a first bit and rotate clock wise manner generates eight bit binary code. The
central pixel value is replaced by the decimal value of that binary code. The histogram of
these decimal values is used as features.
Figure 5.5 shows LBP code generation. The generated Binary code is 11000010, that is
equivalent to 194 in decimal. To implement the rotation invariant features and reduce the size
of feature vector, Uniform LBB is used.
Figure 5.5 (a) Input image (b) The LBP (8,1) operator (c) LBP coded block
A local binary pattern is called uniform if its uniformity measure is at most 2.For example,
the patterns 00000000 (0 transitions), 01110000 (2 transitions) and 11001111 (2 transitions)
are uniform and the patterns 11001001 (4 transitions) and 01010011 (6 transitions) are not
uniform. In uniform LBP mapping there is a separate output label for each uniform pattern
and all the non-uniform patterns are assigned to a single label. So, there are 58 uniform
patterns and 1 non uniform pattern, total 59 point feature vector is considered.
For obtain Uniform LBP features in this research work, binary image is first converted into
gray image and then image is further divided into various size of blocks, suppose size is
12X12, so total 16 blocks are generated and from each block 59 features are obtained, so total
features are 16 X 59 + 59 features from whole image =1003 features are obtained.
Method 6: Center Symmetric Local Binary Pattern.
Center Symmetric Local Binary Pattern (CSLBP) is extension of LBP, in which difference of
opposite pixel values are taken, if the difference is greater than some threshold value the
assign bit 1 else assign bit 0; so the length of histogram of CSLBP is 16 point.
-
9
Figure 5.6 shows the generation of CSLBP code. consider the threshold value is 8 and the
difference between opposite pixel values 80-70= 10 and 10 is greater than 8 so put 1, same
way the value of other opposite are counted and binary code is generated. Here, the binary
code is 1011 and its equivalent decimal code is 11.
Figure 5.6 Computation of CSLBP (8,1) with threshold= 8
Like LBP, CSLBP features are obtain by converting image into gray scale, and the divide
into blocks. Total number of features for 12X12 block size is 16 blocks X 16 features + 16
features of whole image= 272 .
Method 7: Wavelet Transform
Wavelet transform is usually used for representing and analyzing image. Image is represented
by the two dimensional matrix; the wavelet transform is applied first row wise and then
column wise, so final image is divided into four sub bands: [LL, HL, LH, HH], each sub band
gives image’s approximation detail, horizontal detail, vertical detail and diagonal detail.
Figure 5.7 shows the first level wavelet transform. The approximation details are used as
features in the research work. The number of features is 256, 64 and 16 for level 2, 3 and 4
respectively.
Figure 5.7 2-D Wavelet Transform
-
10
5.2 Classification Method
Classification is a task which assign object one of the classes from predetermined classes.
Here five fold cross validation is used for classification.
5.2.1 K-Nearest Neighbors Classifier
In pattern recognition, KNN is a method for classifying objects based on the closest
training examples in the feature space.
KNN is a type of lazy learning where the function is only approximated locally and
all computation is deferred until classification.
The simplest of all machine learning algorithms: an object is classified by a majority
vote of its neighbors, with the object being assigned to the class most common
amongst its k nearest neighbors. k is a positive integer, typically small.
If k = 1, then the object is simply assigned to the class of its nearest neighbor.
K-NN assumes that the data is in feature space.
The data can be scalars. Since the points are in feature space, they have notion of
distance.
Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2,
..., xm, the various distances between the vector xs and xt are defined as follows:
o Euclidean distance
o City block metric
o Cosine distance
o Correlation distance
( )( )1
( )( ) ( )( )
ss t tst
s ts s s t t t
x x x xd
x x x x x x x x
1 s tsts s t t
x xd
x x x x
1
n
st sj tj
j
d x x
( )( )st s t s td x x x x
-
11
Table 5.1 shows the accuracy of numerals, characters and combining numerals and characters
for different feature extraction method using KNN classifier.
Feature Extraction
Methods
Accuracy
Numerals Characters Mix
Pixel Count Intensity 97.18% 78.12% 80.65
Gradient based 98.14% 89.06% 89.81
Object Geometry based 90.02% 67.25% 71.38
Object Profile 95.82% 76.71% 79.19
Local Binary Pattern 97.92% 88.12% 88.97
Center Symmetric Local
Binary Patterns
97.92% 87.65% 89.38
Wavelet transform 97.86% 81.40% 84.02
Table 5.1 KNN Classifier Accuracy for Different Feature Extraction Method
Figure 5.8 shows the accuracy of hybrid feature extraction method. By concatenating CSLBP
and Gradient features, character recognition accuracy reach up to 92.37%.
Figure 5.8 hybrid feature extraction method for characters
Figure 5.9 shows the accuracy of hybrid feature extraction method for mixed numerals and
characters. By concatenating CSLBP and Gradient features, the recognition accuracy reach
up to 92.64%.
87
88
89
90
91
92
93
Local Binary Pattern + Gradient
Center Symmetric Local Binary Patterns
+ Gradient
Wavelet Transform + Gradient
% A
ccu
racy
Features Extraction Method
Hybrid Feature Extraction Methods
-
12
Figure 5.9 hybrid feature extraction method for mixed numerals and characters
5.2.2 Support Vector Machine
When there is no idea about data, support vector machine (SVM) extremely work
well.
SVM’s are very excellent when we have no idea on the data.
It works with unconstructed and semi constructed information data like images, text
and trees.
The kernel strategy is main power of SVM . With a specific kernel functionality , it is
possible to deal with any kind of complex problem
In contrast to neural networks, SVM is not made up for local optima.
It scales extremely good to high dimensional data.
SVM is always gives better result than ANN
SVM also required the good kernel selection and large dataset.
SVM takes long training time than other classifier.
Common kernels
o Linear K(x,z) = xTz
o Quadratic K(x,z) = (1+xTz)2
o Polynomial K(x,z) = (1+xTz)d
o RBF K(x,z) = exp-(||x-z||2)
88
89
90
91
92
93
Local Binary Pattern + Gradient
Center Symmetric Local Binary Patterns
+ Gradient
Wavelet Transform + Gradient
% A
ccu
racy
Features Extraction Method
Hybrid Feature Extraction Methods
-
13
Table 5.2 shows the accuracy of numerals, characters and combining numerals and characters
for different feature extraction method using SVM classifier.
Feature Extraction
Methods
Accuracy
Numerals Characters Mix
Pixel Count Intensity 96.90% 76.28 % 81.05%
Gradient based 98.72% 89.57% 92.10%
Object Geometry based 90.70% 67.59% 70.56%
Object Profile 93.50% 64.45% 70.08%
Local Binary Pattern 97.50% 85.99% 88.26%
Center Symmetric Local
Binary Patterns 95.82% 83.3% 86.21%
Wavelet transform 97.40% 84.93% 86.68%
Table 5.2 SVM Classifier Accuracy for Different Feature Extraction Method
Figure 5.10 shows the accuracy of hybrid feature extraction method. By concatenating
wavelet and Gradient features, character recognition accuracy reach up to 92.37%.
Figure 5.10 hybrid feature extraction method for characters
Figure 5.11 shows the accuracy of hybrid feature extraction method for mixed numerals and
characters. By concatenating wavelet and Gradient features, the recognition accuracy reach
up to 92.64%.
85
86
87
88
89
90
91
92
93
Local Binary Pattern + Gradient
Center Symmetric Local Binary Patterns +
Gradient
Wavelet Transform + Gradient
% A
ccu
racy
Features Extraction Method
Hybrid Feature Extraction Methods
-
14
Figure 5.11 hybrid feature extraction method for mixed numerals and characters
5..3 Deep Learning:
In deep learning, a computer model learns to perform classification tasks directly
from images, text, or sound.
Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding
human-level performance.
Models are trained by using a large set of labeled data and neural network
architectures that contain many layers.
Deep learning requires substantial computing power. High-performance GPUs have a
parallel architecture that is efficient for deep learning.
Most deep learning methods use neural network architectures, which is why deep
learning models are often referred to as deep neural networks.
5.3.1 Convolutional Neural Network (CNN)
Most popular algorithms for deep learning with images composed of an input layer,
an output layer, and many hidden layers in between.
CNN usually consists of
o Convolution Layer :
In convolutional layer used to extract the features from image, different filters
with different weight are applied on the input layer of the image, so the
outcomes of this, two dimensional feature maps are generated.
88.5 89
89.5 90
90.5 91
91.5 92
92.5 93
Local Binary Pattern + Gradient
Center Symmetric Local Binary Patterns +
Gradient
Wavelet Transform + Gradient
% A
ccu
racy
Features Extraction Method
Hybrid Feature Extraction Methods
-
15
o Pooling Layer or Sub Sampling :
Poling layer operate on each feature map. It can decrease the spatial dimension
but cannot decrease the depth of the feature map. It trims down the amount of
parameter and computation in network.
o Fully Connected Layer (Classification) :
Fully connected network connect features obtained by previous layers into its
number of classes.
Figure 5.12 Architecture of CNN
Figure 5.13 Implementation of CNN
-
16
Figure 5.12 shows the CNN architecture, which has two convolutional layers, two pooling
layers and one fully connected layers. It also consist two ReLU layers, which is increase the
non- linearity of image by replacing all negative values by zero. Also the Softmax layer is
used for highlights the largest value and suppresses the value which is significantly below the
maximum value.
Implementation of CNN network is shown in figure 5.12. The center blocks of the image
shows sequence of layers, left side shows the size of feature map of each layers and right side
shows number of feature maps with its size. The output of each layer is shown in figure 5.13.
Table 5.3 shows the accuracy for numerals and characters for input size 64X64 and 96X96
respectively.
CNN Layers No. of
Filters
% Accuracy for
Numerals
% Accuracy for
Characters
Input size:
64X64
Input size:
96X96
Input size:
64X64
Input size:
96X96
Convolutional layer 1 20 80.7 94.4 70.15 74.15
Convolutional layer 2 40
Convolutional layer 1 40 88.25 95.1 72 76.25
Convolutional layer 2 80
Table 5.3 Proposed CNN Architecture Accuracy for Numerals and Characters
5.3.2 Pretrained Network Approach
Pretrained network is a previously trained network on a large standard datasets like similar
problems that we want to solve. It already knows how to extract features which are
informative and more powerful. More than million images are given to train this type
Network and its output also classified into approximately 1000 class.
Figure 5.14 Pretrained Network Approach
-
17
Pretrained networks like alexnet, vgg16, vgg19, googlenet, resnet18, resnet50, sufflenet are
used for new task with only feature extraction purpose or transfer learning approach, as
shown in figure 5.14. Alexnet is used as pretrained Network. Alexnet returns a pretrained
AlexNet model and it contains 25 layers. The ImageNet database is used to train this model
and its classified the image into 1000 classes such as different animals, mouse, pencils, cup,
ambulance.
A. Pretrained network as feature extractor
In this approach, Alexnet model is used for feature extraction and these extracted features are
given to SVM classifier for training and tasting purpose. Features are extracted using 20
layers of Alexnet that is layer ‘fc7’. The recognition accuracy for numerals, characters and
mix database is shown in table 5.4
B. Transfer Learning Approach
In this approach, all the layers of the pretrained Alexnet has been used expect last three
layers. The new task has been carried out by replacing those last three layers with fully
connected layers, Softmax layers and classification output layers. The recognition accuracy
for numerals, characters and mix database for this approach is shown in table 5.4
Database SVM classification Approach
Transfer Learning Approach
Numerals 97.40% 99.30%
Characters 86.50% 97.65%
Mixed 89.33% 97.73%
Table 5.4 Pretrained Network Approach
6. Achievements with respect to objectives
Gujarati Handwritten numerals and characters database has been created in which
5000 samples for numerals and 10,000 samples for characters.
Handwritten Gujarati numerals, total 10 class, are recognized with 98.14% accuracy
using gradient feature extraction method and KNN classifier, 98.72% accuracy using
gradient features and SVM classifier and achieve 99.30% accuracy using Transfer
Learning approach in Deep Learning.
Handwritten Gujarati characters, total 40 class, are recognized with 92.37% accuracy
using CSLBP + gradient based hybrid feature extraction method and KNN classifier,
92.21% accuracy using wavelet + gradient based hybrid features and SVM classifier
and achieve 97.65% accuracy using Transfer Learning approach in Deep Learning.
-
18
Handwritten Gujarati numerals and characters, total 48 class, are recognized with
92.64% accuracy using CSLBP + gradient based hybrid feature extraction method and
KNN classifier, 92.93% accuracy using wavelet + gradient based hybrid features and
SVM classifier and achieve 97.73% accuracy using Transfer Learning approach in
Deep Learning.
Three applications are implemented:
1) Handwritten Guajarati Numerals to speech conversion.
2) Handwritten Guajarati Characters to speech conversion.
3) Automatic Handwritten Marks Recognition.
7. Conclusion:
The hand written Gujarati numeral recognition algorithm was successfully developed
using large number (5000) of test images with accuracy of 99.30%.
The hand written Gujarati character recognition algorithm was successfully developed
using large number (10,000) of test images with accuracy of 97.65%.
The hand written Gujarati number and character recognition algorithm was
successfully developed using large number (15,000) of test images with accuracy of
97.73 %.
-
19
8. List of publications
1) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Handwritten Gujarati Numeral Recognition
using wavelet Transform”, Journal of Applied Science and Computation (JASC),
Volume VI, Issue IV,2019
2) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Gujarati Handwritten Character Recognition
Using Convolutional Neural Network”, Journal of Emerging Technologies and
Innovative Research (JETIR) , Volume VI, Issue V, May 2019
-
20
References
1) Antani S, Agnihotri L “Gujarati character recognition”, In: Proceedings of fifth
international conference on document analysis and recognition, 1999 (ICDAR’99), pp
418–421
2) Shah SK, Sharma A, “Design and implementation of optical character recognition
system to recognize Gujarati script using template matching”, J. Inst Eng (India)
Electron Telecommunication Eng.,2006.
3) Ankit K. Sharma, Dipak M. Adhyaru, Tanish H. Zaveri, Priyank B Thakkar,
“Comparative analysis of zoning based methods for Gujarati handwritten numeral
recognition”, 5th Nirma University International Conference on Engineering
(NUiCONE),IEEE 2015
4) Vyas, A. N. ,Goswami, M. M., “Classification of hand written Gujarati numerals”,
IEEE transactions on pattern analysis and machine intelligence, pp.1231- 1237,2015
5) A. Desai, “Support vector machine for identification of handwritten Gujarati
alphabets using hybrid feature space”, CSIT, springer, January, 2015.
6) Dholakia J, Negi A, Rama Mohan S, “Zone identification in the printed Gujarati text”,
In: Proceedings of the eight international conference on document analysis and
recognition, 2005 (ICDAR’05).
7) Swital J. Macwan, Archana N. Vyas, "Classification of Offline Gujarati Handwritten
Characters", International Conference on Advances in Computing, Communications
and Informatics (ICACCI), 2015
8) Dr. Dinesh Satange, Dr. P E Ajmire, Fozia I. Khandwani, “Offline Handwritten
Gujrati Numeral Recognition Using MLP Classifier”, International Journal of Novel
Research and Development, Volume 3, Issue 8 August 2018
9) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition
in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011
International Conference on Image Information Processing.
10) Ashutosh Aggarwal, Rajneesh Rani, RenuDhir, "Handwritten Devanagari Character
Recognition Using Gradient Features", International Journal of Advanced Research in
Computer Science and Software Engineering (ISSN: 2277-128X), Vol. 2, Issue 5, pp.
85- 90, May 2012
11) T. Hassan, H. Khan, “Handwritten BangIa Numeral Recognition using Local Binary
Pattern”, 2nd Int'l Conf. on Electrical Engineering and Information & Communication
Technology (lCEEICT),2015.
-
21
12) M. Pietikäinen, A. Hadid, G. Zhao, T. Ahonen (2011), ‘Local Binary Patterns for Still
Images, Computer Vision Using Local Binary Patterns’, Chapter 2, Computational
Imaging and Vision 40, Springer-Verlag London Limited, pp 13 – 47.
13) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition
in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011
International Conference on Image Information Processing.
14) Saleem Pasha, M.C.Padma, “Handwritten Kannada Character Recognition using
Wavelet Transform and Structural Features” International Conference on Emerging
Research in Electronics, Computer Science and Technology – 2015
15) Sen S., Shaoo D., Paul S., Sarkar R., Roy K. ,“Online Handwritten Bangla Character
Recognition Using CNN: A Deep Learning Approach”, In: Bhateja V., Coello Coello
C., Satapathy S., Pattnaik P. (eds) Intelligent Engineering Informatics. Advances in
Intelligent Systems and Computing, vol 695. Springer, Singapore,2018.
16) Alom, M.Z, Sidike, P., Taha, T.M., Asari, V.K.., “Handwritten Bangla Digit
Recognition Using Deep Learning” Journal Neural Processing Letters; 45, pp: 703-
725,2017.
17) C. Boufenar, A. Kerboua, M. Batouche, "Investigation on deep learning for off-line
handwritten Arabic character recognition", Cogn. Syst. Res., 2017.
18) M. Blumenstein, B. K. Verma and H. Basli, A Novel Feature Extraction Technique
for the Recognition of Segmented Handwritten Characters, 7th International
Conference on Document Analysis and Recognition (ICDAR ’03) Eddinburgh,
Scotland: pp.137-141, 2003.
19) Patel CN, Desai AA , “Segmentation of text lines into words for Gujarati handwritten
text”, In: Proceedings of international conference on signal and image processing,
2010 (ICSIP’10), IEEEXplore,15–17.
20) Dapping Tao, Xu Lin, Lianwen Jin, “Principal Component 2-D Long Short-Term
Memory for Font Recognition on Single Chinese Characters”, IEEE
TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016
21) Devendra K Sahu and C. V. J awahar, “Unsupervised Feature Learning for Optical
Character Recognition”, In: Proceedings of the 13th International Conference on
Document Analysis and Recognition,2015 (ICDAR’15)
22) Mohamed Dahi, Noura A. Semary, and Mohiy M. Hadhoud, “Primitive Printed
Arabic Optical Character Recognition using Statistical Features”, IEEE Seventh
-
22
International Conference on Intelligent Computing and Information Systems, 2015
(ICICIS'15).
23) Tanzila Saba, “Language Independent Rule Based Classification of Printed &
Handwritten Text”, IEEE International Conference on Evolving and Adaptive
Intelligent Systems (EAIS),December ,2015.
24) Abdeljalil Gattal, “Segmentation-Verification Based on Fuzzy Integral for Connected
Handwritten Digit Recognition”, IEEE transaction on Image Processing Theory,
Tools and Applications, 2015.
25) Patel CN, Desai AA (2013) Gujarati handwritten character recognition using hybrid
method based on binary tree-classifier and k-nearest neighbour. Int J Eng Res
Technology, II(6):2337–2345.
26) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of
Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007.
27) Wojciech Bieniecki, Szymon Grabowski and Wojciech Rozenberg “Image
Preprocessing for Improving OCR Accuracy” International Conference on
Perspective Technologies and Methods in MEMS Design, MEMSTECH 2007 Pp.75-
80, 23-26 May 2007.
28) Luis R. Blando’, Junichi Kanai, and Thomas A. Nartker “Prediction of OCR
Accuracy Using Simple Image Features” IEEE Proceedings of the Third International
Conference on Document Analysis and Recognition, Vol.1 PP. 319 – 322, 14-16 Aug
1995
29) Chinmay Chinara, Nishant Nath, Subhajeet Mishra, “ A Novel Approach to Skew-
Detection and Correction of English Alphabets for OCR” IEEE Student Conference
on Research and Development (SCOReD), pp.5-6 241 – 244, Dec. 2012
30) Xiaoling Fu, Yazhuo Xu, Lijing Tong “Document Image Skew Adjusting Based on
the Feedback Information Recognized By OCR” IEEE 3rd International Conference
on Communication Software and Networks (ICCSN), pp. 376 – 378, 27-29 May
2011.
31) E.Kavallieratou, N.Fakotakis and G.Kokkinakis “Handwritten Character Recognition
based on Structural Characteristics” IEEE 16th International Conference on Pattern
Recognition, 2002. Proceedings. Vol.3 pp.139 - 142 .
32) Hanchuan Peng, , Fuhui Long, Zheru Chi ” Document Image Recognition Based on
Template Matching of Component Block Projections ” IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. 25, No. 9, pp. 1188 - 1192 September 2003.
-
23
33) PEPE SIY, C. S. CHEN “Fuzzy Logic for Handwritten Numeral Character
Recognition” IEEE Transactions on Systems, Man and Cybernetics,
Vol.4,No.6, pp.570-575
34) Salvador Espan˜a-Boquera, Maria Jose Castro-Bleda, Jorge Gorbe-Moya, and
Francisco Zamora-Martinez “Improving Offline Handwritten Text Recognition with
Hybrid HMM/ANN Models” IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 33, No.4,pp.767-779,APRIL,2011.
35) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of
Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007.
36) Koga, M. “Camera-based Kanji OCR for mobile-phones: practical issues” Eighth
IEEE International Conference on Document Analysis and Recognition, Vol. 2, pp.
635 –639, 29 Aug.-1 Sept. 2005
37) Lund, W.B. “Error Correction with In-domain Training across Multiple OCR System
Outputs” IEEE International Conference on Document Analysis and Recognition
(ICDAR),pp. 658 – 662, 18-21 Sept. 2011.
38) Bhattacharya, U. “Handwritten Numeral Databases of Indian Scripts and Multistage
Recognition of Mixed Numerals”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol.31, No.3 , pp. 444 - 457, March 2009
39) Kavallieratou, E. “New algorithms for skewing correction and slant removal on word-
level [OCR]” The 6th IEEE International Conference on Electronics, Circuits and
Systems,Vol.2, pp. 1159 – 1162, 5-8 Sep 1999
40) Nikhil Pai, Vijaykumar S. Kolkure,”Optical Character Recognition: An
Encompassing Review”, International Journal of Research in Engineering and
Technology, Vol . 04, Issue: 01 , Jan-2015
41) J .Mantas, "An overview of character recognition methodologies”, Pattern
Recognition, vol. 19, no. 6, pp. 425-43 0, 1 986.
42) Rajean Plamondon, Fellow IEEE and Sargur N. Srihari, Fellow IEEE, “On-Line And
Off-Line Handwriting character Recognition: A Comprehensive Survey”, IEEE
TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE.
VOL. 22, NO. 1. JANUARY 2000
43) Amritha Sampath, Tripti C, Govindaru V, "Freeman code based online handwritten
character recognition for Malayalam using back propagation neural networks",
International journal on Advanced computing, Vol. 3, No. 4, pp. 51 - 58, July 2012.
-
24
44) Pradeep, E Shrinivasan and S.Himavathi, "Diagonal Based Feature Extraction for
Handwritten Alphabets Recognition System Using Neural Network", International
Journal of Computer Science & Information Technology (IJCSIT), vol. 3, No 1, Feb
2011.
45) Om Prakash Sharma, M. K. Ghose, Krishna Bikram Shah, "An Improved Zone Based
Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler
Number", International Journal of Soft Computing and Engineering (ISSN: 2231 -
2307), Vol. 2, Issue 2, pp. 504-508, May 2012
46) He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer vision and pattern
recognition (pp. 770–778).
47) Krizhevsky, A., Sutskever, I., Hinton, & G. E. (2012). Imagenet classification with
deep convolutional neural networks. In Advances in neural information processing
systems (pp. 1097–1105).
48) R. Gonzalez, E. Woods, Digital Image Processing, 3rd edition , Prentice hall.
49) www.mathswork.com