Investigation on Algorithm for Handwritten Gujarati OCR · 99.30% for Numeral, 92.37%, 92.21% and...

26
Investigation on Algorithm for Handwritten Gujarati OCR Ph.D. Synopsis Submitted To Gujarat Technological University For the Degree of Doctor of Philosophy in Electronics and Communication Engineering By Mikita R. Gandhi Enrollment No: 149997111007 Supervisor: Dr. Vishvjit Thakar Associate Professor and Head Information and Communication Technology, Sankalchand Patel University, Visnagar. Co-Supervisor: Dr. Hetal N. Patel Professor & Head Electronics and Communication Department, A.D.Patel Institute of Technology, New V. V. Nagar.

Transcript of Investigation on Algorithm for Handwritten Gujarati OCR · 99.30% for Numeral, 92.37%, 92.21% and...

  • Investigation on Algorithm for Handwritten Gujarati OCR

    Ph.D. Synopsis

    Submitted To

    Gujarat Technological University

    For the Degree

    of

    Doctor of Philosophy

    in

    Electronics and Communication Engineering

    By

    Mikita R. Gandhi

    Enrollment No: 149997111007

    Supervisor:

    Dr. Vishvjit Thakar

    Associate Professor and Head

    Information and Communication Technology,

    Sankalchand Patel University,

    Visnagar.

    Co-Supervisor:

    Dr. Hetal N. Patel

    Professor & Head

    Electronics and Communication Department,

    A.D.Patel Institute of Technology,

    New V. V. Nagar.

  • Table of Contents

    1. Title of the Thesis and Abstract............................................................................. 1

    2. Brief description on the state of the art of the research topic................................ 2

    3. Objective and Scope of the work........................................................................... 4

    4. Original contribution by the thesis......................................................................... 4

    5. Methodology of Research, Results / Comparisons................................................ 4

    6. Achievements with respect to objectives............................................................... 17

    7. Conclusions............................................................................................................ 18

    8. List of publication arising from the thesis............................................................. 19

    9. References.............................................................................................................. 20

  • 1

    1. Title of the Thesis and Abstract

    1.1 Title of the Thesis:

    Investigation on Algorithm for Handwritten Gujarati OCR

    1.2 Abstract:

    Optical Character Recognition is getting much more attention because by this the computer

    learns and recognizes the regional languages pretty well and if it successes, then it opens a

    whole new world of endless possibility.The machine printed characters are accurately

    recognizable which has solved many problems and hence commercialized in routine use but

    the recognition of hand written characters are very difficult and methods of recognition of

    hand written documents is still a subject of active research. There is no common algorithm is

    possible for all Indian language, because each Indian language has its own features and

    restrictions. In Gujarat state, Gujarati is the commercial language and most of the

    communication in Government office, schools and private sectors is done in Gujarati.

    Handwritten Gujarati OCR system was developed for handwritten amount on cheque,

    automatic reading of marks from answer sheet and a learning application for education

    system. The research work is mainly focused on implementation of robust algorithm for

    Handwritten Gujarati OCR.

    The KNN and SVM classifiers were used on different feature extraction methods like pixel

    count ratio, object gradient; geometry, profile, local binary pattern, ceter-symmetric local

    binary pattern and wavelet transform methods. Furthermore hybrid feature extraction

    methods were used for increase the performance of character recognition. The other novel

    approach of automated features extracted was implemented using Deep learning. The

    extracted features were given to SVM for handwritten character classification. For increasing

    recognition rate of characters, pretrained Deep Neural network (Alexnet) has been used and

    implemented three different application: Handwritten Guajarati Numeral to speech

    conversion, character to speech conversion and Automatic Handwritten Marks Recognition.

    KNN, SVM and Deep Neural Networks gives recognition accuracy of 98.14%,98.72% and

    99.30% for Numeral, 92.37%, 92.21% and 97.65% for characters and 92.64%, 92.93% and

    97.73% for combining Numerals and characters respectively.

  • 2

    2. Brief descriptions on the state of the art of the research topic

    As the world move closer to the concept of the “paperless office,” more and more

    communication and storage of documents is performed digitally. Documents and files that

    were once stored physically on paper are now being converted into electronic form in order to

    facilitate quicker additions, searches, and modifications and also doing this, life of such

    documents are prolonged. The advances in character recognition were limited to the

    extraction of English language character for both digital and handwritten. The character

    recognition of Indian languages can help authors, novelist, and many people to recognize the

    Indian characters and even to extract old heritage documents. The research work is

    approximately negligible for handwritten character recognition in general for Indian

    languages and Gujarati language in particular. In Gujarat State, all Government agency

    documents are written in Gujarati language. The software is available for printed Gujarati

    OCR but recognition of handwritten character is still changing exertion.

    Basic block diagram of OCR system is shown in figure 2.1. There are five major stages are

    like preprocessing, segmentation, representation, training and recognition and post

    processing.

    Figure 2.1 Basic block diagram

    Preprocessing is required to make the raw data usable in the descriptive stages of character

    analysis like smoothing, sharpening, binarize the image, remove background and extracting

    the required information. Segmentation converts the document into separate character by first

    segment the lines, then line segments the words and from words to individual characters

    which is used by classifier. In representation stage, the set of features are extracted to

    distinguished one class of the images from other class. KNN, SVM, Neural Network, Deep

    Learning like classifier are used for training and recognition.

    The Gujarati OCR worked was initiated by Sameer Antani et.al.[1] on printed Gujarati

    Script.KNN and hamming distance classifier was applied on 15 characters; 30 samples for

  • 3

    each character and got 67 % and 41.33% accuracy respectively. Using template matching and

    wavelet transform coefficients [2], Shah S. K et.al. attained 72.30 % accuracy for printed

    Gujarati OCR, Ankit K. Sharma et.al. [3] worked on zoning method and using multilayer

    feed forward neural network classifier achieved 95.92% accuracy for handwritten Gujarati

    Numerals and Archna vyas et.al.[4] got 96.99% accuracy using KNN. Using hybrid feature

    space method and SVM classifier A. Desai [5] has recognize forty handwritten Gujarati

    characters with 86.66% accuracy. The Zonal Boundary was successfully detected [6] by

    Jignesh Dholakia et.al. using zoning method. Swital J. Macwan et al [7] has applied discrete

    wavelet transform method on Gujarati Handwritten and got 89.46% accuracy. V. A. Naik et

    al have used different structural and statistical features for recognition of handwritten

    numerals and acquired 95% accuracy [8] and Dinesh Satange et al. obtained 90% accuracy

    using Multi Layer Perception[9].

    Ashutosh Aggarwal et.al. [10]has worked on gradient based feature extraction and SVM

    classifier for Hindi handwritten character recognition. LBP features are used for Bangla digits

    recognition in 2015 [11] and achieved 96.7% accuracy using KNN classifier; the same LBP

    feature applied on Persian/Arabic handwritten digit recognition[12]. Sekhar Mandal et al [13]

    proposed algorithm for machine-printed character recognition in Bangla language using two

    dimensional wavelet transform and gradient information. Saleem Pasha et al have solved

    problem of handwritten recognition for Kannada language using statistical featuresand

    wavelet transform [14].

    Two stage CNN network was used by Shibaprasad Sen et al [15] for online Bengali

    handwritten character recognition and gain 99.40% accuracy. Akm Ashiquzzaman et al

    worked on 10 different layer of CNN architecture for Arabic handwritten digit and achieved

    97.4% accuracy [16]. Chaouki Boufenar et al shown the three different approach of Deep

    learning methods for handwritten Arabic character recognition: i) scratch approach; (ii)

    transfer learning approach and (iii) fine-tuning approach [17].

    Table 2.1 shows the Gujarati Numerals and Characters used for research work.

    Numerals

    ૦ ૧ ૨ ૩ ૪ ૫ ૬ ૭ ૮ ૯

    Characters

    ક ખ ગ ઘ ચ છ જ ઝ ટ ઠ ડ ઢ ણ ત થ દ ધ ન પ ફ

    બ ભ મ ય ર લ વ શ ષ સ હ ળ ક્ષ જ્ઞ શ્ર અ ઋ ઈ ઉ ઊ

    Table 2.1 Gujarati Numerals and Characters

  • 4

    3. Objective and Scope of work

    3.1 Objective:

    To develop an algorithm for handwritten Gujarati OCR features to recognize

    numerals and Character.

    To design an Optical Character Recognition system for handwritten Gujarati

    Numerals.

    To design an Optical Character Recognition system for handwritten Gujarati

    Characters.

    To design an Optical Character Recognition system for combined handwritten

    Gujarati Numerals and Characters.

    3.2 Scope of work:

    The research work is useful to automatic detection of amount written in Gujarati on

    bank cheque, marks written on answer sheet and in Gujarati numeral and characters

    learning application.

    Handwritten Gujarati numeral and characters to speech conversation

    The implemented algorithms can be useful for recognition of Gujarati text modifier.

    4. Original contribution by the thesis

    Develop different feature extraction methods along with creation of database for

    Gujarati Handwritten numerals and characters.

    Three different classification methods: KNN, SVM and Deep learning was used for

    recognition

    Hybrid features with the above listed three classification methods.

    Transfer learning approach of Deep learning is used for better accuracy.

    Three applications are developed:

    Gujarati handwritten Numeral to speech conversion

    Gujarati handwritten Character to speech conversion

    Automatic Handwritten Marks recognition.

    5. Methodology of research, results and comparisons

    There is no standard database available for handwritten Gujarati OCR, the database was

    created for research work which containing 5000 samples for Numerals and 10,000 samples

    for characters from different age people.

  • 5

    5.1 Feature Extraction Method:

    Method 1: Pixel Count Ratio

    The image is resized into 64 X64 matrix. Some morphological operations are applied on it.

    Then image was divided into 8X 8 zone and total 64 zones are created. From each zone, the

    ratio of number of white pixels/number of Black pixels is taken as features so total 64

    features are generated.

    Figure 5.1 shows the image of Guajarati numeral ‘1’ and its 64 different zones.

    5.1 pixel count ratio

    Method 2: Object Gradient

    The gradient magnitude and gradient direction is used as features. First the image is divided

    into 9 sub images. The code is assigned for 30° span of direction. So total 12 code assigned to

    each sub image. The total 12X9 =108 features are obtained from single image .

    Figure 5.2 shows the image of Guajarati numeral ‘0’ and its 9 different zones. For each sub

    image gradient magnitude and gradient direction is computed and further, each white pixels

    gradient direction is observed, then checks that specific pixel lies in which span, according to

    span the code is assign to that pixel.

    Figure 5.2 Object Gradient

  • 6

    Method 3: Object Geometry

    In this method [18] object Geometry is used as features, geometry features like horizontal

    line, vertical line, right diagonal and left diagonal lines, area and Euler number. For these

    feature extraction image is divided into 9 sub images. From each sub images, first starting

    point and intersection points are founded, and then numbers of line segments are counted in

    particular direction like horizontal, vertical, right diagonal and left diagonal lines. The first

    four features of each sub images are values of these lines, computed by equation (1).

    Value =1 - ((number of lines /10) * 2) …………… (1)

    Next four features are computed using length of lines. If particular line is not available, than

    consider value =-1, else normalize the length consider as feature value. Last feature is

    considered as area of sub image. So total 9 features from each sub image X 9 sub images =81

    features and Euler no of image is consider as one another features. Hence total 82 features are

    computed.

    Figure 5.3 Object Geometry

    Figure 5.3 shows the one of the sub image of digit ‘0’. The 9 features are calculated as below:

    No. of segments: 3

    o No. of horizontal lines : 0

    o No. of vertical lines : 0

    o No. of right diagonal lines :2

    o No. of left diagonal lines :1

    Value is calculated by:

    Value =1 - ((number of lines /10) * 2)

    So the first four features are:

    o Value of horizontal lines : 1

    o Value of vertical lines :1

    o Value of right diagonal lines:0.6

  • 7

    o Value of left diagonal lines :0.8

    Next 4 features are calculated as:

    Length= (total no. of pixels in a particular direction) / (total no. of all pixels belonging to

    skeleton)

    Total no. of all pixels belonging to skeleton: 20

    If there is no pixels in particular line than consider length = -1

    o Normalized Length of all horizontal lines :-1

    o Normalized Length of all vertical lines :-1

    o Normalized Length of all right diagonal lines :12/20 =0.60

    o Normalized Length of all left diagonal lines :5/20 =0.25

    The 9th feature from each zone is computed as:

    o Normalized Area of the Skeleton= (Total no. of all pixels belonging to

    skeleton) /(size of sub image)

    o Normalized Area of the Skeleton = 20/289 = 0.0692

    Method 4: Character Profile

    Objects horizontal, vertical, right diagonal and left diagonal profile are considered as

    features. The image is resized into 50 X 50. So total 298 features =50 horizontal + 50

    vertical + 99 right diagonal+ 99 left diagonal profile are calculated.

    Figure 5.4 show the character profile for numeral ‘1’.

    Figure 5.4 Character Profile

  • 8

    Method 5: Local Binary Pattern

    Local Binary Patterns (LBP) is mostly used as feature extraction method in recognition of

    Face, fingerprint, texture. It operates on image pixels and replace its value with decimal

    number. In image, each central pixel value is compared with its eight neighboring pixels, if

    the neighboring pixel has less value than assign 0 else assign 1 to that pixel. Considering top

    left corner as a first bit and rotate clock wise manner generates eight bit binary code. The

    central pixel value is replaced by the decimal value of that binary code. The histogram of

    these decimal values is used as features.

    Figure 5.5 shows LBP code generation. The generated Binary code is 11000010, that is

    equivalent to 194 in decimal. To implement the rotation invariant features and reduce the size

    of feature vector, Uniform LBB is used.

    Figure 5.5 (a) Input image (b) The LBP (8,1) operator (c) LBP coded block

    A local binary pattern is called uniform if its uniformity measure is at most 2.For example,

    the patterns 00000000 (0 transitions), 01110000 (2 transitions) and 11001111 (2 transitions)

    are uniform and the patterns 11001001 (4 transitions) and 01010011 (6 transitions) are not

    uniform. In uniform LBP mapping there is a separate output label for each uniform pattern

    and all the non-uniform patterns are assigned to a single label. So, there are 58 uniform

    patterns and 1 non uniform pattern, total 59 point feature vector is considered.

    For obtain Uniform LBP features in this research work, binary image is first converted into

    gray image and then image is further divided into various size of blocks, suppose size is

    12X12, so total 16 blocks are generated and from each block 59 features are obtained, so total

    features are 16 X 59 + 59 features from whole image =1003 features are obtained.

    Method 6: Center Symmetric Local Binary Pattern.

    Center Symmetric Local Binary Pattern (CSLBP) is extension of LBP, in which difference of

    opposite pixel values are taken, if the difference is greater than some threshold value the

    assign bit 1 else assign bit 0; so the length of histogram of CSLBP is 16 point.

  • 9

    Figure 5.6 shows the generation of CSLBP code. consider the threshold value is 8 and the

    difference between opposite pixel values 80-70= 10 and 10 is greater than 8 so put 1, same

    way the value of other opposite are counted and binary code is generated. Here, the binary

    code is 1011 and its equivalent decimal code is 11.

    Figure 5.6 Computation of CSLBP (8,1) with threshold= 8

    Like LBP, CSLBP features are obtain by converting image into gray scale, and the divide

    into blocks. Total number of features for 12X12 block size is 16 blocks X 16 features + 16

    features of whole image= 272 .

    Method 7: Wavelet Transform

    Wavelet transform is usually used for representing and analyzing image. Image is represented

    by the two dimensional matrix; the wavelet transform is applied first row wise and then

    column wise, so final image is divided into four sub bands: [LL, HL, LH, HH], each sub band

    gives image’s approximation detail, horizontal detail, vertical detail and diagonal detail.

    Figure 5.7 shows the first level wavelet transform. The approximation details are used as

    features in the research work. The number of features is 256, 64 and 16 for level 2, 3 and 4

    respectively.

    Figure 5.7 2-D Wavelet Transform

  • 10

    5.2 Classification Method

    Classification is a task which assign object one of the classes from predetermined classes.

    Here five fold cross validation is used for classification.

    5.2.1 K-Nearest Neighbors Classifier

    In pattern recognition, KNN is a method for classifying objects based on the closest

    training examples in the feature space.

    KNN is a type of lazy learning where the function is only approximated locally and

    all computation is deferred until classification.

    The simplest of all machine learning algorithms: an object is classified by a majority

    vote of its neighbors, with the object being assigned to the class most common

    amongst its k nearest neighbors. k is a positive integer, typically small.

    If k = 1, then the object is simply assigned to the class of its nearest neighbor.

    K-NN assumes that the data is in feature space.

    The data can be scalars. Since the points are in feature space, they have notion of

    distance.

    Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2,

    ..., xm, the various distances between the vector xs and xt are defined as follows:

    o Euclidean distance

    o City block metric

    o Cosine distance

    o Correlation distance

    ( )( )1

    ( )( ) ( )( )

    ss t tst

    s ts s s t t t

    x x x xd

    x x x x x x x x

    1 s tsts s t t

    x xd

    x x x x

    1

    n

    st sj tj

    j

    d x x

    ( )( )st s t s td x x x x

  • 11

    Table 5.1 shows the accuracy of numerals, characters and combining numerals and characters

    for different feature extraction method using KNN classifier.

    Feature Extraction

    Methods

    Accuracy

    Numerals Characters Mix

    Pixel Count Intensity 97.18% 78.12% 80.65

    Gradient based 98.14% 89.06% 89.81

    Object Geometry based 90.02% 67.25% 71.38

    Object Profile 95.82% 76.71% 79.19

    Local Binary Pattern 97.92% 88.12% 88.97

    Center Symmetric Local

    Binary Patterns

    97.92% 87.65% 89.38

    Wavelet transform 97.86% 81.40% 84.02

    Table 5.1 KNN Classifier Accuracy for Different Feature Extraction Method

    Figure 5.8 shows the accuracy of hybrid feature extraction method. By concatenating CSLBP

    and Gradient features, character recognition accuracy reach up to 92.37%.

    Figure 5.8 hybrid feature extraction method for characters

    Figure 5.9 shows the accuracy of hybrid feature extraction method for mixed numerals and

    characters. By concatenating CSLBP and Gradient features, the recognition accuracy reach

    up to 92.64%.

    87

    88

    89

    90

    91

    92

    93

    Local Binary Pattern + Gradient

    Center Symmetric Local Binary Patterns

    + Gradient

    Wavelet Transform + Gradient

    % A

    ccu

    racy

    Features Extraction Method

    Hybrid Feature Extraction Methods

  • 12

    Figure 5.9 hybrid feature extraction method for mixed numerals and characters

    5.2.2 Support Vector Machine

    When there is no idea about data, support vector machine (SVM) extremely work

    well.

    SVM’s are very excellent when we have no idea on the data.

    It works with unconstructed and semi constructed information data like images, text

    and trees.

    The kernel strategy is main power of SVM . With a specific kernel functionality , it is

    possible to deal with any kind of complex problem

    In contrast to neural networks, SVM is not made up for local optima.

    It scales extremely good to high dimensional data.

    SVM is always gives better result than ANN

    SVM also required the good kernel selection and large dataset.

    SVM takes long training time than other classifier.

    Common kernels

    o Linear K(x,z) = xTz

    o Quadratic K(x,z) = (1+xTz)2

    o Polynomial K(x,z) = (1+xTz)d

    o RBF K(x,z) = exp-(||x-z||2)

    88

    89

    90

    91

    92

    93

    Local Binary Pattern + Gradient

    Center Symmetric Local Binary Patterns

    + Gradient

    Wavelet Transform + Gradient

    % A

    ccu

    racy

    Features Extraction Method

    Hybrid Feature Extraction Methods

  • 13

    Table 5.2 shows the accuracy of numerals, characters and combining numerals and characters

    for different feature extraction method using SVM classifier.

    Feature Extraction

    Methods

    Accuracy

    Numerals Characters Mix

    Pixel Count Intensity 96.90% 76.28 % 81.05%

    Gradient based 98.72% 89.57% 92.10%

    Object Geometry based 90.70% 67.59% 70.56%

    Object Profile 93.50% 64.45% 70.08%

    Local Binary Pattern 97.50% 85.99% 88.26%

    Center Symmetric Local

    Binary Patterns 95.82% 83.3% 86.21%

    Wavelet transform 97.40% 84.93% 86.68%

    Table 5.2 SVM Classifier Accuracy for Different Feature Extraction Method

    Figure 5.10 shows the accuracy of hybrid feature extraction method. By concatenating

    wavelet and Gradient features, character recognition accuracy reach up to 92.37%.

    Figure 5.10 hybrid feature extraction method for characters

    Figure 5.11 shows the accuracy of hybrid feature extraction method for mixed numerals and

    characters. By concatenating wavelet and Gradient features, the recognition accuracy reach

    up to 92.64%.

    85

    86

    87

    88

    89

    90

    91

    92

    93

    Local Binary Pattern + Gradient

    Center Symmetric Local Binary Patterns +

    Gradient

    Wavelet Transform + Gradient

    % A

    ccu

    racy

    Features Extraction Method

    Hybrid Feature Extraction Methods

  • 14

    Figure 5.11 hybrid feature extraction method for mixed numerals and characters

    5..3 Deep Learning:

    In deep learning, a computer model learns to perform classification tasks directly

    from images, text, or sound.

    Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding

    human-level performance.

    Models are trained by using a large set of labeled data and neural network

    architectures that contain many layers.

    Deep learning requires substantial computing power. High-performance GPUs have a

    parallel architecture that is efficient for deep learning.

    Most deep learning methods use neural network architectures, which is why deep

    learning models are often referred to as deep neural networks.

    5.3.1 Convolutional Neural Network (CNN)

    Most popular algorithms for deep learning with images composed of an input layer,

    an output layer, and many hidden layers in between.

    CNN usually consists of

    o Convolution Layer :

    In convolutional layer used to extract the features from image, different filters

    with different weight are applied on the input layer of the image, so the

    outcomes of this, two dimensional feature maps are generated.

    88.5 89

    89.5 90

    90.5 91

    91.5 92

    92.5 93

    Local Binary Pattern + Gradient

    Center Symmetric Local Binary Patterns +

    Gradient

    Wavelet Transform + Gradient

    % A

    ccu

    racy

    Features Extraction Method

    Hybrid Feature Extraction Methods

  • 15

    o Pooling Layer or Sub Sampling :

    Poling layer operate on each feature map. It can decrease the spatial dimension

    but cannot decrease the depth of the feature map. It trims down the amount of

    parameter and computation in network.

    o Fully Connected Layer (Classification) :

    Fully connected network connect features obtained by previous layers into its

    number of classes.

    Figure 5.12 Architecture of CNN

    Figure 5.13 Implementation of CNN

  • 16

    Figure 5.12 shows the CNN architecture, which has two convolutional layers, two pooling

    layers and one fully connected layers. It also consist two ReLU layers, which is increase the

    non- linearity of image by replacing all negative values by zero. Also the Softmax layer is

    used for highlights the largest value and suppresses the value which is significantly below the

    maximum value.

    Implementation of CNN network is shown in figure 5.12. The center blocks of the image

    shows sequence of layers, left side shows the size of feature map of each layers and right side

    shows number of feature maps with its size. The output of each layer is shown in figure 5.13.

    Table 5.3 shows the accuracy for numerals and characters for input size 64X64 and 96X96

    respectively.

    CNN Layers No. of

    Filters

    % Accuracy for

    Numerals

    % Accuracy for

    Characters

    Input size:

    64X64

    Input size:

    96X96

    Input size:

    64X64

    Input size:

    96X96

    Convolutional layer 1 20 80.7 94.4 70.15 74.15

    Convolutional layer 2 40

    Convolutional layer 1 40 88.25 95.1 72 76.25

    Convolutional layer 2 80

    Table 5.3 Proposed CNN Architecture Accuracy for Numerals and Characters

    5.3.2 Pretrained Network Approach

    Pretrained network is a previously trained network on a large standard datasets like similar

    problems that we want to solve. It already knows how to extract features which are

    informative and more powerful. More than million images are given to train this type

    Network and its output also classified into approximately 1000 class.

    Figure 5.14 Pretrained Network Approach

  • 17

    Pretrained networks like alexnet, vgg16, vgg19, googlenet, resnet18, resnet50, sufflenet are

    used for new task with only feature extraction purpose or transfer learning approach, as

    shown in figure 5.14. Alexnet is used as pretrained Network. Alexnet returns a pretrained

    AlexNet model and it contains 25 layers. The ImageNet database is used to train this model

    and its classified the image into 1000 classes such as different animals, mouse, pencils, cup,

    ambulance.

    A. Pretrained network as feature extractor

    In this approach, Alexnet model is used for feature extraction and these extracted features are

    given to SVM classifier for training and tasting purpose. Features are extracted using 20

    layers of Alexnet that is layer ‘fc7’. The recognition accuracy for numerals, characters and

    mix database is shown in table 5.4

    B. Transfer Learning Approach

    In this approach, all the layers of the pretrained Alexnet has been used expect last three

    layers. The new task has been carried out by replacing those last three layers with fully

    connected layers, Softmax layers and classification output layers. The recognition accuracy

    for numerals, characters and mix database for this approach is shown in table 5.4

    Database SVM classification Approach

    Transfer Learning Approach

    Numerals 97.40% 99.30%

    Characters 86.50% 97.65%

    Mixed 89.33% 97.73%

    Table 5.4 Pretrained Network Approach

    6. Achievements with respect to objectives

    Gujarati Handwritten numerals and characters database has been created in which

    5000 samples for numerals and 10,000 samples for characters.

    Handwritten Gujarati numerals, total 10 class, are recognized with 98.14% accuracy

    using gradient feature extraction method and KNN classifier, 98.72% accuracy using

    gradient features and SVM classifier and achieve 99.30% accuracy using Transfer

    Learning approach in Deep Learning.

    Handwritten Gujarati characters, total 40 class, are recognized with 92.37% accuracy

    using CSLBP + gradient based hybrid feature extraction method and KNN classifier,

    92.21% accuracy using wavelet + gradient based hybrid features and SVM classifier

    and achieve 97.65% accuracy using Transfer Learning approach in Deep Learning.

  • 18

    Handwritten Gujarati numerals and characters, total 48 class, are recognized with

    92.64% accuracy using CSLBP + gradient based hybrid feature extraction method and

    KNN classifier, 92.93% accuracy using wavelet + gradient based hybrid features and

    SVM classifier and achieve 97.73% accuracy using Transfer Learning approach in

    Deep Learning.

    Three applications are implemented:

    1) Handwritten Guajarati Numerals to speech conversion.

    2) Handwritten Guajarati Characters to speech conversion.

    3) Automatic Handwritten Marks Recognition.

    7. Conclusion:

    The hand written Gujarati numeral recognition algorithm was successfully developed

    using large number (5000) of test images with accuracy of 99.30%.

    The hand written Gujarati character recognition algorithm was successfully developed

    using large number (10,000) of test images with accuracy of 97.65%.

    The hand written Gujarati number and character recognition algorithm was

    successfully developed using large number (15,000) of test images with accuracy of

    97.73 %.

  • 19

    8. List of publications

    1) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Handwritten Gujarati Numeral Recognition

    using wavelet Transform”, Journal of Applied Science and Computation (JASC),

    Volume VI, Issue IV,2019

    2) Mikita Gandhi, V.K.Thakar, H.N.Patel, “Gujarati Handwritten Character Recognition

    Using Convolutional Neural Network”, Journal of Emerging Technologies and

    Innovative Research (JETIR) , Volume VI, Issue V, May 2019

  • 20

    References

    1) Antani S, Agnihotri L “Gujarati character recognition”, In: Proceedings of fifth

    international conference on document analysis and recognition, 1999 (ICDAR’99), pp

    418–421

    2) Shah SK, Sharma A, “Design and implementation of optical character recognition

    system to recognize Gujarati script using template matching”, J. Inst Eng (India)

    Electron Telecommunication Eng.,2006.

    3) Ankit K. Sharma, Dipak M. Adhyaru, Tanish H. Zaveri, Priyank B Thakkar,

    “Comparative analysis of zoning based methods for Gujarati handwritten numeral

    recognition”, 5th Nirma University International Conference on Engineering

    (NUiCONE),IEEE 2015

    4) Vyas, A. N. ,Goswami, M. M., “Classification of hand written Gujarati numerals”,

    IEEE transactions on pattern analysis and machine intelligence, pp.1231- 1237,2015

    5) A. Desai, “Support vector machine for identification of handwritten Gujarati

    alphabets using hybrid feature space”, CSIT, springer, January, 2015.

    6) Dholakia J, Negi A, Rama Mohan S, “Zone identification in the printed Gujarati text”,

    In: Proceedings of the eight international conference on document analysis and

    recognition, 2005 (ICDAR’05).

    7) Swital J. Macwan, Archana N. Vyas, "Classification of Offline Gujarati Handwritten

    Characters", International Conference on Advances in Computing, Communications

    and Informatics (ICACCI), 2015

    8) Dr. Dinesh Satange, Dr. P E Ajmire, Fozia I. Khandwani, “Offline Handwritten

    Gujrati Numeral Recognition Using MLP Classifier”, International Journal of Novel

    Research and Development, Volume 3, Issue 8 August 2018

    9) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition

    in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011

    International Conference on Image Information Processing.

    10) Ashutosh Aggarwal, Rajneesh Rani, RenuDhir, "Handwritten Devanagari Character

    Recognition Using Gradient Features", International Journal of Advanced Research in

    Computer Science and Software Engineering (ISSN: 2277-128X), Vol. 2, Issue 5, pp.

    85- 90, May 2012

    11) T. Hassan, H. Khan, “Handwritten BangIa Numeral Recognition using Local Binary

    Pattern”, 2nd Int'l Conf. on Electrical Engineering and Information & Communication

    Technology (lCEEICT),2015.

  • 21

    12) M. Pietikäinen, A. Hadid, G. Zhao, T. Ahonen (2011), ‘Local Binary Patterns for Still

    Images, Computer Vision Using Local Binary Patterns’, Chapter 2, Computational

    Imaging and Vision 40, Springer-Verlag London Limited, pp 13 – 47.

    13) Sekhar Mandal, Sanjib Sur ,Avishek Dan “Handwritten Bangla Character Recognition

    in Machine-printed Forms using Gradient Information and Haar Wavelet”, 2011

    International Conference on Image Information Processing.

    14) Saleem Pasha, M.C.Padma, “Handwritten Kannada Character Recognition using

    Wavelet Transform and Structural Features” International Conference on Emerging

    Research in Electronics, Computer Science and Technology – 2015

    15) Sen S., Shaoo D., Paul S., Sarkar R., Roy K. ,“Online Handwritten Bangla Character

    Recognition Using CNN: A Deep Learning Approach”, In: Bhateja V., Coello Coello

    C., Satapathy S., Pattnaik P. (eds) Intelligent Engineering Informatics. Advances in

    Intelligent Systems and Computing, vol 695. Springer, Singapore,2018.

    16) Alom, M.Z, Sidike, P., Taha, T.M., Asari, V.K.., “Handwritten Bangla Digit

    Recognition Using Deep Learning” Journal Neural Processing Letters; 45, pp: 703-

    725,2017.

    17) C. Boufenar, A. Kerboua, M. Batouche, "Investigation on deep learning for off-line

    handwritten Arabic character recognition", Cogn. Syst. Res., 2017.

    18) M. Blumenstein, B. K. Verma and H. Basli, A Novel Feature Extraction Technique

    for the Recognition of Segmented Handwritten Characters, 7th International

    Conference on Document Analysis and Recognition (ICDAR ’03) Eddinburgh,

    Scotland: pp.137-141, 2003.

    19) Patel CN, Desai AA , “Segmentation of text lines into words for Gujarati handwritten

    text”, In: Proceedings of international conference on signal and image processing,

    2010 (ICSIP’10), IEEEXplore,15–17.

    20) Dapping Tao, Xu Lin, Lianwen Jin, “Principal Component 2-D Long Short-Term

    Memory for Font Recognition on Single Chinese Characters”, IEEE

    TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

    21) Devendra K Sahu and C. V. J awahar, “Unsupervised Feature Learning for Optical

    Character Recognition”, In: Proceedings of the 13th International Conference on

    Document Analysis and Recognition,2015 (ICDAR’15)

    22) Mohamed Dahi, Noura A. Semary, and Mohiy M. Hadhoud, “Primitive Printed

    Arabic Optical Character Recognition using Statistical Features”, IEEE Seventh

  • 22

    International Conference on Intelligent Computing and Information Systems, 2015

    (ICICIS'15).

    23) Tanzila Saba, “Language Independent Rule Based Classification of Printed &

    Handwritten Text”, IEEE International Conference on Evolving and Adaptive

    Intelligent Systems (EAIS),December ,2015.

    24) Abdeljalil Gattal, “Segmentation-Verification Based on Fuzzy Integral for Connected

    Handwritten Digit Recognition”, IEEE transaction on Image Processing Theory,

    Tools and Applications, 2015.

    25) Patel CN, Desai AA (2013) Gujarati handwritten character recognition using hybrid

    method based on binary tree-classifier and k-nearest neighbour. Int J Eng Res

    Technology, II(6):2337–2345.

    26) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of

    Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007.

    27) Wojciech Bieniecki, Szymon Grabowski and Wojciech Rozenberg “Image

    Preprocessing for Improving OCR Accuracy” International Conference on

    Perspective Technologies and Methods in MEMS Design, MEMSTECH 2007 Pp.75-

    80, 23-26 May 2007.

    28) Luis R. Blando’, Junichi Kanai, and Thomas A. Nartker “Prediction of OCR

    Accuracy Using Simple Image Features” IEEE Proceedings of the Third International

    Conference on Document Analysis and Recognition, Vol.1 PP. 319 – 322, 14-16 Aug

    1995

    29) Chinmay Chinara, Nishant Nath, Subhajeet Mishra, “ A Novel Approach to Skew-

    Detection and Correction of English Alphabets for OCR” IEEE Student Conference

    on Research and Development (SCOReD), pp.5-6 241 – 244, Dec. 2012

    30) Xiaoling Fu, Yazhuo Xu, Lijing Tong “Document Image Skew Adjusting Based on

    the Feedback Information Recognized By OCR” IEEE 3rd International Conference

    on Communication Software and Networks (ICCSN), pp. 376 – 378, 27-29 May

    2011.

    31) E.Kavallieratou, N.Fakotakis and G.Kokkinakis “Handwritten Character Recognition

    based on Structural Characteristics” IEEE 16th International Conference on Pattern

    Recognition, 2002. Proceedings. Vol.3 pp.139 - 142 .

    32) Hanchuan Peng, , Fuhui Long, Zheru Chi ” Document Image Recognition Based on

    Template Matching of Component Block Projections ” IEEE Transactions on Pattern

    Analysis and Machine Intelligence, Vol. 25, No. 9, pp. 1188 - 1192 September 2003.

  • 23

    33) PEPE SIY, C. S. CHEN “Fuzzy Logic for Handwritten Numeral Character

    Recognition” IEEE Transactions on Systems, Man and Cybernetics,

    Vol.4,No.6, pp.570-575

    34) Salvador Espan˜a-Boquera, Maria Jose Castro-Bleda, Jorge Gorbe-Moya, and

    Francisco Zamora-Martinez “Improving Offline Handwritten Text Recognition with

    Hybrid HMM/ANN Models” IEEE Transactions on Pattern Analysis and Machine

    Intelligence, Vol. 33, No.4,pp.767-779,APRIL,2011.

    35) D. Bradley and G. Roth, “ Adaptive thresholding using the integral image”, Journal of

    Graphics tools, Vol.12, No.2,pp.13-21, Jun 2007.

    36) Koga, M. “Camera-based Kanji OCR for mobile-phones: practical issues” Eighth

    IEEE International Conference on Document Analysis and Recognition, Vol. 2, pp.

    635 –639, 29 Aug.-1 Sept. 2005

    37) Lund, W.B. “Error Correction with In-domain Training across Multiple OCR System

    Outputs” IEEE International Conference on Document Analysis and Recognition

    (ICDAR),pp. 658 – 662, 18-21 Sept. 2011.

    38) Bhattacharya, U. “Handwritten Numeral Databases of Indian Scripts and Multistage

    Recognition of Mixed Numerals”, IEEE Transactions on Pattern Analysis and

    Machine Intelligence, Vol.31, No.3 , pp. 444 - 457, March 2009

    39) Kavallieratou, E. “New algorithms for skewing correction and slant removal on word-

    level [OCR]” The 6th IEEE International Conference on Electronics, Circuits and

    Systems,Vol.2, pp. 1159 – 1162, 5-8 Sep 1999

    40) Nikhil Pai, Vijaykumar S. Kolkure,”Optical Character Recognition: An

    Encompassing Review”, International Journal of Research in Engineering and

    Technology, Vol . 04, Issue: 01 , Jan-2015

    41) J .Mantas, "An overview of character recognition methodologies”, Pattern

    Recognition, vol. 19, no. 6, pp. 425-43 0, 1 986.

    42) Rajean Plamondon, Fellow IEEE and Sargur N. Srihari, Fellow IEEE, “On-Line And

    Off-Line Handwriting character Recognition: A Comprehensive Survey”, IEEE

    TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE.

    VOL. 22, NO. 1. JANUARY 2000

    43) Amritha Sampath, Tripti C, Govindaru V, "Freeman code based online handwritten

    character recognition for Malayalam using back propagation neural networks",

    International journal on Advanced computing, Vol. 3, No. 4, pp. 51 - 58, July 2012.

  • 24

    44) Pradeep, E Shrinivasan and S.Himavathi, "Diagonal Based Feature Extraction for

    Handwritten Alphabets Recognition System Using Neural Network", International

    Journal of Computer Science & Information Technology (IJCSIT), vol. 3, No 1, Feb

    2011.

    45) Om Prakash Sharma, M. K. Ghose, Krishna Bikram Shah, "An Improved Zone Based

    Hybrid Feature Extraction Model for Handwritten Alphabets Recognition Using Euler

    Number", International Journal of Soft Computing and Engineering (ISSN: 2231 -

    2307), Vol. 2, Issue 2, pp. 504-508, May 2012

    46) He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image

    recognition. In Proceedings of the IEEE conference on computer vision and pattern

    recognition (pp. 770–778).

    47) Krizhevsky, A., Sutskever, I., Hinton, & G. E. (2012). Imagenet classification with

    deep convolutional neural networks. In Advances in neural information processing

    systems (pp. 1097–1105).

    48) R. Gonzalez, E. Woods, Digital Image Processing, 3rd edition , Prentice hall.

    49) www.mathswork.com