OCR FOR KANNADA SCRIPT - IJRTER · the optical character recognition using neural network for...
Transcript of OCR FOR KANNADA SCRIPT - IJRTER · the optical character recognition using neural network for...
@IJRTER-2016, All Rights Reserved 270
OCR FOR KANNADA SCRIPT
Saritha.M1 1Assistant professor,Department of Computer Science & Engg.,SDMIT, Ujire
Abstract— Handwriting recognition has been one of the active and challenging research areas in the
field of pattern recognition. It has numerous applications which include, reading aid for blind,
bank cheques and conversion of any hand written document into structural text form. As there
are no sufficient number of works on Indian language character recognition especially
Kannada script. In this paper an attempt is made to recognize handwritten Kannada characters using
Neural Networks. A handwritten Kannada character is resized into 20x30 pixel. The resized
character is used for training the neural network. Once the training process is completed the
same character is given as input to the neural network with different set of neurons in hidden
layer and their recognition accuracy rate for different Kannada characters has been calculated
and compared . The results show that the proposed system yields good recognition accuracy rates
comparable to that of other handwritten character recognition systems.
I. INTRODUCTION
One of the most classical applications of the Artificial Neural Networks is the Character
Recognition System. This system is the base for many different types of applications in various
fields, many of which we use in our daily life. Cost-effective and less time consuming, businesses,
post offices, banks, security systems, and even the field of robotics employ this system as the base of
their operations. Whether you are processing a check, performing an eye/face scan at the airport
entrance, or teaching a robot to pick up an object, you are employing the system of Character
Recognition. Recognizing characters present in an image makes the processing of various different
kinds of data comparatively easier. The knowledge or data can be used to recognize characters. OCR
is becoming an important part of modern research based computer applications. Especially with the
advent of Unicode and support of complex scripts on personal computers, the importance of this
application has increased. The computing device can be outfitted with a camera so that software in
the device can use this to take pictures of the data available like a hand written text and give the
characters written as an output. Proper user interface has to be created which should help the user to
easily enter and accurately obtain the meaning for any Kannada word present in image. Conversion
of handwritten characters is important for making several important documents related to our history,
such as manuscripts, into machine editable form so that it can be easily accessed and preserved.
Optical Character Recognition (OCR) systems is transforming large amount of documents, either
printed alphabet or handwritten into machine encoded text without any transformation, noise,
resolution variations and other factors. OCR (Optical Character Recognition) means identifying
alphabets of a language from an image of a document, and converting it to an editable format. This
technology is in full swing now. As more and more old and ancient hand scripts are being studied
extensively, an OCR system becomes more and more important. English and other international
languages have many years of work and research being done, while Indian languages, where much
more information is waiting to be revealed to public, are still in their infant stage. Neural networks
are well known tools for recognition processes in imaging industry for many years now. They work
on the derived principles of human brain working. Thus they derive their name from neurons present
in human brain. The platform intended to use here is MATLAB (MATrixLABoratory), a well known
platform used mainly for DSP (Digital Signal Processing), Image processing, Speech processing and
so on. Also it combines the power of VB and JAVA for GUI applications. Kannada alphabet when
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 271
compared to English, has much more complex characters like mixtures, subscripts etc, which make
the process much more complex.
The Kannada alphabet is classified into two main categories: vowels and consonants. There are
16 vowels and 35 consonants as shown in figures 1 and 2.Words in Kannada are composed of
aksharas which are analogous to characters in an English word. While vowels and consonants are
aksharas, the vast majority of aksharas are composed of combinations of these in a manner similar to
most other Indian scripts. An akshara can be one of the following,
A standalone vowel or a consonant (i.e. symbols appearing in figures 1.1 and 1.2)
A consonant modified by a vowel.
A consonant modified by one or more consonants and a vowel.
Fig. 1.1: Vowels in Kannada Fig. 1.2: Consonants in Kannada
When a vowel constitutes the whole akshara, the vowel normally appears at the beginning of a word.
A consonant can also form the whole akshara and can come anywhere in the word. These aksharas
appear in the middle region of the line and are represented by the same glyph as shown in figures 1.1
and 1.2. A consonant C and a vowel V can combine to form an akshara. Here the akshara is
composed by retaining most of the consonant glyph and by attaching to it the glyph corresponding to
the vowel modifier. The vowel modifier glyphs are different from those of the vowels and are shown
in figure 1.3. The glyph of the vowel modifier for a particular vowel is attached to all consonants
mostly in the same way, though, in a few cases the glyphs of the vowel modifier may change
depending on the consonant. Figure 1.3 shows two of the consonants modified by all the 16 vowels.
In this figure, the second row shows the vowels, the third row shows the glyphs of the vowel
modifiers, the fourth and fifth rows show the consonant–vowel(C–V ) combinations for two
consonants which phonetically correspond to c as in cat and y as in yacht. As can be seen from the
figure 1.3, the vowel modifier glyphs attach to the consonant glyphs at up to three places
corresponding to the top, right and bottom positions in Figure 1.3of the consonant. It can be observed
that the widths of the C–V combinations vary widely and also that the image of a single akshara may
be composed of two or more disconnected components.
Fig. 1.3: Consonants and Vowel Combinations
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 272
II. LITERATURE SURVEY
Early optical character recognition could be traced to activity around two issues: expanding
telegraphy and creating reading devices for the blind. Later it was continued to develop OCR
technology for data entry. It was proposed to be used in photographing data records and then, using
photocells, matching the photos against a template containing the desired identification pattern. OCR
software is analytical artificial intelligence systems that consider sequences of characters rather than
whole words or phrases. Based on the analysis of sequential lines and curves, OCR make 'best
guesses' at characters using database look-up tables to closely associate or match the strings of
characters that form words. An OCR engine was developed by Hewlett- Packard between 1985 and
1994. It is one of the most important applications of the OCR technology. It is most suitable for
backend working. Apart from character recognition the software can also detect whether the text is
mono spaced or proportional. Various papers have been presented on the OCR over the years. Some
of them are: The use of OCR for logo matching.
In the paper[1] gives an insight into logo matching where translation, scale and rotation of the
image containing the images. The image is prepared by processing the image using various
transformations. As the paper is dealing with logos and not characters of similar fonts and sizes they
have used feature extraction for processing the image and for character retrieval. Various
experiments like baseline technique, evaluation metrics are used to compare the accuracy of the
application.
Paper [2] describes an accurate OCR for English. The paper mainly concentrates on business
cards with fixed font and color characters. The approach taken is a very simple one, comparing the
characters with the one present in the database as English has only 26 alphabets. There is no use of
any type of neural network like artificial or Kohonen neural network. The author uses a very soft
approach but tries different experiments to prove that in OCR 100% accuracy is possible. This paper
gives a very basic idea of the technology and introduces it to the beginners.
In this paper[3] describes the application of OCR in scanning books. The main aim is to make
the technology useful for reading e-texts and e-books. The unique words in vocabulary of the book
are lined up against the outputs of the OCR. This is done repetitively till the number of such words
become very less. Distance based alignment algorithm is used for alignment of the text. This is used
for character recognition of books written in Spanish, French, English and German. Paper explains
the optical character recognition using neural network for Bangla characters.
In this work [4] gives an object oriented modeling framework of a Kohonen based character
recognition system. The paper provides an insight into the regional language, the challenges faced
and the feature extraction method, which is used for the character detection. The paper helps to learn
the implementation of OCR to Indian regional languages, as the number of characters including
vowels, consonants and complicated letters are very much similar to most of the other Indian
languages.
Claudiu et al. [5] has investigated using simple training data pre-processing gave us experts
with errors less correlated than those of different nets trained on the same or bootstrapped data.
Hence committees that simply average the expert outputs considerably improve recognition rates.
Our committee-based classifiers of isolated handwritten characters are the first on par with human
performance and can be used as basic building blocks of any OCR system (all our results were
achieved by software running on powerful yet cheap gaming cards).
Georgios et al. [6] has presented a methodology for off-line handwritten character
recognition. The proposed methodology relies on a new feature extraction technique based on
recursive subdivisions of the character image so that the resulting sub images at each iteration have
balanced (approximately equal) numbers of foreground pixels, as far as this is possible. Feature
extraction is followed by a two-stage classification scheme based on the level of granularity of the
feature extraction method. Classes with high values in the confusion matrix are merged at a certain
level and for each group of merged classes, granularity features from the level that best distinguishes
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 273
them are employed. Two handwritten character databases as well as two handwritten digit databases
were used in order to demonstrate the effectiveness of the proposed technique.
Sankaran et al. [7] has presented present a novel recognition approach that results in a 15%
decrease in word error rate on heavily degraded Indian language document images. OCRs have
considerably good performance on good quality documents, but fail easily in presence of
degradations. Also, classical OCR approaches perform poorly over complex scripts such as those for
Indian languages. He addressed these issues by proposing to recognize character n-gram images,
which are basically groupings of consecutive character/component segments. Their approach was
unique, since they use the character n grams as a primitive for recognition rather than for post
processing.
Jawahar et al. [8] has propose a recognition scheme for the Indian script of Devanagari.
Recognition accuracy of Devanagari script is not yet comparable to its Roman counterparts. This is
mainly due to the complexity of the script, writing style etc. Our solution uses a Recurrent Neural
Network known as Bidirectional Long- Short Term Memory (BLSTM). Our approach does not
require word to character segmentation, which is one of the most common reason for high word error
rate. Jawahar et al. [8] has reported a reduction of more than 20% in word error rate and over 9%
reduction in character error rate while comparing with the best available OCR system.
Zhang et al. [9] has discussed the misty, foggy, or hazy weather conditions lead to image
color distortion and reduce the resolution and the contrast of the observed object in outdoor scene
acquisition. In order to detect and remove haze, this article proposes a novel effective algorithm for
visibility enhancement from a single gray or color image. Since it can be considered that the haze
mainly concentrates in one component of the multilayer image, the haze-free image is reconstructed
through haze layer estimation based on the image filtering approach using both low-rank technique
and the overlap averaging scheme. By using parallel analysis with Monte Carlo simulation from the
coarse atmospheric veil by the median filter, the refined smooth haze layer is acquired with both less
texture and retaining depth changes. With the dark channel prior, the normalized transmission
coefficient is calculated to restore fogless image. Experimental results show that the proposed
algorithm is a simpler and efficient method for clarity improvement and contrast enhancement from a
single foggy image. Moreover, it can be comparable with the state-of-the-art methods, and even has
better results than them.
III. SYSTEM DESIGN AND ANALYSIS
Fig. 3.1: Block diagram of OCR
3.1 Steps in OCR
Following steps are to be performed in order to recognize the particular face with minimal error
as shown in figure 4.1,
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 274
Image acquisition: Acquiring the test image from the user. Some of the acquisition hardware are
web cameras, digital cameras, cell phones etc.
Preprocessing: Correcting the irregularities in the image. This process removes most of the
noise and unrelated components of the image.
Segmentation: This step isolates individual alphabets from others so that the recognition is done
more accurately.
Feature extraction: The preprocessed image is further processed to extract important features
for recognition process.
Recognition: This process uses the extracted features from the previous section to identify the
alphabets in the documents.
This paper uses neural networks for recognition process. Also all the alphabets from
Kannada languages are used to train or prepare the network for recognition. For simplicity, this
projects considers only images of printed documents only, avoiding hand written documents, which
vary in sizes, styles etc.
3.2 Data Flow Diagram
Fig. 3.2: Data flow diagram of OCR
3.2.1 Scanning of documents
Through the scanning process a digital image of the original document is captured. In OCR
optical scanners are used, which generally consist of a transport mechanism plus a sensing device
that converts light intensity into gray-levels.
3.2.2 Segmentation
Segmentation is the process of extracting objects of interest from an image. The first step in
segmentation is detecting lines. The subsequent steps are detecting the words in each line and the
individual characters in each word.
3.2.3 Noise removal step
The image resulting from the scanning process may contain a certain amount of noise. After
Image segmentation step noisy pixel still present. To remove this noise we apply image smoothening
algorithm. The smoothing implies both filling and thinning. Filling eliminates small breaks, gaps and
holes in the digitized characters, while thinning reduces the width of the line. The most common
techniques for smoothing, moves a window across the pixel set of the character, applying certain
rules to the contents of the window. We compare central pixel to 8 neighbours and apply
smoothening process.
3.2.4 Feature extraction step
In this step we extract certain features that still characterize the symbols, but leaves out the
unimportant attributes. The techniques for extraction of such features are often divided into three
main groups. First is the distribution of points, transformations and series expansions and structural
analysis of characters.
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 275
3.2.4 Post processing step
In the post processing step we group characters to form string. The process of performing this
association of symbols into strings is commonly referred to as grouping. The grouping of the
symbols into strings is based on the symbols’ location in the document. Symbols that are found to be
sufficiently close are grouped together.
IV. IMPLEMENTATION
4.1 Back Propagation (BP) Algorithm:
One of the most popular NN algorithms is back propagation algorithm. BP algorithm could be
broken down to four main steps. After choosing the weights of the network randomly, the back
propagation algorithm is used to compute the necessary corrections. The algorithm can be
decomposition the following four steps:
Feed-forward computation
Back propagation to the output layer
Back propagation to the hidden layer
Weight updates
The algorithm is stopped when the value of the error function has become sufficiently small.
This is very rough and basic formula for BP algorithm. There are some variation proposed by other
scientist but Rojas definition seem to be quite accurate and easy to follow. The last step, weight
updates is happening throughout the algorithm.
Worked example
NN on figure 5.1 has two nodes (N0,0 and N0,1) in input layer, two nodes in hidden layer
(N1,0 and N1,1)and one node in output layer (N2,0). Input layer nodes are connected to hidden layer
nodes with weights (W0, 1-W0, 4). Hidden layer nodes are connected with output layer node with
weights (W1, 0 and W1, 1). The values that were given to weights are taken randomly and will be
changed during BP iterations. Input node values and desired output with learning rate and
momentum are also given in Table 5.2. There is also sigmoid function formula f(x) = 1:0=(1:0 +
exp(∆x)). Shown are calculations for this simple network (only calculation for example set is going
to be shown (input values of 1 and 1 with output value 1)).
4.1.1 Feed-forward computation:
Feed forward computation or forward pass is two step process. First part is getting the values
of the hidden layer nodes and second part is using those values from hidden layer to compute value
or values of output layer. Input values of nodes N0, 0 and N0, 1 are pushed up to the network
towards nodes in hidden layer (N1, 0 and N1, 1). They are multiplied with weights of connecting
nodes and values of hidden layer nodes are calculated.
Fig. 4.1: Neural Network of layers
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 276
N0,0 N0,1 Output n2,0
1 1 1
1 0 0
0 1 0
0 0 0
Table 4.1: Pattern data for AND
Sigmoid function is used for calculations f(x) = 1:0= (1:0 + exp(�x)).
N1,0=f(x1)=f(w0,0*n0,0+w0,1*n0,1)=f(0.4+0.1)=f(0.5)=0.622459
N1,1=f(x2)=f(w0,2*n0,0+w0,3*n0,1)=f(-0.1-0.1)=f(-0.2)=0.450166
When hidden layer values are calculated, network propagates forward, it propagates values
from hidden layer up to a output layer node (N2, 0). This is second step of feed forward computation.
N2,0=f(x3)=f(w1,0*n1,0+w1,1*n1,1)=f(0.06*0.622459+(-0.4)*0.450166=0.464381
Having calculated N2,0 forward pass is completed.
4.1.2 Back propagation to the output layer
Next step is to calculate error of N2,0 node. From the table in figure 4, output should be 1.
Predicted value (N2,0) in our example is 0.464381. Error calculation is done the following way:
N2,0error=n2,0*(1-n2,0)*(N2,0desired-N2,0)=0.46438(1-0.46438)*(1-0.46438)=0.13322
Once error is known, it will be used for backward propagation and weights adjustment. It is
two step process. Error is propagated from output layer to the hidden layer first. This is where
learning rate and momentum are brought to equation. So weights W1,0 and W1,1 will be updated
first. Before weights can be updated, rate of change needs to be found. This is done by multiplication
of the learning rate, error value and node N1,0 value.
∆W1,0=β*N2,0error*n1,0=0.45*0.133225*0.622459=0.037317
Now new weight for W1,0 can be calculated.
W1,0new=w1,0old+∆W1,0+(α*∆(t-1))=0.06+0.037317+0.9*0=0.097137
∆W1,1=β*N2,0error*n1,1=0.45*0.133225*0.450166=0.026988
W1,1new=w1,1old+∆W1,1+(α*∆(t-1))= -0.4+0.026988= -0.373012
The value of (t -1) is previous delta change of the weight. In our example, there is no
previous delta change so it is always 0. If next iteration were to be calculated, this would have some
value.
4.1.3 Back propagation of the hidden layer
Now errors has to be propagated from the hidden layer down to the input layer. This is bit
more complicated than propagating error from output to hidden layer. In previous case, output from
node N2,0 was known beforehand. Output of nodes N1,0 and N1,1 was unknown. Let's start with
finding N1,0 error first. This will be calculated multiplying new weight W1,0 value with error for the
node N2,0 value. Same way error for N1,1 node will be found.
N1,0error=N2,0error*W1,0new=0.133225*0.097317=0.012965
N1,1error=N2,0error*W1,1new=0.133225*(-0.373012)= -0.049706
Once error for hidden layer nodes is known, weights between input and hidden layer can be updated.
Rate of change first needs to be calculated for every weight:
∆W0,0=β*N1,0error*N0,0=0.45*0.012965=0.005834
∆W0,1=β*N1,0error*n0,1=0.45*0.012965*1=0.005834
∆W0,2=β*N1,1error*n0,0=0.45*-0.049706*1= -0.022368
∆W0,3=β*N1,1error*n0,1=0.45*-0.049706*1= -0.022368
Than we calculate new weights between input and hidden layer.
W0,0new=w0,0old+∆W0,0+(α*∆(t-1))=0.4+0.005834+0.9*0=0.405834
W0,1new=w0,1old+∆W0,1+(α*∆(t-1))=0.1+0.005834+0=0.105384
W0,2new=w0,2old+∆W0,2+(α*∆(t-1))= -0.1+ -0.022368+0= -0.122368
W0,3new=w0,3old+∆W0,3+(α*∆(t-1))= -0.1+ -0.022368+0= -0.122368
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 277
4.1.4 Weight updates
Important thing is not to update any weights until all errors have been calculated. It is easy to
forget this and if new weights were used while calculating errors, results would not be valid. Here is
quick second pass using new weights to see if error has decreased.
N1,0=f(x1)=f(w0,0*n0,0+w0,1*n0,1)=f(0.406+0.1)=0.62386314
N1,1=f(x2)=f(w0,2*n0,0+w0,3*n0,1)=f(-0.122-0.122)=0.43930085
N2,0=f(x3)=f(w1,0*n1,0+w1,1*n1,1)=f(0.097*0.62386314+(-0.373)*0.4393=0.474186
Having calculated N2,0, forward pass is completed. Next step is to calculate error of N2,0 node.
From the table in figure 4, output should be 1. Predicted value (N2,0) in our example is 0.464381.
Error calculation is done in following way.
N2,0error=n2,0*(1-n2,0)*(N2,0desired-N2,0)=0.4741*(1-0.4741)*(1-0.4741)=0.131102
So after initial iteration, calculated error was 0.133225 and new calculated error is 0.131102.
Our algorithm has improved, not by much but this should give good idea on how BP algorithm
works. Although this was very simple example, it should help to understand basic operation of BP
algorithm .It can be said that algorithm learned through iterations. Number of iterations in typical
NN would be any number from ten to ten thousands. This is only one example set pass that could be
repeated many times until error is small enough.
4.2 Implementation Phases
The Character Recognition System must first be created through a few simple steps in order
to prepare it for presentation into MATLAB. The matrixes of each kannada character must be created
along with the network structure. In addition, one must understand how to pull the binary input code
from the matrix, and how to interpret the binary output code, which the computer ultimately
produces.
4.2.1 Pattern creation
The most popular and simple approach to OCR problem is based on feed forward neural
network with back propagation learning. The main idea is that we should first prepare a training set
and then train a neural network to recognize patterns from the training set. In the training step we
teach the network to respond with desired output for a specified input. For this purpose each training
sample is represented by two components: possible input and the desired network's output for the
input. After the training step is done, we can give an arbitrary input to the network and the network
will form an output, from which we can resolve a pattern type presented to the network. Let's assume
that we want to train a network to recognize 29 capital letters represented as images of 10x10 pixels.
One of the most obvious ways to convert an image to an input part of a training sample is to
create a vector of size 100 (for our case), containing "1" in all positions corresponding to the letter
pixel and "0" in all positions corresponding to the background pixels. But, in many neural network
training tasks, it's preferred to represent training patterns in so called "bipolar" way, placing into
input vector "0.5" instead of "1" and "-0.5" instead of "0". Such sort of pattern coding will lead to a
greater learning performance improvement.
4.2.2 Neural network
A feed forward back propagation neural network is used in this work for classifying and recognizing
the kannada handwritten characters .The neural classifier consists of a hidden layer besides an input
layer and an output layer as shown in Figure 5.2.The hidden layer use tansig activation function and
the output layer is a competitive layer as one of the characters is required to be identified at any point
in time. The neural network receives 24 element input vector in which each element represents a
particular kannada character of 20x30 pixel in size. Once the M neural network has been trained
successfully it is then required to identify the same corresponding character of 20x30 pixel .In
addition, the network should also be able to handle noise. In practice, the network does not receive a
perfect kannada character as an input. Specifically, the network should make few mistakes as
possible when classifying characters with noise.
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 278
Fig. 4.2: Trainee Neural network
Networks 1 2 3 4
No of Layers 2 2 2 2
No of neurons in hidden layer 10 25 50 100
No of neurons in output layer 18 18 18 18
Learning rate 0.1 0.1 0.1 0.1
Table 4.2: Four Neural based character recognition system
4.2.3 Neural network training
TRAINBFGC FUNCTION:
Syntax:[net,TR,Y,E,Pf,Af,flag_stop] = trainbfgc(net,P,T,Pi,Ai,epochs,TS,Q)
info = trainbfgc(code)
Description: trainbfgc is a network training function that updates weight and bias values according to
the BFGS quasi-Newton method. This function is called from nnmodref, a GUI for the model
reference adaptive control Simulink block.
[net,TR,Y,E,Pf,Af,flag_stop] = trainbfgc(net,P,T,Pi,Ai,epochs,TS,Q) takes these inputs, net
Neural network
P Delayed input vectors
T Layer target vectors
Pi Initial input delay conditions
Ai Initial layer delay conditions
epochs Number of iterations for training
TS Time steps
Q Batch size
Training occurs according to trainbfgc's training parameters, shown here with their default values:
net.trainParam.epochs 100 Maximum number of epochs to train
net.trainParam.show 25 Epochs between displays
net.trainParam.goal 0 Performance goal
net.trainParam.time inf Maximum time to train in seconds
net.trainParam.min_grad 1e-6 Minimum performance gradient
net.trainParam.max_fail 5 Maximum validation failures
net.trainParam.searchFcn 'srchbacx' Name of line search routine to use
Training stops when any of these conditions occurs:
The maximum number of epochs (repetitions) is reached.
The maximum amount of time is exceeded.
Performance is minimized to the goal.
The performance gradient falls below min_grad.
Precision problems have occurred in the matrix inversion
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 279
V. RESULT ANALYSIS
Fig. 5.1: Creating Kannada character "Na" Fig. 5.2: Array of kannada character "Na"
Fig. 5.3: Array of kannada alphabets Fig. 5.4: Neural network trainee
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 280
Table 5.1: Performance comparison
Performance graph: Figure 5.6 shows the lines of that function that take up the most time, the time
spent executing that line, the percentage of total time for that function that is spent on that line, and a
bar graph showing the relative time spent on the line.
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 281
Fig. 5.5: Plot perform Graph
Training state: The training state plot shows the progress of other training variables, such as the
gradient magnitude, the number of validation checks, etc as shown in below figure 5.7.
Fig. 5.6: Plot train state Graph
Error histogram: In figure 5.8 the error histogram plot shows the distribution of the network errors.
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 282
Fig. 5.7: Plot histogram Graph
Final Output:
Fig. 5.8: Handwritten Input Image “DhaNa” Fig. 5.9: Output of Handwritten Image “DhaNa”
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 283
Fig. 5.10: Input file image “GaNaPa” Fig. 5.11: Output of image “GaNaPa”
VI. APPLICATION
Banking
One widely known application is in banking, where OCR is used to process checks without
human involvement. A image of check can be captured by mobile camera, the writing on it is
scanned instantly, and the correct amount of money is transferred.
Legal Industry In the legal industry, there has also been a significant movement to digitize paper documents. In
order to save space and eliminate the need to sift through boxes of paper files, documents are being
scanned. OCR further simplifies the process by making documents text-searchable
Other Industries
OCR is widely used in many other fields, including education, finance, and government
agencies. OCR has made countless texts available online, saving money for students and allowing
knowledge to be shared.
Vocal Monitoring
In some cases, oral information is more efficient than written messages. The appeal is stronger,
while the attention may still focus on other visual sources of information. Hence the idea of
incorporating speech from documents.
VII. CONCLUSION
A neural network based kannada character recognition system has been introduced in this
paper for classifying and recognizing the kannada handwritten and printed characters. The pixel
values derived from the resized characters using image processing techniques have been directly
used for training the neural network. As a result, the proposed system will be less complex compared
to other methods of character recongnition systems. Of the several neural network architectures used
for classifying the kannada characters, the one with a hidden layer having 50 neurons has been found
to yield the highest recognition accuracy of 99.58%.The handwritten recognition system described in
this paper will find potential applications in hand written name recognition, document reading,
conversion of any handwritten document into structural text form and postal address recognition.
International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]
@IJRTER-2016, All Rights Reserved 284
FUTURE SCOPE Our next works with OCR Mobile Application will include the improvement of the results
by the use of table boundaries detection techniques and the use of text post-processing techniques to
detect the noise and to correct bad-recognized words. OCR application will also display the
signatures and the other symbols as it is in the document. It will also update its features including the
translation of one language to another. So that it will helpful for people from other countries who
can’t understand the local language.
REFERENCES 1. G. Zhu and D. Doermann. “Logo Matching for Document Image Retrieval”, 10th international conference on
document analysis and recognition,p 606-610, 2009.
2. Tariq.J, “α-Soft: An English language OCR”, Computer engineering and applications (ICCEA).
3. Yalniz, I.Z.; Manmatha. R,”A Fast Alignment Scheme for Automatic OCR Evaluation of Books”,International
Conference Document Analysis and Recognition (ICDAR), 2011.
4. R.Shukla, “Object oriented framework modeling of a Kohonen network based character recognition
system”,Computer communication and informatics international conference(ICCCI), p 93-100, 2012.
5. Dan ClaudiuCires¸an and Ueli Meier and LucaMaria Gambardella and JurgenSchmidhuber, “Convolutional Neural
Network Committees for Handwritten Character Classification”, 2011International Conference on Document
Analysis and Recognition, IEEE, 2011.
6. GeorgiosVamvakas, Basilis Gatos, Stavros J. Perantonis, “Handwritten character recognition through two-stage
foreground sub-sampling”,Pattern Recognition, Volume 43, Issue 8, August2010.
7. ShreyDutta, Naveen Sankaran, PramodSankar K., C.V. Jawahar, “Robust Recognition of Degraded Documents
Using Character N-Grams”, IEEE, 2012.
8. Naveen Sankaran and C.V Jawahar, “Recognition of Printed Devanagari Text Using BLSTM Neural Network”,
IEEE, 2012.
9. Yong-Qin Zhang, Yu Ding, Jin-Sheng Xiao, Jiaying Liu and Zongming Guo1, “Visibility enhancement using an
image filtering approach”, Zhang et al. EURASIP Journal on Advances in Signal Processing 2012.