OCR FOR KANNADA SCRIPT - IJRTER · the optical character recognition using neural network for...

@IJRTER-2016, All Rights Reserved 270

OCR FOR KANNADA SCRIPT

Saritha.M1 1Assistant professor,Department of Computer Science & Engg.,SDMIT, Ujire

Abstract— Handwriting recognition has been one of the active and challenging research areas in the

field of pattern recognition. It has numerous applications which include, reading aid for blind,

bank cheques and conversion of any hand written document into structural text form. As there

are no sufficient number of works on Indian language character recognition especially

Kannada script. In this paper an attempt is made to recognize handwritten Kannada characters using

Neural Networks. A handwritten Kannada character is resized into 20x30 pixel. The resized

character is used for training the neural network. Once the training process is completed the

same character is given as input to the neural network with different set of neurons in hidden

layer and their recognition accuracy rate for different Kannada characters has been calculated

and compared . The results show that the proposed system yields good recognition accuracy rates

comparable to that of other handwritten character recognition systems.

I. INTRODUCTION

One of the most classical applications of the Artificial Neural Networks is the Character

Recognition System. This system is the base for many different types of applications in various

fields, many of which we use in our daily life. Cost-effective and less time consuming, businesses,

post offices, banks, security systems, and even the field of robotics employ this system as the base of

their operations. Whether you are processing a check, performing an eye/face scan at the airport

entrance, or teaching a robot to pick up an object, you are employing the system of Character

Recognition. Recognizing characters present in an image makes the processing of various different

kinds of data comparatively easier. The knowledge or data can be used to recognize characters. OCR

is becoming an important part of modern research based computer applications. Especially with the

advent of Unicode and support of complex scripts on personal computers, the importance of this

application has increased. The computing device can be outfitted with a camera so that software in

the device can use this to take pictures of the data available like a hand written text and give the

characters written as an output. Proper user interface has to be created which should help the user to

easily enter and accurately obtain the meaning for any Kannada word present in image. Conversion

of handwritten characters is important for making several important documents related to our history,

such as manuscripts, into machine editable form so that it can be easily accessed and preserved.

Optical Character Recognition (OCR) systems is transforming large amount of documents, either

printed alphabet or handwritten into machine encoded text without any transformation, noise,

resolution variations and other factors. OCR (Optical Character Recognition) means identifying

alphabets of a language from an image of a document, and converting it to an editable format. This

technology is in full swing now. As more and more old and ancient hand scripts are being studied

extensively, an OCR system becomes more and more important. English and other international

languages have many years of work and research being done, while Indian languages, where much

more information is waiting to be revealed to public, are still in their infant stage. Neural networks

are well known tools for recognition processes in imaging industry for many years now. They work

on the derived principles of human brain working. Thus they derive their name from neurons present

in human brain. The platform intended to use here is MATLAB (MATrixLABoratory), a well known

platform used mainly for DSP (Digital Signal Processing), Image processing, Speech processing and

so on. Also it combines the power of VB and JAVA for GUI applications. Kannada alphabet when

International Journal of Recent Trends in Engineering & Research (IJRTER) Volume 02, Issue 07; July - 2016 [ISSN: 2455-1457]


compared to English, has much more complex characters like mixtures, subscripts etc, which make

the process much more complex.

The Kannada alphabet is classified into two main categories: vowels and consonants. There are

16 vowels and 35 consonants as shown in figures 1 and 2.Words in Kannada are composed of

aksharas which are analogous to characters in an English word. While vowels and consonants are

aksharas, the vast majority of aksharas are composed of combinations of these in a manner similar to

most other Indian scripts. An akshara can be one of the following,

A standalone vowel or a consonant (i.e. symbols appearing in figures 1.1 and 1.2)

A consonant modified by a vowel.

A consonant modified by one or more consonants and a vowel.

Fig. 1.1: Vowels in Kannada Fig. 1.2: Consonants in Kannada

When a vowel constitutes the whole akshara, the vowel normally appears at the beginning of a word.

A consonant can also form the whole akshara and can come anywhere in the word. These aksharas

appear in the middle region of the line and are represented by the same glyph as shown in figures 1.1

and 1.2. A consonant C and a vowel V can combine to form an akshara. Here the akshara is

composed by retaining most of the consonant glyph and by attaching to it the glyph corresponding to

the vowel modifier. The vowel modifier glyphs are different from those of the vowels and are shown

in figure 1.3. The glyph of the vowel modifier for a particular vowel is attached to all consonants

mostly in the same way, though, in a few cases the glyphs of the vowel modifier may change

depending on the consonant. Figure 1.3 shows two of the consonants modified by all the 16 vowels.

In this figure, the second row shows the vowels, the third row shows the glyphs of the vowel

modifiers, the fourth and fifth rows show the consonant–vowel(C–V ) combinations for two

consonants which phonetically correspond to c as in cat and y as in yacht. As can be seen from the

figure 1.3, the vowel modifier glyphs attach to the consonant glyphs at up to three places

corresponding to the top, right and bottom positions in Figure 1.3of the consonant. It can be observed

that the widths of the C–V combinations vary widely and also that the image of a single akshara may

be composed of two or more disconnected components.

Fig. 1.3: Consonants and Vowel Combinations



II. LITERATURE SURVEY

Early optical character recognition could be traced to activity around two issues: expanding

telegraphy and creating reading devices for the blind. Later it was continued to develop OCR

technology for data entry. It was proposed to be used in photographing data records and then, using

photocells, matching the photos against a template containing the desired identification pattern. OCR

software is analytical artificial intelligence systems that consider sequences of characters rather than

whole words or phrases. Based on the analysis of sequential lines and curves, OCR make 'best

guesses' at characters using database look-up tables to closely associate or match the strings of

characters that form words. An OCR engine was developed by Hewlett- Packard between 1985 and

1994. It is one of the most important applications of the OCR technology. It is most suitable for

backend working. Apart from character recognition the software can also detect whether the text is

mono spaced or proportional. Various papers have been presented on the OCR over the years. Some

of them are: The use of OCR for logo matching.

In the paper[1] gives an insight into logo matching where translation, scale and rotation of the

image containing the images. The image is prepared by processing the image using various

transformations. As the paper is dealing with logos and not characters of similar fonts and sizes they

have used feature extraction for processing the image and for character retrieval. Various

experiments like baseline technique, evaluation metrics are used to compare the accuracy of the

application.

Paper [2] describes an accurate OCR for English. The paper mainly concentrates on business

cards with fixed font and color characters. The approach taken is a very simple one, comparing the

characters with the one present in the database as English has only 26 alphabets. There is no use of

any type of neural network like artificial or Kohonen neural network. The author uses a very soft

approach but tries different experiments to prove that in OCR 100% accuracy is possible. This paper

gives a very basic idea of the technology and introduces it to the beginners.

In this paper[3] describes the application of OCR in scanning books. The main aim is to make

the technology useful for reading e-texts and e-books. The unique words in vocabulary of the book

are lined up against the outputs of the OCR. This is done repetitively till the number of such words

become very less. Distance based alignment algorithm is used for alignment of the text. This is used

for character recognition of books written in Spanish, French, English and German. Paper explains

the optical character recognition using neural network for Bangla characters.

In this work [4] gives an object oriented modeling framework of a Kohonen based character

recognition system. The paper provides an insight into the regional language, the challenges faced

and the feature extraction method, which is used for the character detection. The paper helps to learn

the implementation of OCR to Indian regional languages, as the number of characters including

vowels, consonants and complicated letters are very much similar to most of the other Indian

languages.

Claudiu et al. [5] has investigated using simple training data pre-processing gave us experts

with errors less correlated than those of different nets trained on the same or bootstrapped data.

Hence committees that simply average the expert outputs considerably improve recognition rates.

Our committee-based classifiers of isolated handwritten characters are the first on par with human

performance and can be used as basic building blocks of any OCR system (all our results were

achieved by software running on powerful yet cheap gaming cards).

Georgios et al. [6] has presented a methodology for off-line handwritten character

recognition. The proposed methodology relies on a new feature extraction technique based on

recursive subdivisions of the character image so that the resulting sub images at each iteration have

balanced (approximately equal) numbers of foreground pixels, as far as this is possible. Feature

extraction is followed by a two-stage classification scheme based on the level of granularity of the

feature extraction method. Classes with high values in the confusion matrix are merged at a certain

level and for each group of merged classes, granularity features from the level that best distinguishes



them are employed. Two handwritten character databases as well as two handwritten digit databases

were used in order to demonstrate the effectiveness of the proposed technique.

Sankaran et al. [7] has presented present a novel recognition approach that results in a 15%

decrease in word error rate on heavily degraded Indian language document images. OCRs have

considerably good performance on good quality documents, but fail easily in presence of

degradations. Also, classical OCR approaches perform poorly over complex scripts such as those for

Indian languages. He addressed these issues by proposing to recognize character n-gram images,

which are basically groupings of consecutive character/component segments. Their approach was

unique, since they use the character n grams as a primitive for recognition rather than for post

processing.

Jawahar et al. [8] has propose a recognition scheme for the Indian script of Devanagari.

Recognition accuracy of Devanagari script is not yet comparable to its Roman counterparts. This is

mainly due to the complexity of the script, writing style etc. Our solution uses a Recurrent Neural

Network known as Bidirectional Long- Short Term Memory (BLSTM). Our approach does not

require word to character segmentation, which is one of the most common reason for high word error

rate. Jawahar et al. [8] has reported a reduction of more than 20% in word error rate and over 9%

reduction in character error rate while comparing with the best available OCR system.

Zhang et al. [9] has discussed the misty, foggy, or hazy weather conditions lead to image

color distortion and reduce the resolution and the contrast of the observed object in outdoor scene

acquisition. In order to detect and remove haze, this article proposes a novel effective algorithm for

visibility enhancement from a single gray or color image. Since it can be considered that the haze

mainly concentrates in one component of the multilayer image, the haze-free image is reconstructed

through haze layer estimation based on the image filtering approach using both low-rank technique

and the overlap averaging scheme. By using parallel analysis with Monte Carlo simulation from the

coarse atmospheric veil by the median filter, the refined smooth haze layer is acquired with both less

texture and retaining depth changes. With the dark channel prior, the normalized transmission

coefficient is calculated to restore fogless image. Experimental results show that the proposed

algorithm is a simpler and efficient method for clarity improvement and contrast enhancement from a

single foggy image. Moreover, it can be comparable with the state-of-the-art methods, and even has

better results than them.

III. SYSTEM DESIGN AND ANALYSIS

Fig. 3.1: Block diagram of OCR

3.1 Steps in OCR

Following steps are to be performed in order to recognize the particular face with minimal error

as shown in figure 4.1,



Image acquisition: Acquiring the test image from the user. Some of the acquisition hardware are

web cameras, digital cameras, cell phones etc.

Preprocessing: Correcting the irregularities in the image. This process removes most of the

noise and unrelated components of the image.

Segmentation: This step isolates individual alphabets from others so that the recognition is done

more accurately.

Feature extraction: The preprocessed image is further processed to extract important features

for recognition process.

Recognition: This process uses the extracted features from the previous section to identify the

alphabets in the documents.

This paper uses neural networks for recognition process. Also all the alphabets from

Kannada languages are used to train or prepare the network for recognition. For simplicity, this

projects considers only images of printed documents only, avoiding hand written documents, which

vary in sizes, styles etc.

3.2 Data Flow Diagram

Fig. 3.2: Data flow diagram of OCR

3.2.1 Scanning of documents

Through the scanning process a digital image of the original document is captured. In OCR

optical scanners are used, which generally consist of a transport mechanism plus a sensing device

that converts light intensity into gray-levels.

3.2.2 Segmentation

Segmentation is the process of extracting objects of interest from an image. The first step in

segmentation is detecting lines. The subsequent steps are detecting the words in each line and the

individual characters in each word.

3.2.3 Noise removal step

The image resulting from the scanning process may contain a certain amount of noise. After

Image segmentation step noisy pixel still present. To remove this noise we apply image smoothening

algorithm. The smoothing implies both filling and thinning. Filling eliminates small breaks, gaps and

holes in the digitized characters, while thinning reduces the width of the line. The most common

techniques for smoothing, moves a window across the pixel set of the character, applying certain

rules to the contents of the window. We compare central pixel to 8 neighbours and apply

smoothening process.

3.2.4 Feature extraction step

In this step we extract certain features that still characterize the symbols, but leaves out the

unimportant attributes. The techniques for extraction of such features are often divided into three

main groups. First is the distribution of points, transformations and series expansions and structural

analysis of characters.



3.2.4 Post processing step

In the post processing step we group characters to form string. The process of performing this

association of symbols into strings is commonly referred to as grouping. The grouping of the

symbols into strings is based on the symbols’ location in the document. Symbols that are found to be

sufficiently close are grouped together.

IV. IMPLEMENTATION

4.1 Back Propagation (BP) Algorithm:

One of the most popular NN algorithms is back propagation algorithm. BP algorithm could be

broken down to four main steps. After choosing the weights of the network randomly, the back

propagation algorithm is used to compute the necessary corrections. The algorithm can be

decomposition the following four steps:

Feed-forward computation

Back propagation to the output layer

Back propagation to the hidden layer

Weight updates

The algorithm is stopped when the value of the error function has become sufficiently small.

This is very rough and basic formula for BP algorithm. There are some variation proposed by other

scientist but Rojas definition seem to be quite accurate and easy to follow. The last step, weight

updates is happening throughout the algorithm.

Worked example

NN on figure 5.1 has two nodes (N0,0 and N0,1) in input layer, two nodes in hidden layer

(N1,0 and N1,1)and one node in output layer (N2,0). Input layer nodes are connected to hidden layer

nodes with weights (W0, 1-W0, 4). Hidden layer nodes are connected with output layer node with

weights (W1, 0 and W1, 1). The values that were given to weights are taken randomly and will be

changed during BP iterations. Input node values and desired output with learning rate and

momentum are also given in Table 5.2. There is also sigmoid function formula f(x) = 1:0=(1:0 +

exp(∆x)). Shown are calculations for this simple network (only calculation for example set is going

to be shown (input values of 1 and 1 with output value 1)).

4.1.1 Feed-forward computation:

Feed forward computation or forward pass is two step process. First part is getting the values

of the hidden layer nodes and second part is using those values from hidden layer to compute value

or values of output layer. Input values of nodes N0, 0 and N0, 1 are pushed up to the network

towards nodes in hidden layer (N1, 0 and N1, 1). They are multiplied with weights of connecting

nodes and values of hidden layer nodes are calculated.

Fig. 4.1: Neural Network of layers



N0,0 N0,1 Output n2,0

1 1 1

1 0 0

0 1 0

0 0 0

Table 4.1: Pattern data for AND

Sigmoid function is used for calculations f(x) = 1:0= (1:0 + exp(�x)).

N1,0=f(x1)=f(w0,0*n0,0+w0,1*n0,1)=f(0.4+0.1)=f(0.5)=0.622459

N1,1=f(x2)=f(w0,2*n0,0+w0,3*n0,1)=f(-0.1-0.1)=f(-0.2)=0.450166

When hidden layer values are calculated, network propagates forward, it propagates values

from hidden layer up to a output layer node (N2, 0). This is second step of feed forward computation.

N2,0=f(x3)=f(w1,0*n1,0+w1,1*n1,1)=f(0.06*0.622459+(-0.4)*0.450166=0.464381

Having calculated N2,0 forward pass is completed.

4.1.2 Back propagation to the output layer

Next step is to calculate error of N2,0 node. From the table in figure 4, output should be 1.

Predicted value (N2,0) in our example is 0.464381. Error calculation is done the following way:

N2,0error=n2,0*(1-n2,0)*(N2,0desired-N2,0)=0.46438(1-0.46438)*(1-0.46438)=0.13322

Once error is known, it will be used for backward propagation and weights adjustment. It is

two step process. Error is propagated from output layer to the hidden layer first. This is where

learning rate and momentum are brought to equation. So weights W1,0 and W1,1 will be updated

first. Before weights can be updated, rate of change needs to be found. This is done by multiplication

of the learning rate, error value and node N1,0 value.

∆W1,0=β*N2,0error*n1,0=0.45*0.133225*0.622459=0.037317

Now new weight for W1,0 can be calculated.

W1,0new=w1,0old+∆W1,0+(α*∆(t-1))=0.06+0.037317+0.9*0=0.097137

∆W1,1=β*N2,0error*n1,1=0.45*0.133225*0.450166=0.026988

W1,1new=w1,1old+∆W1,1+(α*∆(t-1))= -0.4+0.026988= -0.373012

The value of (t -1) is previous delta change of the weight. In our example, there is no

previous delta change so it is always 0. If next iteration were to be calculated, this would have some

value.

4.1.3 Back propagation of the hidden layer

Now errors has to be propagated from the hidden layer down to the input layer. This is bit

more complicated than propagating error from output to hidden layer. In previous case, output from

node N2,0 was known beforehand. Output of nodes N1,0 and N1,1 was unknown. Let's start with

finding N1,0 error first. This will be calculated multiplying new weight W1,0 value with error for the

node N2,0 value. Same way error for N1,1 node will be found.

N1,0error=N2,0error*W1,0new=0.133225*0.097317=0.012965

N1,1error=N2,0error*W1,1new=0.133225*(-0.373012)= -0.049706

Once error for hidden layer nodes is known, weights between input and hidden layer can be updated.

Rate of change first needs to be calculated for every weight:

∆W0,0=β*N1,0error*N0,0=0.45*0.012965=0.005834

∆W0,1=β*N1,0error*n0,1=0.45*0.012965*1=0.005834

∆W0,2=β*N1,1error*n0,0=0.45*-0.049706*1= -0.022368

∆W0,3=β*N1,1error*n0,1=0.45*-0.049706*1= -0.022368

Than we calculate new weights between input and hidden layer.

W0,0new=w0,0old+∆W0,0+(α*∆(t-1))=0.4+0.005834+0.9*0=0.405834

W0,1new=w0,1old+∆W0,1+(α*∆(t-1))=0.1+0.005834+0=0.105384

W0,2new=w0,2old+∆W0,2+(α*∆(t-1))= -0.1+ -0.022368+0= -0.122368

W0,3new=w0,3old+∆W0,3+(α*∆(t-1))= -0.1+ -0.022368+0= -0.122368



4.1.4 Weight updates

Important thing is not to update any weights until all errors have been calculated. It is easy to

forget this and if new weights were used while calculating errors, results would not be valid. Here is

quick second pass using new weights to see if error has decreased.

N1,0=f(x1)=f(w0,0*n0,0+w0,1*n0,1)=f(0.406+0.1)=0.62386314

N1,1=f(x2)=f(w0,2*n0,0+w0,3*n0,1)=f(-0.122-0.122)=0.43930085

N2,0=f(x3)=f(w1,0*n1,0+w1,1*n1,1)=f(0.097*0.62386314+(-0.373)*0.4393=0.474186

Having calculated N2,0, forward pass is completed. Next step is to calculate error of N2,0 node.

From the table in figure 4, output should be 1. Predicted value (N2,0) in our example is 0.464381.

Error calculation is done in following way.

N2,0error=n2,0*(1-n2,0)*(N2,0desired-N2,0)=0.4741*(1-0.4741)*(1-0.4741)=0.131102

So after initial iteration, calculated error was 0.133225 and new calculated error is 0.131102.

Our algorithm has improved, not by much but this should give good idea on how BP algorithm

works. Although this was very simple example, it should help to understand basic operation of BP

algorithm .It can be said that algorithm learned through iterations. Number of iterations in typical

NN would be any number from ten to ten thousands. This is only one example set pass that could be

repeated many times until error is small enough.

4.2 Implementation Phases

The Character Recognition System must first be created through a few simple steps in order

to prepare it for presentation into MATLAB. The matrixes of each kannada character must be created

along with the network structure. In addition, one must understand how to pull the binary input code

from the matrix, and how to interpret the binary output code, which the computer ultimately

produces.

4.2.1 Pattern creation

The most popular and simple approach to OCR problem is based on feed forward neural

network with back propagation learning. The main idea is that we should first prepare a training set

and then train a neural network to recognize patterns from the training set. In the training step we

teach the network to respond with desired output for a specified input. For this purpose each training

sample is represented by two components: possible input and the desired network's output for the

input. After the training step is done, we can give an arbitrary input to the network and the network

will form an output, from which we can resolve a pattern type presented to the network. Let's assume

that we want to train a network to recognize 29 capital letters represented as images of 10x10 pixels.

One of the most obvious ways to convert an image to an input part of a training sample is to

create a vector of size 100 (for our case), containing "1" in all positions corresponding to the letter

pixel and "0" in all positions corresponding to the background pixels. But, in many neural network

training tasks, it's preferred to represent training patterns in so called "bipolar" way, placing into

input vector "0.5" instead of "1" and "-0.5" instead of "0". Such sort of pattern coding will lead to a

greater learning performance improvement.

4.2.2 Neural network

A feed forward back propagation neural network is used in this work for classifying and recognizing

the kannada handwritten characters .The neural classifier consists of a hidden layer besides an input

layer and an output layer as shown in Figure 5.2.The hidden layer use tansig activation function and

the output layer is a competitive layer as one of the characters is required to be identified at any point

in time. The neural network receives 24 element input vector in which each element represents a

particular kannada character of 20x30 pixel in size. Once the M neural network has been trained

successfully it is then required to identify the same corresponding character of 20x30 pixel .In

addition, the network should also be able to handle noise. In practice, the network does not receive a

perfect kannada character as an input. Specifically, the network should make few mistakes as

possible when classifying characters with noise.



Fig. 4.2: Trainee Neural network

Networks 1 2 3 4

No of Layers 2 2 2 2

No of neurons in hidden layer 10 25 50 100

No of neurons in output layer 18 18 18 18

Learning rate 0.1 0.1 0.1 0.1

Table 4.2: Four Neural based character recognition system

4.2.3 Neural network training

TRAINBFGC FUNCTION:

Syntax:[net,TR,Y,E,Pf,Af,flag_stop] = trainbfgc(net,P,T,Pi,Ai,epochs,TS,Q)

info = trainbfgc(code)

Description: trainbfgc is a network training function that updates weight and bias values according to

the BFGS quasi-Newton method. This function is called from nnmodref, a GUI for the model

reference adaptive control Simulink block.

[net,TR,Y,E,Pf,Af,flag_stop] = trainbfgc(net,P,T,Pi,Ai,epochs,TS,Q) takes these inputs, net

Neural network

P Delayed input vectors

T Layer target vectors

Pi Initial input delay conditions

Ai Initial layer delay conditions

epochs Number of iterations for training

TS Time steps

Q Batch size

Training occurs according to trainbfgc's training parameters, shown here with their default values:

net.trainParam.epochs 100 Maximum number of epochs to train

net.trainParam.show 25 Epochs between displays

net.trainParam.goal 0 Performance goal

net.trainParam.time inf Maximum time to train in seconds

net.trainParam.min_grad 1e-6 Minimum performance gradient

net.trainParam.max_fail 5 Maximum validation failures

net.trainParam.searchFcn 'srchbacx' Name of line search routine to use

Training stops when any of these conditions occurs:

The maximum number of epochs (repetitions) is reached.

The maximum amount of time is exceeded.

Performance is minimized to the goal.

The performance gradient falls below min_grad.

Precision problems have occurred in the matrix inversion



V. RESULT ANALYSIS

Fig. 5.1: Creating Kannada character "Na" Fig. 5.2: Array of kannada character "Na"

Fig. 5.3: Array of kannada alphabets Fig. 5.4: Neural network trainee



Table 5.1: Performance comparison

Performance graph: Figure 5.6 shows the lines of that function that take up the most time, the time

spent executing that line, the percentage of total time for that function that is spent on that line, and a

bar graph showing the relative time spent on the line.



Fig. 5.5: Plot perform Graph

Training state: The training state plot shows the progress of other training variables, such as the

gradient magnitude, the number of validation checks, etc as shown in below figure 5.7.

Fig. 5.6: Plot train state Graph

Error histogram: In figure 5.8 the error histogram plot shows the distribution of the network errors.



Fig. 5.7: Plot histogram Graph

Final Output:

Fig. 5.8: Handwritten Input Image “DhaNa” Fig. 5.9: Output of Handwritten Image “DhaNa”



Fig. 5.10: Input file image “GaNaPa” Fig. 5.11: Output of image “GaNaPa”

VI. APPLICATION

Banking

One widely known application is in banking, where OCR is used to process checks without

human involvement. A image of check can be captured by mobile camera, the writing on it is

scanned instantly, and the correct amount of money is transferred.

Legal Industry In the legal industry, there has also been a significant movement to digitize paper documents. In

order to save space and eliminate the need to sift through boxes of paper files, documents are being

scanned. OCR further simplifies the process by making documents text-searchable

Other Industries

OCR is widely used in many other fields, including education, finance, and government

agencies. OCR has made countless texts available online, saving money for students and allowing

knowledge to be shared.

Vocal Monitoring

In some cases, oral information is more efficient than written messages. The appeal is stronger,

while the attention may still focus on other visual sources of information. Hence the idea of

incorporating speech from documents.

VII. CONCLUSION

A neural network based kannada character recognition system has been introduced in this

paper for classifying and recognizing the kannada handwritten and printed characters. The pixel

values derived from the resized characters using image processing techniques have been directly

used for training the neural network. As a result, the proposed system will be less complex compared

to other methods of character recongnition systems. Of the several neural network architectures used

for classifying the kannada characters, the one with a hidden layer having 50 neurons has been found

to yield the highest recognition accuracy of 99.58%.The handwritten recognition system described in

this paper will find potential applications in hand written name recognition, document reading,

conversion of any handwritten document into structural text form and postal address recognition.



FUTURE SCOPE Our next works with OCR Mobile Application will include the improvement of the results

by the use of table boundaries detection techniques and the use of text post-processing techniques to

detect the noise and to correct bad-recognized words. OCR application will also display the

signatures and the other symbols as it is in the document. It will also update its features including the

translation of one language to another. So that it will helpful for people from other countries who

can’t understand the local language.

REFERENCES 1. G. Zhu and D. Doermann. “Logo Matching for Document Image Retrieval”, 10th international conference on

document analysis and recognition,p 606-610, 2009.

2. Tariq.J, “α-Soft: An English language OCR”, Computer engineering and applications (ICCEA).

3. Yalniz, I.Z.; Manmatha. R,”A Fast Alignment Scheme for Automatic OCR Evaluation of Books”,International

Conference Document Analysis and Recognition (ICDAR), 2011.

4. R.Shukla, “Object oriented framework modeling of a Kohonen network based character recognition

system”,Computer communication and informatics international conference(ICCCI), p 93-100, 2012.

5. Dan ClaudiuCires¸an and Ueli Meier and LucaMaria Gambardella and JurgenSchmidhuber, “Convolutional Neural

Network Committees for Handwritten Character Classification”, 2011International Conference on Document

Analysis and Recognition, IEEE, 2011.

6. GeorgiosVamvakas, Basilis Gatos, Stavros J. Perantonis, “Handwritten character recognition through two-stage

foreground sub-sampling”,Pattern Recognition, Volume 43, Issue 8, August2010.

7. ShreyDutta, Naveen Sankaran, PramodSankar K., C.V. Jawahar, “Robust Recognition of Degraded Documents

Using Character N-Grams”, IEEE, 2012.

8. Naveen Sankaran and C.V Jawahar, “Recognition of Printed Devanagari Text Using BLSTM Neural Network”,

IEEE, 2012.

9. Yong-Qin Zhang, Yu Ding, Jin-Sheng Xiao, Jiaying Liu and Zongming Guo1, “Visibility enhancement using an

image filtering approach”, Zhang et al. EURASIP Journal on Advances in Signal Processing 2012.

OCR FOR KANNADA SCRIPT - IJRTER · the optical character recognition using neural network for...

Documents

Transcript of OCR FOR KANNADA SCRIPT - IJRTER · the optical character recognition using neural network for...