Character Recognition Using ANN

7/21/2019 Character Recognition Using ANN

1/52

COMENIUS UNIVERSITY IN BRATISLAVA

FACULTY OF MATHEMATICS, PHYSICS AND INFORMATICS

HANDWRITTEN CHARACTER RECOGNITION USING

MACHINE LEARNING METHODS

Bachelor's Thesis

Study Program: Applied Informatics

Branch of Study: 2511 Applied Informatics

Educational Institution: epartment of Applied Informatics

Super!isor: "gr# $udo!%t "alino!s&

Bratislava, 2!"

Iv#r U$liari%


2/52


3/52

A&%'#(l)*+)'t

I (ould li&e to e)press my sincere gratitude to my super!isor "gr# $udo!%t "alino!s& for

in!alua*le consultations+ interest+ initiati!e+ and continuous support throughout the

duration of (riting this thesis# I (ould also li&e to than& my family+ fello( colleagues and

all the people that ha!e supported me# ,inally+ I (ould li&e to than& "artin Bo-e and the

rest of the n'c community for listening to my rants and all the (itty remar&s#


4/52

D)&larati#' #' W#r* #- H#'#r

I declare that this thesis has *een (ritten *y myself using only the listed references and

consultations pro!ided *y my super!isor#

Bratisla!a+ .################ 2/10

.##################################

I!or hliari&


5/52

A.stra&t

The aim of this (or& is to re!ie( e)isting methods for the hand(ritten character

recognition pro*lem using machine learning algorithms and implement one of them for a

userfriendly Android application# The main tas&s the application pro!ides a solution for

are hand(riting recognition *ased on touch input+ hand(riting recognition from li!e

camera frames or a picture file+ learning ne( characters+ and learning interacti!ely *ased

on user's feed*ac The recognition model (e ha!e chosen is a multilayer perceptron+ a

feedfor(ard artificial neural net(or&+ especially *ecause of its high performance on non

linearly separa*le pro*lems# It has also pro!ed po(erful in 34 and I4 systems 617 that

could *e seen as a further e)tension of this (or 8e had e!aluated the perceptron's

performance and configured its parameters in the 9 3cta!e programming language+

after (hich (e implemented the Android application using the same perceptron

architecture+ learning parameters and optimi-ation algorithms# The application (as then

tested on a training set consisting of digits (ith the a*ility to learn alpha*etical or different

characters#

/)0(#r*s: 4haracter recognition+ "ultilayer Perceptron+ Bac&propagation+ prop+ Image

processing


6/52

A.stra%t

4ie;om te*oru s o*r=-&om+ uenie no!ch+ dosia; nepo-nanch -na&o! a

intera&t%!ne uenie pod;a sptne< !-*y pouC%!ate;a# A&o model uenia sme -!olili

dopredn> neur@no!> sie? F !iac!rst!o! perceptr@n# Go;*u o!ply!nil !yso& !&on na

neline=rne separo!ate;nch pro*lmoch+ a&o a< fa&t+ Ce sa asto pouC%!a ! 34 a I4

systmoch 617+ &tor *y sa dali priro!na? ro-H%reniam teru+

parametre a algoritmy optimali-=cie# Apli&=ciu sme testo!ali na trno!ace< sade cifier s

moCnos?ou pridania a nauenia alfa*etic&ch a inch -na&o!#

/13#v4 sl#v5: o-po-n=!anie -na&o!+ Giac!rst!o! perceptr@n+ Bac&propagation+

prop+ Spraco!anie o*ra-u


7/52

Table of Contents

Introduction############################################################################################################################1

1# 3!er!ie(############################################################################################################################0

1#1 Pro


8/52

List #- A..r)viati#'s a'* S0.#ls

34 3ptical 4haracter ecognition#

I4 Intelligent 4haracter ecognition#

O "atri) of all training e)amples# Each ro( contains a feature !ector of a single

e)ample#

) ,eature !ector of a training e)ample

y A !ector of la*els of training e)amples#

L um*er of layers in a net(or

m um*er of training e)amples#

8eights of a neural net(or Superscript denotes a layer+ su*scripts denote

inde) of an element#

aQlR Gector of neuron acti!ations in layer l#

h Dypothesis of a classifier (ith gi!en (eights #

s< The si-e of layerj#

gQ-R Sigmoid function ofz#

QR 4ost function of (eights #

QlR Gector of error terms for neurons in layer l#

UQlR Accumulator of error terms for neurons in layer lin the conte)t of pure

*ac&propagation algorithm#

UQtR "atri) of (eight update si-es for iteration tin the conte)t of P3P#

UQtR "atri) of (eight change !alues in the conte)t of P3P#

U/ Initial (eight step si-e in P3P#

VW Acceleration of (eight step si-e in P3P#

V eceleration of (eight step si-e in P3P#

X et(or& regulari-ation parameter#


9/52

I'tr#*6&ti#'

Dand(ritten character recognition is a field of research in artificial intelligence+ computer

!ision+ and pattern recognition# A computer performing hand(riting recognition is said to

*e a*le to acYuire and detect characters in paper documents+ pictures+ touchscreen de!ices

and other sources and con!ert them into machineencoded form# Its application is found in

optical character recognition and more ad!anced intelligent character recognition

systems# "ost of these systems no(adays implement machine learning mechanisms such

as neural networks#

"achine learning is a *ranch of artificial intelligence inspired *y psychology and *iology

that deals (ith learning from a set of data and can *e applied to sol!e (ide spectrum of

pro*lems# A super!ised machine learning model is gi!en instances of data specific to a

pro*lem domain and an ans(er that sol!es the pro*lem for each instance# 8hen learning is

complete+ the model is a*le not only to pro!ide ans(ers to the data it has learned on+ *ut

also to yet unseen data (ith high precision#

eural net(or&s are learning models used in machine learning# Their aim is to simulate the

learning process that occurs in an animal or human neural system# Being one of the mostpo(erful learning models+ they are useful in automation of tas&s (here the decision of a

human *eing ta&es too long+ or is imprecise# A neural net(or& can *e !ery fast at

deli!ering results and may detect connections *et(een seen instances of data that human

cannot see#

8e ha!e decided to implement a neural net(or& in an Android application that recogni-es

characters (ritten on the de!ice's touch screen *y hand and e)tracted from camera and

images pro!ided *y the de!ice# Da!ing acYuired the &no(ledge that is e)plained in this

te)t+ the neural net(or& has *een implemented on a lo( le!el (ithout using li*raries that

already facilitate the process# By doing this+ (e e!aluate the performance of neural

net(or&s in the gi!en pro*lem and pro!ide source code for the net(or& that can *e used to

sol!e many different classification pro*lems# The resulting system is a su*set of a comple)

34 or I4 systemZ these are seen as possi*le future e)tensions of this (or

In the first chapter+ (e descri*e the o!erall pro


10/52

approaches+ algorithms and systems of similar nature# ,or o!er!ie(+ (e also *riefly

e)plain the specific algorithms that ha!e *een used in the implementation of the pro


11/52

!7 Ov)rvi)(

In order to reach the analysis of the used learning model and the specification and

implementation of the algorithms+ and conseYuently+ the Android application+ (e had to

re!ie( e)isting approaches to the pro*lem of character recognition# In this chapter+ (e

descri*e the pro


12/52

8e understand interacti!e learning (ith user feed*ac& as using online machine learning#

This should not *e confused (ith online and offline hand(riting recognition+ (hich is

descri*ed a*o!e# 3nline learning is defined as learning one instance of data at a time+

e)pecting feed*ac&[the real la*el of a character input *y a user[to *e pro!ided after the

neural net(or&'s prediction# 3n the other hand+ offline learning performs all of the learning

process prior to ma&ing any predictions and does not change its prediction hypothesis

after(ard# The t(o methods are e)amples of ]la-y^ and ]eager^ learning+ respecti!ely#

3nline machine learning ma&es the system more adapti!e to change+ such as a change in

trends+ prices+ mar&et needs+ etc# In our case+ user feed*ac& ma&es the system a*le to adapt

to a change in hand(riting style+ perhaps caused *y a change of user# 8e use *oth offline

and online learning in this (or Initially+ the user e)pects from the application to ]


13/52

image+ *ut some of them+ such as cropping the (ritten character and scaling it to our input

si-e+ are also performed in the touch mode#

igital capture and con!ersion of an image often introduces noise (hich ma&es it hard to

decide (hat is actually a part of the o*


14/52

The tas& of image segmentation is to split an image into parts (ith strong correlation (ith

o*


15/52

,inally+ in *oth touch and image *ased recognition in our (or&+ (e ha!e used cropping and

scaling of the images to a small fi)ed si-e#

!7272 F)at6r) E>tra&ti#'

,eatures of input data are the measura*le properties of o*ser!ations+ (hich one uses to

analy-e or classify these instances of data# The tas& of feature e)traction is to choose

rele!ant features that discriminate the instances (ell and are independent of each other#

According to 607+ selection of a feature e)traction method is pro*a*ly the single most

important factor in achie!ing high recognition performance# There is a !ast amount of

methods for feature e)traction from character images+ each ha!ing different characteristics+

in!ariance properties+ and reconstructa*ility of characters# 607 states that in order to ans(erto the Yuestion of (hich method is *est suited for a gi!en situation+ an e)perimental

e!aluation must *e performed#

The methods e)plained in 607 are template matching+ deforma*le templates+ unitary image

transforms+ graph description+ pro


16/52

Fi+6r) ": H#ri#'tal a'* v)rti&al

9r#8)&ti#' $ist#+ras ?"@7

As (e ha!e mentioned+ there is no method that is intrinsically perfect for a gi!en tas

E!aluation of such methods (ould ta&e a lot of time and it is not in the scope of this (or

Instead+ (e set our focus on multilayer feedforward neural networks+ (hich can *e !ie(ed

as a com*ination of a feature e)tractor and a classifier 607+ the latter of (hich (ill *e

e)plained shortly#

In our (or&+ (e ha!e used the multilayer perceptronneural net(or& model+ (hich (ill *e

more deeply descri*ed in the ne)t chapter# ,or no(+ (e can thin& of this model as a

directed graph consisting of at least 0 layers of nodes#

The first layer is called the input layer+ the last layer is the output layer+ and a num*er of

intermediate layers are &no(n as hidden layers# E)cept of the input layer+ nodes of neural

net(or&s are also called neuronsor units# Each node of a layer typically has a (eighted

connection to the nodes of the ne)t layer# The hidden layers are important for feature

e)traction+ as they create an internal a*straction of the data fed into the net(or The more

hidden layers there are in a net(or&+ the more a*stract the e)tracted features are#

N


17/52

Fi+6r) : Basi& vi)( at t$) 6ltila0)r 9)r&)9tr#'

ar&$it)&t6r)7 It 'tai's t$r)) la0)rs, #') #- t$) .)i'+ a$i**)' la0)r7 La0)rs 'sist #- ')6r#'s= )a&$ la0)r is -6ll0

'')&t)* t# t$) ')>t #')7

!727" Classi-i&ati#'

4lassification is defined as the tas& of assigning labelsQcategories+ classesR to yet unseen

o*ser!ations Qinstances of dataR# In machine learning+ this is done on the *asis of training

an algorithm on a set of training examples# 4lassification is asupervised learningpro*lem+

(here a ]teacher^ lin&s a labelto e!ery instance of data# La*el is a discrete num*er that

identifies the class a particular instance *elongs to# It is usually represented as a non

negati!e integer#

There are many machine learning models that implement classificationZ these are &no(n as

classifiers# The aim of classifiers is to fit a decision boundaryQ,igure R in featurespace

that separates the training e)amples+ so that the class of a ne( o*ser!ation instance can *e

correctly la*eled# In general+ the decision *oundary is a hypersurface that separates an

dimensional space into t(o partitions+ itself *eing 1dimensional#

K

I'96t la0)r Hi**)' la0)r O6t96t la0)r


18/52

Fi+6r) : Vis6aliati#' #- a *)&isi#'.#6'*ar07 I' -)at6r)s9a&) +iv)' .0

>! a'* >2, a *)&isi#' .#6'*ar0 is

9l#tt)* .)t())' t(# li')arl0s)9ara.l) &lass)s ?@7

!727"7! L#+isti& R)+r)ssi#'

!ogistic regression is a simple linear classifier# This algorithm tries to find the decision

*oundary *y iterating o!er the training e)amples+ trying to fit parametersthat descri*e the

decision *oundary hypersurface eYuation# uring this learning process+ the algorithm

computes a cost functionQalso called error functionR+ (hich represents the error measure of

its hypothesis Qthe output !alue+ predictionR# This !alue is used for penali-ation+ (hich

updates the parameters to *etter fit the decision *oundary# The goal of this process is to

con!erge to parameter !alues that minimizethe cost function# It has *een pro!ed 6N7 that

logistic regression is al(ays con!e)+ therefore the minimi-ation process can al(ays

con!erge to a minimum+ thus finding the *est fit of the decision *oundary this algorithm

can pro!ide#

ntil no(+ (e ha!e *een discussing *inary classification# To apply logistic regression to

hand(riting recognition+ (e (ould need more than 2 distinguishing classes+ hence (e need

multiclass classification# This can *e sol!ed *y using the one"vs"allapproach 6K7#

1/


19/52

Fi+6r) : M6lti&lass &lassi-i&ati#' #-

t$r)) &lass)s as it is s9lit i't# t$r))s6.9r#.l)s ?@7

The principle of one!sall is to split the training set into a num*er of *inary classification

pro*lems# 4onsidering (e (ant to classify hand(ritten digits+ the pro*lem degrades into

1/ su*pro*lems+ (here indi!idual digits are separated from the rest# ,igure Msho(s thesame for 0 classes of o*


20/52

!727"72 M6ltila0)r P)r&)9tr#'

"ultilayer perceptrons Q"LPsR are artificial neural net(or&s+ learning models inspired *y

*iology# As opposed to logistic regression+ (hich is only a linear classifier on its o(n+ the

multilayer perceptron learning model+ (hich (e already ha!e mentioned in terms of

feature e)traction+ can also distinguish data that are not linearly separa*le#

8e ha!e already outlined the architecture of an "LP+ as seen in Q,igure R#

In order to calculate the class prediction+ one must perform feedforward propagation# Input

data are fed into the input layer and propagated further+ passing through (eighted

connections into hidden layers+ using an activation function#Dence+ the node's activation

Qoutput !alue at the nodeR is a function of the (eighted sum of the connected nodes at a

pre!ious layer# This process continues until the output layer is reached#

The learning algorithm+ backpropagation+ is different from the one in logistic regression#

,irst+ the cost function is measured on the output layer+ propagating *ac& to the

connections *et(een the input and the first hidden layer after(ards+ updating unit (eights#

"LPs can perform multiclass classification as (ell+ (ithout any modifications# 8e simply

set the output layer si-e to the num*er of classes (e (ant to recogni-e# After the

hypothesis is calculated+ (e pic& the one (ith the ma)imum !alue#

A nonlinear acti!ation function is reYuired for the net(or& to *e a*le to separate non

linearly separa*le data instances# This+ along (ith the mentioned algorithms+ (ill *e

e)plained in ne)t chapters#

!7" E>isti'+ A99li&ati#'s

Dand(ritten character recognition is currently used e)tensi!ely in 34 and I4 systems#

These are used for !arious purposes+ some of (hich are listed *elo(#

!7"7! F6ll-)at6r)* D#&6)'t OCR a'* ICR S0st)s

ABB ,ineeader *y the ABB company is a piece of soft(are (ith (orld(ide

recognition that deals (ith 34 and I4 systems+ as (ell as applied linguistics 6117# The

company has also de!eloped a *usiness card reader mo*ile application that uses a

12


21/52

smartphone's camera for te)t recognition to import contact information 6127# The

application is+ among others+ also a!aila*le for the Android platform#

Tesseractocr is an 34 engine de!eloped at DP La*s *et(een 1KN5 and 1KK5# It is

claimed 6107 that this engine is the most accurate open source 34 engine a!aila*le+

supporting a (ide !ariety of image formats and o!er M/ languages# It is free and open

source soft(are+ licensed under Apache License 2#/#

9oogle 9oggles is an image recognition Android and i3S application+ featuring searching

*ased on pictures ta&en *y compati*le de!ices and using character recognition for some

use cases 617#

!7"72 I'96t )t$#*s"icrosoft has *een supporting a ta*let hand(riting*ased input method since the release of

8indo(s OP Ta*let P4 Edition 6157# This allo(s users of de!ices (ith this platform to

(rite te)t using a digiti-ing ta*let+ a touch screen+ or a mouse+ (hich is con!erted into te)t

that can *e used in most applications running on the platform#

9oogle Translate+ a machine translation Android application from 9oogle+ features

hand(riting recognition as an input method+ as (ell as translating directly from the camera

61M7# This closely resem*les a possi*le e)tension of our (or& in the future#

10


22/52

27 L)ar'i'+ M#*)l i' D)tail

In this chapter+ (e e)plain in detail the model that has *een used in the pro


23/52

In *inary classification+ using a single output neuron is recommended 6K7# Dere+ the

hypothesis is typically a real !alue# A threshold is then used to determine the

predicted class#

In multiclass pro*lems+ the si-e of the output layer is typically eYual to the num*er

of classes# Thus+ output data is represented as a !ector of real !alues# The predicted

class is the element (ith the ma)imum !alue#

In general+ ho(e!er+ (e assume the output layer is al(ays a real !alued !ector:

for output layer!+ Q2#1R

To perform super!ised learning+ a set of labelsQclassesR has to *e pro!ided# 8e represent

these as a !ector of the same si-e as the output !ector:

Q2#2R

15

Fi+6r) : T$) ar&$it)&t6r) #- a 6ltila0)r ')6ral ')t(#r%7 La0)r ! is

t$) i'96t la0)r= la0)r 2 is a $i**)' la0)r= la0)r " is t$) #6t96t la0)r7

>!, >2, a'* >" ar) -)at6r)s -)* t# t$) ')t(#r%= a!, a2, a'* a" ar)

$i**)' la0)r 6'its= $;>< is t$) #6t96t val6) ;$09#t$)sis


24/52

The (eights of a neuron are represented as real!alued matrices:

for layer l b!+ Q2#0R

Dere+ l is any layer e)cept of the output layer# sing our notation+ QlR is a matri) of

(eights corresponding to the connection mapping from layer l to layer l W 1# sl is the

num*er of units in layer l+ andslW 1is the num*er of units in layer lW 1# Thus+ the si-e of

QlRis 6slW 1+slW 17#

The additional neuron that is included in layer lis the biasneuron# sually mar&ed asx/or

a/(l)

+ *ias is a !ery important element in the net(or It can alter the shift of the

acti!ation function along thexa)is# The *ias neuron is only connected to the ne)t layer[it

has no input connections#

ote that a ro( in the (eights matri) represents connections from all of the neurons in

layer l to a single neuron in layer lW 1# 4on!ersely+ a column in the matri) representsconnections from a single neuron in layer lto all of the neurons in layer lW 1#

272 H09#t$)sis

Throughout this te)t+ (e'!e referred to hypothesis se!eral times# It is the prediction of a

class+ the output !alue of a classifier# As mentioned in chapter 1+ in order to ena*le the

net(or& to sol!e comple) nonlinear pro*lems+ the use of a nonlinear acti!ation function is

reYuired#

In many cases+ thesigmoidacti!ation function is used:

Q2#R

The range of the sigmoid function is 6/+ 17+ (hich is therefore also the range of the

elements in the output layer#

1M

(l)=

[1+ 1

(l) 1+2(l)

### 1+sl+1(l)

2+1(l) 2+2

(l) ### 2+sl+1(l)

### ### ### ###

sl+1, 1(l) sl+1,2

(l)### sl+1,sl+1

(l)]

g(z)=1

(1+ez), z


25/52

An acti!ation of a neuron in a layer is computed as a sigmoid function of the linear

com*ination of the (eights !ector corresponding to the neuron and the acti!ation of all

connected neurons from the pre!ious layer# ,or con!enience+ (e define the input layer

neuron !ector as

Q2#5R

sing Q2#R and Q2#5R+ (e generali-e the computation of a neuron's acti!ation in a

!ectori-ed form as

for layer l 1+ Q2#MR

Dere+ the sigmoid function is applied element(ise to the product of the (eights and the

connected neurons from the pre!ious layer+ therefore a(l)sl #

It may *e intuiti!e to go ahead and use Q2#MR recursi!ely to compute the o!erall hypothesis

of the net(or&+ *ut as (e're assuming the *ias neuron in the architecture+ it needs to *e

added to the !ector of acti!ations in layer lintermittently#

The process of determining the !alue of the hypothesis in the descri*ed (ay is called

forward propagation# The algorithm *ro&en up into steps follo(s:

1# Start (ith the first hidden layer#

2# 4ompute the acti!ations in the current layer using Q2#MR#

0# If the current layer is the output layer+ (e ha!e reached the hypothesis and end#

# Add a *ias unit a/(l)=1 to the !ector of computed acti!ations#

5# Ad!ance to the ne)t layer and go to step 2#

27" L)ar'i'+: Ba&%9r#9a+ati#'

A multilayer perceptron is a super!ised learning model# As such+ e!ery e)ample in the

training set is assigned a la*el+ (hich is used to compute a cost function Qan error

measureR# As mentioned in the first chapter+ learning is an optimi-ation pro*lem that

updates free parameters Q(eightsR in order to minimi-e the cost function#

1J

x=a(1)

a(l)=g((l1)a(l1))


26/52

There is a num*er of different cost functions typically used (hen training multilayer

perceptrons# Training is often performed *y minimi-ing the mean sYuared error+ (hich is a

sum of sYuared differences *et(een computed hypotheses and actual la*els# Do(e!er+

mean nonsYuared error is also used# To *e consistent (ith 6K7+ (e ha!e used a

generali-ation of the cost function used in logistic regression:

Q2#JR

In Q2#JR+ mis the num*er of training e)amples and$is the total num*er of possi*le la*els#

h(x

(i )

) is computed using for(ard propagation descri*ed a*o!e#

The cost function a*o!e is not al(ays con!e)[there may *e more local minima# Do(e!er+

according to 6K7+ it is sufficient to reach a local minimum that is not glo*al# In order to

reach a local minimum of the cost function+ (e use an optimi-ation algorithm+ such as

gradient descent# 9radient descent is a simple optimi-ation algorithm that con!erges to a

local minimum *y ta&ing steps in a (eightspace iterati!ely in socalled epochs# The si-e

of each step is proportional to the negati!e of the gradientof the cost function at the

current point#

There is an important factor that modifies the descent step si-e in machine learning[the

learning rate# It is a modifier that is used to tune ho( fast and ho( accurate the

optimi-ation is and hea!ily determines the efficiency of a learning algorithm#

Fi+6r) : T$) )--)&t #- l)ar'i'+ rat) #' +ra*i)'t *)s&)'t7 I' ;a


27/52

Since gradient is the partial deri!ati!e of the cost function (ith respect to indi!idual

parameters+ the change of parameters in a single gradient descent step is performed as:

Q2#NR

In ,igure N(e can see the effect of a (rong choice of the learning rate# There are ad!anced

(ays on determining the right learning rate !alue+ *ut it is usually sufficient to determine it

empirically after applying !arious learning rates to the learning algorithm and pic&ing one

(ith the minimum error#

3*taining the gradient in a multilayer perceptron is not tri!ial and is done in se!eral steps#

As each neuron has its o(n acti!ation and (eighted connections+ it ta&es its o(n part inthe cost function# To propagate the error measured on the output layer after a prediction+

each neuron's (eights need to *e updated differently# To achie!e this+ (e introduce the

concept of an error term & j(l)

+ representing the error of nodejin layer l#

To o*tain the error terms for all neurons in the net(or& e)cept the input layer Qas there is

no error in the input dataR+ (e do the follo(ing# 9i!en an instance of input datax+ for(ard

propagation is performed to determine the hypothesis# 8ith the input la*elyj+ starting at the

end of the net(or&+ (e calculate the error terms for the output layer per neuron j:

Q2#KR

ote that the output acti!ation a j(!)

is a part of the hypothesis as sho(n in Q2#1R#

8e then propagate the error to lo(er layers:

for layer 1 b lb!+ Q2#1/R

(hereg'is the gradient of the sigmoid function andzQlRis a !ector of the linear

com*inations of all neurons and their respecti!e (eights in layer l 1:

for layer 1 b l!+ Q2#11R

sing Q2#11R+ it can *e sho(n that the sigmoid gradient is

for layer 1 b l!+ Q2#12R

therefore Q2#1/R can *e e)pressed using Q2#12R as:

1K

=%()

& j(!)=aj

(!)yj

& j(l)=((l))T & j

(l+1)#g '(z(l)),

z(l)=(l1 )a(l1)

g '(z(l))=a(l) #(1a(l)),


28/52

for layer 1 b lb!+ Q2#10R

4ollecting the error terms is essential for the computation of the partial deri!ati!es of the

net(or 8e (ill no( descri*e the o!erall process of o*taining the partial deri!ati!es+

called backpropagation+ in the follo(ing pseudocode# ote that in this pseudocode+

matri) elements are inde)ed from /+ as opposed to mathematical notation+ (here inde)ing

starts at 1#

01| We are given a training set {(x(1), y(1)),###,(x(m), y(m))}

02| Set i , j(l)

:=/ for all l, i, j

03| For i=1 to m

04| Set a(1)

:=x(i )

05| Perform forward propagation to compte a!l"for l= 2, 3, #, L

0$| %sing y!i", compte &(!)=a(!)y (i)

0&| 'ompte &(!1)

, &(!2)

, ###, &(2 )

0(| Set (l):=(l)+&(l+1)(a (l))T

0)| (i , j(l)

:=1

mi , j(l)

o(+ the(term is eYual to the follo(ing:

for layer lb!+ Q2#1R

sing these partial deri!ati!es+ (e can perform gradient descent to minimi-e the cost

function and thus ena*le the algorithm to ma&e predictions on ne( data#

It is important to note that *efore using *ac&propagation to learn the parameters+ (eights

should *e initiali-ed to random small num*ers *et(een / and 1# The (eights must not *e

initiali-ed to -ero+ other(ise indi!idual (eight updates (ill *e constant for all (eights and

the minimi-ation algorithm (ill fail to con!erge to a local minimum#

27 L)ar'i'+: R)sili)'t Ba&%9r#9a+ati#'

Resilient backpropagationQP3PR is an efficient optimi-ation algorithm proposed *y

iedmiller and Braun in 1KK0 67# It is *ased on the principle of gradient descent used (ith

pure *ac&propagation# Instead of updating (eights of a net(or& (ith a fi)ed learning rate

2/

& j(l)=((l))T&j

(l+1)#(a(l) #(1a(l)))

((l)=

%()

(l)


29/52

that is constant for all (eight connections+ it performs a direct adaptation of the (eight step

using only the sign of the partial deri!ati!e+ not its magnitude# As such+ it o!ercomes the

difficulty of setting the right learning rate !alue#

,or each (eight+ an indi!idual (eight step si-e is introduced: )i+jQnot to *e confused (ith

the sym*ol used (hen accumulating error in pure *ac&propagationR# 9i!en epoch t / and

considering as a (eight matri) for a single layer Qthis can *e applied to any !alid layerR+

this !alue e!ol!es during the learning process according to the follo(ing rule:

Q2#15R

The change of the sign of the partial deri!ati!e across t(o epochs indicates the local

minimum of the cost function has *een /

i , j(t1)

,if%()

(t1 )

%()

(t)


30/52

sho(n 67 that the choice of this !alue is not critical#

Empirically+ the reasona*le !alues for + + + and )/ are /#5+ 1#2+ and /#1+

respecti!ely 67#

In the original document 67 it (as suggested that the pre!ious update step *e re!erted if

the sign of the gradient changes Qthe minimum (as missedR# This is called backtracking#

Do(e!er+ in 657+ an P3P !ariation (ithout *ac&trac&ing has *een proposed+ simply

lea!ing this step out+ as it is not crucial in the optimi-ation process and is easier to

implement#

27 Bias, Varia'&)

Digh bias and high variance+ also called underfitting and overfitting+ respecti!ely+ are

common causes of an unsatisfying learning algorithm performance#

If a learning model is presented (ith input (ith too many comple) features+ the learned

hypothesis may fit the training set !ery (ell QQR /R+ *ut may fail to generali-e the

prediction on ne( e)amples# In this case+ o!erfitting may *e o*ser!ed# 3n the other hand+

if a classifier generali-es the hypothesis to an o!erly simple form+ the error is usually high

on *oth the training set and ne( e)amples+ (hich is caused *y underfitting#

22

Fi+6r) : Hi+$ .ias, $i+$ varia'&)7 O' t$) l)-t, a l)ar'i'+ al+#rit$ 6'*)r-its t$)trai'i'+ )>a9l)s a'* (ill li%)l0 -ail at 9r)*i&ti#'s #' 6's))' )>a9l)s= i' t$) &)'t)r,

t$) al+#rit$ -its t$) )>a9l)s J86st ri+$tK= #' t$) ri+$t, t$) al+#rit$ #v)r-its t$)

)>a9l)s, -itti'+ t$) trai'i'+ s)t, .6t (ill li%)l0 -ail at 9r)*i&ti#'s #' 6's))')>a9l)s7


31/52

To address o!erfitting+ (e may do one of the follo(ing:

1# educe the num*er of features

"anually select (hich features to &eep

se a dimensionality reduction algorithm+ such as principal component

analysisQnot co!ered in this (or&R

2# 9et more e)amples Qmay help in some casesR

*# Apply regularization

ecreases the !alues of free parameters for *etter generali-ation

sed (hen all features contri*ute to a successful hypothesis

In regulari-ation+ a regulari-ation parameter+is used to penali-e free parameters Q(eights

in an "LPR# This is a real scalar !alue that ta&es part in the cost function and optimi-ation

functions in order to affect the choice of free parameters and help (ith the high !ariance

pro*lem# In the learning process+ (hen + /+ the machine learning algorithm's parameters

are reduced+ (hen +b /+ the parameters are increased+ and (hen + /+ no regulari-ation

is performed# Being stated+ (e choose a high !alue for the regulari-ation parameter to

a!oid high !ariance and lo(er it in the case of high *ias+ *ecause setting a too largeregulari-ation parameter may *e the cause of high *ias itself#

To add regulari-ation to the multilayer perceptron algorithms+ (e must modify Q2#JR:

Q2#1NR

In effect+ this adds a condition to line K of the *ac&propagation pseudocode:

0)| (i , j(l)

:=1

mi , j(l)

if j= 0

10| (i , j(l)

:=1

m(i , j

(l) +i , j(l) ) if j* 0

As sho(n in Q2#1NR and in line K+ (e do not add the regulari-ation term for the *ias units+

therefore (e s&ip the first column of the (eight matri)#

20

%()=1

mi=1

m

k=1

$

[yk(i ) log((h(h(i )))k)(1yk(i )) log(1(h(x(i)))k)]

+ 2 m [l=1

!1

i=1

sl+1

j=2

sl

(i , j(l) )2]


32/52

"7 S#l6ti#' D)s&ri9ti#'

"7! F6'&ti#'al S9)&i-i&ati#'As mentioned in the aim+ there are four main reYuirements of the Android application that

has *een de!eloped:

a*ility to recogni-e characters (ritten using the touchsensiti!e display+

a*ility to recogni-e characters gi!en an image or camera frames as input+

a*ility to learn progressi!ely *ased on user's feed*ac&

a*ility to learn ne(+ pre!iously unseen characters#

8e (ill no( descri*e these reYuirements in detail+ listed *elo(#

6/17 The application pro!ides means to enter a ]touch mode^ screen# Dere+ the user can

dra( characters freely *y hand#

6/1#17 8hen the user is done dra(ing+ the dra(ing is recogni-ed and the prediction is

sho(n to the user#

6/1#27 The dra(ing+ along (ith the predicted la*el+ can *e sa!ed to a persistent location#

6/1#07 The user can pro!ide feed*ac& on the prediction+ signaling (hether the prediction

(as correct+ or ma&ing a correction to the prediction+ performing online learning#

6/27 The application pro!ides means to enter a ]camera mode^ screen# Dere+ the

de!ice's camera is used to present its input to the screen#

6/2#17 After sho(ing the camera frame+ it is analy-ed and found patterns are recogni-ed

and sho(n to the user#

6/2#27 The process in 6/27 and 6/2#17 is done continuously as the camera frames are

updated o!er time#

6/2#07 The user can pro!ide feed*ac& on the prediction+ signaling (hether the prediction

(as correct+ or ma&ing a correction to the prediction+ performing online learning#

6/07 The application pro!ides means to enter an ]image mode^ screen# Dere+ the user

2


33/52

can load an image file present on the de!ice+ (hich is then sho(n on the screen#

6/0#17 After sho(ing the image+ it is analy-ed and found patters are recogni-ed and

sho(n to the user#

6/7 The user can see the list of all learned characters#

6/57 The user can add a ne( character and train it using touch mode as descri*ed in

6/17#

6/M7 The user can perform offline learning on current persistent data at any time#

"72 Pla' #- S#l6ti#'

Dere (e are going to descri*e the choices that (ere needed to *e made *efore starting to

implement the solution#

"727! R)+'iti#' Pr#&)ss Pi9)li')s

So far+ (e ha!e named and descri*ed se!eral algorithms dealing (ith image preprocessing+

learning+ and optimi-ation# These are thought of as *uilding *loc&s (hen designing the

recognition process pipeline that (ould satisfy gi!en reYuirements and they are often used

in seYuence+ one pro!iding input for another# As *riefly mentioned in the o!er!ie(+ (e

ha!e considered a process pipeline for recognition *ased on touch input+ and a separate

pipeline for recognition *ased on static image input#

The recognition pipeline used to recogni-e characters in the touch input mode follo(s:

1# AcYuire the hand(ritten character image as a grayscale *itmap#

2# esi-e this *itmap to 2/)2/ pi)els#

0# AcYuire a *inary *itmap of points (here each stro&e has started and ended#

# esi-e this *itmap to 2/)2/ pi)els#

5# nroll the *itmap matrices to a feature !ector of N// elements#

M# ,eed this !ector to a trained multilayer perceptron+ gi!ing us the prediction#

The *itmaps ha!e *een chosen to *e grayscale and ha!e a small resolution *ecause it is

sufficient to perform correct prediction and the feature !ector si-e is small enough to ma&e

25


34/52

the learning and prediction computationally feasi*le# Similar *itmap si-es ha!e *een

chosen in related pro*lems *y g 6K7 and Le4un et al# 617#

As (e ha!e more data a!aila*le in the touch mode than a pure image *itmap+ (e ha!e also

decided to collect the *itmap of stro&e end points to *e a*le to *etter distinguish characters

such as 'N' and 'B'+ as mentioned in the o!er!ie(# The resi-ed *itmaps of these characters

are often similar+ *ut the (riting style of each is usually different# By pro!iding this e)tra

*itmap (ith each e)ample+ (e are gi!ing a hint to the neural net(or& classifier a*out (hat

features to focus on (hen performing automatic feature e)traction (ith the hidden layer#

The pipeline for recognition *ased on an image or a camera frame is different:

1# AcYuire the image *itmap in grayscale colors#

2# Apply a median filter to the *itmap#

0# Segment the *itmap using thresholding to get a *inary *itmap#

# ,ind the *ounding *o)es of e)ternal contours in the *itmap#

5# E)tract su**itmaps from the *ounding *o)es#

M# esi-e the su**itmaps to 2/)2/ pi)els#

J# nroll the su**itmap matrices to feature !ectors per // elements#

N# ,eed each feature !ector to a trained multilayer perceptron+ gi!ing us predictions#

3ur intention is to produce a similar input for the net(or& as in the case of touch input+

only (ithout the stro&e end *itmaps as (e do not possess this information in this mode#

8ith this approach+ (e are a*le to reuse *itmaps produced in touch input to train a

net(or& that operates on images#

8hen preprocessing the image+ (e apply a median *lur to remo!e noise# "edian *lur (as

preferred to other *lur filters *ecause of its property of preser!ing edges+ (hich are

important in character recognition#

Thresholding is a !ery simple segmentation algorithm that is sufficient to segment out

characters from the *ac&ground and thus *inari-e the *itmap# This step is reYuired+ as the

*ac&ground is an e)tra information (e do not need in the recognition process#

,inding contours in the image and finding the *ounding *o)es of each pattern allo(s us to

2M


35/52

detect more characters in an image at once# 9i!en the *ounding *o)es+ (e are a*le to

e)tract *itmaps of the indi!idual characters Qor other patternsR and then perform the rest of

the processes


36/52

"727" O--li') a'* O'li') L)ar'i'+

3ffline and online learning ha!e already *een defined in the o!er!ie( chapter as eager and

la-y leaning mechanisms+ respecti!ely# As the specification denotes+ *oth mechanismsha!e to *e implemented in our application in order to *e a*le to learn progressi!ely *ased

on user's feed*ac&+ as (ell as perform offline learning on persistent data at any time#

The preferred algorithm for offline learning is P3P *ecause of the descri*ed ad!antages

o!er pure *ac&propagation# It is e!ident the algorithm is superior to *ac&propagation and

other learning algorithms as sho(n in Q,igure 1/R#

Do(e!er+ P3P is not an online learning algorithm[it (ouldn't ma&e sense to perform

as single learning iteration+ *ecause no (eight update acceleration or deceleration occurs#

In effect+ using P3P for online learning (ould *e eYual to using *ac&propagation if

initial (eight update !alue of P3P is eYual to the learning rate of *ac&propagation#

Therefore+ pure *ac&propagation has *een chosen to perform online learning#

,or *ac&propagation+ (e ha!e set the learning rate to /#0# ,or P3P+ the initial (eight

update si-e is /#/1+ has *een set to the traditional !alue of /#5+ and + is eYual to

1#2# After testing the performance of the algorithms+ (e ha!e decided not to use

regulari-ation at all Qthe regulari-ation parameter is /R# Both *ac&propagation and P3P

perform 1// optimi-ation epochs#

2N

Fi+6r) !: Av)ra+) '6.)r #- r)6ir)* )9#&$s

i' *i--)r)'t l)ar'i'+ 9r#.l) s&)'ari#s7 It isa99ar)'t t$at RPROP is s69)ri#r t#

.a&%9r#9a+ati#' ;BP


37/52

"727 Us)* T)&$'#l#+i)s

8e ha!e de!eloped a nati!e Android application using a!a SE untime En!ironment M+

the a!a programming language+ and Android S# This is the traditional (ay to *uildAndroid applications and does not rely on a third party# The application has *een targeted

at Android #2#2+ ho(e!er+ Android 2#2 and up should *e supported#

Da!ing e)plained the algorithms used in multilayer perceptrons+ (e ha!e decided to

implement machine learning on matri)le!el only using a li*rary for matri) manipulations+

a!oiding e)isting machine learning li*raries# ,or this+ (e ha!e used a su*set of the A"A

a!a pac&age that deals (ith matrices#

Because image preprocessing is not the primary focus of this (or&+ (e ha!en't

implemented these algorithms from scratch[instead+ (e ha!e used the 3pen4G li*rary

for Android# This li*rary contains algorithms for computer !ision in general# 3mitting its

machine learning capa*ilities+ (e ha!e used this to perform image preprocessing steps in

the image and camera modes of the application#

To prototype+ test+ and e!aluate algorithms+ (e ha!e used the 9 3cta!e language# This

approach allo(ed us to produce statistics and plots useful for configuring the net(or&

architecture and de*ugging the algorithms#

2K


38/52

7 I9l))'tati#'

In this chapter+ (e (ill descri*e ho( the Android application reYuirements ha!e *een

satisfied+ document the implementation+ descri*e the pro*lems (e'!e run into and ho( they

(ere sol!ed and pro!ide a *rief user guide in the process#

7! A'*r#i* A99li&ati#' I9l))'tati#'

The main entry point of the application is a na!igation activity# Being one of four main

*uilding *loc&s of Android applications+ acti!ities are single+ focused things that the user

can do 61J7# They usually appear as (indo(s containing user interface elements that theuser can interact (ith# 3ur na!igation acti!ity contains *uttons to access other acti!ities[

input mode+ camera mode+ image mode+ and character list# Thus+ this acti!ity is reYuired in

order to *e a*le to satisfy /1+ /2+ /0+ and /#

The touch mode acti!ity is designed to implement features reYuired in /1# The main piece

of user interface it contains is a custom view o*


39/52

By default+ each time a prediction is made+ the dra(ing and the stro&e end *itmaps are

sa!ed to e)ternal storage as image files in pairs+ so that perceptron (eights can *e remo!ed

and offline learning can *e performed# This *eha!ior satisfies /1#2 and can *e turned off

in application settings to sa!e memory (hen no or only online learning is reYuired#

The camera mode contains an 3pen4G user interface element tailored for sho(ing and

processing camera frames in Android# 8hen the user enters the acti!ity+ they are

immediately presented (ith continuously updated camera images+ (hich are processed

according to the image mode recognition pipeline as has *een descri*ed# Indi!idual

contours that are li&ely to *e characters are found and mar&ed (ith rectangles# A predicted

la*el is sho(n a*o!e each such rectangle# In practice+ a e)us S de!ice (ith a singlecore

19D- 4P Qthe 9P is+ unfortunately+ not used *y the implementation of the 3pen4Gli*rary (hich (ould ha!e li&ely performed *etterR and a camera resolution of J2/)N/

pi)els+ the updating freYuency is appro)imately 2 frames per second# This depends on the

num*er of o*


40/52

8hen the user enters the image mode+ they are instructed to start *y opening an image file#

The system then sho(s all possi*le applications that can handle this reYuestZ a default

gallery application is preferred+ as these usually implement the *eha!ior of pro!iding data

correctly+ ho(e!er+ other applications+ such as file managers+ may also *e used# The loaded

image file is processed according to the image mode recognition pipeline using the same

techniYues as in camera mode+ and is then sho(n to the user# The segmentation control

!ie(s are also present# Because this acti!ity (ould share a lot of source code (ith the

camera acti!ity+ methods that could *e reasona*ly separated from the classes ha!e *een

mo!ed to a pu*lic class and declared as static#

nli&e in the camera acti!ity+ (hich is constrained to the landscape screen orientation+ (e

(anted to ma&e this acti!ity fle)i*le# It can *e rotated+ (hile its state is retained in afragment+ (hich represents a portion of a user interface or a *eha!ior of an acti!ity 61J7#

The last acti!ity from the four na!igation elements is a character list that satisfies /# It

sho(s agrid viewof learned characters+ filling the items (ith sa!ed *itmaps if present#

8hen the user selects an item in the grid !ie(+ they are ta&en into the touch mode acti!ity

that is modified to focus on training the single character la*el# The difference is that no

feed*ac& dialog is sho(n to the user[it is assumed that the character the user selected in

the character list is al(ays the anticipated la*el and online learning is performed after eachprediction# 3ther(ise+ the acti!ity is unchangedZ the e)amples are still sa!ed to e)ternal

storage if set to do so in the settings and the net(or& state is sa!ed in appropriate e!ents#

The character list acti!ity also contains a menu itemfor adding a ne( character as reYuired

in /5# After *eing selected+ the user is as&ed to assign a la*el to the ne( character and is

then ta&en into the touch input acti!ity to train the ne( character as descri*ed a*o!e# This

action adds the character to the list of &no(n characters and modifies the structure of the

perceptron used for touch input recognition# In particular+ the output layer si-e is increased

*y one Qto *e a*le to predict the ne( classR and (eight connections to this layer are

ad


41/52

using online learning in the related acti!ities+ or *y performing offline learning in the case

that the *itmaps for the ne( character ha!e *een sa!ed#

So far+ (e ha!e mentioned the application settings only in relation to the option of sa!ing

character *itmaps# Settings are another acti!ity that the user can access from any other

acti!ity at any time# Dere+ *esides the sa!ing option+ the user is a*le to pic& the error

minimi-ation algorithm used for offline learning: *ac&propagation or P3P+ the latter

*eing the default# The initial learning rate to perform the online learning iteration Qusing

*ac&propagationR is /#0+ *ut if the user feels the need to ad


42/52

*ac&light are allo(ed to *e turned off+ *ut the 4P is &ept on until all partial (a&e loc&s

ha!e *een released 61J7#

72 A'*r#i* A99li&ati#' S#6r&) C#*)The application source code is composed of a!a classes arranged in pac&ages# Dere+ (e

(ill ta&e a *rief loo& at the pac&ages and descri*e the classes# The complete source code

documentation+ along (ith the source code of the Android application and 9 3cta!e

scripts+ see the content included on the enclosed G#

The list of pac&ages is as follo(s:

1# eu#uhliaricharrec#gui#acti!ities

2# eu#uhliaricharrec#gui#adapters

0# eu#uhliaricharrec#gui#dialogs

# eu#uhliaricharrec#gui#fragments

5# eu#uhliaricharrec#gui#models

M# eu#uhliaricharrec#gui#!ie(s

J# eu#uhliaricharrec#learning

N# eu#uhliaricharrec#ser!ices

K# eu#uhliaricharrec#utils

1/# eu#uhliarimlp

11#


43/52

The models pac&age comprises Character0rid1tem6odel+ (hich is a class that holds a

*itmap and a la*el of a character used in the character list#

8e ha!e created t(o custom !ie(s in the viewspac&age+ namely Character0rid4iewand

Character4iew# The Character0rid4iewclass e)tends 0rid4iewand adds a fe( methods

to facilitate the use of the class# 0rid4iew e)tends the 4iew class and is the main

component of the touch input mode# Its main tas& is to o!erride the on(raw and

onTouch2ventmethods to interact (ith the user and allo( them to dra( on the !ie('s

can!as#

The eu#uhliarik#charrec#learningpac&age contains the Character6lpclass that e)tends a

6ultilayer3erceptronin the eu#uhliarik#mlppac&age and is used throughout the application

as the main learning model#

The Android ser!ices descri*ed earlier in this chapter can *e found in the

eu#uhliarik#charrec#services pac&age: Trainingervice and Transferervice# The former

performs offline learning of the neural net(or&s+ (hile the latter manages transfer of the

initial data (hen the application is used for the first time#

4ode that (ould other(ise ha!e *een shared *y se!eral classes has *een separated to the

5ile7tils and 1mage7tils classes in the eu#uhliarik#charrec#utils pac&age# Intuiti!ely+

5ile7tilscontains pu*lic static methods that deal (ith sa!ing and loading perceptron data+

e)tracting files from the compressed initial data file+ and more#1mage7tils+ on the other

hand+ deals (ith loading and con!erting the dataset Qcharacter *itmapsR+ *itmap

manipulations+ *itmap matri) format con!ersions+ and image preprocessing algorithms#

Pac&age 1/ is placed outside of the eu#uhliarik#charrecpac&age+ as it is meant to *e used

*y any application+ not only in our pro


44/52

7 R)s6lts

Before implementing the learning algorithms in a!a+ (e ha!e tested them as prototypes

using 9 3cta!e# This allo(ed us to easily collect results of ho( (ell the learning

algorithms perform# This chapter presents and discusses the comparison of the collected

results#

7! C#ll)&ti#' M)t$#*s

8e ha!e measured the performance of the algorithms used in multilayer perceptrons:

*ac&propagation and resilient propagation# 8e ha!e considered the scenario of recognitionfrom image+ (here the dataset consists only of / character image *itmaps per character#

,or this comparison+ the datasets are only comprised of characters of digits+ therefore the

si-e of the dataset contains // e)amples#

,or rele!ant !alues+ (e ha!e split the dataset into training and !alidation sets+ (ith the

ratio *eing J:0# Also+ *efore using the learning algorithms+ the dataset has *een randomly

shuffled#

The error rates ha!e *een o*tained using a logarithmic cost function sho(n in Q2#JR#

The configuration of the learning model (hose results are presented here is:

The regulari-ation parameter is /#

The num*er of epochs is 1//#

In *ac&propagation+ the learning rate is /#0#

In resilient *ac&propagation+

+ +

+ and )/ are /#5+ 1#2+ and /#/1+

respecti!ely#

The perceptron architectures are as descri*ed in the plan of solution#

This configuration has *een chosen *ased on recommendations from !arious resources that

ha!e *een referred to in the e)planation of indi!idual model characteristics+ and testing+ the

results of (hich (ould *e too long to fit in this chapter#

0M


45/52

7! R)s6lt C#9aris#'

8e ha!e measured the error of the *ac&propagation and P3P algorithms on the training

and !alidation sets# This has *een tested using fractions of the dataset of !arious si-es and a

learning cur!e has *een plotted# Learning cur!e represents error as a function of the dataset

si-e and is a perfect tool to !isuali-e high *ias or !ariance#

0J

Fi+6r) !!: L)ar'i'+ &6rv) #- .a&%9r#9a+ati#' 9)r-#r)* #' t$) trai'i'+ a'*

vali*ati#' s)ts7


46/52

In the learning cur!es+ no significant o!erfitting or underfitting is apparent# 8e can see that

the P3P algorithm manages to con!erge to a *etter minimum gi!en 1// epochs than

*ac&propagation# This is caused *y the ad!antages of the P3P algorithm to pure

*ac&propagation that (e e)plained earlier in this (or Ta*le 1 confirms these findings#

Al+#rit$ Trai'i'+s)t )rr#r

; B: A'*r#i* A99li&ati#' S&r))'s$#ts

S&r))'s$#t !: A'*r#i*a99li&ati#' 'avi+ati#' s&r))'

S&r))'s$#t 2: A'*r#i*a99li&ati#' t#6&$ #*) s&r))'


52/52

S&r))'s$#t ": A'*r#i* a99li&ati#' &a)ra #*)

s&r))'

Character Recognition Using ANN

Documents

Transcript of Character Recognition Using ANN