Character Recognition Using ANN

download Character Recognition Using ANN

of 52

Transcript of Character Recognition Using ANN

  • 7/21/2019 Character Recognition Using ANN

    1/52

    COMENIUS UNIVERSITY IN BRATISLAVA

    FACULTY OF MATHEMATICS, PHYSICS AND INFORMATICS

    HANDWRITTEN CHARACTER RECOGNITION USING

    MACHINE LEARNING METHODS

    Bachelor's Thesis

    Study Program: Applied Informatics

    Branch of Study: 2511 Applied Informatics

    Educational Institution: epartment of Applied Informatics

    Super!isor: "gr# $udo!%t "alino!s&

    Bratislava, 2!"

    Iv#r U$liari%

  • 7/21/2019 Character Recognition Using ANN

    2/52

  • 7/21/2019 Character Recognition Using ANN

    3/52

    A&%'#(l)*+)'t

    I (ould li&e to e)press my sincere gratitude to my super!isor "gr# $udo!%t "alino!s& for

    in!alua*le consultations+ interest+ initiati!e+ and continuous support throughout the

    duration of (riting this thesis# I (ould also li&e to than& my family+ fello( colleagues and

    all the people that ha!e supported me# ,inally+ I (ould li&e to than& "artin Bo-e and the

    rest of the n'c community for listening to my rants and all the (itty remar&s#

  • 7/21/2019 Character Recognition Using ANN

    4/52

    D)&larati#' #' W#r* #- H#'#r

    I declare that this thesis has *een (ritten *y myself using only the listed references and

    consultations pro!ided *y my super!isor#

    Bratisla!a+ .################ 2/10

    .##################################

    I!or hliari&

  • 7/21/2019 Character Recognition Using ANN

    5/52

    A.stra&t

    The aim of this (or& is to re!ie( e)isting methods for the hand(ritten character

    recognition pro*lem using machine learning algorithms and implement one of them for a

    userfriendly Android application# The main tas&s the application pro!ides a solution for

    are hand(riting recognition *ased on touch input+ hand(riting recognition from li!e

    camera frames or a picture file+ learning ne( characters+ and learning interacti!ely *ased

    on user's feed*ac The recognition model (e ha!e chosen is a multilayer perceptron+ a

    feedfor(ard artificial neural net(or&+ especially *ecause of its high performance on non

    linearly separa*le pro*lems# It has also pro!ed po(erful in 34 and I4 systems 617 that

    could *e seen as a further e)tension of this (or 8e had e!aluated the perceptron's

    performance and configured its parameters in the 9 3cta!e programming language+

    after (hich (e implemented the Android application using the same perceptron

    architecture+ learning parameters and optimi-ation algorithms# The application (as then

    tested on a training set consisting of digits (ith the a*ility to learn alpha*etical or different

    characters#

    /)0(#r*s: 4haracter recognition+ "ultilayer Perceptron+ Bac&propagation+ prop+ Image

    processing

  • 7/21/2019 Character Recognition Using ANN

    6/52

    A.stra%t

    4ie;om te*oru s o*r=-&om+ uenie no!ch+ dosia; nepo-nanch -na&o! a

    intera&t%!ne uenie pod;a sptne< !-*y pouC%!ate;a# A&o model uenia sme -!olili

    dopredn> neur@no!> sie? F !iac!rst!o! perceptr@n# Go;*u o!ply!nil !yso& !&on na

    neline=rne separo!ate;nch pro*lmoch+ a&o a< fa&t+ Ce sa asto pouC%!a ! 34 a I4

    systmoch 617+ &tor *y sa dali priro!na? ro-H%reniam teru+

    parametre a algoritmy optimali-=cie# Apli&=ciu sme testo!ali na trno!ace< sade cifier s

    moCnos?ou pridania a nauenia alfa*etic&ch a inch -na&o!#

    /13#v4 sl#v5: o-po-n=!anie -na&o!+ Giac!rst!o! perceptr@n+ Bac&propagation+

    prop+ Spraco!anie o*ra-u

  • 7/21/2019 Character Recognition Using ANN

    7/52

    Table of Contents

    Introduction############################################################################################################################1

    1# 3!er!ie(############################################################################################################################0

    1#1 Pro

  • 7/21/2019 Character Recognition Using ANN

    8/52

    List #- A..r)viati#'s a'* S0.#ls

    34 3ptical 4haracter ecognition#

    I4 Intelligent 4haracter ecognition#

    O "atri) of all training e)amples# Each ro( contains a feature !ector of a single

    e)ample#

    ) ,eature !ector of a training e)ample

    y A !ector of la*els of training e)amples#

    L um*er of layers in a net(or

    m um*er of training e)amples#

    8eights of a neural net(or Superscript denotes a layer+ su*scripts denote

    inde) of an element#

    aQlR Gector of neuron acti!ations in layer l#

    h Dypothesis of a classifier (ith gi!en (eights #

    s< The si-e of layerj#

    gQ-R Sigmoid function ofz#

    QR 4ost function of (eights #

    QlR Gector of error terms for neurons in layer l#

    UQlR Accumulator of error terms for neurons in layer lin the conte)t of pure

    *ac&propagation algorithm#

    UQtR "atri) of (eight update si-es for iteration tin the conte)t of P3P#

    UQtR "atri) of (eight change !alues in the conte)t of P3P#

    U/ Initial (eight step si-e in P3P#

    VW Acceleration of (eight step si-e in P3P#

    V eceleration of (eight step si-e in P3P#

    X et(or& regulari-ation parameter#

  • 7/21/2019 Character Recognition Using ANN

    9/52

    I'tr#*6&ti#'

    Dand(ritten character recognition is a field of research in artificial intelligence+ computer

    !ision+ and pattern recognition# A computer performing hand(riting recognition is said to

    *e a*le to acYuire and detect characters in paper documents+ pictures+ touchscreen de!ices

    and other sources and con!ert them into machineencoded form# Its application is found in

    optical character recognition and more ad!anced intelligent character recognition

    systems# "ost of these systems no(adays implement machine learning mechanisms such

    as neural networks#

    "achine learning is a *ranch of artificial intelligence inspired *y psychology and *iology

    that deals (ith learning from a set of data and can *e applied to sol!e (ide spectrum of

    pro*lems# A super!ised machine learning model is gi!en instances of data specific to a

    pro*lem domain and an ans(er that sol!es the pro*lem for each instance# 8hen learning is

    complete+ the model is a*le not only to pro!ide ans(ers to the data it has learned on+ *ut

    also to yet unseen data (ith high precision#

    eural net(or&s are learning models used in machine learning# Their aim is to simulate the

    learning process that occurs in an animal or human neural system# Being one of the mostpo(erful learning models+ they are useful in automation of tas&s (here the decision of a

    human *eing ta&es too long+ or is imprecise# A neural net(or& can *e !ery fast at

    deli!ering results and may detect connections *et(een seen instances of data that human

    cannot see#

    8e ha!e decided to implement a neural net(or& in an Android application that recogni-es

    characters (ritten on the de!ice's touch screen *y hand and e)tracted from camera and

    images pro!ided *y the de!ice# Da!ing acYuired the &no(ledge that is e)plained in this

    te)t+ the neural net(or& has *een implemented on a lo( le!el (ithout using li*raries that

    already facilitate the process# By doing this+ (e e!aluate the performance of neural

    net(or&s in the gi!en pro*lem and pro!ide source code for the net(or& that can *e used to

    sol!e many different classification pro*lems# The resulting system is a su*set of a comple)

    34 or I4 systemZ these are seen as possi*le future e)tensions of this (or

    In the first chapter+ (e descri*e the o!erall pro

  • 7/21/2019 Character Recognition Using ANN

    10/52

    approaches+ algorithms and systems of similar nature# ,or o!er!ie(+ (e also *riefly

    e)plain the specific algorithms that ha!e *een used in the implementation of the pro

  • 7/21/2019 Character Recognition Using ANN

    11/52

    !7 Ov)rvi)(

    In order to reach the analysis of the used learning model and the specification and

    implementation of the algorithms+ and conseYuently+ the Android application+ (e had to

    re!ie( e)isting approaches to the pro*lem of character recognition# In this chapter+ (e

    descri*e the pro

  • 7/21/2019 Character Recognition Using ANN

    12/52

    8e understand interacti!e learning (ith user feed*ac& as using online machine learning#

    This should not *e confused (ith online and offline hand(riting recognition+ (hich is

    descri*ed a*o!e# 3nline learning is defined as learning one instance of data at a time+

    e)pecting feed*ac&[the real la*el of a character input *y a user[to *e pro!ided after the

    neural net(or&'s prediction# 3n the other hand+ offline learning performs all of the learning

    process prior to ma&ing any predictions and does not change its prediction hypothesis

    after(ard# The t(o methods are e)amples of ]la-y^ and ]eager^ learning+ respecti!ely#

    3nline machine learning ma&es the system more adapti!e to change+ such as a change in

    trends+ prices+ mar&et needs+ etc# In our case+ user feed*ac& ma&es the system a*le to adapt

    to a change in hand(riting style+ perhaps caused *y a change of user# 8e use *oth offline

    and online learning in this (or Initially+ the user e)pects from the application to ]

  • 7/21/2019 Character Recognition Using ANN

    13/52

    image+ *ut some of them+ such as cropping the (ritten character and scaling it to our input

    si-e+ are also performed in the touch mode#

    igital capture and con!ersion of an image often introduces noise (hich ma&es it hard to

    decide (hat is actually a part of the o*

  • 7/21/2019 Character Recognition Using ANN

    14/52

    The tas& of image segmentation is to split an image into parts (ith strong correlation (ith

    o*

  • 7/21/2019 Character Recognition Using ANN

    15/52

    ,inally+ in *oth touch and image *ased recognition in our (or&+ (e ha!e used cropping and

    scaling of the images to a small fi)ed si-e#

    !7272 F)at6r) E>tra&ti#'

    ,eatures of input data are the measura*le properties of o*ser!ations+ (hich one uses to

    analy-e or classify these instances of data# The tas& of feature e)traction is to choose

    rele!ant features that discriminate the instances (ell and are independent of each other#

    According to 607+ selection of a feature e)traction method is pro*a*ly the single most

    important factor in achie!ing high recognition performance# There is a !ast amount of

    methods for feature e)traction from character images+ each ha!ing different characteristics+

    in!ariance properties+ and reconstructa*ility of characters# 607 states that in order to ans(erto the Yuestion of (hich method is *est suited for a gi!en situation+ an e)perimental

    e!aluation must *e performed#

    The methods e)plained in 607 are template matching+ deforma*le templates+ unitary image

    transforms+ graph description+ pro

  • 7/21/2019 Character Recognition Using ANN

    16/52

    Fi+6r) ": H#ri#'tal a'* v)rti&al

    9r#8)&ti#' $ist#+ras ?"@7

    As (e ha!e mentioned+ there is no method that is intrinsically perfect for a gi!en tas

    E!aluation of such methods (ould ta&e a lot of time and it is not in the scope of this (or

    Instead+ (e set our focus on multilayer feedforward neural networks+ (hich can *e !ie(ed

    as a com*ination of a feature e)tractor and a classifier 607+ the latter of (hich (ill *e

    e)plained shortly#

    In our (or&+ (e ha!e used the multilayer perceptronneural net(or& model+ (hich (ill *e

    more deeply descri*ed in the ne)t chapter# ,or no(+ (e can thin& of this model as a

    directed graph consisting of at least 0 layers of nodes#

    The first layer is called the input layer+ the last layer is the output layer+ and a num*er of

    intermediate layers are &no(n as hidden layers# E)cept of the input layer+ nodes of neural

    net(or&s are also called neuronsor units# Each node of a layer typically has a (eighted

    connection to the nodes of the ne)t layer# The hidden layers are important for feature

    e)traction+ as they create an internal a*straction of the data fed into the net(or The more

    hidden layers there are in a net(or&+ the more a*stract the e)tracted features are#

    N

  • 7/21/2019 Character Recognition Using ANN

    17/52

    Fi+6r) : Basi& vi)( at t$) 6ltila0)r 9)r&)9tr#'

    ar&$it)&t6r)7 It 'tai's t$r)) la0)rs, #') #- t$) .)i'+ a$i**)' la0)r7 La0)rs 'sist #- ')6r#'s= )a&$ la0)r is -6ll0

    '')&t)* t# t$) ')>t #')7

    !727" Classi-i&ati#'

    4lassification is defined as the tas& of assigning labelsQcategories+ classesR to yet unseen

    o*ser!ations Qinstances of dataR# In machine learning+ this is done on the *asis of training

    an algorithm on a set of training examples# 4lassification is asupervised learningpro*lem+

    (here a ]teacher^ lin&s a labelto e!ery instance of data# La*el is a discrete num*er that

    identifies the class a particular instance *elongs to# It is usually represented as a non

    negati!e integer#

    There are many machine learning models that implement classificationZ these are &no(n as

    classifiers# The aim of classifiers is to fit a decision boundaryQ,igure R in featurespace

    that separates the training e)amples+ so that the class of a ne( o*ser!ation instance can *e

    correctly la*eled# In general+ the decision *oundary is a hypersurface that separates an

    dimensional space into t(o partitions+ itself *eing 1dimensional#

    K

    I'96t la0)r Hi**)' la0)r O6t96t la0)r

  • 7/21/2019 Character Recognition Using ANN

    18/52

    Fi+6r) : Vis6aliati#' #- a *)&isi#'.#6'*ar07 I' -)at6r)s9a&) +iv)' .0

    >! a'* >2, a *)&isi#' .#6'*ar0 is

    9l#tt)* .)t())' t(# li')arl0s)9ara.l) &lass)s ?@7

    !727"7! L#+isti& R)+r)ssi#'

    !ogistic regression is a simple linear classifier# This algorithm tries to find the decision

    *oundary *y iterating o!er the training e)amples+ trying to fit parametersthat descri*e the

    decision *oundary hypersurface eYuation# uring this learning process+ the algorithm

    computes a cost functionQalso called error functionR+ (hich represents the error measure of

    its hypothesis Qthe output !alue+ predictionR# This !alue is used for penali-ation+ (hich

    updates the parameters to *etter fit the decision *oundary# The goal of this process is to

    con!erge to parameter !alues that minimizethe cost function# It has *een pro!ed 6N7 that

    logistic regression is al(ays con!e)+ therefore the minimi-ation process can al(ays

    con!erge to a minimum+ thus finding the *est fit of the decision *oundary this algorithm

    can pro!ide#

    ntil no(+ (e ha!e *een discussing *inary classification# To apply logistic regression to

    hand(riting recognition+ (e (ould need more than 2 distinguishing classes+ hence (e need

    multiclass classification# This can *e sol!ed *y using the one"vs"allapproach 6K7#

    1/

  • 7/21/2019 Character Recognition Using ANN

    19/52

    Fi+6r) : M6lti&lass &lassi-i&ati#' #-

    t$r)) &lass)s as it is s9lit i't# t$r))s6.9r#.l)s ?@7

    The principle of one!sall is to split the training set into a num*er of *inary classification

    pro*lems# 4onsidering (e (ant to classify hand(ritten digits+ the pro*lem degrades into

    1/ su*pro*lems+ (here indi!idual digits are separated from the rest# ,igure Msho(s thesame for 0 classes of o*

  • 7/21/2019 Character Recognition Using ANN

    20/52

    !727"72 M6ltila0)r P)r&)9tr#'

    "ultilayer perceptrons Q"LPsR are artificial neural net(or&s+ learning models inspired *y

    *iology# As opposed to logistic regression+ (hich is only a linear classifier on its o(n+ the

    multilayer perceptron learning model+ (hich (e already ha!e mentioned in terms of

    feature e)traction+ can also distinguish data that are not linearly separa*le#

    8e ha!e already outlined the architecture of an "LP+ as seen in Q,igure R#

    In order to calculate the class prediction+ one must perform feedforward propagation# Input

    data are fed into the input layer and propagated further+ passing through (eighted

    connections into hidden layers+ using an activation function#Dence+ the node's activation

    Qoutput !alue at the nodeR is a function of the (eighted sum of the connected nodes at a

    pre!ious layer# This process continues until the output layer is reached#

    The learning algorithm+ backpropagation+ is different from the one in logistic regression#

    ,irst+ the cost function is measured on the output layer+ propagating *ac& to the

    connections *et(een the input and the first hidden layer after(ards+ updating unit (eights#

    "LPs can perform multiclass classification as (ell+ (ithout any modifications# 8e simply

    set the output layer si-e to the num*er of classes (e (ant to recogni-e# After the

    hypothesis is calculated+ (e pic& the one (ith the ma)imum !alue#

    A nonlinear acti!ation function is reYuired for the net(or& to *e a*le to separate non

    linearly separa*le data instances# This+ along (ith the mentioned algorithms+ (ill *e

    e)plained in ne)t chapters#

    !7" E>isti'+ A99li&ati#'s

    Dand(ritten character recognition is currently used e)tensi!ely in 34 and I4 systems#

    These are used for !arious purposes+ some of (hich are listed *elo(#

    !7"7! F6ll-)at6r)* D#&6)'t OCR a'* ICR S0st)s

    ABB ,ineeader *y the ABB company is a piece of soft(are (ith (orld(ide

    recognition that deals (ith 34 and I4 systems+ as (ell as applied linguistics 6117# The

    company has also de!eloped a *usiness card reader mo*ile application that uses a

    12

  • 7/21/2019 Character Recognition Using ANN

    21/52

    smartphone's camera for te)t recognition to import contact information 6127# The

    application is+ among others+ also a!aila*le for the Android platform#

    Tesseractocr is an 34 engine de!eloped at DP La*s *et(een 1KN5 and 1KK5# It is

    claimed 6107 that this engine is the most accurate open source 34 engine a!aila*le+

    supporting a (ide !ariety of image formats and o!er M/ languages# It is free and open

    source soft(are+ licensed under Apache License 2#/#

    9oogle 9oggles is an image recognition Android and i3S application+ featuring searching

    *ased on pictures ta&en *y compati*le de!ices and using character recognition for some

    use cases 617#

    !7"72 I'96t )t$#*s"icrosoft has *een supporting a ta*let hand(riting*ased input method since the release of

    8indo(s OP Ta*let P4 Edition 6157# This allo(s users of de!ices (ith this platform to

    (rite te)t using a digiti-ing ta*let+ a touch screen+ or a mouse+ (hich is con!erted into te)t

    that can *e used in most applications running on the platform#

    9oogle Translate+ a machine translation Android application from 9oogle+ features

    hand(riting recognition as an input method+ as (ell as translating directly from the camera

    61M7# This closely resem*les a possi*le e)tension of our (or& in the future#

    10

  • 7/21/2019 Character Recognition Using ANN

    22/52

    27 L)ar'i'+ M#*)l i' D)tail

    In this chapter+ (e e)plain in detail the model that has *een used in the pro

  • 7/21/2019 Character Recognition Using ANN

    23/52

    In *inary classification+ using a single output neuron is recommended 6K7# Dere+ the

    hypothesis is typically a real !alue# A threshold is then used to determine the

    predicted class#

    In multiclass pro*lems+ the si-e of the output layer is typically eYual to the num*er

    of classes# Thus+ output data is represented as a !ector of real !alues# The predicted

    class is the element (ith the ma)imum !alue#

    In general+ ho(e!er+ (e assume the output layer is al(ays a real !alued !ector:

    for output layer!+ Q2#1R

    To perform super!ised learning+ a set of labelsQclassesR has to *e pro!ided# 8e represent

    these as a !ector of the same si-e as the output !ector:

    Q2#2R

    15

    Fi+6r) : T$) ar&$it)&t6r) #- a 6ltila0)r ')6ral ')t(#r%7 La0)r ! is

    t$) i'96t la0)r= la0)r 2 is a $i**)' la0)r= la0)r " is t$) #6t96t la0)r7

    >!, >2, a'* >" ar) -)at6r)s -)* t# t$) ')t(#r%= a!, a2, a'* a" ar)

    $i**)' la0)r 6'its= $;>< is t$) #6t96t val6) ;$09#t$)sis

  • 7/21/2019 Character Recognition Using ANN

    24/52

    The (eights of a neuron are represented as real!alued matrices:

    for layer l b!+ Q2#0R

    Dere+ l is any layer e)cept of the output layer# sing our notation+ QlR is a matri) of

    (eights corresponding to the connection mapping from layer l to layer l W 1# sl is the

    num*er of units in layer l+ andslW 1is the num*er of units in layer lW 1# Thus+ the si-e of

    QlRis 6slW 1+slW 17#

    The additional neuron that is included in layer lis the biasneuron# sually mar&ed asx/or

    a/(l)

    + *ias is a !ery important element in the net(or It can alter the shift of the

    acti!ation function along thexa)is# The *ias neuron is only connected to the ne)t layer[it

    has no input connections#

    ote that a ro( in the (eights matri) represents connections from all of the neurons in

    layer l to a single neuron in layer lW 1# 4on!ersely+ a column in the matri) representsconnections from a single neuron in layer lto all of the neurons in layer lW 1#

    272 H09#t$)sis

    Throughout this te)t+ (e'!e referred to hypothesis se!eral times# It is the prediction of a

    class+ the output !alue of a classifier# As mentioned in chapter 1+ in order to ena*le the

    net(or& to sol!e comple) nonlinear pro*lems+ the use of a nonlinear acti!ation function is

    reYuired#

    In many cases+ thesigmoidacti!ation function is used:

    Q2#R

    The range of the sigmoid function is 6/+ 17+ (hich is therefore also the range of the

    elements in the output layer#

    1M

    (l)=

    [1+ 1

    (l) 1+2(l)

    ### 1+sl+1(l)

    2+1(l) 2+2

    (l) ### 2+sl+1(l)

    ### ### ### ###

    sl+1, 1(l) sl+1,2

    (l)### sl+1,sl+1

    (l)]

    g(z)=1

    (1+ez), z

  • 7/21/2019 Character Recognition Using ANN

    25/52

    An acti!ation of a neuron in a layer is computed as a sigmoid function of the linear

    com*ination of the (eights !ector corresponding to the neuron and the acti!ation of all

    connected neurons from the pre!ious layer# ,or con!enience+ (e define the input layer

    neuron !ector as

    Q2#5R

    sing Q2#R and Q2#5R+ (e generali-e the computation of a neuron's acti!ation in a

    !ectori-ed form as

    for layer l 1+ Q2#MR

    Dere+ the sigmoid function is applied element(ise to the product of the (eights and the

    connected neurons from the pre!ious layer+ therefore a(l)sl #

    It may *e intuiti!e to go ahead and use Q2#MR recursi!ely to compute the o!erall hypothesis

    of the net(or&+ *ut as (e're assuming the *ias neuron in the architecture+ it needs to *e

    added to the !ector of acti!ations in layer lintermittently#

    The process of determining the !alue of the hypothesis in the descri*ed (ay is called

    forward propagation# The algorithm *ro&en up into steps follo(s:

    1# Start (ith the first hidden layer#

    2# 4ompute the acti!ations in the current layer using Q2#MR#

    0# If the current layer is the output layer+ (e ha!e reached the hypothesis and end#

    # Add a *ias unit a/(l)=1 to the !ector of computed acti!ations#

    5# Ad!ance to the ne)t layer and go to step 2#

    27" L)ar'i'+: Ba&%9r#9a+ati#'

    A multilayer perceptron is a super!ised learning model# As such+ e!ery e)ample in the

    training set is assigned a la*el+ (hich is used to compute a cost function Qan error

    measureR# As mentioned in the first chapter+ learning is an optimi-ation pro*lem that

    updates free parameters Q(eightsR in order to minimi-e the cost function#

    1J

    x=a(1)

    a(l)=g((l1)a(l1))

  • 7/21/2019 Character Recognition Using ANN

    26/52

    There is a num*er of different cost functions typically used (hen training multilayer

    perceptrons# Training is often performed *y minimi-ing the mean sYuared error+ (hich is a

    sum of sYuared differences *et(een computed hypotheses and actual la*els# Do(e!er+

    mean nonsYuared error is also used# To *e consistent (ith 6K7+ (e ha!e used a

    generali-ation of the cost function used in logistic regression:

    Q2#JR

    In Q2#JR+ mis the num*er of training e)amples and$is the total num*er of possi*le la*els#

    h(x

    (i )

    ) is computed using for(ard propagation descri*ed a*o!e#

    The cost function a*o!e is not al(ays con!e)[there may *e more local minima# Do(e!er+

    according to 6K7+ it is sufficient to reach a local minimum that is not glo*al# In order to

    reach a local minimum of the cost function+ (e use an optimi-ation algorithm+ such as

    gradient descent# 9radient descent is a simple optimi-ation algorithm that con!erges to a

    local minimum *y ta&ing steps in a (eightspace iterati!ely in socalled epochs# The si-e

    of each step is proportional to the negati!e of the gradientof the cost function at the

    current point#

    There is an important factor that modifies the descent step si-e in machine learning[the

    learning rate# It is a modifier that is used to tune ho( fast and ho( accurate the

    optimi-ation is and hea!ily determines the efficiency of a learning algorithm#

    Fi+6r) : T$) )--)&t #- l)ar'i'+ rat) #' +ra*i)'t *)s&)'t7 I' ;a

  • 7/21/2019 Character Recognition Using ANN

    27/52

    Since gradient is the partial deri!ati!e of the cost function (ith respect to indi!idual

    parameters+ the change of parameters in a single gradient descent step is performed as:

    Q2#NR

    In ,igure N(e can see the effect of a (rong choice of the learning rate# There are ad!anced

    (ays on determining the right learning rate !alue+ *ut it is usually sufficient to determine it

    empirically after applying !arious learning rates to the learning algorithm and pic&ing one

    (ith the minimum error#

    3*taining the gradient in a multilayer perceptron is not tri!ial and is done in se!eral steps#

    As each neuron has its o(n acti!ation and (eighted connections+ it ta&es its o(n part inthe cost function# To propagate the error measured on the output layer after a prediction+

    each neuron's (eights need to *e updated differently# To achie!e this+ (e introduce the

    concept of an error term & j(l)

    + representing the error of nodejin layer l#

    To o*tain the error terms for all neurons in the net(or& e)cept the input layer Qas there is

    no error in the input dataR+ (e do the follo(ing# 9i!en an instance of input datax+ for(ard

    propagation is performed to determine the hypothesis# 8ith the input la*elyj+ starting at the

    end of the net(or&+ (e calculate the error terms for the output layer per neuron j:

    Q2#KR

    ote that the output acti!ation a j(!)

    is a part of the hypothesis as sho(n in Q2#1R#

    8e then propagate the error to lo(er layers:

    for layer 1 b lb!+ Q2#1/R

    (hereg'is the gradient of the sigmoid function andzQlRis a !ector of the linear

    com*inations of all neurons and their respecti!e (eights in layer l 1:

    for layer 1 b l!+ Q2#11R

    sing Q2#11R+ it can *e sho(n that the sigmoid gradient is

    for layer 1 b l!+ Q2#12R

    therefore Q2#1/R can *e e)pressed using Q2#12R as:

    1K

    =%()

    & j(!)=aj

    (!)yj

    & j(l)=((l))T & j

    (l+1)#g '(z(l)),

    z(l)=(l1 )a(l1)

    g '(z(l))=a(l) #(1a(l)),

  • 7/21/2019 Character Recognition Using ANN

    28/52

    for layer 1 b lb!+ Q2#10R

    4ollecting the error terms is essential for the computation of the partial deri!ati!es of the

    net(or 8e (ill no( descri*e the o!erall process of o*taining the partial deri!ati!es+

    called backpropagation+ in the follo(ing pseudocode# ote that in this pseudocode+

    matri) elements are inde)ed from /+ as opposed to mathematical notation+ (here inde)ing

    starts at 1#

    01| We are given a training set {(x(1), y(1)),###,(x(m), y(m))}

    02| Set i , j(l)

    :=/ for all l, i, j

    03| For i=1 to m

    04| Set a(1)

    :=x(i )

    05| Perform forward propagation to compte a!l"for l= 2, 3, #, L

    0$| %sing y!i", compte &(!)=a(!)y (i)

    0&| 'ompte &(!1)

    , &(!2)

    , ###, &(2 )

    0(| Set (l):=(l)+&(l+1)(a (l))T

    0)| (i , j(l)

    :=1

    mi , j(l)

    o(+ the(term is eYual to the follo(ing:

    for layer lb!+ Q2#1R

    sing these partial deri!ati!es+ (e can perform gradient descent to minimi-e the cost

    function and thus ena*le the algorithm to ma&e predictions on ne( data#

    It is important to note that *efore using *ac&propagation to learn the parameters+ (eights

    should *e initiali-ed to random small num*ers *et(een / and 1# The (eights must not *e

    initiali-ed to -ero+ other(ise indi!idual (eight updates (ill *e constant for all (eights and

    the minimi-ation algorithm (ill fail to con!erge to a local minimum#

    27 L)ar'i'+: R)sili)'t Ba&%9r#9a+ati#'

    Resilient backpropagationQP3PR is an efficient optimi-ation algorithm proposed *y

    iedmiller and Braun in 1KK0 67# It is *ased on the principle of gradient descent used (ith

    pure *ac&propagation# Instead of updating (eights of a net(or& (ith a fi)ed learning rate

    2/

    & j(l)=((l))T&j

    (l+1)#(a(l) #(1a(l)))

    ((l)=

    %()

    (l)

  • 7/21/2019 Character Recognition Using ANN

    29/52

    that is constant for all (eight connections+ it performs a direct adaptation of the (eight step

    using only the sign of the partial deri!ati!e+ not its magnitude# As such+ it o!ercomes the

    difficulty of setting the right learning rate !alue#

    ,or each (eight+ an indi!idual (eight step si-e is introduced: )i+jQnot to *e confused (ith

    the sym*ol used (hen accumulating error in pure *ac&propagationR# 9i!en epoch t / and

    considering as a (eight matri) for a single layer Qthis can *e applied to any !alid layerR+

    this !alue e!ol!es during the learning process according to the follo(ing rule:

    Q2#15R

    The change of the sign of the partial deri!ati!e across t(o epochs indicates the local

    minimum of the cost function has *een /

    i , j(t1)

    ,if%()

    (t1 )

    %()

    (t)

  • 7/21/2019 Character Recognition Using ANN

    30/52

    sho(n 67 that the choice of this !alue is not critical#

    Empirically+ the reasona*le !alues for + + + and )/ are /#5+ 1#2+ and /#1+

    respecti!ely 67#

    In the original document 67 it (as suggested that the pre!ious update step *e re!erted if

    the sign of the gradient changes Qthe minimum (as missedR# This is called backtracking#

    Do(e!er+ in 657+ an P3P !ariation (ithout *ac&trac&ing has *een proposed+ simply

    lea!ing this step out+ as it is not crucial in the optimi-ation process and is easier to

    implement#

    27 Bias, Varia'&)

    Digh bias and high variance+ also called underfitting and overfitting+ respecti!ely+ are

    common causes of an unsatisfying learning algorithm performance#

    If a learning model is presented (ith input (ith too many comple) features+ the learned

    hypothesis may fit the training set !ery (ell QQR /R+ *ut may fail to generali-e the

    prediction on ne( e)amples# In this case+ o!erfitting may *e o*ser!ed# 3n the other hand+

    if a classifier generali-es the hypothesis to an o!erly simple form+ the error is usually high

    on *oth the training set and ne( e)amples+ (hich is caused *y underfitting#

    22

    Fi+6r) : Hi+$ .ias, $i+$ varia'&)7 O' t$) l)-t, a l)ar'i'+ al+#rit$ 6'*)r-its t$)trai'i'+ )>a9l)s a'* (ill li%)l0 -ail at 9r)*i&ti#'s #' 6's))' )>a9l)s= i' t$) &)'t)r,

    t$) al+#rit$ -its t$) )>a9l)s J86st ri+$tK= #' t$) ri+$t, t$) al+#rit$ #v)r-its t$)

    )>a9l)s, -itti'+ t$) trai'i'+ s)t, .6t (ill li%)l0 -ail at 9r)*i&ti#'s #' 6's))')>a9l)s7

  • 7/21/2019 Character Recognition Using ANN

    31/52

    To address o!erfitting+ (e may do one of the follo(ing:

    1# educe the num*er of features

    "anually select (hich features to &eep

    se a dimensionality reduction algorithm+ such as principal component

    analysisQnot co!ered in this (or&R

    2# 9et more e)amples Qmay help in some casesR

    *# Apply regularization

    ecreases the !alues of free parameters for *etter generali-ation

    sed (hen all features contri*ute to a successful hypothesis

    In regulari-ation+ a regulari-ation parameter+is used to penali-e free parameters Q(eights

    in an "LPR# This is a real scalar !alue that ta&es part in the cost function and optimi-ation

    functions in order to affect the choice of free parameters and help (ith the high !ariance

    pro*lem# In the learning process+ (hen + /+ the machine learning algorithm's parameters

    are reduced+ (hen +b /+ the parameters are increased+ and (hen + /+ no regulari-ation

    is performed# Being stated+ (e choose a high !alue for the regulari-ation parameter to

    a!oid high !ariance and lo(er it in the case of high *ias+ *ecause setting a too largeregulari-ation parameter may *e the cause of high *ias itself#

    To add regulari-ation to the multilayer perceptron algorithms+ (e must modify Q2#JR:

    Q2#1NR

    In effect+ this adds a condition to line K of the *ac&propagation pseudocode:

    0)| (i , j(l)

    :=1

    mi , j(l)

    if j= 0

    10| (i , j(l)

    :=1

    m(i , j

    (l) +i , j(l) ) if j* 0

    As sho(n in Q2#1NR and in line K+ (e do not add the regulari-ation term for the *ias units+

    therefore (e s&ip the first column of the (eight matri)#

    20

    %()=1

    mi=1

    m

    k=1

    $

    [yk(i ) log((h(h(i )))k)(1yk(i )) log(1(h(x(i)))k)]

    + 2 m [l=1

    !1

    i=1

    sl+1

    j=2

    sl

    (i , j(l) )2]

  • 7/21/2019 Character Recognition Using ANN

    32/52

    "7 S#l6ti#' D)s&ri9ti#'

    "7! F6'&ti#'al S9)&i-i&ati#'As mentioned in the aim+ there are four main reYuirements of the Android application that

    has *een de!eloped:

    a*ility to recogni-e characters (ritten using the touchsensiti!e display+

    a*ility to recogni-e characters gi!en an image or camera frames as input+

    a*ility to learn progressi!ely *ased on user's feed*ac&

    a*ility to learn ne(+ pre!iously unseen characters#

    8e (ill no( descri*e these reYuirements in detail+ listed *elo(#

    6/17 The application pro!ides means to enter a ]touch mode^ screen# Dere+ the user can

    dra( characters freely *y hand#

    6/1#17 8hen the user is done dra(ing+ the dra(ing is recogni-ed and the prediction is

    sho(n to the user#

    6/1#27 The dra(ing+ along (ith the predicted la*el+ can *e sa!ed to a persistent location#

    6/1#07 The user can pro!ide feed*ac& on the prediction+ signaling (hether the prediction

    (as correct+ or ma&ing a correction to the prediction+ performing online learning#

    6/27 The application pro!ides means to enter a ]camera mode^ screen# Dere+ the

    de!ice's camera is used to present its input to the screen#

    6/2#17 After sho(ing the camera frame+ it is analy-ed and found patterns are recogni-ed

    and sho(n to the user#

    6/2#27 The process in 6/27 and 6/2#17 is done continuously as the camera frames are

    updated o!er time#

    6/2#07 The user can pro!ide feed*ac& on the prediction+ signaling (hether the prediction

    (as correct+ or ma&ing a correction to the prediction+ performing online learning#

    6/07 The application pro!ides means to enter an ]image mode^ screen# Dere+ the user

    2

  • 7/21/2019 Character Recognition Using ANN

    33/52

    can load an image file present on the de!ice+ (hich is then sho(n on the screen#

    6/0#17 After sho(ing the image+ it is analy-ed and found patters are recogni-ed and

    sho(n to the user#

    6/7 The user can see the list of all learned characters#

    6/57 The user can add a ne( character and train it using touch mode as descri*ed in

    6/17#

    6/M7 The user can perform offline learning on current persistent data at any time#

    "72 Pla' #- S#l6ti#'

    Dere (e are going to descri*e the choices that (ere needed to *e made *efore starting to

    implement the solution#

    "727! R)+'iti#' Pr#&)ss Pi9)li')s

    So far+ (e ha!e named and descri*ed se!eral algorithms dealing (ith image preprocessing+

    learning+ and optimi-ation# These are thought of as *uilding *loc&s (hen designing the

    recognition process pipeline that (ould satisfy gi!en reYuirements and they are often used

    in seYuence+ one pro!iding input for another# As *riefly mentioned in the o!er!ie(+ (e

    ha!e considered a process pipeline for recognition *ased on touch input+ and a separate

    pipeline for recognition *ased on static image input#

    The recognition pipeline used to recogni-e characters in the touch input mode follo(s:

    1# AcYuire the hand(ritten character image as a grayscale *itmap#

    2# esi-e this *itmap to 2/)2/ pi)els#

    0# AcYuire a *inary *itmap of points (here each stro&e has started and ended#

    # esi-e this *itmap to 2/)2/ pi)els#

    5# nroll the *itmap matrices to a feature !ector of N// elements#

    M# ,eed this !ector to a trained multilayer perceptron+ gi!ing us the prediction#

    The *itmaps ha!e *een chosen to *e grayscale and ha!e a small resolution *ecause it is

    sufficient to perform correct prediction and the feature !ector si-e is small enough to ma&e

    25

  • 7/21/2019 Character Recognition Using ANN

    34/52

    the learning and prediction computationally feasi*le# Similar *itmap si-es ha!e *een

    chosen in related pro*lems *y g 6K7 and Le4un et al# 617#

    As (e ha!e more data a!aila*le in the touch mode than a pure image *itmap+ (e ha!e also

    decided to collect the *itmap of stro&e end points to *e a*le to *etter distinguish characters

    such as 'N' and 'B'+ as mentioned in the o!er!ie(# The resi-ed *itmaps of these characters

    are often similar+ *ut the (riting style of each is usually different# By pro!iding this e)tra

    *itmap (ith each e)ample+ (e are gi!ing a hint to the neural net(or& classifier a*out (hat

    features to focus on (hen performing automatic feature e)traction (ith the hidden layer#

    The pipeline for recognition *ased on an image or a camera frame is different:

    1# AcYuire the image *itmap in grayscale colors#

    2# Apply a median filter to the *itmap#

    0# Segment the *itmap using thresholding to get a *inary *itmap#

    # ,ind the *ounding *o)es of e)ternal contours in the *itmap#

    5# E)tract su**itmaps from the *ounding *o)es#

    M# esi-e the su**itmaps to 2/)2/ pi)els#

    J# nroll the su**itmap matrices to feature !ectors per // elements#

    N# ,eed each feature !ector to a trained multilayer perceptron+ gi!ing us predictions#

    3ur intention is to produce a similar input for the net(or& as in the case of touch input+

    only (ithout the stro&e end *itmaps as (e do not possess this information in this mode#

    8ith this approach+ (e are a*le to reuse *itmaps produced in touch input to train a

    net(or& that operates on images#

    8hen preprocessing the image+ (e apply a median *lur to remo!e noise# "edian *lur (as

    preferred to other *lur filters *ecause of its property of preser!ing edges+ (hich are

    important in character recognition#

    Thresholding is a !ery simple segmentation algorithm that is sufficient to segment out

    characters from the *ac&ground and thus *inari-e the *itmap# This step is reYuired+ as the

    *ac&ground is an e)tra information (e do not need in the recognition process#

    ,inding contours in the image and finding the *ounding *o)es of each pattern allo(s us to

    2M

  • 7/21/2019 Character Recognition Using ANN

    35/52

    detect more characters in an image at once# 9i!en the *ounding *o)es+ (e are a*le to

    e)tract *itmaps of the indi!idual characters Qor other patternsR and then perform the rest of

    the processes

  • 7/21/2019 Character Recognition Using ANN

    36/52

    "727" O--li') a'* O'li') L)ar'i'+

    3ffline and online learning ha!e already *een defined in the o!er!ie( chapter as eager and

    la-y leaning mechanisms+ respecti!ely# As the specification denotes+ *oth mechanismsha!e to *e implemented in our application in order to *e a*le to learn progressi!ely *ased

    on user's feed*ac&+ as (ell as perform offline learning on persistent data at any time#

    The preferred algorithm for offline learning is P3P *ecause of the descri*ed ad!antages

    o!er pure *ac&propagation# It is e!ident the algorithm is superior to *ac&propagation and

    other learning algorithms as sho(n in Q,igure 1/R#

    Do(e!er+ P3P is not an online learning algorithm[it (ouldn't ma&e sense to perform

    as single learning iteration+ *ecause no (eight update acceleration or deceleration occurs#

    In effect+ using P3P for online learning (ould *e eYual to using *ac&propagation if

    initial (eight update !alue of P3P is eYual to the learning rate of *ac&propagation#

    Therefore+ pure *ac&propagation has *een chosen to perform online learning#

    ,or *ac&propagation+ (e ha!e set the learning rate to /#0# ,or P3P+ the initial (eight

    update si-e is /#/1+ has *een set to the traditional !alue of /#5+ and + is eYual to

    1#2# After testing the performance of the algorithms+ (e ha!e decided not to use

    regulari-ation at all Qthe regulari-ation parameter is /R# Both *ac&propagation and P3P

    perform 1// optimi-ation epochs#

    2N

    Fi+6r) !: Av)ra+) '6.)r #- r)6ir)* )9#&$s

    i' *i--)r)'t l)ar'i'+ 9r#.l) s&)'ari#s7 It isa99ar)'t t$at RPROP is s69)ri#r t#

    .a&%9r#9a+ati#' ;BP

  • 7/21/2019 Character Recognition Using ANN

    37/52

    "727 Us)* T)&$'#l#+i)s

    8e ha!e de!eloped a nati!e Android application using a!a SE untime En!ironment M+

    the a!a programming language+ and Android S# This is the traditional (ay to *uildAndroid applications and does not rely on a third party# The application has *een targeted

    at Android #2#2+ ho(e!er+ Android 2#2 and up should *e supported#

    Da!ing e)plained the algorithms used in multilayer perceptrons+ (e ha!e decided to

    implement machine learning on matri)le!el only using a li*rary for matri) manipulations+

    a!oiding e)isting machine learning li*raries# ,or this+ (e ha!e used a su*set of the A"A

    a!a pac&age that deals (ith matrices#

    Because image preprocessing is not the primary focus of this (or&+ (e ha!en't

    implemented these algorithms from scratch[instead+ (e ha!e used the 3pen4G li*rary

    for Android# This li*rary contains algorithms for computer !ision in general# 3mitting its

    machine learning capa*ilities+ (e ha!e used this to perform image preprocessing steps in

    the image and camera modes of the application#

    To prototype+ test+ and e!aluate algorithms+ (e ha!e used the 9 3cta!e language# This

    approach allo(ed us to produce statistics and plots useful for configuring the net(or&

    architecture and de*ugging the algorithms#

    2K

  • 7/21/2019 Character Recognition Using ANN

    38/52

    7 I9l))'tati#'

    In this chapter+ (e (ill descri*e ho( the Android application reYuirements ha!e *een

    satisfied+ document the implementation+ descri*e the pro*lems (e'!e run into and ho( they

    (ere sol!ed and pro!ide a *rief user guide in the process#

    7! A'*r#i* A99li&ati#' I9l))'tati#'

    The main entry point of the application is a na!igation activity# Being one of four main

    *uilding *loc&s of Android applications+ acti!ities are single+ focused things that the user

    can do 61J7# They usually appear as (indo(s containing user interface elements that theuser can interact (ith# 3ur na!igation acti!ity contains *uttons to access other acti!ities[

    input mode+ camera mode+ image mode+ and character list# Thus+ this acti!ity is reYuired in

    order to *e a*le to satisfy /1+ /2+ /0+ and /#

    The touch mode acti!ity is designed to implement features reYuired in /1# The main piece

    of user interface it contains is a custom view o*

  • 7/21/2019 Character Recognition Using ANN

    39/52

    By default+ each time a prediction is made+ the dra(ing and the stro&e end *itmaps are

    sa!ed to e)ternal storage as image files in pairs+ so that perceptron (eights can *e remo!ed

    and offline learning can *e performed# This *eha!ior satisfies /1#2 and can *e turned off

    in application settings to sa!e memory (hen no or only online learning is reYuired#

    The camera mode contains an 3pen4G user interface element tailored for sho(ing and

    processing camera frames in Android# 8hen the user enters the acti!ity+ they are

    immediately presented (ith continuously updated camera images+ (hich are processed

    according to the image mode recognition pipeline as has *een descri*ed# Indi!idual

    contours that are li&ely to *e characters are found and mar&ed (ith rectangles# A predicted

    la*el is sho(n a*o!e each such rectangle# In practice+ a e)us S de!ice (ith a singlecore

    19D- 4P Qthe 9P is+ unfortunately+ not used *y the implementation of the 3pen4Gli*rary (hich (ould ha!e li&ely performed *etterR and a camera resolution of J2/)N/

    pi)els+ the updating freYuency is appro)imately 2 frames per second# This depends on the

    num*er of o*

  • 7/21/2019 Character Recognition Using ANN

    40/52

    8hen the user enters the image mode+ they are instructed to start *y opening an image file#

    The system then sho(s all possi*le applications that can handle this reYuestZ a default

    gallery application is preferred+ as these usually implement the *eha!ior of pro!iding data

    correctly+ ho(e!er+ other applications+ such as file managers+ may also *e used# The loaded

    image file is processed according to the image mode recognition pipeline using the same

    techniYues as in camera mode+ and is then sho(n to the user# The segmentation control

    !ie(s are also present# Because this acti!ity (ould share a lot of source code (ith the

    camera acti!ity+ methods that could *e reasona*ly separated from the classes ha!e *een

    mo!ed to a pu*lic class and declared as static#

    nli&e in the camera acti!ity+ (hich is constrained to the landscape screen orientation+ (e

    (anted to ma&e this acti!ity fle)i*le# It can *e rotated+ (hile its state is retained in afragment+ (hich represents a portion of a user interface or a *eha!ior of an acti!ity 61J7#

    The last acti!ity from the four na!igation elements is a character list that satisfies /# It

    sho(s agrid viewof learned characters+ filling the items (ith sa!ed *itmaps if present#

    8hen the user selects an item in the grid !ie(+ they are ta&en into the touch mode acti!ity

    that is modified to focus on training the single character la*el# The difference is that no

    feed*ac& dialog is sho(n to the user[it is assumed that the character the user selected in

    the character list is al(ays the anticipated la*el and online learning is performed after eachprediction# 3ther(ise+ the acti!ity is unchangedZ the e)amples are still sa!ed to e)ternal

    storage if set to do so in the settings and the net(or& state is sa!ed in appropriate e!ents#

    The character list acti!ity also contains a menu itemfor adding a ne( character as reYuired

    in /5# After *eing selected+ the user is as&ed to assign a la*el to the ne( character and is

    then ta&en into the touch input acti!ity to train the ne( character as descri*ed a*o!e# This

    action adds the character to the list of &no(n characters and modifies the structure of the

    perceptron used for touch input recognition# In particular+ the output layer si-e is increased

    *y one Qto *e a*le to predict the ne( classR and (eight connections to this layer are

    ad

  • 7/21/2019 Character Recognition Using ANN

    41/52

    using online learning in the related acti!ities+ or *y performing offline learning in the case

    that the *itmaps for the ne( character ha!e *een sa!ed#

    So far+ (e ha!e mentioned the application settings only in relation to the option of sa!ing

    character *itmaps# Settings are another acti!ity that the user can access from any other

    acti!ity at any time# Dere+ *esides the sa!ing option+ the user is a*le to pic& the error

    minimi-ation algorithm used for offline learning: *ac&propagation or P3P+ the latter

    *eing the default# The initial learning rate to perform the online learning iteration Qusing

    *ac&propagationR is /#0+ *ut if the user feels the need to ad

  • 7/21/2019 Character Recognition Using ANN

    42/52

    *ac&light are allo(ed to *e turned off+ *ut the 4P is &ept on until all partial (a&e loc&s

    ha!e *een released 61J7#

    72 A'*r#i* A99li&ati#' S#6r&) C#*)The application source code is composed of a!a classes arranged in pac&ages# Dere+ (e

    (ill ta&e a *rief loo& at the pac&ages and descri*e the classes# The complete source code

    documentation+ along (ith the source code of the Android application and 9 3cta!e

    scripts+ see the content included on the enclosed G#

    The list of pac&ages is as follo(s:

    1# eu#uhliaricharrec#gui#acti!ities

    2# eu#uhliaricharrec#gui#adapters

    0# eu#uhliaricharrec#gui#dialogs

    # eu#uhliaricharrec#gui#fragments

    5# eu#uhliaricharrec#gui#models

    M# eu#uhliaricharrec#gui#!ie(s

    J# eu#uhliaricharrec#learning

    N# eu#uhliaricharrec#ser!ices

    K# eu#uhliaricharrec#utils

    1/# eu#uhliarimlp

    11#

  • 7/21/2019 Character Recognition Using ANN

    43/52

    The models pac&age comprises Character0rid1tem6odel+ (hich is a class that holds a

    *itmap and a la*el of a character used in the character list#

    8e ha!e created t(o custom !ie(s in the viewspac&age+ namely Character0rid4iewand

    Character4iew# The Character0rid4iewclass e)tends 0rid4iewand adds a fe( methods

    to facilitate the use of the class# 0rid4iew e)tends the 4iew class and is the main

    component of the touch input mode# Its main tas& is to o!erride the on(raw and

    onTouch2ventmethods to interact (ith the user and allo( them to dra( on the !ie('s

    can!as#

    The eu#uhliarik#charrec#learningpac&age contains the Character6lpclass that e)tends a

    6ultilayer3erceptronin the eu#uhliarik#mlppac&age and is used throughout the application

    as the main learning model#

    The Android ser!ices descri*ed earlier in this chapter can *e found in the

    eu#uhliarik#charrec#services pac&age: Trainingervice and Transferervice# The former

    performs offline learning of the neural net(or&s+ (hile the latter manages transfer of the

    initial data (hen the application is used for the first time#

    4ode that (ould other(ise ha!e *een shared *y se!eral classes has *een separated to the

    5ile7tils and 1mage7tils classes in the eu#uhliarik#charrec#utils pac&age# Intuiti!ely+

    5ile7tilscontains pu*lic static methods that deal (ith sa!ing and loading perceptron data+

    e)tracting files from the compressed initial data file+ and more#1mage7tils+ on the other

    hand+ deals (ith loading and con!erting the dataset Qcharacter *itmapsR+ *itmap

    manipulations+ *itmap matri) format con!ersions+ and image preprocessing algorithms#

    Pac&age 1/ is placed outside of the eu#uhliarik#charrecpac&age+ as it is meant to *e used

    *y any application+ not only in our pro

  • 7/21/2019 Character Recognition Using ANN

    44/52

    7 R)s6lts

    Before implementing the learning algorithms in a!a+ (e ha!e tested them as prototypes

    using 9 3cta!e# This allo(ed us to easily collect results of ho( (ell the learning

    algorithms perform# This chapter presents and discusses the comparison of the collected

    results#

    7! C#ll)&ti#' M)t$#*s

    8e ha!e measured the performance of the algorithms used in multilayer perceptrons:

    *ac&propagation and resilient propagation# 8e ha!e considered the scenario of recognitionfrom image+ (here the dataset consists only of / character image *itmaps per character#

    ,or this comparison+ the datasets are only comprised of characters of digits+ therefore the

    si-e of the dataset contains // e)amples#

    ,or rele!ant !alues+ (e ha!e split the dataset into training and !alidation sets+ (ith the

    ratio *eing J:0# Also+ *efore using the learning algorithms+ the dataset has *een randomly

    shuffled#

    The error rates ha!e *een o*tained using a logarithmic cost function sho(n in Q2#JR#

    The configuration of the learning model (hose results are presented here is:

    The regulari-ation parameter is /#

    The num*er of epochs is 1//#

    In *ac&propagation+ the learning rate is /#0#

    In resilient *ac&propagation+

    + +

    + and )/ are /#5+ 1#2+ and /#/1+

    respecti!ely#

    The perceptron architectures are as descri*ed in the plan of solution#

    This configuration has *een chosen *ased on recommendations from !arious resources that

    ha!e *een referred to in the e)planation of indi!idual model characteristics+ and testing+ the

    results of (hich (ould *e too long to fit in this chapter#

    0M

  • 7/21/2019 Character Recognition Using ANN

    45/52

    7! R)s6lt C#9aris#'

    8e ha!e measured the error of the *ac&propagation and P3P algorithms on the training

    and !alidation sets# This has *een tested using fractions of the dataset of !arious si-es and a

    learning cur!e has *een plotted# Learning cur!e represents error as a function of the dataset

    si-e and is a perfect tool to !isuali-e high *ias or !ariance#

    0J

    Fi+6r) !!: L)ar'i'+ &6rv) #- .a&%9r#9a+ati#' 9)r-#r)* #' t$) trai'i'+ a'*

    vali*ati#' s)ts7

  • 7/21/2019 Character Recognition Using ANN

    46/52

    In the learning cur!es+ no significant o!erfitting or underfitting is apparent# 8e can see that

    the P3P algorithm manages to con!erge to a *etter minimum gi!en 1// epochs than

    *ac&propagation# This is caused *y the ad!antages of the P3P algorithm to pure

    *ac&propagation that (e e)plained earlier in this (or Ta*le 1 confirms these findings#

    Al+#rit$ Trai'i'+s)t )rr#r

    ; B: A'*r#i* A99li&ati#' S&r))'s$#ts

    S&r))'s$#t !: A'*r#i*a99li&ati#' 'avi+ati#' s&r))'

    S&r))'s$#t 2: A'*r#i*a99li&ati#' t#6&$ #*) s&r))'

  • 7/21/2019 Character Recognition Using ANN

    52/52

    S&r))'s$#t ": A'*r#i* a99li&ati#' &a)ra #*)

    s&r))'