ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR...

download ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING    AN OCR SYSTEM FOR ARABIC HANDWRITING    AN OCR SYSTEM FOR ARABIC HANDWRITING

of 41

Transcript of ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR...

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    1/41

    CHAPTER ONE

    Introduction

    OCR is an abbreviation which stands for Optical Character Recognition. It has been an

    active subject of research since the early days of computers. Despite its age, it remains

    one of the most challenging and exciting areas of research in computer science. It has

    recently grown into a mature discipline !".

    OCR can be defined as the tas# of transforming text represented in the spatial form of

    graphical mar#s i.e., handwritten into its symbolic representation in a computer system.

    $he importance of the OCR emerges from the fact that a paper will become obsolete in

    the age of the digital computers. $he most of students in a lecture theatre, for example,

    will feel comfortable towards computer%written documents or notes rather than those

    handwritten, OCR provides a convenient way for a &uic# converting a handwritten text

    into a computer%typed text !".

    1.1 Problem Definition

    OCR is still one of the challenging areas of research. ' lot of wor# was done in OCR for

    (nglish characters, but a few wor#s were done for 'rabic characters. $he problem ariseshere) since 'rabian people also deal with the technology of the digital computers, they

    need a way by which their manual papers, documents, receipts*etc, can be converted

    and stored in a computer system. $his will ma#e life faster as it will reduce the time

    needed to write a manual document then typing it into a computer system. $his will save

    time for many organi+ations universities, companies*etc- which are considered as an

    important core for every 'rabic nation. It will also be useful for those who are not used to

    typing.

    1.2 Objective

    1

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    2/41

    $he objective of this wor# is to design and develop a new algorithm for detecting

    recogni+ing a handwritten 'rabic letter and to display the corresponding character in the

    computer system.

    1.3 et!odolo"ie# $ Tool#

    /'$0'1 software will be used to manipulate the image of the letter. $he design and

    development processes will all ta#e place in /'$0'1 environment. Otsu2s method will

    be used in thresholding the image. $he proposed features will be extracted using built%in

    algorithms in /'$0'1. ' new algorithm will be developed to extract the ratio of the

    letter2s width to its height. $he letters will be clustered and classified manually into

    classes according to the extracted features. ' feed%forward neural networ# that consists

    only of one neuron will be used for each class to distinguish between letters in the same

    class. $he final numeric output of the neural networ# will be translated into a meaningful

    output that represents the recogni+ed letter. 3inally, an interface that interacts with the

    user will be implemented enabling the /'$0'1 to read the image directly.

    1.% &no'led"e (#ed

    i. Digital Image 4rocessing DI4-.

    ii. 'rtificial 5eural 5etwor#s '55-.

    1.) T!e#i# *+,out

    $he rest of the thesis is organi+ed as follows6

    C!+-ter 2 *iter+ture Revie'/represents previous wor# in OCR. It also

    consists of the literature related to the OCR, DI4 and '55 to ma#e the reader

    more convenient with the terminology used in these fields.

    C!+-ter 3 De#i"n $ odelin"/provides the reader with an analysis to

    'rabic letters and a conceptual design for the system, flowcharts and algorithms

    developed and specified to achieve our objective.

    2

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    3/41

    C!+-ter % Im-lement+tion $ Re#ult#/ shows the implementation of

    the system and how it was tested and the results obtained from the test. It also

    discusses the results.

    C!+-ter ) Conclu#ion $ 0uture or/ specifies what is

    recommended to be done in order to improve this system in the future.

    APPENDI A/ contains tables for width to height ratios measured for all

    'rabic letters.

    APPENDI 4/ represents the &uestionnaire form- that was distributed to

    collect samples to test them on the system.

    APPENDI C/represents the system code.

    APPENDI D/contains tables for some measured values necessary for thereader in order to explain some values used in the code.

    CHAPTER TO

    3

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    4/41

    *iter+ture Revie'

    7ere comes a brief description of the literature related to the wor# in the area of the OCR.

    OCR basically depends on Digital Image 4rocessing for the manipulation of the image

    and extracting features, and on either 7idden /ar#ov /odels 7//- or 'rtificial

    5eural 5etwor#s '55- for the recognition. 1efore that, previous wor#s on OCR for

    'rabic letters are discussed.

    2.1 Previou# or

    3ew wor#s have been done on 'rabic text handwriting. $he methods used can be

    divided into 8 categories6 segmentation%free methods and segmentation%based methods.

    9illies used the approach of over segmenting the word to insure that no segment belongsto more than one character, and then he used the combined features as classifiers and

    passed them to a neural networ#. (lgammal Ismail segmented the word to scripts

    small connected segments- using a 0ine 'djacency 9raph 0'9- then they used the

    features of these scripts as classifiers !".

    In segmentation%free methods, a wide range of different features have been used by

    different groups. /any of these features re&uire normali+ation of the image before

    features can be extracted. 'n important part of this is often finding the baseline of theword the line on which it is written and by which the characters are often connected-.

    Other techni&ues have been developed in this area such as6 5earest 5eighbor method

    which was used on a small dataset containing 8: words but it re&uired resi+ing of the

    image. 4eter 1urrow described a method called radial method used to obtain statistical

    features of the words. $his method had very poor performance !".

    In this thesis, it is intended to perform 'rabic letters recognition using few topological

    descriptors and a proposed descriptor which is the ratio of letter2s width to height ta#inginto account no image resi+ing or any specification to the image characteristics should be

    defined thus eliminating any constraints about the image attributes.

    2.2 Di"it+l Im+"e Proce##in"

    4

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    5/41

    Digital image processing DI4- is a science in which a special #ind of signals #nown as

    images can be manipulated within a computer system in order to obtain and extract some

    information that has to do something with the objectives- of the manipulation. It is a

    discipline of D;4 digital signal processing-.

    ' digital image is represented as a 8%dimensional matrix in the computer system. It is a

    function of 8 variables fx,y- where x and y indicate the row and the column in which the

    pixel lies respectively. $he value of the function fx,y- represents the gray level

    associated with the pixel x,y- and it varies from : blac#- to 8

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    6/41

    3igure 8.8- 3rom left to right- red, green, and blue components of the digital image in 8.!

    2.2.1 4+#ic Conce-t# $ Terminolo", in DIP

    2.2.1.1 Connectivit,/two pixels are said to be connected if they are neighbors and

    their gray levels satisfy a specified criterion of similarity say, if they are e&ual-.

    2.2.1.2 Re"ion/let Rbe a subset of pixels in an image, Ris a region if Ris a connected

    set.

    2.2.1.3 4ound+r, 5Contour or 4order6/ a boundary of a region Ris a set of pixels

    in the region that have one or more neighbors that are not in R.

    2.2.2 DIP Proce##e#

    $here are = basic processes in the digital image processing6

    i. Image ;egmentation.

    ii. 3eatures- (xtraction.

    iii. 4attern Recognition.

    6

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    7/41

    3igure 8.=- 4rocesses of DI4

    2.2.2.1 Im+"e 7e"ment+tion

    ;egmentation is the process in which the image is subdivided into objects-

    bac#ground. It is the process by which the computer system #nows which pixels belong

    to the object that is re&uired to be manipulated and which ones must be discarded as they

    belong to the bac#ground.

    $here are many techni&ues used to segment images such as6 detection of discontinuities,

    7ough $ransforms, and thresholding.

    $hresholding is the mapping of the values of the pixels of a 8%dimensional image into one

    of 8 values usually : or !- using a specified threshold. It is the process of converting a 8%

    dimensional image into a binary image. ' binary image is that whose pixels have only

    one of 8 values) these values are labels used to refer to either object or bac#ground. $he

    threshold is used to divide the image into 8 regions6 region with gray level values greater

    than the threshold and other with gray level values less than the threshold.

    ' thresholded image gx,y- for an image fx,y- with threshold $- is defined as follows6

    8.!-

    $he thresholding process is histogram%based) it re&uires #nowing the histogram of the

    image to define the threshold that can divide the image into its objects- and bac#ground.

    ' histogram of a digital image with gray levels in the range : 0%!" is the discrete

    function6

    8.8-

    7

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    8/41

    where6

    r#6 the #thgray level.

    n#6 the number of pixels in the image having gray level r#-.

    3igure 8.>- ' 8%dimensional image left- and its histogram right-

    $he threshold value is computed using several mathematical methods. One of these

    methods is to compute the threshold from the image2s statistics the histogram- using

    Otsu2s method which is described as follows 8"6

    (valuate the normali+ed histogram for the image, i.e., treat the histogram as a

    discrete probability density function as in6

    8.=-

    ;uppose that a threshold #- is chosen such that C :is the set of pixels with levels

    :,!,8,*,#%!" C!is the set of pixels with levels #,#?!,#?8,*,0%!".

    Choose the value of #- that maximi+es the between%class variance @18.

    8.>-

    8

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    9/41

    where6

    8.

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    10/41

    ;imple regional descriptors include area, perimeter, compactness, mean, and median of

    the gray level.

    $opological descriptors are useful for global description of regions in the image plane.

    $opology is the study of properties of a figure that are unaffected by any deformation, aslong as there is no tearing or joining of the figure =". $opological descriptors include the

    following6

    i. 5umber of 7oles 7-.

    ii. 5umber of Connected Components C-.

    iii. (uler 5umber (-.

    where6

    E = C H 8.!:-

    3igure 8.

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    11/41

    2.3 Artifici+l Neur+l Net'or#

    It is #nown that the human brain is estimated to have around !: billion neurons each

    connected on average to !:::: other neurons forming a complicated networ# of neurons.

    $he artificial neural networ#s were inspired by biological findings relating to thebehavior of the brain >".

    'n artificial neural networ# may be defined as a networ# of units called neurons the

    basic processing elements for the networ#- which communicate by sending signals to

    each other over a large number of weighted connections

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    12/41

    3igure 8.B- 0ayers of neurons

    $here are several activation functions used in the neural networ#s. ;ome of them are

    shown in figure 8. below. $he choice of the activation function is based upon the

    purpose of the neural networ# design. $he chosen function must be able to classify given

    input pattern vectors into the desired classes which are called targets-.

    3igure 8.- ;ome of the activation functions used in '55

    5eural networ#s can be classified into architectures based on the type of the activation

    function used. One of the famous architectures is the perceptron which is a neuron with a

    hard limiting activation function.

    2.3.2 *e+rnin" $ Tr+inin" ANN

    $he most important characteristic of an artificial neural networ# is that it can learn and

    detect regularity irregularity of the input data. $here are several learning rules and

    12

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    13/41

    algorithms developed to perform the learning process) each group of rules is applicable to

    certain structures of neural networ#s. $he concept of the learning is based upon the

    modification of the connecting weights so that the mean s&uare error the mean s&uare of

    the difference between the desired output the actual output of the networ#- is as less as

    possible the goal is ideally +ero- and the relationship between the input and the output is

    estimated with more accuracy. $his process is similar to the curve fitting problems in

    numerical analysis where certain points in an n%dimensional space are given and a curve

    that passes through them is re&uired to be estimated. $he learning in which the targets are

    #nown is called supervised learning.

    3igure 8.- 1loc# diagram for the supervised learning process

    $raining is the process of collecting as many as possible input pattern, output- pairs from

    the application domain where the neural networ# is intended to be implemented, and

    presenting them in an appropriate form to it so that its weights are adjusted according to

    the learning rule embedded within it to map the inputs to the desired outputs. 'n

    important terminology in the training is the epoch which is defined as a one pass through

    all the training data.

    CHAPTER THREE

    De#i"n $ odelin"

    13

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    14/41

    In this chapter an analysis for 'rabic letters2 shapes in terms of the topological

    descriptors described earlier is introduced. $his analysis is followed by the conceptual

    design of the system.

    3.1 Ar+bic *etter#9 C!+r+cteri#tic# $ De#cri-tor#

    3.1.1 Ar+bic *etter# Cl+##e#

    $he 'rabic letters will be classified into classes) each class is characteri+ed by a =%

    dimensional descriptor pattern number of connected components, number of holes, and

    (uler number- which will be denoted as the C, 7, (" pattern. $he following table shows

    the evaluation of these descriptors for 'rabic letters.

    $able =.!- $he descriptors C, 7 ( for some 'rabic letters

    0etter C 7 ( 0etter C 7 ( 0etter C 7 (

    E 8 ! ! F 8 : 8 G ! : !

    H = ! 8 ! : ! J 8 : 8

    K 8 : 8 L > : > M = : =

    N ! : ! ! ! : P > : >

    Q ! : ! 8 ! ! S 8 : 8

    T 8 : 8 U ! ! : V ! : !

    WWX ! 8 %! Y 8 ! ! Z 8 : 8

    [ ! ! : \ ! : ! ] ! : !

    ^ = : = _ 8 : 8 ` 8 : 8

    $able =.8- Classification of the 'rabic letters into the proposed classes

    Class 0etters

    8 : 8 T K _ F ` Z S J

    8 ! ! E Y

    14

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    15/41

    = : = ^ M

    > : > L P

    ! : ! Q N \ ] V G

    ! ! : [ U

    ! 8 %! WX

    = ! 8 H

    It is obvious that from a simple investigation to the = descriptors C, 7, (" the class

    which the letter belongs to will be identified, the rest is to find out or to recogni+e-

    which letter within the class this is valid for all classes except the 8 classes ! 8 %!" =

    ! 8" since each one of them consists only of one letter as shown in the previous table-.

    $he ratio of the letter2s width to its height will be ta#en as an additional descriptor to

    obtain a >%dimensional pattern C, 7, (, R" where R- is the ratio mentioned recently. It

    will be assumed that each letter maintains its own ratio regardless of the handwriting.

    3igure =.!- illustrates the concept of the ratio.

    3igure =.!- $he width height of an 'rabic letter

    where

    =.!-

    3.1.2 7election of Cert+in *etter# for E+c! Cl+##

    15

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    16/41

    $he ratios were measured for all 'rabic letters typed by computer using = types of fonts6

    $raditional, Courier, and $ransparent. It is believed that neural networ#s solve

    classification problems if they are linearly separable which implies that each 'rabic letter

    should have a range of ratios that do not intersect with other letters. ;o certain letters

    from each class were chosen as they satisfy this criterion see appendix '-.

    /easurements for mean, variance, and confidence interval were ta#en for the ratios of a

    !:: random samples of selected letters. $hese letters were extracted from 'rabic

    Database 4roject developed by students from ;udan niversity for ;cience

    $echnology. ;ome results were excluded as they generated extreme values. ;ome letters

    were excluded from their classes as they affected the confidence interval for other letters.

    'nd as a result some classes will have only one letter as shown in tables =.=- to =.--.

    $he red circles indicate which letters were selected.

    $able =.=- ;ample of letters in class 8:8

    0etter J S `

    /ean :.B=: !.:!>= :.AA!=

    ariance :.!!!8 :.!8A> :.:>=8

    /in :.8:: :.: :.:>=?6

    $able =.!:- $raining data for class 8:8

    24

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    25/41

    5etwor# 5ame 6 class8:8 55

    $raining data 5etwor# $ype $argets /eaning of

    targets

    Data collected fromprevious graphs.

    4erceptron One of the numbers : !" :6 `

    !6 S

    $able =.!!- $raining data for class !!:

    5etwor# 5ame 6 class!!: 55

    $raining data 5etwor# $ype $argets /eaning of

    targets

    Data collected from

    previous graphs.

    0inear One of the numbers ! 8 >?" !6 U

    86 [

    >?6

    $able =.!8- $raining data for class 8!!

    5etwor# 5ame 6 class8!! 55

    $raining data 5etwor# $ype $argets /eaning of targets

    Data collected from

    previous graphs.

    4erceptron One of the numbers : !" :6 Y

    !6

    25

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    26/41

    3.3.3.2 Tr+inin" re#ult#

    $ables =.!=- to =.!A- include results obtained from training of '55. 3rom these results

    the values of weights biases were set to the networ#s6

    $able =.!=- $raining results for class!:!55

    26

    5etwor# 5ame 6 class!:!55

    /ax. 5umber of epochs !!A

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    27/41

    $he minimum value of the mean s&uare error was :.!:88A and with this value only B.!

    of the training samples were misclassified. ;ince the curve was stuc# at this value, it was

    decided to stop here and ta#e the values of the weight and bias.

    $able =.!>- $raining results for class8:855

    27

    5etwor# 5ame 6 class8:855

    /ax. 5umber of epochs !8

    /ean ;&uare (rror /;(- :.:

    ;amples ;i+e

    ;amples ;i+e >!

    ;amples misclassified :

    (rror probability :.:

    eight =.8B

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    29/41

    In table =.!

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    30/41

    3igure =.B- 4erformance graph for class8:855

    3igure =.- 4erformance graph for class!:!55

    30

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    31/41

    3igure =.- 4erformance graph for class!!:55

    3igure =.!:- 4erformance graph for class8!!55

    31

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    32/41

    CHAPTER 0O(R

    Im-lement+tion $ Re#ult#

    In this chapter, the system implementation is discussed as well as the mechanism by

    which the system was tested. $esting results are also interpreted.

    %.1 Im-lement+tion

    In order to implement run this OCR system, the developed /%files please refer to

    appendix C to see their names code- need to be placed in /'$0'12s wor# directory

    version B.: or more-.

    $o run the system, OCR or ocr is typed in the command window and a dialog box

    will appear as shown in 3igure >.!-. $hrough this dialog box the user can browse for his

    desired image file. Once open is clic#ed, the OCR system is executed and the result of

    the recognition is printed in /'$0'12s command window.

    3igure >.!- Dialog box that appears when running the system

    32

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    33/41

    3igure >.8- ;napshot of the system execution

    $able >.!- displays the message that will be displayed according to each 'rabic letter

    provided that the system recogni+ed it.

    $able >.!- the messages that must be printed in correspondence to each letter

    letter message letter message

    G 'lif ;ad

    M $eh Dad

    P$heh

    U$ah

    S keem Y $hah

    V 7ah H af

    ] Dal WX 7aa

    ` $hal [ aw

    ;een

    %.2 Te#tin" t!e #,#tem

    ' form was distributed for a random !:: people inside the faculty demanding them to fill

    it with their handwriting. $he 'rabic letters tested are shown in 'ppendix 1.

    33

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    34/41

    $he forms were collected. ;canners were used for data ac&uisition of images. Images of

    each letter were subjected to the system. $able >.8- shows the results obtained from the

    testing. 3igure >.=- displays the results graphically.

    $able >.8- $est results

    0etter /isclassified in the

    correct class -

    /isclassified in the correct ratio

    region -

    Recogni+ed

    successfully -

    G !> ! A

    P =A A>

    S 8A B AB

    V =< 8 =B

    ] !: A8 8

    ` != 8! AA

    8 8 B:

    =8 A A8

    = !: = = !

    Y >< == 88

    H A: >:

    WX >

    [ =: >8 8

    35

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    36/41

    3igure >.=- $est results

    It is clear that the recognition percentage varies along the letters. $he samples that the

    OCR system could not recogni+e can be classified into 8 categories6

    i. $hose which were not classified in the correct C, 7, (" class.

    ii. $hose which were not recogni+ed by their ratios but classified successfully they

    had the right C, 7, (" class-.

    $he first category was due to handwriting errors such as6 imperfect holes, unnecessary

    additional holes, connected dots, disconnected components writing without raising the

    pen to write dots or ham+a-) these errors are responsible for mista#ing a letter2s class

    for another. 3igure

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    37/41

    3igure

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    38/41

    3igure

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    39/41

    CHAPTER 0I:E

    Conclu#ion $ 0uture or

    ).1 Conclu#ion

    It can be concluded that this OCR system is so sensitive to the level accuracy of the

    handwriting specially the number of holes the connected components) if the user is so

    accurate in his handwriting or at least complies with the rules of the holes connected

    components, it can be predicted that the recognition percentages will increase to the

    following values for each letter see table

    H !:: = P WX = S =

    [ >A : V

    'lthough the system can achieve its objective this way, OCR systems and any system

    that deals with human beings in general- are intended to be easy in terms of applicability

    and more convenient to the user so it is not comforting to force the user to comply withthe system rather than forcing ourselves as system designers to comply with user2s needs.

    It can also be concluded that the ratio alone does not represent an enough classifier for

    'rabic letters except in very few letters as discussed before.

    39

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    40/41

    In the positive side, it can be concluded that the objectives have been achieved partially)

    image type constraints and image resi+ing re&uirements were eliminated.

    ).2 0uture or

    3ollowing are the enhancements recommended to be made in order to improve this OCR

    system6

    'pplying more experiments on handwritten 'rabic letters to obtain more accurate

    features and descriptors for them than those used in this system so as to extend the

    pattern length as well as number of letters to be recogni+ed.

    sing more complex neural networ# architectures specially those with built%in

    memories such as 'R$ 'daptive Resonance $heory- based neural networ#s as it

    is believed that the secret to the evolution improvement of this project is in the

    architecture of the neural networ# used as well as the length of the pattern.

    40

  • 8/13/2019 ChaptersAN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTEM FOR ARABIC HANDWRITING AN OCR SYSTE

    41/41

    RE0ERENCE7

    !" 4eter 1urrow, Arabic Handwriting Recognition, /aster of ;cience, ;chool of

    Informatics, niversity of (dinburgh, 8::>.

    8" Rafael C. 9on+ale+, Richard (. oods, ;teven 0. (ddins, Digital Image Processing

    using MATLAB, 8::>.

    =" Rafael C. 9on+ale+, Richard (. oods6Digital Image Processing, 8nd edition, 8::8.

    >" 1ishop, Christopher6eural etwor!s "or Pattern Recognition, Oxford, !