Research Article A Study of Moment Based Features...
Transcript of Research Article A Study of Moment Based Features...
Research ArticleA Study of Moment Based Features onHandwritten Digit Recognition
Pawan Kumar Singh Ram Sarkar and Mita Nasipuri
Department of Computer Science and Engineering Jadavpur University 188 Raja S C Mullick RoadKolkata West Bengal 700032 India
Correspondence should be addressed to Pawan Kumar Singh pawansinghjugmailcom
Received 3 November 2015 Revised 16 January 2016 Accepted 27 January 2016
Academic Editor Miin-Shen Yang
Copyright copy 2016 Pawan Kumar Singh et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
Handwritten digit recognition plays a significant role in many user authentication applications in the modern world As thehandwritten digits are not of the same size thickness style and orientation therefore these challenges are to be faced to resolvethis problem A lot of work has been done for various non-Indic scripts particularly in case of Roman but in case of Indicscripts the research is limited This paper presents a script invariant handwritten digit recognition system for identifying digitswritten in five popular scripts of Indian subcontinent namely Indo-Arabic BanglaDevanagari Roman and Telugu A 130-elementfeature set which is basically a combination of six different types of moments namely geometric moment moment invariant affinemoment invariant Legendremoment Zernikemoment and complexmoment has been estimated for each digit sample Finally thetechnique is evaluated on CMATER and MNIST databases using multiple classifiers and after performing statistical significancetests it is observed that Multilayer Perceptron (MLP) classifier outperforms the others Satisfactory recognition accuracies areattained for all the five mentioned scripts
1 Introduction
Thefield of automated reading of printed or handwritten doc-uments by the electronic devices is known as Optical Char-acter Recognition (OCR) system which is broadly definedas the process of recognizing either printed or handwrittentext from document images and converting it into electronicform OCR systems can contribute tremendously to theadvancement of the automation process and can improve theinteraction between man and machine in many applicationsincluding office automation bank check verification postalautomation and a large variety of business and data entryapplications Handwritten digit recognition is the method ofrecognizing and classifying handwritten digits from 0 to 9without human interaction [1] Although the recognition ofhandwritten numerals has been studied for more than threedecades and many techniques with high accuracy rates havealready been developed the research in this area continueswith the aim of improving the recognition rates further
Handwritten digit recognition is a complex problem dueto the fact that variation exists in writing style of different
writers The phenomenon that makes the problem morechallenging is the inherent variation in writing styles atdifferent instances Due to this reason building a genericrecognizer that is capable of recognizing handwritten digitswritten by diverse writers is not always feasible [2] Howeverthe extraction of the most informative features with highlydiscriminatory ability to improve the classification accuracywith reduced complexity remains one of the most importantproblems for this task It is a task of great importancefor which there are standard databases that allow differentapproaches to be compared and validated
India is a multilingual country with 23 constitutionallyrecognized languages written in 12 major scripts [1] Besidesthese hundreds of other languages are used in India each onewith a number of dialectsThe officially recognized languagesare Hindi Bengali Punjabi Marathi Gujarati Oriya SindhiAssamese Nepali Urdu Sanskrit Tamil Telugu KannadaMalayalam KashmiriManipuri KonkaniMaithili SanthaliBodo English and Dogri The 12 major scripts used to writethese languages are Devanagari Bangla Oriya GujaratiGurumukhi Tamil Telugu Kannada Malayalam Manipuri
Hindawi Publishing CorporationApplied Computational Intelligence and So ComputingVolume 2016 Article ID 2796863 17 pageshttpdxdoiorg10115520162796863
2 Applied Computational Intelligence and Soft Computing
Roman and Urdu In a multilingual country like India it is acommon scenario that a document like job application formrailway ticket reservation form and so forth is composed oftext contents written in different languagesscripts in orderto reach a larger cross section of people The variation ofdifferent scripts may be in the form of numerals or alphanumerals in a single document page But the techniquesdeveloped for text identification generally do not incorporatethe recognition of digitsThis is because the features requiredfor the text identification may not be applicable for identify-ing the digits
The paper is organized as follows Section 2 presents abrief review of some of the previous approaches to hand-written digit recognition whereas in Section 3 we introduceour script independent handwritten digit recognition systemSection 4describes the performance of our systemon realisticdatabases of handwritten digits and finally Section 5 con-cludes the paper
2 Review of Related Works
Gorgevik and Cakmakov [3] developed Support VectorMachine (SVM) based digits recognition system for hand-written Roman numerals They extracted four types of fea-tures from each digit image (1) projection histograms (2)contour profiles (3) ring-zones and (4) Kirsch featuresTheyreported 9727 recognition accuracy on National Instituteof Standards and Technology (NIST) handwritten digitsdatabase [4] In [5] Chen et al proposed max-min poste-rior pseudoprobabilities framework for Roman handwrittendigit recognition They extracted 256 dimension directionalfeatures from the input image Finally these features weretransformed into a set of 128 features using Principal Com-ponent Analysis (PCA) They reported recognition accuracyof 9876 on NIST database [4] Labusch et al [6] describeda sparse coding based feature extraction method with SVMas a classifier They found recognition accuracy of 9941on MNIST (Modified NIST) handwritten digits database [7]The work described in [8] combined three recognizers bymajority vote and one of them is based on Kirsch gradient(four orientations) dimensionality reduction by PCA andclassification by SVM They achieved an accuracy rate of9505 with 093 error on 10000 test samples of MNISTdatabase [7] Mane and Ragha [9] performed handwrittendigit recognition using elastic image matching techniquebased on eigendeformation which is estimated by the PCAof actual deformations automatically selected by the elasticmatching They achieved an overall accuracy of 9491 ontheir owndatabase collected fromdifferent individuals of var-ious professions for the experiment Cruz et al [10] presenteda handwritten digit recognition system which uses multiplefeature extraction methods and classifier ensemble A totalof six feature extraction algorithms namely MultizoningModified Edge Maps Structural Characteristics ProjectionsConcavities Measurements and Gradient Directional wereevaluated in this paper A scheme using neural networks as acombiner achieved a recognition rate of 9968 on a trainingset of 60000 images and a test set of 10000 images of MNISTdatabase
Dhandra et al [11] investigated a script independentautomatic numeral recognition system for recognition ofKannada Telugu and Devanagari handwritten numerals Inthe proposed method 30 classes were reduced to 18 classesby extracting the global and local structural features likedirectional density estimation water reservoirs maximumprofile distances and fill-hole density Finally a probabilisticneural network (PNN) classifier was used for the recognitionsystem which yielded an accuracy of 9720 on a totalof 2550 numeral images written in Kannada Telugu andDevanagari scripts In [12] Yang et al proposed supervisedmatrix factorization method used directly as multiclass clas-sifier They reported recognition accuracy of 9871 withsupervised learning approach on MNIST database [7] In[13] a mixture of multiclass logistic regression models wasdescribed They claimed recognition accuracy of 98 onthe Indian digit database provided by CENPARMI [14] Daset al [15] described a technique for creating a pool of localregions and selection of an optimal set of local regions fromthat pool for extracting optimal discriminating informationfor handwritten Bangla digit recognition Genetic algorithm(GA) was then applied on these local regions to samplethe best discriminating features The features extracted fromthese selected local regions were then classified with SVMand recognition accuracy of 97 was achieved In [16] awavelet analysis based technique for feature extraction wasreported For classification SVM and k-Nearest Neighbor (k-NN) were used and an overall recognition accuracy of 9704was reported on MNIST digit database [7] A comparativestudy in [17] was conducted by training the neural networkusing Backpropagation (BP) algorithm and further usingPCA for feature extraction Digit recognition was finallycarried out using 13 algorithms neural network algorithmand the Fisher Discriminant Analysis (FDA) algorithm TheFDA algorithm proved less efficient with an overall accuracyof 7767 whereas the BP algorithm with PCA for its featureextraction gave an accuracy of 912
In [18] a set of structural features (namely number ofholes water reservoirs in four directions maximum profiledistances in four directions and fill-hole density) and k-NNclassifier were employed for classification and recognition ofhandwritten digits They reported recognition accuracy of9694 on 5000 samples of MNIST digit database [7] In[19] AlKhateeb and Alseid proposed an Arabic handwrittendigit recognition system using Dynamic Bayesian NetworkThey employed DCT coefficients based features for classifi-cation The system was tested on Indo-Arabic digits database(ADBase) which contains 70000 Indo-Arabic digits [20] andan average recognition accuracy of 8526 was achieved on10000 samples Ebrahimzadeh and Jampour [21] proposedan appearance feature-based approach using Histogram ofOriented Gradients (HOG) for handwritten digit recogni-tion A linear SVM was then used for classification of thedigits in MNIST dataset and an overall accuracy of 9725had been realized Gil et al [22] presented a novel approachusing SVM binary classifiers and unbalanced decision treesTwo classifiers were proposed in this study where one usedthe digit characteristics as input and the other used thewhole image as such It is observed that a handwritten
Applied Computational Intelligence and Soft Computing 3
digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved
From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1
3 Feature Extraction Methodology
One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection
31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and
Handwritten digit images
Feature extraction using moment based features
Classification of digits using multiple classifiers
Statistical significance tests for comparison of multipleclassifiers using multiple datasets
Selection of appropriate classifier
Detailed performance evaluation of chosen classifierusing training and test sets
Figure 1 Schematic diagram illustrating the key modules of theproposed methodology
similarly in mechanics to characterize bodies by their spatialdistribution of mass
A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921
Let 120583119899 be a real sequence of numbers and let us define
Δ119898
120583119899=
119898
sum119894=0
(minus1)119894
(119898
119894)120583119899+119894 (1)
Note that Δ119898120583119899can be viewed as the 119898th order derivative of
120583119899By theHausdorff theorem a necessary and sufficient con-
dition that there exists a monotonic function 119865(119909) satisfyingthe system
120583119899= int1
0
119909119899
119889119865 (119909) 119899 = 0 1 2 (2)
is that the system of linear inequalities
Δ119896
120583119899ge 0 119896 = 0 1 2 (3)
should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals
int1
0
119909119899
119891 (119909) 119889119909 119899 = 0 1 (4)
completely characterizes the functionA necessary and sufficient condition that there exists a
function 119865(119909) of bounded variation satisfying (7) is that thesequence
119901
sum119898=0
(119901
119898)1003816100381610038161003816Δ119901minus119898
120583119898
1003816100381610038161003816 119901 = 0 1 2 (5)
4 Applied Computational Intelligence and Soft Computing
should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)
32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902
of a gray level image 119891(119909 119910) is defined as
119872119901119902= ∬infin
minusinfin
119909119901
119910119902
119891 (119909 119910) 119889119909 119889119910 (6)
where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901
119910119902 is the basis function for this moment definition A set
of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is
the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely
119891(119909 119910) is uniquely determined by 119872119901119902 Considering the
fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform
119898119901119902=
119872
sum119909=1
119873
sum119910=1
119909119901
119910119902
119891 (119909 119910) (7)
where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling
then the image may be positioned such that its center of mass
(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583
119901119902 The simplified form of central moment of
order (119901 + 119902) is defined as follows
120583119901119902=
119872
sum119909=1
119873
sum119910=1
(119909 minus 119909)119901
(119910 minus 119910)119902
119891 (119909 119910) (8)
where 119909 = 1198981011989800and 119910 = 119898
0111989800
The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583
119901119902computed using the centroid of the image are
equivalent to 119898119901119902
whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902
are defined as
120578119901119902=120583119901119902
120583120574
00
(9)
where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578
02 12057811 12057820 known as the
moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by
120579 =1
2tanminus1 (
212058311
12058320minus 12058302
) (10)
where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows
119890 =(12058320minus 12058302)2
+ 41205832
11
12058300
(11)
33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order
Applied Computational Intelligence and Soft Computing 5
three that are invariant with respect to image scale transla-tion and rotation Consider
1206011= 12057820+ 12057802
1206012= (12057820minus 12057802)2
+ 41205782
11
1206013= (12057830minus 312057812)2
+ (312057821minus 12057803)2
1206014= (12057830+ 12057812)2
+ (12057821+ 12057803)2
1206015= (12057830minus 312057812) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057821minus 12057803)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
1206016= (12057820minus 12057802) [(12057830+ 12057812)2
minus (12057821+ 12057803)2
]
+ 412057811(12057830+ 12057812) (12057821+ 12057803)
1206017= (312057821minus 12057803) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057812minus 12057830)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
(12)
This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work
34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows
1198681=
1
120583400
(1205832012058302minus 1205832
11)
1198682=
1
1205831000
(1205832
301205832
03minus 612058330120583211205831212058303+ 4120583301205833
12
+ 4120583031205833
21minus 31205832
211205832
12)
1198683=
1
120583700
(12058320(1205832112058303minus 1205832
12) minus 12058311(1205833012058303minus 1205832112058312)
+ 12058302(1205833012058312minus 1205832
21))
1198684=
1
1205831100
(1205833
201205832
03minus 61205832
20120583111205831212058303minus 61205832
20120583211205830212058303
+ 91205832
20120583021205832
12+ 12120583
201205832
111205830312058321
+ 61205832012058311120583021205833012058303minus 18120583
2012058311120583021205832112058312
minus 81205833
111205830312058330minus 6120583201205832
021205833012058312+ 9120583201205832
021205832
21
+ 121205832
11120583021205833012058312minus 61205832
111205832
021205833012058321+ 1205832
021205832
30)
1198685=
1
120583600
(1205834012058304minus 41205833112058313+ 31205832
22)
1198686=
1
120583900
(120583401205830412058322+ 2120583311205832212058313minus 120583401205832
13minus 120583041205832
31
minus 1205833
22)
(13)
A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work
35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows
119871119901119902=(2119901 + 1) (2119902 + 1)
4
sdot ∬+1
minus1
119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910
(14)
where the kernel function 119875119901(119909) denotes the 119901th-order
Legendre polynomial and is given by
119875119901(119909) =
119901
sum119896=0
119862119901119896[(1 minus 119909)
119896
+ (minus1)119901
(1 + 119909)119896
] (15)
where
119862119901119896=(minus1)119896
2119896+1
(119901 + 119896)
(119901 minus 119896) (119896)2 (16)
Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871
119901119902 defined by (14) is usually
approximated by the formula
119871119901119902
=(2119901 + 1) (2119902 + 1)
(119873 minus 1)2
119873
sum119894=1
119873
sum119895=1
119875119901(119909119894) 119875119902(119910119895) 119891 (119909
119894 119910119895)
(17)
where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910
119895= (2119895minus119873minus1)(119873minus1)
and for a binary image 119891(119909119894 119910119895) is given as
119891 (119909119894 119910119895) =
1 if (119894 119895) is in original object
0 otherwise(18)
As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form
119901119902=(2119901 + 1) (2119902 + 1)
4
119873
sum119894=1
119873
sum119895=1
ℎ119901119902(119909119894119910119895) 119891 (119909
119894 119910119895) (19)
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
2 Applied Computational Intelligence and Soft Computing
Roman and Urdu In a multilingual country like India it is acommon scenario that a document like job application formrailway ticket reservation form and so forth is composed oftext contents written in different languagesscripts in orderto reach a larger cross section of people The variation ofdifferent scripts may be in the form of numerals or alphanumerals in a single document page But the techniquesdeveloped for text identification generally do not incorporatethe recognition of digitsThis is because the features requiredfor the text identification may not be applicable for identify-ing the digits
The paper is organized as follows Section 2 presents abrief review of some of the previous approaches to hand-written digit recognition whereas in Section 3 we introduceour script independent handwritten digit recognition systemSection 4describes the performance of our systemon realisticdatabases of handwritten digits and finally Section 5 con-cludes the paper
2 Review of Related Works
Gorgevik and Cakmakov [3] developed Support VectorMachine (SVM) based digits recognition system for hand-written Roman numerals They extracted four types of fea-tures from each digit image (1) projection histograms (2)contour profiles (3) ring-zones and (4) Kirsch featuresTheyreported 9727 recognition accuracy on National Instituteof Standards and Technology (NIST) handwritten digitsdatabase [4] In [5] Chen et al proposed max-min poste-rior pseudoprobabilities framework for Roman handwrittendigit recognition They extracted 256 dimension directionalfeatures from the input image Finally these features weretransformed into a set of 128 features using Principal Com-ponent Analysis (PCA) They reported recognition accuracyof 9876 on NIST database [4] Labusch et al [6] describeda sparse coding based feature extraction method with SVMas a classifier They found recognition accuracy of 9941on MNIST (Modified NIST) handwritten digits database [7]The work described in [8] combined three recognizers bymajority vote and one of them is based on Kirsch gradient(four orientations) dimensionality reduction by PCA andclassification by SVM They achieved an accuracy rate of9505 with 093 error on 10000 test samples of MNISTdatabase [7] Mane and Ragha [9] performed handwrittendigit recognition using elastic image matching techniquebased on eigendeformation which is estimated by the PCAof actual deformations automatically selected by the elasticmatching They achieved an overall accuracy of 9491 ontheir owndatabase collected fromdifferent individuals of var-ious professions for the experiment Cruz et al [10] presenteda handwritten digit recognition system which uses multiplefeature extraction methods and classifier ensemble A totalof six feature extraction algorithms namely MultizoningModified Edge Maps Structural Characteristics ProjectionsConcavities Measurements and Gradient Directional wereevaluated in this paper A scheme using neural networks as acombiner achieved a recognition rate of 9968 on a trainingset of 60000 images and a test set of 10000 images of MNISTdatabase
Dhandra et al [11] investigated a script independentautomatic numeral recognition system for recognition ofKannada Telugu and Devanagari handwritten numerals Inthe proposed method 30 classes were reduced to 18 classesby extracting the global and local structural features likedirectional density estimation water reservoirs maximumprofile distances and fill-hole density Finally a probabilisticneural network (PNN) classifier was used for the recognitionsystem which yielded an accuracy of 9720 on a totalof 2550 numeral images written in Kannada Telugu andDevanagari scripts In [12] Yang et al proposed supervisedmatrix factorization method used directly as multiclass clas-sifier They reported recognition accuracy of 9871 withsupervised learning approach on MNIST database [7] In[13] a mixture of multiclass logistic regression models wasdescribed They claimed recognition accuracy of 98 onthe Indian digit database provided by CENPARMI [14] Daset al [15] described a technique for creating a pool of localregions and selection of an optimal set of local regions fromthat pool for extracting optimal discriminating informationfor handwritten Bangla digit recognition Genetic algorithm(GA) was then applied on these local regions to samplethe best discriminating features The features extracted fromthese selected local regions were then classified with SVMand recognition accuracy of 97 was achieved In [16] awavelet analysis based technique for feature extraction wasreported For classification SVM and k-Nearest Neighbor (k-NN) were used and an overall recognition accuracy of 9704was reported on MNIST digit database [7] A comparativestudy in [17] was conducted by training the neural networkusing Backpropagation (BP) algorithm and further usingPCA for feature extraction Digit recognition was finallycarried out using 13 algorithms neural network algorithmand the Fisher Discriminant Analysis (FDA) algorithm TheFDA algorithm proved less efficient with an overall accuracyof 7767 whereas the BP algorithm with PCA for its featureextraction gave an accuracy of 912
In [18] a set of structural features (namely number ofholes water reservoirs in four directions maximum profiledistances in four directions and fill-hole density) and k-NNclassifier were employed for classification and recognition ofhandwritten digits They reported recognition accuracy of9694 on 5000 samples of MNIST digit database [7] In[19] AlKhateeb and Alseid proposed an Arabic handwrittendigit recognition system using Dynamic Bayesian NetworkThey employed DCT coefficients based features for classifi-cation The system was tested on Indo-Arabic digits database(ADBase) which contains 70000 Indo-Arabic digits [20] andan average recognition accuracy of 8526 was achieved on10000 samples Ebrahimzadeh and Jampour [21] proposedan appearance feature-based approach using Histogram ofOriented Gradients (HOG) for handwritten digit recogni-tion A linear SVM was then used for classification of thedigits in MNIST dataset and an overall accuracy of 9725had been realized Gil et al [22] presented a novel approachusing SVM binary classifiers and unbalanced decision treesTwo classifiers were proposed in this study where one usedthe digit characteristics as input and the other used thewhole image as such It is observed that a handwritten
Applied Computational Intelligence and Soft Computing 3
digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved
From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1
3 Feature Extraction Methodology
One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection
31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and
Handwritten digit images
Feature extraction using moment based features
Classification of digits using multiple classifiers
Statistical significance tests for comparison of multipleclassifiers using multiple datasets
Selection of appropriate classifier
Detailed performance evaluation of chosen classifierusing training and test sets
Figure 1 Schematic diagram illustrating the key modules of theproposed methodology
similarly in mechanics to characterize bodies by their spatialdistribution of mass
A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921
Let 120583119899 be a real sequence of numbers and let us define
Δ119898
120583119899=
119898
sum119894=0
(minus1)119894
(119898
119894)120583119899+119894 (1)
Note that Δ119898120583119899can be viewed as the 119898th order derivative of
120583119899By theHausdorff theorem a necessary and sufficient con-
dition that there exists a monotonic function 119865(119909) satisfyingthe system
120583119899= int1
0
119909119899
119889119865 (119909) 119899 = 0 1 2 (2)
is that the system of linear inequalities
Δ119896
120583119899ge 0 119896 = 0 1 2 (3)
should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals
int1
0
119909119899
119891 (119909) 119889119909 119899 = 0 1 (4)
completely characterizes the functionA necessary and sufficient condition that there exists a
function 119865(119909) of bounded variation satisfying (7) is that thesequence
119901
sum119898=0
(119901
119898)1003816100381610038161003816Δ119901minus119898
120583119898
1003816100381610038161003816 119901 = 0 1 2 (5)
4 Applied Computational Intelligence and Soft Computing
should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)
32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902
of a gray level image 119891(119909 119910) is defined as
119872119901119902= ∬infin
minusinfin
119909119901
119910119902
119891 (119909 119910) 119889119909 119889119910 (6)
where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901
119910119902 is the basis function for this moment definition A set
of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is
the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely
119891(119909 119910) is uniquely determined by 119872119901119902 Considering the
fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform
119898119901119902=
119872
sum119909=1
119873
sum119910=1
119909119901
119910119902
119891 (119909 119910) (7)
where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling
then the image may be positioned such that its center of mass
(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583
119901119902 The simplified form of central moment of
order (119901 + 119902) is defined as follows
120583119901119902=
119872
sum119909=1
119873
sum119910=1
(119909 minus 119909)119901
(119910 minus 119910)119902
119891 (119909 119910) (8)
where 119909 = 1198981011989800and 119910 = 119898
0111989800
The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583
119901119902computed using the centroid of the image are
equivalent to 119898119901119902
whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902
are defined as
120578119901119902=120583119901119902
120583120574
00
(9)
where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578
02 12057811 12057820 known as the
moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by
120579 =1
2tanminus1 (
212058311
12058320minus 12058302
) (10)
where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows
119890 =(12058320minus 12058302)2
+ 41205832
11
12058300
(11)
33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order
Applied Computational Intelligence and Soft Computing 5
three that are invariant with respect to image scale transla-tion and rotation Consider
1206011= 12057820+ 12057802
1206012= (12057820minus 12057802)2
+ 41205782
11
1206013= (12057830minus 312057812)2
+ (312057821minus 12057803)2
1206014= (12057830+ 12057812)2
+ (12057821+ 12057803)2
1206015= (12057830minus 312057812) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057821minus 12057803)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
1206016= (12057820minus 12057802) [(12057830+ 12057812)2
minus (12057821+ 12057803)2
]
+ 412057811(12057830+ 12057812) (12057821+ 12057803)
1206017= (312057821minus 12057803) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057812minus 12057830)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
(12)
This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work
34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows
1198681=
1
120583400
(1205832012058302minus 1205832
11)
1198682=
1
1205831000
(1205832
301205832
03minus 612058330120583211205831212058303+ 4120583301205833
12
+ 4120583031205833
21minus 31205832
211205832
12)
1198683=
1
120583700
(12058320(1205832112058303minus 1205832
12) minus 12058311(1205833012058303minus 1205832112058312)
+ 12058302(1205833012058312minus 1205832
21))
1198684=
1
1205831100
(1205833
201205832
03minus 61205832
20120583111205831212058303minus 61205832
20120583211205830212058303
+ 91205832
20120583021205832
12+ 12120583
201205832
111205830312058321
+ 61205832012058311120583021205833012058303minus 18120583
2012058311120583021205832112058312
minus 81205833
111205830312058330minus 6120583201205832
021205833012058312+ 9120583201205832
021205832
21
+ 121205832
11120583021205833012058312minus 61205832
111205832
021205833012058321+ 1205832
021205832
30)
1198685=
1
120583600
(1205834012058304minus 41205833112058313+ 31205832
22)
1198686=
1
120583900
(120583401205830412058322+ 2120583311205832212058313minus 120583401205832
13minus 120583041205832
31
minus 1205833
22)
(13)
A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work
35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows
119871119901119902=(2119901 + 1) (2119902 + 1)
4
sdot ∬+1
minus1
119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910
(14)
where the kernel function 119875119901(119909) denotes the 119901th-order
Legendre polynomial and is given by
119875119901(119909) =
119901
sum119896=0
119862119901119896[(1 minus 119909)
119896
+ (minus1)119901
(1 + 119909)119896
] (15)
where
119862119901119896=(minus1)119896
2119896+1
(119901 + 119896)
(119901 minus 119896) (119896)2 (16)
Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871
119901119902 defined by (14) is usually
approximated by the formula
119871119901119902
=(2119901 + 1) (2119902 + 1)
(119873 minus 1)2
119873
sum119894=1
119873
sum119895=1
119875119901(119909119894) 119875119902(119910119895) 119891 (119909
119894 119910119895)
(17)
where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910
119895= (2119895minus119873minus1)(119873minus1)
and for a binary image 119891(119909119894 119910119895) is given as
119891 (119909119894 119910119895) =
1 if (119894 119895) is in original object
0 otherwise(18)
As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form
119901119902=(2119901 + 1) (2119902 + 1)
4
119873
sum119894=1
119873
sum119895=1
ℎ119901119902(119909119894119910119895) 119891 (119909
119894 119910119895) (19)
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 3
digit recognition accuracy of 100 was achieved on MNISTdatabase using the whole image as input El Qacimy et al[23] investigated the effectiveness of four feature extractionapproaches based on Discrete Cosine Transform (DCT)namely DCT upper left corner (ULC) coefficients DCTzigzag coefficients block based DCT ULC coefficients andblock based DCT zigzag coefficients The coefficients of eachDCT variant were used as input data for SVM classifier andit was found that block based DCT zigzag feature extractionyielded a superior recognition accuracy of 9876 onMNISTdatabase AL-Mansoori [24] implemented aMLP classifier torecognize and predict handwritten digits A dataset of 5000samples were obtained from MNIST database and an overallaccuracy of 9932 was achieved
From the above literature it is clear thatmost of the workshave been done for the Roman script whereas relatively fewworks [11 15 19] have been reported for the digit recognitionwritten in Indic scripts The main reasons for this slowprogress could be attributed to the complexity of the shapeof Indic scripts as opposed to Roman script Again thediscriminating power of the features exploited till now isnot easily measurable investigative experimentations will benecessary for identifying new feature descriptors for effec-tive classification of complex handwritten digits of differentscripts It is also revealed that the methods described inthe literature suffer from larger computational time mainlydue to feature extraction from large dataset In addition theabove recognition systems fail to meet the desired accuracywhen exposed to different multiscript scenario Hence itwould be beneficial for multilingual country like India ifthere is a method which is independent of script and yieldsreasonable recognition accuracy This has motivated us tointroduce a script invariant handwritten digit recognitionsystem for identifying digits written in five popular scriptsnamely Indo-ArabicBanglaDevanagariRoman andTeluguThe key module of the proposed methodology is shown inFigure 1
3 Feature Extraction Methodology
One of the basic problems in the design of any patternrecognition system is the selection of a set of appropriatefeatures to be extracted from the object of interest Researchon the utilization of moments for object characterizationin both invariant and noninvariant tasks has received con-siderable attention in recent years Describing digit imageswith moments instead of other more commonly used patternrecognition features (described in [21ndash23]) means that globalproperties of the digit image are used rather than localproperties So for the present work we considered amomentbased approach which is described in the next subsection
31 Moments Moments are pure statistical measure of pixeldistribution around the center of gravity of the image andallow capturing global shapes information [25]Theydescribenumerical quantities at some distance from a referencepoint or axis Moments are commonly used in statisticsto characterize the distribution of random variables and
Handwritten digit images
Feature extraction using moment based features
Classification of digits using multiple classifiers
Statistical significance tests for comparison of multipleclassifiers using multiple datasets
Selection of appropriate classifier
Detailed performance evaluation of chosen classifierusing training and test sets
Figure 1 Schematic diagram illustrating the key modules of theproposed methodology
similarly in mechanics to characterize bodies by their spatialdistribution of mass
A complete characterization of moment functional over aclass of univariate functions was given by Hausdorff [26] in1921
Let 120583119899 be a real sequence of numbers and let us define
Δ119898
120583119899=
119898
sum119894=0
(minus1)119894
(119898
119894)120583119899+119894 (1)
Note that Δ119898120583119899can be viewed as the 119898th order derivative of
120583119899By theHausdorff theorem a necessary and sufficient con-
dition that there exists a monotonic function 119865(119909) satisfyingthe system
120583119899= int1
0
119909119899
119889119865 (119909) 119899 = 0 1 2 (2)
is that the system of linear inequalities
Δ119896
120583119899ge 0 119896 = 0 1 2 (3)
should be satisfied that is if 119891(119909) is a positive function (incase of image processing) then the set of functionals
int1
0
119909119899
119891 (119909) 119889119909 119899 = 0 1 (4)
completely characterizes the functionA necessary and sufficient condition that there exists a
function 119865(119909) of bounded variation satisfying (7) is that thesequence
119901
sum119898=0
(119901
119898)1003816100381610038161003816Δ119901minus119898
120583119898
1003816100381610038161003816 119901 = 0 1 2 (5)
4 Applied Computational Intelligence and Soft Computing
should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)
32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902
of a gray level image 119891(119909 119910) is defined as
119872119901119902= ∬infin
minusinfin
119909119901
119910119902
119891 (119909 119910) 119889119909 119889119910 (6)
where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901
119910119902 is the basis function for this moment definition A set
of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is
the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely
119891(119909 119910) is uniquely determined by 119872119901119902 Considering the
fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform
119898119901119902=
119872
sum119909=1
119873
sum119910=1
119909119901
119910119902
119891 (119909 119910) (7)
where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling
then the image may be positioned such that its center of mass
(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583
119901119902 The simplified form of central moment of
order (119901 + 119902) is defined as follows
120583119901119902=
119872
sum119909=1
119873
sum119910=1
(119909 minus 119909)119901
(119910 minus 119910)119902
119891 (119909 119910) (8)
where 119909 = 1198981011989800and 119910 = 119898
0111989800
The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583
119901119902computed using the centroid of the image are
equivalent to 119898119901119902
whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902
are defined as
120578119901119902=120583119901119902
120583120574
00
(9)
where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578
02 12057811 12057820 known as the
moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by
120579 =1
2tanminus1 (
212058311
12058320minus 12058302
) (10)
where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows
119890 =(12058320minus 12058302)2
+ 41205832
11
12058300
(11)
33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order
Applied Computational Intelligence and Soft Computing 5
three that are invariant with respect to image scale transla-tion and rotation Consider
1206011= 12057820+ 12057802
1206012= (12057820minus 12057802)2
+ 41205782
11
1206013= (12057830minus 312057812)2
+ (312057821minus 12057803)2
1206014= (12057830+ 12057812)2
+ (12057821+ 12057803)2
1206015= (12057830minus 312057812) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057821minus 12057803)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
1206016= (12057820minus 12057802) [(12057830+ 12057812)2
minus (12057821+ 12057803)2
]
+ 412057811(12057830+ 12057812) (12057821+ 12057803)
1206017= (312057821minus 12057803) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057812minus 12057830)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
(12)
This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work
34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows
1198681=
1
120583400
(1205832012058302minus 1205832
11)
1198682=
1
1205831000
(1205832
301205832
03minus 612058330120583211205831212058303+ 4120583301205833
12
+ 4120583031205833
21minus 31205832
211205832
12)
1198683=
1
120583700
(12058320(1205832112058303minus 1205832
12) minus 12058311(1205833012058303minus 1205832112058312)
+ 12058302(1205833012058312minus 1205832
21))
1198684=
1
1205831100
(1205833
201205832
03minus 61205832
20120583111205831212058303minus 61205832
20120583211205830212058303
+ 91205832
20120583021205832
12+ 12120583
201205832
111205830312058321
+ 61205832012058311120583021205833012058303minus 18120583
2012058311120583021205832112058312
minus 81205833
111205830312058330minus 6120583201205832
021205833012058312+ 9120583201205832
021205832
21
+ 121205832
11120583021205833012058312minus 61205832
111205832
021205833012058321+ 1205832
021205832
30)
1198685=
1
120583600
(1205834012058304minus 41205833112058313+ 31205832
22)
1198686=
1
120583900
(120583401205830412058322+ 2120583311205832212058313minus 120583401205832
13minus 120583041205832
31
minus 1205833
22)
(13)
A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work
35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows
119871119901119902=(2119901 + 1) (2119902 + 1)
4
sdot ∬+1
minus1
119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910
(14)
where the kernel function 119875119901(119909) denotes the 119901th-order
Legendre polynomial and is given by
119875119901(119909) =
119901
sum119896=0
119862119901119896[(1 minus 119909)
119896
+ (minus1)119901
(1 + 119909)119896
] (15)
where
119862119901119896=(minus1)119896
2119896+1
(119901 + 119896)
(119901 minus 119896) (119896)2 (16)
Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871
119901119902 defined by (14) is usually
approximated by the formula
119871119901119902
=(2119901 + 1) (2119902 + 1)
(119873 minus 1)2
119873
sum119894=1
119873
sum119895=1
119875119901(119909119894) 119875119902(119910119895) 119891 (119909
119894 119910119895)
(17)
where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910
119895= (2119895minus119873minus1)(119873minus1)
and for a binary image 119891(119909119894 119910119895) is given as
119891 (119909119894 119910119895) =
1 if (119894 119895) is in original object
0 otherwise(18)
As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form
119901119902=(2119901 + 1) (2119902 + 1)
4
119873
sum119894=1
119873
sum119895=1
ℎ119901119902(119909119894119910119895) 119891 (119909
119894 119910119895) (19)
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
4 Applied Computational Intelligence and Soft Computing
should be bounded The use of moments for image analysisis straightforward if we consider a binary or gray level imagesegment as a two-dimensional density distribution functionIt can be assumed that an image can be represented by a real-valued measurable function 119891(119909 119910) In this way momentsmay be used to characterize an image segment and extractproperties that have analogies in statistics and mechanics Inimage processing and computer vision an image momentis a certain particular weighted average (moment) of theimage pixelsrsquo intensities or a function of such momentsusually chosen to have some attractive property or interpre-tation The first significant work considering moments forpattern recognition was performed by Hu [27] He derivedrelative and absolute combinations of moment values thatare invariant with respect to scale position and orientationbased on the theories of invariant algebra that deal with theproperties of certain classes of algebraic expressions whichremain invariant under general linear transformations Sizeinvariant moments are derived from algebraic invariants butcan be shown to be the result of simple size normalizationTranslation invariance is achieved by computing momentsthat have been translated by the negative distance to thecentroid and thus normalized so that the center of mass ofthe distribution is at the origin (central moments)
32 Geometric Moments Geometric moments are defined asthe projection of the image intensity function119891(119909 119910) onto themonomial 119909119901119910119902 [25] The (119901 + 119902)th order geometric moment119872119901119902
of a gray level image 119891(119909 119910) is defined as
119872119901119902= ∬infin
minusinfin
119909119901
119910119902
119891 (119909 119910) 119889119909 119889119910 (6)
where 119901 119902 = 0 1 2 infin Note that the monomial product119909119901
119910119902 is the basis function for this moment definition A set
of 119899 moments consists of all 119872119901119902rsquos for 119901 + 119902 le 119899 that is
the set contains (12)(119899 + 1)(119899 + 2) elements If 119891(119909 119910) ispiecewise continuous and contains nonzero values only ina finite region of the 119909119910-plane then the moment sequence119872119901119902 is uniquely determined by 119891(119909 119910) and conversely
119891(119909 119910) is uniquely determined by 119872119901119902 Considering the
fact that an image segment has finite area or in the worst caseis piecewise continuous moments of all orders exist and acomplete moment set can be computed and used uniquely todescribe the information contained in the image Howeverobtaining all the information contained in the image requiresan infinite number of moment values Therefore to selecta meaningful subset of the moment values that containsufficient information to characterize the image uniquely fora specific application becomes very important In case of adigital image of size 119872 times 119873 the double integral in (6) isreplaced by a summation which turns into this simplifiedform
119898119901119902=
119872
sum119909=1
119873
sum119910=1
119909119901
119910119902
119891 (119909 119910) (7)
where 119901 119902 = 0 1 2 are integersWhen119891(119909 119910) changes by translating rotating or scaling
then the image may be positioned such that its center of mass
(COM) is coincided with the origin of the field of view thatis (119909 = 0) and (119910 = 0) and then the moments computedfor that object are referred to as central moment [25] and it isdesignated by 120583
119901119902 The simplified form of central moment of
order (119901 + 119902) is defined as follows
120583119901119902=
119872
sum119909=1
119873
sum119910=1
(119909 minus 119909)119901
(119910 minus 119910)119902
119891 (119909 119910) (8)
where 119909 = 1198981011989800and 119910 = 119898
0111989800
The pixel point (119909 119910) is theCOMof the imageThe centralmoments 120583
119901119902computed using the centroid of the image are
equivalent to 119898119901119902
whose center has been shifted to centroidof the image Therefore the central moments are invariantto image translations Scale invariance can be obtained bynormalization The normalized central moments denoted by120578119901119902
are defined as
120578119901119902=120583119901119902
120583120574
00
(9)
where 120574 = (119901 + 119902)2 + 1 for (119901 + 119902) = 2 3 The second order moments 120578
02 12057811 12057820 known as the
moments of inertia may be used to determine an importantimage feature called orientation [25] Here the feature valuesF1ndashF3 have been computed from moments of inertia of theword images In general the orientation of an image describeshow the image lies in the field of view or the directions of theprincipal axes In terms of moments the orientation of theprincipal axis 120579 taken as feature value F4 is given by
120579 =1
2tanminus1 (
212058311
12058320minus 12058302
) (10)
where 120579 is the angle of the principal axis nearest to the 119909-axis and is in the range minus1205874 le 120579 le 1205874 The minimumandmaximumdistances (119903min and 119903max) between the centroidand the boundary of an image are also feature descriptorsTheratio 119903max119903min is called elongation or eccentricity (F5) and canbe defined in terms of central moments as follows
119890 =(12058320minus 12058302)2
+ 41205832
11
12058300
(11)
33 Moment Invariants Based on the theory of algebraicinvariants Hu [27] derived relative and absolute combina-tions of moments that are invariant with respect to scaleposition and orientation The method of moment invariantsis derived from algebraic invariants applied to the momentgenerating function under a rotation transformationThe setof absolute moment invariants consists of a set of nonlinearcombinations of central moment values that remain invariantunder rotation A set of seven invariant moments can bederived based on the normalized central moments of order
Applied Computational Intelligence and Soft Computing 5
three that are invariant with respect to image scale transla-tion and rotation Consider
1206011= 12057820+ 12057802
1206012= (12057820minus 12057802)2
+ 41205782
11
1206013= (12057830minus 312057812)2
+ (312057821minus 12057803)2
1206014= (12057830+ 12057812)2
+ (12057821+ 12057803)2
1206015= (12057830minus 312057812) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057821minus 12057803)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
1206016= (12057820minus 12057802) [(12057830+ 12057812)2
minus (12057821+ 12057803)2
]
+ 412057811(12057830+ 12057812) (12057821+ 12057803)
1206017= (312057821minus 12057803) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057812minus 12057830)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
(12)
This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work
34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows
1198681=
1
120583400
(1205832012058302minus 1205832
11)
1198682=
1
1205831000
(1205832
301205832
03minus 612058330120583211205831212058303+ 4120583301205833
12
+ 4120583031205833
21minus 31205832
211205832
12)
1198683=
1
120583700
(12058320(1205832112058303minus 1205832
12) minus 12058311(1205833012058303minus 1205832112058312)
+ 12058302(1205833012058312minus 1205832
21))
1198684=
1
1205831100
(1205833
201205832
03minus 61205832
20120583111205831212058303minus 61205832
20120583211205830212058303
+ 91205832
20120583021205832
12+ 12120583
201205832
111205830312058321
+ 61205832012058311120583021205833012058303minus 18120583
2012058311120583021205832112058312
minus 81205833
111205830312058330minus 6120583201205832
021205833012058312+ 9120583201205832
021205832
21
+ 121205832
11120583021205833012058312minus 61205832
111205832
021205833012058321+ 1205832
021205832
30)
1198685=
1
120583600
(1205834012058304minus 41205833112058313+ 31205832
22)
1198686=
1
120583900
(120583401205830412058322+ 2120583311205832212058313minus 120583401205832
13minus 120583041205832
31
minus 1205833
22)
(13)
A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work
35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows
119871119901119902=(2119901 + 1) (2119902 + 1)
4
sdot ∬+1
minus1
119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910
(14)
where the kernel function 119875119901(119909) denotes the 119901th-order
Legendre polynomial and is given by
119875119901(119909) =
119901
sum119896=0
119862119901119896[(1 minus 119909)
119896
+ (minus1)119901
(1 + 119909)119896
] (15)
where
119862119901119896=(minus1)119896
2119896+1
(119901 + 119896)
(119901 minus 119896) (119896)2 (16)
Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871
119901119902 defined by (14) is usually
approximated by the formula
119871119901119902
=(2119901 + 1) (2119902 + 1)
(119873 minus 1)2
119873
sum119894=1
119873
sum119895=1
119875119901(119909119894) 119875119902(119910119895) 119891 (119909
119894 119910119895)
(17)
where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910
119895= (2119895minus119873minus1)(119873minus1)
and for a binary image 119891(119909119894 119910119895) is given as
119891 (119909119894 119910119895) =
1 if (119894 119895) is in original object
0 otherwise(18)
As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form
119901119902=(2119901 + 1) (2119902 + 1)
4
119873
sum119894=1
119873
sum119895=1
ℎ119901119902(119909119894119910119895) 119891 (119909
119894 119910119895) (19)
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 5
three that are invariant with respect to image scale transla-tion and rotation Consider
1206011= 12057820+ 12057802
1206012= (12057820minus 12057802)2
+ 41205782
11
1206013= (12057830minus 312057812)2
+ (312057821minus 12057803)2
1206014= (12057830+ 12057812)2
+ (12057821+ 12057803)2
1206015= (12057830minus 312057812) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057821minus 12057803)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
1206016= (12057820minus 12057802) [(12057830+ 12057812)2
minus (12057821+ 12057803)2
]
+ 412057811(12057830+ 12057812) (12057821+ 12057803)
1206017= (312057821minus 12057803) (12057830+ 12057812)
sdot [(12057830+ 12057812)2
minus 3 (12057821+ 12057803)2
] + (312057812minus 12057830)
sdot (12057821+ 12057803) [3 (120578
30+ 12057812)2
minus (12057821+ 12057803)2
]
(12)
This set of moments is invariant to translation scale changemirroring (within a minus sign) and rotation The 2Dmoment invariant gives seven features (F6ndashF12) which hadbeen used for the current work
34 Affine Moment Invariants The affine moment invariantsare derived to be invariants to translation rotation andscaling of shapes and under 2D Affine transformation Thesix affine moment invariants [28] used for the present workare defined as follows
1198681=
1
120583400
(1205832012058302minus 1205832
11)
1198682=
1
1205831000
(1205832
301205832
03minus 612058330120583211205831212058303+ 4120583301205833
12
+ 4120583031205833
21minus 31205832
211205832
12)
1198683=
1
120583700
(12058320(1205832112058303minus 1205832
12) minus 12058311(1205833012058303minus 1205832112058312)
+ 12058302(1205833012058312minus 1205832
21))
1198684=
1
1205831100
(1205833
201205832
03minus 61205832
20120583111205831212058303minus 61205832
20120583211205830212058303
+ 91205832
20120583021205832
12+ 12120583
201205832
111205830312058321
+ 61205832012058311120583021205833012058303minus 18120583
2012058311120583021205832112058312
minus 81205833
111205830312058330minus 6120583201205832
021205833012058312+ 9120583201205832
021205832
21
+ 121205832
11120583021205833012058312minus 61205832
111205832
021205833012058321+ 1205832
021205832
30)
1198685=
1
120583600
(1205834012058304minus 41205833112058313+ 31205832
22)
1198686=
1
120583900
(120583401205830412058322+ 2120583311205832212058313minus 120583401205832
13minus 120583041205832
31
minus 1205833
22)
(13)
A total of 6 features (F13ndashF18) is extracted from each of thehandwritten digit images for the present work
35 The Legendre Moment The 2D Legendre moment [29]of order (119901 + 119902) of an object with intensity function 119891(119909 119910) isdefined as follows
119871119901119902=(2119901 + 1) (2119902 + 1)
4
sdot ∬+1
minus1
119875119901(119909) 119875119902(119910) 119891 (119909 119910) 119889119909 119889119910
(14)
where the kernel function 119875119901(119909) denotes the 119901th-order
Legendre polynomial and is given by
119875119901(119909) =
119901
sum119896=0
119862119901119896[(1 minus 119909)
119896
+ (minus1)119901
(1 + 119909)119896
] (15)
where
119862119901119896=(minus1)119896
2119896+1
(119901 + 119896)
(119901 minus 119896) (119896)2 (16)
Since the Legendre polynomials are orthogonal over theinterval [minus1 1] [20] a square image of 119873 times 119873 pixels withintensity function 119891(119894 119895) with 1 le 119894 119895 le 119873 must be scaledto be within the region minus1 le 119909 119910 le 1 The graphical plotfor first 10 Legendre polynomials is shown in Figures 2(a)-2(b) When an analog image is digitized to its discrete formthe 2D Legendre moments 119871
119901119902 defined by (14) is usually
approximated by the formula
119871119901119902
=(2119901 + 1) (2119902 + 1)
(119873 minus 1)2
119873
sum119894=1
119873
sum119895=1
119875119901(119909119894) 119875119902(119910119895) 119891 (119909
119894 119910119895)
(17)
where 119909119894= (2119894minus119873minus1)(119873minus1) and 119910
119895= (2119895minus119873minus1)(119873minus1)
and for a binary image 119891(119909119894 119910119895) is given as
119891 (119909119894 119910119895) =
1 if (119894 119895) is in original object
0 otherwise(18)
As indicated by Liao and Pawlak [30] (17) is not a veryaccurate approximation of (14) For achieving better accuracythey proposed to use the following approximated form
119901119902=(2119901 + 1) (2119902 + 1)
4
119873
sum119894=1
119873
sum119895=1
ℎ119901119902(119909119894119910119895) 119891 (119909
119894 119910119895) (19)
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
6 Applied Computational Intelligence and Soft Computing
Legendre polynomials
minus05 0 05 1minus1
x
Pn(x)
P0(x)
P1(x)
P2(x)
P3(x)
P4(x)
P5(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
(a)
Legendre polynomials
minus05 0 05 1minus1
x
P6(x)
P7(x)
P8(x)
P9(x)
P10(x)
minus1
0
02
04
06
08
1
minus08
minus06
minus04
minus02
Pn(x)
(b)
Figure 2 Graph showing the plots for the two-dimensional Legendre polynomials 119875119899(119909) (a) 119875
1(119909) to 119875
5(119909) and (b) 119875
6(119909) to 119875
10(119909)
where
ℎ119901119902(119909119894 119910119895) = int
119909119894+Δ1199092
119909119894minusΔ1199092
int119910119895+Δ1199102
119910119895minusΔ1199102
119875119901(119909) 119875119901(119910) 119889119909 119889119910 (20)
withΔ119909 = 119909119894minus119909119894minus1
= 2(119873minus1) andΔ119910 = 119910119895minus119910119895minus1
= 2(119873minus1)To evaluate the double integral ℎ
119901119902(119909119894 119910119895) defined by (20)
an alternative extended Simpson rule was proposed by Liaoand Pawlak These values were then used to calculate the2D Legendre moments
119901119902defined by (19) Therefore this
method requires a large number of computing operations Asone can see
119901119902can be expressed with the help of a useful
formula that will be given below as a linear combination of119871119898119899 with 0 le 119898 le 119901 0 le 119899 le 119902A set of 10 Legendre moments (F19ndashF28) can also be
derived based on the set of invariant moments found in theprevious subsection
11987100= 12060100
11987110= (
3
4) 12060110
11987101= (
3
4) 12060101
11987120= (
5
4) [(
3
2) 12060120minus (
1
2) 12060100]
11987102= (
5
4) [(
3
2) 12060102minus (
1
2) 12060100]
11987111= (
9
4) 12060111
11987130= (
7
4) [(
5
2) 12060130minus (
3
2) 12060110]
11987103= (
7
4) [(
5
2) 12060103minus (
3
2) 12060101]
11987121= (
15
4) [(
3
2) 12060121minus (
1
2) 12060101]
11987112= (
15
4) [(
3
2) 12060112minus (
1
2) 12060110]
(21)
36 Zernike Moments Zernike polynomials are orthogonalseries of basis functions normalized over a unit circle Thecomplexity of these polynomials increases with increasingpolynomial order [31] To calculate the Zernikemoments theimage (or region of interest) is first mapped to the unit discusing polar coordinates where the center of the image is theorigin of the unit disc The pixels falling outside the unit discare not considered here The coordinates are then describedby the length of the vector from the origin to the coordinatepoint The mapping from Cartesian to polar coordinates isdefined as follows
119909 = 119903 cos 120579
119910 = 119903 sin 120579(22)
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 7
where
119903 = radic1199092 + 1199102
120579 = tanminus1 (119910
119909)
(23)
An important attribute of the geometric representationsof Zernike polynomials is that lower order polynomialsapproximate the global features of the shapesurface whilethe higher ordered polynomials capture local shapesurfacefeatures Zernikemoments are a class of orthogonalmomentsand have been shown to be effective in terms of imagerepresentation
Zernike introduced a set of complex polynomials whichforms a complete orthogonal set over the interior of the unitcircle that is 1199092 + 1199102 = 1 Let the set of these polynomials bedenoted by 119881
119899119898(119909 119910) The form of these polynomials is as
follows
119881119899119898(119909 119910) = 119881
119899119898(119901 120579) = 119877
119899119898(120588) exp (119895119898120579) (24)
where
119899 positive integer or zero
119898 positive and negative integers subject to con-straints 119899 minus |119898| even |119898| le 119899
120588 length of vector from origin to 119891(119909 119910) pixel
120579 angle between vector 120588 and 119909-axis in counterclock-wise direction
As mentioned above the complex Zernike moments of order119899 with repetition 119898 for a continuous image function 119891(119909 119910)are defined as follows
119881119899119898
=119899 + 1
120587∬119891(119909 119910)119881
lowast
119899119898(120588 120579) 119889119909 119889119910 (25)
in the 119909119910 image plane where 1199092 + 1199102 le 1 and lowast indicatesthe complex conjugate Note that for the moments to beorthogonal the image must be scaled within a unit circlecentered at the origin and
119881119899119898
=119899 + 1
120587int2120587
0
int1
0
119891 (120588 120579) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(26)
in polar coordinates The Zernike moment of the rotatedimage in the same coordinates is given by
119881119903
119899119898=119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 120579 minus 120572) 119877119899119898(120588) exp (minus119895119898120579) 120588 119889120588 119889120579
(27)
By change of variable 1205791= 120579 minus 120572
119881119903
119899119898=119899 + 1
120587int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588)
sdot exp (minus119895119898 (1205791+ 120572)) 120588 119889120588 119889120579 = [
119899 + 1
120587
sdot int2120587
0
int1
0
119891 (120588 1205791) 119877119899119898(120588) exp (minus119895119898120579
1) 120588 119889120588 119889120579]
sdot exp (minus119895119898120572) = 119881119899119898
exp (minus119895119898120572)
(28)
Equation (28) shows that Zernike moments have simplerotational transformation properties each Zernike momentmerely acquires a phase shift on rotation This simpleproperty leads to the conclusion that the magnitudes ofthe Zernike moments of a rotated image function remainidentical to those before rotationThus the magnitude of theZernike moment |119881
119899119898| can be taken as a rotation invariant
feature of the underlying image function The real-valuedradial polynomial 119877
119899119898(120588) is defined as follows
119877119899119898(120588) =
(119899minus|119898|)2
sum119909=0
(minus1)119904
sdot(119899 minus 119904)
119904 ((119899 + |119898|) 2 minus 119904) ((119899 minus |119898|) 2 minus 119904)120588119899minus2119904
(29)
where 119899 minus |119898| = even and |119898| le 119899Zernike moments may also be derived from conventional
moments 120583119901119902
as follows
119885119899119897=(119899 + 1)
120587
sdot
119899
sum119896=119897
119902
sum119895=0
119897
sum119898=0
(minus119894)119898
(119902
119895)(
119897
119898)119861119899119897119896120583119896minus2119895minus119897+1198982119895+119897minus119898
(30)
Zernikemomentsmay bemore easily derived from rotationalmoments119863
119899119896 by
119885119899119897=
119899
sum119896=119897
119861119899119897119896119863119899119896 (31)
When computing the Zernike moments if the center of apixel falls inside the border of unit disk 119909
2
+ 1199102
le 1this pixel will be used in the computation otherwise thepixel will be discarded Therefore the area covered by themoment computation is not exactly the area of the unitdisk Advantages of Zernike moments can be summarized asfollows
(1) The magnitude of Zernike moment has rotationalinvariant property
(2) They are robust to noise and shape variations to someextent
(3) Since the basis is orthogonal they have minimumredundant information
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
8 Applied Computational Intelligence and Soft Computing
Table 1 List of Zernike moments and their corresponding numbersof features from order 0 to order 10
Order(119899)
Zernike moments of order 119899with repetition119898 (119881
119899119898)
Total number ofmoments up to order 10
0 11988100
36
1 11988111
2 11988120 11988122
3 11988131 11988133
4 11988140 11988142 11988144
5 11988151 11988153 11988155
6 11988160 11988162 11988164 11988166
7 11988171 11988173 11988175 11988177
8 11988180 11988182 119881841198818611988188
9 11988191 11988193 119881951198819711988199
10 119881100 119881102 119881104 119881106 119881108 1198811010
(4) An image can better be described by a small set of itsZernike moments than any other types of momentssuch as geometric moments
(5) A relatively small set of Zernike moments can char-acterize the global shape of pattern Lower ordermoments represent the global shape of patternwhereas the higher order moments represent thedetails
Therefore we choose Zernike moments as our shape descrip-tor in digit recognition process Table 1 lists the rotationinvariant Zernike moment features (F29ndashF64) and theircorresponding numbers from order 0 to order 10 used for thepresent work
The defined features on the Zernike moments are onlyrotation invariant To obtain scale and translation invariancethe digit image is first subjected to a normalization processusing its regular moments The rotation invariant Zernikefeatures are then extracted from the scale and translationnormalized image
37 Complex Moments The notion of complex moments wasintroduced in [32] as a simple and straightforward techniqueto derive a set of invariant moments The two-dimensionalcomplex moments of order (119901 119902) for the image function119891(119909 119910) are defined by
119862119901119902= int1198862
1198861
int1198872
1198871
(119909 + 119895119910)119901
(119909 minus 119895119910)119902
119891 (119909 119910) 119889119909 119889119910 (32)
where 119901 and 119902 are nonnegative integers and 119895 = radicminus1 Someadvantages of the complex moments can be described asfollows
(1) When the central complex moments are taken as thefeatures the effects of the imagersquos lateral displacementcan be eliminated
(2) A set of complex moment invariants can also bederived which are invariant to the rotation of theobject
(3) Since the complex moment is an intermediate stepbetween ordinary moments and moment invariantsit is relatively more simple to compute and morepowerful than other moment features in any patternclassification problem
The complex moments of order (119901 119902) are a linear combina-tion with complex coefficients of all the geometric moments119872119899119898 satisfying 119901 + 119902 = 119899 + 119898 In polar coordinates the
complex moments of order (119901 + 119902) can be written as follows
119862119897
119899= 119862119901119902
= int2120587
0
int+infin
0
120588119901+119902
119890119895(119901minus119902)120579
119891 (120588 cos 120579 120588 sin 120579) 120588 119889120588 119889120579(33)
where119901+119902 = 119899 and119901minus119902 = 119897denote the order and repetition ofthe complex moments respectively If the complex momentof the original image and that of the rotated image in thesame polar coordinates are denoted by 119862
119901119902and 119862
119903
119901119902 the
relationship [33] between them is given as follows
119862119903
119901119902= 119862119901119902119890minus119895(119901minus119902)120579
(34)
where 120579 is the angle at which the original image is rotatedThecomplex moment features represent the invariant propertiesto lateral displacement and rotation Based on the definitionof moment invariants we know that as the image is rotatedeach complex moment goes through all possible phasesof a complex number while its magnitude |119862
119901119902| remains
unchanged If the exponential factor of the complex momentis canceled out we will obtain its absolute invariant valuewhich is invariant to the rotation of the images The rotationinvariant complex moment features (F65ndashF130) and theircorresponding numbers from order 0 to order 10 used for thepresent work are listed in Table 2
Finally a feature vector consisting of 130 moment basedfeatures is calculated from each of the handwritten numeralimages belonging to five different scripts Summarization ofthe overall moment based feature set used in the present workis enlisted in Table 3
4 Experimental Study and Analysis
In this section we present the detailed experimental resultsto illustrate the suitability of moment based approach tohandwritten digit recognition All the experiments are imple-mented inMATLAB 2010 under aWindows XP environmenton an Intel Core2 Duo 24GHz processor with 1 GB of RAMand performed on gray-scale digit imagesThe accuracy usedas assessment criteria for measuring the performance of theproposed system is expressed as follows
Accuracy Rate () =Number of Correctly classified digits
Total number of digitstimes 100 (35)
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 9
Table 2 List of complex moments and their corresponding numbers of features from order 0 to order 10
Order(119899) Complex moments (119862
119901119902) Complex moments of order 119899 with repetition 119897 (119862119897
119899) Total number of moments
up to order 100 119862
00 1198620
0
66
1 11986210 11986201 119862
1
1 119862minus1
1
2 11986220 11986211 11986202 119862
2
2 1198620
2 119862minus2
2
3 11986230 11986221 11986212 11986203 119862
3
3 1198621
3 119862minus1
3119862minus3
3
4 11986240 11986231 11986222 11986213 11986204 119862
4
4 1198622
4 1198620
4 119862minus2
4 119862minus4
4
5 11986250 11986241 11986232 11986223 11986214 11986205 119862
5
5 1198623
5 1198621
5 119862minus1
5 119862minus3
5 119862minus5
5
6 11986260 11986251 11986242 11986233 11986224 11986215 11986206 119862
6
6 1198624
6 1198622
6 1198620
6 119862minus2
6 119862minus4
6 119862minus6
6
7 11986270 11986261 11986252 11986243 11986234 11986225 11986216 11986207 119862
7
7 1198625
7 1198623
7 1198621
7 119862minus1
7 119862minus3
7 119862minus4
7 119862minus7
7
8 11986280 11986271 11986262 11986253 11986244 11986235 11986226 11986217 11986208 119862
8
8 1198626
8 1198624
8 1198622
8 1198620
8 119862minus2
8 119862minus4
8 119862minus6
8 119862minus8
8
9 11986290 11986281 11986272 11986263 11986254 11986245 11986236 11986227 11986218 11986209 119862
9
9 1198627
9 1198625
9 1198623
9 1198621
9 119862minus1
9 119862minus3
9 119862minus5
9 119862minus7
9 119862minus9
9
10 119862100 11986291 11986282 11986273 11986264 11986255 11986246 11986237 11986228 11986219 119862010 119862
10
10 1198628
10 1198626
10 1198624
10 1198622
10 1198620
10 119862minus2
10 119862minus4
10 119862minus6
10 119862minus8
10 119862minus10
10
Table 3 Description of feature vector
Serialnumber List of moments Number of
features1 Geometric moment (F1ndashF5) 52 Moment invariant (F6ndashF12) 73 Affine moment invariant (F13ndashF18) 64 Legendre moment (F19ndashF28) 105 Zernike moment (F29ndashF64) 366 Complex moment (F65ndashF130) 66
Total 130
41 Detailed Dataset Description Handwritten numeralsfrom five different popular scripts namely Indo-ArabicBangla Devanagari Roman and Telugu are used in theexperiments for investigating the effectiveness of themomentbased feature sets as compared to conventional features Indo-Arabic or Eastern-Arabic is widely used in the Middle-Eastand also in the Indian subcontinent On the other handDevanagari and Bangla are ranked as the top two popular (interms of the number of native speakers) scripts in the Indiansubcontinent [34] Roman originally evolved from the Greekalphabet is spoken and used all over the world Also Teluguone of the oldest andpopular South Indian languages of Indiais spoken by more than 74 million people [34] It essentiallyranks third by the number of native speakers in India
The present approach is tested on the database named asCMATERdb3 where CMATER stands for Center for Micro-processor Application for Training Education and Researcha research laboratory at Computer Science and EngineeringDepartment of Jadavpur University India where the currentresearch activity took place db stands for database andthe numeric value 3 represents handwritten digit recogni-tion database stored in the said database repository Thetesting is currently done on four versions of CMATERdb3namely CMATERdb311 CMATERdb321 CMATERdb331and CMATERdb341 representing the databases created forhandwritten digit recognition system for four major scriptsnamely BanglaDevanagari Indo-Arabic and Telugu respec-tively
Each of the digit images are first preprocessed using basicoperations of skew corrections and morphological filtering[25] and then binarized using an adaptive global thresholdvalue computed as the average of minimum and maximumintensities in that imageThe binarized digit images may con-tain noisy pixels which have been removed by using Gaussianfilter [25] A well-known algorithm known as Canny EdgeDetection algorithm [25] is then applied for smoothing theedges of the binarized digit images Finally the boundingrectangular box of each digit image is separately normalizedto 32 times 32 pixels Database is made available freely in theCMATER website (httpwwwcmaterjuorgcmaterdbhtm)and at httpcodegooglecompcmaterdb
A dataset of 3000 digit samples is considered for each ofthe Devanagari Indo-Arabic and Telugu scripts For each ofthese datasets 2000 samples are used for training purposeand the rest of the samples are used for the test purposewhereas a dataset of 6000 samples is used by selecting 600samples for each of 10-digit classes of handwritten Bangladigits A training set of 4000 samples and a test set of 2000samples are then chosen for Bangla numerals by consideringequal number of digit samples from each class For Romannumerals a dataset of 6000 training samples is formed byrandom selection from the standard handwritten MNIST [7]training dataset of size 60000 samples In the same way4000 digit samples are selected from MNIST test dataset ofsize 10000 samples These digit samples are enclosed in aminimum bounding square and are normalized to 32 times 32pixels dimension Typical handwritten digit samples takenfrom the abovementioned databases used for evaluating thepresent work are shown in Figure 3
42 Recognition Process To realize the effectiveness of theproposed approach our comprehensive experimental testsare conducted on the five aforementioned datasets A totalof 6000 (for Devanagari Indo-Arabic and Telugu scripts)numerals have been used for the training purpose whereasthe remaining 3000 numerals (1000 from each of the script)have been used for the testing purpose For Bangla andRoman scripts a total of 8000 numerals (4000 taken from
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
10 Applied Computational Intelligence and Soft Computing
(a) (b)
(c) (d)
(e)
Figure 3 Samples of digit images taken from CMATER and MNIST databases written in five different scripts (a) Indo-Arabic (b) Bangla(c) Devanagari (d) Roman and (e) Telugu
each script) have been used for the training purpose whereasthe remaining 4000 numerals (2000 taken from each script)have been used for the testing purpose The designed featureset has been individually applied to eight well-known classi-fiers namely Naıve Bayes Bayes Net MLP SVM RandomForest Bagging Multiclass Classifier and Logistic For thepresent work the following abovementioned classifiers withthe given parameters are designed
Naıve Bayes Naıve Bayes classifier for details refer to[35]
Bayes Net Estimator = SimpleEstimator-A 05 searchalgorithm = K2MLP Learning Rate = 03Momentum= 02 Numberof Epochs = 1000 minerror = 002SVM Support Vector Machine using radial basiskernel with (119901 = 1) for details refer to [36]Random Forest Ensemble classifier that consists ofmany decision trees and outputs the class that is themode of the classes output by individual trees fordetails refer to [37]
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 11
Table 4 Recognition accuracies of eight classifiers and their corresponding ranks using 12 different datasets (ranks in the parentheses areused for performing the Friedman test)
DatasetsRecognition accuracy ()
ClassifiersNaıve Bayes Bayes Net MLP SVM Random Forest Bagging Multiclass Classifier Logistic
1 86 (8) 92 (7) 100 (1) 99 (25) 97 (6) 99 (25) 98 (45) 98 (45)
2 99 (3) 98 (55) 100 (1) 99 (3) 96 (75) 96 (75) 97 (55) 99 (3)
3 98 (5) 94 (7) 100 (1) 93 (8) 99 (25) 98 (5) 99 (25) 98 (5)
4 93 (8) 94 (7) 99 (15) 98 (35) 98 (35) 97 (5) 96 (6) 99 (15)
5 91 (8) 92 (7) 100 (15) 99 (35) 99 (35) 97 (6) 98 (5) 100 (15)
6 99 (25) 98 (45) 100 (1) 96 (7) 97 (6) 98 (45) 99 (25) 95 (8)
7 91 (8) 94 (7) 100 (1) 99 (3) 99 (3) 99 (3) 98 (55) 98 (55)
8 91 (8) 96 (7) 100 (1) 98 (45) 98 (45) 97 (6) 99 (25) 99 (25)
9 92 (8) 94 (7) 100 (15) 100 (15) 97 (6) 99 (4) 99 (4) 99 (4)
10 99 (25) 97 (65) 100 (1) 99 (25) 97 (65) 97 (65) 98 (4) 97 (65)
11 94 (8) 96 (7) 100 (1) 98 (5) 99 (3) 98 (5) 99 (3) 99 (3)
12 98 (4) 98 (4) 100 (1) 97 (7) 98 (4) 99 (2) 97 (7) 97 (7)
Mean rank R1 = 608 R2 = 637 R3 = 1125 R4 = 425 R5 = 467 R6 = 475 R7 = 433 R8 = 433
Bagging Bagging Classifier for detail refer to [38]Multiclass Classifier Method = ldquo1 against allrdquo ran-domWidthFactor = 20 seed = 1Logistic LogitBoost is used with simple regressionfunctions as base learner for details refer to [39]
The design parameters of classifiers are chosen as typicalvalues used in the literature or by experience The classifiersare not specifically tuned for the dataset at hand eventhough they may achieve a better performance with anotherparameter set since the goal is to design an automatedhandwritten digit recognition system based on the chosen setof classifiers
The digit recognition performances of the present tech-nique using each of these classifiers and their correspondingsuccess rates achieved at 95 confidence level are shownin Figures 4(a)-4(b) respectively It can be seen fromFigure 4 that the highest digit recognition accuracy hasbeen achieved by the MLP classifier which are found to be993 995 9892 9977 and 988 on Indo-ArabicBangla Devanagari Roman and Telugu scripts respectivelyThe performance analysis involves two parameters namelyModel Building Time (MBT) and Recognition Time (RT)MBT is based on the time required to train the system onthe given training samples whereas RT is based on the timerequired to recognize the given test samples The MBT andRT required for the abovementioned classifiers on all the fivedatabases are shown in Figures 5(a)-5(b)
43 Statistical Significance Tests The statistical significancetest is one of the essential ways for validating the performanceof themultiple classifiers usingmultiple datasets To do so wehave performed a safe and robust nonparametric Friedmantest [40] with the corresponding post hoc tests on Indo-Arabic script database For the present experimental setupthe number of datasets (119873) and the number of classifiers (119896)are set as 12 and 8 respectively These datasets are chosenrandomly from the test setTheperformances of the classifiers
on different datasets are shown in Table 4 On the basis ofthese performances the classifiers are then ranked for eachdataset separately the best performing algorithm gets therank 1 the second best gets rank 2 and so on (see Table 4)In case of ties average ranks are assigned to the classifiers tobreak the tie
Let 119903119894119895be the rank of the 119895th classifier on 119894th datasetThen
the mean of the ranks of the 119895th classifier over all the 119873datasets will be computed as follows
119877119895=1
119873
119873
sum119894=1
119903119894
119895 (36)
The null hypothesis states that all the classifiers are equivalentand so their ranks 119877
119895should be equal To justify it the
Friedman statistic [40] is computed as follows
1205942
119865=
12119873
119896 (119896 + 1)[
[
sum119895
1198772
119895minus119896 (119896 + 1)
2
4]
]
(37)
Under the current experimentation this statistic is dis-tributed according to 1205942
119865with 119896 minus 1 (=7) degrees of freedom
Using (37) the value of 1205942119865is calculated as 3046 From the
table of critical values (see any standard statistical book) thevalue of 1205942
119865with 7 degrees of freedom is 140671 for 120572 = 005
(where 120572 is known as level of significance) It can be seen thatthe computed 1205942
119865differs significantly from the standard 1205942
119865
So the null hypothesis is rejectedSingh et al [40] derived a better statistic using the
following formula
119865119865=
(119873 minus 1) 1205942
119865
119873(119896 minus 1) minus 1205942119865
(38)
119865119865is distributed according to the 119865-distribution with 119896 minus 1
(=7) and (119896minus1)(119873minus1) (=77) degrees of freedom Using (38)the value of 119865
119865is calculated as 80659 The critical value of 119865
(7 77) for 120572 = 005 is 2147 (see any standard statistical book)
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
12 Applied Computational Intelligence and Soft Computing
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
8486889092949698
100
Reco
gniti
on ac
cura
cy (
)
(a)
Classifiers
ArabicBanglaDevanagari
TeluguRoman
888990919293949596979899
100
95
confi
denc
e sco
re (
)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 4 Graph showing (a) recognition accuracies and (b) 95 confidence scores of the proposed handwritten digit recognition techniqueusing eight well-known classifiers on digits of five different scripts
05
10152025303540
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Mod
el b
uild
ing
time (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(a)
0
1
2
3
4
5
6
Classifiers
ArabicBanglaDevanagari
TeluguRoman
Reco
gniti
on ti
me (
s)
Naiuml
ve B
ayes
Baye
s Net
MLP
SVM
Rand
om F
ores
t
Bagg
ing
Mul
ticla
ss C
lass
ifier
Logi
stic
(b)
Figure 5 Graphical comparison of (a) MBTs and (b) RTs required by eight different classifiers on all the five databases for handwritten digitrecognition
which shows a significant difference between the standardand calculated values of 119865
119865 Thus both Friedman and Iman
et al statistics reject the null hypothesisAs the null hypothesis is rejected a post hoc test known
as the Nemenyi test [40] is carried out for pairwise com-parisons of the best and worst performing classifiers Theperformances of two classifiers are significantly different ifthe corresponding average ranks differ by at least the criticaldifference (CD) which is expressed as follows
CD = 119902120572
radic119896 (119896 + 1)
6119873 (39)
For the Nemenyi test the value of 119902005
for eight classifiersis 3031 (see Table 5(a) of [41]) So the CD is calculated as3031radic89612 that is 3031 using (39) Since the differencebetween mean ranks of the best and worst classifier is muchgreater than the CD (see Table 3) we can conclude that thereis a significant difference between the performing abilitiesof the classifiers For comparing all classifiers with a controlclassifier (say MLP) we have applied the Bonferroni-Dunntest [40] For this test CD is calculated using the same (39)But here the value of 119902
005for eight classifiers is 2690 (see
Table 5(b) of [41]) So the CD for the Bonferroni-Dunntest is calculated as 2690radic89612 that is 2690 As the
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 13
49555245
3125 3545 36253205 3205
3031
Differences of mean ranksDifference of mean ranksCD
0
1
2
3
4
5
6M
ean
rank
s
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|
Pairwise comparison of classifiers for the Nemenyi test
(a)
49555245
3125 3545 36253205 3205
269
Differences of mean ranksDifference of mean ranksCD
|R3minusR1|
|R3minusR2|
|R3minusR4|
|R3minusR5|
|R3minusR6|
|R3minusR7|
|R3minusR8|0
1
2
3
4
5
6
Mea
n ra
nks
Comparison of classifiers with MLP for the Bonferroni-Dunn test
(b)
Figure 6 Graphical representation of comparison of multiple classifiers for (a) the Nemenyi test and (b) the Bonferroni-Dunn test
difference between the mean ranks of any classifier and MLPis always greater than CD (see Table 3) the chosen controlclassifier performs significantly better than other classifiersfor Indo-Arabic database A graphical representation of theabovementioned post hoc tests for comparison of eight dif-ferent classifiers on Dataset 1 is shown in Figure 6 Similarlyit can also be shown for Bangla Devanagari Roman andTelugu databases that the chosen classifier (MLP) performssignificantly better than the other seven classifiers
44 Comparison among Moment Based Features For thejustification of the feature set used in the present workthe diverse combinations of six different types of momentsnamely geometric moment (F1ndashF5) moment invariant(F6ndashF12) affine moment invariant (F13ndashF18) Legendremoment (F19ndashF28) Zernike moment (F29ndashF64) and com-plex moment (F65ndashF130) are compared by considering allthe possible combinations This is done for measuring thediscriminating strength of the individual moment featuresand their combinations based on their complementary infor-mation These can be listed as follows
(a) Geometric moment + moment invariant + affineMoment invariant (F1ndashF18)
(b) Legendre moment (F19ndashF28)(c) Geometric moment + moment invariant + affine
moment invariant + Legendre moment (F1ndashF28)(d) Zernike moment (F29ndashF64)(e) Geometric moment + moment invariant + affine
moment invariant + Legendre moment + Zernikemoment (F1ndashF64)
(f) Legendre moment + Zernike moment (F19ndashF64)(g) Complex moment (F65ndashF130)(h) Zernike moment + complex moment (F29ndashF130)
86
88
90
92
94
96
98
100
Feature set and all possible combinations
ArabicBanglaDevanagari
RomanTelugu
Reco
gniti
on ac
cura
cy (
)
F1
ndashF18
F1
ndashF64
F19
ndashF28
F1
ndashF28
F29
ndashF64
F19
ndashF64
F65
ndashF130
F29
ndashF130
F1
ndashF130
Figure 7Graphical comparison showing the recognition accuraciesof all the possible combinations of moment based features achievedby MLP classifier
(i) Geometric moment + moment invariant + affinemoment invariant + Legendre moment + Zernikemoment + complex moment (F1ndashF130)
The graphical comparison of the corresponding numeralrecognition accuracies achieved by MLP classifier over thesame test set is shown in Figure 7 It can be observed fromFigure 7 that the present combination of moment feature setoutperforms all the other possible combinations
45 Detail Evaluation of MLP Classifier In the present workdetailed error analysis with respect to different parametersnamely Kappa statistics mean absolute error (MAE) root
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
14 Applied Computational Intelligence and Soft Computing
Table 5 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Indo-Arabic numerals (here MAE means mean absolute error RMSE means root mean square error TPR means True Positiverate FPR means False Positive rate MCC means Matthews Correlation Coefficient and AUC means Area under ROC)
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09922 01758 0293
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0990 0001 0990 0990 0990 0989 1000ldquo2rdquo 0960 0002 0980 0960 0970 0966 0990ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0004 0961 0990 0975 0973 0999ldquo7rdquo 0990 0000 1000 0990 0995 0994 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09922 01758 0293 0993 00007 09931 0993 0993 09922 09989
Table 6 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Bangla numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09944 00535 0115
1000 0001 0990 1000 0995 0994 1000ldquo1rdquo 1000 0002 0980 1000 0990 0989 1000ldquo2rdquo 1000 0001 0990 1000 0995 0994 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0990 0001 0990 0990 0990 0989 1000ldquo6rdquo 0970 0000 1000 0970 0985 0983 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 09944 00535 0115 0995 00005 0995 0995 0995 09943 1000
Table 7 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Devanagari numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
0988 00342 00847
0969 0002 0984 0969 0977 0974 0999ldquo1rdquo 0969 0000 1000 0969 0984 0983 1000ldquo2rdquo 1000 0005 0956 1000 0977 0975 1000ldquo3rdquo 0985 0003 0970 0985 0977 0975 0999ldquo4rdquo 1000 0002 0985 1000 0992 0992 1000ldquo5rdquo 0985 0000 1000 0985 0992 0991 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 0985 0000 1000 0985 0992 0991 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 1000 0000 1000 1000 1000 1000 1000Mean 0988 00342 00857 09893 00012 09895 09893 09891 09881 09998
mean square error (RMSE) True Positive rate (TPR) FalsePositive rate (FPR) precision recall 119865-measure MatthewsCorrelationCoefficient (MCC) andArea under ROC (AUC)is computed Tables 5ndash9 provide the said statistical mea-surements for handwritten numeral recognition written inIndo-Arabic Bangla Devanagari Roman and Telugu scriptsrespectively
5 Conclusion
India is a multilingual and multiscript country comprisingof 12 different scripts But there are not much competentworks done towards handwritten numeral recognition ofIndic scripts The following issues are observed with hand-written digit recognition system (1) mostly they have worked
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 15
Table 8 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Roman numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09975 016 02716
1000 0000 1000 1000 1000 1000 1000ldquo1rdquo 0995 0001 0995 0995 0995 0994 0999ldquo2rdquo 1000 0000 1000 1000 1000 1000 1000ldquo3rdquo 1000 0000 1000 1000 1000 1000 1000ldquo4rdquo 1000 0000 1000 1000 1000 1000 1000ldquo5rdquo 0995 0000 1000 0995 0997 0997 1000ldquo6rdquo 1000 0000 1000 1000 1000 1000 1000ldquo7rdquo 1000 0000 1000 1000 1000 1000 1000ldquo8rdquo 0994 0001 0989 0994 0991 0990 0999ldquo9rdquo 0994 0001 0994 0994 0994 0994 0999Mean 09975 016 02716 09978 00003 09978 09978 09977 09975 09997
Table 9 Statistical performance measures along with their respective means (styled in bold) achieved by the proposed technique onhandwritten Telugu numerals
Class Statistical performance measuresKappa statistics MAE RMSE TPR FPR Precision Recall 119865-measure MCC AUC
ldquo0rdquo
09867 00051 00449
0980 0001 0990 0980 0985 0983 0988ldquo1rdquo 0970 0002 0980 0970 0975 0972 0991ldquo2rdquo 1000 0002 0980 1000 0990 0989 1000ldquo3rdquo 0990 0000 1000 0990 0995 0994 0999ldquo4rdquo 0990 0002 0980 0990 0985 0983 0998ldquo5rdquo 1000 0000 1000 1000 1000 1000 1000ldquo6rdquo 0990 0003 0971 0990 0980 0978 0995ldquo7rdquo 0990 0002 0980 0990 0985 0983 1000ldquo8rdquo 1000 0000 1000 1000 1000 1000 1000ldquo9rdquo 0970 0000 1000 0970 0985 0983 0994Mean 09867 00051 00449 0988 00012 09881 0988 0988 09865 09965
on limited dataset (2) Training and testing times are notmentioned in most of the works (3) Most of the works havebeen done for Roman because of the availability of largerdataset like MNIST (4) Recognition systems for Indic scriptsare mainly focused on single script (5) Limitation to somefeature extraction methods also exist that is they are localto a particular scriptlanguage rather having a global scopeIn this work we have verified the effectiveness of a momentbased approach to handwritten digit recognition problemthat includes geometric moment moment invariant affinemoment invariant Legendre moment Zernike moment andcomplex moment The present scheme has been tested forfive different popular scripts namely Indo-Arabic BanglaDevanagari Roman and Telugu These methods have beenevaluated on the CMATER and MNIST databases usingmultiple classifiers FinallyMLP classifier is found to producethe highest recognition accuracies of 993 995 98929977 and 988 on Indo-Arabic Bangla DevanagariRoman and Telugu scripts respectively The results havedemonstrated that the application ofmoment based approachleads to a higher accuracy compared to its counterpartsAmong the most important ones an advantage of thisfeature extraction algorithm is that it is less computationally
expensive where the most of the published works need morecomputation time These features are also very simple toimplement compared to other methods It is obvious thatto improve the performance of proposed system furtherwe need to investigate more the sources of errors Potentialmoment features other than the presented ones may alsoexist
To further improve the performance possible futureworks are as follows (1) although the moment based featuresperform superbly on the whole complementary featureslike concavity analysis may help in discriminating confusingnumerals For example Indo-Arabic numerals ldquo2rdquo and ldquo3rdquo canbetter be separated by considering the original size beforenormalization (2) For classifier design it is better to selectmodel parameters (classifier structures) by cross validationrather than empirically as done in our experiments (3)Combining multiple classifiers can improve the recognitionaccuracy
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
16 Applied Computational Intelligence and Soft Computing
Acknowledgments
The authors are thankful to the Center for MicroprocessorApplication for Training Education and Research (CMATER)and Project on Storage Retrieval and Understanding ofVideo for Multimedia (SRUVM) of Computer Science andEngineering Department Jadavpur University for providinginfrastructure facilities during progress of the work Thecurrent work reported here has been partially funded byUniversity with Potential for Excellence (UPE) Phase-IIUGC Government of India
References
[1] P K Singh R Sarkar and M Nasipuri ldquoOffline Script Identi-fication from multilingual Indic-script documents a state-of-the-artrdquo Computer Science Review vol 15-16 pp 1ndash28 2015
[2] C-L Liu K Nakashima H Sako and H Fujisawa ldquoHand-written digit recognition benchmarking of state-of-the-arttechniquesrdquo Pattern Recognition vol 36 no 10 pp 2271ndash22852003
[3] D Gorgevik and D Cakmakov ldquoHandwritten digit recognitionby combining SVM classifiersrdquo in Proceedings of the Interna-tional Conference on Computer as a Tool (EUROCON rsquo05) vol2 pp 1393ndash1396 Belgrade Serbia November 2005
[4] M D Garris J L Blue and G T Candela NIST Form-BasedHandprint Recognition System NIST 1997
[5] X Chen X Liu and Y Jia ldquoLearning handwritten digit recog-nition by the max-min posterior pseudo-probabilities methodrdquoin Proceedings of the 9th International Conference on DocumentAnalysis and Recognition (ICDAR rsquo07) pp 342ndash346 ParanaBrazil September 2007
[6] K Labusch E Barth and T Martinetz ldquoSimple method forhigh-performance digit recognition based on sparse codingrdquoIEEE Transactions on Neural Networks vol 19 no 11 pp 1985ndash1989 2008
[7] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-basedlearning applied to document recognitionrdquo Proceedings of theIEEE vol 86 no 11 pp 2278ndash2324 1998
[8] Y Wen Y Lu and P Shi ldquoHandwritten Bangla numeralrecognition system and its application to postal automationrdquoPattern Recognition vol 40 no 1 pp 99ndash107 2007
[9] V Mane and L Ragha ldquoHandwritten character recognitionusing elastic matching and PCArdquo in Proceedings of the Interna-tional Conference on Advances in Computing Communicationand Control pp 410ndash415 ACM Mumbai India January 2009
[10] R M O Cruz G D C Cavalcanti and T I Ren ldquoHandwrittendigit recognition using multiple feature extraction techniquesand classifier ensemblerdquo in Proceedings of the 17th InternationalConference on Systems Signals and Image Processing pp 215ndash218 Rio de Janeiro Brazil June 2010
[11] B V Dhandra R G Benne and M Hangarge ldquoKannadatelugu and devanagari handwritten numeral recognition withprobabilistic neural network a script independent approachrdquoInternational Journal of Computer Applications vol 26 no 9pp 11ndash16 2011
[12] J Yang J Wang and T Huang ldquoLearning the sparse repre-sentation for classificationrdquo in Proceedings of the 12th IEEEInternational Conference on Multimedia and Expo (ICME rsquo11)pp 1ndash6 IEEE Barcelona Spain July 2011
[13] A Gimenez J Andres-Ferrer A Juan and N Serrano ldquoDis-criminative bernoulli mixture models for handwritten digitrecognitionrdquo in Proceedings of the 11th International Conferenceon Document Analysis and Recognition (ICDAR rsquo11) pp 558ndash562 IEEE Beijing China September 2011
[14] YAl-OhaliMCheriet andC Suen ldquoDatabases for recognitionof handwrittenArabic chequesrdquo Pattern Recognition vol 36 no1 pp 111ndash121 2003
[15] N Das R Sarkar S Basu M Kundu M Nasipuri and D KBasu ldquoA genetic algorithm based region sampling for selectionof local features in handwritten digit recognition applicationrdquoApplied Soft Computing vol 12 no 5 pp 1592ndash1606 2012
[16] M S Akhtar andH AQureshi ldquoHandwritten digit recognitionthrough wavelet decomposition and wavelet packet decom-positionrdquo in Proceedings of the 8th International Conferenceon Digital Information Management (ICDIM rsquo13) pp 143ndash148IEEE Islamabad Pakistan September 2013
[17] Z Dan and C Xu ldquoThe recognition of handwritten digits basedon BP neural network and the implementation on Androidrdquoin Proceedings of the 3rd International Conference on IntelligentSystem Design and Engineering Applications (ISDEA rsquo13) pp1498ndash1501 IEEE Hong Kong January 2013
[18] U R Babu Y Venkateswarlu and A K Chintha ldquoHandwrittendigit recognition using K-nearest neighbour classifierrdquo in Pro-ceedings of the World Congress on Computing and Communica-tion Technologies (WCCCT rsquo14) pp 60ndash65 Trichirappalli IndiaMarch 2014
[19] J H AlKhateeb and M Alseid ldquoDBNmdashbased learning forArabic handwritten digit recognition using DCT featuresrdquo inProceedings of the 6th International Conference on ComputerScience and Information Technology (CSIT rsquo14) pp 222ndash226Amman Jordan March 2014
[20] S Abdleazeem and E El-Sherif ldquoArabic handwritten digitrecognitionrdquo International Journal on Document Analysis andRecognition vol 11 no 3 pp 127ndash141 2008
[21] R Ebrahimzadeh and M Jampour ldquoEfficient handwritten digitrecognition based on Histogram of oriented gradients andSVMrdquo International Journal of Computer Applications vol 104no 9 pp 10ndash13 2014
[22] A M Gil C F F C Filho and M G F Costa ldquoHandwrittendigit recognition using SVM binary classifiers and unbalanceddecision treesrdquo in Image Analysis and Recognition vol 8814 ofLecture Notes in Computer Science pp 246ndash255 Springer BaselSwitzerland 2014
[23] B El Qacimy M A Kerroum and A Hammouch ldquoFeatureextraction based on DCT for handwritten digit recognitionrdquoInternational Journal of Computer Science Issues vol 11 no 6(2)pp 27ndash33 2014
[24] S AL-Mansoori ldquoIntelligent handwritten digit recognitionusing artificial neural networkrdquo International Journal of Engi-neering Research and Applications vol 5 no 5 pp 46ndash51 2015
[25] R C Gonzalez and R E Woods Digital Image Processing volI Prentice-Hall New Delhi India 1992
[26] F Hausdorff ldquoSummationsmethoden und Momentfolgen IrdquoMathematische Zeitschrift vol 9 no 1-2 pp 74ndash109 1921
[27] M-K Hu ldquoVisual pattern recognition by moment invariantsrdquoIRE Transactions on Information Theory vol 8 no 2 pp 179ndash187 1962
[28] M Petrou and A Kadyrov ldquoAffine invariant features from thetrace transformrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 26 no 1 pp 30ndash44 2004
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 17
[29] P-T Yap and R Paramesran ldquoAn efficient method for thecomputation of Legendre momentsrdquo IEEE Transactions onPattern Analysis and Machine Intelligence vol 27 no 12 pp1996ndash2002 2005
[30] S X Liao and M Pawlak ldquoOn image analysis by momentsrdquoIEEE Transactions on Pattern Analysis andMachine Intelligencevol 18 no 3 pp 254ndash266 1996
[31] A Khotanzad and Y H Hong ldquoInvariant image recognition byZernike momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 12 no 5 pp 489ndash497 1990
[32] Y S Abu-Mostafa and D Psaltis ldquoRecognitive aspects ofmoment invariantsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol PAMI-6 no 6 pp 698ndash706 1984
[33] Y S Abu-Mostafa and D Psaltis ldquoImage normalization bycomplex momentsrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 7 no 1 pp 46ndash55 1985
[34] August 2015 httpenwikipediaorgwikiLanguages of India[35] G H John and P Langley ldquoEstimating continuous distributions
in Bayesian classifiersrdquo in Proceedings of the 11th Conference onUncertainty in Artificial Intelligence (UAI rsquo95) pp 338ndash345 SanMateo Calif USA August 1995
[36] S S Keerthi S K Shevade C Bhattacharyya and K R KMurthy ldquoImprovements to plattrsquos SMO algorithm for SVMclassifier designrdquo Neural Computation vol 13 no 3 pp 637ndash649 2001
[37] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[38] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[39] S le Cessie and J C van Houwelingen ldquoRidge estimators inlogistic regressionrdquo Applied Statistics vol 41 no 1 pp 191ndash2011992
[40] P K Singh R Sarkar N Das S Basu and M Nasipuri ldquoSta-tistical comparison of classifiers for script identification frommulti-script handwritten documentsrdquo International Journal ofApplied Pattern Recognition vol 1 no 2 pp 152ndash172 2014
[41] J Demsar ldquoStatistical comparisons of classifiers over multipledata setsrdquo Journal of Machine Learning Research vol 7 pp 1ndash302006
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014