HandwrittenBanglaCharacterRecognitionUsingthe State-of-the...

Research ArticleHandwritten Bangla Character Recognition Using theState-of-the-Art Deep Convolutional Neural Networks

Md Zahangir Alom 1 Paheding Sidike 2 Mahmudul Hasan3 Tarek M Taha1

and Vijayan K Asari1

1Department of Electrical and Computer Engineering University of Dayton Dayton OH USA2Department of Earth and Atmospheric Sciences Saint Louis University St Louis MO USA3Comcast Labs Washington DC USA

Correspondence should be addressed to Md Zahangir Alom alomm1udaytonedu

Received 1 March 2018 Revised 10 May 2018 Accepted 10 July 2018 Published 27 August 2018

Academic Editor Friedhelm Schwenker

Copyright copy 2018 Md Zahangir Alom et al (is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

In spite of advances in object recognition technology handwritten Bangla character recognition (HBCR) remains largely unsolveddue to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings Even many ad-vanced existing methods do not lead to satisfactory performance in practice that related to HBCR In this paper a set of the state-of-the-art deep convolutional neural networks (DCNNs) is discussed and their performance on the application of HBCR issystematically evaluated (e main advantage of DCNN approaches is that they can extract discriminative features from raw dataand represent themwith a high degree of invariance to object distortions(e experimental results show the superior performanceof DCNNmodels compared with the other popular object recognition approaches which implies DCNN can be a good candidatefor building an automatic HBCR system for practical applications

1 Introduction

Automatic handwriting character recognition has manyacademic and commercial interests (e main challenge inhandwritten character recognition is to deal with theenormous variety of handwriting styles by different writersFurthermore some complex handwriting scripts comprisedifferent styles for writing words Depending on the lan-guage characters are written isolated from each other insome cases (eg (ai Laos and Japanese) In some othercases they are cursive and sometimes characters are relatedto each other (eg English Bangladeshi and Arabic) (ischallenge has been already recognized by many researchersin the field of natural language processing (NLP) [1ndash3]

Handwritten character recognition is more challengingcompared with the printed forms of character due to thefollowing reasons (1) Handwritten characters written bydifferent writers are not only nonidentical but also vary indifferent aspects such as size and shape (2) numerous

variations in writing styles of individual character make therecognition task difficult (3) the similarities of differentcharacter in shapes the overlaps and the interconnectionsof the neighbouring characters further complicate thecharacter recognition problem In summary a large varietyof writing styles and the complex features of the handwrittencharacters make it a challenge to accurately classifyinghandwritten characters

Bangla is one of the most spoken languages and rankedfifth in the world and spoken by more than 200 millionpeople [4 5] It is the national and official language ofBangladesh and the second most popular language in IndiaIn addition Bangla has a rich heritage February 21st isannounced as the International Mother Language day byUNESCO to respect the language martyrs for the language inBangladesh in the year of 1952 In terms of Bangla characterit involves a Sanskrit-based script that is inherently differentfrom English- or Latin-based scripts and it is relativelydifficult to achieve desired accuracy on the recognition tasks

HindawiComputational Intelligence and NeuroscienceVolume 2018 Article ID 6747098 13 pageshttpsdoiorg10115520186747098

(erefore developing a recognition system for Banglacharacters is of a great interest [4 6 7]

In Bangla language there are 10 digits and 50 charactersincluding vowel and consonant where some contain ad-ditional sign up andor below Moreover Bangla consists ofmany similar shaped characters In some cases a characterdiffers from its similar one with a single dot or markFurthermore Bangla also contains some special characterswith equivalent representation of vowels (is makes itdifficult to achieve a better performance with simple clas-sification technique as well as hinders to the development ofa reliable handwritten Bangla character recognition (HBCR)system

(ere are many applications of HBCR such as Banglaoptical character recognition national ID number recog-nition system automatic license plate recognition system forvehicle and parking lot management system post officeautomation and online banking Some example images ofthese applications are shown in Figure 1 In this work weinvestigate the HBCR on Bangla numerals alphabets andspecial characters using the state-of-the-art deep convolu-tional neural network (DCNN) [8] models (e contribu-tions of this paper can be summarized as follows

(i) First time to comprehensive evaluation of the state-of-the-art DCNN models including VGG Net [9]All Convolutional Neural Network (All-Conv) [10]Network in Network (NiN) [11] Residual Network(ResNet) [12] Fractal Network (FractalNet) [13]and Densely connected convolutional Network(DenseNet) [14] on the application of HBCR

(ii) Extensive experiments on HBCR including hand-written digits alphabets and special characterrecognition

(iii) (e better recognition accuracy is achieved to thebest of knowledge compared with other existingapproaches that reported in the literature

2 Related Work

Although some studies on Bangla character recognition havebeen reported in the past years [15ndash17] there is a few re-markable works available for HBCR Pal and Chaudhuri [5]proposed a new feature extraction-based method forhandwritten Bangla character recognition where the conceptof water overflow from the reservoir is utilized Liu and Suen[18] introduced directional gradient features for handwrit-ten Bangla digit classification using ISI Bangla numeraldataset [19] which consists of 19392 training samples 4000test samples and 10 classes (ie 0 to 9) Surinta et al [20]proposed a system using a set of features such as the contourof the handwritten image computed using 8-directionalcodes distance calculated between hotspots and black pixelsand the intensity of pixel space of small blocks Each of thesefeatures is separately fed into support vector machine (SVM)[21] classifier and the final decision is made by the majorityvoting strategy Das et al [22] exploited genetic algorithms-based region sampling method for local feature selection andachieved 97 accuracy on HBCR Xu et al [23] used

a hierarchical Bayesian network which directly takes rawimages as the network inputs and classifies them usinga bottom-up approach Sparse representation classifier hasalso been applied for Bangla digit recognition [4] where 94accuracy was reported for handwritten digit recognition In[6] handwritten Bangla basic and compound characterrecognition using multilayer perceptron (MLP) [24] andSVM classifier was suggested while handwritten Banglanumeral recognition using MLP was presented in [7] wherethe average recognition rate reached 9667

Recently deep learning-based methods have drawnincreasing attention in handwritten character recognition[25 26] Ciregan andMeier [27] applied multicolumn CNNsto Chinese character classification Kim and Xie [25] appliedDCNN to Hangul handwritten character recognition andsuperior performance has been achieved against classicalmethods A deep learning framework such as a CNN-basedHBCR scheme was introduced in [26] where the best rec-ognition accuracy reached at 8536 on their own dataset Inthis paper we for the first time introduce the very latestDCNN models including VGG network All-Conv NiNResNet FractalNet and DenseNet for handwritten Banglacharacter (eg digits alphabets and special characters)recognition

3 Deep Neural Networks

Deep neural network (DNN) is an active area in the field ofmachine learning and computer vision [28] and it generallycontains three popular architectures Deep Belief Net (DBN)[29] Stacked Autoencoder (SAE) [30] and CNN Due to thecomposition of many layers DNN methods are more ca-pable for representing the highly varying nonlinear functioncompared with shallow learning approaches [31] (e lowand middle level of DNN abstract the feature from the inputimage whereas the high level performs classification op-eration on the extracted features As a result an end-to-endframework is formed by integrating with all necessary layerswithin a single network (erefore DNN models often leadto better accuracy compared with the other type of machinelearning methods Recent successful practice of DNN coversa variety of topics such as electricity consumption moni-toring [32] radar signal examination [33] medical imageanalysis [34ndash36] food security [37ndash39] and remote sensing[40ndash42]

Among all deep learning approaches CNN is one of themost popular models and has been providing the state-of-the-art performance on segmentation [43 44] human actionrecognition [45] image superresolution [46] scene labelling[47] and visual tracking [48]

31 Convolutional Neural Network (CNN) CNN was ini-tially applied to digit recognition task by LeCun et al [8]CNN and its variants are gradually adopted to variousapplications [46 49] CNN is designed to imitate humanvisual processing and it has highly optimized structures toprocess 2D images Furthermore CNN can effectively learnthe extraction and abstraction of 2D features In detail the

2 Computational Intelligence and Neuroscience

max-pooling layer of CNN is very effective in absorbingshape variations Moreover sparse connection with tiedweights makes CNN involve with fewer parameters thana fully connected network with similar size Most impor-tantly CNN is trainable with the gradient-based learningalgorithm and suffers less from the diminishing gradientproblem Given that the gradient-based algorithm trains thewhole network to minimize an error criterion directly CNNcan produce highly optimized weights and good general-ization performance [50]

(e overall architecture of a CNN as shown in Figure 2consists of two main parts feature extractor and classifier Inthe feature extraction unit each layer of the network receivesthe output from its immediate previous layer as inputs andpasses current output as inputs to the immediate next layerwhereas classification part generates the predicted outputsassociated with the input data (e two basic layers in CNNarchitecture are convolution and pooling [8] layers Inconvolution layer each node extracts the features from theinput images by convolution operation on the input nodes(e max-pooling layer abstracts the feature through averageor maximum operation on input nodes (e outputs oflminus 1th layer are used as input for the lth layer where theinputs go through a set of kernels followed by nonlinearfunction ReLU Here f refers to activation function ofReLU For example if xlminus1

i inputs from lminus 1th layer klij are

kernels of lth layer (e biases of lth layer are representedwith bl

j (en the convolution operation can be expressed as

xlj f x

lminus1i lowast k

lij1113872 1113873 + b

lj (1)

(e subsampling or pooling layer abstracts the featurethrough average or maximum operation on input nodes Forexample if a 2 times 2 down sampling kernel is applied then

each output dimension will be the half of the correspondinginput dimension for all the inputs (e pooling operationcan be stated as follows

xlj down x

lminus1i1113872 1113873 (2)

In contrast to traditional neural networks CNN extractslow- to high-level features (e higher-level features can bederived from the propagated feature of the lower-level layersAs the features propagate to the highest layer the dimensionof the feature is reduced depending on the size of theconvolution and pooling masks However the number offeature mapping usually increased for selecting or mappingthe extreme suitable features of the input images for betterclassification accuracy (e outputs of the last layer of CNNare used as inputs to the fully connected network and ittypically uses a Softmax operation to produce the classifi-cation outputs For an input sample x weight vector w andK distinct linear functions the Softmax operation can bedefined for the ith class as follows

P(y i|x) exp xTwi( 1113857

1113936Kk1exp xTwk( 1113857

(3)

However there are different variants of DCNN archi-tecture that have been proposed over the last few years (efollowing section discusses six popular DCNN models

32 CNN Variants As far as CNN architecture is con-cerned it can be observed that there are some important andfundamental components that are used to construct anefficient DCNN architecture (ese components are con-volution layer pooling layer fully connected layer and

(a) (b)

(c)

Figure 1 Application of handwritten character recognition (a) national ID number recognition system (b) post office automation withcode number recognition on envelope and (c) automatic license plate recognition

Computational Intelligence and Neuroscience 3

Softmax layer (e advanced architecture of this networkconsists of a stack of convolutional layers and max-poolinglayers followed by fully connected and Softmax layer at theend Noticeable examples of such networks include LeNet[8] AlexNet [49] VGG Net All-Conv and NiN (ere aresome other alternative and advanced architecture that havebeen proposed including GoogleNet with inception layers[51] ResNet FractalNet and DenseNet However there aresome topological differences observed in the modern ar-chitectures Out of many DCNN architectures AlexNetVGG Net GoogleNet ResNet DenseNet and FractalNetcan be viewed as most popular architectures with respect totheir enormous performance on different benchmarks forobject classification Among these models some of themodels are designed especially for large-scale imple-mentation such as ResNet and GoogleNet whereas the VGGNet consists of a general architecture On the other handFractalNet is an alternative of ResNet In contrast Dense-Netrsquos architecture is unique in terms of unit connectivitywhere every layer to all subsequent layers is directly con-nected In this paper we provide a review and comparativestudy of All-Conv NiN VGG-16 ResNet FractalNet andDenseNet for Bangla character recognition (e basicoverview of these architectures is given in the followingsection

321 VGG-16 (e visual geometry group (VGG) was therunner up of the ImageNet Large Scale Visual RecognitionCompetition (ILSVRC) in 2014 [52] In this architecturetwo convolutional layers are used consecutively with a rec-tified linear unit (ReLU) [53] activation function followed bysingle max-pooling layer several fully connected layers withReLU and Softmax as the final layer (ere are three typesof VGG Net based on the architecture (ese threenetworks contain 11 16 and 19 layers and named as VGG-11 VGG-16 and VGG-19 respectively (e basic structurefor VGG-11 architecture contains eight convolution layers

one max-pooling layer and three fully connected (FC) layersfollowed by single Softmax layer(e configuration of VGG-16 is as follows the number of convolutions and max-pooling layers 13 max-pooling layer 1 FC layers 3 andSoftmax layer 1 Total weights is 138 million (e VGG-19consisted of 16 convolutional layers one max-pooling layer3 FC layers followed by a Softmax layer (e basic buildingblocks of VGG architecture is shown in Figure 3 In thisimplementation we have used VGG-16 network with lessnumber of feature maps in convolutional layers comparedwith the standard VGG-16 network

322 All Convolutional Network (All-Conv) (e layerspecification of All-Conv is given in Figure 4 (e basicarchitecture is composed with two convolutional layersfollowed by a max-pooling layer Instead of using fullyconnected layer global average pooling (GAP) [11] with thedimension of 6 times 6 is used Finally the Softmax layer is usedfor classification (e output dimension is assigned based onthe number of classes

323 Network in Network (NiN) (is model is quite dif-ferent compared with the aforementioned DCNN modelsdue to the following properties [11]

(i) It uses multilayer convolution where convolution isperformed with 1 times 1 filters

(ii) It uses GAP instead of fully connected layer

(e concept of using 1 times 1 convolution helps to increasethe depth of the network (e GAP significantly changes thenetwork structure which is used nowadays very often asa replacement of fully connected layers (e GAP on a largefeature map is used to generate a final low-dimensionalfeature vector instead of reducing the feature map to a smallsize and then flattening the feature vector

Feature extraction

Classification

Input32times32

Feature maps3228times28




Convolution Convolution Max-pooling

Out

puts

Max-pooling

Figure 2 Basic CNN architecture for digit recognition


324 Residual Network (ResNet) ResNet architecture be-comes very popular in computer vision community (eResNet variants have been experimented with differentnumber of layers as follows number of convolution layers49 (34 152 and 1202 layers for other versions of ResNet)number of fully connected layers 1 weights 255M (ebasic block diagram of ResNet architecture is shown inFigure 5 If the input of the residual block is xlminus1 the output ofthis block is xl After performing operations (eg convolutionwith different size of filters batch normalization (BN) [54]followed by a activation function such as ReLU) on xlminus1 theoutput F(xlminus1) is produced (e final output of the residualunit is defined as

xl F xlminus1( 1113857 + xlminus1 (4)

(e Residual Network consists of several basic residualunits(e different residual units are proposed with differenttypes of layers However the operations between the residualunits vary depending on the architectures that are explainedin [12]

325 FractalNet (e FractalNet architecture is an ad-vanced and alternative one of ResNet which is very efficientfor designing very large network with shallow subnetworksbut shorter paths for the propagation of gradient duringtraining [13] (is concept is based on drop path which isanother regularization for large network As a result thisconcept helps to enforce speed versus accuracy tradeoff (ebasic block diagram of FractalNet is shown in Figure 6 Herex is the actual inputs of FractalNet and z and f(z) are theinputs and outputs of Fractal block respectively

326 Densely Connected Network (DenseNet) DenseNet isdensely connected CNN where each layer is connected to allprevious layers [14] (erefore it forms very dense con-nectivity between the layers and so it is called DenseNet (eDenseNet consists of several dense blocks and the layerbetween two adjacent blocks is called transition layers (econceptual diagram of the dense block is shown in Figure 7According to the figure the lth layer receives all the featuremaps x0 x1 x2 xlminus1 from the previous layers asinput which is expressed by

xl Hl x0 x1 x2 xlminus11113858 1113859( 1113857 (5)

Inpu

ts

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Max

-poo

ling

FC FC FC

Softm

ax

Figure 3 Basic architecture of VGG Net convolution (Conv) and FC for fully connected layers and Softmax layer at the end

Input 32 times 32

3 times 3 Conv 128 ReLU(i)

(i)

(i)(ii)

(ii) 3 times 3 Conv 128 ReLU

Convolution

3 times 3 max-pooling stride 2

(i) 3 times 3 max-pooling stride 2

Pooling

3 times 3 Conv 256 ReLU3 times 3 Conv 256 ReLU

(i)(ii)


Convolution

Pooling

Convolution

6 times 6 global average pooling

Somax

Figure 4 All convolutional network framework

ReLU activation

Convolution

ReLU activation

Convolution

+

Figure 5 Basic diagram of residual block


where [x0 x1 x2 xlminus1] is the concatenated featuresfrom 0 lminus 1 layers and Hl(middot) is a single tensorDenseNet performs three consecutive operations BN fol-lowed by ReLU and a 3 times 3 convolution In the transitionblock 1 times 1 convolutional operations are performed withBN followed by 2 times 2 average pooling layer (is new ar-chitecture has achieved state-of-the-art accuracy for objectrecognition on the five different competitive benchmarks

327 Network Parameters (e number of network pa-rameters is a very important criterion to assess the com-plexity of the architecture (e number of parameters can beused tomake comparison between different architectures Atfirst the dimension of the output feature map can becomputed as

M NminusF

S+ 1 (6)

where N denotes the dimension of input feature maps F

refers to the dimension of filters or receptive field S rep-resents stride in the convolution and M is the dimension ofoutput feature maps (e number of parameters (withoutbias) for a single layer is obtained by

Pl F times F times FMlminus1( 1113857 times FMl (7)

where Pl represents the total number of parameters in the lthlayer FMl is the total number of output feature maps of lthlayer and FMlminus1 is the total number of feature maps in the(lminus 1)th layer For example let a 32 times 32 dimensional (N)image be an input (e size of the filter (F) is 5 times 5 and stride(S) is 1 for convolutional layer(e output dimension (M) ofthe convolutional layer is 28 times 28 which is calculatedaccording to (6) For better illustration a summary of pa-rameters used in All-Conv architecture is shown in Table 1Note that the number of trainable parameters is zero in thepooling layer

4 Results and Discussion

(e entire experiment is performed on desktop computerwith Intelreg Core-I7 CPU333GHz 5600GB memoryand Keras with (eano on the backend on Linux envi-ronment We evaluate the state-of-the-art DCNNmodels onthree datasets from CMATERdb (available at httpscodegooglecomarchivepcmaterdb) containing Bangla hand-written digits alphabets and special character recognition

Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3

Figure 7 A 4-layer dense block with growth rate of k 3 Each of the layers takes all of the preceding feature maps as input

Convolution layerJoining layer

z

Fractal blockPooling layer

x

y

f2 (z)

Figure 6 An example of FractalNet architecture


(e statistics of three datasets used in this paper aresummarized in Table 2 For convenience we named thedatasets as Digit-10 Alphabet-50 and SpecialChar-13 re-spectively All images are rescaled to 32 times 32 pixels in ourexperiment

41BanglaHandwrittenDigitDataset (e standard samplesof the numeral with respective Arabic numerals are shown inFigure 8 (e performance of both DBN and CNN isevaluated on a Bangla handwritten benchmark dataset calledCMATERdb 311 [22] (is dataset contains 6000 images ofunconstrained handwritten isolated Bangla numerals Eachdigit has 600 images that are rescaled to 32 times 32 pixels Somesample images in the database are shown in Figure 9 Visualinspection depicts that there is no visible noise Howevervariability in writing style is quite high due to user de-pendency In our experiments the dataset is split intoa training set and a test set for the evaluation of differentDCNN models (e training set consists of 4000 images(400 randomly selected images of each digit) (e rest of the2000 images are used for testing

Figure 10 shows the training loss of all DCNN modelsduring 250 epochs It can be observed that FractalNet andDenseNet converge faster compared with other networksand worst convergence is obtained to be for the All-ConvNetwork (e validation accuracy is shown in Figure 11where DenseNet and FractalNet show better recognitionaccuracy among all DCNN models Finally the testing ac-curacy of all the DCNNmodels is shown in Figure 12 Fromthe result it can be clearly seen that DenseNet provides thebest recognition accuracy compared with other networks

42 Bangla Handwritten Alphabet-50 In our implementa-tion the basic fifty alphabets including 11 vowels and 39consonants are considered (e samples of 39-consonantand 11-vowel characters are shown in Figures 13(a) and13(b) respectively (e Alphabet-50 dataset contains 15000samples where 12000 are used for training and theremaining 3000 samples are used for testing Since thedataset contains samples with different dimensions we

rescale all input images to 32 times 32 pixels for better fitting tothe convolutional operation Some randomly selectedsamples from this database are shown in Figure 14

Table 1 Parameter specification in All-Conv model

Layers Operations Featuremaps

Size offeaturemaps

Size ofkernels

Number ofparameters

Inputs 32times 32times 3C1 Convolution 128 30times 30 3times 3 3456C2 Convolution 128 28times 28 3times 3 147456

S1Max-pooling 128 14times14 2times 2 NA

C3 Convolution 256 12times12 3times 3 294912C4 Convolution 256 10times10 3times 3 589824

S2Max-pooling 256 5times 5 2times 2 NA

C5 Convolution 512 3times 3 3times 3 1179648C6 Convolution 512 3times 3 1times 1 262144GAP1 GAP 512 3times 3 NA NAOutputs Softmax 10 1x1 NA 5120

Table 2 Statistics of the database used in our experiment

Dataset trainingsamples

testingsamples

Totalsamples

Numberof classes

Digit-10 4000 2000 6000 10Alphabet-50 12000 3000 15000 50SpecialChar-13 2196 935 2231 13

0 1 2 3 4 5 6 7 8 9

Figure 8 First row shows the Bangla actual digits and second rowshows the corresponding Arabic numerals

Figure 9 Sample handwritten Bangla numeral images fromCMATERdb 311 database including digits from 1 to 10

0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv

ResNetFractalNet NiN

DenseNet

VGG-16

Figure 10 Training loss of different architectures for Banglahandwritten 1ndash10 digits


(e training loss for different DCNN models is shown inFigure 15 It is clear that the DenseNet shows the bestconvergence compared with the other DCNN approachesSimilar to the previous experiment All-Conv shows the worstconvergence behavior In addition an unexpected conver-gence behavior is observed in the case of NiN modelHowever all DCNN models tend to converge after 200epochs (e corresponding validation accuracy on Alphabet-50 is shown in Figure 16 DenseNet again shows superiorvalidation accuracy compared with other DCNN approaches

Figure 17 shows the testing results on handwrittenAlphabet-50 (e DenseNet shows the best testing accuracywith a recognition rate of 9831 On the other hand the All-Conv Net provides around 9431 testing accuracy which isthe lowest testing accuracy among all the DCNN models

43 Bangla Handwritten Special Characters (ere are sev-eral special characters (SpecialChar-13) which are equivalentto representations of vowels that are combined with con-sonants for making meaningful words In our evaluation we

use 13 special characters which are for 11 vowels and twoadditional special characters Some samples of Bangla specialcharacters are shown in Figure 18 It can be seen that thequality of the samples is poor and significant variation in thesame symbols makes this recognition task even difficult

(e training loss and validation accuracy for SpecialChar-13 are shown in Figures 19 and 20 respectively From theseresults it can be seen that DenseNet provides better per-formance with lower loss and with the highest validationaccuracy among all DCNN models Figure 21 shows thetesting accuracy of DCNN model for SpecialChar-13 datasetIt is observed from Figure 21 that DenseNet shows the highesttesting accuracy with lowest training loss and it convergesvery fast On the other hand VGG-19 network showspromising recognition accuracy as well

44 Performance Comparison (e testing performance iscompared to several existing methods (e results arepresented in Table 3 (e experimental results show that themodern DCNNmodels including DenseNet FractalNet andResNet provide better testing accuracy against the otherdeep learning approaches and the previously proposedclassical methods In general the DenseNet provides 9913testing accuracy for handwritten digit recognition which isthe best accuracy that has been publicly reported to the bestour knowledge In case of a 50-alphabet recognitionDenseNet yields 9831 recognition accuracy which is al-most 25 better than the method in [55] As far as we knowthis is the highest accuracy for handwritten Bangla 50-alphabet recognition In addition on 13 special characterrecognition task DCNNs show promising recognition ac-curacy especially DenseNet achieves the best accuracywhich is 9818

0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet

Figure 11 Validation accuracy of different architectures for Banglahandwritten 1ndash10 digits

975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN

ResNetFractalNetDenseNet

Figure 12 Testing accuracy for Bangla handwritten digitrecognition

(a)

(b)

Figure 13 Example images of handwritten characters (a) Banglaconsonant characters and (b) vowels


45 Parameter Evaluation For impartial comparison wehave trained and tested the networks with the optimized samenumber of parameters as in the references Table 4 shows thenumber of parameters used for different networks for 50-alphabet recognition (e number of network parameters fordigits and special character recognition was the same exceptthe number of neurons in the classification layer

46 Computation Time We also calculate computationalcost for all methods although the computation time dependson the complexity of the architecture Table 5 presents thecomputational time per epoch (in second) during training ofall the networks for Digit-10 Alphabet-50 and SpecialChar-13 recognition task From Table 5 it can be seen thatDenseNet takes the longest time during training due to itsdense structure but yields the best accuracy

Figure 14 Randomly selected handwritten characters of Bangla alphabets from Bangla handwritten Alphabet-50 dataset

0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet

Figure 15 Training loss of different DCNN models for Banglahandwritten Alphabet-50

0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs

ResNet All-ConvVGG NiNDenseNet FractalNet

Figure 16 (e validation accuracy of different architectures forBangla handwritten Alphabet-50


975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura

cyVGG NetAll-ConvNiN


Figure 17 Testing accuracy for handwritten 50-alphabet recognition using different DCNN techniques

Figure 18 Randomly selected images of special character from the dataset

0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs

All-ConvResNetFractalNet

NiNDenseNetVGG-19

Figure 19 Training loss of different architectures for Bangla 13 special characters (SpecialChar-13)


5 Conclusions

In this research we investigated the performance of severalpopular deep convolutional neural networks (DCNNs) forhandwritten Bangla character (eg digits alphabets andspecial characters) recognition (e experimental resultsindicated that DenseNet is the best performer in classifyingBangla digits alphabets and special characters Specificallywe achieved recognition rate of 9913 for handwrittenBangla digits 9831 for handwritten Bangla alphabet and9818 for special character recognition using DenseNet Tothe best of knowledge these are the best recognition resultson the CMATERdb dataset In future some fusion-basedDCNN models such as Inception Recurrent ConvolutionalNeural Network (IRCNN) [47] will be explored and de-veloped for handwritten Bangla character recognition

0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv

NiNResNetFractalNetDenseNet

VGG

Figure 20 Validation accuracy of different architectures for Bangla 13 special characters (SpecialChar-13)

972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN


Figure 21 Testing accuracy of different architectures for Bangla 13special characters (SpecialChar-13)

Table 4 Number of parameter comparison

Models Number of parametersVGG-16 [9] sim 843MAll-Conv Net [10] sim 226MNiN [11] sim 281MResNet [12] sim 563MFractalNet [13] sim 784MDenseNet [14] sim 425M

Table 5 Computational time (in sec) per epoch for differentDCNN models on Digit-10 Alphabet-50 and SpecialChar-13

Models Digit-10 Alphabet-50 SpecialChar-13VGG-16 [9] 32 83 15All-Conv Net [10] 7 23 4NiN [11] 9 27 5ResNet [12] 64 154 34FractalNet [13] 32 102 18DenseNet [14] 95 210 58

Table 3 (e testing accuracy of VGG-16 Network All-ConvNetwork NiN ResNet FractalNet and DenseNet on Digit-10Alphabet-50 and SpecialChar-13 and comparison against otherexisting methods

Types Methods Digit-10()

Alphabet-50()

SpecialChar-13 ()

Existingapproaches

MLP [7] 9667 mdash mdashMPCA

+QTLR [56] 9855 mdash mdash

GA [22] 9700 mdash mdashLeNet

+DBN [57] 9864 mdash mdash

VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558

NiN [11] 9736 9673 9724ResNet [12] 9851 9733 9764FractalNet

[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability

(e data used to support the findings of this study areavailable at httpscodegooglecomarchivepcmaterdb

Conflicts of Interest

(e authors declare that they have no conflicts of interest

References

[1] D C Ciresan U Meier L M Gambardella andJ Schmidhuber ldquoDeep big simple neural nets for hand-written digit recognitionrdquo Neural Computation vol 22no 12 pp 3207ndash3220 2010

[2] U Meier D C Ciresan L M Gambardella andJ Schmidhuber ldquoBetter digit recognition with a committee ofsimple neural netsrdquo in Proceedings of International Conferenceon Document Analysis and Recognition (ICDAR) pp 1250ndash1254 Beijing China September 2011

[3] W Song S Uchida and M Liwicki ldquoComparative study ofpart-based handwritten character recognition methodsrdquo inProceedings of International Conference on DocumentAnalysis and Recognition (ICDAR) pp 814ndash818 BeijingChina September 2011

[4] H A Khan A Al Helal and K I Ahmed ldquoHandwrittenBangla digit recognition using sparse representation classi-fierrdquo in Proceedings of 2014 International Conference on In-formatics Electronics and Vision (ICIEV) pp 1ndash6 IEEEDhaka Bangladesh May 2014

[5] U Pal and B Chaudhuri ldquoAutomatic recognition of un-constrained off-line Bangla handwritten numeralsrdquo in Pro-ceedings of Advances in Multimodal InterfacesndashICMI 2000pp 371ndash378 Springer Beijing China October 2000

[6] N Das B Das R Sarkar S Basu M Kundu andM NasipurildquoHandwritten Bangla basic and compound character recog-nition using MLP and SVM classifierrdquo 2010 httparxivorgabs10024040

[7] S Basu N Das R Sarkar M Kundu M Nasipuri andD K Basu ldquoAn MLP based approach for recognition ofhandwritten Bangla numeralsrdquo 2012 httparxivorgabs12030876

[8] Y LeCun L Bottou Y Bengio and P Haffner ldquoGradient-based learning applied to document recognitionrdquo Proceedingsof the IEEE vol 86 no 11 pp 2278ndash2324 1998

[9] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo 2014 httparxivorgabs14091556

[10] J T Springenberg A Dosovitskiy T Brox and M RiedmillerldquoStriving for simplicity (e all convolutional netrdquo 2014 httparxivorgabs14126806

[11] M Lin Q Chen and S Yan ldquoNetwork in networkrdquo 2013httparxivorgabs13124400

[12] K He X Zhang S Ren and J Sun ldquoDeep residual learningfor image recognitionrdquo in Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition pp 770ndash778 LasVegas NV USA June-July 2016

[13] G Larsson M Maire and G Shakhnarovich ldquoFractalnetultra-deep neural networks without residualsrdquo 2016 httparxivorgabs160507648

[14] G Huang Z Liu K Q Weinberger and L van der MaatenldquoDensely connected convolutional networksrdquo in Proceedingsof the IEEE Conference on Computer Vision and PatternRecognition vol 1 no 2 p 3 Honolulu HI USA July 2017

[15] B Chaudhuri and U Pal ldquoA complete printed Bangla OCRsystemrdquo Pattern Recognition vol 31 no 5 pp 531ndash549 1998

[16] U Pal On the development of an optical character recognition(OCR) system for printed Bangla script PhD dissertationIndian Statistical Institute Kolkata India 1997

[17] U Pal and B Chaudhuri ldquoIndian script character recognitiona surveyrdquo Pattern Recognition vol 37 no 9 pp 1887ndash18992004

[18] C-L Liu and C Y Suen ldquoA new benchmark on the rec-ognition of handwritten Bangla and Farsi numeral charac-tersrdquo Pattern Recognition vol 42 no 12 pp 3287ndash3295 2009

[19] B Chaudhuri ldquoA complete handwritten numeral database ofBanglandasha major Indic scriptrdquoin Proceedings of Tenth In-ternational Workshop on Frontiers in Handwriting Recogni-tion Suvisoft Baule France October 2006

[20] O Surinta L Schomaker and M Wiering ldquoA comparison offeature and pixel-based methods for recognizing handwrittenBangla digitsrdquo in Proceedings of 12th International Conferenceon Document Analysis and Recognition (ICDAR) pp 165ndash169IEEE Buffalo NY USA 2013

[21] C Cortes and V Vapnik ldquoSupport-vector networksrdquo Ma-chine Learning vol 20 no 3 pp 273ndash297 1995

[22] N Das R Sarkar S Basu M Kundu M Nasipuri andD K Basu ldquoA genetic algorithm based region sampling forselection of local features in handwritten digit recognitionapplicationrdquo Applied Soft Computing vol 12 no 5pp 1592ndash1606 2012

[23] J-W Xu J Xu and Y Lu ldquoHandwritten Bangla digit rec-ognition using hierarchical Bayesian networkrdquo in Proceedingsof 3rd International Conference on Intelligent System andKnowledge Engineering vol 1 pp 1096ndash1099 IEEE XiamenChina November 2008

[24] D E Rumelhart J L McClelland P R Group et al ParallelDistributed Processing MIT Press Vol 1 MIT PressCambridge MA USA 1987

[25] I-J Kim and X Xie ldquoHandwritten Hangul recognition usingdeep convolutional neural networksrdquo International Journal onDocument Analysis and Recognition vol 18 no 1 pp 1ndash132015

[26] M M Rahman M Akhand S Islam P C Shill andM H Rahman ldquoBangla handwritten character recognitionusing convolutional neural networkrdquo International Journal ofImage Graphics and Signal Processing vol 7 no 8 pp 42ndash492015

[27] D Ciregan and U Meier ldquoMulti-column deep neural net-works for offline handwritten Chinese character classifica-tionrdquo in Proceedings of 2015 International Joint Conference onNeural Networks (IJCNN) pp 1ndash6 IEEE Killarney IrelandJuly 2015

[28] A Voulodimos N Doulamis A Doulamis andE Protopapadakis ldquoDeep learning for computer visiona brief reviewrdquo Computational Intelligence and Neurosci-ence vol 2018 Article ID 7068349 13 pages 2018

[29] G E Hinton S Osindero and Y-W Teh ldquoA fast learningalgorithm for deep belief netsrdquo Neural Computation vol 18no 7 pp 1527ndash1554 2006

[30] P Vincent H Larochelle Y Bengio and P-A ManzagolldquoExtracting and composing robust features with denoisingautoencodersrdquo in Proceedings of the 25th InternationalConference on Machine Learning pp 1096ndash1103 ACMHelsinki Finland July 2008

[31] Y Bengio et al ldquoLearning deep architectures for AIrdquoFoundations and Trendsreg in Machine Learning vol 2 no 1pp 1ndash127 2009


[32] J Kim T-T-H Le and H Kim ldquoNonintrusive load moni-toring based on advanced deep learning and novel signaturerdquoComputational Intelligence and Neuroscience vol 2017Article ID 4216281 22 pages 2017

[33] E Protopapadakis A Voulodimos A Doulamis N DoulamisD Dres and M Bimpas ldquoStacked autoencoders for outlierdetection in over-the-horizon radar signalsrdquo ComputationalIntelligence and Neuroscience vol 2017 Article ID 589141711 pages 2017

[34] H Greenspan B van Ginneken and R M Summers ldquoGuesteditorial deep learning in medical imaging overview andfuture promise of an exciting new techniquerdquo IEEE Trans-actions on Medical Imaging vol 35 no 5 pp 1153ndash11592016

[35] D Shen G Wu and H-I Suk ldquoDeep learning in medicalimage analysisrdquo Annual Review of Biomedical Engineeringvol 19 pp 221ndash248 2017

[36] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[37] S Sladojevic M Arsenovic A Anderla D Culibrk andD Stefanovic ldquoDeep neural networks based recognition ofplant diseases by leaf image classificationrdquo ComputationalIntelligence and Neuroscience vol 2016 Article ID 328980111 pages 2016

[38] S P Mohanty D P Hughes and M Salathe ldquoUsing deeplearning for image-based plant disease detectionrdquo Frontiers inPlant Science vol 7 p 1419 2016

[39] G Wang Y Sun and J Wang ldquoAutomatic image-based plantdisease severity estimation using deep learningrdquo Computa-tional Intelligence and Neuroscience vol 2017 Article ID2917536 8 pages 2017

[40] L Zhang L Zhang and B Du ldquoDeep learning for remotesensing data a technical tutorial on the state of the artrdquo IEEEGeoscience and Remote Sensing Magazine vol 4 no 2pp 22ndash40 2016

[41] A Romero C Gatta and G Camps-Valls ldquoUnsuperviseddeep feature extraction for remote sensing image classifica-tionrdquo IEEE Transactions on Geoscience and Remote Sensingvol 54 no 3 pp 1349ndash1362 2016

[42] K Nogueira O A Penatti and J A dos Santos ldquoTowardsbetter exploiting convolutional neural networks for remotesensing scene classificationrdquo Pattern Recognition vol 61pp 539ndash556 2017

[43] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theIEEE Conference on Computer Vision and Pattern Recognitionpp 3431ndash3440 Boston MA USA June 2015

[44] V Badrinarayanan A Kendall and R Cipolla ldquoSegnet a deepconvolutional encoder-decoder architecture for image seg-mentationrdquo IEEE Transactions on Pattern Analysis andMachine Intelligence vol 39 no 12 pp 2481ndash2495 2017

[45] S Ji W Xu M Yang and K Yu ldquo3d convolutional neuralnetworks for human action recognitionrdquo IEEE Transactionson Pattern Analysis and Machine Intelligence vol 35 no 1pp 221ndash231 2013

[46] C Dong C C Loy K He and X Tang ldquoImage super-resolution using deep convolutional networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 38no 2 pp 295ndash307 2016

[47] C Farabet C Couprie L Najman and Y LeCun ldquoLearninghierarchical features for scene labelingrdquo IEEE Transactions on

Pattern Analysis and Machine Intelligence vol 35 no 8pp 1915ndash1929 2013

[48] K Zhang Q Liu Y Wu and M-H Yang ldquoRobust visualtracking via convolutional networks without trainingrdquo IEEETransactions on Image Processing vol 25 no 4 pp 1779ndash1792 2016

[49] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[50] M Matsugu K Mori Y Mitari and Y Kaneda ldquoSubjectindependent facial expression recognition with robust facedetection using a convolutional neural networkrdquo NeuralNetworks vol 16 no 5-6 pp 555ndash559 2003

[51] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of 2015 IEEE Conference onComputer Vision and Pattern Recognition (CVPR) pp 1ndash9Boston MA USA June 2015

[52] O Russakovsky J Deng H Su et al ldquoImagenet large scalevisual recognition challengerdquo International Journal of Com-puter Vision vol 115 no 3 pp 211ndash252 2015

[53] V Nair and G E Hinton ldquoRectified linear units improverestricted Boltzmann machinesrdquo in Proceedings of the 27thInternational Conference on Machine Learning (ICML-10)pp 807ndash814 Haifa Israel June 2010

[54] S Ioffe and C Szegedy ldquoBatch normalization acceleratingdeep network training by reducing internal covariate shiftrdquo inProceedings of International Conference on Machine Learningpp 448ndash456 Lille France July 2015

[55] U Bhattacharya M Shridhar S K Parui P Sen andB Chaudhuri ldquoOffline recognition of handwritten Banglacharacters an efficient two-stage approachrdquo Pattern Analysisand Applications vol 15 no 4 pp 445ndash458 2012

[56] N Das J M Reddy R Sarkar et al ldquoA statisticalndashtopologicalfeature combination for recognition of handwritten nu-meralsrdquoApplied Soft Computing vol 12 no 8 pp 2486ndash24952012

[57] M Z Alom P Sidike T M Taha and V K AsarildquoHandwritten Bangla digit recognition using deep learningrdquo2017 httparxivorgabs170502680


Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems


Volume 2018


ReconfigurableComputing



Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018


Civil EngineeringAdvances in


Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia


Biomedical Imaging



Engineering Mathematics


RoboticsJournal of



Computational Intelligence and Neuroscience


Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018


Human-ComputerInteraction

Advances in


Scientic Programming

Submit your manuscripts atwwwhindawicom

(erefore developing a recognition system for Banglacharacters is of a great interest [4 6 7]

In Bangla language there are 10 digits and 50 charactersincluding vowel and consonant where some contain ad-ditional sign up andor below Moreover Bangla consists ofmany similar shaped characters In some cases a characterdiffers from its similar one with a single dot or markFurthermore Bangla also contains some special characterswith equivalent representation of vowels (is makes itdifficult to achieve a better performance with simple clas-sification technique as well as hinders to the development ofa reliable handwritten Bangla character recognition (HBCR)system

(ere are many applications of HBCR such as Banglaoptical character recognition national ID number recog-nition system automatic license plate recognition system forvehicle and parking lot management system post officeautomation and online banking Some example images ofthese applications are shown in Figure 1 In this work weinvestigate the HBCR on Bangla numerals alphabets andspecial characters using the state-of-the-art deep convolu-tional neural network (DCNN) [8] models (e contribu-tions of this paper can be summarized as follows

(i) First time to comprehensive evaluation of the state-of-the-art DCNN models including VGG Net [9]All Convolutional Neural Network (All-Conv) [10]Network in Network (NiN) [11] Residual Network(ResNet) [12] Fractal Network (FractalNet) [13]and Densely connected convolutional Network(DenseNet) [14] on the application of HBCR

(ii) Extensive experiments on HBCR including hand-written digits alphabets and special characterrecognition

(iii) (e better recognition accuracy is achieved to thebest of knowledge compared with other existingapproaches that reported in the literature

2 Related Work

Although some studies on Bangla character recognition havebeen reported in the past years [15ndash17] there is a few re-markable works available for HBCR Pal and Chaudhuri [5]proposed a new feature extraction-based method forhandwritten Bangla character recognition where the conceptof water overflow from the reservoir is utilized Liu and Suen[18] introduced directional gradient features for handwrit-ten Bangla digit classification using ISI Bangla numeraldataset [19] which consists of 19392 training samples 4000test samples and 10 classes (ie 0 to 9) Surinta et al [20]proposed a system using a set of features such as the contourof the handwritten image computed using 8-directionalcodes distance calculated between hotspots and black pixelsand the intensity of pixel space of small blocks Each of thesefeatures is separately fed into support vector machine (SVM)[21] classifier and the final decision is made by the majorityvoting strategy Das et al [22] exploited genetic algorithms-based region sampling method for local feature selection andachieved 97 accuracy on HBCR Xu et al [23] used

a hierarchical Bayesian network which directly takes rawimages as the network inputs and classifies them usinga bottom-up approach Sparse representation classifier hasalso been applied for Bangla digit recognition [4] where 94accuracy was reported for handwritten digit recognition In[6] handwritten Bangla basic and compound characterrecognition using multilayer perceptron (MLP) [24] andSVM classifier was suggested while handwritten Banglanumeral recognition using MLP was presented in [7] wherethe average recognition rate reached 9667

Recently deep learning-based methods have drawnincreasing attention in handwritten character recognition[25 26] Ciregan andMeier [27] applied multicolumn CNNsto Chinese character classification Kim and Xie [25] appliedDCNN to Hangul handwritten character recognition andsuperior performance has been achieved against classicalmethods A deep learning framework such as a CNN-basedHBCR scheme was introduced in [26] where the best rec-ognition accuracy reached at 8536 on their own dataset Inthis paper we for the first time introduce the very latestDCNN models including VGG network All-Conv NiNResNet FractalNet and DenseNet for handwritten Banglacharacter (eg digits alphabets and special characters)recognition

3 Deep Neural Networks

Deep neural network (DNN) is an active area in the field ofmachine learning and computer vision [28] and it generallycontains three popular architectures Deep Belief Net (DBN)[29] Stacked Autoencoder (SAE) [30] and CNN Due to thecomposition of many layers DNN methods are more ca-pable for representing the highly varying nonlinear functioncompared with shallow learning approaches [31] (e lowand middle level of DNN abstract the feature from the inputimage whereas the high level performs classification op-eration on the extracted features As a result an end-to-endframework is formed by integrating with all necessary layerswithin a single network (erefore DNN models often leadto better accuracy compared with the other type of machinelearning methods Recent successful practice of DNN coversa variety of topics such as electricity consumption moni-toring [32] radar signal examination [33] medical imageanalysis [34ndash36] food security [37ndash39] and remote sensing[40ndash42]

Among all deep learning approaches CNN is one of themost popular models and has been providing the state-of-the-art performance on segmentation [43 44] human actionrecognition [45] image superresolution [46] scene labelling[47] and visual tracking [48]

31 Convolutional Neural Network (CNN) CNN was ini-tially applied to digit recognition task by LeCun et al [8]CNN and its variants are gradually adopted to variousapplications [46 49] CNN is designed to imitate humanvisual processing and it has highly optimized structures toprocess 2D images Furthermore CNN can effectively learnthe extraction and abstraction of 2D features In detail the







xlj f x

lminus1i lowast k

lij1113872 1113873 + b

lj (1)



xlj down x

lminus1i1113872 1113873 (2)



1113936Kk1exp xTwk( 1113857

(3)



(a) (b)

(c)











Feature extraction

Classification

Input32times32






Out

puts

Max-pooling








xl Hl x0 x1 x2 xlminus11113858 1113859( 1113857 (5)

Inpu

ts

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Max

-poo

ling

FC FC FC

Softm

ax


Input 32 times 32


(i)

(i)(ii)


Convolution



Pooling


(i)(ii)


Convolution

Pooling

Convolution


Somax


ReLU activation

Convolution

ReLU activation

Convolution

+





M NminusF

S+ 1 (6)







Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3



z


x

y

f2 (z)










Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in









xlj f x

lminus1i lowast k

lij1113872 1113873 + b

lj (1)



xlj down x

lminus1i1113872 1113873 (2)



1113936Kk1exp xTwk( 1113857

(3)



(a) (b)

(c)











Feature extraction

Classification

Input32times32






Out

puts

Max-pooling








xl Hl x0 x1 x2 xlminus11113858 1113859( 1113857 (5)

Inpu

ts

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Max

-poo

ling

FC FC FC

Softm

ax


Input 32 times 32


(i)

(i)(ii)


Convolution



Pooling


(i)(ii)


Convolution

Pooling

Convolution


Somax


ReLU activation

Convolution

ReLU activation

Convolution

+





M NminusF

S+ 1 (6)







Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3



z


x

y

f2 (z)










Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in












Feature extraction

Classification

Input32times32






Out

puts

Max-pooling








xl Hl x0 x1 x2 xlminus11113858 1113859( 1113857 (5)

Inpu

ts

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Max

-poo

ling

FC FC FC

Softm

ax


Input 32 times 32


(i)

(i)(ii)


Convolution



Pooling


(i)(ii)


Convolution

Pooling

Convolution


Somax


ReLU activation

Convolution

ReLU activation

Convolution

+





M NminusF

S+ 1 (6)







Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3



z


x

y

f2 (z)










Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in









xl Hl x0 x1 x2 xlminus11113858 1113859( 1113857 (5)

Inpu

ts

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Conv

and

ReL

U

Max

-poo

ling

FC FC FC

Softm

ax


Input 32 times 32


(i)

(i)(ii)


Convolution



Pooling


(i)(ii)


Convolution

Pooling

Convolution


Somax


ReLU activation

Convolution

ReLU activation

Convolution

+





M NminusF

S+ 1 (6)







Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3



z


x

y

f2 (z)










Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in






M NminusF

S+ 1 (6)







Inpu

ts BN-

ReLU-

ConvBN-

ReLU-

ConvBN-

ReLU-

Conv Transaction

layer

H1 x1x0 H2 x2 H3 x3



z


x

y

f2 (z)










Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in











Size offeaturemaps

Size ofkernels

Number ofparameters








testingsamples

Totalsamples

Numberof classes


0 1 2 3 4 5 6 7 8 9



0

1

2

3

4

5

6

1 51 101 151 201

Loss

Number of epochs

All-Conv


DenseNet

VGG-16









0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in










0010203040506070809

1

1 51 101 151 201

Acc

urac

y

Number of epochs

All-Conv

ResNetVGG-16 NiN

DenseNet

FractalNet


975

7

970

8 973

6

985

1 989

2

991

3

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN



(a)

(b)






0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in







0

1

2

3

4

5

6

7

8

1 51 101 151 201

Loss

Number of epochs

All-ConvVGGDenseNet

ResNetNiNFractalNet


0

02

04

06

08

1

1 51 101 151 201

Val

idat

ion

accu

racy

Number of epochs




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




975

6

943

1

967

3 973

3

978

7

983

1

Testi

ng ac

cura





0

1

2

3

4

5

6

7

1 51 101 151 201

Loss

Number of epochs


NiNDenseNetVGG-19



5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




5 Conclusions


0010203040506070809

1

1 51 101 151 201

Acc

urac

y

EpochsAll-Conv


VGG


972

4

955

8

971

5 976

4

979

8

981

8

Testi

ng ac

cura

cy

VGG NetAll-ConvNiN









Alphabet-50()

SpecialChar-13 ()

Existingapproaches





VGGNet [9] 9757 9756 9615

DCNN

All-Conv[10] 9708 9431 9558


[13] 9892 9787 9798

DenseNet[14] 9913 9831 9818


Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




Data Availability




References


































































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in





































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




HandwrittenBanglaCharacterRecognitionUsingthe State-of-the...

Documents

Transcript of HandwrittenBanglaCharacterRecognitionUsingthe State-of-the...