Feature Extraction and Classification of Handwritten Patterns using Neural Networks

7
International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 8190 119 www.ijaert.org Feature Extraction and Classification of Handwritten Patterns using Neural Networks Dimple Bhasin*, Prof. Gulshan Goyal**, Dr. Maitreyee Dutta** *CSE, Chandigarh Group of Colleges, Gharuan, Punjab ** CSE, Chandigarh Group of Colleges, Gharuan, Punjab ** CSE, NITTTR, Chandigarh ABSTRACT Handwritten Pattern recognition has been one of the most captivating and interesting areas in the field of Image processing and soft computing. The biggest challenge faced by current technology is the recognition of patterns or object accurately. To train machines for the task of recognition is a tedious job due to variability in patterns and different writing styles, yet they provide useful solution thereby saving time and money. Neural networks provide an efficient solution to the problem of recognition. The paper presents a system that culminates efficient steps of pre-processing along with the training of neural network and hence providing better results of recognition. The system aims to reduce processing time while simultaneously improving recognition rates. Keywords - Artificial Neural Network, Feature Extraction, Filters Handwritten Pattern Recognition, Pre-Processing, Thinning. 1. INTRODUCTION Machine simulation of human functions has been cumbersome and challenging research area since the advent of digital computers. Although extensive improvements have been achieved yet humans outperform even the most powerful computer in many routine functions. Handwritten Pattern recognition is one such area where machine simulation of human functions is far from end [1]. Recognition of handwritten patterns gets complicated due to numerous variations involved in shape of patterns, and different writing styles that vary from person to person. Neural networks with their inherent learning ability provide efficient solution to handwritten pattern recognition. 2. ARTIFICIAL NEURAL NETWORK An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of the ANN paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected Processing elements (neurons) working in unison to solve specific problems [2]. Advantages of neural network include: Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience. Self-Organization: An ANN can create its own organization or representation of the information it receives during learning time. Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage [3]. 3. LITERATURE REVIEW Pattern recognition aims to classify patterns based on either prior knowledge or on statistical information extracted from patterns [5].It has been one of the fascinating and challenging research areas. Tremendous research has already been done and it is still being done to get a more efficient and user friendly approach for the problem. A brief review of various research papers is laid down in following paragraphs. In [4], a system is presented for recognition of Devnagari handwriting using neural network. The entire system consisted of 3 parts: Character Separation, Pre- processing comprising of Image binarization, Thinning and Windowing. Character recognition and classification is done using multi layer perceptrons trained by error propagation algorithms. Also an important step is deriving of input from character matrix. The presented system is able to recognize most of handwritings. Also the size of the dataset imposes a large impact of recognition rate.

description

Handwritten Pattern recognition has been one of the most captivating and interesting areas in the field of Image processing and soft computing. The biggest challenge faced by current technology is the recognition of patterns or object accurately. To train machines for the task of recognition is a tedious job due to variability in patterns and different writing styles, yet they provide useful solution thereby saving time and money. Neural networks provide an efficient solution to the problem of recognition. The paper presents a system that culminates efficient steps of pre-processing along with the training of neural network and hence providing better results of recognition. The system aims to reduce processing time while simultaneously improving recognition rates.

Transcript of Feature Extraction and Classification of Handwritten Patterns using Neural Networks

Page 1: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

119

www.ijaert.org

Feature Extraction and Classification of Handwritten Patterns using Neural

Networks

Dimple Bhasin*, Prof. Gulshan Goyal**, Dr. Maitreyee Dutta** *CSE, Chandigarh Group of Colleges, Gharuan, Punjab

** CSE, Chandigarh Group of Colleges, Gharuan, Punjab

** CSE, NITTTR, Chandigarh

ABSTRACT Handwritten Pattern recognition has been one of the

most captivating and interesting areas in the field of

Image processing and soft computing. The biggest

challenge faced by current technology is the recognition

of patterns or object accurately. To train machines for

the task of recognition is a tedious job due to variability

in patterns and different writing styles, yet they provide

useful solution thereby saving time and money. Neural

networks provide an efficient solution to the problem of

recognition. The paper presents a system that culminates

efficient steps of pre-processing along with the training

of neural network and hence providing better results of

recognition. The system aims to reduce processing time

while simultaneously improving recognition rates.

Keywords - Artificial Neural Network, Feature

Extraction, Filters Handwritten Pattern Recognition,

Pre-Processing, Thinning.

1. INTRODUCTION Machine simulation of human functions has been

cumbersome and challenging research area since the

advent of digital computers. Although extensive

improvements have been achieved yet humans

outperform even the most powerful computer in many

routine functions. Handwritten Pattern recognition is one

such area where machine simulation of human functions

is far from end [1]. Recognition of handwritten patterns

gets complicated due to numerous variations involved in

shape of patterns, and different writing styles that vary

from person to person. Neural networks with their

inherent learning ability provide efficient solution to

handwritten pattern recognition.

2. ARTIFICIAL NEURAL NETWORK

An Artificial Neural Network (ANN) is an information

processing paradigm that is inspired by the way

biological nervous systems, such as the brain, process

information. The key element of the ANN paradigm is

the novel structure of the information processing system.

It is composed of a large number of highly

interconnected Processing elements (neurons) working

in unison to solve specific problems [2].

Advantages of neural network include:

Adaptive learning: An ability to learn how to do

tasks based on the data given for training or initial

experience.

Self-Organization: An ANN can create its own

organization or representation of the information it

receives during learning time.

Real Time Operation: ANN computations may be

carried out in parallel, and special hardware devices are

being designed and manufactured which take advantage

of this capability.

Fault Tolerance via Redundant Information

Coding: Partial destruction of a network leads to the

corresponding degradation of performance. However,

some network capabilities may be retained even with

major network damage [3].

3. LITERATURE REVIEW Pattern recognition aims to classify patterns based on

either prior knowledge or on statistical information

extracted from patterns [5].It has been one of the

fascinating and challenging research areas. Tremendous

research has already been done and it is still being done

to get a more efficient and user friendly approach for the

problem. A brief review of various research papers is

laid down in following paragraphs.

In [4], a system is presented for recognition of

Devnagari handwriting using neural network. The entire

system consisted of 3 parts: Character Separation, Pre-

processing comprising of Image binarization, Thinning

and Windowing. Character recognition and classification

is done using multi layer perceptrons trained by error

propagation algorithms. Also an important step is

deriving of input from character matrix. The presented

system is able to recognize most of handwritings. Also

the size of the dataset imposes a large impact of

recognition rate.

Page 2: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

120

www.ijaert.org

In [5], three different soft computing techniques of feed

forward neural network namely evolutionary algorithm,

Back propagation algorithm and hybrid evolutionary

algorithm is used for recognition of handwritten English

alphabets. The techniques have been evaluated on the

basis of number of iterations and count of convergence

weight matrix. Also the algorithms are compared on the

basis of speed and computational cost. As a result of all

computations and comparisons hybrid evolutionary

algorithm is proven to be the best among the three.

In [6], a system for recognition of unconstrained offline

texts using hybrid Hidden Markov model/artificial

neural network is presented. The performance of system

is evaluated in terms of word error recognition rate and

obtained a 42 percent improvement compared to systems

adopted as baseline systems for comparisons.

4. RECOGNITION PROCESS The whole recognition process for offline handwritten

patterns consists of following stages as shown in Fig. 1:

1) Image acquisition

2) Pre-processing

3) Feature Extraction

4) Training and Classification

4.1 Image Acquisition

The images used are obtained from

www.mathwork.com. The available dataset comprises of

650 images of 26 English alphabets out of which 260

images have been sorted. Each image is of size 50x50

having bmp format. A sample of dataset is given in Fig.

2.

Fig. 1: Schematic diagram of the Recognition process

Fig. 2: Dataset of Handwritten Input images.

4.2 Pre-processing

Pre-processing consist of different steps to mould the

input image into a form suitable for processing. The

main objectives of pre-processing are:

1) Noise reduction

2) Normalization of the data

3) Compression in the amount of information to be

retained.

The presented Pre-processing approach onsists of:

1. Image Cropping

2. Noise reduction

3. Image Binarization

4. Cropping to edges

5. Image Complement

6. Normalization

7. Thinning

These steps involved in pre-processing are shown in

Figure 3.

Image Acquistion

Pre-Processing

Feature Extraction

Training and Classification

Page 3: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

121

www.ijaert.org

Fig. 3: Design of Pre-processing Approach for Offline

Handwritten Images

4.2.1 Image cropping

It involves removal of surplus pixels and blank spaces

from the image. It can be automated or done manually.

In the present work, cropping is done manually using

‗imcrop‘ command present in MATLAB as shown in

Fig. 4.

4.2.2 Filtering

It aims to remove noise and spurious points, introduced

by uneven writing surface and poor sampling rate of data

acquisition device [5]. Main types of filtering methods

are shown in Fig.5.

a) b)

Fig. 4: a) Original Image b) Cropped Image obtained using

‘imcrop’ command.

Fig.5: Different types of filters

4.2.2.1 Performance evaluation of filters Different filters that can be applied to the image are

evaluated in terms of:

1. Mean Square error (MSE)

2. Signal to noise ratio (SNR)

3. Peak signal to noise ratio (PSNR)

4. Root mean square error (RMSE)

Smaller the value of MSE, and larger the value of SNR

and PSNR, lesser will be the noise and higher the quality

of image.

According to Table 1, Wiener filters possess lowest

value of MSE and highest value of SNR and PSNR. It

reduces noise and improves the quality of images

effectively. Hence, Gaussian filter is used for filtering.

Table 1: Comparative analysis of different performance

evaluation parameters

Image Cropping

Noise removal

Image binarization

Cropping to Edges

Image Complement

Normalization

Thinning

Page 4: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

122

www.ijaert.org

4.2.3 Image binarization

Binarization involves conversion of a gray scale image

into a binary image where each pixel taking a value of 0

or 1.

4.2.4 Cropping to edges

It involves resizing the image in accordance with the

given window.

Fig. 6: Image cropped to edges

4.2.5 Image complement

Image complement involves conversion of image into its

negative. This is done since it is pre-requisite steps for

thinning. An example of image complement is shown in

Fig. 7.

Fig. 7: Image Complement

4.2.6 Normalization It aims to obtain a standardized data. In the present works

binarized and complemented image is normalized to a size of

25x25 as shown in Fig. 8.

a) b)

Fig.8 a) Cropped image b) Resized image 25x25

4.2.6 Thinning

Thinning is a morphological operation that removes

selected foreground pixels from binary images and is

particularly used for skeletonization [8]. Thinning can

remove irregularities in letters and in turn, makes the

recognition algorithm simpler because they only have to

operate on a character stroke, which is only one pixel

wide [4] [7].

In the present work, thinning is done using fast parallel

thinning algorithms namely Zhang Suen algorithm and

Guo Hall algorithm.

4.2.6.1Performance evaluation of thinning algorithm

Thinning rate: The aim of image thinning is to obtain

the skeleton of the image. In order to evaluate the

thinning algorithm, the thinning rate (TR) is:

𝑇𝑅 = 1 −𝑇1

𝑇0 ∗ 100%

Where T0 is the number of object pixel in original

image, T1 is the number of object pixels in thinning

image.

Elapsed time or Execution time: It is defined as the

total time take to complete the task of thinning.

Generally, large TR and smaller value of execution time

indicates higher thinning degree [9]. A comparative

analysis is presented in Table 2.Since thinning rate

obtained by Guo Hall algorithm is greater, it can be used

for the thinning purposes.

Table 2: Comparative analysis of different thinning

algorithms.

Page 5: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

123

www.ijaert.org

4.3 Feature Extraction Feature extraction is done in order to extract relevant

information by reducing the dimension of the data. In

this step a pattern is analyzed and a set of features are

selected that can uniquely identify the pattern [21]. In the

present system, the thinned image is converted into a

5x7 matrix. This matrix contains a set of intensity

values. These values represent the minimum features

required to identify the pattern. This set of features is

further used as an input vector to the network.

4.4 Training and Classification

4.4.1 Neural Network The network receives the 35 intensity values as a 35

input vector. It is then required to identify the alphabet

by responding with a 26-element output vector. In order

to operate accurately, the network needs to respond with

a 1 in the position of the letter being presented to the

network. All other values should be kept 0. The network

should be able to handle noise. The network does not

receive the intensity values as perfect Boolean values i.e.

0 and 1 only [4].

4.4.2 Architecture

A feed forward network is created that requires 35

neurons in the input layer and 26 neurons in the output

layer. The value of number of neurons in hidden layer is

chosen arbitrarily . The function newff creates a

feedforward network. The network is 2 layer log-

sigmoid/linear networks. The network is trained using

Levenberg-Marquardt algorithm. This algorithm is the

fastest among different back propagation algorithms.

The Created network is shown in Fig. 9 and training of

neural network is shown in Fig. 10.

Fig. 9: Neural network architecture

Fig. 10: Neural Network training

Page 6: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

124

www.ijaert.org

5. RESULTS The network training parameters opted is shown in Table

3. The system is analyzed on the basis of the number of

neurons in the hidden layer, Mean Square error and the

accuracy. Accuracy defines the rate of recognition.

Feedforward neural network

parameters

Input nodes 35

Output nodes 26

Epochs 1000

Learning rate 0.75

Training algorithm Levenberg-

Marquardt

algorithm

Hidden layer

neurons

10

Performance goal 0.01

Table 3: Feedforward Neural network training parameters

Fig.11: Performance Graph for a single character

The results are tabulated in Table 4. An accuracy of

80-100% was obtained for more than half of the

samples.

Alphabets Number

of

Variance

Number

of

Success

Accuracy

(%)

A 10 6 60

B 10 9 90

C 10 8 80

D 10 9 90

E 10 6 60

F 10 9 90

G 10 9 90

H 10 7 70

I 10 6 60

J 10 9 90

K 10 7 70

L 10 5 50

M 10 10 100

N 10 7 70

O 10 8 80

P 10 9 90

Q 10 6 60

R 10 10 100

S 10 9 90

T 10 8 80

U 10 7 70

V 10 7 70

W 10 8 80

X 10 10 100

Y 10 6 60

Z 10 8 80

Table 4: Results

6. CONCLUSION Offline handwritten patterns are difficult to recognize

due to high variability in writing styles and fonts from

person to person. The system presented is efficiently

working with the given network parameters and is able

to recognize a good amount of handwritings. The use of

an efficient pre-processing approach has improved the

recognition problem to a great extend. Thinning rate

imposed a great impact on the recognition rate. More the

thinning rate better is the recognition. Also recognition

rate is improved with the greater number of neurons in

the hidden layer. The only limitation is the number of

samples taken. However, the system yields good results

for the given samples and hence can be efficiently used

for different handwritings in different languages.

Page 7: Feature Extraction and Classification of Handwritten Patterns using Neural Networks

International Journal of Advanced Engineering Research and Technology (IJAERT) Volume 2 Issue 4, July 2014, ISSN No.: 2348 – 8190

125

www.ijaert.org

REFERENCES

[1] Arica N., Yarman-Vural F.T. (2001),‖ An Overview of

Character Recognition Focused on Off-Line Handwriting‖

IEEE transactions on systems, man, and cybernetics—part c:

applications and reviews, vol. 31, no. 2, pp.216-233.

[2] José C. Principe, Neil R. Euliano, Curt W. Lefebvre

―Neural and Adaptive Systems: Fundamentals Through

Simulations‖, ISBN 0-471-35167-9.

[3] Ni D.X. (2007) ―Applications of neural network to

character recognition‖ Proceedings of student/faculty research

day,CSIC Pace University.

[4] Rajput K.Y., Mishra S. ―Recognition and editing of

devnagari handwriting using neural network‖ Proceedings of

SPIT-IEEE Colloquium and International Conference, vol. 1,

pp. 66-70.

[5]Shrivastava S.,Singh M.P. (2010) ― Performance

evaluation of feed forward network with soft computing

techniques for handwritten English alphabets‖ Applied soft

computing ,Elsevier, vol.11 pp. 1156-1182.

[6] Espana-Boquera S., Castro-Bleda M.J., Gorbe-Moya J.,

Zamora-Martinez F. (2011) ― Improving offline handwritten

text recognition with hybrid HMM/ANN models‖ IEEE

transactions on pattern analysis and machine intelligence

vol.33, no.4, pp. 767-779.

[7] Indira B., shalini M., Murthy Raman M.V., Shaik M.S.

(2012) ―Classification and Recognition of Printed Hindi

Characters Using Artificial Neural Networks‖ I.J. Image,

Graphics and Signal Processing, vol. 6, pp. 15-21.

[8] Shenbagavadivu S., Dr. Devi M.R.(2013) ―An

investigation of noise removing techniques used in spatial

domain image processing‖ International Journal of Computer

Science and Mobile Computing Vol.2 Issue. 7, pp. 378-384.

[9] Dancheng Xu, Bailiang Li,Nijholt A. (2009) ―A novel

approach based on PCNNs Template for fingerprint Image

Thinning‖, Eight IEEE/ACIS International Conference on

computer and Information Science, pp. 115-119.

[10] Goyal G., Dr. Dutta M., Er. Girdhar A. (2010) ―A Parallel

Thinning Algorithm for Numeral Pattern images in BMP

Format.

[11] R.Gonzalez and R.E. Woods, Digital Image Processing,

Prentice Hall, 2011.

[12] Rani R., Kaur K. (2013) ― Experiment analysis of

different texture based features of image using simplified

Gabor Gaussian Wavelet transform‖ International Journal of

Engineering and Advanced Technology (IJEAT), Vol. 2, pp.

365-368.

[13] K.purnima, T V Sampath Kumar, ―Lossless Image

Compression Using Traditional and Lifting Based Wavelet

Transform‖ International Journal of Innovative Research and

Studies. ISSN 2319 -9725.

[14] Saeed K., Tabe Dzki M., Rybnik M., Adamski M. (2010)

―K3M: A Universal algorithm for image skeletonization and a

review of thinning techniques‖ International Journal of

Applied Mathematics and Computer Science, Vol. 20, No.

2,pp. 317–335.

[15] Xu D., Li. B., Nijholt A. (2009) ―A novel Approach

Based on PCNNs Template for Fingerprint Image Thinning‖

Eight IEEE/ACIS International Conference on Computer and

Information Science, pp. 115-119.

[16] Ahmed P. (1995) ―A neural network based dedicated

thinning method‖ Elsevier, pp. 585-590.

[17] Shang L., Yi Z. (2007) ―A class of binary images

thinning using twp PCNNs‖ Elsevier, pp. 1096-1101.

[18] Hallale S.B., Salunke G.D. (2013)‖ Offline and

handwritten digit recognition using neural network‖

International Journal of Advanced Research in Electrical,

Electronics and Instrumentation Engineering Vol. 2, Issue 9.

[19] Yadav D., Sanchez-Cuadrado S., Morato J. ― Optical

character recognition for Hindi language using a neural

network approach‖ (March 2013) J. Inf. Process Syst. vol. 9,

no.1 pp 117-138.

[20] Kumar H. and Kaur P. (2011)‖ A Comparative Study of

Iterative Thinning Algorithms for BMP Images‖

(IJCSIT) International Journal of Computer Science and

Information Technologies, Vol. 2 (5), pp. 2375-2379.

[21] Vipin, Dass R., Rajni (2013) ― Character recognition

using neural network‖, International journal of advanced

trends in computer science and engineering, pp. 62-67.