HMM Based Handwritten Text Recognition Using Biometrical Data Acquisition Pen Ondrej Rohlik, Pavel...
-
Upload
isabella-york -
Category
Documents
-
view
213 -
download
0
Transcript of HMM Based Handwritten Text Recognition Using Biometrical Data Acquisition Pen Ondrej Rohlik, Pavel...
HMM Based Handwritten Text Recognition Using Biometrical Data Acquisition Pen
Ondrej Rohlik, Pavel Mautner, Vaclav Matousek, Juergen Kempf
Department of Computer Science and EngineeringUniversity of West Bohemia in Pilsen
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 2
Outline
• Data acquisition device: The BiSP pen
• Handwritten text recognition
• Hidden Markov models
• Experimental results
• Future Work
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 3
Input Devices – Overview
• off-line (static)– scanners– cameras*
• on-line (dynamic)– electronic pens– digitizers, tablets– cameras*– mouse*
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan
Input Device: The BiSP Pen• Electronic pen* is used for data acquisition
built at University of Applied Sciences in Regensburg, Germany
*
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 5
Input Device – Writing
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 6
Input Device – Signals
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 7
Handwritten Text Recognition
• Objective: To convert handwritten sentences or phrases in analog form (off-line or on-line sources) into digital form (ASCII or Unicode).
• isolated character recognition (TM, DTW, NN)
• word recognition (HMMs)• gesture recognition
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 8
Hanwritten Text
hand printed characters
spaced descrete characters
cursive script words
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 9
Signal Description
• Pairs of x and y signals are transformed into sequence of primitives
Primitive (observation)
Signal trendx y
1 2 3 4
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 10
Hidden Markov Models
• left-to-right model(used mostly in speech recognition)
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 11
Hidden Markov Models
• Training – Baum-Welch algorithm• Recognition – Backward algorithm
• Matrices that describes the model (A, B, ) are decomposed after the training – one model for each letter
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 12
Word HMM Decomposition
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 13
Word HMM Composition
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 14
Experimental Results
• method have been tested on three independent data sets of various sizes
• limited number of letters used in our data sets: 15– reduced complexity of tagging the training set
Vocabulary size 1649 2198 5129
Recognition rate (%) 88 90 82
Recognition time (min) 17-26 27-49 360
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 15
Future Work
• to speed up the algorithm to achieve real-time recognition
• incorporation of language models to improve the recognition rate
• special attention will be paid to signature analysis and signature verification
• application in tele-robotics and robot sensing robot aided signature forging
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 16
Forgeries – Overwiew
a) genuine c) unskilled b) zero-effort d) skilled
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 17
Example of Two Features
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 18
Class Boundaries
Ondrej Rohlik, IEEE CIRA 2003, Kobe, Japan 19
Signature Verification – Algorithms For each class C Training algorithm For each feature f For each pair of signatures Classes[C][i] and Classes[C][j] Compute the difference between Classes[C][i] and Classes[C][j] and add it to an extra variable Sum[f] Compute mean value mean[f] and variance var[f] of each feature over all pairs using the variable Sum[f] Compute critical cluster coefficient using variances var[f] and weights w[f] over all features f
For class C to be verified Classification algorithm For each pattern Classes[c][i] For each feature f Compute the difference and remember the least one over all patterns Sum up products of least differences and weights w[f] and compare the sum with Critical cluster coefficient