HANDWRITING RECOGNITION - CUBS - Homecubs.buffalo.edu/docs/Govindaraju-InvitedTalk-Fall2015.pdf ·...

57
HANDWRITING RECOGNITION A PERSPECTIVE ON TWO DECADES OF INNOVATIONS Venu Govindaraju University at Buffalo New York, USA

Transcript of HANDWRITING RECOGNITION - CUBS - Homecubs.buffalo.edu/docs/Govindaraju-InvitedTalk-Fall2015.pdf ·...

HANDWRITING RECOGNITIONA PERSPECTIVE ON TWO DECADES OF INNOVATIONS

Venu Govindaraju

University at Buffalo

New York, USA

Organizing Multiple Experts for Efficient Pattern Recognition

Active Pattern Recognition Using Genetic Programming

A Complexity Framework for Combination of Classifiers in Verification and Identification Systems

Image Processing using Ontology Concepts for Image Segmentation

Language Motivated Approaches for Human Action Recognition and Spotting

Towards a Globally Optimal Approach for Learning Deep Unsupervised Models

PATT

ERN

R

ECO

GN

ITIO

N

Sequential Pattern Classification without Explicit Feature Extraction

Exploiting the Gap between Human and Machine Abilities in Handwriting Recognition for Web Security Applications

Stochastic Modeling of High-level Structures in Handwritten Word Recognition

A Stochastic Framework for Font Independent Devanagari OCR

Language Models and Automatic Topic Categorization for Information Retrieval in Handwritten Documents

Automatic Recognition of Handwritten Medical Forms for Search

Statistical Techniques for Efficient Indexing and Retrieval of Document Images

Enhancing Cyber Security through the use of Synthetic Handwritten CAPTCHAs

Methods for Biomedical Image Content Extraction Toward Improved Multimodal Retrieval of Biomedical Articles

A Semi Supervised Framework for Handwritten Document Analysis

Bayesian Background Models for Retrieval of Handwritten Documents

Accents in Handwriting: A Hierarchical Bayesian Approach to Handwriting Analysis

Multilingual Word Spotting in Offline Handwritten Documents

Enhancement and Retrieval of Low Quality Handwritten Documents

Hierarchical and Dynamic-Relational Models for Handwriting Recognition

Probabilistic Random Field based Text Identification

DO

CU

ME

NT

AN

ALY

SIS

2

Minutia-Based Partial Fingerprint Recognition

Integrating Facial Expressions and Skin Texture in Face Recognition

Integrating Minutiae Based Fingerprint Matching with Local Correlation Methods

A Novel Multi-sample Fusion Methodology for Improving Biometric Recognition

A Framework for Fingerprint Enhancement and Feature Detection

A Framework for Efficient Fingerprint Identification using a Minutiae Tree

Face Modeling and Biometric Anti-spoofing using Probability Distribution Transfer Learning

Intrusion Detection using Spatial Information and Behavioral Biometrics

BIO

ME

TR

ICS

3

5

O

Stone

Tablet

to

iPad

tablet

Full Circle

7

8

9

Air Writing

11

E-signatures and Finger Doodles

This AI Algorithm Learns Simple Tasks as Fast as We Do

Software that learns to recognize written characters from just one example may point the way

towards more powerful, more humanlike artificial intelligence. MIT Technology Review, Dec 2015.

AI

All Things Handwritten

Dynamic lexiconsand integrated segmentation

Interactive lexicons and

multiresolution

Serial fusion and lexicon density

Principled fusion methods

Indexing and Spotting

Text retrieval

CAPTCHAs and security

Accents, signatures, and

security

Personal archives

Flipped classrooms

Key Innovations

1

9

9

5

-

2

0

0

0

2

0

0

0

-

2

0

0

5

2

0

1

0

-

2

0

1

5

2

0

0

0

-

2

0

0

5

2

0

1

5

-

2

0

2

0

LEXICONS

1Lexicon

Postal

ML Success Story - Highlighted in the CCC Symposium on “Computing Research That Changed the

World” (2009); Seminal work of Kim, Govindaraju (PAMI 1997) at the core of the technology

10-10-20 Rule

Lexicon

H. Xue, V. Govindaraju, “On the dependence of handwritten word recognizers on lexicons”, IEEE Transactions

Pattern Analysis and Machine Intelligence, IEEE Computer Society Press, 24(12): 1553-1564 (2002)

18

30% ZIPs contain less than 100 street names; Max streets returned is 3,071

Lexicon

Dynamic Lexicon

Rank Recog result

1 glyburide

2 bumetanide

3 indapamide

Rank Recog result

1 fosinopril

2 perindopril

3 benazepril

Rank Recog

result

1 metoprolol

2 metolazone

3 torsemide

MedicalFaxed prescriptions

20

Lexicons

Lexicon

R. Jayadevan, U. Pal, and F. Kimura: Recognition of Words from Legal Amounts of Indian Bank Cheques.

ICFHR 2010

• Census forms recognition

Lexicon of professions

• Prescription forms

Lexicon of medicines

• Registry of land ownerships

Lexicon of owner

1 2 3 4 5 6 7 8 9

w[7.6]

w[7.2]

r[3.8]

w[5.0]

w[8.6]

o[7.6]r[6.3]

d[4.9]

w[5.0]

o[6.6]

o[6.0]

o[7.2]o[10.6] d[6.5]

d[4.4]

r[7.5]r[6.4]

o[7.8]r[8.6]

r[7.6]

o[8.3]

o[7.7]r[5.8]

1 2 3 4 5 6 7 8 9

o[6.1]

G. Kim, Venu Govindaraju: A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time

Applications. IEEE Trans. Pattern Anal. Mach. Intell. 19(4): 366-379 (1997)

21

Lexicon

Integrated Segmentation

WORD

WELL

WORLD

HOUSE

SCHOOL

Hyderabad A D A - - - A - A

A - A A - A A -A - A - -A - A - - A - -

A - - - D - -

Interactive Features

a. Delhi b. Kolkatta c. Patna d. Dehradun

Lexicon

a) Amherst

b) Buffalo

c) Boston

d) None of the above

S. Madhvanath, V. Govindaraju, “The role of holistic paradigms in handwritten word recognition”, IEEE

Transactions Pattern Analysis and Machine Intelligence, IEEE Computer Society Press, 23(2): 149-164 (2001)

Lexicon

Interactive Features

PJ. Park, V. Govindaraju, and S. Srihari,: OCR in a Hierarchical Feature Space. IEEE Trans. Pattern Anal.

Mach. Intell. 22(4): 400-407 (2000)

24

Lexicon

Multiresolution features

A*

Lexicon

Lookup table: 29 x 10 x 85 (quad tree, 4 levels)

Interactive Lexicons

a. New Delhi b. Kolkattac. Patna d. Dehradun

Lexicon

Length

Short (<5 chars)

Long (>8 chars)

Ascenders / Descenders

Beginning (< 3 chars)

End (> 6 chars)

Parts

1

>1

Holes / Loops

0

>0

i-dots / t-crossings

IMPACT

1995-2000 Today

27

Lexicon

Lex size %

10 96

100 91

1000 (Top 50) 80 (98)

20000 (Top 100) 62 (94)

Postal Encoding 40

Lex size %

10 ~99

100 ~99

1000 ~95

Postal Encoding 95

FUSION

2Fusion

Sergey Tulyakov, Venu Govindaraju: Use of Identification Trial Statistics for the Combination of

Biometric Matchers. IEEE Transactions on Information Forensics and Security 3(4):719-733 (2008)

29

Fusion

Serial Fusion

Lexicon : NRecognizer

C1

Top x%

Recognizer

Cn

Top y%Recognizer

C2

• How does one determine x, y ?

• How does one determine the ordering of C1, C2, … Cn

• Is there a difference based on whether it is an Identification task or

Verification task?

• Can it be a generalized principled approach?

30

Fusion

Lexicon Density

V. Govindaraju, P. Slavik*, and H. Xue, “Lexicon density as a measure for performance evaluation of handwritten

recognizers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society Press,

24(6): 789-800 (2002)

• Lexicon size as feature

• Different lexicons with the

same size have inconsistent

results

• Lexicon density as metric

• Recognizer specific

• Performance is consistent

with the same level of density

R. Milewski, A. Bharadwaj, and and V. Govindaraju, “Automatic Recognition of Handwritten Medical Forms for

Search Engines”, International Journal of Document Analysis and Recognition, Springer, 2009.

Bootstrapping for Large Lexicons

Fusion

Pruning Lexicons

Fusion

CLT to RLT CL to RL CLT to ALT

HR 7.48% 7.42% 17.58%

Error Rate 10.78% 10.88% 24.53%

Map: M matchers x N classes scores to N combined scores

Principled Approach

Lex N

Lex 1C1

Ci

CN

Lex 2

:

:

:

:

:

:

:

:

1

Ns

1

1s

1

is

:

:

j

Ns

js1

:

: j

is

M

Ns

Ms1

:

:Mis

NS

1S

:

:iSf

Verification

Verify if combined score of

hypothesis >θ

),,,( 21 M

iiii sssfS

Identification

Class of maximum combined

score iNi

S,...,1

maxarg

Fusion

Lex 1 Lex i Lex N

Recognizer C1

Sergey Tulyakov, Venu Govindaraju: Use of Identification Trial Statistics for the Combination of Biometric

Matchers. IEEE Transactions on Information Forensics and Security 3(4):719-733 (2008)

34Fusion

Score Matrix

Recognizer C1

Recognizer CM

Optimal Methods

Fusion

trialsj

kk

j

kk

j

Csp

Csp

)|(

)|(

Likelihood Ratio

Vi fS

Ni fS

Reject:Accept?i

S

NiiS

,...,1maxarg

? Vi fS

?Ni fS

C1 C2 C1 ^ C2 C1 v C2 LR Weighted

sum

NN

54.8 77.2 48.9 83.0 68.8 81.6 81.7

A B C …

.95 .89 .76 …

A B C …

.80 .54 .43 …

trialsj

kk

j

k

j

kk

j

k

j

k Ctsp

CtspC

)|,(

)|,(maxarg

Sergey Tulyakov, Chaohong Wu, Venu Govindaraju: On the Difference between Optimal Combination

Functions for Verification and Identification Systems. IJPRAI 24(2): 173-191 (2010)

Fusion

IMPACTPractical

• Dependent Scores

• Independent Matchers

37

IMPACTTheoretical

Fusion

)}({ , ji

j

iii sfS

Low

Medium I

Medium II

High

)(MC f

)(MNC f

)(NMC f

)(NMNC f

INDEXING AND RETRIEVAL

3Retrieval

Retrieval

Historical Documents

Catalin I. Tomai, Bin Zhang, Venu Govindaraju: Transcript mapping for historic handwritten document

images. IWFHR 2002: 413-418

Transcript mapping

Retrieval

1755: p.12, p.22, …

Express: p.12, p.14, ..

May: p12, p.45, ..

Word Spotting

Retrieval

42

Retrieval

• Imperfect Segmentation

• Multiple Writers

• Use recognition scores

)(#)|()|(ˆ,

it

w

ji wPcwPqefr

Impact

Lexicon Controlled Historical Medical Postal Checks

10K 1K 10K 1K 4K 100 40

Top 1 57 67 12 28 20 95 99

Top 10 74 75 32 72 42 99 99

CAPTCHAS , ACCENTS AND

SECURITY

4 Security

Security

New Captchas

Security

𝜑𝑖 𝑡 = 𝜃𝑠𝑖 +𝜃𝑒𝑖 − 𝜃𝑠𝑖

𝐷𝑖න0

𝑡

𝑣𝑡𝑖 𝜏 𝑑𝜏ith component trajectory :

0 5 10 15 20 25

0

2

4

6

8

10

12

14

x (mm)

y (

mm

)

Neuromotor Model

Security

Chetan Ramaiah, Réjean Plamondonm Venu Govindaraju: A Sigma-Lognormal Model for Handwritten Text

CAPTCHA Generation. ICPR 2014: 250-255

47

Accents

Security

IMPACT

48

Security

Dynamic lexiconsand integrated segmentation

Interactive lexiconsand multiresolution

Serial fusion and lexicon density

Principled fusion methods

Indexing and Spotting

Text retrieval

CAPTCHAs and security

Accents, signatures, and

security

Personal archives

Flipped classrooms

Marianne Craig Moore was an American Modernist poet, critic, translator, and

editor. Her poetry is noted for formal innovation, precise diction, irony, and wit

Transform based method works

for both gray scale and binary

images

MultilingualMADCAT

Neighboring patches likely to have the same class label

),(),()j(i,

iioi

jih yxxx

dependency Observ : ; similarity Nbd : oh

Hand vs Machine vs Graphics

),(),(1),( jien

xxDjiD

jih eexx

: dominant gap between words

. dominant gap between linesx

y

)),(/(1),( iim yxD

iio eyx

Distance in Spatial Space (Dn ) and Feature Space (De:)

2

2

,

2

2

,

2

)(

2

)(),(

y

ydy

x

xdxjiD

jiji

n

Dm: Distance in Feature Space

MRF

Flipped Class

Summary

Thank You [email protected]