Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson...

32
Large Vocabulary Large Vocabulary Unconstrained Unconstrained Handwriting Handwriting Recognition Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center

Transcript of Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson...

Page 1: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Large Vocabulary Large Vocabulary Unconstrained Unconstrained Handwriting Handwriting RecognitionRecognition

J Subrahmonia

Pen Technologies

IBM T J Watson Research Center

Page 2: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Pen Technologies

Pen-based interfaces in mobile computing

Page 3: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Mathematical Formulation

H : Handwriting evidence on the basis of which a recognizer will make its decision– H = {h1, h2, h3, h4,…,hm}

W : Word string from a large vocabulary– W = {w1, w2, w3, w4,…., wn}

Recognizer :– )|( HWW p

Wargmax

Page 4: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Mathematical Formulation

)()|(

)(

)()|(

)|(

WWH

H

WWH

HWW

pp

p

pp

p

W

W

W

argmax

argmax

argmax

SOURCECHANNEL

Page 5: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Source Channel Model

WRITER DIGITIZER FEATURE EXTRACTOR

DECODER

H

W

CHANNEL

Page 6: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Source Channel Model

)()|(

)|(

WWH

HWW

pp

p

W

W

argmax

argmax

Handwriting Modeling : HMMs

LanguageModeling

SEARCH STRATEGY

Page 7: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Hidden Markov Models

Memoryless Model

Add Memory

Hide Something

Markov Model Mixture Model

Hide Something

Add Memory

Hidden Markov Model

Alan B Poritz : Hidden Markov Models : A Guided Tour ICASSP 1988

Page 8: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Memoryless ModelCOIN : Heads (1) : probability p Tails (0) : probability 1-p

Flip the coin 10 times (IID Random sequence)

Sequence : 1 0 1 0 0 0 1 1 1 1

Probability = p*(1-p)*p*(1-p)*(1-p)*(1-p)*p*p*p*p = p)-(1p

46

Page 9: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Add Memory – Markov Model2 Coins : COIN 1 => p(1) = 0.9, p(0) = 0.1 COIN 2 => p(1) = 0.1, p(0) = 0.9

Experiment :Flip COIN 1, Note the outcomeIf ( outcome = Head) Flip Coin 1Else Flip Coin 2End

Sequence 110 0 : Probability = 0.9*0.9*0.1*0.9Sequence 1010 : Probability = 0.9*0.1*0.1*0.1

Page 10: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

State Sequence Representation

1 2

1 : 0.9

0 : 0.1

1 : 0.1

0 : 0.9

Observed Output Sequence Unique State Sequence

Page 11: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Hide the states => Hidden Markov Model

s1 s2

0.9

0.1

0.1

0.90.90.1

0.10.9

0.10.9

0.90.1

Page 12: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Why use Hidden Markov Models Instead of Non-hidden?

Hidden Markov Models can be smaller – less parameters to estimate

States may be truly hidden– Position of the hand– Positions of articulators

Page 13: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Summary of HMM Basics We are interested in assigning probabilities p(H)

to feature sequences Memoryless model

– This model has no memory of the past Markov noticed that is some sequences the future

depends on the past. He introduced the concept of a STATE – a equivalence class of the past that influences the future

Hide the states : HMM

n

i

pp1

)()( hiH

)|()|( 1 ispp hih11,...,hihi

),()( SHH ppS

Page 14: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Hidden Markov Models

Given a observed sequence H– Compute p(H) for decoding– Find the most likely state sequence for a

given Markov model (Viterbi algorithm)– Estimate the parameters of the Markov

source (training)

Page 15: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Compute p(H)

s1 s3

0.5

0.3

0.2

0.4p(a)p(b)

0.50.5

0.70.3

0.5

0.1

s20.30.7

0.80.2

Page 16: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Compute p(H) – contd.

Compute p(H) where H = a a b b Enumerate all ways of producing h1=a

s1 s1

s2

s2 s2

s2 s3

0.5x0.8

0.3x0.7

0.2

0.4x0.5

0.5x0.3

0.2

0.40

0.21

0.04

0.03

Page 17: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Compute p(H) – contd. Enumerate all ways of producing

h1=a h2=a

s1 s1

s2

s2 s2

s2 s3

0.5x0.8

0.3x0.7

0.2

0.4x0.5

0.5x0.3

0.2

s1

s2

s2 s2

s2 s3

0.5x0.8

0.3x0.70.2

0.4x0.5

0.5x0.3

0.2

s2

s3

0.4x0.5

0.5x0.3

Page 18: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Compute p(H)

Can save computation by combining paths

s1 s1

s2

s2

s2 s3

s1

s2

s2

s2 s3

s2

s3

Page 19: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Compute p(H)

Trellis Diagram

s1

s2

s3

0 a aa aab aabb

.5x.8 .5x.8 .5x.2 .5x.2

.4x.5 .4x.5 .4x.5 .4x.5

.3x.7 .3x.7 .3x.3 .3x.3

.5x.3 .5x.3 .5x.7 .5x.7

.2 .2 .2 .2 .2

.1 .1 .1 .1 .1

Page 20: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Basic Recursion Prob (Node) =

sum (Prob(predecessor) x Prob (predecessor->node) ) Boundary condition : Prob (s, 0) = 1

s1

s2

s3

0 a aa aab aabb

1.0 s1, a : 0.4

1.0 0.4 .16 .016 .0016

s1, a : 0.4 s1, a : 0.4 s1, a : 0.4

s1, 0 : .08s1, a : .21s2, a : .04

0.20.33 .182 .054 .01256

s1, 0 : 0.2s1, 0 : .032s1, a : .084s2, a : .066

s1, 0 : .0032s1, b : .0144s2, b : .0364

s1, 0 : .00032s1, b : .00144s2, b : .0108

s2, 0 : .033s1, a : .03

0.02 0.063 .0677 .0691 .020156

s2, 0 : 0.02s2, 0 : .0182s2, a : .0495

s2, 0 : .0054s2, b : .0637

s2, 0 : .001256s2, b : .0189

Page 21: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

More Formally –Forward Algorithm

)|()(

)|()|()(

)(

1

ssPs

ssPssPs

s

st

st

t

ht

Page 22: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Find Most Likely Path for aabb- Dynamic Prog. or Viterbi

Max Prob (Node) =

MAX(Max(predecessor) x Prob (predecessor->node) )

s1

s2

s3

0 a aa aab aabb

1.0 s1, a : 0.4 s1, a : .16 s1, b : .016 s1,b : .0016

s1, 0 : .08s1, a : .21s2, a : .04

s1, 0 : 0.2s1, 0 : .032s1, a : .084s2, a : .066

s1, 0 : .0032s1, b : .0144s2, b : .0168

s1, 0 : .00032s1, b : .00144s2, b : .00336

s2, 0 : .021s1, a : .03

s2, 0 : 0.02s2, 0 : .0084s2, a : .0315

s2, 0 :.00168s2, b : .0294

s2, 0 : .000336s2, b : .00588

Page 23: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters1/3

1/3

1/3

1/2

1/2

1/21/2

p(a)p(b) =H = abaa

.000385 .000578 .000868

.001302 .001157 .002604 .001736

p(H) = .008632

Page 24: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters1t

2t

3t

4t

5t

ic = A posterior probability of path i = )(Hppi

1c 2c 3c 4c 5c 6c 7c.045 .067 .134 .100 .201 .150 .301

46.0)(

:

363.0)(

637.0)(

838.0223)(

1

64213

7532

543211

tp

New

cccctc

ccctc

ccccctc

34.0)( 2 tp 20.0)( 3 tp

Page 25: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters

1t

2t

4t

5t

29.0),(

71.0),(

:

246.0),(

592.02),(

1

1

3211

543211

btp

atp

New

cccbtc

cccccatc

Page 26: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters

.71

.29.68.32

.64

.36

.60

.40

.34

.46

.20

.60

.40

1p 2p 3p 4p 5p 6p 7p

0.00108 0.00129 0.00404 0.00212 0.00537 0.00253 0.00791

008632.002438.0)( Hp

Keep on repeating : 600 iterations : p(H) = .037037037Another initial parameter set : p(H) = 0.0625

Page 27: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters

Converges to local maximum There are 7 (atleast) local maxima Final solution depends on starting point Speed of convergence depends on

starting point

Page 28: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters : Forward Backward algorithm

Improves on enumerating algorithm by using the Trellis

Results in reduction from exponential computation to linear computation

Page 29: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Forward Backward Algorithm

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

asas

as

bs

j

Page 30: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Forward Backward Algorithm

= Probability that hj is produced by and the complete output is H

=

),( Htp ij

it)().()().(1 bjiiaj st|PtPs hj

)(1 aj s = Probability of being in state and producing the output h1, .. hj-1

as

)( bj s = Probability of being in state and producing the output hj+1,..hm

bs

Page 31: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Forward Backward Algorithm

Transition count

)(/),()|( HHH ptptC ji

)|()(

)|()|()(

)(

1

ssPs

ssPssPs

s

st

st

t

1ht

Page 32: Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Training HMM parameters Guess initial values for all parameters Compute forward and backward pass

probabilities Compute counts Re-estimate probabilities

BAUM-WELCH, BAUM-EAGON, FORWARD-BACKWARD, E-M