Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of...

43
Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University) Ying-Qing Xu (Microsoft Research Asia) Heung-Yeung Shum (Microsoft Research Asia) International Journal on Document Analysis and Recognition (IJDAR) 2004
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of...

Page 1: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Combining Shape and Physical Models for Online Cursive Handwriting Synthesis

Jue Wang (University of Washington)Chenyu Wu (Carnegie Mellon University)Ying-Qing Xu (Microsoft Research Asia)Heung-Yeung Shum (Microsoft Research Asia)

International Journal on Document Analysis and Recognition (IJDAR) 2004

Page 2: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Introduction

Handwriting computing techniques (pen-based devices)

Handwriting recognition make it possible for computers to understand the

information involved in handwriting

Handwriting modulation handwriting editing, error correction,

script searching

Page 3: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Introduction

Handwriting Modeling & SynthesisMovement-simulation techniques

base on motor models and try to model the process of handwriting production

focus on the representation and analysis of real handwriting signals rather than handwriting synthesis

Page 4: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Introduction

Shape-simulation methods consider the static shape of handwriting trajectory more practical than movement-simulation tech

when dynamic information is not available straight forward approach : synthesize form

collected handwritten glyphs learning-based cursive handwriting synthesis

approach

Page 5: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Introduction

Successful handwriting synthesis algorithm shapes of letters vs. training samples connection between synthesized letters

A novel cursive handwriting synthesis tech Combine the advantages of the shape-simulation and

the movement-simulation methods

Page 6: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Outline

Sample collection and segmentation Learning strategies Synthesis Strategies Experimental results Discussion and Conclusion

Page 7: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Sample Collection

About 200 words Each letter has appeared more than 5 times These handwriting samples firstly pass through a low

pass filter and then be re-sampled to produce equidistant points

Page 8: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Sample Segmentation

OverviewSegmentation-based recognition methodRecognition-based segmentation(rely heavily on the performance of the recognition engine)Level-building

simultaneously outputs the recognition and segmentation results

segmentation and recognition are merged to give an optimal result

Page 9: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

A Two-level Framework

Framework of traditional handwriting segmentation approachesTemporal handwriting sequence

is a low level feature that denotes the coordinate and velocity of the sequence at time t

},...,{ 1 TzzS tz

Page 10: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Segmentation

The segmentation problem is to find the identity string {I1,…,In}, with the corresponding segments of the sequence {S1,…,Sn}, S1= {z1,…,zt1},…, Sn={ztn-1,…, zT},that best explain the sequence

n

iiiii

nnn

n

IIpISpISpIp

IIpIISSp

II

21111

111

**1

)|()|()|()(maxarg

),...,(),...,|,...,(maxarg

},...,{

Page 11: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Segmentation

For the training of the writer-independent segmentation system low-level feature-based segmentation algorithm works

well for a small number of writers

A script code is calculated from handwriting data as the middle-level feature

Page 12: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Middle Level Feature

Five kinds of key points are extracted points of maximum/minimum x-coordinate (X+,X

-)

points of maximum/minimum y-coordinate (Y+,Y-)

crossing points ( )

Average direction of the interval sequence between two adjacent key points

Page 13: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Script Codes Examples

Page 14: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Middle Level Feature

Samples of each character are divided into several clusters those in the same cluster have a similar structural

topology Since the length of script code might not be the

same in all cases → can’t directly compute the similarity

The script code is modeled as a homogeneous Markov chain

Page 15: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Middle Level Feature

Given two script codes T1, T2

We may compute the stationary distributions , and transition matrix A1, A2

The similarity between two script codes is measured as

1 2

2213

12212

12211

21

)(

)],()],([2

)],(),([2

exp),(

nn

AAKLAAKL

KLKL

TTd

n

l l

ll

KL

12

11

21

)(

)(log)(

),(

Page 16: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Middle Level Feature

The position of , , A1, A2 are enforced symmetrically balance the variance of the KL divergence and

the difference in code length

If both the stationary distribution and the transition matrix of two script codes are matched well, and their code lengths are almost the same → d(T1, T2) is close to 1

1 2

321 ,,

Page 17: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Segmentation

After introducing the script code as middle-level features, the optimization problem becomes

improve the accuracy of segmentation dramatically reduce the computational complexity of

level-building

),|(),,|()|(),|()|()(maxarg

),...,(),...,|,(maxarg

),...,(),...,|,...,(maxarg

},...,{

112

1111111

11

111

**1

iiiiiii

n

iii

nn

nnn

n

ITSpITITpIIpITSpITpIp

IIpIITSp

IIpIISSp

II

Page 18: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Graph Model

Page 19: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Result

Page 20: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Outline

Sample collection and segmentation Learning strategies Synthesis Strategies Experimental results Discussion and Conclusion

Page 21: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Learning Strategies

Data alignmentTrajectory matchingTraining set alignment

Shape models

Page 22: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Trajectory Matching

Segmentation and reconstruction of on-line handwritten scripts (1998, Pattern Recognition)

Each piece is simple arc, points can be equidistantly sampled from it to represent the stroke

Page 23: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Trajectory Matching

Landmark-point-extraction method pen-down, pen-up points local extrema of curvature inflection points of curvature

A handwriting sample can be divided into as many as six pieces

The same character are mostly composed of the same number of pieces and they match each other naturally

Page 24: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Trajectory Matching

A handwriting sample can be represented by a point vector

s: number of static pieces segmented from the sample ni: number of points extracted from the i th piece

),,...,,(),...,,...,,{( 2111

211 1

sns

ssn xxxxxxX

)},...,,(),...,,...,,( 2111

211 1

sn

ssn s

xyyyyy

Page 25: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Trajectory Matching

The following is to align different vector into a common coordinate frame estimate an affine transform for each sample

that transforms the sample into the coordinate frame

Affine transformations: translation, rotation, scaling

Page 26: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Training Set Alignment

Iterative algorithm(Learning from one example through shared densities on transforms (IEEE CVPR 2000) )

Deformable energy based criterion is defined as

sN

ix

i

s V

XX

NE

1

2

)2

||||exp(

1log

sN

i is

XN

X1

1 sN

i is

x XXN

V1

2||||1

Page 27: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Training Set Alignment - Algorithm

Maintain an affine transform matrix Ui for each sample, which is set to identity initially

Compute the deformable energy-based criterion E Repeat until convergence:

For each one of the six unit affine matrixes[14], Aj, j = 1,…,6 Let Apply to the sample and recalculate the criterion E If E has been reduced, accept , otherwise: Let and apply again,

If E has been reduce, accept , otherwise revert to Ui

End

ijnewi UAU

newiU

newiU

ijnewi UAU 1

newiU

Page 28: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Shape Models

By modeling the distribution of aligned vectors, new examples can be generated that are similar to those in the training set

Like the Active Shape Model, principal component analysis is applied to the data (PCA)(Statistical models of appearance for computer vision, Draft report, 2000)

Page 29: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Shape Model

Formally, the covariance of the data is calculated as

Then the eigenvectors and corresponding eigenvalues of S are computed and sorted so that

The training set is approximated by represent the t eigenvectors

corresponding to the largest eigenvalues b is a vt-dimensional vector given by

By varying the elements in b, new handwriting trajectory can be generated from this model

apply limits of to the elements bi

Ti

s

i i XXXXs

S )()(1

11

ii

1 ii bXX

)|...||( 21 t

)( XXb T

i3

Page 30: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Outline

Sample collection and segmentation Learning strategies Synthesis Strategies Experimental results Discussion and Conclusion

Page 31: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Synthesis Strategies

Generate each individual letter in the word Then the baselines of these letters are aligned and

juxtaposed in a sequence Concatenate letters with their neighbors to form a

cursive handwriting →can’t be easily achieved

To solve this problem, a delta log-normal model based conditional sampling algorithm is proposed

Page 32: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Individual Letter Synthesis

Page 33: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Delta Log-normal Model

A powerful tool in analyzing rapid human movements With respect to handwriting generation, the movement of a

simple stroke is controlled by velocity The magnitude of the velocity is described as

(Why handwriting segmentation can be misleading?, 13th international conference on PR, 1996)

),,;(),,;()( 22202

21101 ttDttDtv

log-normal function

(on a logarithmic scale axis)

tttt

tttt

02

20

0

20 ,

2

])[ln(exp

)(2

1),,;(

i

t0: activation timeDi: amplitude of impulse commands : mean time delay :response time of the agonist and antagonist systemi

Page 34: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Delta Log-normal Model

The angular velocity can be expressed as

The angular velocity is calculated as the derivative of

Give , the curvature along a stroke piece is calculated as

The static shape of the piece is an arc, characterized by

t

tduuvct

0

)()( 00 : initial directionc0: constant

0

t)()( 0 tvctv

)(tv

000 )(

)()( limlim c

ttv

ttv

stc

ss

DcS ,, 2100 ,,, DDDcc (arc length)

Page 35: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Delta Log-normal Model-Example[Why Handwriting Segmentation Can Be Misleading, 1996 IEEE ICPR]

Page 36: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Conditional Sampling

First, the trajectories of synthesized handwriting letters are decomposed into static pieces

The first piece of a trajectory is called head piece, and the last piece is called the tail piece

In the concatenation process, the trajectories of letters will be deformed to produce a natural cursive handwriting,by changing the parameters of the head and the tail pieces from

tttthhhh DcSDcS ,,,,,

**,, thth StoSSS

Page 37: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Conditional Sampling

A deformation energy of a stroke is defined as

A concatenation energy between the i th letter and the (i+1) th letter is defined as

By minimizing the second and the third items, the two letters are forced to connect with each other smoothly and naturally

2/

*/

2/

*/

2/

*/

/ )()()( ththththththth

d DDccE

)]1()([)1,( 1 iEiEiiE hd

tdc

])1()()()([ 2****2 iciDiic httt

]||)1()([|| 23 ipip ht

Page 38: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Conditional Sampling

The concatenation energy of a whole word is calculated as

We must ensure that the deformed letters are consistent with models

The sampling energy is calculated as

The whole energy formulation is finally given as

1

2)1,(lN

i cc iiEE

tv

i iis bfE1

* ))3/((

1|:|

1|:|0)(

2 xx

xxf

lN

i sscc iEEE1

)(

Page 39: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Synthesis-Iterative Approach

Randomly generate a vector b(i) for each letter initially Generate trajectories Si of letters and calculate an affine

transform Ti for each letter (transform it to its desired position)

For each pair of adjacent letters {Si, Si+1}, deform the pieces in these letters to minimize the concatenation energy Ec(i, i+1)

Project the deformed shape into the model coordinate frame

Update the model parameters If not converged return to step 2

Page 40: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Experimental Results

Page 41: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Discussion & Conclusion

Performance is limited by samples used for training since the shape models can only generate novel shapes within the variation of training samples

Although some experimental results are shown, it is still not known how to make an objective evaluation on the synthesized scripts and compare different synthesis approaches

Page 42: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Markov chains

Markov chain on a space X with transitions T is a random process (infinite sequence of random variables) (x(0), x(1),…x(t),…) that satisfy

That is, the probability of being in a particular state at time t given the state history depends only on the state at time t-1

If the transition probabilities are fixed for all t, the chain is considered homogeneous

),T(),...,|p( )()1()1()1()( tttt xxxxx

T=

0.7 0.3 0

0.3 0.4 0.3

0 0.3 0.7

x2

x1 x3

0.4

0.3

0.3

0.3

0.7 0.70.3

Page 43: Combining Shape and Physical Models for Online Cursive Handwriting Synthesis Jue Wang (University of Washington) Chenyu Wu (Carnegie Mellon University)

Stationary distribution

Consider the Markov chain given above:

The stationary distribution is

T=

0.7 0.3 0

0.3 0.4 0.3

0 0.3 0.7

x2

x1 x3

0.4

0.3

0.3

0.3

0.7 0.70.3

0.33 0.33 0.33x =0.7 0.3 0

0.3 0.4 0.3

0 0.3 0.7

0.33 0.33 0.33