Online Graphics Recognition: State-of-the-Art

50
Online Graphics Recognition: State-of-the-Art Liu Wenyin Dept of Computer Science, City University of Hong Kong [email protected] http://www.cs.cityu.edu.hk/~liuwy/

Transcript of Online Graphics Recognition: State-of-the-Art

Page 1: Online Graphics Recognition: State-of-the-Art

Online Graphics Recognition: State-of-the-Art

Liu WenyinDept of Computer Science, City University of Hong Kong

[email protected]://www.cs.cityu.edu.hk/~liuwy/

Page 2: Online Graphics Recognition: State-of-the-Art

Outline

Introduction: Problem and MotivationPrimitive Shape RecognitionComposite Graphic Object Recognition Document Recognition and UnderstandingPerformance Evaluation and User StudySummaryOpen problems

Page 3: Online Graphics Recognition: State-of-the-Art

The Problem

Input graphic objects by freehand sketchingon tablet

Instant and continuous recognitionwhen the strokes are addeddetermine type/class & parameters (& semantics)(optionally) beautify appearance find (due to feedback) and correct error earlier

Similar problem: handwriting

Page 4: Online Graphics Recognition: State-of-the-Art

Motivation

Driven by Pen-based interaction devicesUbiquitous computingNatural, humanistic, & convenient UI

TabletPC heating up the market of handwriting software

e.g., Microsoft and IBM are developing their own handwriting software

Page 5: Online Graphics Recognition: State-of-the-Art

Motivation: Why Sketch-based?

Current UI for Graphics InputMouse clicks on menus & buttons

Disadvantages of Mouse and MenusInconvenient: too many clicks; hard to remember/findUnnatural: interruptive, unsuitable for quick tasks/ideasUnsuitable for small screen devices

Graphics Input UI for Creative Design by SketchingSketching with a pen is a mode of informal, perceptual, direct interaction: especially good for creative design tasksContinuous, non-interruptive, natural ways for interactionProductivity, instant feedback for quick correction/position

Page 6: Online Graphics Recognition: State-of-the-Art

Sketch-based UI: Design Principles

Humanistic Same way as human: informal, fast/instant feedback, ambiguity, creativity...Machine adapts to human, and not vice versa

Efficient and IntelligentGuess user’s intention and do accordinglySupported by graphics recognition: sketchy to regular

PersonalizedLearn user’s drawing styles, habits, preferences from user actions (as virtual feedback)

Page 7: Online Graphics Recognition: State-of-the-Art

Graphic Objects

Visual Token

Primitive Shape

Composite Graphic Object

Stroke

Graphic Document

Text

1..m

2..m

m

Page 8: Online Graphics Recognition: State-of-the-Art

Graphic Objects

Stroke--trajectory of a pen movement during touching a tabletminimal unit of user input, represented by a chain of points

Primitive Shape--simple shape: closed/open, single/multi-strokeslimited # of classes: e.g., triangles, ellipses, or line/arc segments

Composite Graphic Object--consisting of 2+ primitive shapesassumption: input components of one object consecutively & adjacently

Visual Token--component of a graphic documentjust like words in a text documentcomposite graphic objects or free/single primitive shapes

Graphic Document--complete document for a purposecomposed of visual tokenssemantics: defined by the tokens, their parameters & spatial relations

Page 9: Online Graphics Recognition: State-of-the-Art

Recognition Tasks/Stages

1. Primitive shape (or stroke) recognition: simultaneous recognition immediate recognition

2. Composite graphic object recognition:simultaneous recognition immediate recognition

3. Document recognition and understanding: simultaneous recognitionimmediate recognition

Page 10: Online Graphics Recognition: State-of-the-Art

Primitive/Stroke Recognition

Determine type & parameters (pos, size, & orientation)Regularize/beautify to the most common form (params)

because it is intents of most users

In most cases, immediate recognition:recognize after it is completely input

Single or multiple strokeslink multiple strokes first

Gestures: visual commands (e.g., undo/re-do, remove)represented by primitive shapes & need this level recognition

Page 11: Online Graphics Recognition: State-of-the-Art

Examples of Editing Gestures

SILK (Landay and Myers, IEEE Computer 2001)

Different groups developing different set of gesturesStandardization is necessary

Page 12: Online Graphics Recognition: State-of-the-Art

Simultaneous Recognition

Fluid Sketches (Arvo and Novins, UIST2000)

while a freehand stroke is being inputguess/suggest what the user is intending to drawsimultaneous or immediate feedback

Page 13: Online Graphics Recognition: State-of-the-Art

Fluid Sketches (Arvo & Novins)

Page 14: Online Graphics Recognition: State-of-the-Art

Primitive Recognition: 4 Stages

Stroke curve pre-processing

Shape classification

Shape fitting

Shape regularization/beautification

Page 15: Online Graphics Recognition: State-of-the-Art

Stroke Pre-Processing: Problem

Input: a freehand stroke

Output: a refined polyline

Requirement: the output is similar to the input freehand stroke but with some necessary perfection (noise reduction)Link multiple strokes or segment a single stroke

if necessary

Page 16: Online Graphics Recognition: State-of-the-Art

Pre-Processing

Polygonal approximation with ε = 1.0 pixel

Polygonal approximation with ε = 5.0 pixels

Hooklet

Circlet

The sketchy line before processing

The sketchy lineafter processing

The sketchy linebefore processing

After pullingthe end points

After deletingextra points

Page 17: Online Graphics Recognition: State-of-the-Art

Shape Classification: Problem

Input: the refined polyline

Output: the type id of a basic shape class: e.g., line, triangle, quadrangle, pentagon, hexagon,ellipse, or free curve

Requirement: correctness: the type (output) is of the user’s intent

Page 18: Online Graphics Recognition: State-of-the-Art

Shape Classification

Based on featuresextracted from the stroke’s vector polyline (or image)represent the stroke

Many pattern recognition methods can be used:Rule-Based ApproachesNeural-Network-Based ApproachesSVM-Based Approaches

e.g., Ernesto Tapia and Raul Rojas (ICDAR 2003)

etc.

Page 19: Online Graphics Recognition: State-of-the-Art

Features Used in Recognition

Corners can be found byspeed (Davis 2002; Calhoun et al. 2002)curvature

Turning angle functions (Arkin et al. 1991)Attraction force model (Jin et al. PG2002)Stroke order and direction

especially for composite objectsDomain-specific or independent knowledge

Page 20: Online Graphics Recognition: State-of-the-Art

Corners: Speed & Curvature

Davis (2002)Calhoun et al. (2002)

Page 21: Online Graphics Recognition: State-of-the-Art

Turning Angle Functions

Arkin et al. (IEEE T-PAMI 1991)

v

O

10

v

s

T (s)

v+2p

Page 22: Online Graphics Recognition: State-of-the-Art

Jin, Liu, Sun & Sun (2002)Inner angle of attracted pointInner angle of the attracting pointDistance between the two points

Attraction Force Model

A

B

C

A

B

(a)

A

B

C

(b) (c) (d)C

B

C

),(βα),( 2 BADis

BAf =

Page 23: Online Graphics Recognition: State-of-the-Art

Decision Making

Rule-based: # of corners (or vertices)

Construction of ClassifiersSVM (can be used for incremental learning)

One-against-one structure: n(n-1)/2 classifiersOne-against-all structure: n classifiersMax-win scheme

Training with samplesNeural Network

Page 24: Online Graphics Recognition: State-of-the-Art

Shape Fitting: Problem

Input: the type idthe stroke (original and refined polyline)

Output: the fitted shape (characterized by parameters)

Requirement: the output has the lowest average distance to the input stroke

Page 25: Online Graphics Recognition: State-of-the-Art

Shape Fitting

(a) (b)

(c) (d)

the axis orientation

the axis orientation

x

y

the center point

(a) (b)

Polygonal Fitting:

Ellipse Fitting:

Page 26: Online Graphics Recognition: State-of-the-Art

Shape Regularization: Problem

Input: the fitted shape

Output: the regularized shape (characterized by parameters)

Requirement: the output is similar to the original freehand stroke but also appears in its most beautiful form:

e.g., conforming as much as possible to connectedness, perpendicularity, congruence, and symmetry, intended by the user.

Also referred to as beautification (Igarashi et al. 1997)

Page 27: Online Graphics Recognition: State-of-the-Art

Shape Regularization

Inner-Shape Regularization

Inter-Shape Regularization

Page 28: Online Graphics Recognition: State-of-the-Art

Inner-Shape Regularization

Equilateral RectificationEdges and axes

Parallelism RectificationEdges

Special Angle Rectification90, 30, 45, 60, 120, etc.

Horizontal/Vertical RectificationEdges, axes, diagonals

Page 29: Online Graphics Recognition: State-of-the-Art

Fitted Shape

circle

ellipsetriangle

equilateral triangle

isosceles triangle

parallelogram

quadrangle

diamond

square

right triangle

rectangle

trapezoid

Inner-Shape Rectification Rules

Page 30: Online Graphics Recognition: State-of-the-Art

Inter-Shape Rectification

Affected by neighbors in a documentSize Rectification

Position/orientation Rectification

AlignmentIntersectionTangencyConcentric…

Page 31: Online Graphics Recognition: State-of-the-Art

Composite Object Recognition: Problem

After recognizing the current strokeCombine the current shape with previous onesBased on their sequential & spatial relationship

assumptions: consisting of 2+ primitive shapesinput components consecutively & adjacently

Determine or predict the type & parameters of the composite objectRegularization or beautification

Page 32: Online Graphics Recognition: State-of-the-Art

Composite Object Recognition: Approaches

Classifier-based approaches: decision tree Fonseca and Jorge (2000): fuzzyPeng, Sun, Liu, & Cong (GREC2003)

Similarity-based approachesBased on similarities of component and constraintsRepresentation: ARG(Li 2000), RAG(Lladós 2001)Relational distance metric (Shapiro 1993)Directional shape similarity (Liu et al. 2001)Ernesto Tapia and Raul Rojas (GREC2003)

Page 33: Online Graphics Recognition: State-of-the-Art

Directional Composite Similarity

Principles for composite similarity metricsPartial (for partial match)Structural/topologicalStroke-number freeStroke-order free

Used in our demo system (SmartSketchpad)

Page 34: Online Graphics Recognition: State-of-the-Art

Scenario for Composite Input

for partial input match:

Page 35: Online Graphics Recognition: State-of-the-Art

Document Recognition and Understanding: Problem

Analyze the connections and relationshipamong the elements

Obtain and represent the semantics in current (part/whole) drawing as one document

Beautify and re-display it into a neat layoutComparison to offline document recognition

similar to engineering drawings but more cursive for sketches more regular for engineering drawings

Page 36: Online Graphics Recognition: State-of-the-Art

Document Recognition and Understanding: Applications

Mainly for quick design2D diagrams:

GUI: Landy & Myers (2001), Caetano et al. (2002) UML diagrams: Blostein et al. (2002) …

3D object input: Igarashi et al. (SIGGraph1999): TEDDYLipson and Shpitalni (2002)Hsu and Lee (1994): 2.5D animations

Page 37: Online Graphics Recognition: State-of-the-Art

Sketch Input for a Dog in Animation

Fabian Di Fiore & Frank Van Reeth (2002)A Multi–Level Sketching Tool for “Pencil&Paper” Animation

Page 38: Online Graphics Recognition: State-of-the-Art

Document Recognition and Understanding Approaches

Gross (1994, 1996): Sketch-A-Sketchdetect & maintain spatial relationship (constraints)

represented as binary predicates: • e.g., “concentric”, “contains”, “connects”, “overlap”

by the bounding box, size, & starting-ending pointsby-product: learn composite objects from examples

Pinto-Albuquerque et al. (2000): DocSketchsyntax as a fuzzy relational adjacency grammarvisual syntax analyzer

Page 39: Online Graphics Recognition: State-of-the-Art

Prototype Systems

ASSIST (Alvarado and Davis IJCAI2001), MITSketchIT: Stahovich (1996)…, MIT, CMU SILK: Landay & Myers (2001), UC Berkeley & CMUTeddy: Igarashi et al. (SIGGraph1999), U-TokyoTivoli: Pedersen et al. (CHI1993), Xerox PARCEsQUIsE: Pierre Leclercq (GREC2003)SmartSketchpad: Liu et al. (2001, 2002) …

Page 40: Online Graphics Recognition: State-of-the-Art

ASSIST

A Shrewd Sketch Interpretation & Simulation ToolMIT AI LabChristine Alvarado’s Master Thesis (2000)Alvarado and Davis (IJCAI-2001)Each new stroke triggers three stage process:

Recognition: generate all possible interpretations primitive stroke recognition & composite object (device) recognition

Reasoning: score each interpretationResolution: select the current best consistent interpretation

Gesture recognition: arrows and pointing

Page 41: Online Graphics Recognition: State-of-the-Art

SketchIT

Conceptual Design for CAD (mechanical engineering)instead of precise design

Thomas F. Stahovich (1996), MIT PhD ThesisStahovich, Davis & Shrobe (AAAI-1997)

QC-space for representing interaction among mechanical parts

Calhoun, Stahovich, et al. (2002)semantic network based recognizer for multi-strokes

Stahovich, Davis & Shrobe (AI-1998)generate multiple new designs from a sketch

Page 42: Online Graphics Recognition: State-of-the-Art

Experiments of Primitive Shape Recognition

Page 43: Online Graphics Recognition: State-of-the-Art

Database of Composite ObjectsIn this experiment, we created 97 composite graphic objects. All these objects are composed of less than ten primitive shapes. The weights and thresholds we used areε =20, w1=0.4, w2=0.3, w3=0.3, k1=k2=0.5.

We randomly selected 10 objects (whose ID is 73, 65, 54, 88, 22, 5, 12, 81, 18, and 76) and draw these objects as queries. In most cases, the intended object will appear in the smart toolbox (ranked in the first 10) after only a few components are drawn.

Page 44: Online Graphics Recognition: State-of-the-Art

33

124 1 10

20

40

60

80

100

2 3 4 5 6

61

38

25 25

1 10

20

40

60

80

100

2 3 4 5 6 7

Object 73 Object 54

30

2 1 10

20

40

60

80

100

2 3 4 5

25

102 1 10

20

40

60

80

100

2 3 4 5 6

Object 76 Object 5

User Study and Performance Evaluation

Page 45: Online Graphics Recognition: State-of-the-Art

Sketches for Evaluating Different UIs

(a) sketch1 (b) sketch2

Page 46: Online Graphics Recognition: State-of-the-Art

Drawing Time for Sketch 1 (s) Drawing Time for Sketch 2 (s) #User ID

Sketch-based Traditional Sketch-based Traditional 1 104 125 157 190 2 93 99 151 288 3 69 98 156 294 4 59 81 122 156 5 64 178 135 191 6 63 100 85 231 7 61 120 92 203 8 72 91 119 195 9 70 70 156 252 10 78 110 125 201

Average 73.3 107.2 129.8 220.1

Drawing Time for Different Sketches Using Different UIs

Page 47: Online Graphics Recognition: State-of-the-Art

Open Problems

Complex editing gestures recognition and editing-related applications

up to several hundred different gestures Composite object recognition for large object set: for graphics input

up to 10,000 master objects in MS VISIOuser and domain adaptation

Semantic level understanding for creative/conceptual design

reasoning & prediction of the user’s intentions

Page 48: Online Graphics Recognition: State-of-the-Art

Where to Find Papers?

2002 AAAI Spring Symposium Series--Sketch Understanding

Chairs: Tom Stahovich, James Landay, Randy DavisOnline Proc.: http://automatix.inesc.pt/sketch02/

ACM Annual Conference on Human Factors in Computing Systems (SIGCHI)ACM Annual Symposiums on User Interface Software and Technology (UIST)GREC and ICDAR SIGGRAPH

Page 49: Online Graphics Recognition: State-of-the-Art

Summary

Brief survey of online graphics recognition Problems, Approaches, and Applications Supportive for pen-based UI

improve user productivity convenient for creative tasks: quick design ideas users unanimously prefer the sketch-based UI

Page 50: Online Graphics Recognition: State-of-the-Art

Thank You!Contact Liu Wenyin [email protected]

See some of my research work athttp://www.cs.cityu.edu.hk/~liuwy/The survey paper can be found at

http://www.cs.cityu.edu.hk/~liuwy/publications/GREC2003_LNCS.pdf