Deictic gestures and symbolic gestures produced by adults ...
Toolkits for Supporting Gestures in Applications
description
Transcript of Toolkits for Supporting Gestures in Applications
Toolkits for Supporting Gestures in Applications
Justin Weisz
05-830 UI Software
Nov. 16, 2004
“A single stroke indicates the operation (move text), the operand (the text to be moved), and additional parameters (the
new location of the text).”
-- Rubine
What is a gesture?
“A single stroke indicates the operation (move text), the operand (the text to be moved), and additional parameters (the
new location of the text).”
-- Rubine
What is a gesture?
Uses of gestures
Editing existing objects
Creating new objects
Uses of gestures
Issuing commands
Back Reload page Menu > Copy
Applications of gesturing - Lightpens
Applications of gesturing - Tablets
Applications of gesturing - PDAs
Applications of gesturing - Video games
PowerGlove in “The Wizard”
Applications of gesturing - Video games
Black and White - 2001
Gesture
Recognizers
Automated (in a)
Novel
Direct
Manipulation
Architecture
Rubine [1991]
Rubine [1991]
Rubine [1991]
Rubine [1991]
recog = [Seq :[handler mousetool:LineCursor] :[[view createLine] setEndpoint:0 x:<start X> y:<start Y>] ];
manip = [recog setEndpoint:1 x:<current X> y:<current Y>];
done = nil;
For the line gesture:
Rubine [1991]
BUT, how are gestures actually represented and recognized?
Assumptions:
• Gestures are 2D, single strokes
• Start and end of a gesture is clearly defined
Representation:
€
G = {g0,...,gP}
€
gp = {x p , y p, t p}
set of P sample points
position & timestamp, preprocessed to remove jitter
Rubine [1991]
Feature vector extracted from G:
€
f = { f1,..., fF }
Example features:
€
f1 =(x2 − x0)
(x2 − x0)2 + (y2 − y0)2
€
f3 = (xmax − xmin )2 + (ymax − ymin )2
€
θp = arctanΔx pΔy p−1 − Δx p−1Δy p
Δx pΔx p−1 − Δy pΔy p−1
€
f9 = θ p
p=1
p−2
∑
cos of initial angle
length of BB diagonal
angle between three pts(?)
total angle traversed
Rubine [1991]
BUT...
“The aforementioned feature set was empirically determined by the author to work well on a number of different gesture sets” -- Rubine
Rubine [1991]
Classification
Each gesture class represented by a weight vector
€
w ˆ c = {w ˆ c 0,...,w ˆ c F}
To classify gesture G:
€
v ˆ c = w ˆ c 0 + w ˆ c i f i
i=1
F
∑
score bias(?) weight of feature i
feature i of gesture G
Take the highest score:
€
argmaxˆ c ∈C
v ˆ c
Rubine [1991]
Training
Optimal classifier
€
w ˆ c = {w ˆ c 0,...,w ˆ c F}
Rubine [1991]
Rejection
Pr(G matches i)Gesture G
Classification i
> 0.95? ACCEPT
REJECT
mean(i)
g1 g2g3
Rubine [1991]
Evaluation
Rubine [1991]
Evaluation
Rubine [1991]
Evaluation
Aside: Agate - Landay, Myers [1993]
gdt - Long et al. [1999]
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
gdt - Long et al. [1999]
Newton and Palm users reported:
• Gestures are powerful, efficient and convenient
• Want more commands to have gestures
• Want to define new gestures
• Recognition accuracy is not good enough
gdt - Long et al. [1999]
Oh Agate, I will make you beautiful!
gdt - Long et al. [1999]
gdt - Long et al. [1999]
Distance matrix
gdt - Long et al. [1999]
Classification matrix
gdt - Long et al. [1999]
Experiment - Hypotheses
• “Participants could use gdt to improve their gesture sets.”
• “The tables gdt provided would aid designers.”
• “PDA users and non-PDA users would perform differently.”
gdt - Long et al. [1999]
Experiment - Procedure
(pay no attention to the man behind the curtain...)
gdt - Long et al. [1999]
Experiment - Results
gdt - Long et al. [1999]
Experiment - Problems with gdt
new class
existing gesture classes
d“Clustering”
Reverse direction
gdt - Long et al. [1999]
Experiment - Problems with gdt
“Sloppiness”
Gesture overloading
Delete
gdt - Long et al. [1999]
Lessons learned
• GDT helpful, but participants averaged a 95.4% recognition rate
• Tables too confusing, didn’t help performance (better: “Gesture class A is too similar to gesture class B”)
• Should be able to create a test set of gestures and run it against a different gesture class
rect
copy
Break time!
Muchas gracias to my officemate for the suggestion. Smiling babies make people happy. BE HAPPY!
GT2k - Westeyn et al. [2003]
Problem:
Real problem: it is (still) cumbersome to design a system to perform gesture recognition
GT2k - Westeyn et al. [2003]
GT2k system components
Data generator
Sensors
Microphones
Cameras
Accelerometers
€
f = { f1,..., fF }
I’m back!
Results interpreter<action>
Aside: Hidden Markov Models
€
λ =(Λ,B,π )HMM Transition
probs.Symbol output probs.
Initial state dist.
€
Λ={aij}
€
aij = Pr(qt +1 = j | qt = i)
€
π ={π i}
€
π i = Pr(q1 = i)
€
B = {b j (k)}
€
b j (k) = Pr(ot = vk | qt = j)kth symbol in the alphabet
Aside: Hidden Markov Models
• Evaluation problem– Given HMM and O={o1,...,oT}, compute Pr(O|HMM)
– Forward algorithm
• Decoding problem– Given O, compute most likely state sequence that
produced O– Viterbi algorithm
• Learning problem– Given O, compute transition probs. to maximize
likelihood of observing O– Forward-Backward algorithm (aka. Baum-Welch)
Aside: Hidden Markov Models
GT2k - Westeyn et al. [2003]
Grammars
MoveForward = Advance Slow_Down Halt
MoveBackward = Reverse Slow_Down Halt
command =
Attention <MoveForward | MoveBackward>
GT2k - Westeyn et al. [2003]
Converting raw sensor data to feature vectors
1 56 Attention
57 175 Advance
176 235 Slow_Down
236 250 Halt
GT2k - Westeyn et al. [2003]
Training
Training and validationprocedure
train
test
overfit!
only during continuous recognition
GT2k - Westeyn et al. [2003]
Accuracy
A = accuracy
N = number of examples
S = # substitution errors (misclassification)
D = # deletion errors (failed to recognize a gesture)
I = # insertion errors (system hallucinates a gesture)
€
A =N − S − D − I
N
GT2k - Westeyn et al. [2003]
Applications - Gesture Panel
gesture = up | down | left | right | up-left | up-right | down-left | down-right
Result: 99.20% accuracy on 251 examples (2 substitution errors)
GT2k - Westeyn et al. [2003]
Applications - Prescott
blinkprint = person_1 | person_2 | person_3
Result: 89.6% accuracy on 48 examples (5 substitution errors, not good!)
GT2k - Westeyn et al. [2003]
Applications - TeleSign
word = my | computer | helps | me | talk
sentence = ( calibrate word word word word word exit )
Result: 90.48% accuracy on 72 examples
GT2k - Westeyn et al. [2003]
Applications - Workshop Activity Recognition
gesture = hammer | file | sand | saw | screw | vise | drill | clap | use_drawer | grind
Result: 93.33% accuracy on 10 examples per activity
GT2k - Westeyn et al. [2003]
Major conclusions
• HMMs can learn from arbitrary types of data
• Domain-specific knowledge may be needed to construct proper HMM topologies
• Shouldn’t assume that gestures are only applicable to 2D strokes with a mouse
• Wearing all that gear just to speak 5 sign language words is kind of ridiculous
BONUS SLIDES: What are the neat gesturing apps?
gestures used for:
handwriting & issuing commands
system-wide commands, interacting with UI widgets
http://www.xstroke.org/
http://www.bitart.com/
BONUS SLIDES: What are the neat gesturing apps?
gestures used for:
issuing commands
(gesturing built in)
(several gestures plugins available)
BONUS SLIDES: What are the neat gesturing apps?
• SwingGestures– Java Swing gesture recognizer– http://sourceforge.net/projects/swinggestures/
• Jestur– Python gesture recognizer– http://sourceforge.net/projects/jestur/
• Quill– Java gesture creation toolkit– http://sourceforge.net/projects/quill/
BONUS SLIDES: What are the neat gesturing apps?
• Quill