Toolkits for Supporting Gestures in Applications
description
Transcript of Toolkits for Supporting Gestures in Applications
Toolkits for Supporting Gestures in Applications
Justin Weisz
05-830 UI Software
Nov. 16, 2004
“A single stroke indicates the operation (move text), the operand (the text to be moved), and additional parameters (the
new location of the text).”
-- Rubine
What is a gesture?
“A single stroke indicates the operation (move text), the operand (the text to be moved), and additional parameters (the
new location of the text).”
-- Rubine
What is a gesture?
Uses of gestures
Editing existing objects
Creating new objects
Uses of gestures
Issuing commands
Back Reload page Menu > Copy
Applications of gesturing - Lightpens
Applications of gesturing - Tablets
Applications of gesturing - PDAs
Applications of gesturing - Video games
PowerGlove in “The Wizard”
Applications of gesturing - Video games
Black and White - 2001
Gesture
Recognizers
Automated (in a)
Novel
Direct
Manipulation
Architecture
Rubine [1991]
Rubine [1991]
Rubine [1991]
Rubine [1991]
recog = [Seq :[handler mousetool:LineCursor] :[[view createLine] setEndpoint:0 x:<start X> y:<start Y>] ];
manip = [recog setEndpoint:1 x:<current X> y:<current Y>];
done = nil;
For the line gesture:
Rubine [1991]
BUT, how are gestures actually represented and recognized?
Assumptions:
• Gestures are 2D, single strokes
• Start and end of a gesture is clearly defined
Representation:
€
G = {g0,...,gP}
€
gp = {x p , y p, t p}
set of P sample points
position & timestamp, preprocessed to remove jitter
Rubine [1991]
Feature vector extracted from G:
€
f = { f1,..., fF }
Example features:
€
f1 =(x2 − x0)
(x2 − x0)2 + (y2 − y0)2
€
f3 = (xmax − xmin )2 + (ymax − ymin )2
€
θp = arctanΔx pΔy p−1 − Δx p−1Δy p
Δx pΔx p−1 − Δy pΔy p−1
€
f9 = θ p
p=1
p−2
∑
cos of initial angle
length of BB diagonal
angle between three pts(?)
total angle traversed
Rubine [1991]
BUT...
“The aforementioned feature set was empirically determined by the author to work well on a number of different gesture sets” -- Rubine
Rubine [1991]
Classification
Each gesture class represented by a weight vector
€
w ˆ c = {w ˆ c 0,...,w ˆ c F}
To classify gesture G:
€
v ˆ c = w ˆ c 0 + w ˆ c i f i
i=1
F
∑
score bias(?) weight of feature i
feature i of gesture G
Take the highest score:
€
argmaxˆ c ∈C
v ˆ c
Rubine [1991]
Training
Optimal classifier
€
w ˆ c = {w ˆ c 0,...,w ˆ c F}
Rubine [1991]
Rejection
Pr(G matches i)Gesture G
Classification i
> 0.95? ACCEPT
REJECT
mean(i)
g1 g2g3
Rubine [1991]
Evaluation
Rubine [1991]
Evaluation
Rubine [1991]
Evaluation
Aside: Agate - Landay, Myers [1993]
gdt - Long et al. [1999]
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
gdt - Long et al. [1999]
Newton and Palm users reported:
• Gestures are powerful, efficient and convenient
• Want more commands to have gestures
• Want to define new gestures
• Recognition accuracy is not good enough
gdt - Long et al. [1999]
Oh Agate, I will make you beautiful!
gdt - Long et al. [1999]
gdt - Long et al. [1999]
Distance matrix
gdt - Long et al. [1999]
Classification matrix
gdt - Long et al. [1999]
Experiment - Hypotheses
• “Participants could use gdt to improve their gesture sets.”
• “The tables gdt provided would aid designers.”
• “PDA users and non-PDA users would perform differently.”
gdt - Long et al. [1999]
Experiment - Procedure
(pay no attention to the man behind the curtain...)
gdt - Long et al. [1999]
Experiment - Results
gdt - Long et al. [1999]
Experiment - Problems with gdt
new class
existing gesture classes
d“Clustering”
Reverse direction
gdt - Long et al. [1999]
Experiment - Problems with gdt
“Sloppiness”
Gesture overloading
Delete
gdt - Long et al. [1999]
Lessons learned
• GDT helpful, but participants averaged a 95.4% recognition rate
• Tables too confusing, didn’t help performance (better: “Gesture class A is too similar to gesture class B”)
• Should be able to create a test set of gestures and run it against a different gesture class
rect
copy
Break time!
Muchas gracias to my officemate for the suggestion. Smiling babies make people happy. BE HAPPY!
GT2k - Westeyn et al. [2003]
Problem:
Real problem: it is (still) cumbersome to design a system to perform gesture recognition
GT2k - Westeyn et al. [2003]
GT2k system components
Data generator
Sensors
Microphones
Cameras
Accelerometers
€
f = { f1,..., fF }
I’m back!
Results interpreter<action>
Aside: Hidden Markov Models
€
λ =(Λ,B,π )HMM Transition
probs.Symbol output probs.
Initial state dist.
€
Λ={aij}
€
aij = Pr(qt +1 = j | qt = i)
€
π ={π i}
€
π i = Pr(q1 = i)
€
B = {b j (k)}
€
b j (k) = Pr(ot = vk | qt = j)kth symbol in the alphabet
Aside: Hidden Markov Models
• Evaluation problem– Given HMM and O={o1,...,oT}, compute Pr(O|HMM)
– Forward algorithm
• Decoding problem– Given O, compute most likely state sequence that
produced O– Viterbi algorithm
• Learning problem– Given O, compute transition probs. to maximize
likelihood of observing O– Forward-Backward algorithm (aka. Baum-Welch)
Aside: Hidden Markov Models
GT2k - Westeyn et al. [2003]
Grammars
MoveForward = Advance Slow_Down Halt
MoveBackward = Reverse Slow_Down Halt
command =
Attention <MoveForward | MoveBackward>
GT2k - Westeyn et al. [2003]
Converting raw sensor data to feature vectors
1 56 Attention
57 175 Advance
176 235 Slow_Down
236 250 Halt
GT2k - Westeyn et al. [2003]
Training
Training and validationprocedure
train
test
overfit!
only during continuous recognition
GT2k - Westeyn et al. [2003]
Accuracy
A = accuracy
N = number of examples
S = # substitution errors (misclassification)
D = # deletion errors (failed to recognize a gesture)
I = # insertion errors (system hallucinates a gesture)
€
A =N − S − D − I
N
GT2k - Westeyn et al. [2003]
Applications - Gesture Panel
gesture = up | down | left | right | up-left | up-right | down-left | down-right
Result: 99.20% accuracy on 251 examples (2 substitution errors)
GT2k - Westeyn et al. [2003]
Applications - Prescott
blinkprint = person_1 | person_2 | person_3
Result: 89.6% accuracy on 48 examples (5 substitution errors, not good!)
GT2k - Westeyn et al. [2003]
Applications - TeleSign
word = my | computer | helps | me | talk
sentence = ( calibrate word word word word word exit )
Result: 90.48% accuracy on 72 examples
GT2k - Westeyn et al. [2003]
Applications - Workshop Activity Recognition
gesture = hammer | file | sand | saw | screw | vise | drill | clap | use_drawer | grind
Result: 93.33% accuracy on 10 examples per activity
GT2k - Westeyn et al. [2003]
Major conclusions
• HMMs can learn from arbitrary types of data
• Domain-specific knowledge may be needed to construct proper HMM topologies
• Shouldn’t assume that gestures are only applicable to 2D strokes with a mouse
• Wearing all that gear just to speak 5 sign language words is kind of ridiculous
BONUS SLIDES: What are the neat gesturing apps?
gestures used for:
handwriting & issuing commands
system-wide commands, interacting with UI widgets
http://www.xstroke.org/
http://www.bitart.com/
BONUS SLIDES: What are the neat gesturing apps?
gestures used for:
issuing commands
(gesturing built in)
(several gestures plugins available)
BONUS SLIDES: What are the neat gesturing apps?
• SwingGestures– Java Swing gesture recognizer– http://sourceforge.net/projects/swinggestures/
• Jestur– Python gesture recognizer– http://sourceforge.net/projects/jestur/
• Quill– Java gesture creation toolkit– http://sourceforge.net/projects/quill/
BONUS SLIDES: What are the neat gesturing apps?
• Quill