Paul Taele's Blog on Gesture Recognition: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Wobbrock, et al

Current Mood:

Blogs I Commented On:

Summary:
The authors created an easy, cheap, and highly portable gesture recognizer called the $1 Gesture Recognizer. The algorithm requires one hundred lines of code and handles only basic geometry and trigonometry. The algorithm’s contributions include being easy to implement for novice user interface prototypers, capable as a measuring stick against more advanced algorithms, and used to give insight to gestures that are “best” for people and computer systems. Challenges to gesture recognizers in general include having resilience to sampling variations, supporting optimal and configurable rotation, scale, and position invariance, requiring no advanced math techniques, writing it easily in a few lines of code, being teachable with one example, returning a list of N-best points with sensible scores independent of number of points, and providing competitive recognition rates to more advanced algorithms.

$1 is able to cope with those challenges in its four-step algorithm: 1) re-sample to N points, where 32 <= N <= 256, 2) rotate once based on indicative angle, which is the angle formed between the gesture’s centroid and starting point, 3) scale non-uniformly and translate to the centroid, which is set as the origin, and 4) do recognition by finding the optimal angle for the best score. Analyzing the rotation invariance shows that there’s no guarantee that candidate points and template points of the gesture will optimally align after rotating the indicative angle to 0 degrees, so $1 uses a Golden Section Search (GSS) to find minimum ranges using the Golden Ratio in order to find the optimal angle.

Limitations of $1 include being unable to distinguish gestures whose identities depend on specific orientations, aspect ratios, or locations, abusing horizontal and vertical lines by non-uniform scaling, and being unable to differentiate gestures by speed since it doesn’t use time. In order to handle variations with $1, new templates can be defined to capture variation with a single name. A study was done to compare $1 with a modified Rubine classifier and Dynamic Time Warping (DTW) template matcher. The study showed that $1 and DTW were more accurate than Rubine, and that $1 and Rubine executed faster than DTW.

Discussion:
I guess I should change the discussion a bit because we're now looking at this paper from a GR perspective instead of an SR perspective. Our SR class was quite critical of this paper at the time, given that there were already existing SR algorithms that were more capable. Maybe $1 isn't as bad for GR, given that the simplicity of this algorithm would help bring GR-based applications into the mainstream, and also because such gestures for glove- and wand-based devices probably aren't as complicated to handle as for pen-based devices. The limitations that we noted in the SR class haven't gone away simply because we shifted to GR, but I don't think they're as disadvantageous second time around. I guess we won't know for sure unless we start experimenting with various applications which use $1.

Paul Taele's Blog on Gesture Recognition

Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Wobbrock, et al – 2007)

31 March 2008

0 comments:

Blog Archive

Links