Paul Taele's Blog on Gesture Recognition: February 2008

[10] Hand Tension as a Gesture Segmentation Cue (Harling & Edwards – 1997)

02 February 2008

Current Mood: speechless

Blogs I Commented On:

Rant:
How is it that this university has two chicken fingers restaurant near campus, yet there is no buffalo wings restaurant nearby? (No, Buffalo Wild is not nearby.) What they need to do is convert Layne's into a buffalo wings restaurant. Their chicken fingers and sauce suck compared to Raising Cane's.

Summary:
Gestures classes can be defined as having either static or dynamic hand postures, and having static or dynamic hand locations. Recognizers designed for hand gesture recognition have difficulty defining whether two distinct and consecutive hand motions are considered an atomic gesture or not. The authors propose a system to help remedy that by considering what happens to the muscles of the fingers during posture creation. They observe that as the hand is moved from one posture to another, the amount of tension will change and become tenser for various postures. They thus theorize that intentional gestures will be made with a tense hand position rather than a relaxed one. Of the four gesture classes, their segmentation method using this theory works best when dynamic finger motions are not involved.

Currently, current input technology do not directly measure finger tension, therefore their model considers a finger to be a light rigid rod of a fixed length, with two light elastic strings attached to the end of the rod. They resolve the forces along the finger using Hooke’s law to determine the amount of tension in each finger. By summing the total hand tension, they observe that total hand tension increases as single finger tension increases, and likewise when it decreases.

The hand model was tested on two sets of gesture with a Mattel Power Glove that measured bentness on the fingers sans pinky for the domain of BSL. When sentence fragments were executed with hand gestures, the graph of hand tensions for the sentence fragments displayed local maxima when gestures were performed, and local minima during the transiting to the next gesture.

Discussion:
It’s an interesting idea to use finger tension as a way to segment different hand gestures in their domain of BSL (which isn’t much different from ASL, I suppose). Their improvised way to measure finger tension seem to offer decent results in segmentation, so their theory held up quite well. On the other hand, they note that this approach doesn’t work for dynamic hand postures and locations, which is a shame since these types of gestures are more natural. Though their approach alone wouldn’t be very useful for a robust gesture recognition system, it could be a useful metric to aid in a particular area of hand gesture recognition.

One other point that I wished to discuss is the author’s comments about the use of finger tension for aiding actual recognition rather than segmentation. Originally, they theorized their system for the problem of segmentation, but they observed from their sample data that different postures in their study exhibited unique levels of hand gesture. Hence, they wonder if these different levels of finger tensions hold in the general case. I have my doubts that this metric would be reliable for a domain with a large gesture library, but this might be useful for a smaller library. Of course, the segmentation problem would also decrease in complexity anyway with a smaller library. I think it would be more productive to focus their attention on solving the dynamic portion of the hand gesture classes, but their idea on this matter is intriguing and worthy of a look. I definitely would like to see if this correlation held.

[09] A Multi-Class Pattern Recognition System for Practical Finger Spelling Translation (Hernandez-Rebollar, et al – 2002)

Current Mood: studious

Blogs I Commented On:

Rant:
Looks like the groundhog didn't see its shadow today, because the weather's so nice right now. I hope I'm right.

Summary:
The authors propose a system for ASL that addresses portability and affordability, by using an inexpensive microcontroller and a set of five dual-axis accelerometers. Their input device is the Accele Glove, which doesn’t require an external tracking system. The glove also serves as the key component to their system, since it measures finger position. A PC is used for data analysis and algorithm training, and a voice synthesizer is used to vocally output recognized letters. Dual-axis sensors are attached to middle joints of the fingers to eliminate ambiguity, while the accelerometers measure joint flexion at the y-axis and hand roll/yaw and individual finger abduction at the x-axis.

The Accele glove can measure finger flexion (or hand shape), and also hand orientation (with respect to gravitational vector without needing an external sensor. Thus, the extracted features are: orientation of fingers, total number of fingers bent, and palm orientation (closed, horizontal, and vertical/open). The sample space used for classification consists of 50 samples for the 26 ASL symbols. A decision tree was used to do handle classification in a hierarchical structure, where different features are tested at each level. Many letters are recognized in two steps, while the most difficult ones are recognized at the bottom level.

Ten different sample data per five volunteers were collected to train and test the system. 21 of the 26 ASL letters achieved perfect recognition, while linear functions proved impossible to handle recognition without errors in the recognition with accuracy such as 90%, 78%, and 96%.

Discussion:
Finally, a hand gesture paper that doesn’t use HMMs. Instead, this paper treads on using decision trees for recognition. I found this very interesting, because I didn’t know decision trees could be used successfully for something that can’t be easily linearly classified as hand gestures. In my opinion, I think there’s a lot of other good in this paper. Such examples include not using the Cyberglove but an inexpensive alternative instead, having lots of sample data from several people instead of one, and actually testing on all the ASL symbols with a few more to produce basic communication. Using index finger position as the initial posture recognition component is an interesting aspect, which would make more sense with a decision tree implementation, but we can definitely see where the implementation reaches its limits with erroneous classifications on a few ASL letters. I wonder if there’s an easy way to incorporate the power of non-linearity functions into a decision tree implementation…

[08] A Dynamic Gesture Interface for Virtual Environments Based on Hidden Markov Models (Chen, et al – 2005)

01 February 2008

Current Mood: slightly peeved

Blogs I Commented On:

Rant:
The papers...they never end...

Summary:
Meaningful hand gestures consist of two types: static postures (e.g., ASL) and continuous dynamic gestures. The latter consists of global hand motions (i.e., large hand rotations and translations) and local finger motions (i.e., parameterized with a set of joint angles). The focus of this paper is a continuous dynamic gesture recognition system based on HMMs. The prototype for continuous dynamic gesture recognition involves rotating a cube with three different gestures.

The implementation of the system contains three steps. The first step involves collecting raw data and preprocessing them. Dynamic gestures are modeled with discrete HMMs, and observation signals are standard deviations of angle variations for each finger joint. The standard deviation describes the dynamic character of angle variation for each finger joint, thus transforming multi-dimensional observation signals into easier-to-process single discrete dimensional ones. The second step is training the HMMs using the Baum-Welch algorithm. Ten data sets were taken for each dynamic gesture, resulting in three HMMs. The third step focused on the gesture recognition, which uses the forward-backward algorithm. The paper gave no results to their system.

Discussion:
What an interesting paper. There were neither any results to show the performance of their system, nor were any convincing arguments as to why HMMs were used with their standard deviation technique. I do think that the author’s standard deviation technique using HMMs could work well for other applications -- in theory. In theory, communism works. In theory.

Paul Taele's Blog on Gesture Recognition

[10] Hand Tension as a Gesture Segmentation Cue (Harling & Edwards – 1997)

02 February 2008

[09] A Multi-Class Pattern Recognition System for Practical Finger Spelling Translation (Hernandez-Rebollar, et al – 2002)

[08] A Dynamic Gesture Interface for Virtual Environments Based on Hidden Markov Models (Chen, et al – 2005)

01 February 2008

Blog Archive

Links