[09] A Multi-Class Pattern Recognition System for Practical Finger Spelling Translation (Hernandez-Rebollar, et al – 2002)

02 February 2008

Current Mood: studious

Blogs I Commented On:


Rant:
Looks like the groundhog didn't see its shadow today, because the weather's so nice right now. I hope I'm right.

Summary:
The authors propose a system for ASL that addresses portability and affordability, by using an inexpensive microcontroller and a set of five dual-axis accelerometers. Their input device is the Accele Glove, which doesn’t require an external tracking system. The glove also serves as the key component to their system, since it measures finger position. A PC is used for data analysis and algorithm training, and a voice synthesizer is used to vocally output recognized letters. Dual-axis sensors are attached to middle joints of the fingers to eliminate ambiguity, while the accelerometers measure joint flexion at the y-axis and hand roll/yaw and individual finger abduction at the x-axis.

The Accele glove can measure finger flexion (or hand shape), and also hand orientation (with respect to gravitational vector without needing an external sensor. Thus, the extracted features are: orientation of fingers, total number of fingers bent, and palm orientation (closed, horizontal, and vertical/open). The sample space used for classification consists of 50 samples for the 26 ASL symbols. A decision tree was used to do handle classification in a hierarchical structure, where different features are tested at each level. Many letters are recognized in two steps, while the most difficult ones are recognized at the bottom level.

Ten different sample data per five volunteers were collected to train and test the system. 21 of the 26 ASL letters achieved perfect recognition, while linear functions proved impossible to handle recognition without errors in the recognition with accuracy such as 90%, 78%, and 96%.

Discussion:
Finally, a hand gesture paper that doesn’t use HMMs. Instead, this paper treads on using decision trees for recognition. I found this very interesting, because I didn’t know decision trees could be used successfully for something that can’t be easily linearly classified as hand gestures. In my opinion, I think there’s a lot of other good in this paper. Such examples include not using the Cyberglove but an inexpensive alternative instead, having lots of sample data from several people instead of one, and actually testing on all the ASL symbols with a few more to produce basic communication. Using index finger position as the initial posture recognition component is an interesting aspect, which would make more sense with a decision tree implementation, but we can definitely see where the implementation reaches its limits with erroneous classifications on a few ASL letters. I wonder if there’s an easy way to incorporate the power of non-linearity functions into a decision tree implementation…

1 comments:

Brandon said...

you can probably introduce some non-linearity into a decision tree by using branching probability that could possibly be based on context. for example, if spelling the word 'sun' using their approach we know that 'U' is a problem letter. they say this letter is ambiguous with 'r' and 'v'. obviously 'srn' and 'svn' are not words. also, we know that almost never will a 'v' follow an 's', occasionally an 'r' will follow an 's', and frequently a 'u' follows an 's'. using these prior probabilities with the decision tree could boost accuracy. then again, abandoning the decision tree and going with a more complex pattern recognition algorithm could also solve the problem since there would then be no need for the dimensionality reduction they performed in the paper.