[03] American Sign Language Finger Spelling Recognition System (Allen, et al – 2003)

26 January 2008

Current Mood: sinister

Blogs I Commented On:


Rant:
Oh man, it's going to take me awhile to get used to reading two papers per class instead of one. I must hone my gesture-fu skills. *executes a high-pitched Bruce Lee scream*

Summary:
Sign language is a form of communication used by the deaf and deaf-blind community, but most people in general are not familiar with it. Also, sign language interpreters are expensive and only used in formal settings. The goal is to make it possible to develop wearable technology that will recognize sign language, translate it into printed and spoken English, and transform that English into an animated and tactile language. To do so, the authors created a recognition system that allows classification of finger-spelled letters based on subunits of hand shape using a neural network for pattern classification. The classification technique involves matching descriptive parameters of sign language dictionaries, which differs from conventional techniques that must recognize the entire sign. Their device is a CyberGlove which contains 18 sensors and measures finger and write position/movement.

In developing the recognition system, the authors created an initial Labview program to collect glove data that saved to a file. The data was then loaded to Matlab in order to train the neural network. A second Matlab program related the glove data to the most similar America Sign Language (ASL) letter it was trained to recognize. After the system is capable of differentiating the different letters, a second Labview program translated the classifications into spoken form by outputting their corresponding English sound. The final step integrated the entire system together.

The main recognition system used a perceptron network since it produced the best results out of all network types tested. The input matrix represented the 18 glove values and 24 ASL letters (2 letters were omitted since they required arm movement as well), and the output was 24x24. A trained network took in an 18x1 matrix of sensor values, then output a 24x1 matrix, where a 1 value in the output denoted the index of the recognized ASL letter. Results generated up to 90% accuracy applied to data trained from one person, with the authors perceiving that more data will generate better results.

Discussion:
I never believed in the idea of researching technology for the sake of technology. That technology should have beneficial applications in order to justify its existence. In the case of haptics research, the authors’ work is an example of that justification. On the other hand, it does seem strange that they resorted to classifying ASL letters using a linear perceptron neural network. Most neural networks use more powerful non-linear sigmoid neural networks, especially for a domain as complex as haptics. The authors state that their results generated accuracy up to 90%, too. I wonder if those results would have been stated as above 90% had they employed a more powerful neural network.

0 comments: