Enabling fast and effortless customization in accelerometer based gesture interaction (Mantyjarvi, et al – 2004)

31 March 2008

Current Mood:

Blogs I Commented On:


Summary:
The purpose of this paper is to create a procedure which allows users to create accelerometer-based gesture control when applying HMMs. Primarily, the authors refer to gestures as user hand movements collected with a set of sensors in a handheld device, modeled by machine learning methods. They use HMMs to recognize gestures since they can model time-series with spatial and temporal variability. Their system first involves preprocessing, where gesture data is normalized to equal length and amplitude. A vector quantizer is then used to map three-dimensional vectors into one-dimensional sequence of codebook indices, where the codebook was generated from collected gesture vectors using a k-means algorithm. This information is then sent to the HMM with an ergodic topology. A codebook size of 8 and a model state size of 5 were chosen. After vector quantization, the gesture is either used to train the HMM or to evaluate the HMM’s recognition capability. Finally, the authors added noise to test whether copied gesture data with added noise can reduce the amount of training repetitions done by the user when using discrete HMMs. The two types of noise distributions used were uniform and Gaussian, and various signal to noise ratios (SNR) were experimented with to determine which ratio value provided the best results. The system was evaluated for eight popular gestures applicable to a DVD playback system, and the experiments consisted of finding an optimal threshold value to converge the HMM, examining accuracy rates for different amounts of training repetitions, finding an optimal SNR value, and examining the effects of using noise distorted signal duplicates in training. With six training repetitions, accuracy was over 95%, the best accuracy for Gaussian and uniformly distributed noise with SNR = 3 and 5 was 97.2% ad 96.3%, respectively.

Discussion:
Some thoughts on the paper:
  • It felt like the authors wanted to create a system which allowed users to create customized “macros” analogous for hand motion gestures. It seems like an interesting idea, but my main concern with their system is with its robustness in relation to other users whom didn’t train these “macros.” It’s a novel idea to incorporate noise into existing training gesture data in order to generalize their system more while maintaining low training repetition, but the paper does not tell us how it performs on multiple users. The results may have given really high accuracy rates, but it’s a bit misleading since I didn’t see a separate test data used from, say, another user. I do think it’s a fine system if it’s for an application meant for that specific user though, but doesn’t seem robust for multiple users. If the latter is desired, I have no idea if this system will perform well.
  • They tested their data on 2D data. Seems like a waste of z-axis data, since it could have been done by omitting that third dimension. But then, I can’t really imagine truly useful gestures that would take advantage of z-axis data.
  • I think it would be better to have a system where users sketch their gestures in 2D on-screen, and then have the system try to recognize the accelerometer data using existing sketch recognition techniques.

1 comments:

Grandmaster Mash said...

I want to see the sets of gestures that different people came up with. That would be a much more interesting paper