Abstract
Exemplar-based approaches for dynamic hand gesture recognition usually require a large collection of gestures to achieve high-quality performance. Efficient visual representation of the motion patterns hence is very important to offer a scalable solution for gesture recognition when the databases are large. In this paper, we propose a new visual representation for hand motions based on the motion divergence fields, which can be normalized to gray-scale images. Salient regions such as Maximum Stable Extremal Regions (MSER) are then detected on the motion divergence maps. From each detected region, a local descriptor is extracted to capture local motion patterns. We further leverage indexing techniques from image search into gesture recognition. The extracted descriptors are indexed using a pre-trained vocabulary. A new gesture sample accordingly can be efficiently matched with database gestures through a term frequency-inverse document frequency (TF-IDF) weighting scheme. We have collected a hand gesture database with 10 categories and 1050 video samples for performance evaluation and further applications. The proposed method achieves higher recognition accuracy than other state-of-the-art motion and spatio-temporal features on this database. Besides, the average recognition time of our method for each gesture sequence is only 34.53 ms.
Original language | English (US) |
---|---|
Pages (from-to) | 227-235 |
Number of pages | 9 |
Journal | Image and Vision Computing |
Volume | 30 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2012 |
Keywords
- Divergence fields
- Hand gesture recognition
- Maximum Stable Extremal Regions
- Optical flow
- Term frequency-inverse document frequency (TF-IDF)
ASJC Scopus subject areas
- Signal Processing
- Computer Vision and Pattern Recognition