TY - GEN
T1 - Motion divergence fields for dynamic hand gesture recognition
AU - Shen, Xiaohui
AU - Hua, Gang
AU - Williams, Lance
AU - Wu, Ying
PY - 2011
Y1 - 2011
N2 - Although it is in general difficult to track articulated hand motion, exemplar-based approaches provide a robust solution for hand gesture recognition. Presumably, a rich set of dynamic hand gestures are needed for a meaningful recognition system. How to build the visual representation for the motion patterns is the key for scalable recognition. We propose a novel representation based on the divergence map of the gestural motion field, which transforms motion patterns into spatial patterns. Given the motion divergence maps, we leverage modern image feature detectors to extract salient spatial patterns, such as Maximum Stable Extremal Regions (MSER). A local descriptor is extracted from each region to capture the local motion pattern. The descriptors from gesture exemplars are subsequently indexed using a pre-trained vocabulary tree. New gestures are then matched efficiently with the database gestures with a TF-IDF scheme. Our extensive experiments on a large hand gesture database with 10 categories and 1050 video samples validate the efficacy of the extracted motion patterns for gesture recognition. The proposed approach achieves an overall recognition rate of 97.62%, while the average recognition time is only 34.53 ms.
AB - Although it is in general difficult to track articulated hand motion, exemplar-based approaches provide a robust solution for hand gesture recognition. Presumably, a rich set of dynamic hand gestures are needed for a meaningful recognition system. How to build the visual representation for the motion patterns is the key for scalable recognition. We propose a novel representation based on the divergence map of the gestural motion field, which transforms motion patterns into spatial patterns. Given the motion divergence maps, we leverage modern image feature detectors to extract salient spatial patterns, such as Maximum Stable Extremal Regions (MSER). A local descriptor is extracted from each region to capture the local motion pattern. The descriptors from gesture exemplars are subsequently indexed using a pre-trained vocabulary tree. New gestures are then matched efficiently with the database gestures with a TF-IDF scheme. Our extensive experiments on a large hand gesture database with 10 categories and 1050 video samples validate the efficacy of the extracted motion patterns for gesture recognition. The proposed approach achieves an overall recognition rate of 97.62%, while the average recognition time is only 34.53 ms.
UR - http://www.scopus.com/inward/record.url?scp=79958742553&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79958742553&partnerID=8YFLogxK
U2 - 10.1109/FG.2011.5771447
DO - 10.1109/FG.2011.5771447
M3 - Conference contribution
AN - SCOPUS:79958742553
SN - 9781424491407
T3 - 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011
SP - 492
EP - 499
BT - 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011
T2 - 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011
Y2 - 21 March 2011 through 25 March 2011
ER -