Cross-view action modeling, learning, and recognition

Jiang Wang*, Xiaohan Nie, Yin Xia, Ying Wu, Song Chun Zhu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

340 Scopus citations

Abstract

Existing methods on video-based action recognition are generally view-dependent, i.e., performing recognition from the same views seen in the training data. We present a novel multiview spatio-temporal and-or graph (MST-AOG) representation for cross-view action recognition, i.e., the recognition is performed on the video from an unknown and unseen view. As a compositional model, MST-AOG compactly represents the hierarchical combinatorial structures of cross-view actions by explicitly modeling the geometry, appearance and motion variations. This paper proposes effective methods to learn the structure and parameters of MST-AOG. The inference based on MST-AOG enables action recognition from novel views. The training of MST-AOG takes advantage of the 3D human skeleton data obtained from Kinect cameras to avoid annotating enormous multi-view video frames, which is error-prone and time-consuming, but the recognition does not need 3D information and is based on 2D video input. A new Multiview Action3D dataset has been created and will be released. Extensive experiments have demonstrated that this new action representation significantly improves the accuracy and robustness for cross-view action recognition on 2D videos.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
Pages2649-2656
Number of pages8
ISBN (Electronic)9781479951178, 9781479951178
DOIs
StatePublished - Sep 24 2014
Event27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014 - Columbus, United States
Duration: Jun 23 2014Jun 28 2014

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Other

Other27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014
Country/TerritoryUnited States
CityColumbus
Period6/23/146/28/14

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Cross-view action modeling, learning, and recognition'. Together they form a unique fingerprint.

Cite this