Human action segmentation using 3D fully convolutional network

Pei Yu, Jiang Wang, Ying Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


Detailed action analysis, such as action detection, localization and segmentation, has received more and more attention in recent years. Compared to action classification, action segmentation and localization are more useful in many practical applications that require precise spatio-temporal information of the actions. However, performing action segmentation and localization is more challenging, because determining the pixel-level locations of action not only requires a strong spatial model that captures the visual appearances for the actions, but also calls for a temporal model that characterizes the dynamics of the actions. Most existing methods either use hand-crafted spatial models, or can only extract short-term motion information. In this paper, we propose a 3D fully convolutional deep network to jointly exploit spatial and temporal information in a unified framework for action segmentation and localization. The proposed deep network is trained to combine both information in an end-to-end fashion. Extensive experimental results have shown that the proposed method outperforms state-of-the-art methods by a large margin.

Original languageEnglish (US)
Title of host publicationBritish Machine Vision Conference 2017, BMVC 2017
PublisherBMVA Press
ISBN (Electronic)190172560X, 9781901725605
StatePublished - 2017
Event28th British Machine Vision Conference, BMVC 2017 - London, United Kingdom
Duration: Sep 4 2017Sep 7 2017

Publication series

NameBritish Machine Vision Conference 2017, BMVC 2017


Conference28th British Machine Vision Conference, BMVC 2017
Country/TerritoryUnited Kingdom

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Human action segmentation using 3D fully convolutional network'. Together they form a unique fingerprint.

Cite this