Action recognition by spatio-temporal oriented energies

Zhen, Xiantong, Shao, Ling and Li, Xuelong (2014) Action recognition by spatio-temporal oriented energies. Information Sciences, 281. pp. 295-309. ISSN 0020-0255

Full text not available from this repository. (Request a copy)
Official URL: http://dx.doi.org/10.1016/j.ins.2014.05.021

Abstract

In this paper, we present a unified representation based on the spatio-temporal steerable pyramid (STSP) for the holistic representation of human actions. A video sequence is viewed as a spatio-temporal volume preserving all the appearance and motion information of an action in it. By decomposing the spatio-temporal volumes into band-passed sub-volumes, the spatio-temporal Laplacian pyramid provides an effective technique for multi-scale analysis of video sequences, and spatio-temporal patterns with different scales could be well localized and captured. To efficiently explore the underlying local spatio-temporal orientation structures at multiple scales, a bank of three-dimensional separable steerable filters are conducted on each of the sub-volume from the Laplacian pyramid. The outputs of the quadrature pair of steerable filters are squared and summed to yield a more robust oriented energy representation. To be further invariant and compact, a spatio-temporal max pooling operation is performed between responses of the filtering at adjacent scales and over spatio-temporal neighbourhoods. In order to capture the appearance, local geometric structure and motion of an action, we apply the STSP on the intensity, 3D gradients and optical flow of video sequences, yielding a unified holistic representation of human actions.

Taking advantage of multi-scale, multi-orientation analysis and feature pooling, STSP produces a compact but informative and invariant representation of human actions. We conduct extensive experiments on the KTH, UCF Sports and HMDB51 datasets, which shows the unified STSP achieves comparable results with the state-of-the-art methods.

Item Type: Article
Uncontrolled Keywords: Action recognition; Steerable filters; Spatio-temporal oriented energies; Spatio-temporal Laplacian pyramid
Subjects: G400 Computer Science
G900 Others in Mathematical and Computing Sciences
Department: Faculties > Engineering and Environment > Computer Science and Digital Technologies
Related URLs:
Depositing User: Paul Burns
Date Deposited: 10 Jun 2015 10:09
Last Modified: 10 Aug 2015 11:03
URI: http://nrl.northumbria.ac.uk/id/eprint/22810

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics


Policies: NRL Policies | NRL University Deposit Policy | NRL Deposit Licence