A local descriptor based on Laplacian pyramid coding for action recognition

Zhen, Xiantong and Shao, Ling (2013) A local descriptor based on Laplacian pyramid coding for action recognition. Pattern Recognition Letters, 34 (15). pp. 1899-1905. ISSN 0167-8655

Full text not available from this repository. (Request a copy)
Official URL: http://dx.doi.org/10.1016/j.patrec.2012.10.021

Abstract

We present a new descriptor for local representation of human actions. In contrast to state-of-the-art descriptors, which use spatio-temporal features to describe cuboids detected from video sequences, we propose to employ a 2D descriptor based on the Laplacian pyramid for efficiently encoding spatio-temporal regions of interest. Image templates including structural planes and motion templates, are firstly extracted from a cuboid to encode the structural and motion features. A 2D Laplacian pyramid is then performed to decompose each of those images into a series of sub-band feature maps, which is followed by a two-stage feature extraction, i.e., Gabor filtering and max pooling. Motion-related edge and orientation information is enhanced after the filtering. To capture more discriminative and invariant features, max pooling is applied to the outputs of Gabor filtering, between scales within filter banks and over spatial neighbors. The obtained local features associated with cuboids are fed to the localized soft-assignment coding with max pooling on the Bag-of-Words (BoWs) model to represent an action.

The image templates, i.e., MHI and TOP, explicitly encode the motion and structure information in the video sequences and the proposed Laplacian pyramid coding descriptor provides an informative representation of them due to the multi-scale analysis. The employment of localized soft-assignment coding and max pooling gives a robust representation of actions. Experimental results on the benchmark KTH dataset and the newly released and challenging HMDB51 dataset demonstrate the effectiveness of the proposed method for human action recognition.

Item Type: Article
Uncontrolled Keywords: Action recognition; Laplacian pyramid; Localized soft-assignment coding; Max pooling
Subjects: G400 Computer Science
Department: Faculties > Engineering and Environment > Computer Science and Digital Technologies
Related URLs:
Depositing User: Paul Burns
Date Deposited: 10 Jun 2015 14:20
Last Modified: 10 Aug 2015 11:07
URI: http://nrl.northumbria.ac.uk/id/eprint/22833

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics


Policies: NRL Policies | NRL University Deposit Policy | NRL Deposit Licence