A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition

Men, Qianhui, Ho, Edmond, Shum, Hubert and Leung, Howard (2021) A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR 2020): Milan, Italy, 10-15 January 2021. Proceedings of the International Conference on Pattern Recognition (ICPR) . IEEE, Piscataway, NJ, pp. 2771-2778. ISBN 9781728188096, 9781728188089

[img]
Preview
Text
Interaction_recognition_icpr.pdf - Accepted Version

Download (2MB) | Preview
Official URL: https://doi.org/10.1109/icpr48806.2021.9412538

Abstract

This paper addresses the problem of recognizing human-human interaction from skeletal sequences. Existing methods are mainly designed to classify single human action. Many of them simply stack the movement features of two characters to deal with human interaction, while neglecting the abundant relationships between characters. In this paper, we propose a novel two-stream recurrent neural network by adopting the geometric features from both single actions and interactions to describe the spatial correlations with different discriminative abilities. The first stream is constructed under pairwise joint distance (PJD) in a fully-connected mesh to categorize the interactions with explicit distance patterns. To better distinguish similar interactions, in the second stream, we combine PJD with the spatial features from individual joint positions using graph convolutions to detect the implicit correlations among joints, where the joint connections in the graph are adaptive for flexible correlations. After spatial modeling, each stream is fed to a bi-directional LSTM to encode two-way temporal properties. To take advantage of the diverse discriminative power of the two streams, we come up with a late fusion algorithm to combine their output predictions concerning information entropy. Experimental results show that the proposed framework achieves state-of-the art performance on 3D and comparable performance on 2D interaction datasets. Moreover, the late fusion results demonstrate the effectiveness of improving the recognition accuracy compared with single streams.

Item Type: Book Section
Additional Information: Funding information: The project is supported in part by grants from City University of Hong Kong (Project No. 9220077 and 9678139), and the Royal Society (Ref: IES\R2\181024 and IES\R1\191147).
Subjects: G400 Computer Science
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: John Coen
Date Deposited: 07 Dec 2020 12:14
Last Modified: 31 Jul 2021 10:34
URI: http://nrl.northumbria.ac.uk/id/eprint/44929

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics