A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition

Men, Qianhui, Ho, Edmond, Shum, Hubert and Leung, Howard (2020) A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR): putting artificial intelligence to work on patterns. Proceedings of the International Conference on Pattern Recognition (ICPR) . IEEE, Piscataway. (In Press)

Full text not available from this repository. (Request a copy)


This paper addresses the problem of recognizing human-human interaction from skeletal sequences. Existing methods are mainly designed to classify single human action. Many of them simply stack the movement features of two characters to deal with human interaction, while neglecting the abundant relationships between characters. In this paper, we propose a novel two-stream recurrent neural network by adopting the geometric features from both single actions and interactions to describe the spatial correlations with different discriminative abilities. The first stream is constructed under pairwise joint distance (PJD) in a fully-connected mesh to categorize the interactions with explicit distance patterns. To better distinguish similar interactions, in the second stream, we combine PJD with the spatial features from individual joint positions using graph convolutions to detect the implicit correlations among joints, where the joint connections in the graph are adaptive for flexible correlations. After spatial modeling, each stream is fed to a bi-directional LSTM to encode two-way temporal properties. To take advantage of the diverse discriminative power of the two streams, we come up with a late fusion algorithm to combine their output predictions concerning information entropy. Experimental results show that the proposed framework achieves state-of-the art performance on 3D and comparable performance on 2D interaction datasets. Moreover, the late fusion results demonstrate the effectiveness of improving the recognition accuracy compared with single streams.

Item Type: Book Section
Subjects: G400 Computer Science
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: John Coen
Date Deposited: 07 Dec 2020 12:14
Last Modified: 07 Dec 2020 12:14
URI: http://nrl.northumbria.ac.uk/id/eprint/44929

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics