Learning discriminative features for human motion understanding

Zhang, Jingtian (2019) Learning discriminative features for human motion understanding. Doctoral thesis, Northumbria University.

Text (Doctoral Thesis)
zhang.jingtian_phd.pdf - Submitted Version

Download (11MB) | Preview


Human motion understanding has attracted considerable interest in recent research for its applications to video surveillance, content-based search and healthcare. With different capturing methods, human motion can be recorded in various forms (e.g. skeletal data, video, image, etc.). Compared to the 2D video and image, skeletal data recorded by motion capture device contains full 3D movement information. To begin with, we first look into a gait motion analysis problem based on 3D skeletal data. We propose an automatic framework for identifying musculoskeletal and neurological disorders among older people based on 3D skeletal motion data. In this framework, a feature selection strategy and two new gait features are proposed to choose an optimal feature set from the input features to optimise classification accuracy.
Due to self-occlusion caused by single shooting angle, 2D video and image are not able to record full 3D geometric information. Therefore, viewpoint variation dramatically affects the performance on lots of 2D based applications (e.g. arbitrary view action recognition and image-based 3D human shape reconstruction). Leveraging view-invariance from the 3D model is a popular idea to improve the performance on 2D computer vision problems. Therefore, in the second contribution, we adopt 3D models built with computer graphics technology to assist in solving the problem of arbitrary view action recognition. As a solution, a new transfer dictionary learning framework that utilises computer graphics technologies to synthesise realistic 2D and 3D training videos is proposed, which can project a real-world 2D video into a view-invariant sparse representation.
In the third contribution, 3D models are utilised to build an end-to-end 3D human shape reconstruction system, which can recover the 3D human shape from a single image without any prior parametric model. In contrast to most existing methods that calculate 3D joint locations, the method proposed in this thesis can produce a richer and more useful point cloud based representation. Synthesised high-quality 2D images and dense 3D point clouds are used to train a CNN-based encoder and 3D regression module.
It can be concluded that the methods introduced in this thesis try to explore human motion understanding from 3D to 2D. We investigate how to compensate for the lack of full geometric information in 2D based applications with view-invariance learnt from 3D models.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Gait disorder diagnosis, arbitrary view action reiogrition, 3D human shape reconstruction, machine learning, computer vision,
Subjects: G400 Computer Science
G500 Information Systems
Department: Faculties > Engineering and Environment > Computer and Information Sciences
University Services > Graduate School > Doctor of Philosophy
Depositing User: John Coen
Date Deposited: 24 Mar 2020 15:57
Last Modified: 31 Jul 2021 18:49
URI: http://nrl.northumbria.ac.uk/id/eprint/42562

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics