Deep learning based image description generation

Kinghorn, Philip, Zhang, Li and Shao, Ling (2017) Deep learning based image description generation. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, Piscataway, pp. 919-926. ISBN 978-1-5090-6183-9

Full text not available from this repository. (Request a copy)
Official URL: https://doi.org/10.1109/IJCNN.2017.7965950

Abstract

Describing the contents of images is a challenging task for machines to achieve. It requires not only accurate recognition of objects and humans, but also their attributes and relationships as well as scene information. It would be even more challenging to extend this process to identify falls and hazardous objects to aid elderly or users in need of care. This research makes initial attempts to deal with the above challenges to produce multi-sentence natural language description of image contents. It employs a local region based approach to extract regional image details and combines multiple techniques including deep learning and attribute learning through the use of machine learned features to create high level labels that can generate detailed description of real-world images. The system contains the core functions of scene classification, object detection and classification, attribute learning, relationship detection and sentence generation. We have also further extended this process to deal with open-ended fall detection and hazard identification. In comparison to state-of-the-art related research, our system shows superior robustness and flexibility in dealing with test images from new, unrelated domains, which poses great challenges to many existing methods. Our system is evaluated on a subset from Flickr8k and Pascal VOC 2012 and achieves an impressive average BLEU score of 46 and outperforms related research by a significant margin of 10 BLEU score when evaluated with a small dataset of images containing falls and hazardous objects. It also shows impressive performance when evaluated using a subset of IAPR TC-12 dataset.

Item Type: Book Section
Uncontrolled Keywords: Deep neural networks, Support Vector Machines, Applications of Deep neural networks
Subjects: G400 Computer Science
G700 Artificial Intelligence
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: Becky Skoyles
Date Deposited: 24 Oct 2017 08:52
Last Modified: 12 Oct 2019 19:20
URI: http://nrl.northumbria.ac.uk/id/eprint/32375

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics