Lawrence, Tom (2022) Deep neural network generation for image classification within resource-constrained environments using evolutionary and hand-crafted processes. Doctoral thesis, Northumbria University.
|
Text (Doctoral thesis)
lawrence.tom_phd(15035651).pdf - Submitted Version Download (7MB) | Preview |
Abstract
Constructing Convolutional Neural Networks (CNN) models is a manual process requiringexpert knowledge and trial and error. Background research highlights the following knowledge gaps. 1) existing efficiency-focused CNN models make design choices that impact model performance. Better ways are needed to construct accurate models for resourceconstrained environments that lack graphics processing units (GPU’s) to speed up model inference time such as CCTV cameras and IoT devices. 2) Existing methods for automatically designing CNN architectures do not explore the search space effectively for the best solution and 3) existing methods for automatically designing CNN architectures do not exploit modern model architecture design patterns such as residual connections. The lack of residual connections means the model depth is limited owing to the vanishing gradient problem. Furthermore, existing methods for automatically designing CNN architectures adopt search strategies that make them vulnerable to local minima traps.
Better techniques to construct efficient CNN models, and automated approaches that can produce accurate deep model constructions advance many areas such as hazard detection, medical diagnosis and robotics in both academia and industry.
The work undertaken during this research are 1) the proposal of an efficient and accurate CNN architecture for resource-constrained environments owing to a novel block structure containing 1x3 and 3x1 convolutions to save computational cost, 2) proposed a particle swarm optimization (PSO) method of automatically constructing efficient deep CNN architectures with greater accuracy by proposing a novel encoding and search strategy, 3) proposed a PSO based method of automatically constructing deeper CNN models with improved accuracy by proposing a novel encoding scheme that employs residual connections, in novel search mechanism that follows the global and neighbouring best leaders.
The main findings of this research are 1) the proposed efficiency-focused CNN model outperformed MobileNetV2 by 13.43% in respect to accuracy, and 39.63% in respect to efficiency, measured in floating-point operations. A reduction in floating-point operations means the model has the potential for faster inference times which is beneficial to applications within resource-constrained environments without GPU’s such as CCTV cameras. 2) the proposed automatic CNN generation technique outperformed existing methods by 7.58% in respect to accuracy and a 63% improvement in search time efficiency owing to the proposal of more efficient architectures speeding up the search process and 3) the proposed automatic deep residual CNN generation method improved model accuracy by 4.43% when compared against related studies owing to deeper model construction and improvements in the search process. The proposed search process embeds human knowledge of constructing deep residual networks and provides constraint settings which can be used to limit the proposed models depth and width. The ability to constrain a models depth and width is important as it ensures the upper bounds of a proposed model will fit within the constraints of resource-constrained environments.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | computer vision, CNN, PSO |
Subjects: | G400 Computer Science |
Department: | Faculties > Engineering and Environment > Computer and Information Sciences University Services > Graduate School > Doctor of Philosophy |
Depositing User: | John Coen |
Date Deposited: | 11 Jan 2023 08:35 |
Last Modified: | 11 Jan 2023 08:45 |
URI: | https://nrl.northumbria.ac.uk/id/eprint/51121 |
Downloads
Downloads per month over past year