Improving K-means clustering with enhanced Firefly Algorithms

Xie, Hailun, Zhang, Li, Lim, Chee Peng, Yu, Yonghong, Liu, Chengyu, Liu, Han and Walters, Julie (2019) Improving K-means clustering with enhanced Firefly Algorithms. Applied Soft Computing, 84. p. 105763. ISSN 1568-4946

[img]
Preview
Text
1-s2.0-S1568494619305447-main.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (2MB) | Preview
[img] Text
journal_optimized_revised.pdf - Accepted Version
Restricted to Repository staff only until 7 September 2020.
Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0.

Download (1MB) | Request a copy
Official URL: https://doi.org/10.1016/j.asoc.2019.105763

Abstract

In this research, we propose two variants of the Firefly Algorithm (FA), namely inward intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA), for undertaking the obstinate problems of initialization sensitivity and local optima traps of the K-means clustering model. To enhance the capability of both exploitation and exploration, matrix-based search parameters and dispersing mechanisms are incorporated into the two proposed FA models. We first replace the attractiveness coefficient with a randomized control matrix in the IIEFA model to release the FA from the constraints of biological law, as the exploitation capability in the neighbourhood is elevated from a one-dimensional to multi-dimensional search mechanism with enhanced diversity in search scopes, scales, and directions. Besides that, we employ a dispersing mechanism in the second CIEFA model to dispatch fireflies with high similarities to new positions out of the close neighbourhood to perform global exploration. This dispersing mechanism ensures sufficient variance between fireflies in comparison to increase search efficiency. The ALL-IDB2 database, a skin lesion data set, and a total of 15 UCI data sets are employed to evaluate efficiency of the proposed FA models on clustering tasks. The minimum Redundancy Maximum Relevance (mRMR)-based feature selection method is also adopted to reduce feature dimensionality. The empirical results indicate that the proposed FA models demonstrate statistically significant superiority in both distance and performance measures for clustering tasks in comparison with conventional K-means clustering, five classical search methods, and five advanced FA variants.

Item Type: Article
Uncontrolled Keywords: Data clustering, Firefly Algorithm, Swarm intelligence algorithm, K-means clustering
Subjects: G400 Computer Science
G500 Information Systems
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: Elena Carlaw
Date Deposited: 16 Sep 2019 08:36
Last Modified: 05 Nov 2019 16:45
URI: http://nrl.northumbria.ac.uk/id/eprint/40689

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics


Policies: NRL Policies | NRL University Deposit Policy | NRL Deposit Licence