Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity

Uslan, Volkan, Seker, Huseyin and John, Robert (2019) Overlapping Clusters and Support Vector Machines Based Interval Type-2 Fuzzy System for the Prediction of Peptide Binding Affinity. IEEE Access, 7. pp. 49756-49764. ISSN 2169-3536

[img]
Preview
Text
08685099.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (5MB) | Preview
[img]
Preview
Text
08685099.pdf - Published Version

Download (5MB) | Preview
Official URL: https://doi.org/10.1109/access.2019.2910078

Abstract

In the post-genome era, it is becoming more complex to process high dimensional, low-instance available, and nonlinear biological datasets. This paper aims to address these characteristics as they have adverse effects on the performance of predictive models in bioinformatics. In this paper, an interval type-2 Takagi Sugeno fuzzy predictive model is proposed in order to manage high-dimensionality and nonlinearity of such datasets which is the common feature in bioinformatics. A new clustering framework is proposed for this purpose to simplify antecedent operations for an interval type-2 fuzzy system. This new clustering framework is based on overlapping regions between the clusters. The cluster analysis of partitions and statistical information derived from them has identified the upper and lower membership functions forming the premise part. This is further enhanced by adapting the regression version of support vector machines in the consequent part. The proposed method is used in experiments to quantitatively predict affinities of peptide bindings to biomolecules. This case study imposes a challenge in post-genome studies and remains an open problem due to the complexity of the biological system, diversity of peptides, and curse of dimensionality of amino acid index representation characterizing the peptides. Utilizing four different peptide binding affinity datasets, the proposed method resulted in better generalization ability for all of them yielding an improved prediction accuracy of up to 58.2% on unseen peptides in comparison with the predictive methods presented in the literature. Source code of the algorithm is available at https://github.com/sekerbigdatalab .

Item Type: Article
Uncontrolled Keywords: Interval type-2 fuzzy systems, support vector regression, overlapping clusters, peptide binding affinity, clustering, high-dimensionality.
Subjects: F200 Materials Science
G400 Computer Science
Department: Faculties > Engineering and Environment > Computer and Information Sciences
Depositing User: Elena Carlaw
Date Deposited: 01 May 2019 10:29
Last Modified: 11 Oct 2019 10:24
URI: http://nrl.northumbria.ac.uk/id/eprint/39121

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics


Policies: NRL Policies | NRL University Deposit Policy | NRL Deposit Licence