A Computational Statistics Approach to Evaluate Blood Biomarkers for Breast Cancer Risk Stratification

Oktay, Kaan, Santaliz-Casiano, Ashlie, Patel, Meera, Marino, Natascia, Storniolo, Anna Maria V., Torun, Hamdi, Acar, Burak and Madak Erdogan, Zeynep (2020) A Computational Statistics Approach to Evaluate Blood Biomarkers for Breast Cancer Risk Stratification. Hormones and Cancer, 11 (1). pp. 17-33. ISSN 1868-8497

HOCA-R1_v2.pdf - Accepted Version

Download (4MB) | Preview
Official URL: https://doi.org/10.1007/s12672-019-00372-3


Breast cancer is the second leading cause of cancer mortality among women. Mammography and tumor biopsy followed by histopathological analysis are the current methods to diagnose breast cancer. Mammography does not detect all breast tumor subtypes, especially those that arise in younger women or women with dense breast tissue, and are more aggressive. There is an urgent need to find circulating prognostic molecules and liquid biopsy methods for breast cancer diagnosis and reducing the mortality rate. In this study, we systematically evaluated metabolites and proteins in blood to develop a pipeline to identify potential circulating biomarkers for breast cancer risk. Our aim is to identify a group of molecules to be used in the design of portable and low-cost biomarker detection devices. We obtained plasma samples from women who are cancer free (healthy) and women who were cancer free at the time of blood collection but developed breast cancer later (susceptible). We extracted potential prognostic biomarkers for breast cancer risk from plasma metabolomics and proteomics data using statistical and discriminative power analyses. We pre-processed the data to ensure the quality of subsequent analyses, and used two main feature selection methods to determine the importance of each molecule. After further feature elimination based on pairwise dependencies, we measured the performance of logistic regression classifier on the remaining molecules and compared their biological relevance. We identified six signatures that predicted breast cancer risk with different specificity and selectivity. The best performing signature had 13 factors. We validated the difference in level of one of the biomarkers, SCF/KITLG, in plasma from healthy and susceptible individuals. These biomarkers will be used to develop low-cost liquid biopsy methods toward early identification of breast cancer risk and hence decreased mortality. Our findings provide the knowledge basis needed to proceed in this direction.

Item Type: Article
Uncontrolled Keywords: Liquid biopsy, Breast cancer risk, Circulating biomarker, Machine learning, Feature selection
Subjects: B800 Medical Technology
C400 Genetics
G300 Statistics
Department: Faculties > Engineering and Environment > Mathematics, Physics and Electrical Engineering
Depositing User: Ay Okpokam
Date Deposited: 02 Jan 2020 12:56
Last Modified: 31 Jul 2021 14:17
URI: http://nrl.northumbria.ac.uk/id/eprint/41786

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics