DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models

Aliferi, Anastasia, Ballard, David, Gallidabino, Matteo, Thurtle, Helen, Barron, Leon and Syndercombe Court, Denise (2018) DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models. Forensic Science International: Genetics, 37. pp. 215-226. ISSN 1872-4973

[img]
Preview
Text
Aliferi_FSIGEN_37_215ss.pdf - Accepted Version

Download (963kB) | Preview
Official URL: http://dx.doi.org/10.1016/j.fsigen.2018.09.003

Abstract

The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11–93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing. For the selected model (RMSE = 4.9 years) the mean average error (MAE) of the blind test (n = 33) was calculated at 4.1 years, with 52% of the samples predicting with less than 4 years of error and 86% with less than 7 years. Furthermore, the sensitivity of the method was assessed both in terms of methylation quantification accuracy and prediction accuracy in the first validation of this kind. The described method retained its accuracy down to 10 ng of initial DNA input or ∼2 ng bisulphite PCR input. Finally, 34 saliva samples were analysed and following basic normalisation, the chronological age of the donors was predicted with less than 4 years of error for 50% of the samples and with less than 7 years of error for 70%.

Item Type: Article
Uncontrolled Keywords: DNA methylation, Age prediction, Artificial neural networks, Machine learning, Whole blood, Saliva, Sperm
Subjects: C400 Genetics
F400 Forensic and Archaeological Science
Department: Faculties > Health and Life Sciences > Applied Sciences
Depositing User: Becky Skoyles
Date Deposited: 24 Sep 2018 14:15
Last Modified: 11 Oct 2019 13:00
URI: http://nrl.northumbria.ac.uk/id/eprint/35880

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics