A machine learning model for predicting fetal Haemoglobin levels in sickle cell disease patients

2021 
Sickle cell disease is one of the commonest genetic diseases and is defined as a decrease in hemoglobin concentration in the blood. The main known factor that can alleviate the disease is the persistence of fetal hemoglobin (HbF), and thus the aim of our research is to build a model to predict the HbF% of patients based on the three regulating genes of the disease (BCL11A, Xmm1-HBG2 and HBS1L-MYB). A machine learning approach is employed in order to improve the accuracy of the model, with various algorithms of that type being explored. In the end, the K-nearest neighbors algorithm is chosen and an initial version of it is implemented and tested. Finally, the algorithm is optimized enabling our optimized model to predict the HbF% of a patient with 87.25% accuracy, a major improvement over the existing alternative that has a mean error of 336.33%. Furthermore, 93.45% of our predictions have a sheer error that is less than 0.5, and all these facts reinforce the strength of our model as a quick and accurate estimation tool for small and medium-sized clinical trials, where fast HbF% predictions can help adjust for genetic background variability that obscures test outcomes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []