A Machine Learning Framework Based on Extreme Gradient Boosting for Intelligent Alzheimer’s Disease Diagnosis Using Structure MRI

2022 
Alzheimer’s Disease (AD) is difficult to diagnose even with recent advanced diagnostic methods. Other mental disorders, such as frontotemporal lobe dementia or vascular dementia, could be misdiagnosed as AD. Various deep learning models for early detection of AD using MRI data have demonstrated promising results. However, the results of these methods are difficult to interpret as they did not identify specific structural changes that are related to the disease. Additionally, they require a large amount of data and computational resources. In this study, we proposed a machine learning framework for diagnosis of AD. Our framework employed FreeSurfer library to extract insightful features such as volumetric measures and voxel- or vertex-wise atrophy from structural MRI brain scans collected from Alzheimer’s Disease Neuroimaging Initiative. These extracted features were then fed into Extreme Gradient Boosting (XGBoost) which is an ensemble learning algorithm with a decision tree as a base learner to distinguish AD patients from cognitively normal subjects (CN). XGBoost also provides the concept of variable importance which was evaluated by various criteria such as information gain or feature frequency to give an insight on which structural features have critical impacts on the final diagnosis. Our model was trained on 144 features extracted from 924 sMRI images (462 images of AD and 462 images of CN) and achieved 91% Area Under the Curve (AUC) on average using 5-fold cross-validation. Based on feature ranking, we observed that the 3rd ventricle, intracranial and supratentorial volume were some of the most crucial brain structural features affected in AD. This information could assist doctors and experts in AD diagnosis. We propose a framework for AD diagnosis using XGBoost and brain structural atrophy. The feature extraction phase which consumed significant computational resources is currently the bottleneck of our diagnosis pipeline. In future, we plan to improve the feature extraction efficiency to reduce computational cost, yet maintaining diagnosis accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []