Histological classification of non-small cell lung cancer with RNA-seq data using machine learning models

2021 
This study develops an automated model using the supervised learning framework(s) for the classification of the histological subtypes of non-small cell lung cancer (NSCLC). The machine learning (ML) approach is performed on gene expression profiles for the diagnosis of lung cancer that is the primary cause of cancer deaths worldwide. The performance of five classical Machine Learning (ML) estimators and four ensemble ML classifiers are evaluated on an RNA-Sequence dataset of 127 cases of NSCLC. The Decision Tree (DT) and Bagging models show promising classification accuracy up to 100% and area under curves (AUCs) is more than 0.97. The implemented ensemble methods collectively exhibit good performance in terms of AUCs (0.68 -- 1.00). The findings are comparable to the high precision ML models and the results provide an insight into the supervised models that can achieve higher diagnosis accuracy on RNA-Seq-based gene expression profiles of NSCLC subtypes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []