Decision tree‑based classifiers for lung cancer diagnosis and subtyping using TCGA miRNA expression data

2019 
Lung cancer has the world's highest cancer- associated mortality rate, making biomarker discovery for this cancer a pressing issue. Machine learning approaches to identify molecular biomarkers are not as prevalent as screening of potential biomarkers by differential expression analysis. However, several differentially expressed miRNAs involved in cancer have been identified using this approach. The availability of The Cancer Genome Atlas (TCGA) allows the use of machine-learning methods for the molecular profiling of tumors. The present study employed empirical negative control microRNAs (miRs) in lung cancer to normalize lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) datasets from TCGA to model decision trees in order to classify lung cancer status and subtype. The two primary classification models consisted of four miRNAs for lung cancer diagnosis and subtyping. hsa-miR-183 and hsa-miR-135b were used to distinguish lung tumors from normal samples taken from tissues adjacent to the tumor site, and hsa-miR-944 and hsa-miR-205 to further classify the tumors into LUAD and LUSC major subtypes. Specific cancer status classification models were also presented for each subtype.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    14
    Citations
    NaN
    KQI
    []