Identification of biomarkers for acute leukemia via machine learning-based stemness index.

2021 
Abstract Traditional methods to understand leukemia stem cell (LSC)'s biological characteristics include constructing LSC-like cells and mouse models by transgenic or knock-in methods. However, there are some potential pitfalls in using this method, such as retroviral insertion mutagenesis, non-physiological level gene expression, non-physiological expansion, and difficulty to construct. The mRNAsi index for each sample of the Cancer Genome Atlas (TCGA) could avoid these potential pitfalls by machine learning. In this work, we aimed to construct a network of LSC genes utilizing the mRNAsi. First, mRNAsi value was analyzed with expressions distributions, survival analysis, age, and gender in acute myeloid leukemia (AML) samples. Then, we used the weighted gene co-expression network analysis (WGCNA) to construct modules of stemness genes. The correlation of the LSC genes transcription and interplay among LSC proteins was analyzed. We performed functional and pathway enrichment analysis to annotate stemness genes. Survival analysis further identified prognostic biomarkers by clinical data of TCGA and the Gene Expression Omnibus (GEO) database. We found that the result of mRNAsi overall survival is not significant, which may be due to the heterogeneity of AML in the stage of myeloid differentiation, French–American–British (FAB) classification systems. Enrichment analysis indicated that the stemness genes were biologically clustered as a group and mainly associated with cell cycle and mitosis. Moreover, 10 key genes (SNRNP40, RFC4, RFC5, CDC6, HSPE1, PA2G4, SNAP23P, DARS2, MIS18A, and HPRT1) were screened by survival analysis with the data from TCGA and GEO. Among them, RFC4 and RFC5 were the distinguished biomarkers for their double-validated prognostic value in both databases. Additionally, the expression of RFC4 and RFC5 had the same trend as mRNAsi score in FAB subtypes. In conclusion, our result demonstrated that mRNAsi based LSC-related genes were found to have strong interactions as a cluster. These genes, especially RFC4 and RFC5, could be the therapeutic targets for inhibiting the stemness characteristics of AML. This work is also a comprehensive pipeline for future cancer stem cell studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    0
    Citations
    NaN
    KQI
    []