MiningBreastCancer: Selection of Candidate Gene Associated with Breast Cancer via Comparison between Data Mining of TCGA and Text Mining of PubMed

2020 
In 2016, 12,676 new cases of breast cancer were diagnosed among Taiwan women. In 2018 the standardized death rate of breast cancer was 12.5 per 100,000 persons. Previous studies have integrated data and text mining to yield fusion genes, identify genetic factors for breast cancer and select single-gene feature sets for colon cancer discrimination. However, our study is the first to select significantly different expression between breast normal tissue and cancer using TCGA data and biostatistics, excluding know genes using abstracts from PubMed and natural language processing. The top twenty genes for research potential from the selection of Mining-BreastCancer are EML3, ABCB9, GRASP, KANK3, GPR146, ZNF623, CCDC9, ADCY4, DLL1, ADAM33, GRRP1, LRRN4CL, C14orf180, ABCD4, ABCC6P1, PEAR1, FAM43A, C20orf160, KIF21A and PP-FIA3. Few studies for these genes exist, but they hold significantly different expressions between breast cancer and normal tissue, each pathologic tumor and lymph node, or between each pathologic metastasis. These results show that MiningBreastCancer can help scientists select genes for research potential. MiningBreastCancer is available through http://bio.yungyun.com.tw/MiningBreastCancer.aspx.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []