A Concept Lattice Method for Eliminating Redundant Features

2021 
Microarray gene technology solves the problem of obtaining gene expression data. It is a significant part for current research to obtain effective information from omics genes quickly. Feature selection is an important step of data preprocessing, and it is one of the key factors affecting the capability of algorithm information extraction. Since single feature selection method causes the deviation of feature subsets, we introduce ensemble learning to solve the problem of clusters redundancy. We propose a new method called Multi-Cluster minimum Redundancy (MCmR). Firstly, features are clustered by L1-normth. And then, redundant features among clusters are removed according to the mRMR algorithm. Finally, it can be sorted by the calculation results of each feature MCFS_score in the features subset. By this process, the feature with higher score can be used as the output result. The concept lattice constructed by MCmR reduces redundant concepts while maintaining its structure and improve the efficiency of data analysis. We verify the valid of MCmR on multiple disease gene datasets, and its ACC in Prostate_Tumor, Lung_cancer, Breast_cancer and Leukemia datasets reached 95.4, 94.9, 96.0 and 95.8 respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []