Gene Selection inside Pathways using Weighted L 1 -norm Support Vector Machine

2017 
The common issues of high-dimensional gene expression data are that many of genes may not be relevant to their diseases. Genes have naturally pathway structure, where the pathway contains several genes sharing a biological function. Gene selection has been proved to be an effective way to improve the result of many classification methods. It is of great interest to incorporate pathway knowledge into gene selection. In this paper, a weighted sparse support vector is proposed, with the aim of identification genes and pathways, by combining the support vector machine with the weighted L 1 -norm. Experimental results based on three publicly gene expression datasets show that the proposed method significantly outperforms three competitor methods in terms of classification accuracy, G-mean, and area under the curve. In addition, the results demonstrate that the top identified genes and pathways are biologically related to the cancer type. Thus, the proposed method can be useful for cancer classification using DNA gene expression data in the real clinical practice.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    0
    Citations
    NaN
    KQI
    []