Attribute Selection and Classification of Prostate Cancer Gene Expression Data Using Artificial Neural Networks

2016 
Artificial Intelligence AI approaches for medical diagnosis and prediction of cancer are important and ever growing areas of research. Artificial Neural Networks ANN is one such approach that have been successfully applied in these areas. Various types of clinical datasets have been used in intelligent decision making systems for medical diagnosis, especially cancer for over three decades. However, gene expression datasets are complex with large numbers of attributes which make it more difficult for AI approaches to classification and prediction. Prostate Cancer dataset is one such dataset with 12600 attributes and only 102 samples. In this paper, we propose an extended ANN based approach for classification and prediction of prostate cancer using gene expression data. Firstly, we use four attribute selection approaches, namely Sequential Floating Forward Selection SFFS, RELIEFF, Sequential Backward Feature Section SFBS and Significant Attribute Evaluation SAE to identify the most influential attributes among 12600. We use ANNs and Naive Bayes for classification with complete sets of attributes as well as various sets obtained from attribute selection methods. Experimental results show that ANN outperformed Naive Bayes by achieving a classification accuracy of 98.2i¾?% compared to 62.74i¾?% with the full set of attributes. Further, with 21 selected attributes obtained with SFFS, ANNs achieved better accuracy 100i¾?% for classification compared to Naive Bayes. For prediction using ANNs, SFFS was able achieve best results with 92.31i¾?% of accuracy by correctly predicting 24 out of 26 samples provided for independent sample testing. Moreover, some of the gene selected by SFFS are identified to have a direct reference to cancer and tumour. Our results indicate that a combination of standard feature selection methods in conjunction with ANNs provide the most impressive results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    7
    Citations
    NaN
    KQI
    []