SSCMDA: spy and super cluster strategy for MiRNA-disease association prediction

2018 
// Qi Zhao 1, 2 , Di Xie 1 , Hongsheng Liu 2, 3 , Fan Wang 4, 5 , Gui-Ying Yan 6 and Xing Chen 7 1 School of Mathematics, Liaoning University, Shenyang, China 2 Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Liaoning Province, Shenyang, China 3 School of Life Science, Liaoning University, Shenyang, China 4 School of Mechatronic Engineering, China University of Mining and Technology, Xuzhou, China 5 Jiangsu Key Laboratory of Mine Mechanical and Electrical Equipment, China University of Mining and Technology, Xuzhou, China 6 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China 7 School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China Correspondence to: Qi Zhao, email: zhaoqi@lnu.edu.cn Xing Chen, email: xingchen@amss.ac.cn Keywords: microRNA; disease; association prediction; spy strategy; super cluster strategy Received: July 18, 2017      Accepted: October 30, 2017      Published: December 01, 2017 ABSTRACT In the biological field, the identification of the associations between microRNAs (miRNAs) and diseases has been paid increasing attention as an extremely meaningful study for the clinical medicine. However, it is expensive and time-consuming to confirm miRNA-disease associations by experimental methods. Therefore, in recent years, several effective computational models for predicting the potential miRNA-disease associations have been developed. In this paper, we proposed the Spy and Super Cluster strategy for MiRNA-Disease Association prediction (SSCMDA) based on known miRNA-disease associations, integrated disease similarity and integrated miRNA similarity. For problems of mixed unknown miRNA-disease pairs containing both potential associations and real negative associations, which will lead to inaccurate prediction, spy strategy is adopted by SSCMDA to identify reliable negative samples from the unknown miRNA-disease pairs. Moreover, the super-cluster strategy could gather as many positive samples as possible to improve the accuracy of the prediction by overcoming the shortage of lacking sufficient positive training samples. As a result, the AUCs of global leave-one-out cross validation (LOOCV), local LOOCV and 5-fold cross validation were 0.9007, 0.8747 and 0.8806+/-0.0025, respectively. According to the AUC results, SSCMDA has shown a significant improvement compared with some previous models. We further carried out case studies based on various version of HMDD database to test the prediction performance robustness of SSCMDA. We also implemented case study to examine whether SSCMDA was effective for new diseases without any known associated miRNAs. As a result, a large proportion of the predicted miRNAs have been verified by experimental reports.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    9
    Citations
    NaN
    KQI
    []