Predicting human lncRNA-disease associations based on geometric matrix completion.
2019
Recently, increasing evidences have indicated that dysregulations of long non-coding RNAs (lncRNAs) are implicated in various complex diseases. However, only a limited number of lncRNA-disease associations are experimentally verified. Prioritizing potential lncRNA-disease associations is beneficial not only for understanding disease mechanisms at lncRNA level, but also for disease diagnosis. Various computational methods have been proposed, but precise prediction and full use of data's intrinsic structure are still challenging. In this study, we propose a new method, named GMCLDA (Geometric Matrix Completion lncRNA-Disease Association), to infer potential lncRNA-disease associations based on geometric matrix completion. Utilizing association patterns among functionally similar lncRNAs and phenotypically similar diseases, GMCLCA makes use of the intrinsic structure of the lncRNA-disease association matrix. In addition, limiting the scope of the predicted values gives rise to a certain sparsity in computation and enhances the robustness of GMCLDA. GMCLDA computes disease semantic similarity according to the basis of Disease Ontology (DO) hierarchy and calculates Gaussian interaction profile kernel similarity for lncRNAs. Then, GMCLDA measures lncRNA sequence similarity using Needleman-Wunsch algorithm. For a new lncRNA, GMCLDA prefills the association profile according to its K-nearest neighbors defined by sequence similarity. Finally, GMCLDA completes the association matrix based on the geometric matrix completion framework. Computational results show that GMCLDA can effectively predict lncRNA-disease associations with higher accuracy compared with the existing methods. Further case studies show that GMCLDA is able to correctly predict candidate lncRNAs for renal cancer, ovarian cancer, and prostate cancer.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
41
References
10
Citations
NaN
KQI