iDNA6mA-Rice: a computational tool for detecting N6-methyladenine sites in rice

2019 
DNA N6-methyladenine (6mA) is a prevalent kind of DNA modification and involves in various of biological processes. Accurate genome-wide identification of 6mA sites is invaluable for better understanding its biological functions. Due to the labor-intensive and expensive nature of experimental methods for 6mA detection in eukaryotes genome, it is urgent to develop computational methods to identify 6mA genome wide, especially for plants. Based on this consideration, the current study was devoted to construct a machine learning-based method to predict 6mA in the rice genome. We initially proposed using mono-nucleotide binary encoding to formulate positive and negative samples. Subsequently, the machine learning algorithm named Random Forest was utilized to perform the classification for identifying 6mA sites in the rice genome. The five-fold cross-validated results showed that our proposed method could produce an area under the receiver operating characteristic (AUC) of 0.964 with the overall accuracy of 0.917. Furthermore, an independent dataset test was established to evaluate the generalization ability of our method. As a result, an AUC of 0.981 was obtained, suggesting that the proposed method has good predictive performance to predict 6mA in rice. For the convenience of retrieving 6mA sites, based on the proposed method, we established a web-server called iDNA6mA-Rice which is freely accessible at http://lin-group.cn/server/iDNA6mA-Rice.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    82
    References
    34
    Citations
    NaN
    KQI
    []