Predicting Carbonylation Sites of Human Proteins with a New Max-Significance and Min-Redundancy Feature Selection Criterion

2018 
Protein carbonylation is a typical oxidative stress-induced post-translational modification (PTM), which has an important role in proteasome activity and cellular dysfunctions. Mass spectrometry is one of the most common ways to analyze protein carbonyl level and identify exact modification sites. It has been found that four types of amino acid residues, including lysine (K), arginine (R), threonine (T) and proline (P), are prone to carbonylation. However, these experimental approaches are not suitable for batch processing of proteins, and the predictive power of existing relevant bioinformatical tools are still weak. In the paper, an improved method with a new max-significance and min-redundancy (MSAMR) feature selection criterion was proposed to predict human carbonylation sites. This method can achieve total accuracies of 87.43%, 86.83%, 86.56% and 88.10% for K, R, T and P carbonylation site predictions respectively using 10-fold cross-validation. Different kinds of features involved in the method, especially a new kind of customized amino acid composition (CAAC) features, were analyzed, and the performance of MSAMR feature selection criterion was discussed. Furthermore, a software tool CarSPred2.0 was released to serve for human carbonylation site prediction.All datasets and the software can be available at https://github.com/lhqxinghun/bioinformatics/tree/master/CarSPred-2.0/.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []