The identification of variable-length coevolutionary patterns for predicting HIV-1 protease cleavage sites

2020 
The substrate specificity of human immunodeficiency virus 1 (HIV-1) plays an essential role in designing HIV-1 inhibitors for therapy purpose. Hence, to predict the existence of cleavage sites in HIV-1 protease, a variety of computational algorithms have been developed by following the homogeneous information in substrate sequences. However, few of them can fully exploit such information, as they are not capable of identifying variable-length coevolutionary patterns. To overcome this limitation, we propose a novel algorithm with which variable-length coevolutionary patterns can be identified. Based on these patterns, we compose the feature vector for each of substrates and train the SVM classifier to the purpose of predicting HIV1 protease cleavage sites. Experimental results show that the use of variable-length coevolutionary patterns can improve the prediction performance in terms of AUC and PR-AUC analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    1
    Citations
    NaN
    KQI
    []