2D RNA-QSAR: assigning ACC oxidase family membership with stochastic molecular descriptors; isolation and prediction of a sequence from Psidium guajava L

2005 
Abstract Quantitative structure–activity relationship (QSAR) techniques for small molecules could be applied to nucleic acids. Unfortunately, almost all molecular descriptors are more successful at encoding branching information than sequences and/or cannot be back-projected. A solution for scaling the QSAR problem up to RNA may be to transform sequences into secondary structures first. Our group has used Markovian negentropies as molecular descriptors for drug design with preliminary results in bioinformatics [ Bioinformatics 2003 , 19 , 2079]. However, RNA-QSAR studies on RNA molecules have not been described to date. Novel Markovian negentropies have been introduced here as molecular descriptors for 2D-RNA structures. An RNA-QSAR study of the ACC proteins from different plants has been carried out. The QSAR recognizes 19/20 sequences (95.0%) within the ACC family and 12/17 (70.6%) of the control group sequences. The model has a high Matthews’ regression coefficient ( C  = 0.68). Overall cross-validation average accuracies were 14 out of 15 for ACC sequences (93.3%) and 10 out of 13 for control sequences (76.9%). Finally, ACC oxidase family membership was assigned to a new sequence isolated for the first time in this work from Psidium guajava L. A backprojection map for this sequence identifies the left stem (40%) and the main stem (45%) as highly important substructures. Results of an nBLAST experiment are consistent with this finding and indicate a high conservation score (>70) for left stem and main stem; whereas major loop, right stem, cap and major loop right half were hardly conserved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    29
    Citations
    NaN
    KQI
    []