Evolutionary analysis of base-pairing interactions in DNA and RNA secondary structures

2018 
Pairs of nucleotides within biologically functional nucleic acid secondary structures often exhibit evidence of coevolution that is consistent with the maintenance of canonical base-pairing. MESSI is a sequence evolution model that infers substitution rates associated with base-paired sites in alignments of DNA or RNA sequences. MESSI can estimate these whilst simultaneously accounting for the uncertainty associated with an unknown RNA or DNA secondary structure shared across an alignment of sequences. Moreover, the unknown structure can be predicted, or a base-pairing probability matrix calculated. MESSI optionally leverages CUDA GPU parallelism to accelerate inference. MESSI was used to infer coevolution rates associated with GC, AU (AT in DNA), GU (GT in DNA) pairs in non-coding RNA alignments, and single-stranded RNA and DNA virus alignments. Inferred rates of GU pair coevolution were found to be higher at base-paired sites in single-stranded RNA viruses and non-coding RNAs than those of GT pairs in single-stranded DNA viruses, suggesting that GT pairs do not stabilise DNA secondary structures to the same extent as GU pairs in RNA. The relative coevolution rates associated with GC, AU, and GU pairs were largely consistent with their relative chemical base-pairing stabilities (GC base-pairs being more stable than AU base-pairs, and AU base-pairs being more stable than GU base-pairs). Additionally, MESSI estimates the degrees of coevolution at individual base-paired sites in an alignment. These estimates were computed for a SHAPE-MaP-determined HIV-1 NL4-3 RNA secondary structure and two corresponding alignments. MESSI9s estimates of coevolution were significantly more strongly correlated with experimentally-determined SHAPE-MaP pairing scores as compared to three non-evolutionary measures of base-pairing covariation. Finally, to assist researchers in prioritising substructures with potential biological functionality, MESSI automatically identifies substructures and ranks them by degrees of coevolution at base-paired sites within them. Such a ranking was created for an HIV-1 subtype B alignment, revealing an excess of top-ranking substructures that have been previously identified in the literature as having structure-related functional importance, and a number of top-ranking structures that have not yet been characterised.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []