Accurate detection of convergent substitutions

2018 
In the history of life, some phenotypes have been acquired several times independently, in a process known as convergent evolution. Recently, lots of genome-scale studies have been devoted to identify nucleotides or amino acids that changed in a convergent manner when the convergent phenotypes themselves evolved. These efforts have had mixed results, probably because of differences in the detection methods, and because of underlying conceptual differences about the definition of a convergent substitution. Some methods contend that substitutions are convergent only if they occur repeatedly towards the exact same state at a given nucleotide or amino acid position. Others are much looser in their requirements and define a convergent substitution as one that leads the site at which they occur to prefer a phylogeny in which species with the convergent phenotype group together. Here we define convergent substitutions as substitutions that occur on all branches where the phenotype changed and such that they correspond to a change in the type of amino acid preferred at this position. We implement the corresponding probabilistic model into a new open-source software named PCOC. We show on simulations that it performs better than existing methods both in terms of sensitivity and specificity. In particular, it performs better than competing methods both when there are few or many events of convergent evolution. We test it on a plant protein alignment where convergent evolution has been studied in detail and find that our method recovers many previously identified convergent substitutions and proposes credible new candidates.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []