Modeling evolution of protein coding DNA sequences

2003 
We develop a new class of computationally feasible stochastic models for statistical analysis of genetic sequence evolution and inference of properties of the underlying substitution processes in the context of maximum likelihood framework. Existing models for evolution of protein coding sequences allow site to site variation in nonsynonymous substitution rates, but assume that the rate of synonymous substitu­ tions is constant for all sites. New models provide a rigorous statistical framework for testing the hypothesis of synonymous rate constancy, and enable a host of data exploration and analysis tools. For several indicative data sets, the constancy as­ sumption is shown to be violated, and some possible explanations are given. We also present an algorithm for improving efficiency of maximum likelihood evaluations, and discuss HyPhy a user friendly and publicly distributed software implementation of our methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    3
    Citations
    NaN
    KQI
    []