Hidden patterns of codon usage bias across kingdoms

2018 
The genetic code encodes 20 amino acids using 64 nucleotide triplets or codons. 18 of the 20 amino acids are encoded by multiple synonymous codons which are used in organismal genomes in a biased fashion. Codon bias arises because evolutionary selection favours particular nucleotide sequences over others encoding the same amino acid sequence. Despite many existing hypotheses, there is no current consensus on what the evolutionary drivers are. Using ideas from stochastic thermodynamics we derive from first principles a mathematical model describing the statistics of codon usage bias and apply it to extensive genomic data. Our main conclusions include the following findings: (1) Codon usage cannot be explained solely by selection pressures that act on the genome-wide frequency of codons, but also includes pressures that act at the level of individual genes. (2) Codon usage is not only biased in the usage frequency of nucleotide triplets but also in how they are distributed across mRNAs. (3) A new model-based measure of codon usage bias that extends existing measures by taking into account both codon frequency and codon distribution reveals distinct, amino acid specific patterns of selection in distinct branches of the tree of life.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    65
    References
    0
    Citations
    NaN
    KQI
    []