Analyses of the 5-prime Ends of Escherichia coli ORFs

2021 
Sequence biases at 5 ends of coding sequences differ from those of the remainder of ORFs, reflecting differences in function. Internal sequence biases promote translational efficiency by several mechanisms including correlating codon usage and tRNA concentration. However, the early region may also facilitate translational initiation, establishment of the reading frame, and polypeptide processing. Here we examine the beginnings of the ORFs of an Escherichia coli K12 reference genome. The results extend previous observations of A-richness to include an overabundance of the AAA triplet in all reading frames, consistent with the hypothesis that the beginnings of ORFs contribute to initiation site accessibility. Results are also consistent with the idea that the first two amino acids are under selection because they facilitate solvation of the amino-terminus at the end of the ribosomal exit channel. Moreover, serine is highly overrepresented as the second amino acid, possibly because it can facilitate removal of the terminal formylmethionine. Non-AUG initiation codons are known to be less efficient than AUG at directing initiation, presumably because of relatively weak base pairing to the initiator-tRNA. But non-UAG initiation codons are not followed by unusual 3 nearest neighbor codons. Moreover, the four NUG initiation codons do not differ in their propensity to frameshift in an assay known to be sensitive to base pair strength. Altogether, these data suggest that the 5 ends of ORFs are under selection for several functions, and that initiation codon identity may not be critical beyond its role in initiation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []