Persistent biases in the amino acid composition of prokaryotic proteins

2006 
Summary Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins’ amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism’s biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G þC content of the first codon position. An unexpected bias is driven, in many proteomes, by the AANboxofthegeneticcode,suggestingsomefunctional biochemical relationship between asparagine and lysine. Less-significant biases are driven by the rare amino acids,cysteineandtryptophan.Someallowidentification of species-specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction. BioEssays 28:726–738, 2006. 2006 Wiley Periodicals, Inc.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    68
    References
    25
    Citations
    NaN
    KQI
    []