The Compressed Vocabulary of the Proteins of Archaea

2017 
The origin and evolution of molecular functions hold the key to the emergence of modern biochemistry and cellular organization. Here we explore the existence of a growing vocabulary in the proteins and molecular functions of Archaea. A genomic census of structural domains and its mappings to Gene Ontology terms provides the raw data for the search of meaningful patterns and processes that drive molecular change. We present evidence supporting the existence of statistical laws of language and socioeconomic-linked diffusion of innovation models intricately embedded in both protein structure and domain organization. Patterns of origin and diversification of organismal repertoires of proteins (proteomes) and functions (functionomes) reveal their makeup depends on trade-off solutions between three principles that favor organismal and molecular persistence: economy, flexibility and robustness. We find that the microbes of Archaea and Bacteria maximize economy, while eukaryotic organisms maximize flexibility and robustness. In the process, archaeal organisms engage in extreme semantic and pragmatic compression of their messages in response to evolutionary constraints, which were probably historically imposed by microbial lifestyle and harsh environments. Archaea preserves an economy-driven primordial vocabulary that is highly homogeneous and is the most ancient of the cellular world.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    86
    References
    6
    Citations
    NaN
    KQI
    []