Computational Prediction of De Novo Emerged Protein-Coding Genes

2019 
: De novo genes, that is, protein-coding genes originating from previously noncoding sequence, have gone from being considered impossibly unlikely to being recognized as an important source of genetic novelty in eukaryotic genomes. It is clear that de novo gene evolution is a rare but consistent feature of eukaryotic genomes, being detected in every genome studied. However, different studies often use different computational methods, and the numbers and identities of the detected genes vary greatly. Here we present a coherent protocol for the computational identification of de novo genes by comparative genomics. The method described uses homology searches, identification of syntenic regions, and ancestral sequence reconstruction to produce high-confidence candidates with robust evidence of de novo emergence. It is designed to be easily applicable given the basic knowledge of bioinformatic tools and scalable so that it can be applied on large and small datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    10
    Citations
    NaN
    KQI
    []