ESUCA: a pipeline for genome-wide identification of upstream open reading frames with evolutionarily conserved sequences and determination of the taxonomic range of their conservation

2019 
Background: Some upstream open reading frames (uORFs) in the 59 leaders of eukaryotic mRNAs encode evolutionarily conserved functional peptides, such as cis-acting regulatory peptides that control main ORF (mORF) translation. To comprehensively identify uORFs with conserved peptide sequences (CPuORFs), we previously developed the BAIUCAS pipeline, in which uORF sequences are compared between a certain species and any others with available transcript sequence databases. However, further selection is needed to identify CPuORFs encoding functional peptides. The purpose of this study is to develop a novel pipeline to efficiently identify CPuORFs likely to encode functional peptides. Results: Here, we present the ESUCA pipeline. In addition to the function of BAIUCAS, ESUCA has the following new functions: 1) to identify CPuORFs likely to be conserved due to the functions of their encoded small peptides, not due to encoding parts of the mORF-encoded proteins; 2) to systematically calculate Ka/Ks ratios to assess whether uORF sequences are conserved at the nucleotide or amino acid level; 3) to determine the taxonomic range of uORF sequence conservation. We applied ESUCA to five plant genomes and identified 88 novel CPuORF families. Using a transient expression assay, we examined the effects of eight CPuORFs conserved in various taxonomic ranges on mORF translation. Three of seven CPuORFs conserved across diverse eudicots showed sequence-dependent regulatory effects. Conclusions: This study demonstrates that ESUCA can efficiently identify many CPuORFs conserved in various taxonomic ranges by applying it to many species and is highly useful to select CPuORFs likely to encode functional peptides.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    2
    Citations
    NaN
    KQI
    []