Genome-Wide Search for Translated Upstream Open Reading Frames in Arabidopsis Thaliana

2016 
Upstream open reading frames (uORFs) are open reading frames that occur within the 5' UTR of an mRNA. uORFs have been found in many organisms. They play an important role in gene regulation, cell development, and in various metabolic processes. It is believed that translated uORFs reduce the translational efficiency of the main coding region. However, only few uORFs are experimentally characterized. In this paper, we use ribosome footprinting together with a semi-supervised approach based on stacking classification models to identify translated uORFs in Arabidopsis thaliana. Our approach identified 5360 potentially translated uORFs in 2051 genes. GO terms enriched in genes with translated uORFs include catalytic activity, binding, transferase activity, phosphotransferase activity, kinase activity, and transcription regulator activity. The reported uORFs occur with a higher frequency in multi-isoform genes, and some uORFs are affected by alternative transcript start sites or alternative splicing events. Association rule mining revealed sequence features associated with the translation status of the uORFs. We hypothesize that uORF translation is a complex process that might be regulated by multiple factors. The identified uORFs are available online at: https://www.dropbox.com/sh/zdutupedxafhly8/AABFsdNR5zDfiozB7B4igFcja?dl=0 . This paper is the extended version of our research presented at ISBRA 2015.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    13
    Citations
    NaN
    KQI
    []