TranSuite: a software suite for accurate translation and characterization of transcripts

2020 
Protein translation programs often select the longest open reading frame (ORF) in a transcript leading to numerous inaccurate and mis-annotated ORFs in databases. Unproductive transcript isoforms containing premature termination codons (PTCs) are potential substrates for nonsense-mediated decay (NMD). These transcripts often contain truncated ORFs but are incorrectly annotated due to selection of a long ORF beginning at an AUG downstream of the PTC despite the transcript containing the authentic translation start AUG. In gene expression and alternative splicing analyses, it is important to identify transcript isoforms which code for different protein variants and to distinguish these from potential NMD substrates. Here, we present TranSuite, a pipeline of bioinformatics tools that address these challenges by performing accurate translations, characterizing alternative ORFs and identifying NMD and other features of transcripts in newly assembled and existing transcriptomes. Directly comparing ORFs defined by TranSuite and TransDecoder for the Arabidopsis transcriptome AtRTD2 identified ORF mis-calling in over 16k (27%) of transcripts by TransDecoder.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    3
    Citations
    NaN
    KQI
    []