Screening thousands of transcribed coding and non-coding regions reveals sequence determinants of RNA polymerase II elongation potential

2021 
Abstract Organismal growth and development rely on RNA Polymerase II (RNAPII) synthesizing the appropriate repertoire of messenger RNAs (mRNAs) from protein-coding genes. Productive elongation of full-length transcripts is essential for mRNA function, however what determines whether an engaged RNAPII molecule will terminate prematurely or transcribe processively remains poorly understood. Notably, despite a common process for transcription initiation across RNAPII-synthesized RNAs1, RNAPII is highly susceptible to termination when transcribing non-coding RNAs such as upstream antisense RNAs (uaRNAs) and enhancers RNAs (eRNAs)2, suggesting that differences arise during RNAPII elongation. To investigate the impact of transcribed sequence on elongation potential, we developed a method to screen the effects of thousands of INtegrated Sequences on Expression of RNA and Translation using high-throughput sequencing (INSERT-seq). We found that higher AT content in uaRNAs and eRNAs, rather than specific sequence motifs, underlies the propensity for RNAPII termination on these transcripts. Further, we demonstrate that 5’ splice sites exert both splicing-dependent and autonomous, splicing-independent stimulation of transcription, even in the absence of polyadenylation signals. Together, our results reveal a potent role for transcribed sequence in dictating gene output at mRNA and non-coding RNA loci, and demonstrate the power of INSERT-seq towards illuminating these contributions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    0
    Citations
    NaN
    KQI
    []