logo
    Plant defence responses against pathogen infection are crucial to plant survival. The high degree of regulation of plant immunity occurs both transcriptionally and posttranscriptionally. Once transcribed, target gene RNA must be processed prior to translation. This includes polyadenylation, 5'capping, editing, splicing, and mRNA export. RNA-binding proteins (RBPs) have been implicated at each level of RNA processing. Previous research has primarily focused on structural RNA-binding proteins of yeast and mammals; however, more recent work has characterized a number of plant RBPs and revealed their roles in plant immune responses. This paper provides an update on the known functions of RBPs in plant immune response regulation. Future in-depth analysis of RBPs and other related players will unveil the sophisticated regulatory mechanisms of RNA processing during plant immune responses.
    Plant Immunity
    Citations (44)
    RNA-binding proteins (RBPs) lie at the center of post-transcriptional regulation and protein synthesis, adding complexity to RNA life cycle. RBPs also participate in the formation of membrane-less organelles (MLOs) via undergoing liquid-liquid phase separation (LLPS), which underlies the formation of MLOs in eukaryotic cells. RBPs-triggered LLPS mainly relies on the interaction between their RNA recognition motifs (RRMs) and capped mRNA transcripts and the heterotypic multivalent interactions between their intrinsically disordered regions (IDRs) or prion-like domains (PLDs). In turn, the aggregations of RBPs are also dependent on the process of LLPS. RBPs-driven LLPS is involved in many intracellular processes (regulation of translation, mRNA storage and stabilization and cell signaling) and serves as the heart of cellular physiology and pathology. Thus, it is essential to comprehend the potential roles and investigate the internal mechanism of RPBs-triggered LLPS. In this review, we primarily expound on our current understanding of RBPs and they-triggered LLPS and summarize their physiological and pathological functions. Furthermore, we also summarize the potential roles of RBPs-triggered LLPS as novel therapeutic mechanism for human diseases. This review will help understand the mechanisms underlying LLPS and downstream regulation of RBPs and provide insights into the pathogenesis and therapy of complex diseases.
    Stress granule
    Intrinsically Disordered Proteins
    Translational regulation
    Citations (10)
    RNA binding proteins induce RNA life in IDD and OA. RBPs play a central role in PTGR, including transcription, splicing, and export. The effective function of RBPs is critical for coordinating various post-transcriptional events, and dysfunction of RBPs can lead to apoptosis, extracellular matrix disruption, angiogenesis, and inflammation, which cause IDD and OA.
    Transcription
    Citations (0)
    Split End (SPEN) family proteins have three members: SPEN, RBM15, and RBM15B. SPEN family proteins contain three conserved RNA recognition motifs on the N-terminal region and an SPOC domain on the C-terminal region. RBM15 is fused to MKL1 in chromosome translocation t (1;22), which causes childhood acute megakaryoblastic leukemia (AMKL). Haploinsufficiency of RBM15 in AMKL indicates that RBM15 is a tumor suppressor. Both SPEN and RBM15 are mutated in a variety of cancer types, implying that they are tumor suppressors. SPEN and RBM15are required for the development of multiple organs including hematopoiesis partly via regulating the NOTCH signaling pathway, as well as the WNT signaling pathway in species ranging from Drosophila to mammals. Besides transcriptional regulation, RBM15 regulates RNA export and RNA splicing. In this review, we summarized data in the literature on how the members in SPEN family regulate gene expression at transcription and RNA processing steps. The crosstalk between epigenetic regulation and RNA metabolism is increasingly appreciated in understanding tumorigenesis. Studying the SPEN family of RNA binding proteins will create new perspectives for cancer therapy.
    Nuclear export signal
    RNA methylation
    Citations (9)
    Emerging studies support that RNA-binding proteins (RBPs) play critical roles in human biology and pathogenesis. RBPs are essential players in RNA processing and metabolism, including pre-mRNA splicing, polyadenylation, transport, surveillance, mRNA localization, mRNA stability control, translational control and editing of various types of RNAs. Aberrant expression of and mutations in RBP genes affect various steps of RNA processing, altering target gene function. RBPs have been associated with various diseases, including neurological diseases. Here, we mainly focus on selected RNA-binding proteins including Nova-1/Nova-2, HuR/HuB/HuC/HuD, TDP-43, Fus, Rbfox1/Rbfox2, QKI and FMRP, discussing their function and roles in human diseases.
    Citations (74)
    ABSTRACT It is challenging for RNA processing machineries to select exons within long intronic regions. We find that intronic LINE repeat sequences (LINEs) contribute to this selection by recruiting dozens of RNA-binding proteins (RBPs). This includes MATR3, which promotes binding of PTBP1 to multivalent binding sites in LINEs. Both RBPs repress splicing and 3’ end processing within and around LINEs, as demonstrated in cultured human cells and mouse brain. Notably, repressive RBPs preferentially bind to evolutionarily young LINEs, which are confined to deep intronic regions. These RBPs insulate both LINEs and surrounding regions from RNA processing. Upon evolutionary divergence, gradual loss of insulation diversifies the roles of LINEs. Older LINEs are located closer to exons, are a common source of tissue-specific exons, and increasingly bind to RBPs that enhance RNA processing. Thus, LINEs are hubs for assembly of repressive RBPs, and contribute to evolution of new, lineage-specific transcripts in mammals.
    Lineage (genetic)
    Citations (2)
    Article Figures and data Abstract Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Circular RNAs (circRNAs) represent an abundant and conserved entity of non-coding RNAs; however, the principles of biogenesis are currently not fully understood. Here, we identify two factors, splicing factor proline/glutamine rich (SFPQ) and non-POU domain-containing octamer-binding protein (NONO), to be enriched around circRNA loci. We observe a subclass of circRNAs, coined DALI circRNAs, with distal inverted Alu elements and long flanking introns to be highly deregulated upon SFPQ knockdown. Moreover, SFPQ depletion leads to increased intron retention with concomitant induction of cryptic splicing, premature transcription termination, and polyadenylation, particularly prevalent for long introns. Aberrant splicing in the upstream and downstream regions of circRNA producing exons are critical for shaping the circRNAome, and specifically, we identify missplicing in the immediate upstream region to be a conserved driver of circRNA biogenesis. Collectively, our data show that SFPQ plays an important role in maintaining intron integrity by ensuring accurate splicing of long introns, and disclose novel features governing Alu-independent circRNA production. Introduction Gene expression is the output of multiple tightly coupled and controlled steps within the cell, which are highly regulated by a variety of factors and processes. Among these are the physical and functional interactions between the transcriptional and splicing machineries that are of great importance for the generation of both canonical and alternative isoforms of RNA transcripts. This includes a novel class of unique, closed circular RNA (circRNA) molecules. CircRNAs are evolutionary conserved and display differential expression across cell types, tissues, and developmental stages. The highly stable circular conformation is obtained by covalently joining a downstream splice donor to an upstream splice acceptor, a backsplicing process catalyzed by the spliceosome (Memczak et al., 2013; Jeck et al., 2013; Salzman et al., 2013; Ashwal-Fluss et al., 2014; Hansen et al., 2013). The vast majority of circRNAs derive from coding sequences, making their biogenesis compete with the production of linear isoforms (Ashwal-Fluss et al., 2014; Salzman et al., 2013, Salzman et al., 2012). Complementary sequences in the flanking introns can facilitate the production of circRNAs (Dubin et al., 1995; Ivanov et al., 2015; Jeck et al., 2013; Westholm et al., 2014; Zhang et al., 2014), where the primate-specific Alu repeats are found to be significantly enriched in the flanking introns of circRNAs (Jeck et al., 2013; Ivanov et al., 2015; Venø et al., 2015). In some cases, exon skipping has been shown to stimulate circularization of the skipped exon (Barrett et al., 2015). However, in both human and Drosophila, biogenesis of the most abundant and conserved pool of circRNAs tend to be driven by long flanking introns rather than the presence of proximal inverted repeats in the flanking sequences (Westholm et al., 2014; Stagsted et al., 2019). The biogenesis of circRNAs without inverted repeats is currently not understood in detail, although RNA-binding proteins (RBPs) associating with the flanking introns of circRNAs have been shown to be important (Ashwal-Fluss et al., 2014; Conn et al., 2015; Errichelli et al., 2017). Here, we aim to identify additional protein factors involved in circRNA biogenesis. To this end, we exploited the enormous eCLIP and RNA sequencing resource available from the ENCODE consortium (ENCODE Project Consortium, 2012). Stratifying eCLIP hits across the genome with circRNA loci coordinates revealed the splicing factor proline/glutamine rich (SFPQ) and non-POU domain-containing octamer-binding protein (NONO) as highly enriched around circRNAs compared to other exons. Both proteins belong to the multifunctional Drosophila behavior/human splicing (DBHS) family with highly conserved RNA recognition motifs (RRMs) (Dong et al., 1993) and they are often found as a heterodimeric complex (Knott et al., 2016; Knott et al., 2015; Lee et al., 2015; Passon et al., 2012). The proteins are predominantly located to the nucleus, in particular to the membrane-less condensates known as paraspeckles (Clemson et al., 2009; Fox et al., 2018), where they play a pivotal role in cellular mechanisms ranging from regulation of transcription by interaction with the C-terminal domain (CTD) of RNA polymerase II (Buxadé et al., 2008; Rosonina et al., 2005; Urban et al., 2000), pre-mRNA splicing (Emili et al., 2002; Ito et al., 2008; Kameoka et al., 2004; Peng et al., 2002) and 3’end processing (Kaneko et al., 2007; Rosonina et al., 2005) to nuclear retention (Zhang and Carmichael, 2001) and nuclear export of RNA (Furukawa et al., 2015). Recently, SFPQ has been implicated in ensuring proper transcription elongation of neuronal genes (Takeuchi et al., 2018) representing an interesting link to circRNAs, as these are highly abundant in neuronal tissues and often derive from neuronal genes (Rybak-Wolf et al., 2015). Here, we show that SFPQ depletion leads to specific deregulation of circRNAs with long flanking introns devoid of proximal inverted Alu elements. Moreover, we show that long introns in particular are prone to intron retention and alternative splicing with concomitant premature termination. While premature termination is not the main driver of circRNA deregulation, we provide evidence for a complex interplay between upstream (acting positively on circRNA production) and downstream features (acting negatively) that collectively govern the production of individual circRNAs in the absence of SFPQ. This not only elucidates a conserved role for SFPQ in circRNA regulation but also identifies upstream alternative splicing as an approach toward circRNA production. Results The DALI circRNAs are defined by long flanking introns and distal inverted Alu elements To stratify circRNAs by their inverted Alu element dependencies, we characterized the circRNAome in two of the main ENCODE cell lines, HepG2 and K562 (Supplementary file 1). Using the joint prediction of two circRNA detection algorithms, ciri2 and find_circ, we identified 3044 and 7656 circRNAs in HepG2 and K562, respectively. While proximal inverted Alu elements (IAEs) are important for the biogenesis of a subset of circRNAs (Jeck et al., 2013; Ivanov et al., 2015), we and others have shown that long flanking introns associate with circRNA loci, particularly for the conserved and abundant circRNAs (Stagsted et al., 2019; Westholm et al., 2014), and the biogenesis of this group of circRNA species is largely unresolved. To focus our analysis on the non-Alu, long intron fraction of circRNAs, we subgrouped circRNAs based on their IAE distance and flanking intron length using median distance and length as cutoffs (Figure 1A–C). We observed that these two features show interdependent distributions, where approximately 70% of the top1000 expressed circRNAs group as either Distal-Alu-Long-Intron (DALI) circRNAs or Proximal-Alu-Short-Intron (PASI) circRNAs (Figure 1D). Apart from long flanking introns and distal IAEs, DALI circRNAs show higher overall expression compared to PASI circRNAs, longer genomic lengths, but similar distribution of mature lengths (Figure 1—figure supplement 1A–D). Moreover, almost half of a previously characterized subgroup of circRNAs, the AUG circRNAs (Stagsted et al., 2019), derive from DALI circRNAs (Figure 1—figure supplement 1B), and interestingly, when filtering circRNAs for conservation (in mouse and human), 69–72% of conserved circRNAs are DALI circRNAs (Figure 1E). This finding suggests that the IAE-dependent biogenesis pathways may not be relevant for the most conserved and abundant circRNAs and that other factors must be involved. Figure 1 with 1 supplement see all Download asset Open asset Characteristics of DALI-circRNA. (A) Schematics showing the flanking intron length (red) defined by the sum of annotated flanking introns and inverted Alu element (IAE) distance (blue) defined by the sum of distance to the most proximal IAE. (B–C) Density plot for the distribution of flanking intron lengths (B) and IAE Distance (C) for the top1000 expressed circRNAs in HepG2 (upper facet) and K562 (lower facet). The vertical line represents the median. (D) Contingency table showing the 4-way distribution of circRNAs with long and short flanking introns (in respect to the median) and proximal and distal IAEs (also in respect to the median, see B and C) for HepG2 (left facet) and K562 (right facet). The contingency table is color-coded by circRNA subgroup; DALI (distal Alu, long flanking introns, in red), PASI (proximal Alu, short flanking introns, in blue) and ‘Other’ (unclassified, in gray) circRNAs. The p-values are Fisher's exact test of independence. (E) As in D, but for the subset of circRNAs with conserved expression in mouse. SFPQ and NONO are specifically enriched in the introns flanking DALI circRNAs In order to identify RBPs that could drive circRNA formation, we used the elaborate ENCODE eCLIP data (ENCODE Project Consortium, 2012; Supplementary file 2). We scrutinized the immediate flanking regions of the 1000 most highly expressed circRNAs in HepG2 and K562 with the assumption that factors directly involved in backsplicing are likely to bind in the vicinity of the back-splicing sites. We extracted an eCLIP enrichment score using Wilcoxon rank-sum tests between the number of eCLIP reads aligned to circRNA flanking regions (upstream and downstream) compared to flanking regions of host exons, that is other exons from the circRNAs expressing genes. In HepG2, we found SFPQ to be the protein most highly enriched in the circRNA flanking regions, while NONO – a known interaction partner for SFPQ (Dong et al., 1993) – shows enrichment in K562 cells (Figure 2A–B, to our knowledge eCLIP datasets on SPFQ in K562 and NONO in HepG2 are not available). Comparing DALI and PASI circRNAs shows that SFPQ is DALI circRNA specific, both upstream and downstream of the circularizing exons (Figure 2C, p≤1.2e-16), whereas NONO associates with circRNA loci more generally and with the upstream regions of DALI circRNAs specifically (Figure 2D). SFPQ, like circRNAs, is known to associate with long introns (Iida et al., 2020; Takeuchi et al., 2018). To exclude that the enrichment seen is a mere bias from the flanking intron length, we extracted a subset of annotated splice acceptor (SA) and splice donor (SD) pairs sampled to match the expression level (linear spliced reads) and flanking intron lengths of DALI circRNAs (denoted ‘DALI-like exons’) (Figure 2—figure supplement 1A–H). This analysis shows that both SFPQ and NONO were significantly more enriched around circRNA exons compared to sampled DALI-like exons (Figure 2—figure supplement 1E–H). Figure 2 with 2 supplements see all Download asset Open asset SFPQ and NONO show enriched binding in the flanking regions of DALI circRNAs. (A–B) Barplot showing enrichment/depletion of eCLIP signal (see Supplementary file 2) in the vicinity of circRNAs (+/- 2000 nt) compared to host exons (+/- 2000 nt) as determined by Wilcoxon rank-sum tests for HepG2 (A) and K562 (B) eCLIP samples. (C–D) Cumulative plots of SFPQ (C) and NONO (D) eCLIP read distribution upstream and downstream of circRNA subgroups and host exons as denoted. (E) Schematic showing localization of primers (+/- 2000 nt) for targeting either upstream (up) or downstream (down) intronic regions of splice sites in respect to circRNA exons or host exon. (F) Western blotting of immunoprecipitated (IP), endogenous SFPQ or NONO from nuclear fractions of HepG2 cells with Histone H3 as a loading control. (G–H) RT-qPCR of intronic regions flanking a downstream host gene exon (left facet) or flanking the circRNA producing exon(s) (right facet) of CDYL (G) and ZKSCAN1 (H) upon RNA IP of endogenous SFPQ or NONO from nuclear fractions of HepG2 cells. The relative expression of immunoprecipitate (IP)/input is plotted. Data for three biological replicates are shown. To validate the binding of SFPQ and NONO on nascent circRNA transcripts, we conducted RNA immunoprecipitation (RIP) qPCR in HepG2 (Figure 2E–H and Figure 2—figure supplement 2A–B) and HEK293T cells (Figure 2—figure supplement 2C–H) and quantified the expression of a panel of representative DALI and PASI circRNAs. Here, the flanking regions of DALI circRNAs, circCDYL and circARHGAP5 (circEYA1 in HEK293T), were significantly enriched for SFPQ binding compared to downstream intronic regions (Figure 2G and Figure 2—figure supplement 2A,E and G). However, we found no enrichment for PASI circRNAs, circZKSCAN1 (Figure 2H and Figure 2—figure supplement 2F) and circNEIL3 (Figure 2—figure supplement 2B and H). Thus, we conclude that SFPQ and NONO associate with the flanking introns of DALI circRNAs, and this may be indicative of a functional role in circRNA biogenesis. SFPQ depletion represses DALI circRNAs production To study the impact of SFPQ and NONO on circRNA production, we depleted SFPQ and NONO in HepG2 and HEK293T cells using two different siRNAs for each target (Figure 3—figure supplement 1A, Supplementary files 3 and 4). Western blot and RT-qPCR (Figure 3A, Figure 3—figure supplement 1B–E) showed that expression of both proteins was efficiently reduced upon siRNA treatment, although, unexpectedly, the expression levels of SFPQ mRNA appeared unaffected by SFPQ knockdown and greatly elevated upon NONO depletion (Figure 3—figure supplement 1C and E). This, we speculate, is the result of compensatory effects or autoregulatory mechanisms. We performed total RNA sequencing of the knockdown samples, and conducted gene expression analysis of circRNA and mRNAs. Principal Component Analysis (PCA) of HEK293T and HepG2 samples shows clear grouping of treatments (SFPQ, NONO, and CTRL), both on mRNA and circRNA levels (Figure 3—figure supplement 1F–I), suggesting that most of the variance between samples are explained by the knockdown. Although for HepG2, two samples (siSFPQ1_rep1 and siNONO2_rep2) display outlier signatures and were thus removed in downstream analyses. Overall, the composition and expression of DALI and PASI circRNAs in the HepG2 and HEK293T-derived samples look very similar to the ENCODE-based analysis (Figure 3—figure supplement 1J–M). Figure 3 with 3 supplements see all Download asset Open asset Knockdown of SFPQ affects DALI circRNAs. (A) Western blotting of proteins from HEK293T (upper panel) and HepG2 (lower panel) cells transfected with either CTRL siRNAs, siRNAs targeting NONO mRNA, or siRNAs targeting SFPQ mRNA using antibodies against SFPQ, NONO, and β-tubulin (loading control) as denoted. (B–C) Volcano plot showing deregulated circRNAs upon NONO (left facet) and SFPQ (right facet) depletion in HEK293T cells (B) or HepG2 cells (C) color-coded by circRNA subgroup; DALI circRNAs (red), PASI circRNAs (blue) and ‘other’ circRNAs (gray). (D–E) Boxplot showing overall changes in expression (log2Foldchange) of the three circRNA subgroups upon NONO and SFPQ depletion in HEK293T (D) and HepG2 (E) cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests. (F) Genome screen dump of the circCDYL expressing locus with BSJ-spanning reads visualized as junction-track in the IGV browser (G) RT-qPCR quantification of circCDYL and linear CDYL expression upon SFPQ and NONO-depletion in HepG2 cells relative to GAPDH mRNA using two different siRNA designs for each target. Data for four biological replicates are shown. p-Values are calculated using Student’s two-tailed t-test. (H–I) as in F and G, but for the PASI circRNA, circZKSCAN1. (J) Boxplot showing eCLIP enrichment for SFPQ either immediately upstream or downstream (within 2000 nucleotides from the circRNA splice sites) of expressed circRNAs stratified either by circRNA subgroup or by deregulation upon SFPQ depletion in HepG2 cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests. The differential circRNA expression analysis showed that DALI circRNAs are generally reduced upon SFPQ depletion, whereas PASI circRNAs are practically unaffected in both HEK293T (Figure 3B and D) and HepG2 (Figure 3C and E) cells. For NONO, we observed almost no impact on circRNA production in both cell lines (Figure 3B–E). This could either indicate that NONO is less involved in circRNA biogenesis, or that the effect is in part masked by the concomitant upregulation of SFPQ observed upon NONO depletion. Consistently, RT-qPCR analyses of abundant DALI circRNAs, circCDYL (Figure 3F-G, ,Figure 3—figure supplement 2C), circARHGAP5 (Figure 3—figure supplement 2A) and circEYA1 (Figure 3—figure supplement 2E), and PASI circRNAs, circZKSCAN1 (Figure 3H-I , Figure 3—figure supplement 2D) and circNEIL3 (Figure 3—figure supplement 2B and F) confirmed repressed expression of DALI circRNAs and unchanged PASI circRNAs expression relative to host gene levels. Finally, to support a direct role for SFPQ in circRNA formation, we overlaid the results from SFPQ-depleted HepG2 cells with the SFPQ eCLIP data and observed a significant association between SFPQ binding in the flanking regions of DALI circRNAs, as expected, but also a clear association with deregulated circRNAs compared to unchanged circRNAs (Figure 3J). Here, SFPQ appears to associate upstream and downstream of repressed circRNAs, whereas upregulated circRNAs only show significant enrichment in the upstream region. In addition, we examined previously published total RNAseq from SFPQ conditional knock-out (KO) mouse brain (Takeuchi et al., 2018; Supplementary file 5). Here, as in human cell lines, DALI and PASI circRNA are prevalent subclasses (Figure 3—figure supplement 3A) with DALI circRNAs showing higher abundancy compared to PASI circRNAs (Figure 3—figure supplement 3B). SFPQ depletion in mouse brain affects global circRNA expression (Figure 3—figure supplement 3C); however, in contrast to HEK293T and HepG2 cells, we found a more equal distribution of up- and downregulated circRNAs upon SFPQ removal (Figure 3—figure supplement 3D–E), and we detect a clear tendency for DALI circRNAs to be more prone to SFPQ-mediated regulation (25% vs 5% showing significant deregulation, Figure 3—figure supplement 3F, p=8.2e-80, Fisher’s exact test). Consistent with HepG2, eCLIP analysis from mouse brain (Takeuchi et al., 2018) shows a clear association with DALI circRNAs, and a similar tendency toward upstream-only enrichment for circRNAs with increased expression upon SFPQ knockout (Figure 3—figure supplement 3G). Collectively, these findings suggest that SFPQ (and to a lesser degree NONO) regulates DALI circRNA biogenesis in mice and humans. SFPQ depletion affects alternative splicing and intron retention in long genes Next, to understand the impact of SFPQ and NONO on transcription and splicing in general, we used the RNAseq data to investigate SFPQ/NONO-sensitive mRNAs. Here, we found that SFPQ-depletion triggers a general repression of long genes (stratified by median gene length, Figure 4A). The read distribution of highly repressed genes showed a peculiar expression profile with unaffected read densities at the genic 5’ends but with dramatic reduction at the 3’end in HepG2 cells (Figure 4—figure supplement 1A–D) indicating that the transcription machinery drops off mid gene. This prompted us to survey genes globally for a ‘drop-off’ phenotype. Thus, we subgrouped genes into their expression profile by slicing each gene into 20 equally sized bins and conducting differential gene expression on all bins. To subgroup genes with of similar profiles in an unsupervised manner, we clustered the log2foldchanges across genes into five categories, denoted kc1-5, using k-means clustering (Figure 4B). Here, kc5 but also kc3 and 4 showed ‘drop-off’ effects but to different degrees, and interestingly, the effect correlates with gene length (Figure 4B–C). We obtain almost identical results from SFPQ-depleted HEK293T cells (Figure 4—figure supplement 2A–C) and mouse brain (Figure 4—figure supplement 2F–H). Figure 4 with 3 supplements see all Download asset Open asset SFPQ ensures long-gene expression and suppresses cryptic splicing. (A) Volcano plot depicting differential expression of annotated genes upon NONO or SFPQ KD compared to CTRL in HepG2 cells, stratified by median gene length into ‘long’ and ‘short’ genes as denoted. (B) Boxplot showing binned expression of clustered genes. Each gene is sliced into 20 equally sized bins, and the differential expression of each bin is determined and subgrouped into five k-means clusters (kc) (see Materials and methods). (C) Boxplot showing gene lengths distribution (0.25, 0.5 and 0.75 quantiles) stratified by clusters obtained in B. (D) Schematic representation of alternative splicing, where canonical (gray) denoted the most abundant splicing from the splice donor in question. Inclusion (green) and skipping (red) denotes an alternative splicing event shorter or longer than canonical, respectively. (E) Scatter plot showing alternative splicing in NONO and SFPQ depleted samples as a function of canonical intron length and color-coded by type of splicing either inclusion or skipping, see schematics in D. (F) Barplot with the number of unique alternative splicing events showing significant deregulation upon NONO and SFPQ depletion stratified by inclusion (green) and skipping (red), and whether the alternative SA site is annotated (transparent) or not (opaque). (G) Scatter plot showing effects on intron retention (IR) upon SFPQ and NONO depletion as a function of intron length, color-coded by significance (adjusted p-value<0.05) as denoted. (H) Scatterplot showing for each detectable intron the correlation between changes in exon-inclusion/skipping (red/green) and intron retention upon SFPQ depletion. (I) Boxplot showing the IP/Input enrichment of SFPQ eCLIP reads in introns harboring an exon inclusion or an intron retention event color-coded by whether the event is up or down (red or blue, respectively) or not significant (n/s, gray). (J) Schematic showing coordinates and full genic locus of DENND1A (top panel) and exon 8 and 9 with alternative, unannotated exon in-between (green, middle panel). Merged intron-spanning reads (lower panel) from CTRL, NONO-KD, and SFPQ-KD samples (HepG2) are shown and color-coded by splicing type; canonical (gray), inclusion (green), and skipping (red), see D. (K–M) RT-qPCR analysis of alternative splicing event (K), upstream expression (L) and downstream expression (M) relative to GAPDH mRNA using two different siRNA designs for each target. Data for four biological replicates are shown. p-Values are calculated using student’s two-tailed t-test. Upon inspection of the downregulated genes in our samples, we found an upregulation of alternative splicing in the SFPQ KD samples (Figure 4—figure supplement 1). We classified all alternative splicing events as either inclusion or skipping relative to their respective canonical isoform (Figure 4D) and performed differential expression analysis using DESeq2. This showed an extensive change (mostly upregulation) of alternative splicing events correlating with intron length in both HepG2, HEK293T and mouse brain (Figure 4E, Figure 4—figure supplement 2D and I). Of the 2106 significantly deregulated inclusion events in HepG2, more than 96% are upregulated and of these, 76% are not annotated by gencode (Figure 4F, in HEK293T: 95% upregulated, 78% unannotated, in mouse: 90% upregulated, 88% unannotated: data not shown), and consequently, we suggest that these events are mostly cryptic or aberrant splicing. Furthermore, analyzing the levels of intron retention by quantifying unspliced intronic reads shows a very similar intron-length-dependent pattern with significant retention of long intron upon SFPQ depletion (Figure 4G, Figure 4—figure supplement 2E and J). Consistently, we find a clear correlation between exon inclusion and intron retention (Figure 4H), and a clear enrichment of SFPQ eCLIP signal in regions subjected to alternative splicing and intron retention (Figure 4I). As an example, for DENND1A, we observe a previously unannotated splicing event joining exon eight to an alternative splice acceptor dinucleotide (AG) residing in intron eight of this gene (Figure 4J), which is only detectable upon SFPQ knockdown (Figure 4K). In DENND1A, this cryptic event marks the transition from unaffected to repressed state, as quantification of the upstream region shows modest to no effect between control and knockdown, whereas the downstream region is highly suppressed (Figure 4K–M). To strengthen the direct effect of SFPQ on cryptic SA inclusion, we co-introduced an siRNA-resistant SFPQ expression vector (Figure 4—figure supplement 3A–B). This showed an almost complete rescue of the DENND1A cryptic splicing otherwise observed upon SFPQ depletion (Figure 4—figure supplement 3C–E). Collectively, this suggests that intron retention and alternative splicing are conserved effects of SFPQ depletion, and that SFPQ plays a vital role in splicing integrity for long introns in particular. SFPQ depletion results in premature termination events In order for alternative splicing to result in premature termination of transcription, the alternative/cryptic-included exons need to harbor a polyA-signal that can serve as a functional terminator of transcription. To investigate the magnitude of polyA-signal appearance in SFPQ knockdown samples, we subjected SFPQ and NONO-depleted HEK293T cells to 3’end quantSeq (Figure 5—figure supplement 1A, Supplementary file 6). Putative polyA-signals were retrieved using the MACS2 callpeak algorithm, and to further increase the signal-to-noise ratio, we characterized each peak by the presence of a bonafide polyA-signal (PAS: AAUAAA or AUUAAA). Furthermore, for each quantseq peak, we also extracted the highest prevalence of A’s in all possible 15-nucleotide windows to reduce non-polyA-tail artefacts in the samples. The fraction of PAS-containing peaks dropped markedly when regions with 14 or 15 nucleotides A-stretches were found (Figure 5—figure supplement 1B), suggesting that these A-rich peaks are likely polyA-tail-independent artefacts and were thus removed from the analysis. The remaining peaks were classified as PAS sites, and for all PASs, the genic origin was annotated, and the differential usage was determined by DESeq2. This showed a clear enrichment on intronic PAS and a repression of exonic PAS usage upon SFPQ knockdown (Figure 5A). As before, NONO-depletion only showed a modest effect. Figure 5 with 3 supplements see all Download asset Open asset SFPQ depletion activates intronic polyA signal and premature termination. (A) Volcano plot showing deregulated PAS usage as measured by quantseq upon NONO and SFPQ depletion in HEK293T cells. PAS signals are color-coded by their genic origin; intronic (dark blue), exonic (light blue), or ambiguous (gray). (B) Plot showing the cumulative fraction of PASs as a function of relative genic position stratified by genic origin (ambiguous, exonic or intronic, vertical facets) and color-coded by whether the PAS is significantly up (red) or downregulated (blue) upon SFPQ knockdown. (C) Schematic representation of the DENND1A exon 8–9 locus with alternative exon (green) and putative PAS element (purple). Below, merged quantseq coverage from each experiment. (D) RT-qPCR on input and oligo-dT purified RNA from control and SFPQ-depleted HEK293T cells using amplicons specific for GAPDH mRNA (positive control), circZKSCAN1 (negative control), and the alternative SFPQ-activated exon. Values reflect ratios between oligo-dT purified and input quantities. Data for two biological replicates are shown. (E) Venn diagrams showing the number of unique introns with co-occurring upregulation of PAS and upregulated alternative splicing. The number of expressed introns without any evidence of enriched PASs or alternative splicing is denoted below the diagram. P-values are calculated by Fisher’s exact test. (F–G) Schematic showing the outline of the analysis (upper panel): For each circRNA, the locus spanning from the promoter to the circRNA splice donor was interrogated for the presence of quantseq PASs (F) or exon inclusion (G). Barplot (lower panel) showing the fraction of upregulated and downregulated circRNAs upon SFPQ depletion in HEK293T cells with evidence of a concomitant upregulated upstream PAS (F) or an upstream exon inclusion event (G). Numbers indicate the total number of circRNAs in each group. p-Values are calculated by Fisher’s exact test. As an upstream termination impacts downstream elements, we determined the relative genic position of up- and downregulated PASs. This showed a clear and general 5’region tendency of upregulated vs downregulated intronic PASs (Figure 5B), suggesting that activation of upstream PASs may subsequently repress the usage of downstream PASs. In addition, activation of upstream PASs were particularly pronounced in kc4 and 5 (Figure 5—figure supplement 1C) indicating that the ‘drop-off’-phenotype may be a consequence of intronic PAS activation and premature termination. To investigate how alternative splicing relates to premature termination in a global manner, we assessed for the co-presence of alternative exon inclusion and significantly enriched PASs across all expressed introns (Figure 5E). Overall, this showed a significant overlap (Figure 5E) with kc4 exhibiting the highest degree of overlap with 72 distinct introns harboring both events (Figure 5—figu
    Splicing factor
    Citations (2)