Abstract Motivation Ribosome profiling (Ribo-seq) is a powerful approach based on ribosome-protected RNA fragments to explore the translatome of a cell, and is especially useful for the detection of small proteins (<=70 amino acids) that are recalcitrant to biochemical and in silico approaches. While pipelines are available to analyze Ribo-seq data, none are designed explicitly for the analysis of Ribo-seq data from prokaryotes, nor are they focused on the discovery of unannotated open reading frames (ORFs) in bacteria. Results We present HRIBO (High-throughput annotation by Ribo-seq), a workflow to enable reproducible and high-throughput analysis of bacterial Ribo-seq data. The workflow performs all required pre-processing and quality control steps. Importantly, HRIBO outputs annotation-independent ORF predictions based on two complementary bacteria-focused tools, and integrates them with additional features. This facilitates the rapid discovery of novel ORFs and their prioritization for functional characterization. Availability HRIBO is a free and open source project available under the GPL-3 license at: https://github.com/RickGelhausen/HRIBO
Abstract RNA helicases play crucial roles in RNA biology. In plants, RNA helicases are encoded by large gene families, performing roles in abiotic stress responses, development, the post-transcriptional regulation of gene expression, as well as housekeeping functions. Several of these RNA helicases are targeted to the organelles, the mitochondria and chloroplasts. Cyanobacteria are the direct evolutionary ancestors of plant chloroplasts. The cyanobacterium Synechocystis 6803 encodes a single DEAD-box RNA helicase, CrhR, that is induced by a range of abiotic stresses, including low temperature. Though the ΔcrhR mutant exhibits a severe cold-sensitive phenotype, the physiological function(s) performed by CrhR have not been described. To identify transcripts interacting with CrhR, we performed RNA co-immunoprecipitation with extracts from a Synechocystis crhR deletion mutant expressing the FLAG-tagged native CrhR or a K57A mutated version with an anticipated enhanced RNA binding. The composition of the interactome was strikingly biased towards photosynthesis-associated and redox-controlled transcripts. A transcript highly enriched in all experiments was the crhR mRNA, suggesting an autoregulatory molecular mechanism. The identified interactome explains the described physiological role of CrhR in response to the redox poise of the photosynthetic electron transport chain and characterizes CrhR as an enzyme with a diverse range of transcripts as molecular targets.
Abstract Background Post-transcriptional regulation via RNA-binding proteins plays a fundamental role in every organism, but the regulatory mechanisms lack important understanding. Nevertheless, they can be elucidated by cross-linking immunoprecipitation in combination with high-throughput sequencing (CLIP-Seq). CLIP-Seq answers questions about the functional role of an RNA-binding protein and its targets by determining binding sites on a nucleotide level and associated sequence and structural binding patterns. In recent years the amount of CLIP-Seq data skyrocketed, urging the need for an automatic data analysis that can deal with different experimental set-ups. However, noncanonical data, new protocols, and a huge variety of tools, especially for peak calling, made it difficult to define a standard. Findings CLIP-Explorer is a flexible and reproducible data analysis pipeline for iCLIP data that supports for the first time eCLIP, FLASH, and uvCLAP data. Individual steps like peak calling can be changed to adapt to different experimental settings. We validate CLIP-Explorer on eCLIP data, finding similar or nearly identical motifs for various proteins in comparison with other databases. In addition, we detect new sequence motifs for PTBP1 and U2AF2. Finally, we optimize the peak calling with 3 different peak callers on RBFOX2 data, discuss the difficulty of the peak-calling step, and give advice for different experimental set-ups. Conclusion CLIP-Explorer finally fills the demand for a flexible CLIP-Seq data analysis pipeline that is applicable to the up-to-date CLIP protocols. The article further shows the limitations of current peak-calling algorithms and the importance of a robust peak detection.
The eCLIP data provided here is a subset of the eCLIP data of RBFOX2 from a study published by Nostrand et al. (2016, http://dx.doi.org/10.1038/nmeth.3810). The dataset contains the first biological replicate of RBFOX2 CLIP-seq and the input control experiment (*fastq files). The data was changed and downsampled to reduce data processing time, thus the datasets does not correspond to the original data pulled from Nostrand et al. (2016, http://dx.doi.org/10.1038/nmeth.3810). Also included is a text file (.txt) encompassing the chromosome sizes of hg19 obtained from UCSC (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes) and a genome annotation (.gtf) file taken from Ensembl (http://ftp.ensemblorg.ebi.ac.uk/pub/release-74/gtf/homo_sapiens/). The data is used for a galaxy training course about CLIP-Seq data analysis.
Ribosome profiling (Ribo-seq) is a powerful approach based on deep sequencing of cDNA libraries generated from ribosome-protected RNA fragments to explore the translatome of a cell, and is especially useful for the detection of small proteins (50-100 amino acids) that are recalcitrant to many standard biochemical and in silico approaches. While pipelines are available to analyze Ribo-seq data, none are designed explicitly for the automatic processing and analysis of data from bacteria, nor are they focused on the discovery of unannotated open reading frames (ORFs).We present HRIBO (High-throughput annotation by Ribo-seq), a workflow to enable reproducible and high-throughput analysis of bacterial Ribo-seq data. The workflow performs all required pre-processing and quality control steps. Importantly, HRIBO outputs annotation-independent ORF predictions based on two complementary bacteria-focused tools, and integrates them with additional feature information and expression values. This facilitates the rapid and high-confidence discovery of novel ORFs and their prioritization for functional characterization.HRIBO is a free and open source project available under the GPL-3 license at: https://github.com/RickGelhausen/HRIBO.
The datasets are used for the Galaxy CUT&RUN training material. CUT&RUN data was generated by Zhu et al. 2019 and down sampled to speed up the training. ChIP-seq data comes from an experiment for GATA1 by Canver et al. 2017.
Abstract HnRNPs are ubiquitously expressed RNA-binding proteins, tightly controlling posttranscriptional gene regulation. Consequently, hnRNP networks are essential for cellular homeostasis and their dysregulation is associated with cancer and other diseases. However, the physiological function of hnRNPs in non-cancerous cell systems are poorly understood. We analyzed the importance of HNRNPDL in endothelial cell functions. Knockdown of HNRNPDL led to impaired proliferation, migration and sprouting of spheroids. Transcriptome analysis identified cyclin D1 ( CCND1 ) and tropomyosin 4 ( TPM4 ) as targets of HNRNPDL, reflecting the phenotypic changes after knockdown. Our findings underline the importance of HNRNPDL for the homeostasis of physiological processes in endothelial cells.
Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues to use, maintain and contribute to the project, with support from multiple national infrastructure providers that enable freely accessible analysis and training services. The Galaxy Training Network supports free, self-directed, virtual training with >230 integrated tutorials. Project engagement metrics have continued to grow over the last 2 years, including source code contributions, publications, software packages wrapped as tools, registered users and their daily analysis jobs, and new independent specialized servers. Key Galaxy technical developments include an improved user interface for launching large-scale analyses with many files, interactive tools for exploratory data analysis, and a complete suite of machine learning tools. Important scientific developments enabled by Galaxy include Vertebrate Genome Project (VGP) assembly workflows and global SARS-CoV-2 collaborations.