Analyzing a functional genomics experiment, such as ATAC-, ChIP- or RNA-sequencing, requires reference data including a genome assembly and gene annotation. These resources can generally be retrieved from different organizations and in different versions. Most bioinformatic workflows require the user to supply this genomic data manually, which can be a tedious and error-prone process. Here we present genomepy, which can search, download, and preprocess the right genomic data for your analysis. Genomepy can search genomic data on NCBI, Ensembl, UCSC and GENCODE, and compare available gene annotations to enable an informed decision. The selected genome and gene annotation can be downloaded and preprocessed with sensible, yet controllable, defaults. Additional supporting data can be automatically generated or downloaded, such as aligner indexes, genome metadata and blacklists. Genomepy is freely available at https://github.com/vanheeringen-lab/genomepy under the MIT license and can be installed through pip or bioconda.
In Cnidaria, modes of gastrulation to produce the two body layers vary greatly between species. In the hydrozoan species Clytia hemisphaerica gastrulation involves unipolar ingression of presumptive endoderm cells from an oral domain of the blastula, followed by migration of these cells to fill the blastocoel with concomitant narrowing of the gastrula and elongation along the oral-aboral axis. We developed a 2D computational boundary model capable of simulating the morphogenetic changes during embryonic development from early blastula stage to the end of gastrulation. Cells are modeled as polygons with elastic membranes and cytoplasm, colliding and adhering to other cells, and capable of forming filopodia. With this model we could simulate compaction of the embryo preceding gastrulation, bottle cell formation, ingression, and intercalation between cells of the ingressing presumptive endoderm. We show that embryo elongation is dependent on the number of endodermal cells, low endodermal cell-cell adhesion, and planar cell polarity (PCP). When the strength of PCP is reduced in our model, resultant embryo morphologies closely resemble those reported previously following morpholino-mediated knockdown of the core PCP proteins Strabismus and Frizzled. Based on our results, we postulate that cellular processes of apical constriction, compaction, ingression, and then reduced cell-cell adhesion and mediolateral intercalation in the presumptive endoderm, are required and when combined, sufficient for Clytia gastrulation.
Sequencing databases contain enormous amounts of functional genomics data, making them an extensive resource for genome-scale analysis. Reanalyzing publicly available data, and integrating it with new, project-specific data sets, can be invaluable. With current technologies, genomic experiments have become feasible for virtually any species of interest. However, using and integrating this data comes with its challenges, such as standardized and reproducible analysis. Seq2science is a multi-purpose workflow that covers preprocessing, quality control, visualization, and analysis of functional genomics sequencing data. It facilitates the downloading of sequencing data from all major databases, including NCBI SRA, EBI ENA, DDBJ, GSA, and ENCODE. Furthermore, it automates the retrieval of any genome assembly available from Ensembl, NCBI, and UCSC. It has been tested on a variety of species, and includes diverse workflows such as ATAC-, RNA-, and ChIP-seq. It consists of both generic as well as advanced steps, such as differential gene expression or peak accessibility analysis and differential motif analysis. Seq2science is built on the Snakemake workflow language and thus can be run on a range of computing infrastructures. It is available at https://github.com/vanheeringen-lab/seq2science.
During vertebrate gastrulation, mesoderm is induced in pluripotent cells, concomitant with dorsal-ventral patterning and establishing of the dorsal axis.We applied single-cell chromatin accessibility and transcriptome analyses to explore the emergence of cellular heterogeneity during gastrulation in Xenopus tropicalis.Transcriptionally inactive lineage-restricted genes exhibit relatively open chromatin in animal caps, whereas chromatin accessibility in dorsal marginal zone cells more closely reflects transcriptional activity.We characterized single-cell trajectories and identified head and trunk organizer cell clusters in early gastrulae.By integrating chromatin accessibility and transcriptome data, we inferred the activity of transcription factors in single-cell clusters and tested the activity of organizer-expressed transcription factors in animal caps, alone or in combination.The expression profile induced by a combination of Foxb1 and Eomes most closely resembles that observed in the head organizer.Genes induced by Eomes, Otx2, or the Irx3-Otx2 combination are enriched for maternally regulated H3K4me3 modifications, whereas Lhx8induced genes are marked more frequently by zygotically controlled H3K4me3.Taken together, our results show that transcription factors cooperate in a combinatorial fashion in generally open chromatin to orchestrate zygotic gene expression.
Supplementary data for the seq2science manuscript. Includes a description on how to install seq2science, and for each re-analysis its configuration, samples file, and QC report.
Abstract Background and Aims Systemic lupus erythematosus (SLE) is an autoimmune disease directed against nuclear antigens, including those derived from apoptotic microparticles (MPs) and neutrophil extracellular traps (NETs). Innate immune cells display an hyperreactive phenotype in patients with SLE, with increased expression of immunostimulatory surface markers and increased production of proinflammatory cytokines. We hypothesize that the hyperreactive phenotype of innate immune cells in SLE is caused by the induction of trained immunity. Trained immunity is a de facto innate immune memory elicited by an initial stimulus that induces metabolic and epigenetic changes and results in a more vigorous inflammatory response to subsequent stimuli. Here we aim to investigate whether nuclear autoantigens derived from MPs and NETs can induce trained immunity in SLE patients. Method To investigate the capability of MPs and NETs to induce trained immunity we stimulated healthy PBMCs with isolated NETs and MPs or with plasma from SLE patients for 24 hours, washed and rested the cells for five days. Cells were restimulated with lipopolysaccharide (LPS) and Pam3CSK4. To test the activation status of innate immune cells, PBMCs were isolated from patients with SLE and healthy controls and stimulated the cells with different TLR agonists. Cytokine production was measured using ELISA. Immune cell subsets in SLE patients were analyzed by flow cytometry. We performed RNA sequencing and Chromatin immunoprecipitation (ChIP) sequencing for histone 3 lysine 4 trimethylation (H3K4me3) on monocytes from SLE patients and healthy controls. Results We found that in vitro both MPs and NETs, as well as plasma from SLE patients, can induce trained immunity. Initial stimulation with MPs, NETs or SLE plasma resulted in increased production of Tumor necrosis factor (TNF) and Interleukin (IL) 6 upon restimulation with different TLR agonists. Assessment of circulating immune cells showed higher percentages of monocytes in SLE patients compared to healthy controls, and we found that circulating monocytes from SLE patients produce increased levels of pro-inflammatory cytokines (IL-6, IL-1ß, TNF) after stimulation with Toll-like receptor agonists, indicating trained immunity. This is accompanied by increased expression of metabolism and inflammation-related genes, underscoring the hyperreactive phenotype typical in trained innate immune cells. Epigenetic analysis of monocytes revealed major changes in H3K4me3, an epigenetic mark associated with trained immunity. Conclusion Our findings provide new insight into the pathogenesis of SLE by showing that trained immunity can be elicited by SLE-related antigens present in MPs and NETs, and demonstrating that that circulating monocytes from SLE patients have a trained immunity phenotype. Trained immunity yields a possible biomarker for the risk of SLE flares and offers a new potential target for developing therapeutic strategies.
Abstract Proper cell fate determination is largely orchestrated by complex gene regulatory networks centered around transcription factors. However, experimental elucidation of key transcription factors that drive cellular identity is currently often intractable. Here, we present ANANSE ( AN alysis A lgorithm for N etworks S pecified by E nhancers), a network-based method that exploits enhancer-encoded regulatory information to identify the key transcription factors in cell fate determination. As cell type-specific transcription factors predominantly bind to enhancers, we use regulatory networks based on enhancer properties to prioritize transcription factors. First, we predict genome-wide binding profiles of transcription factors in various cell types using enhancer activity and transcription factor binding motifs. Subsequently, applying these inferred binding profiles, we construct cell type-specific gene regulatory networks, and then predict key transcription factors controlling cell fate transitions using differential networks between cell types. This method outperforms existing approaches in correctly predicting major transcription factors previously identified to be sufficient for trans-differentiation. Finally, we apply ANANSE to define an atlas of key transcription factors in 18 normal human tissues. In conclusion, we present a ready-to-implement computational tool for efficient prediction of transcription factors in cell fate determination and to study transcription factor-mediated regulatory mechanisms. ANANSE is freely available at https://github.com/vanheeringen-lab/ANANSE .
Abstract Motivation Analyzing a functional genomics experiment, such as ATAC-, ChIP-, or RNA-sequencing, requires genomic resources such as a reference genome assembly and gene annotation. These data can generally be retrieved from different organizations and in different versions. Most bioinformatic workflows require the user to supply this genomic data manually, which can be a tedious and error-prone process. Results Here, we present genomepy, which can search, download, and preprocess the right genomic data for your analysis. Genomepy can search genomic data on NCBI, Ensembl, UCSC, and GENCODE, and inspect available gene annotations to enable an informed decision. The selected genome and gene annotation can be downloaded and preprocessed with sensible, yet controllable, defaults. Additional supporting data can be automatically generated or downloaded, such as aligner indexes, genome metadata, and blacklists. Availability and implementation Genomepy is freely available at https://github.com/vanheeringen-lab/genomepy under the MIT license and can be installed through pip or Bioconda.