Increasing evidence suggests that interactions between regulatory genomic elements play an important role in regulating gene expression. We generated a genome-wide interaction map of regulatory elements in human cells (ENCODE tier 1 cells, K562, GM12878) using Chromatin Interaction Analysis by Paired-End Tag sequencing (ChIA-PET) experiments targeting six broadly distributed factors. Bound regions covered 80% of DNase I hypersensitive sites including 99.7% of TSS and 98% of enhancers. Correlating this map with ChIP-seq and RNA-seq data sets revealed cohesin, CTCF, and ZNF143 as key components of three-dimensional chromatin structure and revealed how the distal chromatin state affects gene transcription. Comparison of interactions between cell types revealed that enhancer–promoter interactions were highly cell-type-specific. Construction and comparison of distal and proximal regulatory networks revealed stark differences in structure and biological function. Proximal binding events are enriched at genes with housekeeping functions, while distal binding events interact with genes involved in dynamic biological processes including response to stimulus. This study reveals new mechanistic and functional insights into regulatory region organization in the nucleus.
Abstract Pooled CRISPR-Cas9 screens have recently emerged as a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we conducted a genome-scale screen for essential CTCF loop anchors in the K562 leukemia cell line. Surprisingly, the primary drivers of signal in this screen were single guide RNAs (sgRNAs) with low specificity scores. After removing these guides, we found that there were no CTCF loop anchors critical for cell growth. We also observed this effect in an independent screen fine-mapping the core motifs in enhancers of the GATA1 gene. We then conducted screens in parallel with CRISPRi and CRISPRa, which do not induce DNA damage, and found that an unexpected and distinct set of off-targets also caused strong confounding growth effects with these epigenome-editing platforms. Promisingly, strict filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and allowed for the identification of essential enhancers, which we validated extensively. Together, our results show off-target activity can severely limit identification of essential functional motifs by active Cas9, while strictly filtered CRISPRi screens can be reliably used for assaying larger regulatory elements.
Abstract Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements.
Abstract Background Alzheimer’s disease (AD) genome‐wide association studies (GWAS) identified 75 disease‐associated loci; however, interpretation of these results has proven difficult for two primary reasons. First, linkage disequilibrium between nearby variants makes it difficult to identify the causal variant(s) at each locus. Second, because the vast majority of OA risk variants are non‐coding and can regulate genes from distances of greater than one million base pairs, the genes impacted by AD risk variants are largely unknown. Recently developed genomic approaches can address these hurdles but must be applied to the correct cell types and biological contexts. Method We quantified the impact of 3,576 AD‐associated genetic variants on enhancer activity by performing massively parallel reporter assays (MPRAs) in resting and activated human macrophages. To map these variants to the genes they regulate we built regulatory networks by mapping 3D chromatin structure (Hi‐C), chromatin accessibility (ATAC), histone K27 acetylation (CUT&RUN), and gene expression (RNA‐seq) in both resting and activated iPSC‐derived microglia. Result In total we identified 20,992 chromatin loops across resting and activated iPSC‐derived microglia. Among these, 9,544 loops linked 14,576 enhancers to 11,487 genes. 58 loops connected 414 AD‐risk variants queried by our MPRA assays to 85 genes. 212, 19,256, and 2,790 of these loops, enhancers, and genes (respectively) changed in response to activation, providing further insight into their regulatory mechanisms and highlighting the need to study these events in the correct biological context. Conclusion Ongoing analyses and data integration should reveal further mechanistic details and provide novel AD risk genes for further research and therapeutic development. By intersecting our MPRA results and dynamic regulatory networks with AD GWAS, we were able to identify novel putative genes regulated by non‐coding AD risk variants in a cell‐type and condition‐specific manner.
SUMMARY The three-dimensional arrangement of the human genome comprises a complex network of structural and regulatory chromatin loops important for coordinating changes in transcription during human development. To better understand the mechanisms underlying context-specific 3D chromatin structure and transcription during cellular differentiation, we generated comprehensive in situ Hi-C maps of DNA loops during human monocyte-to-macrophage differentiation. We demonstrate that dynamic looping events are regulatory rather than structural in nature and uncover widespread coordination of dynamic enhancer activity at preformed and acquired DNA loops. Enhancer-bound loop formation and enhancer-activation of preformed loops represent two distinct modes of regulation that together form multi-loop activation hubs at key macrophage genes. Activation hubs connect 3.4 enhancers per promoter and exhibit a strong enrichment for Activator Protein 1 (AP-1) binding events, suggesting multi-loop activation hubs driven by cell-type specific transcription factors may represent an important class of regulatory chromatin structures for the spatiotemporal control of transcription. HIGHLIGHTS High resolution and high sensitivity of loop detection via deeply sequenced in situ Hi-C experiments during monocyte to macrophage differentiation (> 10 billion total reads) Multi-loop interaction communities identified surrounding key macrophage genes. Multi-loop communities connect dynamic enhancers through both static and newly acquired DNA loops, forming hubs of activation Macrophage activation hubs are enriched for AP-1 bound long-range enhancer interactions, suggesting cell-type specific TFs drive changes in 3D structure and transcription through regulatory DNA loops
Abstract Osteoarthritis affects millions worldwide, yet effective treatments remain elusive due to poorly understood molecular mechanisms. While genome-wide association studies (GWAS) have identified over 100 OA-associated loci, identifying the genes impacted at each locus remains challenging. Several studies have mapped expression quantitative trait loci (eQTL) in chondrocytes and colocalized them with OA GWAS variants to identify putative OA risk genes; however, the degree to which genetic variants influence OA risk via alternative splicing has not been explored. We investigated the role of alternative splicing in OA pathogenesis using RNA-seq data from 101 human chondrocyte samples treated with PBS (control) or fibronectin fragment (FN-f), an OA trigger. We identified 590 differentially spliced genes between conditions, with FN-f inducing splicing events similar to those in primary OA tissue. We used CRISPR/Cas9 to mimic an SNRNP70 splicing event observed in OA and FN-f-treated chondrocytes and found that it induced an OA-like expression pattern. Integration with genotyping data revealed 7,188 splicing quantitative trait loci (sQTL) affecting 3,056 genes. While many sQTLs were shared, we identified 738 and 343 condition-specific sQTLs for control and FN-f, respectively. We identified 15 RNA binding proteins whose binding sites were enriched at sQTL splice junctions and found that expression of those RNA binding proteins correlated with exon inclusion. Colocalization with OA GWAS identified 6 putative risk genes, including a novel candidate, PBRM1. Our study highlights the significant impact of alternative splicing in OA and provides potential therapeutic targets for future research.
T he R programming language is one of the most widely used programming languages for transforming raw genomic data sets into meaningful biological conclusions through analysis and visualization, which has been largely facilitated by infrastructure and tools developed by the Bioconductor project. However, existing plotting packages rely on relative positioning and sizing of plots, which is often sufficient for exploratory analysis but is poorly suited for the creation of publication-quality multi-panel images inherent to scientific manuscript preparation. We present plotgardener, a coordinate-based genomic data visualization package that offers a new paradigm for multi-plot figure generation in R. Plotgardener allows precise, programmatic control over the placement, aesthetics, and arrangements of plots while maximizing user experience through fast and memory-efficient data access, support for a wide variety of data and file types, and tight integration with the Bioconductor environment. Plotgardener also allows precise placement and sizing of ggplot2 plots, making it an invaluable tool for R users and data scientists from virtually any discipline. Availability Package: https://bioconductor.org/packages/plotgardener Code: https://github.com/PhanstielLab/plotgardener Documentation: https://phanstiellab.github.io/plotgardener/