During macronuclear differentiation of the ciliate Tetrahymena thermophila, genome-wide DNA rearrangements eliminate nearly 50 Mbp of germline derived DNA, creating a streamlined somatic genome. The transposon-like and other repetitive sequences to be eliminated are identified using a piRNA pathway and packaged as heterochromatin prior to their removal. In this study, we show that LIA5, which encodes a zinc-finger protein likely of transposon origin, is required for both chromosome fragmentation and DNA elimination events. Lia5p acts after the establishment of RNAi-directed heterochromatin modifications, but prior to the excision of the modified sequences. In ∆LIA5 cells, DNA elimination foci, large nuclear sub-structures containing the sequences to be eliminated and the essential chromodomain protein Pdd1p, do not form. Lia5p, unlike Pdd1p, is not stably associated with these structures, but is required for their formation. In the absence of Lia5p, we could recover foci formation by ectopically inducing DNA damage by UV treatment. Foci in both wild-type or UV-treated ∆LIA5 cells contain dephosphorylated Pdd1p. These studies of LIA5 reveal that DNA elimination foci form after the excision of germ-line limited sequences occurs and indicate that Pdd1p reorganization is likely mediated through a DNA damage response.
The molecular organization of the human neocortex historically has been studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally defined spatial domains that move beyond classic cytoarchitecture. We used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex. Integration with paired single-nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we mapped the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains.
Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.
INTRODUCTION The brain is responsible for cognition, behavior, and much of what makes us uniquely human. The development of the brain is a highly complex process, and this process is reliant on precise regulation of molecular and cellular events grounded in the spatiotemporal regulation of the transcriptome. Disruption of this regulation can lead to neuropsychiatric disorders. RATIONALE The regulatory, epigenomic, and transcriptomic features of the human brain have not been comprehensively compiled across time, regions, or cell types. Understanding the etiology of neuropsychiatric disorders requires knowledge not just of endpoint differences between healthy and diseased brains but also of the developmental and cellular contexts in which these differences arise. Moreover, an emerging body of research indicates that many aspects of the development and physiology of the human brain are not well recapitulated in model organisms, and therefore it is necessary that neuropsychiatric disorders be understood in the broader context of the developing and adult human brain. RESULTS Here we describe the generation and analysis of a variety of genomic data modalities at the tissue and single-cell levels, including transcriptome, DNA methylation, and histone modifications across multiple brain regions ranging in age from embryonic development through adulthood. We observed a widespread transcriptomic transition beginning during late fetal development and consisting of sharply decreased regional differences. This reduction coincided with increases in the transcriptional signatures of mature neurons and the expression of genes associated with dendrite development, synapse development, and neuronal activity, all of which were temporally synchronous across neocortical areas, as well as myelination and oligodendrocytes, which were asynchronous. Moreover, genes including MEF2C , SATB2 , and TCF4 , with genetic associations to multiple brain-related traits and disorders, converged in a small number of modules exhibiting spatial or spatiotemporal specificity. CONCLUSION We generated and applied our dataset to document transcriptomic and epigenetic changes across human development and then related those changes to major neuropsychiatric disorders. These data allowed us to identify genes, cell types, gene coexpression modules, and spatiotemporal loci where disease risk might converge, demonstrating the utility of the dataset and providing new insights into human development and disease. Spatiotemporal dynamics of human brain development and neuropsychiatric risks. Human brain development begins during embryonic development and continues through adulthood (top). Integrating data modalities (bottom left) revealed age- and cell type–specific properties and global patterns of transcriptional dynamics, including a late fetal transition (bottom middle). We related the variation in gene expression (brown, high; purple, low) to regulatory elements in the fetal and adult brains, cell type–specific signatures, and genetic loci associated with neuropsychiatric disorders (bottom right; gray circles indicate enrichment for corresponding features among module genes). Relationships depicted in this panel do not correspond to specific observations. CBC, cerebellar cortex; STR, striatum; HIP, hippocampus; MD, mediodorsal nucleus of thalamus; AMY, amygdala.
Abstract Studies of complex disorders benefit from integrative analyses of multiple omics data. Yet, sample mix-ups frequently occur in multi-omics studies, weakening statistical power and risking false findings. Accurately aligning sample information, genotype, and corresponding omics data is critical for integrative analyses. We developed DRAMS ( https://github.com/Yi-Jiang/DRAMS ) to Detect and Re-Align Mixed-up Samples to address the sample mix-up problem. It uses a logistic regression model followed by a modified topological sorting algorithm to identify the potential true IDs based on data relationships of multi-omics. According to tests using simulated data, the more types of omics data used or the smaller the proportion of mix-ups, the better that DRAMS performs. Applying DRAMS to real data from the PsychENCODE BrainGVEX project, we detected and corrected 201 (12.5% of total data generated) mix-ups. Of the 21 mix-ups involving errors of racial identity, DRAMS re-assigned all samples to the correct racial cluster in the 1000 Genomes project. In doing so, quantitative trait loci (QTL) (FDR<0.01) increased by an average of 1.62-fold. The use of DRAMS in multi-omics studies will strengthen statistical power of the study and improve quality of the results. Even though very limited studies have multi-omics data in place, we expect such data will increase quickly with the needs of DRAMS. Author summary Sample mix-up happens inevitably during sample collection, processing, and data management. It leads to reduced statistical power and sometimes false findings. It is of great importance to correct mixed-up samples before conducting any downstream analyses. We developed DRAMS to detect and re-align mixed-up samples in multi-omics studies. The basic idea of DRAMS is to align the data and labels for each sample leveraging the genetic information of multi-omics data. DRAMS corrects sample IDs following a two-step strategy. At first, it estimates pairwise genetic relatedness among all the data generated from all the individuals. Because the different data generated from the same individual should share the same genetics, we can cluster all the highly related data and consider that the data from one cluster have only one potential ID. Then, we used a “majority vote” strategy to infer the potential ID for individuals in each cluster. Other information, such as match of genetics-based and reported sexes, omics priorority, etc., were also used to direct identifying the potential IDs. It has been proved that DRAMS performs very well in both simulation and PsychENCODE BrainGVEX multi-omics data.
The impact of genetic variants on gene expression has been intensely studied at the transcription level, yielding in valuable insights into the association between genes and the risk of complex disorders, such as schizophrenia (SCZ). However, the downstream impact of these variants and the molecular mechanisms connecting transcription variation to disease risk are not well understood.We quantitated ribosome occupancy in prefrontal cortex samples of the BrainGVEX cohort. Together with transcriptomics and proteomics data from the same cohort, we performed cis-Quantitative Trait Locus (QTL) mapping and identified 3,253 expression QTLs (eQTLs), 1,344 ribosome occupancy QTLs (rQTLs), and 657 protein QTLs (pQTLs) out of 7,458 genes quantitated in all three omics types from 185 samples. Of the eQTLs identified, only 34% have their effects propagated to the protein level. Further analysis on the effect size of prefrontal cortex eQTLs identified from an independent dataset showed clear post-transcriptional attenuation of eQTL effects. To investigate the biological relevance of the attenuated eQTLs, we identified 70 expression-specific QTLs (esQTLs), 51 ribosome-occupancy-specific QTLs (rsQTLs), and 107 protein-specific QTLs (psQTLs). Five of these omics-specific QTLs showed strong colocalization with SCZ GWAS signals, three of them are esQTLs. The limited number of GWAS colocalization discoveries from omics-specific QTLs and the apparent prevalence of eQTL attenuation prompted us to take a complementary approach to investigate the functional relevance of attenuated eQTLs. Using S-PrediXcan we identified 74 SCZ risk genes, 34% of which were novel, and 67% of these risk genes were replicated in a MR-Egger test. Notably, 52 out of 74 risk genes were identified using eQTL data and 70% of these SCZ-risk-gene-driving eQTLs show little to no evidence of driving corresponding variations at the protein level.The effect of eQTLs on gene expression in the prefrontal cortex is commonly attenuated post-transcriptionally. Many of the attenuated eQTLs still correlate with SCZ GWAS signal. Further investigation is needed to elucidate a mechanistic link between attenuated eQTLs and SCZ disease risk.
Most genetic risk for psychiatric disease lies in regulatory regions, implicating pathogenic dysregulation of gene expression and splicing. However, comprehensive assessments of transcriptomic organization in diseased brains are limited. In this work, we integrated genotypes and RNA sequencing in brain samples from 1695 individuals with autism spectrum disorder (ASD), schizophrenia, and bipolar disorder, as well as controls. More than 25% of the transcriptome exhibits differential splicing or expression, with isoform-level changes capturing the largest disease effects and genetic enrichments. Coexpression networks isolate disease-specific neuronal alterations, as well as microglial, astrocyte, and interferon-response modules defining previously unidentified neural-immune mechanisms. We integrated genetic and genomic data to perform a transcriptome-wide association study, prioritizing disease loci likely mediated by cis effects on brain expression. This transcriptome-wide characterization of the molecular pathology across three major psychiatric disorders provides a comprehensive resource for mechanistic insight and therapeutic development.
Cellular heterogeneity in the human brain obscures the identification of robust cellular regulatory networks, which is necessary to understand the function of non-coding elements and the impact of non-coding genetic variation. Here we integrate genome-wide chromosome conformation data from purified neurons and glia with transcriptomic and enhancer profiles, to characterize the gene regulatory landscape of two major cell classes in the human brain. We then leverage cell-type-specific regulatory landscapes to gain insight into the cellular etiology of several brain disorders. We find that Alzheimer's disease (AD)-associated epigenetic dysregulation is linked to neurons and oligodendrocytes, whereas genetic risk factors for AD highlighted microglia, suggesting that different cell types may contribute to disease risk, via different mechanisms. Moreover, integration of glutamatergic and GABAergic regulatory maps with genetic risk factors for schizophrenia (SCZ) and bipolar disorder (BD) identifies shared (parvalbumin-expressing interneurons) and distinct cellular etiologies (upper layer neurons for BD, and deeper layer projection neurons for SCZ). Collectively, these findings shed new light on cell-type-specific gene regulatory networks in brain disorders.