Abstract Understanding the function of genes and their regulation in tissue homeostasis and disease requires knowing the cellular context in which genes are expressed in tissues across the body. Single cell genomics allows the generation of detailed cellular atlases in human tissues, but most efforts are focused on single tissue types. Here, we establish a framework for profiling multiple tissues across the human body at single-cell resolution using single nucleus RNA-Seq (snRNA-seq), and apply it to 8 diverse, archived, frozen tissue types (three donors per tissue). We apply four snRNA-seq methods to each of 25 samples from 16 donors, generating a cross-tissue atlas of 209,126 nuclei profiles, and benchmark them vs . scRNA-seq of comparable fresh tissues. We use a conditional variational autoencoder (cVAE) to integrate an atlas across tissues, donors, and laboratory methods. We highlight shared and tissue-specific features of tissue-resident immune cells, identifying tissue-restricted and non-restricted resident myeloid populations. These include a cross-tissue conserved dichotomy between LYVE1- and HLA class II-expressing macrophages, and the broad presence of LAM-like macrophages across healthy tissues that is also observed in disease. For rare, monogenic muscle diseases, we identify cell types that likely underlie the neuromuscular, metabolic, and immune components of these diseases, and biological processes involved in their pathology. For common complex diseases and traits analyzed by GWAS, we identify the cell types and gene modules that potentially underlie disease mechanisms. The experimental and analytical frameworks we describe will enable the generation of large-scale studies of how cellular and molecular processes vary across individuals and populations.
Effective management and treatment of cancer continues to be complicated by the rapid evolution and resulting heterogeneity of tumors. Phylogenetic study of cell populations in single tumors provides a way to delineate intra-tumoral heterogeneity and identify robust features of evolutionary processes. The introduction of single-cell sequencing has shown great promise for advancing single-tumor phylogenetics; however, the volume and high noise in these data present challenges for inference, especially with regard to chromosome abnormalities that typically dominate tumor evolution. Here, we investigate a strategy to use such data to track differences in tumor cell genomic content during progression. We propose a reference-free approach to mining single-cell genome sequence reads to allow predictive classification of tumors into heterogeneous cell types and reconstruct models of their evolution. The approach extracts k-mer counts from single-cell tumor genomic DNA sequences, and uses differences in normalized k-mer frequencies as a proxy for overall evolutionary distance between distinct cells. The approach computationally simplifies deriving phylogenetic markers, which normally relies on first aligning sequence reads to a reference genome and then processing the data to extract meaningful progression markers for constructing phylogenetic trees. The approach also provides a way to bypass some of the challenges that massive genome rearrangement typical of tumor genomes presents for reference-based methods. We illustrate the method on a publicly available breast tumor single-cell sequencing dataset. We have demonstrated a computational approach for learning tumor progression from single cell sequencing data using k-mer counts. k-mer features classify tumor cells by stage of progression with high accuracy. Phylogenies built from these k-mer spectrum distance matrices yield splits that are statistically significant when tested for their ability to partition cells at different stages of cancer.
The molecular underpinnings of organ dysfunction in acute COVID-19 and its potential long-term sequelae are under intense investigation. To shed light on these in the context of liver function, we performed single-nucleus RNA-seq and spatial transcriptomic profiling of livers from 17 COVID-19 decedents. We identified hepatocytes positive for SARS-CoV-2 RNA with an expression phenotype resembling infected lung epithelial cells. Integrated analysis and comparisons with healthy controls revealed extensive changes in the cellular composition and expression states in COVID-19 liver, reflecting hepatocellular injury, ductular reaction, pathologic vascular expansion, and fibrogenesis. We also observed Kupffer cell proliferation and erythrocyte progenitors for the first time in a human liver single-cell atlas, resembling similar responses in liver injury in mice and in sepsis, respectively. Despite the absence of a clinical acute liver injury phenotype, endothelial cell composition was dramatically impacted in COVID-19, concomitantly with extensive alterations and profibrogenic activation of reactive cholangiocytes and mesenchymal cells. Our atlas provides novel insights into liver physiology and pathology in COVID-19 and forms a foundational resource for its investigation and understanding.
Abstract The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients’ demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63 + intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.
In high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only p values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as informative covariates to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern methods compare to one another. We investigate the accuracy, applicability, and ease of use of two classic and six modern FDR-controlling methods by performing a systematic benchmark comparison using simulation studies as well as six case studies in computational biology. Methods that incorporate informative covariates are modestly more powerful than classic approaches, and do not underperform classic approaches, even when the covariate is completely uninformative. The majority of methods are successful at controlling the FDR, with the exception of two modern methods under certain settings. Furthermore, we find that the improvement of the modern FDR methods over the classic methods increases with the informativeness of the covariate, total number of hypothesis tests, and proportion of truly non-null hypotheses. Modern FDR methods that use an informative covariate provide advantages over classic FDR-controlling procedures, with the relative gain dependent on the application and informativeness of available covariates. We present our findings as a practical guide and provide recommendations to aid researchers in their choice of methods to correct for false discoveries.
Effective management and treatment of cancer is greatly complicated by the rapid evolution and resulting heterogeneity of tumors. In prior work, we showed that phylogenetic study of cell populations in single tumors provides a way to make sense of this heterogeneity and identify robust features of evolutionary processes of single tumors. The introduction of single-cell sequencing has shown great promise for advancing single-tumor phylogenetics, but the volume and high noise of these data present many challenges for studying tumor evolution, especially with regard to the chromosome abnormalities that typically dominate tumor evolution. We propose a reference-free approach to mining genome sequence reads to allow predictive classification of tumors into heterogeneous types and reconstruct models of their evolution. The approach extracts k-mer counts from single-cell tumor sequences, using differences in normalized k-mer frequencies as a proxy for overall evolutionary distance between distinct cells. The approach is computationally more efficient in time and space than standard protocols for deriving phylogenetic markers, which rely on first aligning sequence reads to a reference genome and then processing the data downstream to extract meaningful progression markers and use them to construct phylogenetic trees. The approach also provides a way to bypass some of the challenges that massive genome rearrangement typical of tumor genomes present for reference-based methods. To handle the unique challenges of single-cell sequencing data, we have applied a series of noise correction measures intended to account for biases due to the sequencing technology. We illustrate the method using publicly available tumor single cell sequencing data. Phylogenies built from these k-mer spectrum distance matrices yield splits that are statistically significant when tested for their ability to partition cells at different stages of cancer.
Abstract SARS-CoV-2, the coronavirus that causes COVID-19, binds to angiotensin-converting enzyme 2 (ACE2) on human cells. Beyond the lung, COVID-19 impacts diverse tissues including the kidney. ACE2 is a key member of the Renin-Angiotensin-Aldosterone System (RAAS) which regulates blood pressure, largely through its effects on the kidney. RAAS blockers such as ACE inhibitors (ACEi) and Angiotensin Receptor Blockers (ARBs) are widely used therapies for hypertension, cardiovascular and chronic kidney diseases, and therefore, there is intense interest in their effect on ACE2 expression and its implications for SARS-CoV-2 pathogenicity. Here, we analyzed single-cell and single-nucleus RNA-seq of human kidney to interrogate the association of ACEi/ARB use with ACE2 expression in specific cell types. First, we performed an integrated analysis aggregating 176,421 cells across 49 donors, 8 studies and 8 centers, and adjusting for sex, age, donor and center effects, to assess the relationship of ACE2 with age and sex at baseline. We observed a statistically significant increase in ACE2 expression in tubular epithelial cells of the thin loop of Henle (tLoH) in males relative to females at younger ages, the trend reversing, and losing significance with older ages. ACE2 expression in tLoH increases with age in females, with an opposite, weak effect in males. In an independent cohort, we detected a statistically significant increase in ACE2 expression with ACEi/ARB use in epithelial cells of the proximal tubule and thick ascending limb, and endothelial cells, but the association was confounded in this small cohort by the underlying disease. Our study illuminates the dynamics of ACE2 expression in specific kidney cells, with implications for SARS-CoV-2 entry and pathogenicity.
Abstract The role of tumor heterogeneity in the progression of lung adenocarcinoma remains poorly understood. In order to understand how tumor heterogeneity impacts tumor evolution, we profiled single cell transcriptomes from genetically engineered mouse lung tumors at various stages. We observed a set of reproducible transcriptional states whose diversity increased over time. Additionally, we identified a highly plastic cell state that arose in every tumor we obtained. We profiled this cell state and identified a robust potential for phenotypic switching, an increased potential for spheroid formation in tumor sphere cultures, and an enrichment of the highly plastic cell state after chemotherapeutic stress in vivo. Our work suggests that the highly plastic cell state plays a significant role, and perhaps even drives, tumor progression and resistance to therapy in lung adenocarcinoma. Citation Format: Jason Earl Chan, Nemanja Despot Marjanovic, Matan Hofree, David Canner, Katherine Wu, Griffin Hartmann, Olivia C. Smith, Jonathan Kim, Anna Hudson, Ayshwarya Subramanian, Kenneth Pitter, Natasha Rekhtman, Pierre P. Massion, John T. Poirier, Charles M. Rudin, Tyler Jacks, Aviv Regev, Tuomas Tammela. A highly plastic cell state emerges during lung adenocarcinoma evolution [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 1511.