Abstract Transcriptional programmes active in haematopoietic cells enable a variety of functions including dedifferentiation, innate immunity and adaptive immunity. Understanding how these programmes function in the context of cancer can provide valuable insights into host immune response, cancer severity and potential therapy response. Here we present a method that uses the transcriptomes of over 200 murine haematopoietic cells, to infer the lineage-specific haematopoietic activity present in human breast tumours. Correlating this activity with patient survival and tumour purity reveals that the transcriptional programmes of many cell types influence patient prognosis and are found in environments of high lymphocytic infiltration. Collectively, these results allow for a detailed and personalized assessment of the patient immune response to a tumour. When combined with routinely collected patient biopsy genomic data, this method can enable a richer understanding of the complex interplay between the host immune system and cancer.
We develop a statistical framework to study the relationship between chromatin features and gene expression. This can be used to predict gene expression of protein coding genes, as well as microRNAs. We demonstrate the prediction in a variety of contexts, focusing particularly on the modENCODE worm datasets. Moreover, our framework reveals the positional contribution around genes (upstream or downstream) of distinct chromatin features to the overall prediction of expression levels.
Among the many factors determining protein evolutionary rate, protein-protein interaction degree (PPID) has been intensively investigated in recent years, but its precise effect on protein evolutionary rate is still heavily debated. We first confirmed that the correlation between protein evolutionary rate and PPID varies considerably across different protein interaction datasets. Specifically, because of the maximal inconsistency between yeast two-hybrid and other datasets, we reasoned that the difference in experimental methods contributes to our inability to clearly define how PPID affects protein evolutionary rate. To address this, we integrated protein interaction and gene co-expression data to derive a co-expressed protein-protein interaction degree (ePPID) measure, which reflects the number of partners with which a protein can permanently interact. Thus, irrespective of the experimental method employed, we found that (1) ePPID is a better predictor of protein evolutionary rate than PPID, (2) ePPID is a more robust predictor of protein evolutionary rate than PPID, and (3) the contribution of ePPID to protein evolutionary rate is statistically independent of expression level. Analysis of hub proteins in the Structural Interaction Network further supported ePPID as a better predictor of protein evolutionary rate than the number of distinct binding interfaces and clarified the slower evolution of co-expressed multi-interface hub proteins over that of other hub proteins. Our study firmly established ePPID as a robust predictor of protein evolutionary rate, irrespective of experimental method, and underscored the importance of permanent interactions in shaping the evolutionary outcome.
Abstract ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.
Abstract Determining how immune cells functionally interact in the tumor microenvironment and identifying their biological roles and clinical values are critical for understanding cancer progression and developing new therapeutic strategies. Here we introduce TimiGP, a computational method to infer inter-cell functional interaction networks and annotate the corresponding prognostic effect from bulk gene expression and survival statistics data. When applied to metastatic melanoma, TimiGP overcomes the prognostic bias caused by immune co-infiltration and identifies the prognostic value of immune cells consistent with their anti- or pro-tumor roles. It reveals the functional interaction network in which the interaction X→Y indicates a more positive impact of cell X than Y on survival. This network provides immunological insights to facilitate the development of prognostic models, as evidenced by our computational-friendly, biologically interpretable, independently validated models. By leveraging single-cell RNA-seq data for specific immune cell subsets, TimiGP has the flexibility to delineate the tumor microenvironment at different resolutions and is readily applicable to a wide range of cancer types.
<p>Supplementary Figure S4. Additional data supporting Figure 6. In the I/V model, BALB/c mice were administered 100k CT26 cells and treated when tumors reached a size of 40mm3 with isotype (I) or anti-VISTA (V). In the CPV model, BALB/c mice were administered 100k CT26 cells and treated with anti-CTLA-4 and anti-PD-1 plus isotype (CP) or plus anti-VISTA (CPV) when tumors reached a size of 600mm3. CD45 + (A-C, E) or CD8+ AH1-tetramer+ (D) cells were isolated from the TME and analyzed by scRNAseq. (A) Stacked bar graph showing all lymphoid cluster frequencies as a proportion of the lymphoid infiltrate for I/V and CP/CPV. (B) Stacked bar graph showing NK cluster frequencies as a proportion of the NK infiltrate for I/V and CP/CPV. (C) Stacked bar graph showing CD8+ T cell cluster frequencies as a proportion of the CD8+ T cell infiltrate for I/V and CP/CPV. (D) Mean cluster frequencies as a proportion of CD8+ AH1-tetramer+ T cells in I/V or CP/CPV treatment. (E) Violin plots showing enrichment of a quiescence gene signature in CD8+ T cell clusters in the CD45 + sorted I/V dataset[58]. *P < 0.05, **P < 0.01, ***P < 0.001.</p>