Abstract Cell-free DNA (cfDNA) contains a composite map of the epigenomes of its cells-of-origin. Tissue-specific transcription factor (TF) binding inferred from cfDNA could enable us to track disease states in humans in a minimally invasive manner. Here, by enriching for short cfDNA fragments, we directly map TF footprints at single binding sites from plasma. We show that the enrichment of TF footprints in plasma reflects the binding strength of the TF in cfDNA tissue-of-origin. Based on this principle, we were able to identify the subset of genome-wide binding sites for selected TFs that leave TF-specific footprints in plasma. These footprints enabled us to not only identify the tissue-of-origin of cfDNA but also map the chromatin structure around the factor-bound sites in their cells-of-origin. To ask if we can use these plasma TF footprints to map cancer states, we first defined pure tumor TF signatures in plasma in vivo using estrogen receptor-positive (ER+) breast cancer xenografts. We found that the tumor-specific cfDNA protections of ER-α could distinguish WT, ER-amplified, and ER-mutated xenografts. Further, tumor-specific cfDNA protections of ER-α and FOXA1 reflect TF-specific accessibility across human breast tumors, demonstrating our ability to capture tumor TF binding in plasma. We then scored TF binding in human plasma samples and identified specific binding sites whose plasma TF protections can identify the presence of cancer and specifically breast cancer. Thus, plasma TF footprints enable minimally invasive mapping of the regulatory landscape of cancer in humans.
Abstract Our current understanding of solid tumors and their progression primarily relies on in vitro models, cell lines, patient-derived xenografts, and scarce data from invasive tissue biopsies from patients. The ability to monitor changes in chromatin structure and transcription factor binding in tumor cells using a minimally invasive approach in humans has the potential to revolutionize our understanding of disease progression and treatment resistance. In this study, we use the example of estrogen receptor (ER) positive breast cancer, the most common disease subtype, and define the ER axis from plasma cell-free DNA (cfDNA). While lymphoid/myeloid cell turnover represents the dominant source of cfDNA in the bloodstream, a detectable fraction of DNA from tumor tissue-of-origin can be found in patients with solid cancers. cfDNA is the product of the action of endogenous nucleases on chromatin; and retains the map of epigenomes from cells of origin. Our method therefore non-invasively captures TF-nucleosome dynamics in tumor tissue-of-origin using plasma cfDNA. First, we show that we can reliably identify the active binding of hematopoietic pioneer factor PU.1 and CTCF from cfDNA of healthy humans and cancer patients. Then to define cfDNA binding of disease specific TF ER, we used ER+ patient-derived xenograft (PDX) models allowing for a clear separation of tumor signal from hematopoietic background. This allowed us to establish the sensitivity and specificity of our approach. We also identified the subset of CUT&RUN-defined ER binding sites that feature the strongest binding in vivo from both lymphocyte background as well as cancer cells. Furthermore, we can define the active binding sites of pioneer factor FOXA1, which facilitates ER binding by opening the chromatin. Based on the TF protection levels from cfDNA we were able to define tumor as well as hematopoietic-specific TF binding sites that can serve as potential hotspots to monitor ER+ disease state at around 1% tumor fraction. These data demonstrate our ability to simultaneously monitor TF and nucleosome dynamics at disease sites just from plasma that can enable real-time monitoring of disease phenotype in a minimally invasive manner. Citation Format: Satyanarayan Rao, Amy Han, Alexis Zukowski, Etana Kopin, Peter Kabos, Srinivas Ramachandran. Transcription factor-nucleosome dynamics inferred from plasma cfDNA delineates tumor and tumor-microenvironment phenotype [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2611.
e21198 Background: Despite revolutionizing cancer therapy, immune checkpoint inhibitors (ICI) still do not benefit a significant proportion of patients. The risks associated with ICI-related adverse events, mixed performance of PD-L1 staining in predicting treatment response and its high cost, present a clinical need for more precise methods to define disease states in the context of ICI treatment. ICI response is thought to depend on the phenotype of the tumor and the associated immune response, especially functional state of CD8+ T cells in the tumor microenvironment (TME). Methods: Here we show that the CD8+ T cell signatures from plasma cell-free DNA (cfDNA) could be used to predict response of patients to anti-PD-1 therapy, using samples collected before and shortly after start of treatment. Blood samples were drawn just before the first dose, 1 day to 1 week before the start of treatment and 45-60 days since start of treatment. The treatment duration varied depending on response. Response was evaluated by CT scans every 8-12 weeks. 11 of the samples are from patients with no or minor response ( < 6 months of treatment), and 12 are from patients with prolonged benefit of the medication ( > 1 year of treatment). Exhausted CD8+ T cells, one of the targets of immunotherapy treatment are epigenetically regulated. So, we utilized naïve and PD-1high CD8+ T cell ATAC-seq data to define the repertoire of accessible chromatin specific to these cell types. We found an enrichment of short cfDNA fragments at a significant fraction of these sites, enabling us to define responder-specific and non-responder-specific accessible PD-1high regions from cfDNA. Results: We used the enrichment of short- versus long-cfDNA fragments (reflecting transcription factor-nucleosome dynamics in tissue of origin of cfDNA) as a scoring function, and repeated-cross-validation as a classifier model to identify differentially enriched features between responders and non-responders. Our model accurately predicts the two classes of response both pre-treatment (mean test AUC = 0.86; SD = 0.14). Conclusions: Notably, in addition to generating an accurate classifier, our analysis enabled us to identify predominant transcription factor motifs from predictive ATAC-seq peaks that characterize both response and lack of response, paving the way for further understanding the mechanistic basis of patient-specific response to ICI. Our results suggest the possibility of personalized prediction of treatment response that is independent of specific tumor genotype.
DNA-binding proteins play important roles in various cellular processes, but the mechanisms by which proteins recognize genomic target sites remain incompletely understood. Functional groups at the edges of the base pairs (bp) exposed in the DNA grooves represent physicochemical signatures. As these signatures enable proteins to form specific contacts between protein residues and bp, their study can provide mechanistic insights into protein-DNA binding. Existing experimental methods, such as X-ray crystallography, can reveal such mechanisms based on physicochemical interactions between proteins and their DNA target sites. However, the low throughput of structural biology methods limits mechanistic insights for selection of many genomic sites. High-throughput binding assays enable prediction of potential target sites by determining relative binding affinities of a protein to massive numbers of DNA sequences. Many currently available computational methods are based on the sequence of standard Watson-Crick bp. They assume that the contribution of overall binding affinity is independent for each base pair, or alternatively include dinucleotides or short k-mers. These methods cannot directly expand to physicochemical contacts, and they are not suitable to apply to DNA modifications or non-Watson-Crick bp. These variations include DNA methylation, and synthetic or mismatched bp. The proposed method, DeepRec, can predict relative binding affinities as function of physicochemical signatures and the effect of DNA methylation or other chemical modifications on binding. Sequence-based modeling methods are in comparison a coarse-grain description and cannot achieve such insights. Our chemistry-based modeling framework provides a path towards understanding genome function at a mechanistic level.
Cell-free DNA (cfDNA) has the potential to enable non-invasive detection of disease states and progression. Beyond its sequence, cfDNA also represents the nucleosomal landscape of cell(s)-of-origin and captures the dynamics of the epigenome. In this review, we highlight the emergence of cfDNA epigenomic methods that assess disease beyond the scope of mutant tumour genotyping. Detection of tumour mutations is the gold standard for sequencing methods in clinical oncology. However, limitations inherent to mutation targeting in cfDNA, and the possibilities of uncovering molecular mechanisms underlying disease, have made epigenomics of cfDNA an exciting alternative. We discuss the epigenomic information revealed by cfDNA, and how epigenomic methods exploit cfDNA to detect and characterize cancer. Future applications of cfDNA epigenomic methods to act complementarily and orthogonally to current clinical practices has the potential to transform cancer management and improve cancer patient outcomes.
DNA shape analysis has demonstrated the potential to reveal structure-based mechanisms of protein-DNA binding. However, information about the influence of chemical modification of DNA is limited. Cytosine methylation, the most frequent modification, represents the addition of a methyl group at the major groove edge of the cytosine base. In mammalian genomes, cytosine methylation most frequently occurs at CpG dinucleotides. In addition to changing the chemical signature of C/G base pairs, cytosine methylation can affect DNA structure. Since the original discovery of DNA methylation, major efforts have been made to understand its effect from a sequence perspective. Compared to unmethylated DNA, however, little structural information is available for methylated DNA, due to the limited number of experimentally determined structures. To achieve a better mechanistic understanding of the effect of CpG methylation on local DNA structure, we developed a high-throughput method, methyl-DNAshape, for predicting the effect of cytosine methylation on DNA shape.Using our new method, we found that CpG methylation significantly altered local DNA shape. Four DNA shape features-helix twist, minor groove width, propeller twist, and roll-were considered in this analysis. Distinct distributions of effect size were observed for different features. Roll and propeller twist were the DNA shape features most strongly affected by CpG methylation with an effect size depending on the local sequence context. Methylation-induced changes in DNA shape were predictive of the measured rate of cleavage by DNase I and suggest a possible mechanism for some of the methylation sensitivities that were recently observed for human Pbx-Hox complexes.CpG methylation is an important epigenetic mark in the mammalian genome. Understanding its role in protein-DNA recognition can further our knowledge of gene regulation. Our high-throughput methyl-DNAshape method can be used to predict the effect of cytosine methylation on DNA shape and its subsequent influence on protein-DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine.
Here, we present a pipeline to map states of protein-binding DNA in vivo. Our pipeline infers as well as quantifies cooperative binding. Using dual-enzyme single-molecule footprinting (dSMF) data, we show how our workflow identifies binding states at an enhancer in Drosophila S2 cells. Data from cells lacking endogenous DNA methylation are a prerequisite for this pipeline. For complete details on the use and execution of this protocol, please refer to Rao et al. (2021) and Krebs et al. (2017).