Abstract Background The cancer genome is commonly altered with thousands of structural rearrangements including insertions, deletions, translocation, inversions, duplications, and copy number variations. Thus, structural variant (SV) characterization plays a paramount role in cancer target identification, oncology diagnostics, and personalized medicine. As part of the SEQC2 Consortium effort, the present study established and evaluated a consensus SV call set using a breast cancer reference cell line and matched normal control derived from the same donor, which were used in our companion benchmarking studies as reference samples. Results We systematically investigated somatic SVs in the reference cancer cell line by comparing to a matched normal cell line using multiple NGS platforms including Illumina short-read, 10X Genomics linked reads, PacBio long reads, Oxford Nanopore long reads, and high-throughput chromosome conformation capture (Hi-C). We established a consensus SV call set of a total of 1788 SVs including 717 deletions, 230 duplications, 551 insertions, 133 inversions, 146 translocations, and 11 breakends for the reference cancer cell line. To independently evaluate and cross-validate the accuracy of our consensus SV call set, we used orthogonal methods including PCR-based validation, Affymetrix arrays, Bionano optical mapping, and identification of fusion genes detected from RNA-seq. We evaluated the strengths and weaknesses of each NGS technology for SV determination, and our findings provide an actionable guide to improve cancer genome SV detection sensitivity and accuracy. Conclusions A high-confidence consensus SV call set was established for the reference cancer cell line. A large subset of the variants identified was validated by multiple orthogonal methods.
Trichoplusiani derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusiani-derived cell line Tni-FNL.By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL.Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly.This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.
Abstract With the rapid advancement of sequencing technologies in the past decade, next generation sequencing (NGS) analysis has been widely applied in cancer genomics research. More recently, NGS has been adopted in clinical oncology to advance personalized medicine. Clinical applications of precision oncology require accurate tests that can distinguish tumor-specific mutations from errors or artifacts introduced during NGS processes or data analysis. Therefore, there is an urgent need to develop best practices in cancer mutation detection using NGS and the need for standard reference data sets for systematically benchmarking sequencing platforms, library protocols, bioinformatics pipelines and for measuring accuracy and reproducibility across platforms and methods. Within the SEQC2 consortium context, we established paired tumor-normal reference samples, a human triple-negative breast cancer cell line and a matched normal cell line derived from B lymphocytes. We generated whole-genome (WGS) and whole-exome sequencing (WES) data using 16 NGS library preparation protocols, seven sequencing platforms at six different centers. We systematically interrogated somatic mutations in the paired reference samples to identify factors affecting detection reproducibility and accuracy in cancer genomes. These large cross-platform/site WGS and WES datasets using well-characterized reference samples will represent a powerful resource for benchmarking NGS technologies, bioinformatics pipelines, and for the cancer genomics studies.
Abstract The desmoplastic stroma of pancreatic cancers forms a physical barrier that impedes intratumoral drug delivery. Attempts to modulate the desmoplastic stroma to increase delivery of administered chemotherapy have not shown positive clinical results thus far, and preclinical reports in which chemotherapeutic drugs were coadministered with antistromal therapies did not universally demonstrate increased genotoxicity despite increased intratumoral drug levels. In this study, we tested whether TGFβ antagonism can break the stromal barrier, enhance perfusion and tumoral drug delivery, and interrogated cellular and molecular mechanisms by which the tumor prevents synergism with coadministered gemcitabine. TGFβ inhibition in genetically engineered murine models (GEMM) of pancreas cancer enhanced tumoral perfusion and increased intratumoral gemcitabine levels. However, tumors rapidly adapted to TGFβ-dependent stromal modulation, and intratumoral perfusion returned to pre-treatment levels upon extended TGFβ inhibition. Perfusion was governed by the phenotypic identity and distribution of cancer-associated fibroblasts (CAF) with the myelofibroblastic phenotype (myCAFs), and myCAFs which harbored unique genomic signatures rapidly escaped the restricting effects of TGFβ inhibition. Despite the reformation of the stromal barrier and reversal of initially increased intratumoral exposure levels, TGFβ inhibition in cooperation with gemcitabine effectively suppressed tumor growth via cooperative reprogramming of T regulatory cells and stimulation of CD8 T cell–mediated antitumor activity. The antitumor activity was further improved by the addition of anti–PD-L1 immune checkpoint blockade to offset adaptive PD-L1 upregulation induced by TGFβ inhibition. These findings support the development of combined antistroma anticancer therapies capable of impacting the tumor beyond the disruption of the desmoplastic stroma as a physical barrier to improve drug delivery.
Abstract The standard treatment for locally advanced rectal cancer is chemoradiotherapy followed by surgery. However, patient response to pre-operative chemoradiotherapy is highly heterogeneous, ranging from complete tumor regression to no response. Over the course of disease, up to 40% of patients suffer from local relapse or the development of metachronous metastatic disease, compromising survival. It is not yet known how treatment resistance and disease progression in these patients evolve – whether from the selective outgrowth of pre-existing resistant clones during treatment or through the acquisition of additional, resistance conferring genomic alterations. We aim to delineate the genomic evolution and clonal architecture underlying treatment resistance and metastatic progression in rectal cancer based on a cohort of 42 patients, from which 98 tumor samples were collected longitudinally over the course of treatment and metastatic progression. Tumor diagnosis and treatment response assessment are routinely based on the histopathologic evaluation of formalin-fixed paraffin-embedded (FFPE) patient tissues. While formalin fixation and paraffin embedding optimally preserve histomorphology and allow long-term tissue storage at ambient temperature, they, on the other hand, lead to DNA crosslinks posing a challenge for sequencing analyses. To overcome this limitation, we have successfully developed a protocol for dissociation of FFPE tissues into single cells that are suitable for immunolabeling, flow sorting and genomic analysis by whole exome sequencing of pure tumor cell populations, single cell copy number variation (CNV) sequencing and multiplex interphase fluorescence in situ hybridization (miFISH). Our genomic analyses at the population as well as individual cell level revealed divergent aberration patterns and distinct levels of intra-tumor heterogeneity. Major tumor clones were characterized by gains of EGFR (7p11.2), MYC (8q24.21), CDX2 (13q12.2) and ZNF217 (20q13.2) along with losses of SMAD4 (18q21.2) and TP53 (17p13.1). While in some patients the clonal composition remained largely stable throughout treatment, in others major clonal shifts occurred favoring clones with TP53 loss. Our preliminary results indicate different propensities of genetically distinct tumor cell clones in therapy response and metastatic progression. Furthermore, our data show the feasibility of high-resolution clonal reconstruction from whole exome sequencing and single cell CNV sequencing data of FFPE cells. Citation Format: Daniela Hirsch, Judith Lieberich, Kerstin Heselmeyer-Haddad, Yonca Ceribas, Michael Kelly, Keyur Talsania, Yongmei Zhao, Timo Gaiser, Thomas Ried. Genomic heterogeneity and clonal dynamics of resistance evolution and metastatic progression in rectal cancer [abstract]. In: Proceedings of the AACR Virtual Special Conference on Tumor Heterogeneity: From Single Cells to Clinical Impact; 2020 Sep 17-18. Philadelphia (PA): AACR; Cancer Res 2020;80(21 Suppl):Abstract nr PO-106.
Gene expression analysis by RNA sequencing (RNA-seq) enables unique insights into clinical samples that can potentially lead to mechanistic understanding of the basis of various diseases as well as resistance and/or susceptibility mechanisms. However, FFPE tissues, which represent the most common method for preserving tissue morphology in clinical specimens, are not the best sources for gene expression profiling analysis. The RNA obtained from such samples is often degraded, fragmented, and chemically modified, which leads to suboptimal sequencing libraries. In turn, these generate poor quality sequence data that may not be reliable for gene expression analysis and mutation discovery. In order to make the most of FFPE samples and obtain the best possible data from low quality samples, it is important to take certain precautions while planning experimental design, preparing sequencing libraries, and during data analysis. This includes the use of appropriate metrics for precise sample quality control (QC), identifying the best methods for various steps during the sequencing library generation, and careful library QC. In addition, applying correct software tools and parameters for sequence data analysis is critical in order to identify artifacts in RNA-seq data, filter out contamination and low quality reads, assess uniformity of gene coverage, and measure the reproducibility of gene expression profiles among biological replicates. These steps can ensure high accuracy and reproducibility for profiling of very heterogeneous RNA samples. Here we describe the various steps for sample QC, library preparation and QC, sequencing, and data analysis that can help to increase the amount of useful data obtained from low quality RNA, such as that obtained from FFPE-RNA tissues.
Abstract With the rapid advancement of sequencing technologies, next generation sequencing (NGS) analysis has been widely applied in cancer genomics research. More recently, NGS has been adopted in clinical oncology to advance personalized medicine. Clinical applications of precision oncology require accurate tests that can distinguish tumor-specific mutations from artifacts introduced during NGS processes or data analysis. Therefore, there is an urgent need to develop best practices in cancer mutation detection using NGS and the need for standard reference data sets for systematically measuring accuracy and reproducibility across platforms and methods. Within the SEQC2 consortium context, we established paired tumor-normal reference samples and generated whole-genome (WGS) and whole-exome sequencing (WES) data using sixteen library protocols, seven sequencing platforms at six different centers. We systematically interrogated somatic mutations in the reference samples to identify factors affecting detection reproducibility and accuracy in cancer genomes. These large cross-platform/site WGS and WES datasets using well-characterized reference samples will represent a powerful resource for benchmarking NGS technologies, bioinformatics pipelines, and for the cancer genomics studies.