Classifications of proteins into groups of related sequences are in some respects like a periodic table for biology, allowing us to understand the underlying molecular biology of any organism. Pfam is a large collection of protein domains and families. Its scientific goal is to provide a complete and accurate classification of protein families and domains. The next release of the database will contain over 10 000 entries, which leads us to reflect on how far we are from completing this work. Currently Pfam matches 72% of known protein sequences, but for proteins with known structure Pfam matches 95%, which we believe represents the likely upper bound. Based on our analysis a further 28 000 families would be required to achieve this level of coverage for the current sequence database. We also show that as more sequences are added to the sequence databases the fraction of sequences that Pfam matches is reduced, suggesting that continued addition of new families is essential to maintain its relevance.
Abstract Accurate detection of minimal residual disease (MRD) can guide individualized management of early stage cancer patients, but current diagnostic approaches lack adequate sensitivity. Circulating tumor DNA (ctDNA) analysis has shown promise for recurrence monitoring but MRD detection immediately after neoadjuvant therapy or surgical resection has remained challenging. We have developed TARgeted DIgital Sequencing (TARDIS) to simultaneously analyze multiple patient-specific cancer mutations in plasma and improve sensitivity for minute quantities of residual tumor DNA. In 77 reference samples at 0.03%-1% mutant allele fraction (AF), we observed 93.5% sensitivity. Using TARDIS, we analyzed ctDNA in 34 samples from 13 patients with stage II/III breast cancer treated with neoadjuvant therapy. Prior to treatment, we detected ctDNA in 12/12 patients at 0.002%-1.04% AF (0.040% median). After completion of neoadjuvant therapy, we detected ctDNA in 7/8 patients with residual disease observed at surgery and in 1/5 patients with pathological complete response (odds ratio, 18.5, Fisher’s exact p=0.032). These results demonstrate high accuracy for a personalized blood test to detect residual disease after neoadjuvant therapy. With additional clinical validation, TARDIS could identify patients with molecular complete response after neoadjuvant therapy who may be candidates for nonoperative management. One Sentence Summary A personalized ctDNA test achieves high accuracy for residual disease.
The inter- and intra-tumor heterogeneity of breast cancer needs to be adequately captured in pre-clinical models. We have created a large collection of breast cancer patient-derived tumor xenografts (PDTXs), in which the morphological and molecular characteristics of the originating tumor are preserved through passaging in the mouse. An integrated platform combining in vivo maintenance of these PDTXs along with short-term cultures of PDTX-derived tumor cells (PDTCs) was optimized. Remarkably, the intra-tumor genomic clonal architecture present in the originating breast cancers was mostly preserved upon serial passaging in xenografts and in short-term cultured PDTCs. We assessed drug responses in PDTCs on a high-throughput platform and validated several ex vivo responses in vivo. The biobank represents a powerful resource for pre-clinical breast cancer pharmacogenomic studies (http://caldaslab.cruk.cam.ac.uk/bcape), including identification of biomarkers of response or resistance.
Abstract Background: Achieving a pathologic complete response (pCR) has been shown on the patient level to predict excellent long-term event-free survival outcomes. Residual cancer burden (RCB) quantifies the extent of residual disease for patients who did not achieve pCR. We have previously observed in the I-SPY 2 TRIAL that while metastatic events outside the central nervous system (CNS) were dramatically reduced in the setting of pCR, the incidence of CNS metastasis remained similar across RCB classes, raising the possibility that these CNS events may be independent of response in the breast. In this study, we evaluate the type and sites of recurrences by RCB in a large pooled dataset, which allows for analysis within subtype, to validate these findings. Methods: 5161 patients pooled across 12 institutions/trials with available RCB and event-free survival (EFS) data were included in this analysis. EFS was calculated as the interval between treatment initiation, and locoregional recurrence, distant recurrence or death from any cause; patients without event are censored at time of last follow-up. The median follow-up is 4.6 years. We summarized the EFS event type, further sub-dividing the distant recurrence events (DR) by their site of relapse (CNS-only, CNS and other sites, Non-CNS). We used a competing risk (Fine-Gray) model to assess which of these site-specific relapses differ between RCB classes and estimated the cumulative incidence of CNS-only and non-CNS events at 5 years. Analyses were performed across the entire study population and within HR/HER2 defined subtypes. Results: Among the 5161 subjects, there were 1164 EFS events, including 92 (7.9%) local recurrences (without distant recurrence and/or death) and 1072 distant recurrence-free survival (DRFS) events. Among the DRFS events, 158 patients died without a distant recurrence. 914 experienced distant recurrences, including 90 (9.8%) with CNS-only, 145 (15.9%) with CNS and other sites, 664 (72.6%) with non-CNS distant recurrence; 15 (1.6%) patients had missing recurrence site information. Table 1 summarizes the cumulative incidence of CNS-only and non-CNS recurrence at 5 years and the proportion of CNS-only recurrences among DR events by RCB class overall and within each HR/HER2 subtypes. The incidence of CNS-only recurrences was low and similar across RCB classes. In contrast, the incidence of non-CNS recurrences increases with increasing RCB. As a result, CNS-only recurrences are proportionally higher within the RCB-0 and RCB-I than in the RCB-II and RCB-III groups, largely because of the low DR event rate and relative low frequency of non-CNS recurrence events within the RCB-0 and RCB-I classes. Overall, 27% of the recurrences in the setting of pCR (RCB-0) are due to CNS-only recurrences.Conclusions: Consistent with previous studies, our large pooled analysis confirmed that CNS-only recurrences are uncommon but appear similar across RCB groups, independent of response, suggesting that the CNS is a treatment sanctuary site. In contrast, non-CNS recurrence rates increase as RCB increases. These findings suggest that inclusion of CNS-only recurrences as an outcome event may impact the association between neoadjuvant therapy response and long-term outcomes in the context of current therapies. Novel therapies that cross the blood brain barrier will be needed to impact CNS recurrence rates. Table 1: Cumulative Incidence of CNS Only and non-CNS Distant Recurrences at 5 years and proportion of CNS-only events among DR eventsRCB Class0IIIIIIpOverall (5161)N16766622017806Cum. Inc. CNS Only2%2%2%1%0.627Cum. Inc. Non-CNS3%6%16%27%<0.001# CNS-Only / # DR events (%)26/96 (27%)14/74 (19%)39/443 (9%)11/301 (4%)HR-HER2- (1774)N770212590202Cum. Inc. CNS Only2%3%2%4%0.298Cum. Inc. Non-CNS4%11%19%42%<0.001# CNS-Only / # DR events (%)13/50 (26%)6/32 (19%)13/148 (9%)8/111 (7%)HR-HER2+ (572)N3766710029Cum. Inc. CNS Only1%5%5%0%0.022Cum. Inc. Non-CNS2%5%18%38%<0.001# CNS-Only / # DR events (%)4/17 (24%)3/10 (30%)6/31 (19%)0/13 (0%)HR+HER2+ (858)N31317229182Cum. Inc. CNS Only1%1%2%0%0.37Cum. Inc. Non-CNS2%3%15%26%<0.001# CNS-Only / # DR events (%)3/10 (30%)2/16 (12%)7/68 (10%)0/29 (0%)HR+HER2- (1957)N2172111036493Cum. Inc. CNS Only3%2%1%0.2%0.087Cum. Inc. Non-CNS5%4%13%20%<0.001# CNS-Only / # DR events (%)6/19 (32%)3/16 (19%)13/196 (7%)3/148 (2%) Citation Format: Sonal Shad, Marieke van der Noordaa, Marie Osdoit, Diane de Croze, Anne-Sophie Hamy, Marick Lae, Fabien Reyal, Miguel Martin, María Del Monte-Millán, Sara López-Tarruella, I-SPY 2 TRIAL Consortium, Judy C Boughey, Matthew P Goetz, Tanya Hoskin, Rebekah Gould, Vicente Valero, Gabe Sonke, Tessa G Steenbruggen, Maartje van Seijen, Jelle Wesseling, John Bartlett, Stephen Edge, Mi-Ok Kim, Jean Abraham, Carlos Caldas, Helena Earl, Elena Provenzano, Stephen-John Sammut, David Cameron, Ashley Graham, Peter Hall, Lorna Mackintosh, Fan Fang, Andrew K Godwin, Kelsey Schwensen, Priyanka Sharma, Angela DeMichele, Janet Dunn, Louise Hiller, Larry Hayward, Jeremy Thomas, Kimberly Cole, Lajos Pusztai, Laura Van't Veer, Fraser Symmans, Laura Esserman, Christina Yau. Site of recurrence after neoadjuvant therapy: A multi-center pooled analysis [abstract]. In: Proceedings of the 2020 San Antonio Breast Cancer Virtual Symposium; 2020 Dec 8-11; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2021;81(4 Suppl):Abstract nr PD13-02.
Abstract Residual Cancer Burden (RCB) after neoadjuvant chemotherapy (NAC) is validated to predict event-free survival (EFS) in breast cancer but has not been studied for invasive lobular carcinoma (ILC). We studied patient-level data from a pooled cohort across 12 institutions. Associations between RCB index, class, and EFS were assessed in ILC and non-ILC with mixed effect Cox models and multivariable analyses. Recursive partitioning was used in an exploratory model to stratify prognosis by RCB components. Of 5106 patients, the diagnosis was ILC in 216 and non-ILC in 4890. Increased RCB index was associated with worse EFS in both ILC and non-ILC ( p = 0.002 and p < 0.001, respectively) and remained prognostic when stratified by receptor subtype and adjusted for age, grade, T category, and nodal status. Recursive partitioning demonstrated residual invasive cancer cellularity as most prognostic in ILC. These results underscore the utility of RCB for evaluating NAC response in those with ILC.
H&E slides used in the training dataset described in "Multi-omic machine learning predictor of breast cancer therapy response" published in Nature: https://www.nature.com/articles/s41586-021-04278-5 Metadata associated with these images also included in file Slide metadata.xlsx