ABSTRACT Purpose To evaluate the risk and spectrum of phenotypes associated in individuals with one or two of the CFTR T5 haplotype variants (TG11T5, TG12T5 and TG13T5) in the absence of the R117H variant. Methods Individuals who received testing with CFTR NGS results between 2014 and 2019 through Invitae at ordering provider discretion were included. TG-T repeats were detected using a custom-developed haplotype caller. Frequencies of the TG-T5 variants (biallelic or in combination with another CF-causing variant [CFvar]) were calculated. Clinical information reported by the ordering provider (via requisition form) or the individual (during genetic counseling appointments) was examined. Results Among 548,300 individuals, the minor allele frequency of the T5 allele was 4.2% (TG repeat distribution: TG11=68.1%, TG12=29.5%, TG13=2.4%). When present with a CFvar, each of the TG[11-13]T5 variants were significantly enriched in individuals with a “high suspicion” of CF/CFTR-RD (personal/family history of CF/CFTR-RD) compared to those with very “low suspicion” for CF or CFTR-RD (hereditary cancer testing, CFTR not requisitioned). Compared to CFvar/CFvar individuals, TG[11-13]T5/CFvar individuals generally had single organ involvement, milder symptoms, variable expressivity, and reduced penetrance. Discussion Data from this study provides a better understanding of disease risks associated with inheriting TG[11-13]T5 variants and has important implications for reproductive genetic counseling.
Lynch syndrome, the most common cause of hereditary colorectal cancer, is due to pathogenic germline variants in mismatch repair (MMR) genes. Variants of uncertain significance (VUS) in MMR genes hinder diagnoses and clinical management. The continued expansion of genetic sequencing in patients with suspected Lynch syndrome, catalyzed by reduced costs and broadened clinical guidelines – has resulted in an explosion in the number of rare variants observed in the Lynch syndrome genes. Despite a commensurate increase in the amount of available data with potential use for interpreting these variants, approximately half of identified variants in Lynch syndrome testing are classified as VUS.
Abstract Introduction: Current variant classification (VC) frameworks rely on rules-based approaches that use heuristic weighting of various types of evidence, resulting in > 50% of variants being classified as variants of uncertain significance (VUS), and leaving many patients with uncertainty about their disease risk or diagnosis. We propose a probabilistic and scalable Bayesian approach to model the causal relationships between various types of evidence. A fully quantitative system maximizes the utility and integration of various evidence types, empowering clinicians to make more nuanced management decisions. Methods: Probabilistic graphical models (PGMs) are uniquely suited to the needs of clinical VC. Two different component PGMs were developed to model two of the evidence categories that will ultimately be used in a comprehensive VC system: population allele frequency (Population PGM) and reported phenotype observations (Reported Phenotype PGM). Results: The Population PGM treats population allele frequency observations as a binomial process. By conditioning the model on partial observations, the probabilistic relationships between pathogenicity and allele frequencies can be estimated while stochastic variational inference allows uncertainty to be efficiently propagated. The resulting model performs well across a large number of genes at inferring pathogenicity of a variant from its allele frequency, with an average precision of >99% for benign variants. In the Reported Phenotype PGM, phenotypic features characteristic of a disorder are learned from patients expected to be affected based on genotype. Patient-level predictions are in turn combined at the variant level to derive variant pathogenicity likelihoods. To date, models representing at least 102 inherited conditions and 259 genes have demonstrated high predictive performance (>0.8 AUROC) for both patient-level and variant-level predictions. Finally, as a proof-of-concept, we demonstrate how each component PGM can be combined into a probabilistic Bayesian VC framework that also includes protein structure and stability, evolutionary conservation, and sequence context. This framework has high concordance with known, well-accepted pathogenic and benign variants classified with rules-based systems, and could make high-confidence predictions for many variants currently classified as VUS. Conclusion: We present a Bayesian approach that can integrate diverse types of evidence to achieve high VC accuracy while quantifying uncertainty. Future expansion of this Bayesian framework to all evidence types relevant to VC may allow for more accurate risk management guidelines and further inform medical and genetic counseling recommendations. Citation Format: Wolfgang Michael Korn, Yuya Kobayashi, Flavia M. Facio, Arun Nampally, Keith Nykamp, Robert Nussbaum, Alexandre Colavin, Britt Johnson, Toby Manders. Continuous, probabilistic variant interpretation with Bayesian graphical models [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 792.
Abstract The transition from analog to digital technologies in clinical laboratory genomics is ushering in an era of “big data” in ways that will exceed human capacity to rapidly and reproducibly analyze those data using conventional approaches. Accurately evaluating complex molecular data to facilitate timely diagnosis and management of genomic disorders will require supportive artificial intelligence methods. These are already being introduced into clinical laboratory genomics to identify variants in DNA sequencing data, predict the effects of DNA variants on protein structure and function to inform clinical interpretation of pathogenicity, link phenotype ontologies to genetic variants identified through exome or genome sequencing to help clinicians reach diagnostic answers faster, correlate genomic data with tumor staging and treatment approaches, utilize natural language processing to identify critical published medical literature during analysis of genomic data, and use interactive chatbots to identify individuals who qualify for genetic testing or to provide pre‐test and post‐test education. With careful and ethical development and validation of artificial intelligence for clinical laboratory genomics, these advances are expected to significantly enhance the abilities of geneticists to translate complex data into clearly synthesized information for clinicians to use in managing the care of their patients at scale.
ABSTRACT Nearly 14% of disease-causing germline variants result from disruption of mRNA splicing. Most (67%) DNA variants predicted in silico to disrupt splicing end up classified as variants of uncertain significance (VUS). We developed and validated an analytic workflow — Sp lice E ffect E vent R esolver (SPEER) — that uses mRNA sequencing to reveal significant deviations in splicing, pinpoints the DNA variants potentially responsible, and measures the deleterious effect of the altered splicing on mRNA transcripts, providing evidence to assess the pathogenicity of the variant. SPEER was used to analyze leukocyte RNA encoding 63 hereditary cancer syndrome genes in 20,317 individuals undergoing clinical genetic testing. Among 3,563 (17.5%) individuals with at least one DNA variant predicted to affect splicing, 971 (4.8%) had altered splicing with a deleterious effect on the transcript and 31 had altered splicing due to a DNA variant located outside our laboratory’s reportable range. Integrating SPEER results into variant interpretation allowed reclassification of VUS to P/LP in 0.4% and to B/LB in 5.9% of the 20,317 patients. SPEER evidence had a significantly higher impact on allowing P/LP and B/LB interpretations in non-White individuals than in non-Hispanic White individuals, illustrating that evidence derived from RNA splicing analysis may reduce ethnic/ancestral disparities in genetic testing.
Background Germline variants in fumarate hydratase ( FH ) are associated with autosomal dominant (AD) hereditary leiomyomatosis and renal cell cancer (HLRCC) and autosomal recessive (AR) fumarase deficiency (FMRD). The prevalence and cancer penetrance across different FH variants remain unclear. Methods A database containing 120,061 records from individuals undergoing cancer germline testing was obtained. FH variants were classified into 3 categories: AD HLRCC variants, AR FMRD variants, and variants of unknown significance (VUSs). Individuals with variants from these categories were compared with those with negative genetic testing. Results FH variants were detected in 1.3% of individuals (AD HLRCC, 0.3%; AR FMRD, 0.4%; VUS, 0.6%). The rate of AD HLRCC variants discovered among reportedly asymptomatic individuals without a clear indication for HLRCC testing was 1 in 2668 (0.04%). In comparison with those with negative genetic testing, the renal cell carcinoma (RCC) prevalence was elevated with AD HLRCC variants (17.0% vs 4.5%; P < .01) and VUSs (6.4% vs 4.5%; P = .02) but not with AR FMRD variants. Conclusions The prevalence of HLRCC discovered incidentally on germline testing is similar to recent population carrier estimates, and this suggests that this is a relatively common cancer syndrome. Compared with those with negative genetic testing, those with VUSs had an elevated risk of RCC, whereas those with AR FMRD variants did not.