Whole-genome sequencing (WGS) is the gold standard for fully characterizing genetic variation but is still prohibitively expensive for large samples. To reduce costs, many studies sequence only a subset of individuals or genomic regions, and genotype imputation is used to infer genotypes for the remaining individuals or regions without sequencing data. However, not all variants can be well imputed, and the current state-of-the-art imputation quality metric, denoted as standard Rsq, is poorly calibrated for lower-frequency variants. Here, we propose MagicalRsq, a machine-learning-based method that integrates variant-level imputation and population genetics statistics, to provide a better calibrated imputation quality metric. Leveraging WGS data from the Cystic Fibrosis Genome Project (CFGP), and whole-exome sequence data from UK BioBank (UKB), we performed comprehensive experiments to evaluate the performance of MagicalRsq compared to standard Rsq for partially sequenced studies. We found that MagicalRsq aligns better with true R2 than standard Rsq in almost every situation evaluated, for both European and African ancestry samples. For example, when applying models trained from 1,992 CFGP sequenced samples to an independent 3,103 samples with no sequencing but TOPMed imputation from array genotypes, MagicalRsq, compared to standard Rsq, achieved net gains of 1.4 million rare, 117k low-frequency, and 18k common variants, where net gains were gained numbers of correctly distinguished variants by MagicalRsq over standard Rsq. MagicalRsq can serve as an improved post-imputation quality metric and will benefit downstream analysis by better distinguishing well-imputed variants from those poorly imputed. MagicalRsq is freely available on GitHub.
Context: Insulin-requiring diabetes affects 7–15% of teens and young adults, and more than 25% of older adults with cystic fibrosis (CF). Pancreatic exocrine disease caused by CF transmembrane conductance regulator (CFTR) dysfunction underlies the high rate of diabetes in CF patients; however, only a subset develops this complication, indicating that other factors are necessary. Objective: Our objective was to estimate the relative contribution of genetic and nongenetic modifiers to the development of diabetes in CF. Design/Patients: This was a twin and sibling study involving 1366 individuals at 109 centers in the CF Twin and Sibling Study, from which were derived 68 monozygous twin pairs, 23 dizygous twin pairs, and 588 sibling pairs, all with CF. Main Outcome Measure: Chronic, insulin-requiring diabetes in the setting of CF, as established using longitudinal clinical and biochemical data, was studied. Results: About 9% of this predominantly pediatric population (mean age = 15.8 yr) had diabetes. Key independent risk factors identified by regression modeling included having a twin or sibling with CF and diabetes, increasing age, pancreatic exocrine insufficiency or two mutations causing severe CFTR dysfunction, decreased lung function or decreased body mass index, and longer duration of glucocorticoid treatment. The concordance rate for diabetes was substantially higher in monozygous twins (0.73) than in dizygous twins and siblings with CF (0.18; P = 0.002). Heritability was estimated as near one (95% confidence interval 0.42–1.0). Conclusions: Diabetes is a frequent complication of CF that is associated with worse outcomes. Although a nongenetic factor (steroid treatment) contributes to risk, genetic modifiers (i.e. genes other than CFTR) are the primary cause of diabetes in CF.
Cystic fibrosis (CF) is characterized by recurrent respiratory infections and progressive lung disease. Whereas exercise may contribute to preserving lung function, its benefit is difficult to ascertain given the selection bias of healthier patients being more predisposed to exercise. Our objective was to examine the role of self-reported exercise with longitudinal lung function and body mass index (BMI) measures in CF. A total of 1038 subjects with CF were recruited through the U.S. CF Twin-Sibling Study. Questionnaires were used to determine exercise habits. Questionnaires, chart review, and U.S. CF Foundation Patient Registry data were used to track outcomes. Within the study sample 75% of subjects self-reported regular exercise. Exercise was associated with an older age of diagnosis (p = 0.002), older age at the time of ascertainment (p < 0.001), and higher baseline FEV1 (p = 0.001), but not CFTR genotype (p = 0.64) or exocrine pancreatic function (p = 0.19). In adjusted mixed models, exercise was associated with both a reduced decline in FEV1 (p < 0.001) and BMI Z-score (p = 0.001) for adults, but not children aged 10–17 years old. In our retrospective study, self-reported exercise was associated with improved longitudinal nutritional and pulmonary outcomes in cystic fibrosis for adults. Although prospective studies are needed to confirm these associations, programs to promote regular exercise among individuals with cystic fibrosis would be beneficial.
OBJECTIVE There are variable reports of risk of concordance for progression to islet autoantibodies and type 1 diabetes in identical twins after one twin is diagnosed. We examined development of positive autoantibodies and type 1 diabetes and the effects of genetic factors and common environment on autoantibody positivity in identical twins, nonidentical twins, and full siblings. RESEARCH DESIGN AND METHODS Subjects from the TrialNet Pathway to Prevention Study (N = 48,026) were screened from 2004 to 2015 for islet autoantibodies (GAD antibody [GADA], insulinoma-associated antigen 2 [IA-2A], and autoantibodies against insulin [IAA]). Of these subjects, 17,226 (157 identical twins, 283 nonidentical twins, and 16,786 full siblings) were followed for autoantibody positivity or type 1 diabetes for a median of 2.1 years. RESULTS At screening, identical twins were more likely to have positive GADA, IA-2A, and IAA than nonidentical twins or full siblings (all P < 0.0001). Younger age, male sex, and genetic factors were significant factors for expression of IA-2A, IAA, one or more positive autoantibodies, and two or more positive autoantibodies (all P ≤ 0.03). Initially autoantibody-positive identical twins had a 69% risk of diabetes by 3 years compared with 1.5% for initially autoantibody-negative identical twins. In nonidentical twins, type 1 diabetes risk by 3 years was 72% for initially multiple autoantibody–positive, 13% for single autoantibody–positive, and 0% for initially autoantibody-negative nonidentical twins. Full siblings had a 3-year type 1 diabetes risk of 47% for multiple autoantibody–positive, 12% for single autoantibody–positive, and 0.5% for initially autoantibody-negative subjects. CONCLUSIONS Risk of type 1 diabetes at 3 years is high for initially multiple and single autoantibody–positive identical twins and multiple autoantibody–positive nonidentical twins. Genetic predisposition, age, and male sex are significant risk factors for development of positive autoantibodies in twins.
Abstract Context Individuals with cystic fibrosis (CF) develop a distinct form of diabetes characterized by β-cell dysfunction and islet amyloid accumulation similar to type 2 diabetes (T2D), but generally have normal insulin sensitivity. CF-related diabetes (CFRD) risk is determined by both CFTR, the gene responsible for CF, and other genetic variants. Objective To identify genetic modifiers of CFRD and determine the genetic overlap with other types of diabetes. Design and Patients A genome-wide association study was conducted for CFRD onset on 5740 individuals with CF. Weighted polygenic risk scores (PRSs) for type 1 diabetes (T1D), T2D, and diabetes endophenotypes were tested for association with CFRD. Results Genome-wide significance was obtained for variants at a novel locus (PTMA) and 2 known CFRD genetic modifiers (TCF7L2 and SLC26A9). PTMA and SLC26A9 variants were CF-specific; TCF7L2 variants also associated with T2D. CFRD was strongly associated with PRSs for T2D, insulin secretion, postchallenge glucose concentration, and fasting plasma glucose, and less strongly with T1D PRSs. CFRD was inconsistently associated with PRSs for insulin sensitivity and was not associated with a PRS for islet autoimmunity. A CFRD PRS comprising variants selected from these PRSs (with a false discovery rate < 0.1) and the genome-wide significant variants was associated with CFRD in a replication population. Conclusions CFRD and T2D have more etiologic and mechanistic overlap than previously known, aligning along pathways involving β-cell function rather than insulin sensitivity. Two CFRD risk loci are unrelated to T2D and may affect multiple aspects of CF. An 18-variant PRS stratifies risk of CFRD in an independent population.