Background Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This study sought to identify new susceptibility genes, using an alternative gene-wide analytical approach which tests for patterns of association within genes, in the powerful genome-wide association dataset of the International Genomics of Alzheimer's Project Consortium, comprising over 7 m genotypes from 25,580 Alzheimer's cases and 48,466 controls. Principal Findings In addition to earlier reported genes, we detected genome-wide significant loci on chromosomes 8 (TP53INP1, p = 1.4×10−6) and 14 (IGHV1-67 p = 7.9×10−8) which indexed novel susceptibility loci. Significance The additional genes identified in this study, have an array of functions previously implicated in Alzheimer's disease, including aspects of energy metabolism, protein degradation and the immune system and add further weight to these pathways as potential therapeutic targets in Alzheimer's disease.
The chromosome 17q21.31 region, containing a 900 Kb inversion that defines H1 and H2 haplotypes, represents the strongest genetic risk locus in progressive supranuclear palsy (PSP). In addition to H1 and H2, various structural forms of 17q21.31, characterized by the copy number of α, β, and γ duplications, have been identified. However, the specific effect of each structural form on the risk of PSP has never been evaluated in a large cohort study.
The Alzheimer's Disease Sequencing Project (ADSP) undertook whole exome sequencing in 5,740 late-onset Alzheimer disease (AD) cases and 5,096 cognitively normal controls primarily of European ancestry (EA), among whom 218 cases and 177 controls were Caribbean Hispanic (CH). An age-, sex- and APOE based risk score and family history were used to select cases most likely to harbor novel AD risk variants and controls least likely to develop AD by age 85 years. We tested ~1.5 million single nucleotide variants (SNVs) and 50,000 insertion-deletion polymorphisms (indels) for association to AD, using multiple models considering individual variants as well as gene-based tests aggregating rare, predicted functional, and loss of function variants. Sixteen single variants and 19 genes that met criteria for significant or suggestive associations after multiple-testing correction were evaluated for replication in four independent samples; three with whole exome sequencing (2,778 cases, 7,262 controls) and one with genome-wide genotyping imputed to the Haplotype Reference Consortium panel (9,343 cases, 11,527 controls). The top findings in the discovery sample were also followed-up in the ADSP whole-genome sequenced family-based dataset (197 members of 42 EA families and 501 members of 157 CH families). We identified novel and predicted functional genetic variants in genes previously associated with AD. We also detected associations in three novel genes: IGHG3 (p = 9.8 × 10
The X chromosome is often omitted in disease association studies despite containing thousands of genes which may provide insight into well-known sex differences in the risk of Alzheimer's Disease.
IGAP meta-analyses of genome-wide association studies (GWAS) have previously identified 30 susceptibility LOAD loci in addition to APOE, however the majority of these were common (minor allele frequency (MAF)>0.02). The Haplotype Reference Consortium (HRC) released a dense reference panel (64,976 haplotypes/39,235,157 SNPs) allowing imputation of rare variants (MAF>0.0008) for discovery association testing. IGAP consortia imputed 42 GWAS datasets to HRC to identify novel rare variant, gene, and pathway associations. We imputed 24,466 cases and 39,951 controls to the HRC r1.1 reference panel using Minimac3 on the University of Michigan Imputation Server. Using imputed genotype probabilities, logistic regression on individual variants with MAF>0.01 was performed in SNPTEST (generalized linear mixed model in R for family-based variants) and fixed-effects meta-analysis performed using METAL. Variants with MAF≤0.01 were meta-analyzed using score-based tests with the program SeqMeta/R. Both analyses adjusted for age, sex, and population substructure. Gene-based and pathway associations were examined using VEGAS2. Discovery analyses of ∼39.2M genotyped or imputed SNVs confirmed single variant associations in 26 of 30 known IGAP LOAD loci at suggestive levels of significance (P<10−5), with 12 of the known loci attaining genome-wide statistical significance (P<5×10−8). Newly observed associations included variation at APP (rs112547745, OR[95% CI]=1.10 [1.06, 1.14], P=8.70×10−7), an early-onset AD gene; variants in genes associated with cardiovascular traits [IGSF5 (rs16998166, OR[95% CI]=0.74 [0.65-0.84], P=4.23×10−6), ACE (rs116112765, OR[95% CI]=1.28 [1.15-1.42], P=2.95×10−6), LIPG (rs2156500, OR[95% CI]=0.94 [0.91-0.96], P=2.59×10−6), and SCARB1 (rs12229555, OR[95% CI]=0.93 [0.90-0.96], P=5.28×10−6), which encodes an HDL cholesterol receptor]; and variants in genes involved in neurodegenerative processes, like TRIT1 (rs61781270, OR[95% CI]=1.08 [1.04-1.11], P=3.54×10−6), encoding an amyloid fiber-forming protein. Gene-based analyses identified several genes involved in innate immunity, including BTRC and DCUN1D5 (both P<10−6). Pathway analyses implicated neuronal development [GO:0030182 (Pempirical=10−6) and GO:0048666 (Pempirical=1.20×10−5). Replication of the observed signals is ongoing.
LOAD risk loci may also contribute to variation in age of onset (AAO) of LOAD, as do the allelic variants in APOE, however roles in AAO for the other newly identified risk loci (CLU, BIN1, and others) have not been explored. We examined variants at ten confirmed LOAD risk loci (APOE, CLU, PICALM, CR1, BIN1, CD2AP, EPHA1, the MS4A region, ABCA7, and CD33) to determine if they contribute to variation in AAO among 9,160 LOAD cases in the Alzheimer's disease Genetics Consortium (ADGC). We tested association with AAO for each locus using linear modeling with covariate adjustment for population substructure and performed a random-effects meta-analysis across datasets. We also examined genetic burden using genotype scores weighted by risk effect sizes to examine the aggregate contribution of these loci to variation in AAO. Analyses confirmed association of APOE regional variation with AAO (rs6857, P =3.30×10 -96), with statistically significant associations with AAO (P<0.005) demonstrated at several other LOAD risk loci, including rs6701713 in CR1 (P =0.00717), rs7561528 in BIN1 (P =0.00478), rs561655 in PICALM (P =0.00223). Associations remained largely unchanged after additional adjustment for dosage of APOE ε4 alleles. Burden analyses showed APOE contributes to 3.1% of variation in AAO (R 2 =0.078) whereas the other nine genes contribute to 1.1% of variation (R 2 =0.058) over baseline (R 2 =0.047). Secondary analyses of genome-wide association with AAO among non-risk loci identified several regions with multiple SNPs demonstrating suggestive associations (P<10 -6), including one nearing genome-wide statistical significance: MYO16 (47 SNPs; most significant: rs9521011, P =7.62×10 -8), CDH20 (4 SNPs; most significant: rs12956834, P =6.17×10 -6), and SGCZ (10 SNPs; most significant: rs7016159, P =7.70×10 -6). We confirmed the association of APOE variants with AAO among LOAD cases, and observe associations with AAO in CR1, BIN1, and PICALM. In contrast to earlier hypothetical modeling, we show that the combined effects of other loci do not exceed the effect of APOE on AAO, and if additional genetic contributions to AAO exist, they are likely very small individually or are hidden in gene-gene interactions.
ABSTRACT Over 90% of variants are rare, and 50% of them are singletons in the Alzheimer’s Disease Sequencing Project Whole Exome Sequencing (ADSP WES) data. However, either single variant tests or unit-based tests are limited in the statistical power to detect the association between rare variants and phenotypes. To best utilize rare variants and investigate their biological effect, we exam their association with phenotypes in the context of protein. We developed a protein structure-based approach, POKEMON (Protein Optimized Kernel Evaluation of Missense Nucleotides), which evaluates rare missense variants based on their spatial distribution on the protein rather than allele frequency. The hypothesis behind this is that the three-dimensional spatial distribution of variants within a protein structure provides functional context and improves the power of association tests. POKEMON identified four candidate genes from the ADSP WES data, namely two known Alzheimer’s disease (AD) genes ( TREM2 and SORL ) and two novel genes ( DUSP18 and CSF1R ). For known AD genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low frequency risk variants within these genes. DUSP18 has a cluster of variants primarily shared by case subjects around the ligand-binding domain, and this cluster is further validated in a replication dataset with a larger sample size. POKEMON is an open-source tool available at https://github.com/bushlab-genomics/POKEMON .
Abstract Background The Genome Center for Alzheimer’s Disease (GCAD) coordinates the integration and meta‐analysis of all available Alzheimer’s disease (AD) relevant whole genome sequencing (WGS) data to facilitate the goal of identifying AD risk or protective genetic variants and eventual therapeutic targets. The WGS datasets are generated via the collaboration of scientists from the Alzheimer’s Disease Sequencing Project (ADSP) and GCAD. To minimize data heterogeneity introduced by different sequencing protocols and machines, GCAD processes all samples using identical pipelines. Methods The raw sequencing data are first mapped to GRCh38/hg38 and variants (SNVs and indels) are called using GATK. Additionally, compact VCF and GDS formatted files are generated to facilitate researchers who want to use smaller pVCFs. SNVs and indels are annotated using the ADSP annotation pipeline. Lastly, structural variants (SV) are called using Smoove and Manta and joint genotyped using GraphTyper2. Results The dataset (ADSP Release 5, R5, 2024) includes ∼60,000 genomes from >50 diverse cohorts with 4 major ancestries: 47% Non‐Hispanic White, 29% Hispanic or Latino, 16% Black or African American and 8% Asian. Data are deeply sequenced (average genome coverage: >30x). CRAMs, gVCFs from GATK, and SV VCFs of a subset of the R5 samples (n = 36,361) were deposited into NIAGADS Data Sharing Service (DSS) ( https://dss.niagads.org/ ) for public distribution in 2022, and similarly, the new samples in R5 will be released after the joint call is complete. In addition, joint‐genotype VCFs on SNVs, indels, and SVs will be available. These will undergo full quality control and annotation process. Conclusion The ADSP and GCAD generate high quality genotype and SV calls. Currently the project is processing ∼60,000 WGS samples sequenced primarily through the ADSP Follow‐Up Study, which will contain a more ancestrally diverse set of populations. We anticipate this 2024 release will continue to benefit the research community studying AD genetics.
Abstract Due to methodological reasons, the X-chromosome has not been featured in the major genome-wide association studies on Alzheimer’s Disease (AD). To address this and better characterize the genetic landscape of AD, we performed an in-depth X-Chromosome-Wide Association Study (XWAS) in 115,841 AD cases or AD proxy cases, including 52,214 clinically-diagnosed AD cases, and 613,671 controls. We considered three approaches to account for the different X-chromosome inactivation (XCI) states in females, i.e. random XCI, skewed XCI, and escape XCI. We did not detect any genome-wide significant signals (P ≤ 5 × 10 − 8 ) but identified seven X-chromosome-wide significant loci (P ≤ 1.6 × 10 − 6 ). The index variants were common for the Xp22.32, FRMPD4, DMD and Xq25 loci, and rare for the WNK3 , PJA1 , and DACH2 loci. Overall, this well-powered XWAS found no genetic risk factors for AD on the non-pseudoautosomal region of the X-chromosome, but it identified suggestive signals warranting further investigations.