ABSTRACT Common SNPs are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes. Here we show, using GWAS data from 5.4 million individuals of diverse ancestries, that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a median size of ~90 kb, covering ~21% of the genome. The density of independent associations varies across the genome and the regions of elevated density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs account for 40% of phenotypic variance in European ancestry populations but only ~10%-20% in other ancestries. Effect sizes, associated regions, and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely explained by linkage disequilibrium and allele frequency differences within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than needed to implicate causal genes and variants. Overall, this study, the largest GWAS to date, provides an unprecedented saturated map of specific genomic regions containing the vast majority of common height-associated variants.
Abstract Carotid artery intima media thickness (cIMT) and carotid plaque are measures of subclinical atherosclerosis associated with ischemic stroke and coronary heart disease (CHD). Here, we undertake meta-analyses of genome-wide association studies (GWAS) in 71,128 individuals for cIMT, and 48,434 individuals for carotid plaque traits. We identify eight novel susceptibility loci for cIMT, one independent association at the previously-identified PINX1 locus, and one novel locus for carotid plaque. Colocalization analysis with nearby vascular expression quantitative loci (cis-eQTLs) derived from arterial wall and metabolic tissues obtained from patients with CHD identifies candidate genes at two potentially additional loci, ADAMTS9 and LOXL4 . LD score regression reveals significant genetic correlations between cIMT and plaque traits, and both cIMT and plaque with CHD, any stroke subtype and ischemic stroke. Our study provides insights into genes and tissue-specific regulatory mechanisms linking atherosclerosis both to its functional genomic origins and its clinical consequences in humans.
Genetic factors explain a majority of risk variance for age-related macular degeneration (AMD). While genome-wide association studies (GWAS) for late AMD implicate genes in complement, inflammatory and lipid pathways, the genetic architecture of early AMD has been relatively under studied. We conducted a GWAS meta-analysis of early AMD, including 4,089 individuals with prevalent signs of early AMD (soft drusen and/or retinal pigment epithelial changes) and 20,453 individuals without these signs. For various published late AMD risk loci, we also compared effect sizes between early and late AMD using an additional 484 individuals with prevalent late AMD. GWAS meta-analysis confirmed previously reported association of variants at the complement factor H (CFH) (peak P = 1.5×10−31) and age-related maculopathy susceptibility 2 (ARMS2) (P = 4.3×10−24) loci, and suggested Apolipoprotein E (ApoE) polymorphisms (rs2075650; P = 1.1×10−6) associated with early AMD. Other possible loci that did not reach GWAS significance included variants in the zinc finger protein gene GLI3 (rs2049622; P = 8.9×10−6) and upstream of GLI2 (rs6721654; P = 6.5×10−6), encoding retinal Sonic hedgehog signalling regulators, and in the tyrosinase (TYR) gene (rs621313; P = 3.5×10−6), involved in melanin biosynthesis. For a range of published, late AMD risk loci, estimated effect sizes were significantly lower for early than late AMD. This study confirms the involvement of multiple established AMD risk variants in early AMD, but suggests weaker genetic effects on the risk of early AMD relative to late AMD. Several biological processes were suggested to be potentially specific for early AMD, including pathways regulating RPE cell melanin content and signalling pathways potentially involved in retinal regeneration, generating hypotheses for further investigation.
Objective: The genetic complexity of schizophrenia may be compounded by the diagnostic imprecision inherent in distinguishing schizophrenia from closely related mood and substance use disorders. Further complexity may arise from studying genetically and/or environmentally diverse ethnic groups. Reported here are the ascertainment, demographic features and clinical characteristics, of a diagnostically and ethnically homogeneous schizophrenia pedigree sample from Tamil Nadu, India. Also reported is the theoretical power to detect genetic linkage in the subset of affected sibling pairs. Method: Affected sibling pair and trio pedigrees were identified by caste/ethnicity. Affected probands and siblings fulfilled DSM-IV criteria for schizophrenia or schizoaffective disorder. Results: The present sample consisted of 159 affected sibling pairs and 187 parent–offspring trios originating primarily from the Tamil Brahmin caste, but also incorporating pedigrees from genetically similar, geographically proximal caste groups. Consistent with previous studies in Tamil Nadu, a very low prevalence of affective psychoses such as schizoaffective disorder, was observed, with most affected individuals having schizophrenia (499/504). Also observed were extremely low rates of nicotine (12.4%), alcohol (1.1%) and illicit drug use (0%). Most affected individuals exhibited negative symptoms (>90%) and a severe, chronic course. All participants lived in the same geographic and climatic region and most affected individuals resided with close family members, increasing uniformity of the sociocultural environment. In affected sibling pairs, power to detect linkage to small-effect risk loci was modest, but this homogeneous sample may be enriched for loci of larger effect. Conclusions: This Indian schizophrenia sample exhibits diagnostic and ethnic homogeneity with high consistency of sociocultural environmental features. These characteristics should assist efforts to identify risk genes for schizophrenia.
Context: Identifying susceptibility genes for schizophrenia may be complicated by phenotypic heterogeneity, with some evidence suggesting that phenotypic heterogeneity reflects genetic heterogeneity.
Objective: To evaluate the heritability and conduct genetic linkage analyses of empirically derived, clinically homogeneous schizophrenia subtypes.
Design: Latent class and linkage analysis.
Setting: Taiwanese field research centers.
Participants: The latent class analysis included 1236 Han Chinese individuals with DSM-IV schizophrenia. These individuals were members of a large affected-sibling-pair sample of schizophrenia (606 ascertained families), original linkage analyses of which detected a maximum logarithm of odds (LOD) of 1.8 (z = 2.88) on chromosome 10q22.3.
Main Outcome Measures: Multipoint exponential LOD scores by latent class assignment and parametric heterogeneity LOD scores.
Results: Latent class analyses identified 4 classes, with 2 demonstrating familial aggregation. The first (LC2) described a group with severe negative symptoms, disorganization, and pronounced functional impairment, resembling “deficit schizophrenia.” The second (LC3) described a group with minimal functional impairment, mild or absent negative symptoms, and low disorganization. Using the negative/deficit subtype, we detected genome-wide significant linkage to 1q23-25 (LOD = 3.78, empiric genome-wide P = .01). This region was not detected using the DSM-IV schizophrenia diagnosis, but has been strongly implicated in schizophrenia pathogenesis by previous linkage and association studies.Variants in the 1q region may specifically increase risk for a negative/deficit schizophrenia subtype. Alternatively, these results may reflect increased familiality/heritability of the negative class, the presence of multiple 1q schizophrenia risk genes, or a pleiotropic 1q risk locus or loci, with stronger genotype-phenotype correlation with negative/deficit symptoms. Using the second familial latent class, we identified nominally significant linkage to the original 10q peak region.
Conclusion: Genetic analyses of heritable, homogeneous phenotypes may improve the power of linkage and association studies of schizophrenia and thus have relevance to the design and analysis of genome-wide association studies.
To quantify genetic overlap between migraine and ischemic stroke (IS) with respect to common genetic variation.We applied 4 different approaches to large-scale meta-analyses of genome-wide data on migraine (23,285 cases and 95,425 controls) and IS (12,389 cases and 62,004 controls). First, we queried known genome-wide significant loci for both disorders, looking for potential overlap of signals. We then analyzed the overall shared genetic load using polygenic scores and estimated the genetic correlation between disease subtypes using data derived from these models. We further interrogated genomic regions of shared risk using analysis of covariance patterns between the 2 phenotypes using cross-phenotype spatial mapping.We found substantial genetic overlap between migraine and IS using all 4 approaches. Migraine without aura (MO) showed much stronger overlap with IS and its subtypes than migraine with aura (MA). The strongest overlap existed between MO and large artery stroke (LAS; p = 6.4 × 10(-28) for the LAS polygenic score in MO) and between MO and cardioembolic stroke (CE; p = 2.7 × 10(-20) for the CE score in MO).Our findings indicate shared genetic susceptibility to migraine and IS, with a particularly strong overlap between MO and both LAS and CE pointing towards shared mechanisms. Our observations on MA are consistent with a limited role of common genetic variants in this subtype.
An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8), the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.
Schizophrenia is a severe psychiatric illness characterised by delusions, hallucinations, thought disorder, and social and emotional impairment. The disorder affects >1% of the population and is a major cause of personal disability, often requiring long-term pharmacotherapy and diminishing social and employment prospects. Schizophrenia also impacts upon the lives of caregivers and constitutes a massive economic burden. The principal treatment is pharmacotherapy with antipsychotic medications, which have variable efficacy and adverse side-effects, reflecting inadequate knowledge of the disorder’s aetiological mechanisms. Family, twin and adoption studies suggest that liability to schizophrenia is contributed largely by genetic factors. The mode of genetic transmission is unknown, although segregation analyses suggest the presence of multiple, small-effect genes. Such a model is consistent with the results of >30 genome-wide linkage scans, which provide convergent evidence for various loci, each of which only modestly increases disease risk. No single locus has been consistently observed across samples, suggesting the presence of locus heterogeneity. The number of susceptibility loci, the relative risk conferred by each locus, the extent of heterogeneity and the degree of interaction among loci and with environmental risk factors remain unknown. The reliable detection and replication of small-effect, heterogeneous risk loci may require larger samples than those previously used. A growing trend is to enlarge samples by pooling multiple, independent datasets. However, the power of pooled samples can be reduced by between-sample differences in locus heterogeneity (i.e. the proportion of ‘linked’ pedigrees) at risk loci, which may result from sampling variation or genetic differences between constituent samples. Under these conditions, power of the pooled sample may be best maintained by explicitly modelling within- and between-sample locus heterogeneity. Chapters 2 and 3 of this thesis comprise re-analyses of pooled schizophrenia linkage data, incorporating within- and between-sample locus heterogeneity. For each dataset, original linkage analyses utilised linkage statistics which assume locus homogeneity, and detected only suggestive linkage to the most strongly supported region. In both cases, re-analysis incorporating locus heterogeneity provided significant evidence for linkage. These studies demonstrate the impact of locus heterogeneity upon individual and pooled schizophrenia datasets and suggest analytic approaches for reducing its impact. The impact of locus heterogeneity may also be reduced by selectively ascertaining samples with increased genetic homogeneity. Suitable groups include genetic isolates and ethnically homogeneous populations, both of which may demonstrate reduced genetic diversity, greater uniformity of environmental and cultural features and increased power to resolve aetiological determinants. Chapter 4 describes a genome scan for schizophrenia in ethnically homogeneous pedigrees from south India. Population genetic studies demonstrate high genetic homogeneity of caste groups from the study region, and this was confirmed by genetic analyses of the clinical sample. The genome scan detected a region of significant linkage (LOD = 3.91) at 1p31.1. Chapter 5 describes linkage fine-mapping of this peak in an enlarged sample, in which increased linkage evidence to the 1p region was obtained. The identified region contains a number of candidate genes which could plausibly function in schizophrenia pathogenesis. Association analyses in schizophrenia provide support for a number of candidate genes, particularly DTNBP1, COMT, NRG1, RGS4, G72/G30, and DAAO. However, the identified risk alleles/haplotypes vary between populations and no pathogenic variants have yet been identified. Discrepant findings may relate to differences in linkage disequilibrium (LD) patterns across study populations and the use of sparse SNP maps which do not adequately reflect local patterns of LD. Chapter 6 describes association analyses of the schizophrenia candidate gene DTNBP1, utilising the largest number of individually genotyped DTNBP1 SNPs reported to date. No SNP or haplotype showed significant association with disease. Potential explanations include insufficient LD between trait and marker loci, the presence of locus and/or allelic heterogeneity at the DTNBP1 locus, and insufficient sample sizes to detect variants of modest effect. Greater consistency of future association studies may be achieved using dense SNP maps designed to capture local patterns of LD in large samples drawn from specific population groups. This thesis reports several novel and exciting findings, including the identification of three genomic regions (1p31.1, 8p23.3 and 10p12.32) demonstrating genome-wide significant linkage to schizophrenia. These findings should assist in the eventual identification of susceptibility variants for this disease. Elucidating schizophrenia’s aetiological mechanisms remains an important challenge for modern medicine, and may both facilitate the development of more efficacious treatments and permit earlier diagnosis of the disease, which can improve disease outcome. Such advancements could substantially reduce the personal, social and financial burden associated with this common and debilitating disorder.
There are a growing number of large cohorts of older persons with genome-wide genotyping data available, but APOE is not included in any of the common microarray platforms. We compared directly measured APOE genotypes with those imputed using microarray data and the "1000 Genomes" dataset in a sample of 320 Caucasians. We find 90% agreement for ε2/ε3/ε4 genotypes and 93% agreement for predicting ε4 status, yielding kappa values of 0.81 and 0.84, respectively. More stringent thresholds around allele number estimates can increase this agreement to 90-97% and kappas of 0.90-0.93.