Abstract We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass whole genome (1-4x mean depth) and deep whole exome (30-40x mean depth) data in a single sequencing run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included participants across African, African American, and Latin American populations. We evaluated the accuracy of BGE imputed genotypes against raw genotype calls from the Illumina Global Screening Array. All PUMAS cohorts had R 2 concordance ≥95% among SNPs with MAF≥1%, and never fell below ≥90% R 2 for SNPs with MAF<1%. Furthermore, concordance rates among local ancestries within two recently admixed cohorts were consistent among SNPs with MAF≥1%, with only minor deviations in SNPs with MAF<1%. We also benchmarked the discovery capacity of BGE to access protein-coding copy number variants (CNVs) against deep whole genome data, finding that deletions and duplications spanning at least 3 exons had a positive predicted value of ∼90%. Our results demonstrate BGE scalability and efficacy in capturing SNPs, indels, and CNVs in the human genome at 28% of the cost of deep whole-genome sequencing. BGE is poised to enhance access to genomic testing and empower genomic discoveries, particularly in underrepresented populations.
The Kessler Psychological Distress Scale (K-10) is a short screening tool developed to identify, with good sensitivity, non-specific psychological distress in the general population. Sensitivity and specificity of the K-10 have been examined in various clinical populations in South Africa; however, other psychometric properties, such as construct validity and factor structure, have not been evaluated. We present evidence of the prevalence and severity of psychological distress in an outpatient setting in South Africa and evaluate the internal reliability, construct validity, and factor structure of the K-10 in this population.We explored prevalence estimates of psychological distress using previously established cutoffs and assessed the reliability (consistency) of the K-10 by calculating Cronbach's alpha, item-total correlations and omega total and hierarchical coefficients. Construct validity and factor structure of the K-10 were examined through split-sample exploratory factor analysis (EFA) followed by confirmatory factor analysis (CFA), comparing several theoretical models and the EFA.Overall, there was low prevalence of psychological distress in our sample of 2591 adults, the majority of whom were between the ages of 18-44 (77.7%). The K-10 showed good construct validity and reliability, with a Cronbach's alpha of 0.84 and omega total of 0.88. EFA yielded a four-factor solution with likely measurement artifacts. CFA showed that the four-factor model from EFA displayed the best comparative fit indices, but was likely overfitted. The unidimensional model with correlated errors was deemed the best fitting model based on fit indices, prior theory, and previous studies.The K-10 displays adequate psychometric properties, good internal reliability, and good fit with a unidimensional-factor structure with correlated errors. Further work is required to determine appropriate cutoff values in different populations and clinical subgroups within South Africa to aid in determining the K-10's clinical utility.
Summary African populations are the most diverse in the world yet are sorely underrepresented in medical genetics research. Here, we examine the structure of African populations using genetic and comprehensive multigenerational ethnolinguistic data from the Neuropsychiatric Genetics of African Populations-Psychosis study (NeuroGAP-Psychosis) consisting of 900 individuals from Ethiopia, Kenya, South Africa, and Uganda. We find that self-reported language classifications meaningfully tag underlying genetic variation that would be missed with consideration of geography alone, highlighting the importance of culture in shaping genetic diversity. Leveraging our uniquely rich multi-generational ethnolinguistic metadata, we track language transmission through the pedigree, observing the disappearance of several languages in our cohort as well as notable shifts in frequency over three generations. We find suggestive evidence for the rate of language transmission in matrilineal groups having been higher than that for patrilineal ones. We highlight both the diversity of variation within the African continent, as well as how within-Africa variation can be informative for broader variant interpretation; many variants appearing rare elsewhere are common in parts of Africa. The work presented here improves the understanding of the spectrum of genetic variation in African populations and highlights the enormous and complex genetic and ethnolinguistic diversity within Africa.
The Mini International Neuropsychiatric Inventory 7.0.2 (MINI-7) is a widely used tool and known to have sound psychometric properties; but very little is known about its use in low and middle-income countries (LMICs). This study aimed to examine the psychometric properties of the MINI-7 psychosis items in a sample of 8609 participants across four countries in Sub-Saharan Africa.
Abstract Background The Populations Underrepresented in Mental illness Association Studies (PUMAS) project is attempting to remediate the historical underrepresentation of African and Latin American populations in psychiatric genetics through large-scale genetic association studies of individuals diagnosed with a serious mental illness [SMI, including schizophrenia (SCZ), schizoaffective disorder (SZA) bipolar disorder (BP), and severe major depressive disorder (MDD)] and matched controls. Given growing evidence indicating substantial symptomatic and genetic overlap between these diagnoses, we sought to enable transdiagnostic genetic analyses of PUMAS data by conducting phenotype alignment and harmonization for 89,320 participants (48,165 cases and 41,155 controls) from four cohorts, each of which used different ascertainment and assessment methods: PAISA n=9,105; PUMAS-LATAM n=14,638; NGAP n=42,953 and GPC n=22,624. As we describe here, these efforts have yielded harmonized datasets enabling us to analyze PUMAS genetic variation data at three levels: SMI overall, diagnoses, and individual symptoms. Methods In aligning item-level phenotypes obtained from 14 different clinical instruments, we incorporated content, branching nature, and time frame for each phenotype; standardized diagnoses; and selected 19 core SMI item-level phenotypes for analyses. The harmonization was evaluated in PUMAS cases using multiple correspondence analysis (MCA), co-occurrence analyses, and item-level endorsement. Outcomes We mapped >6,895 item-level phenotypes in the aggregated PUMAS data, in which SCZ (44.97%) and severe BP (BP-I, 31.53%) were the most common diagnoses. Twelve of the 19 core item-level phenotypes occurred at frequencies of > 10% across all diagnoses, indicating their potential utility for transdiagnostic genetic analyses. MCA of the 14 phenotypes that were present for all cohorts revealed consistency across cohorts, and placed MDD and SCZ into separate clusters, while other diagnoses showed no significant phenotypic clustering. Interpretation Our alignment strategy effectively aggregated extensive phenotypic data obtained using diverse assessment tools. The MCA yielded dimensional scores which we will use for genetic analyses along with the item level phenotypes. After successful harmonization, residual phenotypic heterogeneity between cohorts reflects differences in branching structure of diagnostic instruments, recruitment strategies, and symptom interpretation (due to cultural variation).
Abstract Background Genetic studies of biomedical phenotypes in underrepresented populations identify disproportionate numbers of novel associations. However, current genomics infrastructure--including most genotyping arrays and sequenced reference panels--best serves populations of European descent. A critical step for facilitating genetic studies in underrepresented populations is to ensure that genetic technologies accurately capture variation in all populations. Here, we quantify the accuracy of low-coverage sequencing in diverse African populations. Results We sequenced the whole genomes of 91 individuals to high-coverage (≥20X) from the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study, in which participants were recruited from Ethiopia, Kenya, South Africa, and Uganda. We empirically tested two data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole genome sequencing data. We show that low-coverage sequencing at a depth of ≥4X captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1X) performed comparable to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation, with 4X sequencing detecting 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Conclusion These results indicate that low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, including those that capture variation most common in Europeans and Africans. Low-coverage sequencing effectively identifies novel variation (particularly in underrepresented populations), and presents opportunities to enhance variant discovery at a similar cost to traditional approaches.
An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the 'Save PDF' action button.