HLA-G is a promiscuous immune checkpoint molecule. The HLA-G gene presents substantial nucleotide variability in its regulatory regions. However, it encodes a limited number of proteins compared to classical HLA class I genes. We characterized the HLA-G genetic variability in 4640 individuals from 88 different population samples across the globe by using a state-of-the-art method to characterize polymorphisms and haplotypes from high-coverage next-generation sequencing data. We also provide insights regarding the HLA-G genetic diversity and a resource for future studies evaluating HLA-G polymorphisms in different populations and association studies. Despite the great haplotype variability, we demonstrated that: (1) most of the HLA-G polymorphisms are in introns and regulatory sequences, and these are the sites with evidence of balancing selection, (2) linkage disequilibrium is high throughout the gene, extending up to HLA-A, (3) there are few proteins frequently observed in worldwide populations, with lack of variation in residues associated with major HLA-G biological properties (dimer formation, interaction with leukocyte receptors). These observations corroborate the role of HLA-G as an immune checkpoint molecule rather than as an antigen-presenting molecule. Understanding HLA-G variability across populations is relevant for disease association and functional studies.
Abstract Increasingly, the inference of genetic ancestry plays a prominent role in clinical, population and forensic genetics studies. Over the last few decades, several genotyping strategies and analytical methodologies have been developed in order to assign individuals to specific biogeographic regions. However, despite all these efforts, the ancestry inference in populations with a recent history of admixture, such as those in America, is still a challenge. In admixed populations, proportion and components of genetic ancestry vary at different levels: (i) between populations; (ii) between individuals of the same population; (iii) throughout the individual's genome. In the present study we compared and evaluated different sets of markers, from those with small numbers of ancestry informative markers panels (AIMs), to high-density SNPs (HDSNP) and whole-genome-sequence (WGS) data. To this end, we evaluated 1,675 admixed samples from America, using a tetrahybrid admixture model (Native American, European, African and East Asian). Analyses show greater variation in the correlation coefficient of ancestry components within and between admixed populations, especially for minority ancestral components. We also observed, the higher the number of markers in the AIMs panel, the higher the correlation with HDSNP and WGS. In addition, the greater the number of markers, the more robust was the tetrahybrid admixture model.
Abstract Background The APOE gene is identified as a major risk factor for late onset Alzheimer’s disease and has three alleles, ε2, ε3 and ε4, related to two non‐synonymous substitutions. The presence of the ε4 allele confers an increased risk for the disease in a dose dependent manner (odds ratio 12.9 for homozygous individuals and between 3.2 and 4.2 for heterozygous individuals). The most prevalent form of Alzheimer’s disease has a multifactorial etiology, where several genetic and environmental factors influence its pathology. Factors such as age and sex are relevant in determining the risk for APOE , recent studies point out that the ancestry around the APOE gene is also relevant to the risk for the disease. Individuals of African‐American ancestry have a reduced risk of developing the disease compared to European and Asian whites. These regions may act in a cis‐regulatory manner in modulating gene expression. The Brazilian population results from miscegenation between indigenous, African and European populations, offering an opportunity to study the effects of different ancestries on the risk of Alzheimer’s disease, fill the lack of genetic information about admixed groups and characterize the differential patterns of APOE’s expression in Brazilian population. Method This project aims to quantitatively characterize the role of different local ancestry in the different APOE genotypes using 36 post‐mortem brain tissue, with distinct genotypes and local ancestries combinations (33,34,44, European‐European, African‐African, European‐African, African‐European) and associating them with the expression of this gene through molecular marker genotyping and RT‐qPCR techniques. It is expected that the mRNA expression is differentially increased when European local ancestry is shown to be present, compared to non‐European ancestry and may explain, in part, the difference in risks observed in populations of different ancestries. Result ongoing research Conclusion ongoing research
Despite the high number of individuals infected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) who develop coronavirus disease 2019 (COVID-19) symptoms worldwide, many exposed individuals remain asymptomatic and/or uninfected and seronegative. This could be explained by a combination of environmental (exposure), immunological (previous infection), epigenetic, and genetic factors. Aiming to identify genetic factors involved in immune response in symptomatic COVID-19 as compared to asymptomatic exposed individuals, we analyzed 83 Brazilian couples where one individual was infected and symptomatic while the partner remained asymptomatic and serum-negative for at least 6 months despite sharing the same bedroom during the infection. We refer to these as "discordant couples". We performed whole-exome sequencing followed by a state-of-the-art method to call genotypes and haplotypes across the highly polymorphic major histocompatibility complex (MHC) region. The discordant partners had comparable ages and genetic ancestry, but women were overrepresented (65%) in the asymptomatic group. In the antigen-presentation pathway, we observed an association between HLA-DRB1 alleles encoding Lys at residue 71 (mostly DRB1*03:01 and DRB1*04:01) and DOB*01:02 with symptomatic infections and HLA-A alleles encoding 144Q/151R with asymptomatic seronegative women. Among the genes related to immune modulation, we detected variants in MICA and MICB associated with symptomatic infections. These variants are related to higher expression of soluble MICA and low expression of MICB. Thus, quantitative differences in these molecules that modulate natural killer (NK) activity could contribute to susceptibility to COVID-19 by downregulating NK cell cytotoxic activity in infected individuals but not in the asymptomatic partners.
ABSTRACT Admixed populations are a resource to study the global genetic architecture of complex phenotypes, which is critical, considering that non-European populations are severely under-represented in genomic studies. Leveraging admixture in Brazilians, whose chromosomes are mosaics of fragments of Native American, European and African origins, we used genome-wide data to perform admixture mapping/fine-mapping of Body Mass Index (BMI) in three population-based cohorts from Northeast (Salvador), Southeast (Bambuí) and South (Pelotas) of the country. We found significant associations with African-associated alleles in children from Salvador (PALD1 and ZMIZ1 genes), and in young adults from Pelotas (NOD2 and MTUS2 genes). More importantly, in Pelotas, rs114066381, mapped in a potential regulatory region, is significantly associated only in females (p= 2.76 e-06). This variant is very rare in Europeans but with frequencies of ~3% in West Africa, and has a strong female-specific effect (95%CI: 2.32-5.65 kg/m2 per each A allele). We confirmed this sex-specific association and replicated its strong effect for an adjusted fat-mass index in the same Pelotas cohort, and for BMI in another Brazilian cohort from São Paulo (Southeast Brazil). A meta-analysis confirmed the significant association. Remarkably, we observed that while the frequency of rs114066381-A allele ranges from 0.8 to 2.1% in the studied populations, it attains ~9% among morbidly obese women from Pelotas, São Paulo, and Bambuí. The effect size of rs114066381 is at least five-times the effect size of the FTO SNPs rs9939609 and rs1558902, already emblematic for their high effects, and for which we replicated associations in Pelotas. We demonstrate how, after a decade of GWAS mostly performed in European-ancestry populations, non-European and admixed populations are a source of new relevant phenotype-associated genetic variants.
Genetic and omics analyses frequently require independent observations, which is not guaranteed in real datasets. When relatedness cannot be accounted for, solutions involve removing related individuals (or observations) and, consequently, a reduction of available data. We developed a network-based relatedness-pruning method that minimizes dataset reduction while removing unwanted relationships in a dataset. It uses node degree centrality metric to identify highly connected nodes (or individuals) and implements heuristics that approximate the minimal reduction of a dataset to allow its application to complex datasets. When compared with two other popular population genetics methodologies (PLINK and KING), NAToRA shows the best combination of removing all relatives while keeping the largest possible number of individuals in all datasets tested and also, with similar effects on the allele frequency spectrum and Principal Component Analysis than PLINK and KING. NAToRA is freely available, both as a standalone tool that can be easily incorporated as part of a pipeline, and as a graphical web tool that allows visualization of the relatedness networks. NAToRA also accepts a variety of relationship metrics as input, which facilitates its use. We also release a genealogies simulator software used for different tests performed in this study.
Abstract Approximately 5% of the human genome consists of structural variants, which are enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120kb in length, carrying the highly homologous genes GYPA , GYPB and GYPE . Glycophorin A and glycophorin B are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They act as receptors for invasion of a causative agent of malaria, Plasmodium falciparum . A particular complex structural variant (DUP4) that creates a GYPB / GYPA fusion gene is known to confer resistance to malaria. Many other structural variants exist, and remain poorly characterised. Here, we analyse sequences from 6466 genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection using fibre-FISH and breakpoint mapping. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by unequal cross over events (non-allelic homologous recombination, NAHR) and. by comparing the structural variant breakpoints with recombination hotspot maps, show the importance of a particular meiotic recombination hotspot on structural variant formation in this region.
Background Although aging correlates with a worse prognosis for Covid-19, super elderly still unvaccinated individuals presenting mild or no symptoms have been reported worldwide. Most of the reported genetic variants responsible for increased disease susceptibility are associated with immune response, involving type I IFN immunity and modulation; HLA cluster genes; inflammasome activation; genes of interleukins; and chemokines receptors. On the other hand, little is known about the resistance mechanisms against SARS-CoV-2 infection. Here, we addressed polymorphisms in the MHC region associated with Covid-19 outcome in super elderly resilient patients as compared to younger patients with a severe outcome. Methods SARS-CoV-2 infection was confirmed by RT-PCR test. Aiming to identify candidate genes associated with host resistance, we investigated 87 individuals older than 90 years who recovered from Covid-19 with mild symptoms or who remained asymptomatic following positive test for SARS-CoV-2 as compared to 55 individuals younger than 60 years who had a severe disease or died due to Covid-19, as well as to the general elderly population from the same city. Whole-exome sequencing and an in-depth analysis of the MHC region was performed. All samples were collected in early 2020 and before the local vaccination programs started. Results We found that the resilient super elderly group displayed a higher frequency of some missense variants in the MUC22 gene (a member of the mucins’ family) as one of the strongest signals in the MHC region as compared to the severe Covid-19 group and the general elderly control population. For example, the missense variant rs62399430 at MUC22 is two times more frequent among the resilient super elderly (p = 0.00002, OR = 2.24). Conclusion Since the pro-inflammatory basal state in the elderly may enhance the susceptibility to severe Covid-19, we hypothesized that MUC22 might play an important protective role against severe Covid-19, by reducing overactive immune responses in the senior population.