Large-scale evolutionary analysis of polymorphic inversions in the human genome

2017 
Chromosomal inversions are structural variants that invert a fragment of the genome without usually modifying its content, and their subtle but powerful effects in natural populations have fascinated evolutionary biologists for a long time. Discovered a century ago in fruit flies, their association with different evolutionary processes, such as local adaptation and speciation, was soon evident in several species. However, in the current era of genomics and big data, inversions frequently escape the grasp of current technologies and remain largely overlooked in humans. During the last few years, the InvFEST Project has aimed to address the missing knowledge about human inversions by validating and genotyping a large fraction of predicted polymorphisms. In particular, it has generated one of the most useful data sets on human inversions, consisting of 45 common inversions (with sizes from 83 bp to 415 kbp) genotyped at high-quality in 550 individuals of seven populations of diverse ancestry. This thesis takes advantage of the available population-scale information, combined with whole-genome sequences available from the 1000 Genomes Project, to carry out the first detailed analysis of the evolutionary properties of human polymorphic inversions. The methods used combine theoretical models, simulations and empirical comparisons with other mutation types. Besides the complete characterization of the data set, the results confirm fundamental differences between inversions created by different mechanisms. The frequency distribution of the 21 inversions originated by non-homologous mechanisms (NH) is similar to that expected for neutral variants when controlling for detection biases, which indicates that they are not subjected to strong negative selection. Recombination is completely inhibited across the whole inversion length, with no clear genetic exchange found, and possibly over a few kbp beyond the breakpoints. As a result, NH inversions strongly affect local genome variation levels, as predicted by computer simulations, with older inversions increasing total nucleotide diversity, while younger ones at very high frequency could have the opposite effect. In contrast, most inversions created by non-allelic homologous recombination (NAHR) (19/24) have appeared independently in different haplotypes in the sample. These high recurrence levels are reflected in several measures: they are enriched in intermediate frequencies, share multiple nucleotide polymorphisms between orientations, and have little linkage disequilibrium with neighbouring variants, which limits their detection by tag SNP strategies. Finally, in order to find inversions that are functional candidates, different signatures of selection on inversions were explored based on their frequencies, population differentiation and sequence variation patterns. Ten candidates were revealed, with three of them found to be over 1.5 million years old and maintained at intermediate frequencies, possibly by balancing selection. One of these was also found in archaic hominins. Other candidates seem to have reached high frequencies in a short period of time in some populations, consistent with positive selection. Notably, over half of the candidates are located within gene regions, which suggests that they may have functional effects. Thus, this work offers an overview of inversion dynamics and their role as genomic modifiers, opening interesting avenues of investigation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []