Nonsynonymous SNPs: validation characteristics, derived allele frequency patterns, and suggestive evidence for natural selection

2006 
We experimentally investigated more than 1,200 entries in dbSNP that would change amino-acids (nsSNPs), using various subsets of DNA samples drawn from 18 global populations (∼1,000 subjects in total). First, we mined the data for any SNP features that correlated with a high validation rate. Useful predictors of valid SNPs included multiple submissions to dbSNP, having a dbSNP validation statement, and being present in a low number of ESTs. Together, these features improved validation rates by almost 10-fold. Higher-abundance SNPs (e.g., T/C variants) also validated more frequently. Second, we considered derived alleles and noted a considerably (∼10%) increased average derived allele frequency (DAF) in Europeans vs. Africans, plus a further increase in some other populations. This was not primarily due to an SNP ascertainment bias, nor to the effects of natural selection. Instead, it can be explained as a drift-based, progressive increase in DAF that occurs over many generations and becomes exaggerated during population bottlenecks. This observation could be used as the basis for novel DAF-based tests for comparing demographic histories. Finally, we considered individual marker patterns and identified 37 SNPs with allele frequency variance or FST values consistent with the effects of population-specific natural selection. Four particularly striking clusters of these markers were apparent, and three of these coincide with genes/regions from among only several dozen such domains previously suggested by others to carry signatures of selection. Hum Mutat 27(2), 173–186, 2006. © 2006 Wiley-Liss, Inc.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    8
    Citations
    NaN
    KQI
    []