language-icon Old Web
English
Sign In

Fixation index

The fixation index (FST) is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or microsatellites. Developed as a special case of Wright's F-statistics, it is one of the most commonly used statistics in population genetics. The fixation index (FST) is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or microsatellites. Developed as a special case of Wright's F-statistics, it is one of the most commonly used statistics in population genetics. Two of the most commonly used definitions for FST at a given locus are based on the variance of allele frequencies between populations, and on the probability of Identity by descent. If p ¯ {displaystyle {ar {p}}} is the average frequency of an allele in the total population, σ S 2 {displaystyle sigma _{S}^{2}} is the variance in the frequency of the allele between different subpopulations, weighted by the sizes of the subpopulations, and σ T 2 {displaystyle sigma _{T}^{2}} is the variance of the allelic state in the total population, FST is defined as Wright's definition illustrates that FST measures the amount of genetic variance that can be explained by population structure. This can also be thought of as the fraction of total diversity that is not a consequence of the average diversity within subpopulations, where diversity is measured by the probability that two randomly selected alleles are different, namely 2 p ( 1 − p ) {displaystyle 2p(1-p)} . If the allele frequency in the i {displaystyle i} th population is p i {displaystyle p_{i}} and the relative size of the i {displaystyle i} th population is c i {displaystyle c_{i}} , then Alternatively, where f 0 {displaystyle f_{0}} is the probability of identity by descent of two individuals given that the two individuals are in the same subpopulation, and f ¯ {displaystyle {ar {f}}} is the probability that two individuals from the total population are identical by descent. Using this definition, FST can be interpreted as measuring how much closer two individuals from the same subpopulation are, compared to the total population. If the mutation rate is small, this interpretation can be made more explicit by linking the probability of identity by descent to coalescent times: Let T0 and T denote the average time to coalescence for individuals from the same subpopulation and the total population, respectively. Then, This formulation has the advantage that the expected time to coalescence can easily be estimated from genetic data, which led to the development of various estimators for FST. In practice, none of the quantities used for the definitions can be easily measured. As a consequence, various estimators have been proposed. A particularly simple estimator applicable to DNA sequence data is: where π Between {displaystyle pi _{ ext{Between}}} and π Within {displaystyle pi _{ ext{Within}}} represent the average number of pairwise differences between two individuals sampled from different sub-populations ( π Between {displaystyle pi _{ ext{Between}}} ) or from the same sub-population ( π Within {displaystyle pi _{ ext{Within}}} ). The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. However, this estimator is biased when sample sizes are small or if they vary between populations. Therefore, more elaborate methods are used to compute FST in practice. Two of the most widely used procedures are the estimator by Weir & Cockerham (1984), or performing an Analysis of molecular variance. A list of implementations is available at the end of this article.

[ "Microsatellite", "Loss of heterozygosity", "Genetic structure" ]
Parent Topic
Child Topic
    No Parent Topic