Accuracy of estimating genetic distance between species from short sequences of mitochondrial DNA.

1990 
Recent advances in molecular techniques allow rapid determination of genetic divergence, and in many circumstances the polymerase chain reaction (PCR) and DNA sequencing have become the technologies of choice for estimating genetic distance. For evolutionary studies, considerable effort is being focused on “universal” oligonucleotide primers that amplify short sequences of mitochondrial DNA. Kocher et al. ( 1989, p. 6 199) encapsulate this approach when they state that a “short sequence from a piece of the cytochrome b gene contains phylogenetic information extending from the infraspecific level to the intergeneric level.” If this is a valid generalization, then the advent of universal primers and the ease of direct sequencing of PCR products provides an efficient and reliable means of determining genetic divergence and inferring phylogenetic relationships among taxa. Indeed, DNA sequence derived from the cytochrome b primers (usually amounting to 250 bp) have proved successful for inferring evolutionary relationships and estimating rates and processes of molecular evolution (Kocher et al. 1989; Thomas et al. 1989). However, before evolutionary biologists turn their attention to such short DNA sequences, it is important to understand the accuracy of genetic distances estimated from such sequences. It is well known that the distribution of substitution events is leptokurtic. Thomas and Beckenbach ( 1989) showed that there is spatial heterogeneity in the distribution of substitutions in mitochondrial protein-coding genes from salmonid fishes. Nonrandom distribution of substitutions is a problem in estimating evolutionary relatedness from short, amplified DNA sequences because the sampling scheme is nonrandom, being entirely dependent on the choice of primers used for amplifications. Amplification and sequencing of fragments that are too small to encompass the scope of spatial heterogeneity in the distribution of substitutions will give biased estimates of genetic distance. * We investigated the influence of the number of basepairs on genetic distance estimates based on mtDNA sequence data by randomly subsampling known DNA sequences and then determining the variance of genetic distance estimates among subsamples. This approach allows us to investigate the extent to which substitution differences in different parts of a gene interfere with the robustness of genetic distance estimates based on short sequences. The accuracy of between-taxa genetic divergence estimation based on a small number of basepairs was evaluated by subsampling regions of the mitochondrial genomes containing protein-coding genes for five pairs of vertebrate taxa (fig. 1). The analysis focuses on fourfold-degenerate sites within protein-coding regions because these sites are free from selective constraints and therefore provide the best type of data for inferring evolutionary distance ( Wu and Li 1985 ) . For each of the five paired vertebrate comparisons, continuous, homologous sequences of 250, 500, 750, 1,000, 1,250, and 1,500 bp were selected at random from the complete sequences. For these sequence lengths, the average numbers of fourfold-
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    36
    Citations
    NaN
    KQI
    []