language-icon Old Web
English
Sign In

Error threshold

In evolutionary biology and population genetics, the error threshold (or critical mutation rate) is a limit on the number of base pairs a self-replicating molecule may have before mutation will destroy the information in subsequent generations of the molecule. The error threshold is crucial to understanding 'Eigen's paradox'. In evolutionary biology and population genetics, the error threshold (or critical mutation rate) is a limit on the number of base pairs a self-replicating molecule may have before mutation will destroy the information in subsequent generations of the molecule. The error threshold is crucial to understanding 'Eigen's paradox'. The error threshold is a concept in the origins of life (abiogenesis), in particular of very early life, before the advent of DNA. It is postulated that the first self-replicating molecules might have been small ribozyme-like RNA molecules. These molecules consist of strings of base pairs or 'digits', and their order is a code that directs how the molecule interacts with its environment. All replication is subject to mutation error. During the replication process, each digit has a certain probability of being replaced by some other digit, which changes the way the molecule interacts with its environment, and may increase or decrease its fitness, or ability to reproduce, in that environment. It was noted by Manfred Eigen in his 1971 paper (Eigen 1971) that this mutation process places a limit on the number of digits a molecule may have. If a molecule exceeds this critical size, the effect of the mutations become overwhelming and a runaway mutation process will destroy the information in subsequent generations of the molecule. The error threshold is also controlled by the 'fitness landscape' for the molecules. The fitness landscape is characterized by the two concepts of height (=fitness) and distance (=number of mutations). Similar molecules are 'close' to each other, and molecules that are fitter than others and more likely to reproduce, are 'higher' in the landscape. If a particular sequence and its neighbors have a high fitness, they will form a quasispecies and will be able to support longer sequence lengths than a fit sequence with few fit neighbors, or a less fit neighborhood of sequences. Also, it was noted by Wilke (Wilke 2005) that the error threshold concept does not apply in portions of the landscape where there are lethal mutations, in which the induced mutation yields zero fitness and prohibits the molecule from reproducing. Eigen's paradox is one of the most intractable puzzles in the study of the origins of life. It is thought that the error threshold concept described above limits the size of self replicating molecules to perhaps a few hundred digits, yet almost all life on earth requires much longer molecules to encode their genetic information. This problem is handled in living cells by enzymes that repair mutations, allowing the encoding molecules to reach sizes on the order of millions of base pairs. These large molecules must, of course, encode the very enzymes that repair them, and herein lies Eigen's paradox, first put forth by Manfred Eigen in his 1971 paper (Eigen 1971). Simply stated, Eigen's paradox amounts to the following: This is a chicken-or-egg kind of a paradox, with an even more difficult solution. Which came first, the large genome or the error correction enzymes? A number of solutions to this paradox have been proposed: Consider a 3-digit molecule where A, B, and C can take on the values 0 and 1. There are eight such sequences (, , , , , , , and ). Let's say that the molecule is the most fit; upon each replication it produces an average of a {displaystyle a} copies, where a > 1 {displaystyle a>1} . This molecule is called the 'master sequence'. The other seven sequences are less fit; they each produce only 1 copy per replication. The replication of each of the three digits is done with a mutation rate of μ. In other words, at every replication of a digit of a sequence, there is a probability μ {displaystyle mu } that it will be erroneous; 0 will be replaced by 1 or vice versa. Let's ignore double mutations and the death of molecules (the population will grow infinitely), and divide the eight molecules into three classes depending on their Hamming distance from the master sequence: Note that the number of sequences for distance d is just the binomial coefficient ( L d ) {displaystyle { binom {L}{d}}} for L=3, and that each sequence can be visualized as the vertex of an L=3 dimensional cube, with each edge of the cube specifying a mutation path in which the change Hamming distance is either zero or ±1. It can be seen that, for example, one third of the mutations of the molecules will produce molecules, while the other two thirds will produce the class 2 molecules and . We can now write the expression for the child populations n i ′ {displaystyle n'_{i}} of class i in terms of the parent populations n j {displaystyle n_{j}} . where the matrix 'w’ that incorporates natural selection and mutation, according to quasispecies model, is given by:

[ "Algorithm" ]
Parent Topic
Child Topic
    No Parent Topic