language-icon Old Web
English
Sign In

Conserved sequence

In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids (DNA and RNA) or proteins across species (orthologous sequences), or within a genome (paralogous sequences), or between donor and receptor taxa (xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids (DNA and RNA) or proteins across species (orthologous sequences), or within a genome (paralogous sequences), or between donor and receptor taxa (xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. A highly conserved sequence is one that has remained relatively unchanged far back up the phylogenetic tree, and hence far back in geological time. Examples of highly conserved sequences include the RNA components of ribosomes present in all domains of life, the homeobox sequences widespread amongst Eukaryotes, and the tmRNA in Bacteria. The study of sequence conservation overlaps with the fields of genomics, proteomics, evolutionary biology, phylogenetics, bioinformatics and mathematics. The discovery of the role of DNA in heredity, and observations by Frederick Sanger of variation between animal insulins in 1949, prompted early molecular biologists to study taxonomy from a molecular perspective. Studies in the 1960s used DNA hybridization and protein cross-reactivity techniques to measure similarity between known orthologous proteins, such as hemoglobin and cytochrome c. In 1965, Émile Zuckerkandl and Linus Pauling introduced the concept of the molecular clock, proposing that steady rates of amino acid replacement could be used to estimate the time since two organisms diverged. While initial phylogenies closely matched the fossil record, observations that some genes appeared to evolve at different rates led to the development of theories of molecular evolution. Margaret Dayhoff's 1966 comparison of ferrodoxin sequences showed that natural selection would act to conserve and optimise protein sequences essential to life. Over many generations, nucleic acid sequences in the genome of an evolutionary lineage can gradually change over time due to random mutations and deletions. Sequences may also recombine or be deleted due to chromosomal rearrangements. Conserved sequences are sequences which persist in the genome despite such forces, and have slower rates of mutation than the background mutation rate. Conservation can occur in coding and non-coding nucleic acid sequences. Highly conserved DNA sequences are thought to have functional value, although the role for many highly conserved non-coding DNA sequences is poorly understood. The extent to which a sequence is conserved can be affected by varying selection pressures, its robustness to mutation, population size and genetic drift. Many functional sequences are also modular, containing regions which may be subject to independent selection pressures, such as protein domains. In coding sequences, the nucleic acid and amino acid sequence may be conserved to different extents, as the degeneracy of the genetic code means that synonymous mutations in a coding sequence do not affect the amino acid sequence of its protein product. Amino acid sequences can be conserved to maintain the structure or function of a protein or domain. Conserved proteins undergo fewer amino acid replacements, or are more likely to substitute amino acids with similar biochemical properties. Within a sequence, amino acids that are important for folding, structural stability, or that form a binding site may be more highly conserved. The nucleic acid sequence of a protein coding gene may also be conserved by other selective pressures. The codon usage bias in some organisms may restrict the types of synonymous mutations in a sequence. Nucleic acid sequences that cause secondary structure in the mRNA of a coding gene may be selected against, as some structures may negatively affect translation, or conserved where the mRNA also acts as a functional non-coding RNA. Non-coding sequences important for gene regulation, such as the binding or recognition sites of ribosomes and transcription factors, may be conserved within a genome. For example, the promoter of a conserved gene or operon may also be conserved. As with proteins, nucleic acids that are important for the structure and function of non-coding RNA (ncRNA) can also be conserved. However, sequence conservation in ncRNAs is generally poor compared to protein-coding sequences, and base pairs that contribute to structure or function are often conserved instead.

[ "Peptide sequence", "base sequence", "Phylogenetic footprinting", "Conserved non-coding sequence", "Sequence space (evolution)" ]
Parent Topic
Child Topic
    No Parent Topic