Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

2017 
Ectopic DNA binding by transcription factors and other DNA binding proteins can be detrimental to cellular functions and ultimately to organismal fitness. The frequency of protein-DNA binding at non-functional sites depends on the global composition of a genome with respect to all possible short motifs, or k-mer words. To determine whether weak yet ubiquitous protein-DNA interactions could exert significant evolutionary pressures on genomes, we correlate in vitro measurements of binding strengths on all 8-mer words from a large collection of transcription factors, in several different species, against their relative genomic frequencies. Our analysis reveals a clear signal of purifying selection to reduce the large number of weak binding sites genome-wide. This evolutionary process, which we call global selection, has a detectable hallmark in that similar words experience similar evolutionary pressure, a consequence of the biophysics of protein-DNA binding. By analyzing a large collection of genomes, we show that global selection exists in all domains of life, and operates through tiny selective steps, maintaining genomic binding landscapes over long evolutionary timescales.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []