DNA shape complements sequence-based representations of transcription factor binding sites

2019 
The position weight matrix (PWM) has long been a useful tool for describing variation in the composition of regions of DNA such as transcription factor (TF) binding sites. It is difficult, however, to relate the sequence- based representation of a DNA motif to the biological features of the interaction of a TF with its binding site. Here we present an alternative strategy for representing DNA motifs -- called Structural Motif (StruM) -- that can easily represent different sets of structural features. Structural features are inferred from dinucleotide properties listed in the Dinucleotide Property Database. StruMs are able to specifically model TF binding sites, using an encoding strategy that is distinct from sequence-based models. This difference in encoding strategies makes StruMs complementary to sequence-based methods of TF binding site identification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []