Adding sequence context to a Markov background model improves the identification of regulatory elements

2006 
Motivation: Many computational methods for identifying regulatory elements use a likelihood ratio between motif and background models. Often, the methods use a background model of independent bases. At least two different Markov background models have been proposed with the aim of increasing the accuracy of predicting regulatory elements. Both Markov background models suffer theoretical drawbacks, so this article develops a third, context-dependent Markov background model from fundamental statistical principles. Results: Datasets containing known regulatory elements in eukaryotes provided a basis for comparing the predictive accuracies of the different background models. Non-parametric statistical tests indicated that Markov models of order 3 constituted a statistically significant improvement over the background model of independent bases. Our model performed slightly better than the previous Markov background models. We also found that for discriminating between the predictive accuracies of competing background models, the correlation coefficient is a more sensitive measure than the performance coefficient. Availability: Our C++ program is available at ftp://ftp.ncbi.nih.gov/pub/spouge/papers/archive/AGLAM/2006-07-19 Contact: spouge@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    8
    Citations
    NaN
    KQI
    []