An alternative description of power law correlations in DNA sequences

R. Silva,J. R. P. Silva,D.H.A.L. Anselmo,J. S. Alcaniz,W.J.C. da Silva,M.O. Costa

An alternative description of power law correlations in DNA sequences

2019

Abstract We analyze the coding sequence for the Homo Sapiens via a model which naturally embraces power law correlations (PLC) among the bases in DNA sequences of living organisms. This model is based on a principle of universal optimization, which is the core of all statistical arguments, being associated with the power law distribution function of the length of DNA, measured in base pairs (bp). This distribution provides a PLC parameter introduced through a nonadditive framework in which such parameter measures the PLC in the DNA sequence. The results show that the Short-Range-Correlations (SRC), always present in coding DNA sequences, are appropriately captured through the power law distribution, adequately describing the cumulative length distribution of DNA bases, in contrast with the case of the traditional exponential statistical model. We use an Empirical cumulative distribution function and the database of proteins compiled by the Ensembl Project to show that the power law distribution provides the best description of the data. A Bayesian analysis of the data further confirms this result.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations