Statistical analysis of DNA sequences in the neighborhood of splice sites

2008 
Prediction of gene sequences and their exon-intron structure in large eukaryotic genomic sequences is one of the central problems of mathematical biology. Solving this problem involves, in particular, high-accuracy splice site recognition. Using statistical analysis of a splice site-containing human gene fragment database, some characteristic features were described for nucleotide sequences in the splicing site neighborhood, the frequencies of all nucleotides and dinucleotides were determined, and those with frequencies increased or decreased in comparison to a random sequence were identified. The results can be used in sequence annotation, splicing site prediction, and the recognition of the gene exon-intron structure.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    2
    Citations
    NaN
    KQI
    []