ExtendAlign: a computational algorithm for delivering multiple, local end-to-end alignments for short sequences

2019 
The performance of local and global alignment algorithms circumscribe the alignment of sequences shorter than 30 nucleotides. Regardless of the computational approach applied, a series of diverse limitations accumulate as the length and similarity between the aligned sequences decreases, often resulting in alignment biases: local alignments have difficulties reporting correctly the number of matches and mismatches (m/mm) flanking the seed; global alignments lengthen the total alignment size and introduce gaps artificially. These biases compromise the accuracy of computational analysis of short sequences. Here we report ExtendAlign, a computational tool that overhauls and corrects the aforementioned bias generated by local and global alignments. ExtendAlign provides an end-to-end report of the accurate number of m/mm for all the nucleotides that flank a local alignment of short sequences, thus eliminating the artificial lengthening of the query size, the introduction of gaps, and the failure in reporting flanking m/mm. Since ExtendAlign combines the refinement and strength of global and local multiple sequence alignments, it delivers exceptional accuracy in correcting the alignment of dissimilar sequences in the range of 35-50% -similarity also known as the twilight zone; indicating it can be adopted regularly whenever high accuracy is required for short-sequence alignments.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []