ExtendAlign: a computational algorithm for delivering multiple, local end-to-end alignments for short sequences
2019
The performance of local and global alignment algorithms circumscribe the alignment of sequences shorter than 30 nucleotides. Regardless of the computational approach applied, a series of diverse limitations accumulate as the length and similarity between the aligned sequences decreases, often resulting in alignment biases: local alignments have difficulties reporting correctly the number of matches and mismatches (m/mm) flanking the seed; global alignments lengthen the total alignment size and introduce gaps artificially. These biases compromise the accuracy of computational analysis of short sequences. Here we report ExtendAlign, a computational tool that overhauls and corrects the aforementioned bias generated by local and global alignments. ExtendAlign provides an end-to-end report of the accurate number of m/mm for all the nucleotides that flank a local alignment of short sequences, thus eliminating the artificial lengthening of the query size, the introduction of gaps, and the failure in reporting flanking m/mm. Since ExtendAlign combines the refinement and strength of global and local multiple sequence alignments, it delivers exceptional accuracy in correcting the alignment of dissimilar sequences in the range of 35-50% -similarity also known as the twilight zone; indicating it can be adopted regularly whenever high accuracy is required for short-sequence alignments.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI