language-icon Old Web
English
Sign In

Aligning fragmented sequences

2002 
The past few years have seen a remarkable series of accomplishments related to DNA sequencing, with genome sequences being generated for several key organisms including a yeast, a nematode, a plant and the human. Upon completion of the human and mouse genome sequences, world-wide sequencing capacity will turn to other complex organisms. Because of the prohibitive cost of fully sequencing a genome, current strategies call for many of these genomes to be incompletely sequenced. That is, gaps will remain in the known sequence, and the relative order and orientation of the known sequence fragments may not be determined. Sequence comparison between two genomes of this sort may allow some of the fragments to be oriented and ordered relative to each other by computational means. We formalize this as an optimization problem, show that the problem is MAX-SNP hard, and develop a polynomial time algorithm that is guaranteed to produce a solution whose score is within a factor 3 of optimal. We also consider the simpler problem where only one of the genomes is fully sequenced. We describe a heuristic that generates near optimal solutions on real data sets and show that solutions with high score have correspondingly high biological significance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    1
    Citations
    NaN
    KQI
    []