copMEM: finding maximal exact matches via sampling both genomes

2019 
MOTIVATION: Genome-to-genome comparisons require designating anchor points, which are given by Maximum Exact Matches (MEMs) between their sequences. For large genomes this is a challenging problem and the performance of existing solutions, even in parallel regimes, is not quite satisfactory. RESULTS: We present a new algorithm, copMEM, that allows to sparsely sample both input genomes, with sampling steps being coprime. Despite being a single-threaded implementation, copMEM computes all MEMs of minimum length 100 between the human and mouse genomes in less than 2 minutes, using 7 GB of RAM memory. AVAILABILITY AND IMPLEMENTATION: https://github.com/wbieniec/copmem. SUPPLEMENTARY DATA: Supplementary data are available at Bioinformatics online.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    7
    Citations
    NaN
    KQI
    []