On the Performance of BWA on NUMA Architectures

2015 
Rapid progress in genome sequencing techniques is creating the necessity of advanced algorithms to process such information in reasonable time. Alignment applications such as BWA (Burrows Wheeler Aligner) are essential for solving genomic variant calling studies. Although BWA takes advantage of multithreading execution, it exhibits significant scalability limitations on systems with a non-uniform memory architecture (NUMA). Data sharing between independent threads and irregular memory access patterns constitute performance limiting factors that affect BWA's scalability. We have analyzed performance problems of BWA on two NUMA systems: one based on Intel Xeon and the other one based on AMD Opteron. We present some simple techniques that can be applied at system level and do not require any application modification. Significant improvements in speedup were achieved when these techniques were applied to the execution of BWA on both systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []