OpenMP and StarPU Abreast: the Impact of Runtime in Task-Based Block QR Factorization Performance

2019 
Directed Acyclic Graph (DAG) is a high-level abstraction to describe the activities of parallel applications. A DAG contains tasks (nodes) and dependencies (edges) in the task-based programming paradigm. Application performance depends on the choices of the runtime system. Our work intends to evaluate and compare the performance of three different runtime systems, GCC/libgomp, LLVM/libomp, and StarPU for a task-based dense block QR factorization. The obtained results show that while GCC/libgomp achieves up to 5.4% better performance in the best case, it has scalability problems for finegrain problems with large DAGs. LLVM/libomp and StarPU are more scalable, and StarPU is much faster in task creation and submission than the other runtimes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []