Leveraging Data-Flow Task Parallelism for Locality-Aware Dynamic Scheduling on Heterogeneous Platforms

2018 
Writing programs for heterogeneous platforms is challenging, since programmers must deal with multiple programming models, partition work for CPUs and accelerators with different compute capabilities, and manage memory in multiple distinct address spaces. We show that using a task-parallel data-flow programming model, in which parallelism is specified in a platform-neutral description that abstracts in particular from the heterogeneity of the hardware, efficient execution can be carried out by a run-time system at execution time using an appropriate task scheduling and memory allocation scheme. This is achieved through dynamic scheduling of tasks by reducing the dependence exchanges between devices, interleaved execution of tasks and transfer between host and device memory, and load balancing across CPUs and GPUs. Our results show our technique increases the number of tasks offloaded to the GPU and improves data locality of GPU tasks leading to a significant reduction of GPU idle time and thus to substantial improvements of performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    1
    Citations
    NaN
    KQI
    []