Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS

2015 
Current generation GPUs can accelerate high-performance, compute-intensive applications by exploiting massive thread-level parallelism. The high performance, however, comes at the cost of increased power consumption. Recently, commercial GPGPU architectures have introduced support for concurrent kernel execution to better utilize the computational/memory resources and thereby improve overall throughput. In this paper, we argue and experimentally validate the benefits of concurrent kernels towards energy-efficient execution. We design power-performance models to carefully select the appropriate kernel combinations to be executed concurrently, the relative contributions of the kernels to the thread mix, along with the frequency choices for the cores and the memory to achieve high performance per watt metric. Our experimental evaluation shows that the concurrent kernel execution in combination with DVFS can improve energy-efficiency by up to 34.5% compared to the most energy-efficient sequential execution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    37
    Citations
    NaN
    KQI
    []