Enhancing Data Reuse in Cache Contention Aware Thread Scheduling on GPGPU

2016 
GPGPUs have been widely adopted as throughput processing platforms for modern big-data and cloud computing. Attaining a high performance design on a GPGPU requires careful tradeoffs among various design concerns. Data reuse, cache contention, and thread level parallelism, have been demonstrated as three imperative performance factors for a GPGPU. The correlated performance impacts of these factors pose non-trivial concerns when scheduling threads on GPGPUs. This paper proposes a three-staged scheduling scheme to coschedule the threads with consideration of the three factors. The experiment results on a set of irregular parallel applications, when compared with previous approaches, have demonstrated up to 70% execution time improvement.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []