Architectural Implications in Graph Processing of Accelerator with Gardenia Benchmark Suite

2019 
Existing generic benchmarks for accelerators (e.g. Parboil and Rodinia) have focused on high performance computing (HPC) applications which have limited control flows and data irregularity. Previous available graph analytics benchmark suites include straightforward implemented workloads which do not employ up-to-date optimization techniques and thus have quite different behaviors from real-world applications. This paper first briefly presents and characterizes the Graph Analytics Repository for Designing Next-generation Accelerators (GARDENIA) 1, which is a benchmark suite for studies of irregular algorithms on various massively parallel accelerators. It includes emerging irregular big-data and machine learning applications, in which mimic massively multithreaded programs deployed on not only datacenters but also hand-on devices. Then we characterize Nvidia GPU with GARDENIA, covering a wide spectrum of metrics such as parallelization, cache locality, off-chip traffic and irregularity. Based on the characterization on Nvidia GPU, we unveil the performance bottlenecks of the current mainstream accelerator and give architectural insights for building high performance and energy-efficient domain-specific accelerators for graph applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    0
    Citations
    NaN
    KQI
    []