ReDESK: A Reconfigurable Dataflow Engine for Sparse Kernels on Heterogeneous Platforms

2019 
Sparse Matrix-Vector Multiplication (SpMV) is the most important sparse linear algebra kernel in both scientific and engineering applications. Due to its irregular control flow and data access pattern, Von Neumann architectures like CPUs and GPUs cannot fully exploit the inherent parallelism of $S$ pMV. Although FPGAs can efficiently accelerate SpMV in a dataflow manner, their performance is degraded in face of large matrices that exceed the capacity of on-chip memory because of excessive rescheduling of data. In this paper we propose ReDESK, a Reconfigurable Dataflow Engine for Sparse Kernels, for emerging tightly-coupled CPU-FPGA heterogeneous platforms. To fully exploit the heterogeneity, we design a novel representation of sparse matrix that is tailored for data prefetching on CPU-side and streaming processing on FPGA-side. In this way ReDESK can fully utilize the memory bandwidth regardless of the scale of SpMV problem. We evaluate ReDESK on Intel HARP-2 platform with a set of matrices from the University of Florida sparse matrix collection. The result demonstrates an average bandwidth utilization of 0.094 GFLOP/GB, which is 1.6-4.3x more efficient than previous SpMV on FPGAs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []