Towards Scalable and Efficient FPGA Stencil Accelerators Work-In-Progress

2016 
In this paper we propose a design template for stencil computations targeting FPGA-based accelerators. The goal for our design is to provide scalable high throughput designs that can eciently process iterative stencil programs with large size parameters, i.e., those whose data footprint is too large to t on-chip. Our context is when we seek to use FPGAs as accelerators attached to CPUs. Minimizing the area is not our primary goal. We propose a family of architectures based on hierarchical tiling, where the inner tiling is used to build coarse-grain data-path operators, increasing computational throughput, and the outer tiling is used to control the memory requirement, specically data transfers to/from the accelerator. We present preliminary results for Jacobi-style stencils on 1D and 2D data, and are working on fully automating the ow.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    2
    Citations
    NaN
    KQI
    []