Multi-FPGA Accelerator Architecture for Stencil Computation Exploiting Spacial and Temporal Scalability

2019 
After the introduction of the OpenCL-based FPGA accelerator design method, FPGAs are getting very popular among high-performance computing. The key to achieving high performance using FPGAs is to design pipelined accelerators. We can increase the pipeline depth beyond the border of one FPGA by connecting multiple FPGAs using high-speed QSFP (quad small form-factor pluggable) connectors. Such a deeply-pipelined accelerator using multiple FPGAs works similar to a single very large FPGA. In this paper, we propose a multi-FPGA accelerator architecture for stencil computation by scaling in spacial and temporal dimensions. According to the experimental results, we achieved performance up to 950 GFLOP/s using one FPGA and nearly doubled the performance using two FPGAs. We achieved a high power-efficiency with competitive performances compared to high-end GPUs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    16
    Citations
    NaN
    KQI
    []