A Time-efficient and High-performance FPGA-based Continuous Floating-point Matrix Computing Accelerating Architecture for Control System

2020 
Matrix computing is one of the most important linear algebra modes that is broadly used in both scientific and engineering applications. Currently, there is still a lot of space for the optimization of continuous matrix computing accelerating. In this study, we first present two memory access optimization schemes which significantly minimize the I/O time and the total delay. Then, we extend the data accuracy of continuous matrix computing from double-precision to single-precision and half-precision floating-point data, which can enhance data diversity and improve computing performance. The experiments show that the I/O time is reduced by 40% after coarse-grained parallel optimization. Moreover, the I/O time is almost completely hidden by the calculation time after fine-grained data flow optimization. The accelerator achieves a maximum frequency of180 Mhz with 128 PEs and performs 184.3 GFLOPS for half-precision floating-point data. Our design is more outstanding in time-efficient and application scope comparing with state-of-the-art FPGA-based structures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []