High-Precision BLAS on FPGA-enhanced Computers

Chuan He,Guan Qin,Richard E. Ewing,Wei Zhao

High-Precision BLAS on FPGA-enhanced Computers

2007

The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but powerful “FPGA-enhanced computers”. In this paper, we introduced our efforts towards improving the computational performance of Basic Linear Algebra Subprograms (BLAS) by FPGA-specific algorithms/methods. Our study focus on three BLAS subroutines: floating point summation, matrix-vector multiplication, and matrix-matrix multiplication. They represent all three levels of BLAS functionalities, and their sustained computational performances are either memory bandwidth bounded or computation bounded. By proposing the group-alignment based floating-point summation method and applying this technique to other subroutines, we significantly improved their sustained computational performance and reduced numerical errors with moderate FPGA resources consumed. Comparing with existing FPGA-based implementations, our designs are efficient and compact with improved numerical accuracy and stability.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations