A Reconfigurable Multiple-Precision Floating-Point Dot Product Unit for High-Performance Computing

2021 
There is an emerging need to optimize floating-point (FP) dot product units (DPU) for high-performance scientific computing as well as training deep learning models. Due to different precision requirements of applications, a reconfigurable multiple-precision DPU operation can largely reduce the cost of area and power. However, the existing methods could result in redundant bits for unit multipliers, but also leave idle hardware resources for the operations in different precisions. In this paper, a reconfigurable multiple-precision FP DPU design is proposed for high-performance computing (HPC) applications. The FP DPU can be reconfigured as follows. A bit-partitioning method is provided to minimize the redundant bits with a configurable mixed-precision multiplier for three-mode operations: 20 half-precision Dot Product (DP), 5 single-precision DP, and 1 double-precision DP operations. Any of the modes can be executed in two successive clock cycles without idle hardware resources. The proposed design is realized by using the UMC 55-nm process with simulation results. Compared with the existing multiple-precision FP methods, the proposed DPU achieves 88.9% and 35.8% area-saving performance for FP16 and FP32 operations, respectively. Moreover, when using benchmarked HPC applications where multiple precisions can be used, the proposed reconfigurable DPU can accelerate up to 4× and 20× maximum throughput rates when compared with fixed FP32 and FP64 operations, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []