Optimising Euroben kernels on Maxwell

2013 
The ability to run certain common numerical kernels fast is valuable for many applications and fields of scientific research. The University of Edinburgh investigated the possibility of using FPGA devices to accelerate four kernels from the Euroben benchmark suite: dense matrix multiplication, sparse matrix-by-vector multiplication, fast Fourier transform, and random number generation. Each kernel was ported using both the Harwest C compiler and hand-coded VHDL and for each port the performance gain and porting effort were evaluated. Although all of the kernels ran faster on the FPGA than on the CPU used for comparison, the level of hardware expertise required to port them was high even when using the Harwest compiler. Furthermore, many of the FPGA ports gave only a modest performance improvement over the much more maintainable C implementation, especially when the time taken to copy input and output data across the relatively slow PCI bus to the FPGAs was taken into account. However for certain use cases, for example when a large quantity of random numbers is required or when low power consumption is critical, FPGAs could be a good choice for running this type of kernel.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    0
    Citations
    NaN
    KQI
    []