A 28nm Coarse Grain 2D-Reconfigurable Array with Data Forwarding

2020 
To answer the ever-increasing demand for computational power, coarse-grain reconfigurable architectures (CGRAs) experience a revival. As they provide efficient, dedicated datapaths to match the dataflow graphs of a wide range of applications, CGRAs successfully manage high performance tasks. Yet, modern CGRAs access data through classical memory hierarchies or global scratchpad memories, failing to take advantage of data locality. This reveals memory bandwidth as a key bottleneck for these reconfigurable architectures. This letter introduces a heterogeneous 2-D CGRA which distributes data buffers across its computational grid. Together with a single-cycle data forwarding path, this allows the CGRA to better take advantage of data locality, in order to minimize data transfers. A 28-nm CMOS instantiation of this concept realizes the mapping and execution of a variety of compute intensive kernels, such as a real-time 512-point FFT or $5\times 5$ convolutional filter at high power efficiency. The CGRA demonstrates a peak energy efficiency of 584.9 GOPS/W during a 103-tap FIR filter, marking a $2.9\times $ improvement over state-of-the-art 2-D CGRAs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []