Throughput-Oriented Spatio-Temporal Optimization in Approximate High-Level Synthesis

2020 
Current and emerging systems for high-throughput applications, such as machine learning, cloud computing, and real-time video encoding demand real-time processing computations, heavily constrained by latency and power requirements. To deal with the increasing computational complexity, designers may resort to approximate accelerators for error-resilient compute-intensive kernels to meet such requirements with acceptable deviation from the exact implementation. However, since time-to-market is crucial when dealing with evolving applications, technologies, and standards, hand-crafting approximate accelerators may impose prohibitive development time and cost overheads. In this scenario, approximate High-Level Synthesis (HLS) methodologies have been proposed to deal with the complexity of exploring approximation techniques. Nevertheless, current tools are not suitable for exploring throughput optimizations, being instead constrained to perform specific improvements on area, power, and performance. In this work, we propose the use of HLS to generate Pareto-optimal accelerators for throughput-constrained applications. Particularly, we present a throughput-oriented approximate HLS methodology that explores both delay and area optimizations to increase the reuse over time and parallelism of such accelerators. Results show that our method is able to improve throughput by up to 80 % with no additional area costs or to sustain the same throughput of the exact design with about 45 % less area while introducing manageable error for most applications. Moreover, our method can attain throughput improvements of up to 18% when compared with recent works focusing only on performance or area optimizations, with no additional costs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []