Accuracy and Performance Trade-Offs of Logarithmic Number Units in Multi-Core Clusters
2016
When compared to traditional floating point (FP) number representation, logarithmic number systems (LNS) have superior performance when evaluating complex functions, since multiplications and divisions can be calculated with ease in the logarithmic domain. However, additions and subtractions become costly nonlinear operations. Efficient LNS units (LNUs) implementing ADD/SUB operations in hardware rely on interpolation techniques to save area. Even the most advanced LNUs are still larger than standard single-precision FPUs -- which renders them impractical for most general purpose processors. In this paper, we show that in a multi-core setting, when shared among several processor cores, LNUs become a very attractive solution. We present a methodology to generate LNUs with various error bounds and perform a design space exploration with different parameterizations. We show that already small precision relaxations in the order of a few units in the last place (ulp) reduce the LNU area significantly. Using examples from several signal processing domains, we demonstrate that shared approximate LNUs can outperform their standard FP counterpart on average by 2.14x in speed and 1.92x in energy-efficiency, with insignificant degradation of the output quality.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
36
References
2
Citations
NaN
KQI