Intercepting Functions for Memoization: A Case Study Using Transcendental Functions

2015 
Memoization is the technique of saving the results of executions so that future executions can be omitted when the input set repeats. Memoization has been proposed in previous literature at the instruction, basic block, and function levels using hardware, as well as pure software--level approaches including changes to programming language. In this article, we focus on software memoization for procedural languages such as C and Fortran at the granularity of a function. We propose a simple linker-based technique for enabling software memoization of any dynamically linked pure function by function interception and illustrate our framework using a set of computationally expensive pure functions—the transcendental functions. Transcendental functions are those that cannot be expressed in terms of a finite sequence of algebraic operations (trigonometric functions, exponential functions, etc.) and hence are computationally expensive. Our technique does not need the availability of source code and thus can even be applied to commercial applications, as well as applications with legacy codes. As far as users are concerned, enabling memoization is as simple as setting an environment variable. Our framework does not make any specific assumptions about the underlying architecture or compiler toolchains and can work with a variety of current architectures. We present experimental results for a x86-64 platform using both gcc and icc compiler toolchains, and an ARM Cortex-A9 platform using gcc. Our experiments include a mix of real-world programs and standard benchmark suites: SPEC and Splash2x. On standard benchmark applications that extensively call the transcendental functions, we report memoization benefits of up to 50p on Intel Ivy Bridge and up to 10p on ARM Cortex-A9. Memoization was able to regain a performance loss of 76p in bwaves due to a known performance bug in the GNU implementation of the pow function. The same benchmark on ARM Cortex-A9 benefited by more than 200p.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    28
    Citations
    NaN
    KQI
    []