Accurate floating-point summation: a new approach

2007 
The aim of this paper is to find an accurate and efficient algorithm for evaluating the summation of large sets of floating-point numbers. We present a new representation of the floating-point number system in which a number is represented as a linear combination of integers and the coefficients are powers of the base of the floating-point system. The approach allows to build up an accurate floating-point summation algorithm based on the fact that no rounding error occurs whenever two integer numbers are summed or a floating-point number is multiplied by powers of the base of the floating-point system. The proposed algorithm seems to be competitive in terms of computational effort and, under some assumptions, the computed sum is greatly accurate. With such assumptions, less-conservative in the practical applications, we prove that the relative error of the computed sum is bounded by the unit roundoff.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    4
    Citations
    NaN
    KQI
    []