Performance Evaluation and Modeling of Reduction Operations on the IBM RS/6000 SP Parallel Computer

1996 
We discuss algorithms for global reduction (or combine) operations (e.g., global sums) for numbers of processors that need not be a power of 2, and implement these using standard message-passing techniques on distributed-memory parallel computers. We present performance results measured on an IBM RS/6000 SP parallel computer at UNI•C. Significant performance improvements are obtained by using a recursive doubling method with a vector splice/gather approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    1
    Citations
    NaN
    KQI
    []