Performance Evaluation and Modeling of Reduction Operations on the IBM RS/6000 SP Parallel Computer
1996
We discuss algorithms for global reduction (or combine) operations (e.g., global sums) for numbers of processors that need not be a power of 2, and implement these using standard message-passing techniques on distributed-memory parallel computers. We present performance results measured on an IBM RS/6000 SP parallel computer at UNI•C. Significant performance improvements are obtained by using a recursive doubling method with a vector splice/gather approach.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
4
References
1
Citations
NaN
KQI