Improving the performance of global communication on a three‐dimensional torus network

1996 
A high-speed one-to-all broadcasting algorithm is proposed whose performance does not deteriorate much when the number of processors is increased in a massively parallel computer. For the network topology, 3D torus networks are considered. Two methods are discussed for a system which broadcasts by repeating one-to-one communications. One uses paths having a smaller maximum transfer number to reduce the number of transfers, and the other presets the hardware to reduce the overhead of individual one-to-one communications. These methods are evaluated using a double loop model which consists of an inner loop for local processing and an outer loop for global communications. When these methods are used, the scalability increases and for a 32K processor system a 4.2 times speedup in program execution can be achieved.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    1
    Citations
    NaN
    KQI
    []