Four-ary tree-based barrier synchronization for 2D meshes without nonmember involvement

2001 
This paper proposes a Barrier Tree for Meshes (BTM) to minimize the barrier synchronization latency for two-dimensional (2D) meshes. The proposed BTM scheme has two distinguishing features. First, the synchronization tree is 4-ary. The synchronization latency of the BTM scheme is asymptotically /spl theta/(log/sub 4/ n), while that of the fastest scheme reported in the literature is bounded between /spl Omega/(log/sub 3/ n) and /spl theta/(n/sup 1/2/), where n is the number of member nodes. Second, nonmember nodes are neither involved in the construction of a BTM nor actively participate in the synchronization operations, which avoids interference among different process groups during synchronization. This not only results in low setup overhead, but also reduces the synchronization latency. The low setup overhead is particularly effective for the dynamic process model provided in MPI-2. Extensive simulation study shows that, for up to 64/spl times/64 meshes, the BTM scheme results in about 40/spl sim/70 percent shorter synchronization latency and is more scalable than conventional schemes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    13
    Citations
    NaN
    KQI
    []