RDMA Managed Buffers: A Case for Accelerating Communication Bound Processes via Fine-Grained Events for Zero-Copy Message Passing

2019 
To take full advantage of modern high performance architectures, many large-scale data-driven applications require loosely-coupled, fine-grained asynchronous communication. Accordingly, efficient lightweight middleware based on state-of-the-art networking technology such as RDMA is becoming a necessity. The performance critical task of handling RDMA synchronization, for existing message passing runtimes are mostly coarse granular in nature and thus may be associated with various hidden costs. While low-level RDMA libraries expose fine-grained RDMA communication that can be efficiently controlled, the critical tasks of RDMA buffer management, synchronization and flow control are left to the userspace applications requiring tedious programming effort that may lead to sub-optimal performance. In this paper we present a user-space RDMA transport layer that allows RDMA-enabled memory to be managed internally while still exposing zero-copy completion event-based RDMA transfers for message passing. The integration of an RDMA transport layer enables the opportunity for parallel applications to utilize RDMA-managed buffers for accelerating communication while co-existing with high-level MPI, GASNet or similar middleware. We show a performance speedup of up to 8X in latency/bandwidth benchmarks and 5%-90% improvement in response time or messaging rate in three reference applications with regard to their MPI implementations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []