Scalable communication architecture for network-attached accelerators

2015 
On the road to Exascale computing, novel communication architectures are required to overcome the limitations of host-centric accelerators. Typically, accelerator devices require a local host CPU to configure and operate them. This limits the number of accelerators per host system. Network-attached accelerators are a new architectural approach for scaling the number of accelerators and host CPUs independently. In this paper, the communication architecture for network-attached accelerators is described which enables remote initialization and control of the accelerator devices. Furthermore, an operative prototype implementation is presented. The prototype accelerator node consists of an Intel Xeon Phi coprocessor and an EXTOLL NIC. The EXTOLL interconnect provides new features to enable accelerator-to-accelerator direct communication without a local host. Workloads can be dynamically assigned to CPUs and accelerators at run-time in an N to M ratio. The latency, bandwidth, and performance of the low-level implementation and MPI communication layer are presented. The LAMMPS molecular dynamics simulator is used to evaluate the communication architecture. The internode communication time is improved by up to 47%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    14
    Citations
    NaN
    KQI
    []