Multi-axis decomposition of density functional program for strong scaling up to 82,944 Nodes on the K Computer: Compactly folded 3D-FFT communicators in the 6D torus network

2019 
Abstract Density functional calculations with a plane-wave basis set are widely used in materials science. Due to recent developments in high-performance computers, the number of nodes equipped in such computers greatly exceeds the number of atoms included in a typical simulation. Thus, it is becoming difficult to perform calculations efficiently even when only a portion of all nodes are used (e.g., 10%). We have developed a multi-axis decomposition scheme in which both G-vectors and band axes are decomposed and 3D-FFT communicators are folded compactly. This proposed scheme retains the inner-most do-loop lengths sufficiently long and restrains the increased MPI communication costs as the number of nodes increases. In an investigation of a wide-gap semiconductor material (SiC), our PHASE/0 DFT code exhibits efficient and strong scaling (up to 82,944 nodes) even for a relatively small system with 3,848 atoms, and demonstrates maximum peak performance of 2.25 PFLOPS for a 25,200-atom system despite employing 3D-FFT.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    15
    Citations
    NaN
    KQI
    []