Performance Evaluation and Analysis of A64FX many-core Processor for the Fiber Miniapp Suite

2021 
In recent years, there has been growing interest in Arm-based processors for high performance computing systems such as supercomputer Fugaku using A64FX Arm-based processor. We have evaluated the performance of A64FX processor using Fiber Miniapp suite and have investigated various numbers of MPI processes, OpenMP threads as well as different methods to assign MPI processes and OpenMP threads. In addition to the performance evaluation, the performance comparison with other processors and some performance analysis are shown. Our experiments suggest that while shorter OpenMP thread strides perform better in most mini applications, MPI process allocation methods have not had a large impact on the performance. For some applications of “as-is” with small data set, A64FX shows poor performance, but it can be improved by enhancing the SIMD vectorization and changing instruction scheduling during the compilation. The performance of the A64FX is better or comparable with other processors for other applications and data sets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []