Sequences of Sparse Matrix-Vector Multiplication on Fugaku’s A64FX processors

2021 
We implement parallel and distributed versions of the sparse matrix-vector product and the sequence of matrix-vector product operations, using OpenMP, MPI, and the ARM SVE intrinsic functions, for different matrix storage formats. We investigate the efficiency of these implementations on one and two A64FX processors, using a variety of sparse matrices as input. The matrices have different properties in size, sparsity and regularity. We observe that a parallel and distributed implementation shows good scaling on two nodes for cases where the matrix is close to a diagonal matrix, but the performances degrade quickly with variations to the sparsity or regularity of the input.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []