GPU Acceleration of a High-Order CFD Program

2020 
HNSC (High order Navier-Stokes simulator for Compressible flow) is an aerodynamics numerical simulation application that involves high computational cost. This paper presents our efforts porting and optimizing HNSC on the heterogeneous architecture consists of multi-core CPUs and NVIDIA GPUs. The basic GPU parallelization is done with the CUDA Fortran based on the characteristics of structured mesh and fourth-order Runge-Kutta time marching scheme. A data packing method is proposed to optimize the CPU-GPU data transfer and to improve the MPI communication efficiency. Performance evaluation is done on a server with two 12-core Intel Xeon Skylake Gold 5118 CPUs and one NVIDIA Tesla V100 GPU. The results show that the application can be significantly benefited from the GPU acceleration. The GPU version, using the V100 GPU, achieves a maximum performance speedup of 9.3 over the MPI implementation on the two Skylake CPUs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    1
    Citations
    NaN
    KQI
    []