Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs

Olaf Kaczmarek,Christian Joachim Schmidt,P. Steinbrecher,Mathias Wagner

Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs

2014

Olaf Kaczmarek
Christian Joachim Schmidt
P. Steinbrecher
Mathias Wagner

Lattice Quantum Chromodynamics simulations typically spend most of the runtime in inversions of the Fermion Matrix. This part is therefore frequently optimized for various HPC architectures. Here we compare the performance of the Intel Xeon Phi to current Kepler-based NVIDIA Tesla GPUs running a conjugate gradient solver. By exposing more parallelism to the accelerator through inverting multiple vectors at the same time, we obtain a performance greater than 300 GFlop/s on both architectures. This more than doubles the performance of the inversions. We also give a short overview of the Knights Corner architecture, discuss some details of the implementation and the effort required to obtain the achieved performance.

Keywords:

Quantum chromodynamics
Lattice (order)
Xeon Phi
Conjugate gradient method
Mathematical analysis
Fermion
Computational science
Kepler
Matrix (mathematics)
Mathematics
Solver

Correction
Cite
Save
Machine Reading By IdeaReader

References

Citations