Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU

2015 
In this technical report we study dierent parallel graph coloring algorithms and their application to the incomplete-LU factorization. We implement graph coloring based on dierent heuristics and showcase their performance on the GPU. We also present a comprehensive comparison of level-scheduling and graph coloring approaches for the incomplete-LU factorization and triangular solve. We discuss their tradeos and dierences from the mathematics and computer science prospective. Finally we present numerical experiments that showcase the performance of both algorithms. In particular, we show that incomplete-LU factorization based on graph coloring can achieve a speedup of almost 8 on the GPU over the reference MKL implementation on the CPU.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    39
    Citations
    NaN
    KQI
    []