Exploiting Dynamic Parallelism to Efficiently Support Irregular Nested Loops on GPUs

Da Li,Hancheng Wu,Michela Becchi

Exploiting Dynamic Parallelism to Efficiently Support Irregular Nested Loops on GPUs

2015

Da Li
Hancheng Wu
Michela Becchi

Graphics Processing Units (GPUs) have been used in general purpose computing for several years. The newly introduced Dynamic Parallelism feature of Nvidia's Kepler GPUs allows launching kernels from the GPU directly. However, the naive use of this feature can cause a high number of nested kernel launches, each performing limited work, leading to GPU underutilization and poor performance. We propose workload consolidation mechanisms at different granularities to maximize the work performed by nested kernels and reduce their overhead. Our end goal is to design automatic code transformation techniques for applications with irregular nested loops.

Keywords:

Kernel (linear algebra)
Workload
Parallel computing
Nested loop join
Graphics
Computer science
Theoretical computer science
code transformation
Kepler
workload consolidation
general purpose computing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations