Hybrid strategy for stencil computations on the APU

Pacôme Eberhart,Issam Said,Pierre Fortin,Henri Calandra

Hybrid strategy for stencil computations on the APU

2014

Pacôme Eberhart
Issam Said
Pierre Fortin
Henri Calandra

Stencil computations are very regular and well adapted to GPU execution. However, the PCI-E bus that connects a discrete GPU to the system memory has a relatively low bandwidth when compared to the GPU compute power. The AMD APU architecture contains both CPU and GPU on the same chip and shared memory between them, which enables to bypass this PCI-E bus. In this paper, we devise a strategy or hybrid deployments on the CPU and the integrated GPU of the APU. For the task-parallel deployment, we rely on the CPU to process the diverging parts of the application. For the data-parallel deployment, we balance the workloads of the CPU and the GPU to achieve the best performance. Our strategy is tested on different stencil computations and we achieve a 20 to 30% gain in performance in the best cases.

Keywords:

Computer architecture
Parallel computing
Computation
Software deployment
Conventional PCI
Central processing unit
Chip
Stencil
Architecture
Shared memory
Computer science
Bandwidth (signal processing)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations