Exploiting parallelism of imperfect nested loops with sibling inner loops on coarse-grained reconfigurable architectures

Xinhan Lin,Shouyi Yin,Leibo Liu,Shaojun Wei

Exploiting parallelism of imperfect nested loops with sibling inner loops on coarse-grained reconfigurable architectures

2016

Xinhan Lin
Shouyi Yin
Leibo Liu
Shaojun Wei

Coarse-grained reconfigurable architecture (CGRA) is a promising platform for loop acceleration, but existing software pipelining methods cannot achieve satisfactory performance on a fair number of imperfect nested loops, especially those with sibling inner loops. To tackle this problem, this paper makes 2 contributions: 1) a 2-level pipelining method with an effective II optimization strategy for the imperfect loops with sibling inner loops; 2) a novel kernel compression method to reduce oversize kernel. Experiment results show that our approach can achieve much higher performance than the state-of-the-art approaches at acceptable costs.

Keywords:

Kernel (linear algebra)
For loop
Parallel computing
Imperfect
Real-time computing
Computer science
Software pipelining
Architecture
Nested loop join
Pipeline (computing)
Acceleration
compression method

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations