Exploiting the Inter-cluster Record Reuse for Stream Processors

2014 
Memory accesses limit the performance of stream processors. The stream compiler exploits the reuse of records distributed on different ALU clusters by introducing inter-cluster communications, which decreases the program performance. The paper presents the Stream Transpose (ST) approach to exploit such reuse. The approach, by reorganizing the data, puts data that have been distributed on neighboring ALU clusters on the same ALU cluster, hence exploiting the reuse without any inter-Cluster communications. The experimental results show the approach can exploit the reuse of records distributed among ALU clusters without any inter-cluster communications or any decrease of accessing streams, and gains at most 1.46 speedup over the approach with inter-cluster communication.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []