Computing En-Route for Near Data Processing

2021 
The data explosion and faster data analysis demand have spawned emerging applications that operate over myriads of data and exhibit large memory footprints with low data reuse rate. Such characteristics lead to enormous data movements across the memory hierarchy and pose significant pressure on modern communication fabrics and memory subsystems. To mitigate the worsening gap between high processor computation density and deficient memory bandwidth, memory networks, and near-data processing techniques are proposed to keep improving system performance and energy efficiency. In this article, we propose Active-Routing , an in-network near-data processing architecture for data-flow execution, which enables computation en-route by exploiting patterns of aggregation over intermediate results. The proposed architecture leverages the massive memory cube- and vault-level parallelism as well as network concurrency to optimize the aggregation operations along a dynamically built Active-Routing Tree . It also introduces page granular computation offloading to amortize the offloading overhead and improve the throughput. Compared to the state-of-the-art processing-in-memory architecture, the evaluations show that the baseline Active-Routing can achieve up to 7× speedup with an average of 60 percent performance improvement, and reduce the energy-delay product by 80 percent across various benchmarks. Further optimizations with vault-level parallelism and page granular offloading can achieve an extra order of magnitude improvement.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []