Exploring hardware work queue support for lightweight threads in MPSoCs

2012 
Fine-grain thread parallelism using task based programming models are a new trend in achieving massively parallel computations. Often, software pre-fetching and queuing mechanisms for managing these dynamic environments are inadequate, failing to keep the processor cores busy with computation. At the same time, the CPU-memory performance gap is getting worse and this puts a strain on memory subsystem to keep cores in a busy state. We describe a hardware based pre-fetching and queuing mechanism aimed at assisting the over-subscription of very lightweight threads per core. Experiments with a soft processor and a reconfigurable accelerator core are reported. The hardware demonstrates the ability to block on out-of-order memory transactions and alleviates the software bottleneck.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    2
    Citations
    NaN
    KQI
    []