A task-uncoordinated distributed dataflow model for scalable high performance parallel program execution

2016 
We describe a novel model for executing distributed memory parallel programs using uncoordinated tasks.We describe several off-line optimizations for the proposed model.We examine the effects of these optimizations on modern processors with wider vector units.Increasing levels of task coalescence can improve throughput and increase performance.Increases in performance are observed in both single node and multi node experiments. We propose a distributed dataflow execution model which utilizes a distributed dictionary for data memoization, allowing each parallel task to schedule instructions without direct inter-task coordination. We provide a description of the proposed model, including autonomous dataflow task selection. We also describe a set of optimization strategies which improve overall throughput of stencil programs executed using this model on modern multi-core and vectorized architectures.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    2
    Citations
    NaN
    KQI
    []