Supporting Iteration in a Heterogeneous Data Flow Engine

Jon Currey,Simon Baker,Christopher J. Rossbach

Supporting Iteration in a Heterogeneous Data Flow Engine

2013

Dataflow execution engines such as MapReduce, DryadLINQ, and PTask have enjoyed success because they simplify development for a class of important parallel applications. These systems sacrifice generality for simplicity: while many workloads are easily expressed, important idioms like iteration and recursion are difficult to express and support efficiently. We consider the problem of extending a dataflow engine to support data-dependent iteration in a heterogeneous environment, where architectural diversity introduces data migration and scheduling challenges that complicate the problem. We propose constructs that enable a dataflow engine to efficiently support data-dependent control flow in a heterogeneous environment, implement them in a prototype system called IDEA, and use them to implement a variant of optical flow, a well-studied computer vision algorithm. Optical flow relies heavily on nested loops, making it difficult to express without explicit support for iteration. We demonstrate that IDEA enables up to 18× speedup over sequential and 32% speedup over a GPU implementation using synchronous host-based control.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations