Parallelization primitives for dynamic sparse computations

2013 
We characterize a general class of algorithms common in machine learning, scientific computing, and signal processing, whose computational dependencies are both sparse, and dynamically defined throughout execution. Existing parallel computing runtimes, like MapReduce and GraphLab, are a poor fit for this class because they assume statically defined dependencies for resource allocation and scheduling decisions. As a result, changing load characteristics and straggling compute units degrade performance significantly. However, we show that the sparsity of computational dependencies and these algorithms' natural error tolerance can be exploited to implement a flexible execution model with large efficiency gains, using two simple primitives: selective push-pull and statistical barriers . With reconstruction for compressive time-lapse MRI as a motivating application, we deploy a large Orthogonal Matching Pursuit (OMP) computation on Amazon's EC2 cluster to demonstrate a 19x speedup over current static execution models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    3
    Citations
    NaN
    KQI
    []