Efficient and Robust Distributed Matrix Computations via Convolutional Coding

Anindya Bijoy Das,Aditya Ramamoorthy,Namrata Vaswani

Efficient and Robust Distributed Matrix Computations via Convolutional Coding

2021

Distributed matrix computations are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). The majority of prior work in this area has presented straggler mitigation strategies that are (i) either sub-optimal in terms of their straggler resilience, or (ii) suffer from numerical problems, i.e., there is a blow-up of round-off errors in the decoded result owing to the high condition numbers of the corresponding decoding matrices. This work introduces a novel solution framework, based on embedding the computations into the structure of a convolutional code, that removes these limitations. Our approach is provably optimal in terms of its straggler resilience, and has excellent numerical robustness which can be theoretically quantified by deriving a computable upper bound on the worst case condition number over all possible decoding matrices. All above claims are backed up by extensive experiments done on the AWS cloud platform.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations