Distributed Optimization, Averaging via ADMM, and Network Topology

2020 
There has been an increasing necessity for scalable optimization methods, especially due to the explosion in the size of data sets and model complexity in modern machine learning applications. Scalable solvers often distribute the computation over a network of processing units. For simple algorithms, such as gradient descent, the dependence of the convergence time with the topology of this network is well known. However, for more involved algorithms, such as the alternating direction method of multipliers (ADMM), much less is known. At the heart of many distributed optimization algorithms, there exists a gossip subroutine which averages local information over the network, whose efficiency is crucial for the overall performance of the method. In this article, we review recent research in this area, and with the goal of isolating such a communication exchange behavior, we compare different algorithms when applied to a canonical distributed averaging consensus problem. We also show interesting connections between ADMM and the lifted Markov chains besides providing an explicit characterization of its convergence and optimal parameter tuning in terms of spectral properties of the network. Finally, we empirically study the connection between network topology and convergence rates for different algorithms on a real-world problem of sensor localization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    0
    Citations
    NaN
    KQI
    []