Study of Data Imbalance and Asynchronous Aggregation Algorithm on Federated Learning System

2020 
As the use of machine learning techniques are becoming more widespread, the need for more elaborate dataset is becoming more prevalent. This is usually done with data collection methods that pay little to no attention to the data owner’s privacy and consent. Federated learning is an approach that tries to solve this problem, where such system can train a machine learning model without centrally storing the needed data. But one weakness of the current implementation is that they have a slow convergence time, despite the fact that they distribute the task on many nodes. This is mainly caused by the synchronous nature of the current algorithm. In this paper, we observe the effect that asynchronous aggregation algorithm has on convergence time and test the two factors that might affect it – staleness and data imbalance – on various levels. We implement the asynchronous aggregation algorithm by adapting the Stale Synchronous Parallel algorithm. We test our system on MNIST dataset and found that asynchronous aggregation algorithm improves convergence time in a federated learning system that has large inequality in server-wise update frequency and has a relatively balanced data distribution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    2
    Citations
    NaN
    KQI
    []