An Experimental Comparison Between Genetic Algorithm and Particle Swarm Optimization in Spark Performance Tuning

2017 
The most popular in-memory computing framework --- Spark --- has a number of performance-critical configuration parameters. Manually tuning these parameters for optimized performance is not practical because the parameter tuning space is huge. Searching algorithms such as genetic algorithm can be used to automatically search the optimal configurations. However, there are several such algorithms and it is unclear which one is better in the case of Spark configuration parameter tuning. To address this issue, we experimentally compare two searching algorithms --- the Genetic Algorithm (GA) and the Particle Swarm Optimization (PSO) --- in searching the optimal configurations for Spark applications. We made several interesting observations. For one, PSO converges 2x faster than GA but the performance tuned by the configuration parameters found by PSO is slightly poorer than that by GA. Second, PSO shows better scalability with respect to the number of configuration parameters than GA. Finally, we find PSO is more robust than GA across different searching processes. Based on these observations, we recommend one to use PSO in Spark performance tuning context.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    2
    Citations
    NaN
    KQI
    []