HPC AI500 V2.0: The Methodology, Tools, and Metrics for Benchmarking HPC AI Systems

2021 
Recent years witness a trend of applying large-scale distributed deep learning algorithms (HPC AI) in both business and scientific computing areas, whose goal is to speed up the training time to achieve a state-of-the-art quality. The HPC AI benchmarks accelerate the process. Unfortunately, benchmarking HPC AI systems at scale raises serious challenges. This paper presents a comprehensive HPC AI benchmarking methodology that achieves equivalence, representativeness, repeatability, and affordability. Among the nineteen AI workloads of AIBench Training–by far the most comprehensive AI benchmarks suite, we choose two representative and repeatable AI workloads in terms of both AI model and micro-architectural characteristics. The selected HPC AI benchmarks include both business and scientific computing: Image Classification and Extreme Weather Analytics. Finally, we propose three high levels of benchmarking and the corresponding rules to assure equivalence. To rank the performance of HPC AI systems, we present a new metric named Valid FLOPS, emphasizing both throughput performance and target quality. The evaluations show our methodology, benchmarks, and metrics can measure and rank the HPC AI systems in a simple, affordable and repeatable way. The specification, source code, datasets, and HPC AI500 ranking numbers are publicly available from https://www.benchcouncil.org/aibench/hpcai500/index.html.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    2
    Citations
    NaN
    KQI
    []