Performance analysis and prediction for distributed homogeneous clusters

2013 
We present a new performance model based on the roofline concept for the analysis and performance prediction of distributed computing clusters. The background for our performance modeling is the 28 km InfiniBand interconnection between two bwGRiD clusters each consisting of 140 compute nodes in day-to-day production use. The model is used to analyze the MPI performance of intra-cluster communication compared to inter-cluster communication. We compare the new modeling results to our earlier stochastic model (Richling et al. in Proc. of 3PGCIC-2010. IEEE, New York 2010) where we could give an estimate on the bandwidth requirements for doubling the performance of an application (LinPack as the simplest example). We will derive some bounds for the size of regions in a cluster and the scaling of the maximal speed-up for the region-region-interconnected network.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    2
    Citations
    NaN
    KQI
    []