MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU
2017
MPI implementations are becoming increasingly complex and highly tunable, and thus scalability limitations can come from numerous sources. The MPI Tools Interface (MPI_T) introduced as part of the MPI 3.0 standard provides an opportunity for performance tools and external software to introspect and understand MPI runtime behavior at a deeper level to detect scalability issues. The interface also provides a mechanism to re-configure the MPI library dynamically at runtime to fine-tune performance. In this paper, we propose an infrastructure that extends existing components - TAU, MVAPICH2 and BEACON to take advantage of the MPI_T interface to offer runtime introspection, online monitoring, recommendation generation and autotuning capabilities. We validate our design by developing optimizations for a combination of production and synthetic applications. We use our infrastructure to implement an autotuning policy for AmberMD[1] that monitors and reduces MVAPICH2 library internal memory footprint by 20% without affecting performance. For applications where collective communication is latency sensitive such as MiniAMR[2], our infrastructure is able to generate recommendations to enable hardware offloading of collectives supported by MVAPICH2. By implementing this recommendation, we see a 5% improvement in application runtime.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
22
References
10
Citations
NaN
KQI