Distibuted Dense Numerical Linear Algebra Algorithms on massively parallel architectures: DPLASMA

2011 
We present DPLASMA, a new project related to PLASMA, that operates in the distributed memory regime. It uses a new generic distributed Direct Acyclic Graph engine for high performance computing (DAGuE). Our work also takes advantage of some of the features of DAGuE, such as DAG representation that is independent of problem-size, overlapping of communication and computa- tion, task prioritization, architecture-aware scheduling and management of micro-tasks on distributed architectures that feature heterogeneous many-core nodes. The originality of this engine is that it is capable of translating a sequential nested-loop code into a concise and synthetic format which it can be interpret and then execute in a distributed environment. We consider three common dense linear algebra algorithms, namely: Cholesky, LU and QR factorizations, to investigate their data driven expression and execution in a distributed system. We demonstrate from our preliminary results that our DAG-based approach has the potential to bridge the gap between the peak and the achieved performance that is characteristic in the state-of-the-art distributed numerical softwares on current and emerging architectures.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []