Automatic Generation of Optimized Workflow for Distributed Computations on Large-Scale Matrices

2014 
Efficient evaluation of distributed computation on large-scale data is prominent in modern scientific computation; especially analysis of big data, image processing and data mining applications. This problem is particularly challenging in distributed environments such as campus clusters, grids or clouds on which the basic computation routines are offered as web/cloud services. In this paper, we propose a locality-aware workflow-based solution for evaluation of large-scale matrix expressions in a distributed environment. Our solution is based on automatic generation of BPEL workflows in order to coordinate long running, asynchronous and parallel invocation of services. We optimize the input expression in order to maximize parallel execution of independent operations while reducing the matrix transfer cost to a minimum. Our approach frees the end-user of the system from the burden of writing and debugging lengthy BPEL workflows. We evaluated our solution on realistic mathematical expressions executed on large-scale matrices distributed on multiple clouds.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []