Effectively Using Remote I/O For Work Composition in Distributed Workflows

2020 
Distributed scientific workflows are becoming more important with the interest in incorporating AI into their loops. A critical programming and performance question is how to compose workflow tasks when data is produced on one system but must be consumed on another. Since the dominant technique is composition with remote I/O, this paper explores its performance expectations. We describe BigFlowSim, a workflow I/O simulator that captures key implementation choices for remote I/O, including intensity, reuse, locality, access pattern, and data movement. With BigFlowSim, we generate a synthetic benchmark. We quantify the effects of each parameter with a performance sensitivity study. We explain trends in terms of data movement reduction and show that, under certain conditions, it is possible to establish a total order among most parameters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []