Automatic Scientific Workflow Composition

2012 
Although the composition of scientific workflows has been widely studied, there is still a lack of a general and efficient approach for automatic composition of scientific workflows. In this chapter, we present a STRIPS-based formal definition of the scientific workflow composition problem, followed by an algorithm for automatic composition of high quality (portable, fault tolerant, and optimized) scientific workflows. The algorithm consists of two sub-algorithms dealing with control and data flow composition, respectively. The automatic control flow composition algorithm searches for Activity Function (AFs) and automatically composes them into scientific workflows using an AF Data Dependence (ADD) graph. The composition process consists of three phases: ADD graph creation, workflow extraction, and workflow optimization. The worst case complexity of the algorithm is quadratic in the number of AFs. An extension of the algorithm to compose scientific workflows with branches and loops is also presented. Once control flow is established, the data flow composition algorithm composes data flow of scientific workflows by locating possible source data ports of each sink data port through backwards control flow traversing, and matching source data ports against sink data ports based on data semantics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    179
    References
    2
    Citations
    NaN
    KQI
    []