Using formal verification to evaluate theexecution time of Spark applications
2020
Apache Spark is probably the most widely adopted framework for developing big-data batch applications and for executing them on a
cluster of (virtual) machines. In general, the more resources
(machines) one uses, the faster applications execute, but there is
currently no adequate means to determine the proper size of a Spark
cluster given time constraints, or to foresee execution times given
the number of employed machines. One can only run these applications
and use her/his experience to size the cluster and predict expected
execution times. Wrong estimation of execution times can lead to
costly overruns and overly long executions, thus calling for
analytic sizing/prediction techniques that provide precise time
guarantees. This paper addresses this problem by proposing a
solution based on model-checking. The approach exploits a directed
acyclic graph (DAG) to abstract the structure of the execution flows
of Spark programs, annotates each node (Spark stage) with
execution-related data, and formulates the identification of the
global execution time as a reachability problem. To avoid the
well-known state space explosion problem, the paper also proposes a
technique to reduce the size of generated abstract models. This
results in a significant decrease in used memory and/or verification
time making our approach feasible for predicting the execution time
of Spark applications given the resources available. The benefits of
the proposed reduction technique are evaluated by using both timed
automata and constraint LTL over clocks logic to formally
encode and analyze generated models. The approach is also
successfully validated on some realistic case studies. Since the
optimization is not Spark-specific, we claim that it can be applied
to a wide range of applications whose underlying model can be
abstracted as a DAG.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
44
References
1
Citations
NaN
KQI