Towards dynamic SQL compilation in Apache Spark
2020
Big-data systems have gained significant momentum, and Apache Spark is becoming a de-facto standard for modern data analytics. Spark relies on code generation to optimize the execution performance of SQL queries on a variety of data sources. Despite its already efficient runtime, Spark's code generation suffers from significant runtime overheads related to data de-serialization during query execution. Such performance penalty can be significant, especially when applications operate on human-readable data formats such as CSV or JSON.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
14
References
1
Citations
NaN
KQI