Columnar objects: improving the performance of analytical applications

2015 
Growing volumes of data increase the demand to use it in analytical applications to make informed decisions. Unfortunately, object-oriented runtimes experience performance problems when dealing with large data volumes. Similar problems have been addressed by column-oriented in-memory databases, whose memory layout is tailored to analytical workloads. As a result, data storage and processing are often delegated to such a database. However, the more domain logic is moved to this separate system, the more benefits of object-orientation are lost. We propose modifications to dynamic object-oriented runtimes to store collections of objects in a column-oriented memory layout and leverage a jit to take advantage of the adjusted layout by mapping object traversal to array operations. We implemented our concept in PyPy, a Python interpreter equipped with a tracing jit. Finally, we show that analytical algorithms, expressed through object-oriented code, are up to three times faster due to our optimizations, without substantially impairing the paradigm. Hopefully, extending these concepts will mitigate some problems originating from the paradigm mismatch between object-oriented runtimes and databases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    15
    Citations
    NaN
    KQI
    []