Comprehensive Comparison of LSM Architectures for Spatial Data

2020 
Spatial indexes in traditional relational databases supported spatial queries in the pre-big data era. However, the volume and ingestion rate of spatial data is increasing rapidly in modern applications. Many big data systems use LSM tree as their storage structure in order to support write-intensive large-volume workloads, which are usually optimized for singledimensional data. Research has studied how to support spatial indexes on LSM systems, but have mainly focused on the local index organization, that is, how data is organized inside a single LSM component. In this paper, we study various aspects of spatial LSM indexing, including spatial merge policies, which determine when and how spatial components are merged. We consider both stack-based and leveled merge policies, which we have implemented on the same big data system. We evaluate the write and read performance on various workloads and discuss our findings and recommendations. A key finding is that Leveled policies are underperforming other merge policies for most types of spatial workloads.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    1
    Citations
    NaN
    KQI
    []