High Throughput Log-Based Replication for Many Small In-Memory Objects

2016 
Online graph analytics and large-scale interactive applications such as social media networks require low-latency data access to billions of small data objects. These applications have mostly irregular access patterns making caching insufficient. Hence, more and more distributed in-memory systems are proposed keeping all data always in memory. These in-memory systems are typically not optimized for the sheer amount of small data objects, which demands new concepts regarding the local and global data management and also the fault-tolerance mechanisms required to mask node failures and power outages. In this paper we propose a novel two-level logging architecture with backup-side version control enabling parallel recovery of in-memory objects after node failures. The presented fault-tolerance approach provides high throughput and minimal memory overhead when working with many small objects. We also present a highly concurrent log cleaning approach to keep logs compact. All proposed concepts have been implemented within the DXRAM system and have been evaluated using two benchmarks: The Yahoo! Cloud Serving Benchmark and RAMCloud's Log Cleaner benchmark. The experiments show that our proposed approach has less memory overhead and outperforms state-of-the-art in-memory systems for the target application domains, including RAMCloud, Redis, and Aerospike.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []