NewBalance: Efficient Data Space Management and Algorithmic Optimization for Large-Scale Storage Systems

2017 
Fragmentation usually occurs when data space of original storage nodes has to be reallocated to new added storage nodes during the scale-out evolution of the large-scale storage system. It greatly influences its performance and becomes a challenge to manage the whole space. We present an efficient space management framework, called NewBalance, to reduce fragmentation with the minimum data movement while keeping the storage system load balance. The space management framework has two phases including the collection phase and the allocation phase. For the collection phase, we propose a novel algorithm, called the greedy bi-direction collector, which collects enough space for the new storage nodes. For the allocation phase, we formally represent it as a variant of the bin packing problem and then utilize some bin packing heuristics including the first fitting and the best fitting to allocate collected intervals to new added storage nodes. The experimental results show that the amount of intervals can be reduced by 20%~55% and our algorithmic optimization improves the data lookup performance by at least 10% and the scale-out performance by 2X~3X.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []