Finesse: Fine-Grained Feature Locality based Fast Resemblance Detection for Post-Deduplication Delta Compression

2019 
In storage systems, delta compression is often used as a complementary data reduction technique for data deduplication because it is able to eliminate redundancy among the non-duplicate but highly similar chunks. Currently, what we call is the most popular and widely used approach to computing data similarity for detecting delta compression candidates. But our observations suggest that the is compute-intensive: it needs to , and can be simplified by exploiting the fine-grained feature locality existing among highly similar chunks to . Therefore, we propose , a fine-grained feature-locality-based fast resemblance detection approach that divides each chunk into several fixed-sized subchunks, computes features from these subchunks individually, and then groups the features into super-features. Experimental results show that, compared with the state-of-the-art approach, accelerates the similarity computation for resemblance detection by 3.2× ~ 3.5× and increases the final throughput of a deduplicated and delta compressed prototype system by 41% ~ 85%, while achieving comparable compression ratios
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []