Efficient data deduplication scheme for scale-out distributed storage

2021 
Abstract Nowadays, there are many sources that generate a large amount of data since almost all devices around our environment are well connected and communicate with each other. Data storage becomes more and more important due to exponentially increasing demand of data over time. Besides this, storing and managing the growth of unstructured data is one of the great challenges. More optimized data storage is essential with the increase of computing power. On the other hand, space efficiency is also one of the primary concerns in the storage system. Recent storage systems such as scale-out distributed storage systems, modern datacenters, and cloud storage environments control their storage costs by eliminating storage waste and the requirement for dedicated storage capacity. The compression techniques become essential in the storage system because large storage space is saved and reliability of the system is maintained. If the storage space is large then it creates challenges related to reliability and space requirement. As computer systems are taking more and more workload responsibilities in critical processes, the yearning for a better understanding of the system’s reliability is ever increasing. This chapter highlights a system that emphasizes on the storage capacity utilization and reliability improvement by balancing data deduplication scheme.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []