Latency Comparison of Replication and Coding for Data Access under Random Scheduling

2018 
Replication and coding are two popular approaches to combat failures in large-scale distributed storage systems. Access latency in such systems greatly impacts user experience. Compared with redundant scheduling, random scheduling can reduce the system load, resulting in lower latency, especially when the request arrival rate is high. Besides, random scheduling can achieve near optimal load balancing at a reduced communication cost. A latency comparison of replication and coding is of great importance. Although it has been much argued that coding can achieve lower latency than replication, the latency comparison under random scheduling is still lacking. In this work, based on random scheduling, we analyze the latency of replication and coding and find that, when each request desires all data in a codeword, they have the same average latency. Additionally, we study a general case that users only request a subset of the erasure-coded content, and propose flexible random scheduling for coding, which can lower latency and realize load balancing. Our analysis demonstrates that, in this case, replication achieves lower latency when the system load is low and suffers higher latency when the system load becomes high. With real service time traces from Amazon S3, we conduct trace-driven simulations to validate our analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []