Samya: A Geo-Distributed Data System for High Contention Aggregate Data

2021 
Geo-distributed databases are the state of the art tools for managing cloud-based data. But maintaining hot records in geo-distributed databases such as Google’s Spanner can be expensive, as it synchronizes each update across a majority of replicas. Frequent synchronization poses an obstacle to achieve high throughput for contentious updateheavy workloads. While such synchronizations are inevitable for complex data types, simple data types such as aggregate data can benefit from reduced synchronizations. To this end, we propose an alternate data management system, Samya, to manage aggregate cloud resource usage data. Samya disaggregates available resources and stores fractions of these resources across geo-distributed sites. Dis-aggregation allows sites to serve client requests independently without synchronization for each update. Samya incorporates a learning mechanism to predict future resource demands. If the predicted demand is not satisfied locally, a synchronization protocol, Avantan, is executed to redistribute available resources in the system. Avantan is a novel fault-tolerant consensus protocol where sites agree on the global availability of resources prior to redistribution. Experiments conducted on Google Cloud Platform highlight that dis-aggregating data and reducing synchronizations allows Samya to commit 16x to 18x more transactions than state of the art cloud geo-distributed systems such as Spanner and CockroachDB.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []