Big Data Resource Management & Networks: Taxonomy, Survey, and Future Directions

2021 
Big Data (BD) platforms have a long tradition of leveraging trends and technologies from the broader computer network and communication community. For several years, dedicated servers of homogeneous clusters were employed as the dominant paradigm in BD networks. In recent years, the BD landscape has changed, porting different deployment architectures with various network models. This trend has resulted in various associated opportunities and challenges that induce BD practitioners to achieve the next-generation BD vision. In particular, addressing the BD velocity with batch and micro-batch processing. Nevertheless, the literature misses an extensive study of the associated impacts of adopting these new deployment architectures, giving it holds a significant research interest. This study addresses the previous concern, offering a comprehensive review of the architectural elements of BD batch query deployment models and environments. A novel taxonomy is proposed to classify these models based on their underlying communication systems. We first discuss the batch query processing requirements as comparison criteria of BD communication models and compare their salient features. The benefits/challenges of these environments away from BD traditional on-premise dedicated clusters are presented. Thereafter, we provide an extensive survey of the modern BD deployment architectures, categorizing them based on their underlying infrastructure. Finally, several directions are outlined for future research on improving the state-of-the-art of BD landscape and provide recommendations for the BD practitioners on emerging environments supporting BD applications and the general large-scale data analytics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    178
    References
    2
    Citations
    NaN
    KQI
    []