A two-phase virtual machine placement policy for data-intensive applications in cloud

2021 
Abstract Cloud computing and big data are two technologies whose combination can yield many benefits and challenges. One of the most significant challenges is the traffic produced by data-intensive jobs within a data center. A possible way to manage the produced traffic is optimizing the placement of virtual machines (VM) on hosts. However, placing VMs compactly to reduce the communication cost can negatively impact on other aspects such as host utilization and load balancing. In this paper, we aim to make a balance between optimizing the host utilization and the communication cost while considering load balancing. We investigate the VM placement problem by modeling it as the minimum weight K-vertex-connected induced subgraph. We prove the NP-Hardness of the problem and propose a novel two-phase strategy for placing VMs on hosts. At first phase, in order to balance traffic and workload among racks, we rank all racks using a fuzzy inference system and select the best ones based on a linear programming model. At second phase, we introduce a novel greedy algorithm to assign each VM to a host regarding a proposed communication cost metric. We evaluate our approach using CloudSim simulator whose results show our two-phase strategy is able to make a balance between host utilization and network traffic. It keeps more than 80 percent of the traffic rack local while reduces the average network link saturation to almost 40 percent with a low variance. Besides, the number of used hosts increase linearly by increasing the number of VMs which leads to a higher host utilization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []