An Effective Strategy for Improving Small File Problem in Distributed File System

2015 
Distributed file systems, such as HDFS, DFS, etc., are adopted to support cloud storage and are designed for optimizing large files access. But unfortunately, the problem of massive small files is neglected and seriously restricts the performance of distributed file systems. To improve and even solve the small file problem, in this paper, user access task is defined. The correlations among the access tasks, applications and access files are constructed by the improved PLSA, and the research object is transferred from file-level to task-level. Then, an effective strategy is proposed to improving small file problem in distributed file system. The strategy merges small files in term of access tasks and selects perfecting targets based on the transition probability of the tasks. Finally, the system efficiency analysis model is established and experimental results, compared with original HDFS, HAR and the schemes of Dong, demonstrate that the proposed strategy effectively reduce the MDS workload and the request response delay.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    8
    Citations
    NaN
    KQI
    []