Data adaptively storing approach for hadoop distributed file system

2017 
Hadoop distributed file system (HDFS) is an open source framework that has been usually used for cloud storage systems. Due to the arts and craft development and the consideration of price, today's storage system usually contains both fast devices (such as SSDs) and slow devices (like Hard disks). In order to optimize the performance, we should store the frequently accessed data in fast devices, and lay the infrequently used data in slow devices. Focus this problem, the latest HDFS provides different storage policy for classifying the data. However, traditional methods usually first assign all data for the same storage policy, and then adjust the policy by their access frequency, which usually cannot provide high hit rate for fast devices, because the frequently accessed data cannot be labeled when they uploaded. In this paper, we provide a data adaptively storing approach (DASA) for improving the hit rate for fast devices. DASA first predict all file's hot value when they uploaded, and set storage policy by the hot value, in order to assign the frequently accessed data into fast devices. The evaluation results show that DASA provides very good hit rate compared to traditional method, which illustrates DASA gains good performance for HDFS.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []