ROS: A Rack-based Optical Storage System with Inline Accessibility for Long-Term Data Preservation

2018 
The combination of the explosive growth in digital data and the demand to preserve much of these data in the long term has made it imperative to find a more cost-effective way than HDD arrays and a more easily accessible way than tape libraries to store massive amounts of data. While modern optical discs are capable of guaranteeing more than 50-year data preservation without media replacement, individual optical discs’ lack of the performance and capacity relative to HDDs or tapes has significantly limited their use in datacenters. This article presents a Rack-scale Optical disc library System, or ROS in short, which provides a PB-level total capacity and inline accessibility on thousands of optical discs built within a 42U Rack. A rotatable roller and robotic arm separating and fetching discs are designed to improve disc placement density and simplify the mechanical structure. A hierarchical storage system based on SSDs, hard disks, and optical discs is proposed to effectively hide the delay of mechanical operation. However, an optical library file system (OLFS) based on FUSE is proposed to schedule mechanical operation and organize data on the tiered storage with a POSIX user interface to provide an illusion of inline data accessibility. We further optimize OLFS by reducing unnecessary user/kernel context switches inheriting from legacy FUSE framework. We evaluate ROS on a few key performance metrics, including operation delays of the mechanical structure and software overhead in a prototype PB-level ROS system. The results show that ROS stacked on Samba and FUSE as network-attached storage (NAS) mode almost saturates the throughput provided by underlying samba via 10GbE network for external users, as well as in this scenario provides about 53ms file write and 15ms read latency, exhibiting its inline accessibility. Besides, ROS is able to effectively hide and virtualize internal complex operational behaviors and be easily deployable in datacenters.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    13
    Citations
    NaN
    KQI
    []