LVFS: A scalable big data scientific storage system

2013 
LVFS is a virtual scalable file storage system developed in response to a class of scientific data systems that over time continue to collect petabytes of data that begin to seriously impact the response time to user request services. The system has been operational in a real use case, the NASA MODIS Adaptive Processing System (MODAPS), and shown to double data throughput compared to the original system thanks to better performance and easier load balancing. The MODAPS operational life has been extended over a decade as of now and contains over four petabytes of data in over billions of files on over 500 different disks attached to multiple storage nodes. MODAPS is the processing system for delivering calibrated Level 1 data from MODIS instruments on two NASA satellites, each containing 36 channel multi-spectral visible and infrared changes launched over a decade ago. These system's life cycle operations are typical of many scientific instruments and experiments that continue to generate useful archival data well beyond their originial expected lifetime capabilities to meet current scientific user needs. The Level 1 Atmosphere Archive and Distribution System (LAADS) is responsible for distribution of products produced by MODAPS. The LAADS Virtual File System (LVFS) has now replaced parts of LAADS and is responsible for the read only distribution of all LAADS data to the public. In this paper, we describe the unique design of LVFS and, additionally, describe our ongoing work to incorporate a Distributed Hash-based architecture into the LVFS design to transform LVFS into a full scientific storage architecture scalable to Exabyte sizes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []