CompStor: An In-storage Computation Platform for Scalable Distributed Processing

2018 
The explosion of data-centric and data dependent applications requires new storage devices, interfaces, and software stacks. Big data analytics solutions such as Hadoop, MapReduce and Spark have addressed the performance challenge by using a distributed architecture based on a new paradigm that relies on moving computation closer to data. In this paper, we describe a novel approach aimed at pushing the "move computation to data" paradigm to its ultimate limit by enabling highly efficient and flexible in-storage processing capability in solid state drives (SSDs). We have designed CompStor, an FPGA-based SSD that implements computational storage through a software stack (devices, protocol, interface, software, and systems) and a dedicated hardware for in-storage processing including a quad-core ARM processor subsystem. The dedicated hardware resources provide in-storage data analytics capability without degrading the performance of common storage device functions such as read, write and trim. Experimental results show up to 3X energy saving for some applications in comparison to the host CPU. To the best of our knowledge, the 24TB CompStor SSD is the first one capable of supporting in-storage computation running an operating system, enabling all types of applications and Linux shell commands to be executed in-place with no modification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    15
    Citations
    NaN
    KQI
    []