Catalina: In-Storage Processing Acceleration for Scalable Big Data Analytics

2019 
Cloud applications are increasingly playing a crucial role in big data analytics. New use cases such as autonomous cars and edge computing call for novel approaches mixing heterogeneous computing and machine learning. These applications typically process petabyte-scale datasets, therefore, requiring low-power and scalable storage providing low-latency and high-throughput data access. While data centers have been focusing on migrating from legacy HDDs and SATA SSDs by deploying high-throughput and low-latency NVMe SSDs, the data bottlenecks appear as capacity scales. One approach to tackle this problem is to enable processing to happen within the storage device -in-storage processing (ISP)- eliminating the need to move the data. In this paper, we investigated the deployment of storage units with embedded low-power application processors along with FPGA-based reconfigurable hardware accelerators to address both performance and energy efficiency. To this purpose, we developed a high-capacity solid-state drive (SSD) named Catalina equipped with a quad-core ARM A53 processor running a Linux operating system along with a highly efficient FPGA accelerator for running applications in-place. We evaluated our proposed approach on a case study application for a similarity search library called Faiss.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    18
    Citations
    NaN
    KQI
    []