UDP: a programmable accelerator for extract-transform-load workloads and more

2017 
Big data analytic applications give rise to large-scale extract-transformload (ETL) as a fundamental step to transform new data into a native representation. ETL workloads pose significant performance challenges on conventional architectures, so we propose the design of the unstructured data processor (UDP), a software programmable accelerator that includes multi-way dispatch, variable-size symbol support, flexible-source dispatch (stream buffer and scalar registers), and memory addressing to accelerate ETL kernels both for current and novel future encoding and compression. Specifically, UDP excels at branch-intensive and symbol and pattern-oriented workloads, and can offload them from CPUs. To evaluate UDP, we use a broad set of data processing workloads inspired by ETL, but broad enough to also apply to query execution, stream processing, and intrusion detection/monitoring. A single UDP accelerates these data processing tasks 20-fold (geometric mean, largest increase from 0.4 GB/s to 40 GB/s) and performance per watt by a geomean of 1,900-fold. UDP ASIC implementation in 28nm CMOS shows UDP logic area of 3.82mm 2 (8.69mm 2 with 1MB local memory), and logic power of 0.149W (0.864W with 1MB local memory); both much smaller than a single core. CCS CONCEPTS • Information systems → Extraction, transformation and loading; • Computer systems organization → Parallel architectures; • Hardware → Application specific processors; • Theory of computation → Pattern matching;
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    27
    Citations
    NaN
    KQI
    []