Serverless Data Analytics in the IBM Cloud

Josep Sampé,Gil Vernik,Marc Sánchez Artigas,Pedro García-López

Serverless Data Analytics in the IBM Cloud

2018

Josep Sampé
Gil Vernik
Marc Sánchez Artigas
Pedro García-López

Unexpectedly, the rise of serverless computing has also collaterally started the "democratization" of massive-scale data parallelism. This new trend heralded by PyWren pursues to enable untrained users to execute single-machine code in the cloud at massive scale through platforms like AWS Lambda. Inspired by this vision, this industry paper presents IBM-PyWren, which continues the pioneering work begun by PyWren in this field. It must be noted that IBM-PyWren is not, however, just a mere reimplementation of PyWren's API atop IBM Cloud Functions. Rather, it is must be viewed as an advanced extension of PyWren to run broader MapReduce jobs. We describe the design, innovative features (API extensions, data discovering & partitioning, composability, etc.) and performance of IBM-PyWren, along with the challenges encountered during its implementation.

Keywords:

Data parallelism
Operating system
Cloud computing
Data analysis
IBM
Composability
Computer data storage
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations