High Performance Serverless Architecture for Deep Learning Workflows

Dheeraj Chahal,Manju Ramesh,Ravi Ojha,Rekha S. Singhal

High Performance Serverless Architecture for Deep Learning Workflows

2021

Serverless architecture is a rapidly growing paradigm for deploying deep learning applications performing ephemeral computing and serving bursty workloads. Serverless architecture promises automatic scaling and cost efficiency for inferencing deep learning models while minimizing the operational logic. However, serverless computing is stateless with constraints on local resources. Hence, deploying complex deep learning applications containing large size models, frameworks, and libraries is a challenge.In this work, we discuss a methodology and architecture for migrating deep vision algorithms and model based applications to a serverless computing platform. We have tested our methodology using AWS infrastructure (AWS Lambda, Provisioned Concurrency, VPC endpoint, S3 and EFS) to mitigate the challenges in deploying composition of APIs containing large deep learning models and frameworks. We evaluate the performance and cost of our architecture for a real-life enterprise application used for document processing.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations