High Performance Serverless Architecture for Deep Learning Workflows

2021 
Serverless architecture is a rapidly growing paradigm for deploying deep learning applications performing ephemeral computing and serving bursty workloads. Serverless architecture promises automatic scaling and cost efficiency for inferencing deep learning models while minimizing the operational logic. However, serverless computing is stateless with constraints on local resources. Hence, deploying complex deep learning applications containing large size models, frameworks, and libraries is a challenge.In this work, we discuss a methodology and architecture for migrating deep vision algorithms and model based applications to a serverless computing platform. We have tested our methodology using AWS infrastructure (AWS Lambda, Provisioned Concurrency, VPC endpoint, S3 and EFS) to mitigate the challenges in deploying composition of APIs containing large deep learning models and frameworks. We evaluate the performance and cost of our architecture for a real-life enterprise application used for document processing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []