No DNN Left Behind: Improving Inference in the Cloud with Multi-Tenancy.

Amit Samanta,Suhas Shrinivasan,Antoine Kaufmann,Jonathan Mace

No DNN Left Behind: Improving Inference in the Cloud with Multi-Tenancy.

2019

Amit Samanta
Suhas Shrinivasan
Antoine Kaufmann
Jonathan Mace

With the rise of machine learning, inference on deep neural networks (DNNs) has become a core building block on the critical path for many cloud applications. Applications today rely on isolated ad-hoc deployments that force users to compromise on consistent latency, elasticity, or cost-efficiency, depending on workload characteristics. We propose to elevate DNN inference to be a first class cloud primitive provided by a shared multi-tenant system, akin to cloud storage, and cloud databases. A shared system enables cost-efficient operation with consistent performance across the full spectrum of workloads. We argue that DNN inference is an ideal candidate for a multi-tenant system because of its narrow and well-defined interface and predictable resource requirements.

Keywords:

Cloud storage
Critical path method
Cloud computing
Multitenancy
Artificial neural network
Workload
Distributed computing
Computer science
Inference
First class
Latency (engineering)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations