Towards Accelerating Generic Machine Learning Prediction Pipelines

Alberto Scolari,Yunseong Lee,Markus Weimer,Matteo Interlandi

Towards Accelerating Generic Machine Learning Prediction Pipelines

2017

Alberto Scolari
Yunseong Lee
Markus Weimer
Matteo Interlandi

Machine Learning models are often composed by sequences of transformations. While this design makes easy to decompose and accelerate single model components at training time, predictions requires low latency and high performance predictability whereby end-to-end runtime optimizations and acceleration is needed to meet such goals. This paper shed some light on the problem by using a production-like model, and showing how by redesigning model pipelines for efficient execution over CPUs and FPGAs performance improvements of several folds can be achieved.

Keywords:

Real-time computing
Parallel computing
Field-programmable gate array
Computer science
Latency (engineering)
Predictability
Machine learning
Artificial intelligence
Pipeline transport
Acceleration
training time
single model

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations