ApproxML: efficient approximate ad-hoc ML models through materialization and reuse

Sona Hasani,Faezeh Ghaderi,Shohedul Hasan,Saravanan Thirumuruganathan,Abolfazl Asudeh,Nick Koudas,Gautam Das

ApproxML: efficient approximate ad-hoc ML models through materialization and reuse

2019

Sona Hasani
Faezeh Ghaderi
Shohedul Hasan
Saravanan Thirumuruganathan
Abolfazl Asudeh
Nick Koudas
Gautam Das

Machine learning (ML) has gained a pivotal role in answering complex predictive analytic queries. Model building for large scale datasets is one of the time consuming parts of the data science pipeline. Often data scientists are willing to sacrifice some accuracy in order to speed up this process during the exploratory phase. In this paper, we propose to demonstrate ApproxML, a system that efficiently constructs approximate ML models for new queries from previously constructed ML models using the concepts of model materialization and reuse. ApproxML supports a variety of ML models such as generalized linear models for supervised learning, and K-means and Gaussian Mixture model for unsupervised learning.

Keywords:

Computer science
Reuse
Database
Data mining
Linear regression
k-means clustering
Coreset
Support vector machine
Mixture model

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations