A Data Quality Management of Chain Stores based on Outlier Detection

Linh Nguyen,Tsukasa Ishigaki

A Data Quality Management of Chain Stores based on Outlier Detection

2020

For successfully analyzing data in the business of chain stores, the quality of data recorded in their shops or factories is a key factor. Data quality management is an important practical issue because data qualities widely vary depending on the managers or workers of many stores in the chain. In this paper, we present a data quality evaluation method for shops in chain businesses based on outlier detection and then, we apply this method to a dataset observed in real chain stores, which provide tire maintenance for vehicles. To evaluate the data quality of each shop, we use data about trucks tire information such as tread depth, tread pattern, and distance which was recorded by the shops at maintenance time to calculate low-quality data by using outlier detection methods with reliable experimental data and practical knowledge. Some outlier detection methods such as Isolation Forest and one-class Support Vector Machine are applied to detect anomalous tire information, which is used to calculate datas abnormal rate in each shop. Our result showed that with this kind of data, Isolation Forest is outstanding than other methods because Isolation Forest is designed to detect few and different outliers. The proposed method can support better maintenance services for customers as well as be able to get more correct data from these shops, which will be useful for the next research.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations