Comparison of machine learning algorithm for Santander dataset

Yudhistira Arie Wijaya,Nana Suarna,Iin,Ryan Hamonangan,R Nining

Comparison of machine learning algorithm for Santander dataset

2021

The dataset for Santander banks is released on kaggle.com to decide whether the customer makes a transaction or not. The classes in this dataset are 2 with 200,000 entries in records. Earlier experiments using the regression algorithm led to a precision rate of 89%. In this analysis, the best accuracy value from the algorithm was obtained by using 6 different algorithms, namely Support for the Vector Machine (SVM), Neural Network (NN), Naive Bayes (NB), Decision Tree (DT). Before performing the data mining with the algorithm, preprocessing is carried out using a normalizing technique using the range transformation method with values 0 and 1. From the study, the best results were obtained in a Decision Tree 96.03% accurate algorithm, 95.82%, and 95.71%, 95.38%, 90.42%, 90.42%, and Naive Bayes 14.69%. The algorithms of the Decision Tree are 95.03%, 95.71% and 92%. Except for the Naive Bayes algorithm, the precise value is better than previous study.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations