Empirical Study on Different Feature Selection and Classification Algorithms for Prediction of Hepatitis Disease

2021 
Hepatitis is one of the most commonly diagnosed diseases in the world. With the enormous amount of data available in the medical industry, it is difficult to draw important conclusions. With the advent of technology, data mining techniques are used to solve this problem. In this study, we have applied various classifiers namely KNN, Logistic Regression, Naive Bayes, Decision Tree, Support Vector Machine (SVM), and Random Forest on Hepatitis dataset acquired from UCI Machine Learning repository. Two feature selection techniques: Chi-square test and Boruta Algorithm are used to improve the performance of the classifiers. Finally, we analyze which classifier performed the best and classify the patients into live or dead based on various performance measures. It was concluded that Naive Bayes with Chi-Square attribute selection performed better in terms of F1 score value. Overall, Logistic regression, Support Vector Machine, Kernel SVM, and KNN performed equally well with an accuracy of 90.32%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    0
    Citations
    NaN
    KQI
    []