logo
    Performance analysis of classification algorithms on early detection of liver disease
    139
    Citation
    75
    Reference
    10
    Related Paper
    Citation Trend
    Software defect detection (SDD) is concerned with detecting the existence of defects in software modules. There has been a growing interest in applying machine/deep learning to SDD. However, SDD is a binary classification problem, which involves class imbalances causing a bias in learning and there is little work addressing this problem. In this work, we apply four different class balancing techniques—SMOTE, ADASYN, SMOTE-Tomek, and SMOTE-ENN to the SDD problem using both deep learning and machine learning. We use MLP, CNN, and LSTM for deep learning; and decision tree, random forest, logistic regression, and XGB for machine learning. We evaluate the effect of class balancing techniques on those models and conduct a comparative analysis. The study found that class balancing techniques are positive on MLP, but negative on CNN and LSTM, while being positive in all the machine learning techniques. Overall, they are more compatible with machine learning.
    Binary classification
    Image classification is a complex process and an important direction in the field of image processing. Image classification methods require learning and training stages. Using machine learning classification models in image classification gives better results. Decision Tree, Random Forest, Gradient Boosting, Bagging Classifier, Multi-Layer Perceptron (MLP) Classifier, and Support Vector Machine (SVM) are different machine-learning classification models. The goal of this paper is to analyze the machine learning classification models. These models classify 12 kinds of plant seedlings, of which 3 are crop seedlings and 9 are weed seedlings. This paper suggests that, when using a V2 Plant Seedlings dataset, the accuracy of SVM is 0.71 and the accuracy of other models is less compared to SVM. The experimental results in this paper show that the machine learning model SVM has a better solution effect and higher recognition accuracy. This paper focuses on model building, training, and assessing the quality of the model by generating a confusion matrix and a classification report.
    Confusion matrix
    Gradient boosting
    Multilayer perceptron
    Perceptron
    Boosting
    Contextual image classification
    Citations (0)
    All across the world, heart disease is regarded as a fatal disease. Heart disease is a condition that affects both men and women equally and may be a major cause of death around the world. Early diagnosis of this condition is critical for everyone in order to reduce mortality rates day by day. Chronic kidney disease dataset, from UCI machine learning library, having 1190 samples with 14 characteristics has been used for this study. To make this research more potent, both Machine learning (ML) and Deep learning (DL) techniques were used to detect the sickness early. The data was normalized by standard scaler for having a class varience issue. We then used three deep learning techniques namely Convolutional Neural Network (CNN), Artificial Neural Network (ANN), and Long Short Term Memory (LSTM) with two other general machine learning approaches such as Decision Tree and Support Vector Machine (SVM). To show a replication study, the overall experiments were done based on the three different random subsets. For the classification measurement, we also employ the ROC and the AUC curves. Several promising outcomes have been achieved. We calculated accuracy, precision, sensitivity, specificity, and F1-score. CNN provided the best results, with an accuracy of 99.16%.
    Breast cancer is one of the most common forms of cancer among women in our country and the world. Artificial intelligence studies are growing in order to reduce the mortality and early diagnosis needed for appropriate treatment. The Excessive Learning Machines (ELM) method, one of the machine learning approaches, is applied to the Wisconsin Breast Cancer Diagnostic (WBCD) dataset in this study, and the findings are compared to those of other machine learning methods. For this purpose, the same dataset is also classified using Multi-Layer Perceptron (MLP), Sequential Minimum Optimization (SMO), Decision Tree Learning (J48), Naive Bayes (NB), and K-Nearest Neighbor (KNN) methods. According to the results of the study, the ELM approach is more successful than other approaches on the WBCD dataset. It's also worth noting that as the number of neurons in the ELM grows, so does the learning ability of the network. However, after a certain number of neurons have passed, test performance begins to decline sharply. Finally, the ELM's performance is compared to the results of other studies in the literature.
    Extreme Learning Machine
    Data set
    Student’s academic performance or achievement has from time to time been a subject of discourse to academicians, scholars, researchers and educational institutions all over the globe. To this regard, schools are expected to play major and active roles in ensuring that students actually have good performance at end of their programmes. The academic performance is normally used to classify or predict how students would be ultimately capable to withstand and face future challenges after graduation. Students’ academic performance/achievement in any course of study plays a vital role in contributing and producing outstanding students who will be future viable leaders. The use of algorithms to classify and predict students’ academic performance/achievement is not new in machine learning using different techniques like neural network, logistic regression, decision tree and many more. This study classifies and predicts with the use of graphical technique called Decision Tree. The dataset was built from student’ attendance, practical assessment, assignment, ability to complete a free related course on internet, test score, and examination grade; the dataset was divided into training test and testing set. The training test was used to build and validate the decision tree algorithm (CHAID) while testing set was used to evaluate CHAID on the overall accuracy, sensitivity, and specificity. The results show that decision tree algorithm makes classification and prediction visible and clear with the use of graphics to display the results. Hence, the model built produces 96% accuracy.
    CHAID
    Graduation (instrument)
    ID3 algorithm
    Tree (set theory)
    Attendance
    Citations (5)
    Prediction is the act of forecasting what will happen in the future. The field of prediction is gaining more importance in almost all the fields. Machine learning techniques have been used widely for predictions also in recent time deep learning algorithms gain more importance. In this paper, we will be performing prediction over a dataset using both machine learning and deep learning techniques, and the performance of each method will be identified and compared with each other. We have used the house price dataset, which consists of 80 features, which will help to explore data visualization methods, data splitting, data normalization techniques. We have implemented five regression-based machine learning models including Simple Linear Regression, Random Forest Regression, Ada Boosting Regression, Gradient Boosting Regression, Support Vector Regression were used. Deep learning models, including artificial neural network, multi output regression, regression using Tensorflow-Keras were also used for regression. The study was further extended to compare the performance of the classification models and hence six machine learning models and three deep learning models including logistic regression classifier, decision tree classifier, random forest classifier, Naïve Bayes classifier, k-nearest neighbor classifier, support vector machine classifier, feed forward neural network, recurrent neural network, LSTM recurrent neural networks were used. The models were also fine-tuned and results were also compared using performance metrics. We have split our dataset in to 70:30 ration for training and testing. In regression models random forest algorithms were performing better with MAE score 0.12, MSE score 0.55, RMSE score 0.230 and R2 score of 0.85 and in deep learning Tensorflow-Keras–based regression model was performing well with MAE score 0.12, MSE score 0.54, RMSE score 0.210 and R2-Score of 0.87, while in the other side, the classification model, random forest model, was performing good with accuracy of 89.21%, and in deep learning classification technique, feed forward neural network model, was performing good with accuracy of 89.52%. Other performance metrics including Cohen kappa score, Matthews correlation coefficient, average precision, average recall, and F1 score were also calculated to compare the performance.
    Gradient boosting
    Boosting
    Citations (3)
    Machine learning and deep learning play vital roles in predicting diseases in the medical field. Machine learning algorithms are widely classified as supervised, unsupervised, and reinforcement learning. This paper contains a detailed description of our experimental research work in that we used a supervised machine-learning algorithm to build our model for outbreaks of the novel Coronavirus that has spread over the whole world and caused many deaths, which is one of the most disastrous Pandemics in the history of the world. The people suffered physically and economically to survive in this lockdown. This work aims to understand better how machine learning, ensemble, and deep learning models work and are implemented in the real dataset. In our work, we are going to analyze the current trend or pattern of the coronavirus and then predict the further future of the covid-19 confirmed cases or new cases by training the past Covid-19 dataset by using the machine learning algorithm such as Linear Regression, Polynomial Regression, K-nearest neighbor, Decision Tree, Support Vector Machine and Random forest algorithm are used to train the model. The decision tree and the Random Forest algorithm perform better than SVR in this work. The performance of SVR and lasso regression are low in all prediction areas Because the SVR is challenging to separate the data using the hyperplane for this type of problem. So SVR mostly gives a lower performance in this problem. Ensemble (Voting, Bagging, and Stacking) and deep learning models(ANN) also predict well. After the prediction, we evaluated the model using MAE, MSE, RMSE, and MAPE. This work aims to find the trend/pattern of the covid-19.
    Ensemble Learning
    Benchmark (surveying)
    Lasso
    Supervised Learning
    Citations (3)
    The significance of the heart as the body's most vital organ cannot be stressed. Heart disease is the leading cause of death worldwide. Heart failure (HF) is a main cause of death that must be successfully predicted (HF). Angiography, the gold standard for clinical diagnosis of HF, is expensive and can have catastrophic repercussions, according to research. In this scenario, machine learning and deep learning are applied. Machine learning and deep learning techniques can be used to forecast the whole range of hazards associated with this project. This dataset is created by combining previously available datasets. For your convenience, they are sorted into eleven distinct categories. This investigation would not be possible without this information. According to the findings, machine learning approaches exceeded deep learning in the diagnosis of cardiovascular diseases. PCA approach has been utilized to estimate the relative relevance of each of the dataset's 11 fields. When sample approaches are applied, accuracy and recall rates increased. According to the data, Random Forest Classifiers, Decision Tree Classifiers, and Naive Bayes algorithms surpass other MI algorithms.
    Relevance
    Machine learning (ML) algorithms are designed to perform prediction based on features. With the help of machine learning, system can automatically learn and improve by experience. Machine learning comes under Artificial intelligence. Machine learning is broadly categorized in two types: supervised and unsupervised. Supervised ML performs classification and unsupervised is for clustering. In present scenario, machine learning is used in various areas. It can be used for biometric recognition, hand writing recognition, medical diagnosis etc. In medical field, machine learning plays an important role in identifying diseases based on patient’s features. Presently, doctors use software application based on machine learning algorithm in various disease diagnosis like cancer, cardiac arrest and many more. In this paper we used an ensemble learning method to predict heart problem. Our study described the performance of ML algorithms by comparing various evaluating parameters such as F-measure, Recall, ROC, precision and accuracy. The study done with various combination ML classifiers such as, Decision Tree (DT), Naïve Bayes (NB), Support Vector Machine (SVM), Random Forest (RF) algorithm to predict heart problem. The result showed that by combining two ML algorithm, DT with NB, 81.1% accuracy was achieved. Simultaneously, the models like Support Vector machine (SVM), Decision tree, Naïve Bayes, Random Forest models were also trained and tested individually.
    Ensemble Learning