Breast Cancer Prediction Using Machine Learning Classifiers

Jamal Jamal Jahidul Hasan Antor Rajneesh Kumar Pooja Rani

Citation

Reference

Related Paper

Citation Trend

Abstract:

Breast cancer is one cancer that is becoming more prevalent every day. It's becoming worse due to a lack of detection. Lowering the death rate may be possible with quick detection. Based on the Wisconsin Breast Cancer dataset, this study suggests a machine learning-based strategy for identifying breast cancer. There were five distinct machine learning algorithms tested. Logistic Regression has given 94.73% accuracy, Decision Tree has 92.98% accuracy, Random Forest has 98.24% accuracy, and Support Vector Machine (SVM) has 96.49% accuracy. Random Forest has given the highest accuracy which is 98.24 %.

Topics:

AI in cancer detection

10.1109/icast55766.2022.10039656

Cite

Random-Forest (RF) and Support Vector Machine (SVM) Implementation for Analysis of Gene Expression Data in Chronic Kidney Disease (CKD)

IOP Conference Series Materials Science and Engineering (2019)

Zuherman Rustam Ely Sudarsono Devvi Sarwinda

Abstract The application of mathematics in the field of bioinformatics has been widely developed. For example Support Vector Machines (SVM) and Random Forest (RF) are state of the art for classification of cancer in many applications. One of them is Chronic Kidney Disease (CKD). CKD is one of the kidney diseases that sufferers are increasing and have symptoms that are difficult to detect at first. Later, microarrays in gene expression are important tools for this approach. Microarrays gene expression provides an overview of all transcription activities in biological samples. The purpose of this research is a hybrid model combining Random Forest (RF) and Support Vector Machine (SVM) can be used to classify gene expression data. RF can highly accurate, generelize better and are interpretable and SVM (called RF-SVM) to effectively predict gene expression data with very high dimensions. In addition, from the simulation results on data from the Gene Expression Omnibus (GEO) database, it is shown that the proposed RF-SVM is a more accurate algorithm on CKD data than RFE-SVM.

10.1088/1757-899x/546/5/052066

Cite

Citations (22)

Comparison between Support Vector Machine and Random Forest for Audio Classification

2021 International Conference on Electronics, Communications and Information Technology (ICECIT) (2021)

Md. Rifat Ansari Sadia Alam Tumpa Jannat Ara Ferdouse Raya Mohammad N. Murshed

Machine learning techniques have been in use to identify voice samples. We consider Support Vector Machines (SVM) and Random Forest (RF) and compare their performance in detecting specific audio samples from a set of samples pertaining to many subjects. In particular, we propose an algorithm that can deal with multiple features for multiple audio samples from more than one subject. This work discusses the application and other implementation issues. As a demonstration, SVM and RF models are trained using voice samples from two subjects. We found out that SVM is 100% accurate and Random Forest is 70% accurate for subject 1, whereas SVM is 60% accurate and Random Forest is 90% accurate for subject 2. Our results show that the performance of SVM and RF varies from subject to subject.

10.1109/icecit54077.2021.9641152

Cite

Citations (7)

Prediction of Heart Disease using Random Forest in Comparison with Logistic Regression to Measure Accuracy

2022 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI) (2023)

Guna Sekhar Reddy Thummala Radhika Baskar S RimlonShibi

Utilizing random forest algorithms to predict cardiac illness and evaluating their performance in comparison to that of logistic regression will be the major focus of this research project. In the course of the investigation, the random forest and the logistic regression methodologies are both taken into account. Two groups are statistically evaluated since the pretest power was 80% and the size of the sample was 20 for each group. According to the results of the experiment, a logistic regression model has an accuracy of 80% in predicting heart disease, whereas a random forest classifier has a mean accuracy of 87.64% in making the same prediction. It is possible to demonstrate, via the use of T-tests on independent samples, that there is a statistically significant difference between the two algorithms' levels of accuracy (p<0.05). This research investigated the efficiency and precision of several methods for predicting heart illness in order to improve the accuracy of heart disease prediction by using machine learning classifiers. The outcomes of the comparison show that the random forest strategy performs much better than the logistic regression methodology by a significant margin.

Margin (machine learning)

Logistic model tree

10.1109/accai58221.2023.10199851

Cite

Citations (2)

Comparison of Accuracy Rate in Prediction of Cardiovascular Disease using Random Forest with Logistic Regression

Cardiometry (2023)

T. Vishnuvardhan A. Rama

Aim: Comparison of accuracy rate in prediction of cardiovascular disease using Novel Random Forest with Logistic Regression. Materials and Methods: The Novel Random forest (N=20) and Novel Logistic Regression Algorithm (N=20) these two algorithms are calculated by using 2 Groups and taken 20 samples for both algorithm and accuracy in this work.The sample size is determined using the G power Calculator and it’s found to be 10. Results: The Random Forest exhibited 89.06% accuracy whilst a Logistic Regression has shown 92.18%. accuracy. Statistical significance difference between Random forest algorithm and Novel Logistic Regression Algorithm was found to be p=0.001 (2 tailed) (p<0.5). Conclusion: Prediction of cardiovascular disease using Logistic Regression is significantly better than the Random Forest.

Logistic model tree

10.18137/cardiometry.2022.25.15261531

Cite

Citations (0)

Credit Card Fraud Detection System based on Operational & Transaction features using SVM and Random Forest Classifiers

C. Sudha D. Akila

This paper proposes a Credit Card Fraud Detection system based on Operational & Transaction features using Support Vector Machine (SVM) and Random Forest (RF) classifiers. In this system, in the first phase, the operational features of users are extracted, and then a random forest classifier is used to classify the features into benign and suspected. In the second phase, the transaction features of users are extracted from the user records, and then the M-class SVM classifier is applied to classify the features into benign and suspected. The performance of the system is evaluated in terms of standard measures precision, accuracy, recall, and F-1 score. By results, it was shown that both RF and SVM classifiers achieve a higher detection rate with good accuracy.

Credit card fraud

10.1109/iccakm50778.2021.9357709

Cite

Citations (6)

Hyperspectral remote sensing image classification based on the integration of support vector machine and random forest

Peijun Du Junshi Xia Jocelyn Chanussot Xiyan He

Support vector machine (SVM) and Random Forest (RF) have been developed to improve the accuracy of hyperspectral remote sensing (HRS) image classification significantly in recent years. Due to the different characteristics and obvious diversity between SVM and RF, we propose two integration approaches which combine SVM and Random Forest to classify the HRS image. The proposed method called DWDCS is examined by two hyperspectral images and it can acquire the higher overall accuracy and also improve the accuracy of each classes. Experimental results indicate that the proposed approaches have a great deal of advantages in classifying HRS image.

Contextual image classification

10.1109/igarss.2012.6351609

Cite

Citations (20)

Comparative Analysis of Credit Card Fraud Detection using Logistic regression with Random Forest towards an Increase in Accuracy of Prediction

2022 International Conference on Edge Computing and Applications (ICECAA) (2022)

M.Vamsi Krishna J. Praveenchandar

The study aims to identify the frauds committed using a payment card such as credit cards, debit cards, and also an experiment is performed to find the best suitable algorithm among Random forest and Logistic Regression. Materials and Methods: To stop the fraud detections using Random forest (N=10) and Logistic regression (N=10) with supervised learning that gives insights from the previous data. Results: The precision of the random forest is 76.29% compared with Logistic regression with accuracy of 74.65% with statistical significance value p=0.03 (p<0.05) using Independent sample t test. Conclusion: This results proved that Random forest was significantly better for Fraud detection than Logistic regression within the study's limits.

Sample (material)

10.1109/icecaa55415.2022.9936488

Cite

Citations (9)

Prediction of Heart Disease using Decision Tree over Logistic Regression using Machine Learning with Improved Accuracy

Cardiometry (2023)

R.K.N.S. Shanmukha K. Thinakaran

Aim: Predicting heart disease using the Decision Tree and comparing its feature extraction precision with the Logistic Regression algorithm for improving the accuracy of the prediction. Methods and Materials: In the proposed work, predicting heart disease was carried out using machine learning algorithms such as Logistic Regression (n=10) and Decision tree (n=10). Here the pretest power analysis was carried out with 80% and the sample size for the two groups are 20. Results: From the implemented experiment, the Decision Tree accuracy significantly better than the Logistic Regression 80.10%. There is a measurable 2-tailed huge distinction in accuracy for two algorithms is 0.001 (p<0.05) Conclusion: The Decision Tree algorithm got better accuracy than Logistic Regression for Predicting heart disease.

Logistic model tree

Tree (set theory)

Decision tree model

10.18137/cardiometry.2022.25.15141519

Cite

Citations (2)

Classification and Prediction of Heart Disease using Novel Random Forest Algorithm by Comparing Logistic Regression for Obtaining Better Accuracy

Cardiometry (2023)

T. Poojitha R. Mahaveerakannan

Aim: Heart attacks are usually caused due to blockages, partially or completely, of the heart’s veins or arteries that constrict the flow of blood from or to the heart. The primary objective of this review aims to be seen as the most appropriate algorithm to give us the ideal prediction. We will be comparing the novel Random forest with Logistic regression to find out which of these can give us the best accuracy. Material and Methods: The study used 143 samples with novel Random Forest and Logistic Regression is executed with varying training and testing splits for foreseeing the accuracy of coronary disease prediction with the 80% of G-power value and heart disease data were gathered from multiple web sources, including latest The study’s findings and criterion were 0.05%, with a 95% probability value, average, and confidence interval. The performance accuracy rate of the classifiers is used to evaluate the coronary disease dataset. There was a statistically significant value test between the novel Random Forest and Logistic Regression is 0.046 (p<0.05). Results and Discussion: The accuracy of predicting coronary disease in the novel Random Forest 90.16 % and Logistic Regression 85.25 % is obtained. Conclusion: This study concludes that the Prediction of Coronary disease using the novel Random Forest (RF) algorithm looks to be fundamentally superior to the Logistic Regression (LR) with increased precision.

10.18137/cardiometry.2022.25.15381545

Cite

Citations (2)

An Innovative Method in Improving the accuracy in Intrusion detection by comparing Random Forest over Support Vector Machine

2022 International Conference on Business Analytics for Technology and Security (ICBATS) (2022)

Marri Ranjith Kumar K. Malathi

Improving the accuracy of intruders in innovative Intrusion detection by comparing Machine Learning classifiers such as Random Forest (RF) with Support Vector Machine (SVM). Two groups of supervised Machine Learning algorithms acquire perfection by looking at the Random Forest calculation (N=20) with the Support Vector Machine calculation (N=20)G power value is 0.8. Random Forest (99.3198%) has the highest accuracy than the SVM (9S.56l5%) and the independent T-test was carried out (=0.507) and shows that it is statistically insignificant (p >0.05) with a confidence value of 95% by comparing RF and SVM. Conclusion: The comparative examination displays that the Random Forest is more productive than the Support Vector Machine for identifying the intruders are significantly tested.

Value (mathematics)

10.1109/icbats54253.2022.9759062

Cite

Citations (0)