logo
    Abstract The application of mathematics in the field of bioinformatics has been widely developed. For example Support Vector Machines (SVM) and Random Forest (RF) are state of the art for classification of cancer in many applications. One of them is Chronic Kidney Disease (CKD). CKD is one of the kidney diseases that sufferers are increasing and have symptoms that are difficult to detect at first. Later, microarrays in gene expression are important tools for this approach. Microarrays gene expression provides an overview of all transcription activities in biological samples. The purpose of this research is a hybrid model combining Random Forest (RF) and Support Vector Machine (SVM) can be used to classify gene expression data. RF can highly accurate, generelize better and are interpretable and SVM (called RF-SVM) to effectively predict gene expression data with very high dimensions. In addition, from the simulation results on data from the Gene Expression Omnibus (GEO) database, it is shown that the proposed RF-SVM is a more accurate algorithm on CKD data than RFE-SVM.
    This research aimed to predict smart phone prices using two supervised machine learning algorithms: Decision Tree and Random Forest Regression. Data was collected from the Indian e-Commerce website Flip kart using Python libraries such as Beautiful Soup and Selenium, and was cleaned and pre-processed for analysis. The results showed that the Decision Tree algorithm had an R^2of 89.3%. The Random Forest classifier showed the R^2 value with an accuracy score of 82.8%. The study offers a method for accurately predicting smart phone prices that could be useful to determine the cost of their products and ultimately benefit the entire smart phone market. Key Word: Smartphone, Price Prediction, Machine Learning, Decision Tree, Random Forest Regression.
    Python
    Supervised Learning
    Citations (1)
    Machine learning techniques have been in use to identify voice samples. We consider Support Vector Machines (SVM) and Random Forest (RF) and compare their performance in detecting specific audio samples from a set of samples pertaining to many subjects. In particular, we propose an algorithm that can deal with multiple features for multiple audio samples from more than one subject. This work discusses the application and other implementation issues. As a demonstration, SVM and RF models are trained using voice samples from two subjects. We found out that SVM is 100% accurate and Random Forest is 70% accurate for subject 1, whereas SVM is 60% accurate and Random Forest is 90% accurate for subject 2. Our results show that the performance of SVM and RF varies from subject to subject.
    Breast cancer is one cancer that is becoming more prevalent every day. It's becoming worse due to a lack of detection. Lowering the death rate may be possible with quick detection. Based on the Wisconsin Breast Cancer dataset, this study suggests a machine learning-based strategy for identifying breast cancer. There were five distinct machine learning algorithms tested. Logistic Regression has given 94.73% accuracy, Decision Tree has 92.98% accuracy, Random Forest has 98.24% accuracy, and Support Vector Machine (SVM) has 96.49% accuracy. Random Forest has given the highest accuracy which is 98.24 %.
    The main objective of the study is to classify the music genre using the features that are extracted from audio files. The classification is done using Novel Random Forest and Decision Tree and corresponding results are compared in terms of accuracy. Materials and Methods: The GTZAN dataset used in this study is obtained from the MARSYAS website, which is used for Music Information Retrieval, consists of 1000 music files in the .au format. It is also referred to as the standard dataset to the date. The acoustic features of music called Mel-frequency cepstral coefficients(MFCC) that create patterns and help to predict the genre are extracted from the Music files. The data analysis, model training, and testing process are done entirely on the Jupyter platform. The Sample size was 20 per group. The pretest power obtained was 0.08. Results: From the experimental results it is observed that Novel Random forest gives an accuracy of 71.78% while Decision tree gives an accuracy of 59.89%. The classification process is done with both Novel Random Forest and Decision Tree, where the sample size N is 20 for two groups proposed (N=20) and comparison (N=20). The pretest power obtained is 0.08. Conclusion: In this study, it is found that the Random Forest model outperforms the Decision tree model in terms of accuracy by predicting the music genre efficiently.
    Mel-frequency cepstrum
    Sample (material)
    Tree (set theory)
    The main objective of this paper is to predict Diabetic Retinopathy (DR) using Novel Decision Tree (DT) in comparison with Support Vector Machine (SVM). Prediction of Diabetic Retinopathy is done using Novel Decision Tree (N=10) and Support Vector Machine (N=10) algorithms. Kaggle fundus image dataset which contains more than 50,000 digital retinal images is used for Diabetic Retinopathy detection. Novel Decision Tree has attained an accuracy of 92.8% whereas Support Vector Machine got only 85.2%. Both DT and SVM have a statistical significant difference of (p=0.03). Novel Decision Tree method has better performance when compared to Support Vector Machine for Diabetic Retinopathy Detection.
    Fundus (uterus)
    Tree (set theory)
    Decision tree model
    Support vector machine (SVM) and Random Forest (RF) have been developed to improve the accuracy of hyperspectral remote sensing (HRS) image classification significantly in recent years. Due to the different characteristics and obvious diversity between SVM and RF, we propose two integration approaches which combine SVM and Random Forest to classify the HRS image. The proposed method called DWDCS is examined by two hyperspectral images and it can acquire the higher overall accuracy and also improve the accuracy of each classes. Experimental results indicate that the proposed approaches have a great deal of advantages in classifying HRS image.
    Contextual image classification
    Citations (20)
    The purpose of this study was to assess the accuracy of cybercrime predictions provided using Novel Random Forest and Support Vector Machine. Material and Procedure: Cybercrime predictions were made using the new random forest (N=10) and the support vector machine (N=10), which were then analysed. Random forest trumps SVM in terms of precision (by a margin of 84 percent to 85 percent ). 81 percent is the percentage. The advanced random forest classifier performed better than the conventional SVM classifier. There is a significant difference in accuracy between the two approaches (p>0.005). The development of the new cybercrime prediction system used machine learning. This novel random forest technique outperformed SVM.
    Cybercrime
    Margin (machine learning)
    Statistical classification
    Citations (0)
    The use of credit cards is increasing in today's digital era. This increase has resulted in many cases of fraud which have had a negative impact on credit card owners. To overcome this, many financial institutions have developed credit card fraud detection systems that can identify suspicious transactions. This study uses a classification method, namely random forest and decision tree to identify illegal transactions using a credit card, which then compares the results and attempts to create a model that can be useful for detecting fraud using a credit card that is more accurate and effective. The result of this study is that the accuracy provided by the Decision Tree Classifier is 0.98, while the accuracy provided by the Random Forest Classification is also 0.975. The conclusion obtained that the decision tree has a higher level of accuracy compared to the Random Forest Classification Algorithm, which is 98%. On the other hand, the Random Forest classification algorithm has a slightly lower level of accuracy compared to the Decision Tree classification algorithm, with an accuracy rate of 97.5%
    Credit card fraud
    Random tree
    Statistical classification
    Using a decision support system (DSS) that classifies various cancers provides support to the clinicians/researchers to make better decisions that can aid in early cancer diagnosis, thereby reducing chances of incorrect disease diagnosis. Thus, this work aimed at designing a classification model that can predict accurately for 5 different cancer types comprising of 20 cancer exomes, using the mutations identified from whole exome cancer analysis. Initially, a basic model was designed using supervised machine learning classification algorithms such as K-nearest neighbor (KNN), support vector machine (SVM), decision tree, naïve bayes and random forest (RF), among which decision tree and random forest performed better in terms of preliminary model accuracy. However, output predictions were incorrect due to less training scores. Thus, 16 essential features were then selected for model improvement using 2 approaches. All imbalanced datasets were balanced using SMOTE. In the first approach, all features from 20 cancer exome datasets were trained and models were designed using decision tree and random forest. Balanced datasets for decision tree model showed an accuracy of 77%, while with the RF model, the accuracy improved to 82% where all 5 cancer types were predicted correctly. Area under the curve for RF model was closer to 1, than decision tree model. In the second approach, all 15 datasets were trained, while 5 were tested. However, only 2 cancer types were predicted correctly. To cross validate RF model, Matthew's correlation co-efficient (MCC) test was performed. For method 1, the MCC test and MCC cross validation was found to be 0.7796 and 0.9356 respectively. Likewise, for second approach, MCC was observed to be 0.9365, corroborating the accuracy of the designed model. The model was successfully deployed using Streamlit as a web application for easy use. This study presents insights for allowing easy cancer classifications.
    Tree (set theory)
    Citations (1)