Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

PLoS ONE (2017)

Maitreyi Sur Tony Suffredini Stephen M. Wessells Peter Bloom Michael Lanzone Sheldon Blackshire S. S. Sridhar Todd E. Katzner

Citation

Reference

Related Paper

Citation Trend

Abstract:

Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

Keywords:

Statistical classification

Topics:

Avian ecology and behavior

Biomimetic flight and propulsion mechanisms

Animal Behavior and Reproduction

10.1371/journal.pone.0174785

Cite

PDF

Performance Analysis of Machine Learning Classification Algorithms for Breast Cancer Diagnosis

2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (2021)

Gargi Gupta Medhavani Sharma Shalu Choudhary Kavita Pandey

Machine Learning based learning algorithms gives the machine the power to learn on its own without being explicitly programmed by a programmer. It helps in automating the task like classification, clustering, etc. which previously required human intervention. Machine Learning-based classification algorithms today are being widely used for the diagnosis of various diseases such as Breast Cancer. Breast Cancer is one of the primary causes of death among women all over the world in the past recent years. In this paper, Random Forest, Logistics Regression, Decision Tree, Naive Bayes and SVM classification algorithms have been implemented. The experiment in this article has been conducted on the most popular Wisconsin Diagnosis Breast Cancer Dataset (WDBC)[l]. The main objective of conducting this experiment is to analyze the correctness and accuracy of classification algorithms used to classify cancer-causing tumours (Malignant) from non-cancerous tumours (Benign) with the highest precision and accuracy. The experiment results show that Random Forest gave the highest accuracy amongst all the classification algorithms.

Statistical classification

10.1109/icrito51393.2021.9596230

Cite

Citations (3)

Evaluation and Implementation of Malware Classification Using Random Forest Machine Learning Algorithm

Saifaldeen Alabadee Karam Thanon

Malware classification is one of the most important issues in Information security, because of the huge new numbers of these malwares. Therefore, more classification methods have been proposed. Random forest (RF) is one of the extremely method in many studies or deferent feature extraction methods. It has been considered as one of the efficient methods of malware classification due to it is accurate results. In this paper, machine learning based RF classifier had been proposed to evaluate the performance of the Random Forest implementation. The RF classifier showed high performance as a detector. It has a good capability of classifying huge number of features with unimportant features. Both training and classifying accuracy have increased by reduction of the number of training feature in dataset. The RF classifier have achieved 95.3% of accuracy.

Statistical classification

10.1109/iccitm53167.2021.9677693

Cite

Citations (3)

Functional magnetic resonance imaging classification based on random forest algorithm in Alzheimer's disease

2019 International Conference on Image and Video Processing, and Artificial Intelligence (2019)

Yu Wang Changsheng Li

For classifying Alzheimer's disease (AD) by analyzing medical image data, in this paper a computer-aided diagnosis method is proposed based on random forest algorithm. In this study functional magnetic resonance imaging (fMRI) data including 34 AD patients, 35 mild cognitive impairments (MCI) and 35 normal controls (NC) is collected. Firstly, functional connection between the different regions of whole brain is calculated using Pearson correlation coefficient. Then the importance of the functional connection between different brain regions is measured and the important features are selected using the random forest algorithm. Finally, classification is performed using support vector machine (SVM) classifier with ten-fold cross-validation. The classification model based on random forest and SVM has a good effect on the recognition of AD, and the classification accuracy rate can reach 90.68%. Functional connection characteristics can be effectively analyzed by the random forest algorithm which can distinguish AD, MCI and NC accurately. At the same time, the abnormal brain regions of AD pathogenesis can be obtained. The related experimental results can provide an objective reference for the early clinical diagnosis of AD.

Statistical classification

10.1117/12.2538059

Cite

Citations (3)

Feature Selection In Support Vector Machine And Random Forest Algorithms For The Classification Of Recipients Of The Smart Indonesia Program

Nanda Try Luchia Mustakim Mustakim Noviarni Noviarni Kelik Sussolaikah Teguh Arifianto

Statistical classification

Feature (linguistics)

Feature vector

10.1109/iccsc62074.2024.10616886

Cite

Citations (1)

Skin Disease Classification: A Comparative Analysis of K-Nearest Neighbors (KNN) and Random Forest Algorithm

2021 International Conference on Electronics, Communications and Information Technology (ICECIT) (2021)

Osim Kumar Pal

Skin disease is a very vulnerable and severe issue in today's world. Skin disorder classification is crucial for diagnosis. Several new data mining algorithms have been developed to classify and interpret medical images. The functionality of the K-Nearest Neighbors (KNN) Random Forest (RF) Algorithm is described in this article, along with an analysis of its results. Furthermore, this study demonstrates a high-performing approach that saves both effort and money. The proposed model is designed based on KNN and the Random Forest algorithm. The patient can use this model to classify his skin disease as a primary detection, and the doctor also can ensure his judgment by using this proposed model. Traditional skin disease diagnosis is an expensive and time-consuming procedure. This paper's proposed classification model will identify ten different skin diseases. The Random Forest algorithm has a testing accuracy of 94.22 percent, and K-Nearest Neighbors (KNN) has a testing accuracy of 95.23 percent. The KNN algorithm has an F1 Score of 95.98 percent, whereas the Random Forest (RF) algorithm has an F1 Score of 95.94 percent. It can be increased by expanding the dataset and more feature extraction. This approach may benefit individuals with skin illness who are looking to save money and time as well as avoid skin cancer by identifying cancer at an early stage.

Statistical classification

Feature (linguistics)

10.1109/icecit54077.2021.9641120

Cite

Citations (9)

Comparison of Machine Learning Algorithms for Shelter Animal Classification

Katarina Mitrović Danijela Milošević Marian Greconici

Establishing characteristics of the shelter animal which determine its outcome is an important task for solving the problem of homeless and abused animals. The main goal of this research was to identify which machine learning algorithm can provide the most accurate prediction of the outcome for an animal, based on its main features. The first step in this research was the transformation of data into a proper form for the implementation of the algorithms. Furthermore, several machine learning algorithms were trained in order to achieve the best possible classification results. The results of the algorithms were compared and the most suitable algorithms were selected based on their performance metrics. This research proposes using a combination of multiple data preprocessing techniques, imbalanced data and machine learning algorithms for predicting the outcome for shelter animal based on its characteristics. K-Nearest Neighbors and C4.5 algorithms provided the best classification results in this research.

Statistical classification

Data pre-processing

10.1109/saci46893.2019.9111575

Cite

Citations (4)

Machine Learning Based Heart Disease Prediction Using Random Forest

INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT (2024)

Swarna S

A recent study by the World Health Organization sheds light on the alarming increase in cardiovascular diseases, contributing to approximately 17.9 million deaths annually. This study delves into the effectiveness of employing the Random Forest algorithm, a robust machine learning approach, to forecast the likelihood of heart disease based on diverse risk factors. By leveraging a dataset encompassing demographic, clinical, and lifestyle attributes, the Random Forest model underwent training to categorize individuals into two groups: those with or without heart disease. Through meticulous feature selection and ensemble learning, the algorithm adeptly captures intricate relationships among predictors, thereby augmenting prediction accuracy. Evaluation metrics including accuracy and AUC-ROC curve were employed in order to determine model's effectiveness. Impressively, our model achieves a prediction accuracy of 97%. Moreover, a comparative analysis with other prominent machine learning models such as Naive Bayes, Support Vector Machine (SVM), Logistic Regression (LR), XGBoost, Decision Tree revealed that the Random Forest approach outperforms others in terms of accuracy and efficiency in prediction tasks. Keywords: Random Forest (RF), Machine Learning (ML), Accuracy, Classification.

Ensemble Learning

Predictive modelling

10.55041/ijsrem31320

Cite

Citations (0)

Face classification by a random forest

Abbas Z. Kouzani Saeid Nahavandi Khashayar Khoshmanesh

This paper presents a random forest-based face image classification method. The random forest is an ensemble learning method that grows many classification trees. Each tree gives a classification. The forest selects the classification that has the most votes. Three experiments are performed. The random forest-based method together with several existing approaches are trained and evaluated. The experimental results are presented and discussed.

Contextual image classification

Statistical classification

Ensemble Learning

Tree (set theory)

10.1109/tencon.2007.4428937

Cite

Citations (24)

Attribute Selection Analysis for the Random Forest Classification in Unbalanced Diabetes Dataset

2020 International Seminar on Application for Technology of Information and Communication (iSemantic) (2021)

Eko Hari Rachmawanto De Rosal Ignatius Moses Setiadi Nova Rijati Ajib Susanto Ibnu Utomo Wahyu Mulyono

In the case of diabetes classification, a lot of research has focused on one dataset with limited attributes, so that various methods are used to optimize the classification process and improve accuracy. This research aims to prove that attribute selection has an important role in the classification process, especially in the Random Forest (RF) method. RF has a bagging feature that is quite reliable in the learning process on an unbalanced dataset. In this research, two datasets that have different types and numbers of attributes were compared. The results of the test show that the attributes age, bs _ fast, bs _pp, plasma_r, plasma_f, hba1c, and type/class have an important role in improving the accuracy of the classification of diabetes Mellitus. By using the RF method, data cleaning and attribute selection can produce 100% accuracy on Abel Vika's Diabetes dataset, even though it uses a relatively small number of trees. This has also been validated by the k-fold cross-validation method.

Statistical classification

10.1109/isemantic52711.2021.9573181

Cite

Citations (3)