Training classifiers with unbalanced data is one of the main challenges in the field of Machine Learning. Some techniques that try to get around this problem have been proposed, where one of most important is SMOTE, which artificially generates new instances by interpolating pairs of original instances. This paper proposes a new approach to the balancing of classifier training data. It is a geometric and spatial approach, that uses tetrahedralization by Delaunay tessellation to generate new artificial instances. This allow us to go beyond single pair interpolation. Applying this new method, we can notice an improvement in the classification quality (in terms of AUC) of the kNN classification algorithm when compared with SMOTE.
One of the significant challenges in machine learning is the classification of imbalanced data. In many situations, standard classifiers cannot learn how to distinguish minority class examples from the others. Since many real problems are unbalanced, this problem has become very relevant and deeply studied today. This paper presents a new preprocessing method based on Delaunay tessellation and the preprocessing algorithm SMOTE (Synthetic Minority Over-sampling Technique), which we call DTO-SMOTE (Delaunay Tessellation Oversampling SMOTE). DTO-SMOTE constructs a mesh of simplices (in this paper, we use tetrahedrons) for creating synthetic examples. We compare results with five preprocessing algorithms (GEOMETRIC-SMOTE, SVM-SMOTE, SMOTE-BORDERLINE-1, SMOTE-BORDERLINE-2, and SMOTE), eight classification algorithms, and 61 binary-class data sets. For some classifiers, DTO-SMOTE has higher performance than others in terms of Area Under the ROC curve (AUC), Geometric Mean (GEO), and Generalized Index of Balanced Accuracy (IBA).
The aim of this research is to analyze CT images of stromatolites originated from Salt Lake, RJ, Brazil, with a neural networks tool, called SOM, developed by Kohonen in the 80's. This research analyzed images of various sections of stromatolites and identified three classes: pore, sediment and rock. The automatically, quickly and with high accuracy are the advantages of this technique can be used as an alternative to conventional methods used, generating 3D images.
The objective of this research is to expand the possible applications of the tools of neural networks, SOM and LVQ, developed by Kohonen in the 80's, deepening the study of geological correlation between wells. This research analyzed values of well logs and found similar patterns that define layers between the wells analyzed. The automatically, quickly and with high accuracy are the advantages of this technique can be used as an alternative to conventional methods.