This dataset contains key characteristics about the data described in the Data Descriptor A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Contents: 1. human readable metadata summary table in CSV format 2. machine readable metadata file in JSON format
Versioning Note:Version 2 was generated when the metadata format was updated from JSON to JSON-LD. This was an automatic process that changed only the format, not the contents, of the metadata.
This dataset contains key characteristics about the data described in the Data Descriptor A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Contents: 1. human readable metadata summary table in CSV format 2. machine readable metadata file in JSON format
----------------------------------------------------------------Please remove before publishing. manuscript number: SDATA-19-00214B edit url: https://scientificdata.metadata-creator.com/?id=ag5maWdtZXRhLTIzMDExMXIXCxIKU3VibWlzc2lvbhiAgIDg2o2ACgw Related publications: - Please remove before publishing.----------------------------------------------------------------
Abstract Image classification (categorization) can be considered as one of the most breathtaking domains of contemporary research. Indeed, people cannot hide their faces and related lineaments since it is highly needed for daily communications. Therefore, face recognition is extensively used in biometric applications for security and personnel attendance control. In this study, a novel face recognition method based on perceptual hash is presented. The proposed perceptual hash is utilized for preprocessing and feature extraction phases. Discrete Wavelet Transform (DWT) and a novel graph based binary pattern, called quintet triple binary pattern (QTBP), are used. Meanwhile, the K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) algorithms are employed for classification task. The proposed face recognition method is tested on five well-known face datasets: AT&T, Face94, CIE, AR and LFW. Our proposed method achieved 100.0% classification accuracy for the AT&T, Face94 and CIE datasets, 99.4% for AR dataset and 97.1% classification accuracy for the LFW dataset. The time cost of the proposed method is O ( nlogn ). The obtained results and comparisons distinctly indicate that our proposed has a very good classification capability with short execution time.
Although neural networks (especially deep neural networks) have achieved \textit{better-than-human} performance in many fields, their real-world deployment is still questionable due to the lack of awareness about the limitation in their knowledge. To incorporate such awareness in the machine learning model, prediction with reject option (also known as selective classification or classification with abstention) has been proposed in literature. In this paper, we present a systematic review of the prediction with the reject option in the context of various neural networks. To the best of our knowledge, this is the first study focusing on this aspect of neural networks. Moreover, we discuss different novel loss functions related to the reject option and post-training processing (if any) of network output for generating suitable measurements for knowledge awareness of the model. Finally, we address the application of the rejection option in reducing the prediction time for the real-time problems and present a comprehensive summary of the techniques related to the reject option in the context of extensive variety of neural networks. Our code is available on GitHub: \url{https://github.com/MehediHasanTutul/Reject_option}
Credit scoring (CS) is an effective and crucial approach used for risk management in banks and other financial institutions. It provides appropriate guidance on granting loans and reduces risks in the financial area. Hence, companies and banks are trying to use novel automated solutions to deal with CS challenge to protect their own finances and customers. Nowadays, different machine learning (ML) and data mining (DM) algorithms have been used to improve various aspects of CS prediction. In this paper, we introduce a novel methodology, named Deep Genetic Hierarchical Network of Learners (DGHNL). The proposed methodology comprises different types of learners, including Support Vector Machines (SVM), k-Nearest Neighbors (kNN), Probabilistic Neural Networks (PNN), and fuzzy systems. The Statlog German (1000 instances) credit approval dataset available in the UCI machine learning repository is used to test the effectiveness of our model in the CS domain. Our DGHNL model encompasses five kinds of learners, two kinds of data normalization procedures, two extraction of features methods, three kinds of kernel functions, and three kinds of parameter optimizations. Furthermore, the model applies deep learning, ensemble learning, supervised training, layered learning, genetic selection of features (attributes), genetic optimization of learners parameters, and novel genetic layered training (selection of learners) approaches used along with the cross-validation (CV) training-testing method (stratified 10-fold). The novelty of our approach relies on a proper flow and fusion of information (DGHNL structure and its optimization). We show that the proposed DGHNL model with a 29-layer structure is capable to achieve the prediction accuracy of 94.60% (54 errors per 1000 classifications) for the Statlog German credit approval data. It is the best prediction performance for this well-known credit scoring dataset, compared to the existing work in the field.
In an opinionated long review, there may be several targets described by different potential terms. Traditional review-level techniques for Persian sentiment analysis addressed the problem using a one-method-fits-all solution in which the overall polarity of a review is calculated using all its opinionated words without considering their target. In this article, a new method is proposed, which first decomposes a long review into its constituent sentences and then detects the main target of each sentence. In the next step, five policies, including most occurring first (MOF), most general first (MGF), most specific first (MSF), first occurring first (FOF), and last occurring first (LOF), are proposed to come up with the main target of the review. Finally, using the part-of-speech (POS) tags, potential terms in the sentences are specified and a comprehensive sentiment lexicon is employed to compute the polarity of the sentences. In order to evaluate the proposed method, three data sets of user reviews about different topics, including digital equipment, hotels, and movies, are created as no previous study addressed the problem of target identification in the Persian language. The results of comparing the proposed method with a state-of-the-art lexicon-based method show that specifying the main targets of reviews can improve the performance of the systems about 17% and 12% in terms of accuracy and F1-measure. Moreover, the proposed method using the MGF policy achieves the best performance in finding the main target of reviews, while for finding the ultimate polarity of reviews, the MOF outperforms other policies.
Deep neural networks (DNNs) have achieved the state of the art performance in numerous fields. However, DNNs need high computation times, and people always expect better performance in a lower computation. Therefore, we study the human somatosensory system and design a neural network (SpinalNet) to achieve higher accuracy with fewer computations. Hidden layers in traditional NNs receive inputs in the previous layer, apply activation function, and then transfer the outcomes to the next layer. In the proposed SpinalNet, each layer is split into three splits: 1) input split, 2) intermediate split, and 3) output split. Input split of each layer receives a part of the inputs. The intermediate split of each layer receives outputs of the intermediate split of the previous layer and outputs of the input split of the current layer. The number of incoming weights becomes significantly lower than traditional DNNs. The SpinalNet can also be used as the fully connected or classification layer of DNN and supports both traditional learning and transfer learning. We observe significant error reductions with lower computational costs in most of the DNNs. Traditional learning on the VGG-5 network with SpinalNet classification layers provided the state-of-the-art (SOTA) performance on QMNIST, Kuzushiji-MNIST, EMNIST (Letters, Digits, and Balanced) datasets. Traditional learning with ImageNet pre-trained initial weights and SpinalNet classification layers provided the SOTA performance on STL-10, Fruits 360, Bird225, and Caltech-101 datasets. The scripts of the proposed SpinalNet are available at the following link: https://github.com/dipuk0506/SpinalNet