Feature Selection Method Based on Class Discriminative Degree for Intelligent Medical Diagnosis
40
Citation
23
Reference
20
Related Paper
Citation Trend
Abstract:
By using efficient and timely medical diagnostic decision making, clinicians can positively impact the quality and cost of medical care. However, the high similarity of clinical manifestations between diseases and the limitation of clinicians’ knowledge both bring much difficulty to decision making in diagnosis. Therefore, building a decision support system that can assist medical staff in diagnosing and treating diseases has lately received growing attentions in the medical domain. In this paper, we employ a multi-label classification framework to classify the Chinese electronic medical records to establish corresponding relation between the medical records and disease categories, and compare this method with the traditional medical expert system to verify the performance. To select the best subset of patient features, we propose a feature selection method based on the composition and distribution of symptoms in electronic medical records and compare it with the traditional feature selection methods such as chi-square test. We evaluate the feature selection methods and diagnostic models from two aspects, false negative rate (FNR) and accuracy. Extensive experiments have conducted on a real-world Chinese electronic medical record database. The evaluation results demonstrate that our proposed feature selection method can improve the accuracy and reduce the FNR compare to the traditional feature selection methods, and the multi-label classification framework have better accuracy and lower FNR than the traditional expert system.Keywords:
Discriminative model
Feature (linguistics)
Similarity (geometry)
Medical record
Feature (linguistics)
Cite
Citations (34)
Data set
Supervised Learning
Tree (set theory)
Cite
Citations (3)
Health care providers that use electronic medical records maintain an administrative database of diagnoses generated by physicians in the course of medical care delivery. This database is subsequently used for billing and reimbursement but can also be used to identify patients for clinical research. In this paper we present a hybrid rule-based and machine learning technique for automatic determination of whether a diagnosis is confirmed, probable or represents a history of a disorder. The rule-based stage was able to classify 86% of test instances with an accuracy of 98.7%. The machine learning stage was able to classify the remaining 14% of the test instances with an accuracy of 91.61% using Perceptron neural network as the classification algorithm. A comparison between Naïve Bayes and Perceptron is also presented.
Reimbursement
Multilayer perceptron
Perceptron
Medical record
Cite
Citations (0)
In recent years, Artificial Intelligence based disease diagnosis has drawn considerable attention both in academia and industry. In medical scenarios, a well-trained classifier can effectively detect a disease with sufficient features associating with medical tests. However, such features are not always readily available due to the high cost of time and money associating with medical tests. To address this, this study identifies the diagnostic strategy learning problem and proposes a novel framework consisting of three components to learn a diagnostic strategy with limited features. First, as we often encounter incomplete medical records of the patients, a sequence encoder is designed to encode any set of information in various sizes into fixed-length vectors. Second, taking the output of the encoder as the input, a feature selector based on reinforcement learning techniques is proposed to learn the best feature sequence for diagnosis. Finally, with the best feature sequence, an oracle classifier is used to give the final diagnosis. To evaluate the performance of the proposed method, experiments are conducted on nine real medical datasets. The results suggest that the proposed method is effective for providing personalized diagnostic strategies and makes better diagnoses with fewer features compared with existing methods.
ENCODE
Feature (linguistics)
Feature Engineering
Cite
Citations (8)
Workbench
Feature (linguistics)
Relevance
Cite
Citations (1)
Automated knowledge acquisition is an important research issue in developing medical expert systems. While several methods of symbolic inductive learning have been proposed, most of the approaches focus on inducing some rules to classify cases correctly. On the contrary, medical experts also learn other information important for medical diagnostic procedures from clinical cases. In order to acquire both kinds of knowledge, we developed a program that extracts not only classification rules for differential diagnosis, but also other medical knowledge needed for diagnosis. This system is based on a diagnosing model of a medical expert system RHINOS, which diagnoses causes of headache and facial pain. We apply this program to the same domain and compared the induced results with expert rules. The results show that the combination of a rule induction method with resampling methods is effective to estimate the performance of induced results, especially when only small training samples are available without domain knowledge.
Rule induction
Knowledge Acquisition
Resampling
Subject-matter expert
Cite
Citations (4)
Rule induction
Decision rule
Resampling
Knowledge Acquisition
Cite
Citations (1)
As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems.
Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks.
We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models.
Cart
Statistical classification
Tree (set theory)
Cite
Citations (1)
The process to utilize, the relevant information or knowledge extracted from large databases, into decision making process is called Data Mining.It is widely used in each sector but especially it helps a lot in health care sector so that complicated disease can be diagnosed easily and accurately.In order to diagnose the disease, a decision support system is proposed based upon decision tree technique so that necessary decision can be made after analyzing the input related to the patients.The classification technique which is used to build this model is decision tree, various decision tree based techniques are explored in this study and measured using various measures like accuracy, sensitivity, specificity, precision, recall, F-measure and ROC area.The Dermatology disease is all about the study related to skin disease which is extremely difficult because all six different categories of these diseases share the similar clinical features.The function tree technique is performing very well with overwhelming experiment results of 100 % accuracy, 100% sensitivity and 100 % specificity.The feature selection methods are applied to increase the quickness of the model.With the help of feature selection methods, all the redundant and unwanted features will get removed and a set of effective features will only be required for the purpose of diagnosis of disease.Best first search and rank search are the most suitable feature selection method which can be applied to strengthen the efficiency of the proposed model for derma diseases.
Tree (set theory)
Cite
Citations (1)
As a part of electronic healthcare systems, medical diagnostic decision support systems have been more popular in clinical routine. It is critical to decide the best model to provide reliable machine learning based decision support in diagnostic problems. In this study, the performance of common classification algorithms have been comparatively evaluated using public medical datasets. The experimental results reveal that, although there is no single best algorithm for all datasets, MLP and Naive Bayes methods have provided relatively higher success rates.
Statistical classification
Cite
Citations (3)