Stacked Classifier Model with Prior Resampling for Lung Nodule Rating Prediction

2013 
In this work, we are proposing a new machine learning strategy for classification task for imbalanced data. We are using lung image data by Lung Image Database Consortium (LIDC), since LIDC data is a better example for imbalanced dataset. In this work we are using sufficiently large dataset which contains 4,532 nodules extracted from CT images. Later we consider 55 low level nodule image features and radiologists ratings for experiments. This work is being dealt in two stages. (1) data level learning and (2) algorithm level learning. In first stage, we are balancing the dataset prior to classification process. We are using resampling approach for this task. In second stage, we are using ensemble of classifiers to predict lung nodule rating. We are using wide range of classifier models for constructing an ensemble. We use Bagged Decision Tree, naive Bayes, Boosted Decision Trees, and Support Vector Machine (SVM) in a classifier library. Stacking algorithm is used to combine the different classifier models in library to construct higher level ensemble. We are evaluating the performance of our model on five metrics: Accuracy, precision, recall, F-score and Kappa statistics. Results show that our method yields much improved scores as we are refining at both, data level and algorithm level.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []