An Ensemble Filter Feature Selection Method and Outlier Detection Method for Multiclass Classification

2019 
Feature selection methods facilitate removal of irrelevant attributes. Ineffective features may contain outliers that degrade performance of classifiers. We propose an ensemble filter base feature selection technique for multiclass classification. The technique combines results of four selection methods to create an ensemble list. The study uses a red wine dataset drawn from UC Irvine machine learning data repository and WEKA, a collection of machine learning algorithms for data mining tasks. The multiclass red wine dataset is binarized using WekaMulticlassClassifier utilizing the 1against 1 with pairwise coupling decomposing scheme. Using random forest algorithm and root mean square error values, a learning curve is generated that establishes an optimal ensemble sub-list. Outliers are detected using the Tukey statistical method. The proposed ensemble method outperformed the single feature methods. The study concludes by showing that unnecessary features and presence of outliers degrades classifiers performance. We recommend further studies on the effect of gradual selective removal of outliers on classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    1
    Citations
    NaN
    KQI
    []