The design of variable-length coding matrix for improving error correcting output codes

2020 
Abstract Thus far, all existing Error Correcting Output Codes (ECOC) algorithms produce coding matrices with an equal size for all classes. Yet, this paper proposes a variable-length codewords based ECOC (VL-ECOC), which generates longer codes for hard classes than those for easy classes. VL-ECOC consists of two phases: the overall-class phase and the hard-class phase. In the first phase, the centroids of the top two toughest classes are selected as the centroids of the positive group and the negative group respectively, whereas other classes are assigned to their nearer groups. The remaining hard classes with high error rates will be proceeded to the second phase, in which the K nearest neighbors of the misclassified samples are employed to generate new columns. The codewords generated in the second phase are applied to the decoding process of the hard classes. Consequently, both the easy and hard classes contain distinct code lengths. To verify the performance of VL-ECOC, comprehensive experiments are carried out on the UCI data and the microarray data sets. The experiment results demonstrate that owing to the additional codewords for the hard classes, our algorithm can better handle the class imbalance problem and achieve higher performance in most cases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    4
    Citations
    NaN
    KQI
    []