ISCA: An Improved Sine Cosine Algorithm to select features for text categorization

2019 
Abstract Bag of words model is commonly used for text categorization. The main problem of this model lies in the large number of involved features, which influences the categorization task performance. To deal with this problem, feature selection method is necessary. Feature selection is beneficial for reducing the dimensionality of the problem, it leads to minimize the computational time and improve the performance of the categorization task. In this paper, we propose a new improved algorithm of the original Sine Cosine Algorithm (SCA) for feature selection, which allows for better exploration in the search space. Unlike the SCA which focuses only on the best solution to generate a new solution, the new algorithm (ISCA) of our proposal takes into account two positions of the solution. (i), The position of the best solution found so far, and (ii), a given random position from the search space. This combination allows us to propose a simple algorithm which is able to avoid premature convergence and obtain very satisfactory performance. To validate the new ISCA algorithm, we carried out a series of experiments on nine text collection, where, we compared the experimental results with several search algorithms including the original SCA algorithm and some of its improved versions as well as the Moth-Flam Optimizer (MFO) algorithm. Moreover, from the state of the art, the Genetic Algorithm (GA) and the Ant Colony Optimization (ACO) are chosen in our comparative study. Our evaluation results demonstrate the high performance of our proposed ISCA algorithm which makes it very useful for text categorization problem.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    16
    Citations
    NaN
    KQI
    []