Fast and Robust Wrapper Method for $N$ -gram Feature Template Induction in Structured Prediction

2017 
${N}$ -gram feature templates that consider consecutive contextual information comprise a family of important feature templates used in structured prediction. Some previous studies considered the ${n}$ -gram feature selection problem but they focused on one or several types of features in certain tasks, e.g., consecutive words in a text categorization task. In this paper, we propose a fast and robust bottom-up wrapper method for automatically inducing ${n}$ -gram feature templates, which can induce any type of ${n}$ -gram feature for any structured prediction task. According to the significance distribution for ${n}$ -gram feature templates based on the ${n}$ -gram and bias (offset), the proposed method first determines the ${n}$ -gram that achieves the best tradeoff between the severity of the sparse data problem with ${n}$ -gram feature templates and the richness of the corresponding contextual information, before combining the best ${n}$ -gram with lower-order gram templates in an extremely efficient manner. In addition, our method uses a template pair, i.e., the two symmetrical templates, rather than a template as the basic unit (i.e., including or excluding a template pair rather than a template). Thus, when the data in the training set change slightly, our method is robust to this fluctuation, thereby providing a more consistent induction result compared with the template-based method. The experimental results obtained for three tasks, i.e., Chinese word segmentation, named entity recognition, and text chunking, demonstrated the effectiveness, efficiency, and robustness of the proposed method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    2
    Citations
    NaN
    KQI
    []