A Domain-Adapting Word Representation Method for Word Clustering

2020 
Extracting key information from texts is a goal of natural language processing (NLP) field. A few keywords could prompt the main idea of the text, and the complete vocabulary information is richer, but not easy to organize. This paper proposes a word representation method based on frequently co-occurring entropy (FCE) and fuzzy bag-of-words model (FBoW), named frequently co-occurring entropy and fuzzy bag-of-words model (FCE-FBW). This method is used to cluster the words of different domains and integrate similar words together. These word clusters can be useful for tasks such as building knowledge-based domain repositories. FCE is used to pick out the generalizable features. FBoW supports the description of the same word by multiple dimensions. This paper combines the two models and proposes FCE-FBW method. It provides good performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    1
    Citations
    NaN
    KQI
    []