Research on Key Technologies of Customer Consultation Hotspots Mining
2019
In order to better support customer consultation hotspot analysis, improving the accuracy of identifying customer intent is necessary. In this paper, we focus on the research of new words found in one of the key technologies of customer consultation hotspot mining. According to industry characteristics, most of the power vocabulary consists of professional terms, synthetic words, and abbreviations. To solve the new word discovery problem of power industry corpus, we proposed a new word recognition method. This method transforms the problem of new word discovery into the problem of calculating the probability of word formation and the annotation of word position. Especially, based on the analysis and mining of large-scale power industry corpus, the influence of mutual information-information entropy and conditional random field algorithm on the discovery results of new words in the power industry is compared.. Experiment results show that based on the example of 150M power industry documents, the mutual information-information entropy algorithm tends to identify high-frequency power professional vocabulary and synthetic vocabulary. Besides, the conditional random field has an outstanding performance in the mining industry and can better identify the power industry abbreviations.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
7
References
1
Citations
NaN
KQI