Using Embedding Models for Lexical Categorization in Morphologically Rich Languages
2016
Neural-network-based semantic embedding models are relatively new but popular tools in the field of natural language processing. It has been shown that continuous embedding vectors assigned to words provide an adequate representation of their meaning in the case of English. However, morphologically rich languages have not yet been the subject of experiments with these embedding models. In this paper, we investigate the performance of embedding models for Hungarian, trained on corpora with different levels of preprocessing. The models are evaluated on various lexical categorization tasks. They are used for enriching the lexical database of a morphological analyzer with semantic features automatically extracted from the corpora.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
20
References
6
Citations
NaN
KQI