Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features

SaiKiranmai Gorla,Lalita Bhanu Murthy Neti,Aruna Malapati

Enhancing the Performance of Telugu Named Entity Recognition Using Gazetteer Features

2020

Named entity recognition (NER) is a fundamental step for many natural language processing tasks and hence enhancing the performance of NER models is always appreciated. With limited resources being available, NER for South-East Asian languages like Telugu is quite a challenging problem. This paper attempts to improve the NER performance for Telugu using gazetteer-related features, which are automatically generated using Wikipedia pages. We make use of these gazetteer features along with other well-known features like contextual, word-level, and corpus features to build NER models. NER models are developed using three well-known classifiers—conditional random field (CRF), support vector machine (SVM), and margin infused relaxed algorithms (MIRA). The gazetteer features are shown to improve the performance, and theMIRA-based NER model fared better than its counterparts SVM and CRF.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations