Bangla Part of Speech Tagging Using Contextual Embeddings and Oversampling Techniques

Koushik Roy,Hasan,K. M. Faizullah Fuhad,Nabeel Mohammed,Akm Shahariar Azad Rabby,Nazmul Hasan,Jebun Nahar,Fuad Rahman

Bangla Part of Speech Tagging Using Contextual Embeddings and Oversampling Techniques

2021

Koushik Roy
Hasan
K. M. Faizullah Fuhad
Nabeel Mohammed
Akm Shahariar Azad Rabby
Nazmul Hasan
Jebun Nahar
Fuad Rahman

Part of Speech (PoS) Tagging has been a customary research area in the field of Natural Language Processing. The popularization of Neural Networks has opened substantially more scope of research for Bangla PoS Tagging especially with the class of sequential models particularly using Recurrent Neural Networks like Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU). Our contribution in this paper is that we transformed the overall sequential modeling problem to an inconsequent model using BERT embeddings to leverage the existing well understood oversampling algorithms for improving PoS Tagging using a shallow feed-forward Neural Network. Our experiment results indicate that Synthetic Minority Over-sampling Technique (SMOTE) works well as an oversampling algorithm for BERT embeddings.

Keywords:

Oversampling
Part-of-speech tagging
Speech recognition
Bengali
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations