Bangla Part of Speech Tagging Using Contextual Embeddings and Oversampling Techniques

2021 
Part of Speech (PoS) Tagging has been a customary research area in the field of Natural Language Processing. The popularization of Neural Networks has opened substantially more scope of research for Bangla PoS Tagging especially with the class of sequential models particularly using Recurrent Neural Networks like Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU). Our contribution in this paper is that we transformed the overall sequential modeling problem to an inconsequent model using BERT embeddings to leverage the existing well understood oversampling algorithms for improving PoS Tagging using a shallow feed-forward Neural Network. Our experiment results indicate that Synthetic Minority Over-sampling Technique (SMOTE) works well as an oversampling algorithm for BERT embeddings.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []