Language Modeling Using Part-of-speech and Long Short-Term Memory Networks

2019 
In recent years, neural networks have been widely used for language modeling in different tasks of natural language processing. Results show that long short-term memory (LSTM) neural networks are appropriate for language modeling due to their ability to process long sequences. Furthermore, many studies are shown that extra information improve language models (LMs) performance. In this research, we propose parallel structures for incorporating part-of-speech tags into language modeling task using both the unidirectional and bidirectional type of LSTMs. Words and part-of-speech tags are given to the network as parallel inputs. In this way, to concatenate these two paths, two different structures are proposed according to the type of network used in the parallel part. We analyze the efficiency on Penn Treebank (PTB) dataset using perplexity measure. These two proposed structures show improvements in comparison to the baseline models. Not only does the bidirectional LSTM method gain the lowest perplexity, but it also has the lowest training parameters among our proposed methods. The perplexity of proposed structures has reduced 1.5% and %13 for unidirectional and bidirectional LSTMs, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    0
    Citations
    NaN
    KQI
    []