An Improved LSTM For Language Identification

Qingran Zhan,Liqiang Zhang,Hui Deng,Xiang Xie

An Improved LSTM For Language Identification

2018

Qingran Zhan
Liqiang Zhang
Hui Deng
Xiang Xie

In this paper, we propose a novel framework by combining the phonetic temporal neural model (PTN) with an improved LSTM (IM-LSTM). This is achieved by using an up-down connection from the time t to t+1 in the LSTM structure, which aims to capture the latent information from the previous time step. This updated structure can perform better to discriminate the frame-level phonetic information produced by PTN. On the AP16-OLR language identification dataset, our final model achieves relative growth rate 5.04%, 2.19%, 2.73% on EER and 6.55%, 5.81%, 2.23% on C avg in 1s, 3s and full-length utterance condition than the standard PTN, respectively. The proposed framework receives a better performance than the standard PTN and other proposed models, particularly in 1s condition. This shows the efficacy and flexibility of the proposed method.

Keywords:

Phonetics
Artificial neural network
Language identification
Artificial intelligence
Pattern recognition
Utterance
Computer science
Hidden Markov model
time step

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations