Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

Felix Weninger,Shinji Watanabe,Yuuki Tachioka,Björn W. Schuller

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

2014

Felix Weninger
Shinji Watanabe
Yuuki Tachioka
Björn W. Schuller

This paper describes our joint efforts to provide robust automatic speech recognition (ASR) for reverberated environments, such as in hands-free human-machine interaction. We investigate blind feature space de-reverberation and deep recurrent de-noising auto-encoders (DAE) in an early fusion scheme. Results on the 2014 REVERB Challenge development set indicate that the DAE front-end provides complementary performance gains to multi-condition training, feature transformations, and model adaptation. The proposed ASR system achieves word error rates of 17.62 % and 36.6 % on simulated and real data, which is a significant improvement over the Challenge baseline (25.16 and 47.2 %).

Keywords:

Speech processing
Speech recognition
Autoencoder
Voice activity detection
Artificial intelligence
Feature vector
Recurrent neural network
Computer science
Reverberation
Pattern recognition
fusion scheme
de noising

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations