3R: Word and Phoneme Edition based Data Augmentation for Lexical Punctuation Prediction

Aihua Zheng,Naipeng Ye,Xiao Wang,Xiao Song

3R: Word and Phoneme Edition based Data Augmentation for Lexical Punctuation Prediction

2020

Existing Lexical Punctuation Prediction methods are mainly trained on the standard clean data while losing the generalization in practical automatic speech recognition (ASR) system with ubiquitous transcription errors. To bridge the gap between clean training data and noisy testing data, we propose three random (3R) data augmentation strategies: random word deletion (RWD), random word substitution (RWS), and random phoneme edition (RPE) in both word and phoneme levels on the training dataset. Specifically, we contribute an acoustically similar vocabulary with phoneme level editions for acoustically similar word substitution. In addition, we first introduce the RoBERTa-large model into a punctuation prediction task to capture the semantics and the long-distance dependencies in language. Extensive experiments on the English dataset IWSLT2011 yield to a new state-of-the-art comparing to the prevalent punctuation prediction methods.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations