Improved filter bank on multitaper framework for robust Punjabi-ASR system

2019 
Robustness of the automatic speech recognition (ASR) system relies upon the accuracy of feature extraction and classification in training phase. The mismatch between training and testing conditions during classification of large feature vectors causes a low performance. In this paper, the issue of robustness of acoustic information is addressed for practical Punjabi dataset. Traditional feature extraction approaches: mel frequency cepstral coefficients (MFCC) and gammatone frequency cepstral coefficients (GFCC) face the issue of high variance with leakage of spectral information. Also, handling of the huge number of feature information creates chaos for large speech vocabulary. To overcome this dilemma, a Principal component analysis (PCA) based multi-windowing technique is proposed with the incorporation of baseline GFCC and MFCC based feature approaches after the tuning of taper parameter. The proposed integrated approaches result in better feature vectors, which are further processed using differential evolution + hidden Markov model (DE + HMM) based modelling classifier. The integrated approaches show substantial performance for word recognition as compared to the conventional or fused feature extraction systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    2
    Citations
    NaN
    KQI
    []