Noisy word recognition using a feature based on ternarized spectral slope

2007 
In previous paper, we proposed a feature FTTSS (Fourier transform of ternarized spectral slope) based on power spectrum derivatives with regard to frequency to develop a robust word recognition system under noisy environments, and we confirmed noise robustness of the feature compared with MFCC by applying it to word recognition with HMM. Generally, word recognition with HMM is improved by adding features that may express temporal variations, such as DeltaMFCC or DeltaFTTSS, because HMM can deal with only piecewise stationary signals. Actually, we have examined effectiveness of using DeltaFTTSS in word recognition. It is supposed that features showing raw temporal variations of spectral power are effective in speech recognition and ternary conversion of features may decrease deteriorations of recognition performance by noise corruption. Therefore in this research, we propose a new feature FTTTS (Fourier transform of ternarized temporal slope) instead of DeltaFTTSS. The FTTTS is defined by Fourier transform along frequency of smoothed ternarized temporal variations of spectral power at specific frequency. As a result, we have confirmed experimentally that the proposed feature FTTTS have noise robustness for SNR 0-20 dB compared with FTTSS+DeltaFTTSS or the conventional feature MFCC+DeltaMFCC by applying them to word recognition with HMM.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    1
    Citations
    NaN
    KQI
    []