Speech recognition complexity reduction using decimation of cepstral time trajectories

2000 
The usage of speech recognition technology has become common in a variety of applications ranging from desktop computers with dictation engines to mobile devices with speaker-dependent name dialing. While dictation software is solely run on powerful desktop PCs with huge amounts of memory available mobile devices have limited memory and computational resources. In order to implement speech recognition algorithms into mobile devices, the complexity of the algorithms has to meet the capabilities of the device. This paper addresses the problem of complexity and memory constraints in mobile devices. A specific approach called time domain decimation of feature vectors is presented. This general signal processing technique can be applied to speech recognition due to the band-limited modulation spectrum of the feature vector time trajectories. By decimating the feature vector stream of 100 frames per second by factors of 2 to 5, the complexity of the speech recognizer can be reduced proportionally to the decimation factor. Experiments with name dialing task show that decimation factor of 4 can be used without any significant degradation in the performance of the speech recognizer. With the proposed method, the computational complexity can be reduced by 70% and over 60% save in RAM usage can be obtained.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    1
    Citations
    NaN
    KQI
    []