Robust Speech Recognition From Noise-Type Based Feature Compensation and Model Interpolation in a Multiple Model Framework

2006 
Compared to multi-condition training (MTR), condition-dependent training generates multiple acoustic hidden Markov model sets each identified by a noisy environment and is known to perform substantially better for known noise types (included in training) while worse for unknown (untrained) noise types. This paper attempts to bridge the performance gap between known and unknown noise types by introducing a Minimum Mean-Square Error (MMSE) noise-type based compensation algorithm. On the basis of a modified Vector Taylor Series and the measurement of feature reliability as well as noise similarity, the MMSE estimation adapts the test features corrupted by the unknown noise type to the corresponding features corrupted by the known noise type. This method significantly improves the recognition performance for unknown noise types while maintaining the good performance for known noise types. Furthermore, in order to benefit directly from MTR, a model interpolation strategy is investigated which combines the MTR and the condition-dependent model sets. Both good performance and low computational cost are achieved by only interpolating the mixtures of each condition-dependent model state with the least weighted mixture in the corresponding MTR model state. The overall system gives promising results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    5
    Citations
    NaN
    KQI
    []