Optimizing acoustic models for commercial speech recognition using foreground scores and data weighting

Daniel Boies,Brian Strope,Mitchel Weintraub,Su-Lin Wu

Optimizing acoustic models for commercial speech recognition using foreground scores and data weighting

2004

Daniel Boies
Brian Strope
Mitchel Weintraub
Su-Lin Wu

This paper describes a data-driven technique for optimizing the acoustic models for speech recognition systems that target commercial applications over telephones. Frame-averaged foreground log-likelihoods (foreground scores) correlate to recognition errors. These scores are used together with gender to optimize data weighting for the acoustic model. This process is interpreted as increasing the priors and associated parameters for poorly modeled data. The score-based optimization leads to about 7% fewer semantic errors on a live evaluation set collected after the last data used to estimate the acoustic model.

Keywords:

Commercial speech
Training set
Boosting (machine learning)
Prior probability
Acoustic model
Machine learning
Maximum likelihood
Artificial intelligence
Weighting
Computer science
Pattern recognition
Speech recognition
Telephony

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations