Enhanced Tree Clustering with Single Pronunciation Dictionary for Conversational Speech Recognition

Hua Yu,Tanja Schultz

Enhanced Tree Clustering with Single Pronunciation Dictionary for Conversational Speech Recognition

2003

Hua Yu
Tanja Schultz

Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering .T his approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show that accurate pronunciation modeling can be achieved through efficient parameter sharing in the acoustic model. Combined with a single pronunciation dictionary, a 1.8% absolute word error rate improvement is achieved on Switchboard, a large vocabulary conversational speech recognition task.

Keywords:

Triphone
Speech recognition
Artificial intelligence
Cluster analysis
Acoustic model
Pattern recognition
Word error rate
Computer science
Decision tree
Vocabulary
Pronunciation
Natural language processing
conversational speech

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations