Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming

Manny Rayner,Nikos Tsourakis,Johanna Gerlach

Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming

2017

We describe a simple spoken utterance classification method suitable for data-sparse domains which can be approximately described by CFG grammars. The central idea is to perform robust matching of CFG rules against output from a large-vocabulary recogniser, using a dynamic programming method which optimises the tf-idf score of the matched grammar string. We present results of experiments carried out on a substantial CFG-based medical speech translator and the publicly available Spoken CALL Shared Task. Robust utterance classification using the tf-idf method strongly outperforms plain CFG-based recognition for both domains. When comparing with Naive Bayes classifiers trained on data sampled from the CFG grammars, the tf-idf/dynamic programming method is much better on the complex speech translation domain, but worse on the simple Spoken CALL Shared Task domain.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations