Lightweight Spoken Utterance Classification with CFG, tf-idf and Dynamic Programming
2017
We describe a simple spoken utterance classification method suitable for data-sparse domains which can be approximately described by CFG grammars. The central idea is to perform robust matching of CFG rules against output from a large-vocabulary recogniser, using a dynamic programming method which optimises the tf-idf score of the matched grammar string. We present results of experiments carried out on a substantial CFG-based medical speech translator and the publicly available Spoken CALL Shared Task. Robust utterance classification using the tf-idf method strongly outperforms plain CFG-based recognition for both domains. When comparing with Naive Bayes classifiers trained on data sampled from the CFG grammars, the tf-idf/dynamic programming method is much better on the complex speech translation domain, but worse on the simple Spoken CALL Shared Task domain.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
13
References
7
Citations
NaN
KQI