Active Learning for Domain Classification in a Commercial Spoken Personal Assistant.

Xi C. Chen,Adithya Sagar,Justine T. Kao,Tony Y. Li,Christopher Klein,Stephen Pulman,Ashish Garg,Jason D. Williams

Active Learning for Domain Classification in a Commercial Spoken Personal Assistant.

2019

Xi C. Chen
Adithya Sagar
Justine T. Kao
Tony Y. Li
Christopher Klein
Stephen Pulman
Ashish Garg
Jason D. Williams

We describe a method for selecting relevant new training data for the LSTM-based domain selection component of our personal assistant system. Adding more annotated training data for any ML system typically improves accuracy, but only if it provides examples not already adequately covered in the existing data. However, obtaining, selecting, and labeling relevant data is expensive. This work presents a simple technique that automatically identifies new helpful examples suitable for human annotation. Our experimental results show that the proposed method, compared with random-selection and entropy-based methods, leads to higher accuracy improvements given a fixed annotation budget. Although developed and tested in the setting of a commercial intelligent assistant, the technique is of wider applicability.

Keywords:

Mathematics
Training set
Machine learning
If and only if
Active learning
Artificial intelligence
Annotation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations