Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words

2014 
SUMMARY In this paper, we propose a task estimation method based on multiple subspaces extracted from multimodal information of image objects in visual scenes and spoken words in dialogue appearing in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image-object appearances are extracted first, and then similarities among the input task vector and reference subspaces of different tasks are compared. Experiments are conducted on the identification of game tasks. The experimental results show that the proposed method with multimodal information outperforms the method in which only the single modality of image or spoken dialogue is applied. The proposed method achieves accurate performance even if less spoken dialogue is applied.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []