Cross-modal topic correlations for multimedia retrieval

Jing Yu,Yonghui Cong,Zengchang Qin,Tao Wan

Cross-modal topic correlations for multimedia retrieval

2012

Jing Yu
Yonghui Cong
Zengchang Qin
Tao Wan

In this paper, we propose a novel approach for cross-modal multimedia retrieval by jointly modeling the text and image components of multimedia documents. In this model, the image component is represented by local SIFT descriptors based on the bag-of-feature model. The text component is represented by a topic distribution learned from latent topic models such as latent Dirichlet allocation (LDA). The latent semantic relations between texts and images can be reflected by correlations between the word topics and topics of image features. A statistical correlation model conditioned on category information is investigated. Experimental results on a benchmark Wikipedia dataset show that the newly proposed approach outperforms state-of-the-art cross-modal multimedia retrieval systems.

Keywords:

Artificial intelligence
Latent Dirichlet allocation
Feature (computer vision)
Pattern recognition
Multimedia
Visual Word
Topic model
Image processing
Text mining
Computer science
Scale-invariant feature transform
Modal
statistical correlation
Information retrieval

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations