Online Cross-Modal Scene Retrieval by Binary Representation and Semantic Graph

2017 
In recent years, cross-modal scene retrieval has attracted more attention. However, most existing approaches neglect the semantic relationship between objects in a scene together with the embedded spatial layouts. Moreover, these methods mostly apply the batch learning strategy, which is not suitable for processing streaming data. To address the aforementioned problems, we propose a new framework for online cross-modal scene retrieval based on binary representations and semantic graph. Specially, we adopt the cross-modal hashing based on the quantization loss of different modalities. By introducing the semantic graph, we are able to extract wealthy semantics and measure their correlation across different modalities. Further more, we propose a two-step optimization procedure based on stochastic gradient descent for online update. Experimental results on four datasets show the superiority of our approach over the state-of-the-art.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    22
    Citations
    NaN
    KQI
    []