An image retrieval method based on semantic matching with multiple positional representations

2019 
Text-based image retrieval requires manual annotation or automatic labeling of the machine. Manual annotation is time-consuming, and simple text description is difficult to fully express the content of the image. Existing deep models rely on the representation of a single sentence, and such methods cannot well capture the contextualized local information in the matching process. In response to these problems, this paper presents a new retrieval idea based on image caption. First, the image description sentences of images are generated by using the image caption model. Then, for the sentence matching model, we propose a multiple positional representations semantic matching model. We use two interrelated Bi-LSTMs and the attention mechanism to match sentences. the matching score is finally produced by aggregating interactions between these different positional sentence representations. The sentence matching model is used to match the retrieval sentence with the image description sentences in the image library. In our experiments, the accuracy of the proposed image caption model and the sentence matching model are all improved compared with the competitive models, and our method can complete the image retrieval task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    1
    Citations
    NaN
    KQI
    []