Deep semantic-aware network for zero-shot visual urban perception

2021 
Visual urban perception has recently attracted a lot of research attention owing to its importance in many fields. Traditional methods for visual urban perception mostly need to collect adequate training instances for newly-added perception attributes. In this paper, we consider a novel formulation, zero-shot learning, to free this cumbersome curation. Based on the idea of different images containing similar objects are more likely to possess the same perceptual attribute, we learn the semantic correlation space formed by objects semantic information and perceptual attributes. For newly-added attributes, we attempt to synthesize their prototypes by transferring similar object vector representations between the unseen attributes and the training (seen) perceptual attributes. For this purpose, we leverage a deep semantic-aware network for zero-shot visual urban perception model. It is a new two step zero-shot learning architecture, which includes supervised visual urban perception step for training attributes and zero-shot prediction step for unseen attributes. In the first step, we highlight the important role of semantic information and introduce it into supervised deep visual urban perception framework for training attributes. In the second step, we use the visualization techniques to obtain the correlations between semantic information and visual perception attributes from the well trained supervised model, and learn the prototype of unseen attributes and testing images to predict perception score on unseen attributes. The experimental results on a large-scale benchmark dataset validate the effectiveness of our method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    64
    References
    0
    Citations
    NaN
    KQI
    []