Where and Who? Automatic Semantic-Aware Person Composition

2017 
Image compositing is a popular and successful method used to generate realistic yet fake imagery. Much previous work in compositing has focused on improving the appearance compatibility between a given object segment and a background image. However, most previous work does not investigate the topic of automatically selecting semantically compatible segments and predicting their locations and sizes given a background image. In this work, we attempt to fill this gap by developing a fully automatic compositing system that learns this information. To simplify the task, we restrict our problem by focusing on human instance composition, because human segments exhibit strong correlations with the background scene and are easy to collect. The first problem we investigate is determining where should a person segment be placed given a background image, and what should be its size in the background image. We tackle this by developing a novel Convolutional Neural Network (CNN) model that jointly predicts the potential location and size of the person segment. The second problem we investigate is, given the background image, which person segments (who) can be composited with the previously predicted locations and sizes, while retaining compatibility with both the local context and the global scene semantics? To achieve this, we propose an efficient context-based segment retrieval method that incorporates pre-trained deep feature representations. To demonstrate the effectiveness of the proposed compositing system, we conduct quantitative and qualitative experiments including a user study. Experimental results show our system can generate composite images that look semantically and visually convincing. We also develop a proof-of-concept user interface to demonstrate the potential application of our method.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []