Class-balanced Text to Image Synthesis with Attentive Generative Adversarial Network

2021 
Although the text-to-image synthesis task has shown significant progress, it still remains a challenge in generating high-quality images. In this article, we first propose an attention-driven, cycle-refinement generative adversarial network, AGAN-v1, to bridge the domain gap between visual contents and semantic concepts by constructing spatial configurations of objects. The generation of image contours is the core component, in which an attention mechanism is developed to refine local details of images by focusing on the objects that complement one subregion. Second, an advanced class-balanced generative adversarial network, AGAN-v2, is proposed to address the problem of long-tailed data distribution. Importantly, it is the first method to solve this problem in the text-to-image synthesis task. Our AGAN-v2 introduces a reweighting scheme, which adopts the effective number of samples for each class to rebalance the generative loss. Extensive quantitative and qualitative experiments on CUB and MS-COCO datasets demonstrate that the proposed AGAN-v2 significantly outperforms the state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []