TRGAN: Text to Image Generation Through Optimizing Initial Image

2021 
Generative Adversarial Networks (GANs) have shown success in text-to-image generation tasks. Most of the current methods use multi-stages to generate images, but the quality of the final images is largely dependent on the quality of the initial generated images, thus it is difficult to generate high-quality images in the end if the initial images in the first stage are of low quality, low resolution, irregular shape, strange color, and unrealistic entity relations. Therefore, in this paper, we propose to design a multi-stage generation model, and we address this problem by developing a novel generation model called Text-representation Generative Adversarial Network (TRGAN). TRGAN contains two modules: Joint attention stacked generation module (JASGM) and Text generation in the opposite direction and correction module (TGOCM). In the JASGM module, the detailed feature is extracted from word-level information and the images are generated based on the global sentence attention. In the TGOCM module, the text descriptions are generated reversely, which can improve the quality of the initial images by matching the word-level feature vector. Experimental results present that our proposed model TRGAN outperforms the compared state-of-the-art text-to-image generation methods on CUB and COCO datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []