Multimodal supervised image translation

2019 
Multimodal image-to-image translation is a class of vision and graphics problems where the goal is to learn a one-to-many mapping between the source domain and target domain. Given an image in the source domain, the model aims to produce as many diverse results as possible. It is an important and challenging problem in the task of image translation. To this end, recent works utilise Gaussian vectors to produce diverse results but with a small difference. It is because of the special probabilistic nature of Gaussian distribution. In this work, the authors propose linearly distributed latent codes instead of conventional Gaussian vectors, which control the style of generated images. Taking advantage of linear distribution, their model can produce much more diverse results and outperform the state-of-the-art baselines in terms of diversity. Qualitative and quantitative comparisons against baselines demonstrate the effectiveness and superiority of their method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    2
    Citations
    NaN
    KQI
    []