CPTNet: Cascade Pose Transform Network for Single Image Talking Head Animation

2020 
We study the problem of talking head animation from a single image. Most of the existing methods focus on generating talking heads for human. However, little attention has been paid to the creation of talking head anime. In this paper, our goal is to synthesize vivid talking heads from a single anime image. To this end, we propose cascade pose transform network, termed CPTNet, that consists of a face pose transform network and a head pose transform network. Specifically, we introduce a mask generator to animate facial expression (e.g., close eyes and open mouth) and a grid generator for head movement animation, followed by a fusion module to generate talking heads. In order to handle large motion and obtain more accurate results, we design a pose vector decomposition and cascaded refinement strategy. In addition, we create an anime talking head dataset, that includes various anime characters and poses, to train our model. Extensive experiments on our dataset demonstrate that our model outperforms other methods, generating more accurate and vivid talking heads from a single anime image.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []