Generating Better Search Engine Text Advertisements with Deep Reinforcement Learning

2019 
Deep Reinforcement Learning has been applied in a number of fields to directly optimize non-differentiable reward functions, including in sequence to sequence settings using Self Critical Sequence Training (SCST). Previously, SCST has primarily been applied to bring conditional language models closer to the distribution of their training set, as in traditional neural machine translation and abstractive summarization. We frame the generation of search engine text ads as a sequence to sequence problem, and consider two related goals: to generate ads similar to those a human would write, and to generate ads with high click-through rates. We jointly train a model to minimize cross-entropy on an existing corpus of Landing Page/Text Ad pairs using typical sequence to sequence training techniques while also optimizing the expected click-through rate (CTR) as predicted by an existing oracle model using SCST. Through joint training we achieve a 6.7% increase in expected CTR without a meaningful drop in ROUGE score. Human experiments demonstrate that SCST training produces significantly more attractive ads without reducing grammatical quality.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    13
    Citations
    NaN
    KQI
    []