Entropy Targets for Adaptive Distillation

2020 
The focus of this paper is the problem of targets in knowledge distillation. Compared with hard targets, soft targets can provide extra information which compensates for the lack of supervision signals in classification problems, but there are still many defects such as high entropy's chaos. The problem is addressed by controlling the information entropy, which makes the student network adapt to the targets. After introducing the concepts of the system and interference labels, we propose the entropy transformation which can reduce information entropy of the system using interference labels and maintain supervision signal. Through entropy analysis and entropy transformation, entropy targets are generated from soft targets and are added to the loss function. Due to the decrease in entropy, the student network can better adapt to learn the inter-class similarity from the adaptive knowledge and can potentially lower the risk of over-fitting. Our experiments on MNIST and DISTRACT dataset demonstrate the benefits of entropy targets over soft targets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []