PornNet: A Unified Deep Architecture for Pornographic Video Recognition

2021 
In the era of big data, massive harmful multimedia resources publicly available on the Internet greatly threaten children and adolescents. In particular, recognizing pornographic videos is of great importance for protecting the mental and physical health of the underage. In contrast to the conventional methods which are only built on image classifier without considering audio clues in the video, we propose a unified deep architecture termed PornNet integrating dual sub-networks for pornographic video recognition. More specifically, with image frames and audio clues extracted from the pornographic videos from scratch, they are respectively delivered to two deep networks for pattern discrimination. For discriminating pornographic frames, we propose a local-context aware network that takes into account the image context in capturing the key contents, whilst leveraging an attention network which can capture temporal information for recognizing pornographic audios. Thus, we incorporate the recognition scores generated from the two sub-networks into a unified deep architecture, while making use of a pre-defined aggregation function to produce the whole video recognition result. The experiments on our newly-collected large dataset demonstrate that our proposed method exhibits a promising performance, achieving an accuracy at 93.4% on the dataset including 1 k pornographic samples along with 1 k normal videos and 1 k sexy videos.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []