New Tampered Features for Scene and Caption Text Classification in Video Frame

2016 
The presence of both caption/graphics/superimposed and scene texts in video frames is the major cause for the poor accuracy of text recognition methods. This paper proposes an approach for identifying tampered information by analyzing the spatial distribution of DCT coefficients in a new way for classifying caption and scene text. Since caption text is edited/superimposed, which results in artificially created texts comparing to scene texts that exist naturally in frames. We exploit this fact to identify the presence of caption and scene texts in video frames based on the advantage of DCT coefficients. The proposed method analyzes the distributions of both zero and non-zero coefficients (only positive values) locally by moving a window, and studies histogram operations over each input text line image. This generates line graphs for respective zero and non-zero coefficient coordinates. We further study the behavior of text lines, namely, linearity and smoothness based on centroid location analysis, and the principal axis direction of each text line for classification. Experimental results on standard datasets, namely, ICDAR 2013 video, 2015 video, YVT video and our own data, show that the performances of text recognition methods are improved significantly after-classification compared to before-classification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    11
    Citations
    NaN
    KQI
    []