Optical Character Recognition for Audio-Visual Broadcast Transcription System

2020 
This paper investigates the use of optical character recognition (OCR) for system of audio-visual broadcast transcription. Characters were recognized from video frames by open-source program OCR Tesseract. The OCR in this program (from version 4) is based on Recurrent Neural Networks (RNN) and it uses text post-processing by bigram language model. However, the resulting recognized text contains a number of errors. In some images, the text is not detected and recognized correctly or it is not detected at all. We have designed and tested image pre-processing and text post-processing methods for OCR error reduction. The word error rate (WER) was reduced from 29,4% to 15,4%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    1
    Citations
    NaN
    KQI
    []