A Framework for Lecture Video Segmentation from Extracted Speech Content

2021 
Increasing demand for lecture videos in digital libraries has raised the challenge of automatic annotation of lecture content for effective navigation of lectures by users. One direction is the prior segmentation of lecture videos to simplify several applications such as indexing, keyword spotting, and targeted search. In this study, we present a lecture video segmentation framework based on the speech content of the instructors. The framework is built upon a model that extracts textual and acoustic features from speech and uses them to identify topical segment boundaries of the lecture video. To evaluate our proposed model, we collected our own dataset containing a diverse set of 37 lecture videos and also manually created ground truth. The performance was measured by using metrics like Precision, Recall, and F1 Score and obtained 0.69, 0.58, and 0.63 respectively. We also compared our model with some previously known similar models where our model outperformed others. The overall results of the study are presented as a lecture video segmentation model, integrating various tools and techniques, and showing promising performance. Findings can be used further for research in content-based search and retrieval using speech content.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []