Natural Language Descriptions of Human Activities Scenes: Corpus Generation and Analysis

2016 
There has been continuous growth in the volume and ubiquity of video material. It has become essential to define video semantics in order to aid the searchabil- ity and retrieval of this data. Although the method of annotating this data with keywords is relatively well researched, the quality can be improved through de- scribing videos with natural language. We are exploring approaches to generat- ing natural language descriptions of inter- relations between human activities in a video stream. This paper focuses on cre- ation of a dataset that can be used for de- velopment and evaluation. To this end a corpus of video clips, manually selected from the Hollywood2 dataset, and their natural language descriptions has been generated. Analysis of the hand anno- tation presents insights into human inter- ests and thoughts. Such resource can be used to evaluate automatic natural lan- guage generation systems for video.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    1
    Citations
    NaN
    KQI
    []