Video in sentences out

Andrei Barbu,Alexander Bridge,Zachary Burchill,Dan Coroian,Sven Dickinson,Sanja Fidler,Aaron Michaux,Sam Mussman,Siddharth Narayanaswamy,Dhaval Salvi,Lara Schmidt,Jiangnan Shangguan,Jeffrey Mark Siskind,Jarrell Waggoner,Song Wang,Jinlian Wei,Yifan Yin,Zhiqi Zhang

Video in sentences out

2012

Andrei Barbu
Alexander Bridge
Zachary Burchill
Dan Coroian
Sven Dickinson
Sanja Fidler
Aaron Michaux
Sam Mussman
Siddharth Narayanaswamy
Dhaval Salvi
Lara Schmidt
Jiangnan Shangguan
Jeffrey Mark Siskind
Jarrell Waggoner
Song Wang
Jinlian Wei
Yifan Yin
Zhiqi Zhang

We present a system that produces sentential descriptions of video: who did what to whom, and where and how they did it. Action class is rendered as a verb, participant objects as noun phrases, properties of those objects as adjectival modifiers in those noun phrases, spatial relations between those participants as prepositional phrases, and characteristics of the event as prepositional-phrase adjuncts and adverbial modifiers. Extracting the information needed to render these linguistic entities requires an approach to event recognition that recovers object tracks, the track-to-role assignments, and changing body posture.

Keywords:

Noun phrase
Computer science
Natural language processing
Machine learning
ADJECTIVAL MODIFIERS
Verb
Adverbial
Spatial relation
Artificial intelligence
event recognition
body posture

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

112

Citations