Deciphering visual gist and its implications for video retrieval and interface design

Meng Yang,Gary Marchionini

Deciphering visual gist and its implications for video retrieval and interface design

2005

Meng Yang
Gary Marchionini

How do people make sense of a video based on viewing a few frames of that video? What elements constitute the "visual gist" in their minds? Answers to these questions will give implications to both content-based video retrieval and the interface design (e.g., key-frame selection) of digital video libraries. A preliminary study was conducted to unravel the issues and 45 subjects participated in the study. After viewing a fast forward surrogate, the subjects were asked to choose pictures which they thought would "belong to" the video. And they were also asked to think aloud during their selection processes. Nine visual gist attributes (e.g., people, objects and actions) were generated using the grounded theory method and their frequencies were also compared and analyzed.

Keywords:

Human–computer interaction
Grounded theory
Multimedia
Think aloud protocol
GiST
Interface design
Computer science
video retrieval
video based
digital video
user studies

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations