TokyoTechCanon at TRECVID 2012

Nakamasa Inoue,Yusuke Kamishima,Kotaro Mori,Koichi Shinoda

TokyoTechCanon at TRECVID 2012

2012

Nakamasa Inoue
Yusuke Kamishima
Kotaro Mori
Koichi Shinoda

We aim at developing a high-performance semantic indexing system using Gaussian-mixture-model (GMM) supervectors and tree-structured GMMs [1, 2, 3]. GMM supervectors corresponding to six types of audio and visual features are extracted from video shots. Tree-structured GMMs reduce the computational cost of maximum a posteriori (MAP) adaptation for estimating GMM parameters while keeping accuracy at high levels. This year, we introduce two new low-level features of HOG-Dense and LBP-Dense and video-clip scores. HOG-Dense and LBP-Dense are extracted from up to 100 frames per shot by using dense sampling. The video-clip score is defined as the maximum value of shot scores among all the shots in a video clip and is used for re-ranking video shots. Our best result was 32.10% in terms of Mean InfAP, which was ranked first over all semantic indexing runs in the full task.

Keywords:

Search engine indexing
Ranking
Artificial intelligence
Sampling (statistics)
Maximum a posteriori estimation
Computer science
Pattern recognition
TRECVID

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations