Action Coherence Network for Weakly Supervised Temporal Action Localization
2019
Most prominent temporal action localization methods are of the fully-supervised type, which rely heavily on frame-level labels, which could be prohibitively expensive to annotate. Thanks to recent developments on the Weakly-supervised Temporal Action Localization (W-TAL), this alternative paradigm requires only video-level labels in training, alleviating such annotation efforts. Specifically, we present Action Coherence Network (ACN) for W-TAL, which features a new coherence loss that better supervises action boundary learning and facilitate proposal regression. In addition, a purpose-built fusion module is proposed for localization inference based on features extracted by two streams of convolutional neural network. Overall, the proposed ACN achieves state-of-the-art W-TAL performance on two challenging datasets (THU-MOS14 and ActivityNet1.2, particularly ACN attains mAP of 24.2% on THUMOS14 under IoU threshold 0.5), which is approaching some recent fully-supervised TAL methods.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
22
References
8
Citations
NaN
KQI