Convolutional Features Combining SL(3) Group for Visual Tracking

2020 
For visual tracking, key factors that affect the performance of trackers are related to whether it can effectively extract the appearance information and spatial information of a target. And most of state-of-the-art trackers either do not model the appearance information and spatial information separately or do not design special strategies to deal with the strong geometric deformation of the target. In this paper, we design an appearance information model and a spatial information model separately, and then combine them to obtain complementary benefits. Firstly, because the features from deeper layers of a convolutional neural network (CNN) can better describe the semantic information of a target while the spatial information becomes less, we adopt the features from the deepest layer as the appearance information model. Secondly, we focus on tracking the target with drastic geometric deformation through utilizing a projection transformation group (SL(3) group) to model the geometric transformation of the target, where SL(3) group can describe the geometric deformation more accurately. Furthermore, a standard discriminative correlation filter is used to develop the effect of convolutional features and is more efficient than other methods used for CNN. Extensive experiments results show that our tracker outperforms all the compared trackers.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []