Local Semantic Feature Aggregation-Based Transformer for Hyperspectral Image Classification

2022 
Hyperspectral images (HSIs) contain abundant information in the spatial and spectral domains, allowing for a precise characterization of categories of materials. Convolutional neural networks (CNNs) have achieved great success in HSI classification, owing to their excellent ability in local contextual modeling. However, CNNs suffer from fixed filter weights and deep convolutional layers, which lead to a limited receptive field and high computational burden. The recent vision transformer (ViT) models long-range dependencies with a self-attention mechanism and has been an alternative backbone to CNNs traditionally used in HSI classification. However, such transformer-based architectures designate all the input pixels of the receptive field as feature tokens in terms of feature embedding and self-attention, which inevitably limits the ability for learning multiscale features and increases the computational cost. To overcome this issue, we propose a local semantic feature aggregation-based transformer (LSFAT) architecture which allows transformers to represent long-range dependencies of multiscale features more efficiently. We introduce the concept of the homogeneous region into the transformer by considering a pixel aggregation strategy and further propose neighborhood-aggregation-based embedding (NAE) and attention (NAA) modules, which are able to adaptively form multiscale features and capture locally spatial semantics among them in a hierarchical transformer architecture. A reusable classification token is included together with the feature tokens in the attention calculation. In the last stage, a fully connected layer is used to perform classification on the reusable token after transformer encoding. We verify the effectiveness of the NAE and NAA modules compared with the traditional ViT through extensive experiments. Our results demonstrate the excellent classification performance of the proposed method in comparison to other state-of-the-art approaches on several public HSIs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    0
    Citations
    NaN
    KQI
    []