Spatial–Spectral Transformer With Cross-Attention for Hyperspectral Image Classification

2022 
Convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification tasks because of their excellent local spatial feature extraction capabilities. However, because it is difficult to establish dependencies between long sequences of data for CNNs, there are limitations in the process of processing hyperspectral spectral sequence features. To overcome these limitations, inspired by the Transformer model, a spatial–spectral transformer with cross-attention (CASST) method is proposed. Overall, the method consists of a dual-branch structures, i.e., spatial and spectral sequence branches. The former is used to capture fine-grained spatial information of HSI, and the latter is adopted to extract the spectral features and establish interdependencies between spectral sequences. Specifically, to enhance the consistency among features and relieve computational burden, we design a spatial–spectral cross-attention module with weighted sharing to extract the interactive spatial–spectral fusion feature intra Transformer block, while also developing a spatial–spectral weighted sharing mechanism to capture the robust semantic feature inter Transformer block. Performance evaluation experiments are conducted on three hyperspectral classification datasets, demonstrating that the CASST method achieves better accuracy than the state-of-the-art Transformer classification models and mainstream classification networks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    0
    Citations
    NaN
    KQI
    []