Mixed Transformer U-Net For Medical Image Segmentation
2021
Though U-Net has achieved tremendous success in medical image segmentation
tasks, it lacks the ability to explicitly model long-range dependencies.
Therefore, Vision Transformers have emerged as alternative segmentation
structures recently, for their innate ability of capturing long-range
correlations through Self-Attention (SA). However, Transformers usually rely on
large-scale pre-training and have high computational complexity. Furthermore,
SA can only model self-affinities within a single sample, ignoring the
potential correlations of the overall dataset. To address these problems, we
propose a novel Transformer module named Mixed Transformer Module (MTM) for
simultaneous inter- and intra- affinities learning. MTM first calculates
self-affinities efficiently through our well-designed Local-Global
Gaussian-Weighted Self-Attention (LGG-SA). Then, it mines inter-connections
between data samples through External Attention (EA). By using MTM, we
construct a U-shaped model named Mixed Transformer U-Net (MT-UNet) for accurate
medical image segmentation. We test our method on two different public
datasets, and the experimental results show that the proposed method achieves
better performance over other state-of-the-art methods. The code is available
at: https://github.com/Dootmaan/MT-UNet.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
19
References
0
Citations
NaN
KQI