RSSGLT: Remote Sensing Image Segmentation Network Based on Global-Local Transformer

Satyawant Kumar, Abhishek Kumar, Dong Gyu Lee

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Remotely captured images possess an immense scale and object appearance variability due to the complex scene. It becomes challenging to capture the underlying attributes in the global and local context for their segmentation. Existing networks struggle to capture the inherent features due to the cluttered background. To address these issues, we propose a remote sensing image segmentation network, RSSGLT, for semantic segmentation of remote sensing images. We capture the global and local features by leveraging the benefits of the transformer and convolution mechanisms. RSSGLT is an encoder-decoder design that uses multiscale features. We construct an attention map module (AMM) to generate channelwise attention scores for fusing these features. We construct a global-local transformer block (GLTB) in the decoder network to support learning robust representations during a decoding phase. Furthermore, we designed a feature refinement module (FRM) to refine the fused output of the shallow stage encoder feature and the deepest GLTB feature of the decoder. Experimental findings on the two public datasets show the effectiveness of the proposed RSSGLT.

Original languageEnglish
Article number8000305
Pages (from-to)1-5
Number of pages5
JournalIEEE Geoscience and Remote Sensing Letters
Volume21
DOIs
StatePublished - 2024

Keywords

  • Context details
  • multiscale features
  • remote sensing images
  • semantic segmentation
  • transformer

Fingerprint

Dive into the research topics of 'RSSGLT: Remote Sensing Image Segmentation Network Based on Global-Local Transformer'. Together they form a unique fingerprint.

Cite this