Feature pyramid network with multi-scale prediction fusion for real-time semantic segmentation

Toan Van Quyen, Min Young Kim

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Feature pyramid network (FPN) is constructed from a bottom-up pathway and a top-down pathway. The method involves multi-scale features, so it can obtain rich contextual information from lower scales and high resolution from the largest scale. Additionally, different receptive fields are effective to capture both thin and large objects in image scenes. All feature maps concatenate together to predict the targets. However, the average pooling method yields the problem of combining the best predictions with poorer ones. In this paper, we proposed a dual prediction to leverage the useful characteristics of each FPN feature map. A low scale prediction attains good precision for large objects. The other one suitably segments narrow objects. Finally, a multi-scale fusion is deployed with an attention part. The attention module finds pixels of a low scale having high probabilities of wrong labels, and then requires the supplements from a high scale. A multi-scale fusion allows the network to learn across the different scales of predictions. We have achieved good Results 77.9% mIoU at 62 FPS on Cityscapes and 44.1% mIoU on Mapillary Vistas.

Original languageEnglish
Pages (from-to)104-113
Number of pages10
JournalNeurocomputing
Volume519
DOIs
StatePublished - 28 Jan 2023

Keywords

  • Attention mechanism
  • Feature pyramid network
  • Multi-scale fusion
  • Real time
  • Semantic segmentation

Fingerprint

Dive into the research topics of 'Feature pyramid network with multi-scale prediction fusion for real-time semantic segmentation'. Together they form a unique fingerprint.

Cite this