Learning soft mask based feature fusion with channel and spatial attention for robust visual object tracking

Mustansar Fiaz, Arif Mahmood, Soon Ki Jung

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

We propose to improve the visual object tracking by introducing a soft mask based low-level feature fusion technique. The proposed technique is further strengthened by integrating channel and spatial attention mechanisms. The proposed approach is integrated within a Siamese framework to demonstrate its effectiveness for visual object tracking. The proposed soft mask is used to give more importance to the target regions as compared to the other regions to enable effective target feature representation and to increase discriminative power. The low-level feature fusion improves the tracker robustness against distractors. The channel attention is used to identify more discriminative channels for better target representation. The spatial attention complements the soft mask based approach to better localize the target objects in challenging tracking scenarios. We evaluated our proposed approach over five publicly available benchmark datasets and performed extensive comparisons with 39 state-of-the-art tracking algorithms. The proposed tracker demonstrates excellent performance compared to the existing state-of-the-art trackers.

Original languageEnglish
Article number4021
Pages (from-to)1-25
Number of pages25
JournalSensors
Volume20
Issue number14
DOIs
StatePublished - 2 Jul 2020

Keywords

  • Attentional mechanism
  • Convolutional neural network
  • Siamese networks
  • Visual tracking

Fingerprint

Dive into the research topics of 'Learning soft mask based feature fusion with channel and spatial attention for robust visual object tracking'. Together they form a unique fingerprint.

Cite this