Performance Improvement Method of the Video Visual Relation Detection with Multi-modal Feature Fusion

Kwang Ju Kim, Pyong Kun Kim, Kil Taek Lim, Jong Taek Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Video visual relation detection is a novel research problem that aims to detect instances of visual relations of interest in a video. In this paper, we propose a performance improvement method of the video visual relation detection with multi-modal feature fusion. First, we introduce a spatial feature extraction method that is designed to include the relative positions of objects itself and between objects in the image. Next, we suggest a relationship classifier that is designed to accommodate the complexity of the input features. Our proposed method achieves 6.65 mAP, and ranked the 2nd place in the visual relation detection task of Video Relation Understanding Challenge (VRU), the ACM Multimedia 2020.

Original languageEnglish
Title of host publication4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages87-91
Number of pages5
ISBN (Electronic)9781665458184
DOIs
StatePublished - 2022
Event4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022 - Jeju lsland, Korea, Republic of
Duration: 21 Feb 202224 Feb 2022

Publication series

Name4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022 - Proceedings

Conference

Conference4th International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2022
Country/TerritoryKorea, Republic of
CityJeju lsland
Period21/02/2224/02/22

Keywords

  • component
  • formatting
  • insert
  • style
  • styling

Fingerprint

Dive into the research topics of 'Performance Improvement Method of the Video Visual Relation Detection with Multi-modal Feature Fusion'. Together they form a unique fingerprint.

Cite this