High-Movement Human Segmentation in Video Using Adaptive N-Frames Ensemble

Yong Woon Kim, Yung Cheol Byun, Dong Seog Han, Dalia Dominic, Sibu Cyriac

Research output: Contribution to journalArticlepeer-review

Abstract

A wide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic, privacy, and security reasons. Numerous studies show that the Deep-Learning (DL) is a suitable option for human segmentation, and the ensemble of multiple DL-based segmentation models can improve the segmentation result. However, these approaches are not as effective when directly applied to the image segmentation in a video. This paper proposes an Adaptive N-Frames Ensemble (AFE) approach for high-movement human segmentation in a video using an ensemble of multiple DL models. In contrast to an ensemble, which executes multiple DL models simultaneously for every single video frame, the proposed AFE approach executes only a single DL model upon a current video frame. It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold. Our method employs the idea of the N-Frames Ensemble (NFE) method, which uses the ensemble of the image segmentation of a current video frame and previous video frames. However, NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates. The proposed AFE approach addresses the limitations of the NFE method. Our experiment uses three human segmentation models, namely Fully Convolutional Network (FCN), DeepLabv3, and Mediapipe. We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view. The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping, resizing and dividing it into videos having 50 frames each. This paper compares the proposed AFE with single models and the Two-Models Ensemble, as well as the NFE models. The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video.

Original languageEnglish
Pages (from-to)4743-4762
Number of pages20
JournalComputers, Materials and Continua
Volume73
Issue number3
DOIs
StatePublished - 2022

Keywords

  • artificial intelligence
  • deep learning
  • ensemble
  • High movement
  • human segmentation
  • video instance segmentation

Fingerprint

Dive into the research topics of 'High-Movement Human Segmentation in Video Using Adaptive N-Frames Ensemble'. Together they form a unique fingerprint.

Cite this