Deep learning-guided video compression for machine vision tasks

Aro Kim, Seung Taek Woo, Minho Park, Dong Hwi Kim, Hanshin Lim, Soon Heung Jung, Sangwoon Kwak, Sang Hyo Park

Research output: Contribution to journalArticlepeer-review

Abstract

In the video compression industry, video compression tailored to machine vision tasks has recently emerged as a critical area of focus. Given the unique characteristics of machine vision, the current practice of directly employing conventional codecs reveals inefficiency, which requires compressing unnecessary regions. In this paper, we propose a framework that more aptly encodes video regions distinguished by machine vision to enhance coding efficiency. For that, the proposed framework consists of deep learning-based adaptive switch networks that guide the efficient coding tool for video encoding. Through the experiments, it is demonstrated that the proposed framework has superiority over the latest standardization project, video coding for machine benchmark, which achieves a Bjontegaard delta (BD)-rate gain of 5.91% on average and reaches up to a 19.51% BD-rate gain.

Original languageEnglish
Article number32
JournalEurasip Journal on Image and Video Processing
Volume2024
Issue number1
DOIs
StatePublished - Dec 2024

Keywords

  • Deep learning
  • Video coding for machines
  • Video compression

Fingerprint

Dive into the research topics of 'Deep learning-guided video compression for machine vision tasks'. Together they form a unique fingerprint.

Cite this