Skip to main navigation Skip to search Skip to main content

Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications

  • Kyungpook National University

Research output: Contribution to journalReview articlepeer-review

9 Scopus citations

Abstract

Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.

Original languageEnglish
Pages (from-to)809-818
Number of pages10
JournalICT Express
Volume11
Issue number4
DOIs
StatePublished - Aug 2025

Keywords

  • Affinity computation
  • Autonomous driving
  • Deep neural network
  • Gaussian mixture model-based data association
  • Multi-object tracking

Fingerprint

Dive into the research topics of 'Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications'. Together they form a unique fingerprint.

Cite this