Abstract
Three-dimensional (3D) object tracking is crucial in computer vision applications, particularly in autonomous driving, robotics, and surveillance. Despite advancements, effectively utilizing multimodal data to improve multi-object detection and tracking (MODT) remains challenging. This study introduces ACMODT, an affinity computation-based multi-object detection and tracking framework that integrates camera (2D) and LiDAR (3D) data for enhanced MODT performance in autonomous driving. This approach leverages EPNet as a backbone, utilizing 2D–3D feature fusion for accurate proposal generation. A deep neural network (DNN) extracts robust appearance and geometric features, while an improved affinity computation module combines Refined Boost Correlation Features (RBCF) and 3D-Extended Geometric IoU (3D-XGIoU) for precise object association. Motion prediction is refined using a Kalman filter (KF), and Gaussian Mixture Model (GMM)-based data association ensures consistent tracking. Experiments on the KITTI car tracking benchmark for quantitative analysis and the RADIATE dataset for visualization demonstrate that our method achieves superior tracking accuracy and precision compared to state-of-the-art multi-object tracking (MOT) approaches, proving its effectiveness for real-time object tracking.
| Original language | English |
|---|---|
| Pages (from-to) | 809-818 |
| Number of pages | 10 |
| Journal | ICT Express |
| Volume | 11 |
| Issue number | 4 |
| DOIs | |
| State | Published - Aug 2025 |
Keywords
- Affinity computation
- Autonomous driving
- Deep neural network
- Gaussian mixture model-based data association
- Multi-object tracking
Fingerprint
Dive into the research topics of 'Multiple object detection and tracking in autonomous vehicles: A survey on enhanced affinity computation and its multimodal applications'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver