Enhanced Multi-Pill Detection and Recognition Using VFI Augmentation and Auto-Labeling for Limited Single-Pill Data

Seung Hwan Lee, Dong Min Son, Sung Hak Lee

Research output: Contribution to journalArticlepeer-review

Abstract

This study presents a method for object detection and recognition to identify the positions and types of pills in images containing multiple pills using a small-scale dataset of single-pill images for training. The proposed approach aims to detect multiple pills at the final stage despite the initial training data, which includes only single pills. The method consists of three primary steps. First, a data augmentation technique is introduced to prevent overfitting and improve learning efficiency. This augmentation uses a video frame interpolation (VFI) technique based on the latent diffusion model (LDM). A capturing system is developed for this purpose, and differences between images are used as additional information in weight maps to train the LDM. Second, an automatic labeling system is proposed to generate label data for the paired dataset efficiently. Accurate labeling requires the position and type of pills as training data, but manually labeling the augmented dataset of 61,440 images would be costly. Therefore, an automatic labeling system using an attention map and a deep U-Net is proposed to generate the label data efficiently. Third, a method is presented to detect the position and type of multiple pills based on a training dataset containing only single-pill images. Reliable detection and recognition of multiple pills usually require datasets containing various pill combinations. However, as the number of classes increases, the possible combinations grow exponentially. To address this, we propose a system that learns from single-pill images to detect multiple pills accurately. This study uses a dataset containing 40 types of pills for experimentation, and the results demonstrate superior precision, recall, individual pill accuracy, and image accuracy compared to other methods.

Original languageEnglish
Pages (from-to)60859-60878
Number of pages20
JournalIEEE Access
Volume13
DOIs
StatePublished - 2025

Keywords

  • Object detection and recognition
  • automatic labeling system
  • data augmentation
  • video frame interpolation

Fingerprint

Dive into the research topics of 'Enhanced Multi-Pill Detection and Recognition Using VFI Augmentation and Auto-Labeling for Limited Single-Pill Data'. Together they form a unique fingerprint.

Cite this