TY - JOUR
T1 - SRGAN-enhanced unsafe operation detection and classification of heavy construction machinery using cascade learning
AU - Kim, Bubryur
AU - An, Eui Jung
AU - Kim, Sungho
AU - Sri Preethaa, K. R.
AU - Lee, Dong Eun
AU - Lukacs, R. R.
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/8
Y1 - 2024/8
N2 - In the inherently hazardous construction industry, where injuries are frequent, the unsafe operation of heavy construction machinery significantly contributes to the injury and accident rates. To reduce these risks, this study introduces a novel framework for detecting and classifying these unsafe operations for five types of construction machinery. Utilizing a cascade learning architecture, the approach employs a Super-Resolution Generative Adversarial Network (SRGAN), Real-Time Detection Transformers (RT-DETR), self-DIstillation with NO labels (DINOv2), and Dilated Neighborhood Attention Transformer (DiNAT) models. The study focuses on enhancing the detection and classification of unsafe operations in construction machinery through upscaling low-resolution surveillance footage and creating detailed high-resolution inputs for the RT-DETR model. This enhancement, by leveraging temporal information, significantly improves object detection and classification accuracy. The performance of the cascaded pipeline yielded an average detection and first-level classification precision of 96%, a second-level classification accuracy of 98.83%, and a third-level classification accuracy of 98.25%, among other metrics. The cascaded integration of these models presents a well-rounded solution for near-real-time surveillance in dynamic construction environments, advancing surveillance technologies and significantly contributing to safety management within the industry.
AB - In the inherently hazardous construction industry, where injuries are frequent, the unsafe operation of heavy construction machinery significantly contributes to the injury and accident rates. To reduce these risks, this study introduces a novel framework for detecting and classifying these unsafe operations for five types of construction machinery. Utilizing a cascade learning architecture, the approach employs a Super-Resolution Generative Adversarial Network (SRGAN), Real-Time Detection Transformers (RT-DETR), self-DIstillation with NO labels (DINOv2), and Dilated Neighborhood Attention Transformer (DiNAT) models. The study focuses on enhancing the detection and classification of unsafe operations in construction machinery through upscaling low-resolution surveillance footage and creating detailed high-resolution inputs for the RT-DETR model. This enhancement, by leveraging temporal information, significantly improves object detection and classification accuracy. The performance of the cascaded pipeline yielded an average detection and first-level classification precision of 96%, a second-level classification accuracy of 98.83%, and a third-level classification accuracy of 98.25%, among other metrics. The cascaded integration of these models presents a well-rounded solution for near-real-time surveillance in dynamic construction environments, advancing surveillance technologies and significantly contributing to safety management within the industry.
KW - Safety management
KW - Smart construction sites
KW - Super-resolution generative adversarial network
KW - Transformer
KW - Unsafe operation detection
UR - http://www.scopus.com/inward/record.url?scp=85198364731&partnerID=8YFLogxK
U2 - 10.1007/s10462-024-10839-7
DO - 10.1007/s10462-024-10839-7
M3 - Article
AN - SCOPUS:85198364731
SN - 0269-2821
VL - 57
JO - Artificial Intelligence Review
JF - Artificial Intelligence Review
IS - 8
M1 - 206
ER -