Vision-Based Activity Classification of Excavators by Bidirectional LSTM

In Sup Kim, Kamran Latif, Jeonghwan Kim, Abubakar Sharafat, Dong Eun Lee, Jongwon Seo

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Advancements in deep learning and vision-based activity recognition development have significantly improved the safety, continuous monitoring, productivity, and cost of the earthwork site. The construction industry has adopted the CNN and RNN models to classify the different activities of construction equipment and automate the construction operations. However, the currently available methods in the industry classify the activities based on the visual information of current frames. To date, the adjacent visual information of current frames has not been simultaneously examined to recognize the activity in the construction industry. This paper proposes a novel methodology to classify the activities of the excavator by processing the visual information of video frames adjacent to the current frame. This paper follows the CNN-BiLSTM standard deep learning pipeline for excavator activity recognition. First, the pre-trained CNN model extracted the sequential pattern of visual features from the video frames. Then BiLSTM classified the different activities of the excavator by analyzing the output of the pre-trained convolutional neural network. The forward and backward LSTM layers stacked on help the algorithm compute the output by considering previous and upcoming frames’ visual information. Experimental results have shown the average precision and recall to be 87.5% and 88.52%, respectively.

Original languageEnglish
Article number272
JournalApplied Sciences (Switzerland)
Volume13
Issue number1
DOIs
StatePublished - Jan 2023

Keywords

  • activity recognition
  • computer vision
  • convolution neural network (CNN)
  • Googlenet
  • long short-term memory (LSTM)
  • visual features

Fingerprint

Dive into the research topics of 'Vision-Based Activity Classification of Excavators by Bidirectional LSTM'. Together they form a unique fingerprint.

Cite this