Abstract
Monocular depth estimation is a traditional computer vision task that predicts the distance of each pixel relative to the camera from one 2D image. Relative height information about objects lying on a ground plane can be calculated through several processing steps from the depth image. In this paper, we propose a height estimation method for directly predicting the height of objects from a 2D image. The proposed method utilizes an encoder-decoder network for pixel-wise dense prediction based on height consistency. We used the CARLA simulator to generate 40,000 training datasets from different positions in five areas within the simulator. The experimental results show that the object’s height map can be estimated regardless of the camera’s location.
Original language | English |
---|---|
Article number | 350 |
Journal | Electronics (Switzerland) |
Volume | 12 |
Issue number | 2 |
DOIs | |
State | Published - Jan 2023 |
Keywords
- deep learning
- object height estimation
- virtual dataset