6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments

Mayura Manawadu, Soon Yong Park

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Accurate 6DoF (degrees of freedom) pose and focal length estimation are important in extended reality (XR) applications, enabling precise object alignment and projection scaling, thereby enhancing user experiences. This study focuses on improving 6DoF pose estimation using single RGB images of unknown camera metadata. Estimating the 6DoF pose and focal length from an uncontrolled RGB image, obtained from the internet, is challenging because it often lacks crucial metadata. Existing methods such as FocalPose and Focalpose++ have made progress in this domain but still face challenges due to the projection scale ambiguity between the translation of an object along the z-axis ((Formula presented.)) and the camera’s focal length. To overcome this, we propose a two-stage strategy that decouples the projection scaling ambiguity in the estimation of z-axis translation and focal length. In the first stage, (Formula presented.) is set arbitrarily, and we predict all the other pose parameters and focal length relative to the fixed (Formula presented.). In the second stage, we predict the true value of (Formula presented.) while scaling the focal length based on the (Formula presented.) update. The proposed two-stage method reduces projection scale ambiguity in RGB images and improves pose estimation accuracy. The iterative update rules constrained to the first stage and tailored loss functions including Huber loss in the second stage enhance the accuracy in both 6DoF pose and focal length estimation. Experimental results using benchmark datasets show significant improvements in terms of median rotation and translation errors, as well as better projection accuracy compared to the existing state-of-the-art methods. In an evaluation across the Pix3D datasets (chair, sofa, table, and bed), the proposed two-stage method improves projection accuracy by approximately 7.19%. Additionally, the incorporation of Huber loss resulted in a significant reduction in translation and focal length errors by 20.27% and 6.65%, respectively, in comparison to the Focalpose++ method.

Original languageEnglish
Article number5474
JournalSensors
Volume24
Issue number17
DOIs
StatePublished - Sep 2024

Keywords

  • 6DoF
  • focal length
  • pose estimation
  • uncontrolled RGB images
  • XR

Fingerprint

Dive into the research topics of '6DoF Object Pose and Focal Length Estimation from Single RGB Images in Uncontrolled Environments'. Together they form a unique fingerprint.

Cite this