A Novel Encoder-Decoder based UNet Framework for Monocular Depth Estimation
Farhan Ghulam Miran, Department of Computer Sciences, University of Engineering and Technology, Taxila, Pakistan.
Aun Irtaza, Department of Computer Sciences, University of Engineering and Technology, Taxila, Pakistan.
Nudrat Nida, Department of Software Engineering, SZABIST, Islamabad, Pakistan. I
Iram Abdullah, Department of Computer Sciences, HITEC University, Taxila, Pakistan.
Corresponding Author:
Farhan Ghulam Miran (farhan13693@gmail.com)
Abstract:
Precision depth estimations are now more important than ever thanks to research in engineering and autonomous vehicles. Monocular depth estimation is a computer vision technique to determine the depth of image scenes. Depth information is crucial for numerous applications, including augmented reality, target tracking, and autonomous driving systems to detect their surroundings and estimate their status. Traditional monocular depth estimation systems are not scale-invariant, which makes it challenging to estimate the depth accurately for objects of different sizes. Monocular depth estimation is an ill-posed problem where there can be multiple solutions for a single input image. The lack of 3D information makes it difficult to estimate depth accurately. Moreover, these systems are sensitive to lighting conditions, and changes in lighting can affect the accuracy of the depth estimates. Accurate depth measurement is challenging due to all of these factors. This paper presents a completely automated monocular depth estimation method using encoder-decoder-based UNET architecture. The study is conducted on a state-of-the-art DIODE database. A mean of two loss functions is computed and used as our loss metric to evaluate the study. The framework successfully obtained a minimum loss score of 0.30, hence is effective for real-time monocular depth estimation tasks.
Keywords:
Monocular Depth Estimation; Deep Learning; Encoder-Decoder; UNet