Deep Learning Based Monocular Depth Estimation for Object Distance Inference in 2D Images


Authors : G. Victor Daniel; Koneru Gnana Shritej; Kosari Hemanth Sai; Sunkara Namith

Volume/Issue : Volume 9 - 2024, Issue 4 - April


Google Scholar : https://tinyurl.com/3avaem9f

Scribd : https://tinyurl.com/4apty5h4

DOI : https://doi.org/10.38124/ijisrt/IJISRT24APR1431

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : Monocular depth estimation, a process of predicting depth from a single 2D image, has seen significant advancements due to the proliferation of deep learning techniques. This research focuses on leveraging deep learning for monocular depth estimation to infer object distances accurately in 2D images. We explore various convolutional neural network (CNN) architectures and transformer models to analyze their efficacy in predicting depth information. Our approach involves training these models on extensive datasets annotated with depth information, followed by rigorous evaluation using standard metrics. The results demonstrate substantial improvements in depth estimation accuracy, highlighting the potential of deep learning in enhancing computer vision tasks such as autonomous driving, augmented reality, and robotic navigation. This study not only underscores the importance of model architecture but also investigates the impact of training data diversity and augmentation strategies. The findings provide a comprehensive understanding of the current state-of-the-art in monocular depth estimation, paving the way for future innovations in object distance inference from 2D images. By providing a detailed analysis of various models and their performance, this research contributes to a better understanding of monocular depth estimation and its potential for real-world applications, paving the way for future advancements in object distance inference from 2D images.

Keywords : Monocular Depth Estimation, Deep Learning, Convolutional Neural Network (CNN), Computer Vision, Augmented Reality, Robotic Navigation.

References :

  1. Masoumian, Armin., Rashwan, Hatem A.., Cristiano, Julián., Asif, M. Salman., & Puig, D.. (2022). Monocular Depth  Estimation Using Deep Learning: A Review. Sensors (Basel, Switzerland), 22. http://doi.org/10.3390/s22145353
  2. Höllein, Lukas., Cao, Ang., Owens, Andrew., Johnson, Justin., & Nießner, M.. (2023). Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 7875-7886. http://doi.org/10.1109/ICCV51070.2023.00727
  3. Wang, Tai., Pang, Jiangmiao., & Lin, Dahua. (2022). Monocular 3D Object Detection with Depth from Motion. ArXiv,  abs/2207.12988. http://doi.org/ 10.48550/arXiv.2207.12988
  4. Lian, Qing., Li, Peiliang., & Chen, Xiaozhi. (2022). MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1060-1069. http://doi.org/10.1109/CVPR52688.2022.00114
  5. Sharma, Vijeta., Gupta, Manjari., Pandey, A.., Mishra, Deepti., & Kumar, Ajai. (2022). A Review of Deep Learning-based Human Activity Recognition on Benchmark Video Datasets. Applied Artificial Intelligence, 36. http://doi.org/10.1080/08839514. 2022.2093705
  6. Samant, R.., Bachute, M.., Gite, Shilpa., & Kotecha, K.. (2022). Framework for Deep Learning-Based Language Models Using Multi-Task Learning in Natural Language Understanding: A Systematic Literature Review and Future Directions. IEEE Access, 10, 17078-17097. http://doi.org/10.1109/ ACCESS.2022.3149798
  7. Chen, Mansheng., Lin, Jia-Qi., Li, Xiang-Long., Liu, Bao-Yu., Wang, Changdong., Huang, Dong., & Lai, J.. (2022). Representation Learning in Multi-view Clustering: A Literature Review. Data Science and Engineering, 7, 225-241. http://doi.org/10.1007/s 41019-022-00190-8

Monocular depth estimation, a process of predicting depth from a single 2D image, has seen significant advancements due to the proliferation of deep learning techniques. This research focuses on leveraging deep learning for monocular depth estimation to infer object distances accurately in 2D images. We explore various convolutional neural network (CNN) architectures and transformer models to analyze their efficacy in predicting depth information. Our approach involves training these models on extensive datasets annotated with depth information, followed by rigorous evaluation using standard metrics. The results demonstrate substantial improvements in depth estimation accuracy, highlighting the potential of deep learning in enhancing computer vision tasks such as autonomous driving, augmented reality, and robotic navigation. This study not only underscores the importance of model architecture but also investigates the impact of training data diversity and augmentation strategies. The findings provide a comprehensive understanding of the current state-of-the-art in monocular depth estimation, paving the way for future innovations in object distance inference from 2D images. By providing a detailed analysis of various models and their performance, this research contributes to a better understanding of monocular depth estimation and its potential for real-world applications, paving the way for future advancements in object distance inference from 2D images.

Keywords : Monocular Depth Estimation, Deep Learning, Convolutional Neural Network (CNN), Computer Vision, Augmented Reality, Robotic Navigation.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe