A review of 3D reconstruction methods based on deep learning

Research Article
Open access

A review of 3D reconstruction methods based on deep learning

Yuanchun Wang 1*
  • 1 Wuhan University of Technology    
  • *corresponding author wangyuanchun@whut.edu.cn
Published on 4 February 2024 | https://doi.org/10.54254/2755-2721/35/20230362
ACE Vol.35
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-83558-295-4
ISBN (Online): 978-1-83558-296-1

Abstract

In computer vision, an important research area is three-dimensional reconstruction. Using computer technology to reconstruct three-dimensional models of objects has become an indispensable part of in-depth research in many fields. This thesis presents the development process of 3D reconstruction methods that use deep learning. Compared with traditional methods, the 3D reconstruction method based on deep learning has more flexible input and output and higher efficiency. This thesis classifies the methods by the type of 3D model representation and discusses different frameworks for 3D reconstruction based on deep learning. With the introduction of the method NeRF (Neural Radiance Field), the three-dimensional reconstruction work based on deep learning has got a great development. NeRF can achieve good results in a very short period of time in the face of various complex scenes. With the continuous improvement of NeRF by researchers, this method has achieved more amazing results. Finally, the existing problems in the field of 3D reconstruction, the causes of problems and possible solutions are analyzed. Finally, the future development trend and direction of this field are hypothesized and discussed.

Keywords:

Three-Dimensional Reconstruction, Deep Learning, Computer Vision, Neural Radiance Field

Wang,Y. (2024). A review of 3D reconstruction methods based on deep learning. Applied and Computational Engineering,35,64-71.
Export citation

References

[1]. Roberts L G. Machine Perception of Three-Dimensional Solids [Ph.D.dissertation],Massachusetts Institute of Technology,USA,1963

[2]. Choy C B, Xu D and Gwak J, 2016. Choy et al.(2016): A Unified Approach for Single and Multi-view 3D Object Reconstruction//Proceedings of the European Conference on Computer Vision. Amsterdam, Netherlands: Springer: 628-644. [DOI: 10.1007/ 978- 3-319- 46484 - 8_38]

[3]. Yang B, Rosa S and Markham A, 2019. Dense 3D Object Reconstruction from a Single Depth View. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12): 2820-2834. [DOI:10.1109/TPAMI.2018.2868195]

[4]. Tatarchenko M, Dosovitskiy A and Brox T, 2017. Octree Generating Networks: Efficient Convolutional Architectures for HighResolution 3D Outputs// Proceedings of the IEEE International Conference on Computer Vision. Honolulu, USA: IEEE: 2088- 2096. [DOI:10.1109/ICCV.2017.230]

[5]. Xie H, Yao H and Sun X, 2019. Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images//Proceedings of the International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2690-2698. [DOI:10.1109/ICCV. 2019.00278]

[6]. Fan H, Su H and Guibas L J, 2017. A Point Set Generation Network for 3D Object Reconstruction From a Single Image//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 605-613. [DOI:10.1109/CVPR.2017.264]

[7]. Mandikal P, Navaneet K L and Agarwal M, 2019. 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image//Proceedings of British machine vision conference. Newcastle, UK: 662-674

[8]. Jiang L, Shi S and Qi X, 2018. GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 802-816. [DOI:10.1007/978-3-030- 01237- 3\_49]

[9]. Groueix T, Fisher M and Kim V G, 2018. A Papier-Mâché Approach to Learning 3D Surface Generation//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 216-224. [DOI:10.1109/CVPR. 2018. 00030]

[10]. He K , Zhang X , Ren S ,et al.Deep Residual Learning for Image Recognition[J].IEEE, 2016.DOI:10.1109/CVPR.2016.90.

[11]. Wang N, Zhang Y and Li Z, 2018. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 52- 67. [DOI:10.1007/978-3-030-01252-6\_4]

[12]. Tang J, Han X and Pan J, 2019. A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,USA: IEEE: 4541-4550. [DOI:10.1109/CVPR.2019.00467]

[13]. Wang W, Xu Q and Ceylan D, 2019. DISN: deep implicit surface network for high-quality single-view 3D reconstruction// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc.: 492-502

[14]. Chen W, Ling H and Gao J, 2019. Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates, Inc.: 9609-9619.

[15]. Wen C, Zhang Y and Li Z, 2019. Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1042-1051. [DOI:10.1109/ICCV.2019.00113]

[16]. Bautista M A, Talbott W and Zhai S, 2021. On the Generalization of Learning-Based 3D Reconstruction//Proceedings of the Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 2180-2189. [DOI:10.1109/WACV48630.2021. 00223]

[17]. Shrestha R, Fan Z and Su Q, 2021. MeshMVS: Multi-View Stereo Guided Mesh Reconstruction//International Conference on 3D Vision. London, UK: IEEE: 1290-1300. [DOI: 10.1109/3DV53792. 2021. 00136]

[18]. Wood D N ,Azuma D I ,Aldinger K , et al. Surface light fields for 3D photography[C]// SIGGRAPH conference. 2000.

[19]. Mildenhall, B. , Srinivasan, P. P. , Tancik, M. , Barron, J. T. , Ramamoorthi, R. , & Ng, R. . (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.

[20]. Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa, pixelNeRF: Neural Radiance Fields From One or Few Images, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4578-4587.

[21]. Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, Julien Valentin, FastNeRF: High-Fidelity Neural Rendering at 200FPS, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 14346-14355

[22]. J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. MartinBrualla, and P. P. Srinivasan, Mip-nerf: A multiscale representation for anti-aliasing neural radiance fifields, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5855–5864.

[23]. J. Zhang, Y. Zhang, H. Fu, X. Zhou, B. Cai, J. Huang, R. Jia, B. Zhao, and X. Tang, Ray priors through reprojection: Improving neural radiance fifields for novel view extrapolation,in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 376–18 386.

[24]. Q. Xu, Z. Xu, J. Philip, S. Bi, Z. Shu, K. Sunkavalli, and U. Neumann,Point-nerf: Point-based neural radiance fifields, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5438–5448.

[25]. X. Zhang, S. Bi, K. Sunkavalli, H. Su, and Z. Xu, Nerfusion: Fusing radiance fifields for large-scale scene reconstruction, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5449–5458.

[26]. N. Muller, A. Simonelli, L. Porzi, S. R. Bulo, M. Nießner, and P. Kontschieder, Autorf: Learning 3d object radiance fifields from single view observations, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3971– 3980.

[27]. Can Wang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao,CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields,Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3835-3844


Cite this article

Wang,Y. (2024). A review of 3D reconstruction methods based on deep learning. Applied and Computational Engineering,35,64-71.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2023 International Conference on Machine Learning and Automation

ISBN:978-1-83558-295-4(Print) / 978-1-83558-296-1(Online)
Editor:Mustafa İSTANBULLU
Conference website: https://2023.confmla.org/
Conference date: 18 October 2023
Series: Applied and Computational Engineering
Volume number: Vol.35
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Roberts L G. Machine Perception of Three-Dimensional Solids [Ph.D.dissertation],Massachusetts Institute of Technology,USA,1963

[2]. Choy C B, Xu D and Gwak J, 2016. Choy et al.(2016): A Unified Approach for Single and Multi-view 3D Object Reconstruction//Proceedings of the European Conference on Computer Vision. Amsterdam, Netherlands: Springer: 628-644. [DOI: 10.1007/ 978- 3-319- 46484 - 8_38]

[3]. Yang B, Rosa S and Markham A, 2019. Dense 3D Object Reconstruction from a Single Depth View. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12): 2820-2834. [DOI:10.1109/TPAMI.2018.2868195]

[4]. Tatarchenko M, Dosovitskiy A and Brox T, 2017. Octree Generating Networks: Efficient Convolutional Architectures for HighResolution 3D Outputs// Proceedings of the IEEE International Conference on Computer Vision. Honolulu, USA: IEEE: 2088- 2096. [DOI:10.1109/ICCV.2017.230]

[5]. Xie H, Yao H and Sun X, 2019. Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images//Proceedings of the International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2690-2698. [DOI:10.1109/ICCV. 2019.00278]

[6]. Fan H, Su H and Guibas L J, 2017. A Point Set Generation Network for 3D Object Reconstruction From a Single Image//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 605-613. [DOI:10.1109/CVPR.2017.264]

[7]. Mandikal P, Navaneet K L and Agarwal M, 2019. 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image//Proceedings of British machine vision conference. Newcastle, UK: 662-674

[8]. Jiang L, Shi S and Qi X, 2018. GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 802-816. [DOI:10.1007/978-3-030- 01237- 3\_49]

[9]. Groueix T, Fisher M and Kim V G, 2018. A Papier-Mâché Approach to Learning 3D Surface Generation//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 216-224. [DOI:10.1109/CVPR. 2018. 00030]

[10]. He K , Zhang X , Ren S ,et al.Deep Residual Learning for Image Recognition[J].IEEE, 2016.DOI:10.1109/CVPR.2016.90.

[11]. Wang N, Zhang Y and Li Z, 2018. Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images//Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer: 52- 67. [DOI:10.1007/978-3-030-01252-6\_4]

[12]. Tang J, Han X and Pan J, 2019. A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,USA: IEEE: 4541-4550. [DOI:10.1109/CVPR.2019.00467]

[13]. Wang W, Xu Q and Ceylan D, 2019. DISN: deep implicit surface network for high-quality single-view 3D reconstruction// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc.: 492-502

[14]. Chen W, Ling H and Gao J, 2019. Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates, Inc.: 9609-9619.

[15]. Wen C, Zhang Y and Li Z, 2019. Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1042-1051. [DOI:10.1109/ICCV.2019.00113]

[16]. Bautista M A, Talbott W and Zhai S, 2021. On the Generalization of Learning-Based 3D Reconstruction//Proceedings of the Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 2180-2189. [DOI:10.1109/WACV48630.2021. 00223]

[17]. Shrestha R, Fan Z and Su Q, 2021. MeshMVS: Multi-View Stereo Guided Mesh Reconstruction//International Conference on 3D Vision. London, UK: IEEE: 1290-1300. [DOI: 10.1109/3DV53792. 2021. 00136]

[18]. Wood D N ,Azuma D I ,Aldinger K , et al. Surface light fields for 3D photography[C]// SIGGRAPH conference. 2000.

[19]. Mildenhall, B. , Srinivasan, P. P. , Tancik, M. , Barron, J. T. , Ramamoorthi, R. , & Ng, R. . (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.

[20]. Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa, pixelNeRF: Neural Radiance Fields From One or Few Images, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4578-4587.

[21]. Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, Julien Valentin, FastNeRF: High-Fidelity Neural Rendering at 200FPS, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 14346-14355

[22]. J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. MartinBrualla, and P. P. Srinivasan, Mip-nerf: A multiscale representation for anti-aliasing neural radiance fifields, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5855–5864.

[23]. J. Zhang, Y. Zhang, H. Fu, X. Zhou, B. Cai, J. Huang, R. Jia, B. Zhao, and X. Tang, Ray priors through reprojection: Improving neural radiance fifields for novel view extrapolation,in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 376–18 386.

[24]. Q. Xu, Z. Xu, J. Philip, S. Bi, Z. Shu, K. Sunkavalli, and U. Neumann,Point-nerf: Point-based neural radiance fifields, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5438–5448.

[25]. X. Zhang, S. Bi, K. Sunkavalli, H. Su, and Z. Xu, Nerfusion: Fusing radiance fifields for large-scale scene reconstruction, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5449–5458.

[26]. N. Muller, A. Simonelli, L. Porzi, S. R. Bulo, M. Nießner, and P. Kontschieder, Autorf: Learning 3d object radiance fifields from single view observations, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3971– 3980.

[27]. Can Wang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao,CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields,Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3835-3844