-
CiteScore
-
Impact Factor
Volume 2, Issue 3, ICCK Transactions on Intelligent Systematics
Volume 2, Issue 3, 2025
Submit Manuscript Edit a Special Issue
Academic Editor
Seifedine Kadry
Seifedine Kadry
Noroff University College, Norway
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
ICCK Transactions on Intelligent Systematics, Volume 2, Issue 3, 2025: 190-202

Free to Read | Research Article | 25 August 2025
DT-NeRF: A Diffusion and Transformer-Based Optimization Approach for Neural Radiance Fields in 3D Reconstruction
1 College of Computer Sciences, Northeastern University, Cupertino 95014, CA, United States
2 Department of Electrical Engineering and Computer Science, University of California, Irvine, Moreno Valley 92555, CA, United States
3 Desautels Faculty of Management, McGill University, Montréal 27708, Canada
4 Department of Mathematics, Northeastern University, San Jose 95131, CA, United States
* Corresponding Author: Runlong Li, [email protected]
Received: 06 June 2025, Accepted: 05 July 2025, Published: 25 August 2025  
Abstract
This paper proposes a Diffusion Model-Optimized Neural Radiance Field (DT-NeRF) method, aimed at enhancing detail recovery and multi-view consistency in 3D scene reconstruction. By combining diffusion models with Transformers, DT-NeRF effectively restores details under sparse viewpoints and maintains high accuracy in complex geometric scenes. Experimental results demonstrate that DT-NeRF significantly outperforms traditional NeRF and other state-of-the-art methods on the Matterport3D and ShapeNet datasets, particularly in metrics such as PSNR, SSIM, Chamfer Distance, and Fidelity. Ablation experiments further confirm the critical role of the diffusion and Transformer modules in the model's performance, with the removal of either module leading to a decline in performance. The design of DT-NeRF showcases the synergistic effect between modules, providing an efficient and accurate solution for 3D scene reconstruction. Future research may focus on further optimizing the model, exploring more advanced generative models and network architectures to enhance its performance in large-scale dynamic scenes.

Graphical Abstract
DT-NeRF: A Diffusion and Transformer-Based Optimization Approach for Neural Radiance Fields in 3D Reconstruction

Keywords
diffusion model
NeRF
3D reconstruction
detail recovery
transformer network

Data Availability Statement
Data will be made available on request.

Funding
This work was supported without any funding.

Conflicts of Interest
The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Su, S. Y., Yu, F., Zollhöfer, M., & Rhodin, H. (2021). A-nerf: Articulated neural radiance fields for learning human shape, appearance, and pose. Advances in neural information processing systems, 34, 12278-12291.
    [CrossRef]   [Google Scholar]
  2. Kosiorek, A. R., Strathmann, H., Zoran, D., Moreno, P., Schneider, R., Mokrá, S., & Rezende, D. J. (2021, July). Nerf-vae: A geometry aware 3d scene generative model. In International conference on machine learning (pp. 5742-5752). PMLR.
    [Google Scholar]
  3. Luo, H., Zhang, J., Liu, X., Zhang, L., & Liu, J. (2024). Large-scale 3d reconstruction from multi-view imagery: A comprehensive review. Remote Sensing, 16(5), 773.
    [CrossRef]   [Google Scholar]
  4. Han, X. F., Laga, H., & Bennamoun, M. (2019). Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era. IEEE transactions on pattern analysis and machine intelligence, 43(5), 1578-1604.
    [CrossRef]   [Google Scholar]
  5. Xiao, W., Chierchia, R., Cruz, R. S., Li, X., Ahmedt-Aristizabal, D., Salvado, O., ... & Lebrat, L. (2025). Neural Radiance Fields for the Real World: A Survey. arXiv preprint arXiv:2501.13104.
    [Google Scholar]
  6. Wang, Z., Wu, S., Xie, W., Chen, M., & Prisacariu, V. A. (2021). NeRF--: Neural radiance fields without known camera parameters.
    [Google Scholar]
  7. Wynn, J., & Turmukhambetov, D. (2023, June). DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4180-4189). IEEE.
    [CrossRef]   [Google Scholar]
  8. Wang, C., Chai, M., He, M., Chen, D., & Liao, J. (2022, June). CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3825-3834). IEEE.
    [CrossRef]   [Google Scholar]
  9. Deng, K., Liu, A., Zhu, J. Y., & Ramanan, D. (2022, June). Depth-supervised NeRF: Fewer Views and Faster Training for Free. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 12872-12881). IEEE.
    [CrossRef]   [Google Scholar]
  10. Fime, A. A., Mahmud, S., Das, A., Islam, M. S., & Kim, J. H. (2025). Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future Prospects. IEEE Access, 13, 95753-95796.
    [CrossRef]   [Google Scholar]
  11. Furukawa, Y., & Hernández, C. (2015). Multi-view stereo: A tutorial. Foundations and trends® in Computer Graphics and Vision, 9(1-2), 1-148. http://dx.doi.org/10.1561/0600000052
    [Google Scholar]
  12. Belkaid, M., Merras, M., Berrajaa, A., & El Akkad, N. (2024, December). Review of 3D Scene Reconstruction: From Traditional Methods to Advanced Deep Learning Models. In 2024 3rd International Conference on Embedded Systems and Artificial Intelligence (ESAI) (pp. 1-11). IEEE.
    [CrossRef]   [Google Scholar]
  13. Roldao, L., De Charette, R., & Verroust-Blondet, A. (2022). 3D semantic scene completion: A survey. International Journal of Computer Vision, 130(8), 1978-2005.
    [CrossRef]   [Google Scholar]
  14. Jiang, S., You, K., Li, Y., Weng, D., & Chen, W. (2024). 3D reconstruction of spherical images: a review of techniques, applications, and prospects. Geo-spatial Information Science, 27(6), 1959-1988.
    [CrossRef]   [Google Scholar]
  15. Ingale, A. K. (2021). Real-time 3D reconstruction techniques applied in dynamic scenes: A systematic literature review. Computer Science Review, 39, 100338.
    [CrossRef]   [Google Scholar]
  16. Zhou, L., Wu, G., Zuo, Y., Chen, X., & Hu, H. (2024). A comprehensive review of vision-based 3d reconstruction methods. Sensors, 24(7), 2314.
    [CrossRef]   [Google Scholar]
  17. Wang, Z., She, Q., & Ward, T. E. (2021). Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys (CSUR), 54(2), 1-38.
    [CrossRef]   [Google Scholar]
  18. Yang, J., Li, C., Zhang, P., Dai, X., Xiao, B., Yuan, L., & Gao, J. (2021). Focal attention for long-range interactions in vision transformers. Advances in Neural Information Processing Systems, 34, 30008-30022.
    [CrossRef]   [Google Scholar]
  19. Yang, L., Zhu, Z., Nong, X. L. J., & Liang, Y. (2023, October). Long-Range Grouping Transformer for Multi-View 3D Reconstruction. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 18211-18221). IEEE.
    [CrossRef]   [Google Scholar]
  20. Maxim, B., & Nedevschi, S. (2021, October). A survey on the current state of the art on deep learning 3D reconstruction. In 2021 IEEE 17th International Conference on Intelligent Computer Communication and Processing (ICCP) (pp. 283-290). IEEE.
    [CrossRef]   [Google Scholar]
  21. Samavati, T., & Soryani, M. (2023). Deep learning-based 3D reconstruction: a survey. Artificial Intelligence Review, 56(9), 9175-9219.
    [CrossRef]   [Google Scholar]
  22. Gao, K., Gao, Y., He, H., Lu, D., Xu, L., & Li, J. (2022). Nerf: Neural radiance field in 3d vision, a comprehensive review. arXiv preprint arXiv:2210.00379.
    [Google Scholar]
  23. Garbin, S. J., Kowalski, M., Johnson, M., Shotton, J., & Valentin, J. (2021). Fastnerf: High-fidelity neural rendering at 200fps. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 14346-14355).
    [CrossRef]   [Google Scholar]
  24. Roessle, B., Müller, N., Porzi, L., Bulo, S. R., Kontschieder, P., & Nießner, M. (2023). Ganerf: Leveraging discriminators to optimize neural radiance fields. ACM Transactions on Graphics (TOG), 42(6), 1-14.
    [CrossRef]   [Google Scholar]
  25. Xu, Q., Xu, Z., Philip, J., Bi, S., Shu, Z., Sunkavalli, K., & Neumann, U. (2022, June). Point-NeRF: Point-based Neural Radiance Fields. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5428-5438). IEEE.
    [CrossRef]   [Google Scholar]
  26. Haider, M., Shahzad, K., Rameez, A. U., Umair, S., & Abbas, S. NeRF Explored: A Comprehensive Analysis of Neural Radiance Field in 3D Vision.
    [Google Scholar]
  27. Farshian, A., Götz, M., Cavallaro, G., Debus, C., Nießner, M., Benediktsson, J. A., & Streit, A. (2023). Deep-learning-based 3-d surface reconstruction—a survey. Proceedings of the IEEE, 111(11), 1464-1501.
    [CrossRef]   [Google Scholar]
  28. Hou, J., Graham, B., Nießner, M., & Xie, S. (2021, June). Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 15582-15592). IEEE.
    [CrossRef]   [Google Scholar]
  29. Kim, S., Baek, J., Park, J., Kim, G., & Kim, S. (2022, June). InstaFormer: Instance-Aware Image-to-Image Translation with Transformer. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 18300-18310). IEEE Computer Society.
    [CrossRef]   [Google Scholar]
  30. Yin, W., Zhang, J., Wang, O., Niklaus, S., Mai, L., Chen, S., & Shen, C. (2021, June). Learning to Recover 3D Scene Shape from a Single Image. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 204-213). IEEE.
    [CrossRef]   [Google Scholar]
  31. Ramakrishnan, S. K., Gokaslan, A., Wijmans, E., Maksymets, O., Clegg, A., Turner, J., ... & Batra, D. (2021). Habitat-matterport 3d dataset (hm3d): 1000 large-scale 3d environments for embodied ai. arXiv preprint arXiv:2109.08238.
    [Google Scholar]
  32. Xie, C., Wang, C., Zhang, B., Yang, H., Chen, D., & Wen, F. (2021, June). Style-based Point Generator with Adversarial Rendering for Point Cloud Completion. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4617-4626). IEEE.
    [CrossRef]   [Google Scholar]
  33. Qian, R., Lai, X., & Li, X. (2022). 3D object detection for autonomous driving: A survey. Pattern Recognition, 130, 108796.
    [CrossRef]   [Google Scholar]
  34. Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.
    [CrossRef]   [Google Scholar]
  35. Martin-Brualla, R., Radwan, N., Sajjadi, M. S., Barron, J. T., Dosovitskiy, A., & Duckworth, D. (2021). Nerf in the wild: Neural radiance fields for unconstrained photo collections. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7210-7219).
    [CrossRef]   [Google Scholar]
  36. Zhao, X., Chen, B., Sun, M., Yang, D., Wang, Y., Zhang, X., ... & Zhang, L. (2024). Hybridocc: Nerf enhanced transformer-based multi-camera 3d occupancy prediction. IEEE Robotics and Automation Letters, 9(9), 7867-7874.
    [CrossRef]   [Google Scholar]
  37. Kim, M., Seo, S., & Han, B. (2022, June). InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 12902-12911). IEEE.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Liu, B., Li, R., Zhou, L., & Zhou, Y. (2025). DT-NeRF: A Diffusion and Transformer-Based Optimization Approach for Neural Radiance Fields in 3D Reconstruction. ICCK Transactions on Intelligent Systematics, 2(3), 190–202. https://doi.org/10.62762/TIS.2025.874668

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 66
PDF Downloads: 23

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
Institute of Central Computation and Knowledge (ICCK) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
ICCK Transactions on Intelligent Systematics

ICCK Transactions on Intelligent Systematics

ISSN: 3068-5079 (Online) | ISSN: 3069-003X (Print)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/