Prompt3d an intelligent multiagent framework for natural languagedriven 3d content creation| International Journal of Innovative Science and Research Technology

Prompt3D: An Intelligent Multi-Agent Framework for Natural Language-Driven 3D Content Creation

Authors : Nandu Rajesh; Venkidesh Venu; Paul George; Anjaly Muralidharan

Volume/Issue : Volume 11 - 2026, Issue 2 - February

Google Scholar : https://tinyurl.com/ycyxfesb

Scribd : https://tinyurl.com/kc94p6h6

DOI : https://doi.org/10.38124/ijisrt/26feb1434

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Producing high-quality 3D content is still a complicated and skill-demanding task that usually requires knowledge of professional software such as Blender, Autodesk Maya, or 3ds Max. Therefore, many concepts from designers, educators, and creators remain only ideas because of the lack of technical skills. This paper presents Prompt3D, a multi-agent conversational system that allows users to create and edit 3D scenes with simple natural language instructions. The system integrates a large language model (Google Gemini 2. 5 Pro) with Blender via a well-organized five-stage pipeline comprising intent understanding, context retrieval utilizing Retrieval-Augmented Generation (RAG), tool planning, execution, and verification. A standardized Model Context Protocol (MCP) manages over 50 specialized tools implemented through a Python-based Blender addon. The results of the experiments demonstrate that common modeling tasks take 5-15 seconds, and adaptive rendering optimizations cut the computation time by up to 50%. The user- studies show that both beginners and advanced users find the system functionally accurate and highly usable. Overall, Prompt3D demonstrates a practical approach to making professional-quality 3D content creation more accessible without sacrificing flexibility or control.

Keywords : Natural Language Processing, 3D Content Creation, Large Language Models, Multi-Agent Systems, RetrievalAugmented Generation, Human-Computer Interaction.

References :

“Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions,” 2024.
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” Advances in Neural Information Processing Systems, vol. 35, pp. 24824–24837, 2022.
S. Hong, X. Zheng, J. Chen, Y. Cheng, J. Wang, C. Zhang, Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, C. Ran, L. Xiao, and C. Wu, “MetaGPT: Meta Programming for Multi-Agent Collaborative Framework,” arXiv preprint arXiv:2308.00352, 2023.
B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, ”DreamFusion: Text- to-3D using 2D Diffusion,” *arXiv preprint* arXiv:2209.14988, 2022.
C.-H. Lin, J. Gao, L. Tang, T. Takikawa, X. Zeng, X. Huang, K. Kreis, S. Fidler, M.-Y. Liu, and T.-Y. Lin, ”Magic3D: High-Resolution Text- to-3D Content Creation,” in *Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)*, 2023.
“3D-GPT: Transforming Language Instructions into 3D Modeling Com- mands,” arXiv preprint arXiv:2307.xxxxx, 2023.
“SceneCraft: Natural Language to Blender Python Scripts for Complex Scenes,” arXiv preprint arXiv:2401.xxxxx, 2024.
“BlenderLLM: CAD Script Generation from Natural Language Instruc- tions,” arXiv preprint arXiv:2404.xxxxx, 2024.
“3D-LLM: Grounding Language Models in 3D Spatial Understanding,” arXiv preprint arXiv:2405.xxxxx, 2024.
“DreamGaussian: Generative Gaussian Splatting for Efficient 3D Con- tent Creation,” arXiv preprint arXiv:2311.xxxxx, 2023.
“Hunyuan3D 2.0: Scaling Diffusion for High-Resolution 3D Asset Generation,” arXiv preprint arXiv:2406.xxxxx, 2024.
Google DeepMind, “Gemini: A Family of Highly Capable Multimodal Models,” arXiv preprint arXiv:2312.11805, 2023.
P. Yuan, H. Li, K. Zhao, et al., “EASYTOOL: Enhancing LLM Tool- Use via Structured Documentation,” arXiv preprint arXiv:2403.xxxxx, 2024.
Q. Lu, Y. Wang, X. Chen, et al., “TOOLSANDBOX: Benchmark- ing LLM Tool-Use in Realistic Multi-Turn Tasks,” arXiv preprint arXiv:2404.xxxxx, 2024.
P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Ku¨ttler, M. Lewis, W. Yih, T. Rockta¨schel, S. Riedel, and D. Kiela, “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 9459–9474.
Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, and H. Wang, “Retrieval-Augmented Generation for Large Language Models: A Survey,” arXiv preprint arXiv:2312.10997, 2023.

Producing high-quality 3D content is still a complicated and skill-demanding task that usually requires knowledge of professional software such as Blender, Autodesk Maya, or 3ds Max. Therefore, many concepts from designers, educators, and creators remain only ideas because of the lack of technical skills. This paper presents Prompt3D, a multi-agent conversational system that allows users to create and edit 3D scenes with simple natural language instructions. The system integrates a large language model (Google Gemini 2. 5 Pro) with Blender via a well-organized five-stage pipeline comprising intent understanding, context retrieval utilizing Retrieval-Augmented Generation (RAG), tool planning, execution, and verification. A standardized Model Context Protocol (MCP) manages over 50 specialized tools implemented through a Python-based Blender addon. The results of the experiments demonstrate that common modeling tasks take 5-15 seconds, and adaptive rendering optimizations cut the computation time by up to 50%. The user- studies show that both beginners and advanced users find the system functionally accurate and highly usable. Overall, Prompt3D demonstrates a practical approach to making professional-quality 3D content creation more accessible without sacrificing flexibility or control.

Keywords : Natural Language Processing, 3D Content Creation, Large Language Models, Multi-Agent Systems, RetrievalAugmented Generation, Human-Computer Interaction.

Paper Submission Last Date
30 - June - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.