Trustworthy Agentic AI: A Survey and Taxonomy of Secure Coordination and Hallucination Mitigation in Multi-Agent Large Language Model Systems


Authors : Tharakesvulu Vangalapat; Samreen Iftekhar Shaikh

Volume/Issue : Volume 11 - 2026, Issue 2 - February


Google Scholar : https://tinyurl.com/2vddtsr3

Scribd : https://tinyurl.com/5n8c7ftw

DOI : https://doi.org/10.38124/ijisrt/26feb1090

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : Background: Large language model (LLM)-based agentic systems are evolving beyond single-turn generators into autonomous, toolusing, multi-agent workflows with persistent memory and self-directed planning. When these agents collaborate, hallucinations no longer remain local; they can propagate across agent boundaries and trigger real-world operational failures.  Objective: This paper provides a structured foundation for building secure, reliable, and hallucination-resilient agentic AI by surveying failure modes and proposing a layered trust taxonomy with enforceable controls.  Methods: We reviewed literature on LLM reasoning, retrieval-augmented generation (RAG), hallucination detection, multi-agent frameworks, and AI governance (including zero-trust security principles) and catalogued failure modes across reliability and security dimensions.  Results: We present a seven-layer trust taxonomy spanning identity, planning, communication, memory, retrieval, execution, and oversight. From this taxonomy, we derive six reusable secure-coordination design patterns and propose a model-agnostic reference architecture for auditable, policy-enforced agentic workflows.  Conclusion: Trustworthiness in agentic AI is fundamentally a system property, not merely a model property. The proposed taxonomy and design patterns provide practical, implementation-independent guidance for securing multi-agent LLM deployments in research and high-assurance enterprise contexts.  Plain Language Summary: AI systems built from large language models can now plan tasks, use tools, and work together as teams of specialized agents. This collaboration creates new dangers: when one agent fabricates information, the other agents may act on it as though it were true, spreading errors throughout the system. This paper maps where trust breaks down, from agent identity and message passing to memory and tool use, and proposes design rules and a system blueprint for more reliable and secure AI teams. agentic AI, multi-agent systems, trustworthy AI, hallucination mitigation, retrieval-augmented generation, tool use, zero trust, AI governance.

References :

  1. A. Vaswani et al., “Attention is all you need,” in NeurIPS, 2017.
  2. T. Brown et al., “Language models are few-shot learners,” in NeurIPS, 2020.
  3. C. Raffel et al., “Exploring the limits of transfer learning with t5,” JMLR, 2020.
  4. OpenAI, “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023. [5] S. Bubeck et al., “Sparks of agi,” arXiv, 2023.
  5. S. Yao et al., “React: Synergizing reasoning and acting in llms,” ICLR, 2023.
  6. T. Schick et al., “Toolformer: Language models can teach themselves to use tools,” NeurIPS, 2023.
  7. L. Gao et al., “Program-aided language models,” ICML, 2023.
  8. Q. Wu et al., “Autogen: Multi-agent conversation framework,” arXiv, 2023. [10] J. S. Park et al., “Generative agents,” in UIST, 2023.
  9. J. Wei et al., “Chain-of-thought prompting elicits reasoning in llms,” NeurIPS, 2022.
  10. P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive nlp,” NeurIPS, 2020.
  11. K. Guu et al., “Realm: Retrieval-augmented language model pre-training,” ICML, 2020.
  12. G. Izacard and E. Grave, “Leveraging passage retrieval with generative models,” arXiv, 2021.
  13. S. Robertson and H. Zaragoza, “Bm25 and beyond,” Foundations and Trends in IR, 2009.
  14. O. Khattab and M. Zaharia, “Colbert: Efficient passage search,” SIGIR, 2020.
  15. R. Nakano et al., “Webgpt,” arXiv, 2021.
  16. Y. Bai et al., “Constitutional ai,” arXiv, 2022.
  17. R. Bommasani et al., “On the opportunities and risks of foundation models,” arXiv, 2021.
  18. S. Rose et al., “Zero trust architecture,” NIST, Tech. Rep., 2020.
  19. NIST, “Ai risk management framework,” Tech. Rep., 2023.
  20. Z. Ji et al., “Survey of hallucination in natural language generation,” ACM Computing Surveys, 2023.
  21. S. Lin et al., “Truthfulqa,” ACL, 2022.
  22. P. Manakul et al., “Selfcheckgpt,” EMNLP, 2023.
  23. N. Carlini et al., “Prompt injection attacks against llm applications,” arXiv, 2023.
  24. P. Christiano et al., “Deep reinforcement learning from human preferences,” NeurIPS, 2017.
  25. L. Ouyang et al., “Training lms to follow instructions,” arXiv, 2022.
  26. S. Min et al., “Factscore,” ICML, 2023.
  27. A. Madaan et al., “Self-refine: Iterative refinement,” NeurIPS, 2023.
  28. A. Zou et al., “Universal and transferable adversarial attacks on aligned llms,” arXiv, 2023.
  29. D. Ganguli et al., “Red teaming language models,” arXiv, 2022.
  30. C. Dwork, “Differential privacy,” ICALP, 2006.
  31. M. Abadi et al., “Deep learning with differential privacy,” CCS, 2016.

Background: Large language model (LLM)-based agentic systems are evolving beyond single-turn generators into autonomous, toolusing, multi-agent workflows with persistent memory and self-directed planning. When these agents collaborate, hallucinations no longer remain local; they can propagate across agent boundaries and trigger real-world operational failures.  Objective: This paper provides a structured foundation for building secure, reliable, and hallucination-resilient agentic AI by surveying failure modes and proposing a layered trust taxonomy with enforceable controls.  Methods: We reviewed literature on LLM reasoning, retrieval-augmented generation (RAG), hallucination detection, multi-agent frameworks, and AI governance (including zero-trust security principles) and catalogued failure modes across reliability and security dimensions.  Results: We present a seven-layer trust taxonomy spanning identity, planning, communication, memory, retrieval, execution, and oversight. From this taxonomy, we derive six reusable secure-coordination design patterns and propose a model-agnostic reference architecture for auditable, policy-enforced agentic workflows.  Conclusion: Trustworthiness in agentic AI is fundamentally a system property, not merely a model property. The proposed taxonomy and design patterns provide practical, implementation-independent guidance for securing multi-agent LLM deployments in research and high-assurance enterprise contexts.  Plain Language Summary: AI systems built from large language models can now plan tasks, use tools, and work together as teams of specialized agents. This collaboration creates new dangers: when one agent fabricates information, the other agents may act on it as though it were true, spreading errors throughout the system. This paper maps where trust breaks down, from agent identity and message passing to memory and tool use, and proposes design rules and a system blueprint for more reliable and secure AI teams. agentic AI, multi-agent systems, trustworthy AI, hallucination mitigation, retrieval-augmented generation, tool use, zero trust, AI governance.

Paper Submission Last Date
31 - March - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS
Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe