Entropy Based Obfuscation for Defending Attention Cache in Shared LLMs


Authors : Saurabh Kansal; Deepak Kejriwal

Volume/Issue : Volume 10 - 2025, Issue 9 - September


Google Scholar : https://tinyurl.com/59j56nxc

Scribd : https://tinyurl.com/yc63jfc9

DOI : https://doi.org/10.38124/ijisrt/25sep1140

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : Large Language Models (LLMs) have become an indispensable part of research, business, and real-world use in a short period, providing unequalled capabilities in natural language understanding and generation. Nonetheless, the implementation of such models in shared or multi-tenant frameworks poses grave security risks, especially that of sensitive information being leaked via the attention key-value (KV) memory. The caches used in side-channel attacks can reveal prompts, embedding’s, compromise privacy, and confidence in the services of the LLM. In order to resolve this issue, this paper suggests the use of entropy-based obfuscation framework that injects controlled randomness into the cached states thus rendering access patterns unpredictable without affecting accuracy. The framework dynamically modulates the level of perturbation using the Shannon and Renyi entropy as the guiding metrics in order to achieve the trade-off between privacy and system performance. The experimental outcomes of the multi-tenant deployments show that entropy-based obfuscation is an effective tool reducing prompt leakage by paying a relatively small computational cost. The significance of entropy-based defenses in this study is emphasized because this method is a practical and scalable solution to improving the resilience of LLMs. The study provides a new line of research that aims to protect the collaboration of AI environments by incorporating information-theoretic metrics into model protection.

Keywords : Experts in this Field Include Entropy, Obfuscation, Attention Cache, Large Language Models (LLMs), Privacy Preservation, Side-Channel Defense, Model Security.

References :

  1. Jin, S., Pang, X., Wang, Z., Wang, H., Du, J., Hu, J., and Ren, K. (2025). Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation. arXiv preprint arXiv:2503.12896.
  2. Chu, K., Lin, Z., Xiang, D., Shen, Z., Su, J., Chu, C., ... and Zhang, W. (2025). Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference. arXiv preprint arXiv:2508.08438.
  3. Sri Harsha Koneru. (2025). Securing the Modern Healthcare Ecosystem: Endpoint Management for Medical Environments. Journal of Computer Science and Technology Studies, 7(4), 71-78.
  4. Ma, B., Jiang, Y., Wang, X., Yu, G., Wang, Q., Sun, C., ... and Liu, R. P. (2025). SoK: Semantic Privacy in Large Language Models. arXiv preprint arXiv:2506.23603.
  5. Latibari, B. S., Nazari, N., Chowdhury, M. A., Gubbi, K. I., Fang, C., Ghimire, S., ... and Sasan, A. (2024). Transformers: A security perspective. IEEE Access.
  6. Childress, V., Collyer, J., and Knapp, J. (2025). Architectural Backdoors in Deep Learning: A Survey of Vulnerabilities, Detection, and Defense. arXiv preprint arXiv:2507.12919.
  7. Koneru, S. H. (2025). Secure Cloud Automation: Bridging Public Safety and Creative Workspaces. Journal Of Engineering And Computer Sciences4(9), 145-153.
  8. Khan, I., Chowdary, A., Haseeb, S., Patel, U., and Zaii, Y. (2025). Kodezi Chronos: A Debugging-First Language Model for Repository-Scale Code Understanding. arXiv preprint arXiv:2507.12482.
  9. Alzahrani, S., Xiao, Y., Asiri, S., Zheng, J., and Li, T. (2025). A Survey of Ransomware Detection Methods. IEEE Access.
  10. Koneru, S. H. (2025). Bridging the digital divide in education through automated cloud-based endpoints. World Journal of Advanced Research and Reviews, 26(2), 1337–1343.
  11. Kim, J. (2025). Real-Time Detection and Recovery Method Against Ransomware Based on Simple Format Analysis. Information16(9), 739.
  12. Shabbir, A., Kanpak, H. İ., Küpçü, A., and Sav, S. (2025). A Taxonomy of Attacks and Defenses in Split Learning. arXiv preprint arXiv:2505.05872.
  13. Koneru, S. H. (2025). AI-driven endpoint automation for patient monitoring: Transforming healthcare infrastructure. World Journal of Advanced Engineering Technology and Sciences, 15(2), 1291–1298.
  14. Gustavsson, C. (2024). Approximation-based monitoring of ongoing model extraction attacks: model similarity tracking to assess the progress of an adversary.
  15. Nezhadsistani, N., and Stiller, B. (2025). Leveraging Explainable AI for Cybersecurity. In Challenges and Solutions for Cybersecurity and Adversarial Machine Learning (pp. 271-306). IGI Global Scientific Publishing.
  16. Koneru, Sri. (2025). Seamless Retail: Cloud-Powered Device Management Transforming Store Operations and Employee Experience. European Modern Studies Journal. 9. 1099-1108. 10.59573/emsj.9(4).2025.102.
  17. Kim, D., Woo, H., and Lee, Y. (2024). Addressing bias and fairness using fair federated learning: A synthetic review. Electronics13(23), 4664.
  18. Wang, J., Mahala, G., Ghose, A., Cerrud, R., and Kotani, S. COMPSAC 2024.
  19. Chiruzzo, L., Ritter, A., and Wang, L. (2025, April). Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).
  20. Beebe, N. H. (2023). A Complete Bibliography of Publications in Algorithms.
  21. Adiletta, A., and Sunar, B. (2025). Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models. arXiv preprint arXiv:2505.00817.
  22. Luo, Z., Shao, S., Zhang, S., Zhou, L., Hu, Y., Zhao, C., Liu, Z., and Qin, Z. (2025). Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference. arXiv preprint arXiv:2508.09442.
  23. Chu, K., Lin, Z., Xiang, D., Shen, Z., Su, J., Chu, C., Yang, Y., Wu, W., and Zhang, W. (2025). Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference. arXiv preprint arXiv:2508.08438.
  24. Wu, G., Zhang, Z., Zhang, Y., Wang, W., Niu, J., Wu, Y., and Zhang, Y. (2025). Prompt Leakage via KV-Cache Sharing in Multi-Tenant LLM Serving. In NDSS Symposium.
  25. Unveiling Hardware Cache Side-Channels in Local LLM Inference: Token Value and Token Position Leakage. (2025). arXiv preprint arXiv:2505.06738.
  26. On large language models safety, security, and privacy: A survey. (2025). Science China Information Sciences, (or similar), ScienceDirect.
  27. Lghi Zn, Yichen Liu, Jingwen Yan, Long Cheng, Song Liao, Luyi Xing. (2024/2025). LLM-PBE: Assessing Data Privacy in Large Language Models.
  28. Jiang, T., Wang, Z., Liang, J., Li, C., Wang, Y., and Wang, T. (2024). RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction.
  29. Luo, Z., Shao, S., Zhang, S., Zhou, L., Hu, Y., Zhao, C., Liu, Z., and Qin, Z. (2025). Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference.
  30. Wu, G., Zhang, Z., Zhang, Y., Wang, W., Niu, J., Wu, Y., and Zhang, Y. (2025). Prompt Leakage via KV-Cache Sharing in Multi-Tenant LLM Serving. In NDSS Symposium

Large Language Models (LLMs) have become an indispensable part of research, business, and real-world use in a short period, providing unequalled capabilities in natural language understanding and generation. Nonetheless, the implementation of such models in shared or multi-tenant frameworks poses grave security risks, especially that of sensitive information being leaked via the attention key-value (KV) memory. The caches used in side-channel attacks can reveal prompts, embedding’s, compromise privacy, and confidence in the services of the LLM. In order to resolve this issue, this paper suggests the use of entropy-based obfuscation framework that injects controlled randomness into the cached states thus rendering access patterns unpredictable without affecting accuracy. The framework dynamically modulates the level of perturbation using the Shannon and Renyi entropy as the guiding metrics in order to achieve the trade-off between privacy and system performance. The experimental outcomes of the multi-tenant deployments show that entropy-based obfuscation is an effective tool reducing prompt leakage by paying a relatively small computational cost. The significance of entropy-based defenses in this study is emphasized because this method is a practical and scalable solution to improving the resilience of LLMs. The study provides a new line of research that aims to protect the collaboration of AI environments by incorporating information-theoretic metrics into model protection.

Keywords : Experts in this Field Include Entropy, Obfuscation, Attention Cache, Large Language Models (LLMs), Privacy Preservation, Side-Channel Defense, Model Security.

CALL FOR PAPERS


Paper Submission Last Date
31 - December - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe