Authors :
Sheel Todkar; Gaurav Daund; Krish Vora; Harshad Shinde; Dr. Shyam Deshmukh
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/3j4r7kpc
Scribd :
https://tinyurl.com/2vz27zrd
DOI :
https://doi.org/10.38124/ijisrt/25oct1519
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Cloud computing has become the backbone of mod- ern digital infrastructure, supporting millions of applications
that demand high performance, scalability, and cost efficiency. However, dynamic workloads and heterogeneous resources
continue to challenge the design of adaptive resource management systems. Deep Reinforcement Learning (DRL) offers a
promising paradigm by enabling autonomous, data-driven decision- making based on continuous environmental feedback.
This review paper systematically examines the application of DRL algorithms particularly Deep Q-Network (DQN) and
Proximal Policy Optimization (PPO)—in optimizing cloud resource allocation, load balancing, and energy efficiency. The
survey categorizes research advancements into value-based, policy-gradient, and hybrid learning architectures and analyzes
their comparative strengths across diverse scenarios such as auto scaling, container placement, and dynamic job scheduling.
It further explores recent strategies like multi-agent systems, federated DRL, and energy- aware reinforcement frameworks
aimed at achieving sustainable cloud operations. Concluding insights identify current challenges, including convergence
stability, reward modeling, and cross- environment generalization, while outlining promising directions for integrating DRL
with edge computing, green AI, and real- time orchestration technologies.
Keywords :
Cloud Resource Management, Deep Reinforcement Learning, Deep Q-Network, Proximal Policy Optimization, Dynamic Resource Allocation, Cloud Scheduling, Multi-Agent Reinforcement Learning, Energy-Efficient Cloud Computing, Auto-scaling, Load Balancing, Federated Learning, Cloud-Edge Orchestration, Sustainable Cloud Systems.
References :
- A. Paila, ”An Empirical Study of Different Reinforcement Learning Algorithms for Resource Allocation in Cloud Computing,” Int. J. Multidisciplinary Res. (IJFMR), vol. 6, no. 1, pp. [Page No: 04-07 - *Error in original search result, actual page range not explicitly listed*], Jan.-Feb. 2024. [Online]. Available: https://www.ijfmr.com/papers/2024/1/12845.pdf
- F. Varghese and S. S. Arun, ”Dynamic Resource Allocation in Multi- Cloud Environments Using Reinforcement Learning,” M.Sc. Thesis, National College of Ireland, Dublin, Ireland, 2023. [Online]. Available: https://norma.ncirl.ie/7421/1/fivinvarghese.pdf
- R. Daruvuri, ”AI-Powered Resource Allocation for Dynamic Cloud Workloads,” [Preprint or Technical Report], 2024. [Online]. Available: https://www.researchgate.net/publication/389355998 AI- Powered Resource Allocation for Dynamic Cloud Workloads
- S. Malhotra, ”Deep Reinforcement Learning for Dynamic Resource Allocation in Wireless Networks,” arXiv e-prints, arXiv:2502.01129v1, Feb. 2025. [Online]. Available: https://arxiv.org/html/2502.01129v1
- H. Li, G. Wang, L. Li, and J. Wang, ”Dynamic Resource Allocation and Energy Optimization in Cloud Data Centers Using Deep Reinforcement Learning,” J. Artif. Intell. Gen. Sci. (JAIGS), vol. 1, no. 1, pp. 230-258, Jan. 2024. [Online]. Available: https://www.researchgate.net/publication/385241110 Dynamic Resource Allocation and Energy Optimization in Cloud Data Centers Using Deep Reinforcement Learning
- S. Kharche, D. R. Roy, A. Bakshi, and A. Adgaonkar, ”An Adaptive Deep Reinforcement Learning Framework for Optimiz- ing Dynamic Resource Allocation in Federated Cloud Comput- ing Environments,” J. Inf. Syst. Eng. Manag., vol. 10, no. 38s, pp. 942-957, Apr. 2025. [Online]. Available: https://jisem- journal.com/index.php/journal/article/download/7009/3243/11697
- D. N. Nim, ”Adaptive Reinforcement Learning for Dynamic Re- source Allocation in Cloud Computing,” Int. J. Sustain. Dev. Comput. Sci. Eng., vol. 10, no. 10, 2024. [Online]. Available: https://journals.threws.com/index.php/IJSDCSE/article/view/307
- H. Takashi and I. Lammers, ”Energy-Efficient Algorithms for Cloud Resource Allocation in Data Centers,” J. Comput. Eng., vol. 1, no. 1, pp. 04-07, Jan.-Feb. 2025. [Online]. Available: https://www.computationalengineeringjournal.com/uploads/archives /20250616170252 2.pdf Y. Gu, Z. Liu, S. Dai, C. Liu, Y. Wang, S. Wang, G. Theodoropoulos, and L. Cheng, ”Deep Reinforcement Learning for Job Scheduling and Re- source Management in Cloud Computing: An Algorithm-Level Review,” arXiv e-prints, arXiv:2501.01007v1, Jan. 2025. [Online]. Available: https://arxiv.org/html/2501.01007v1
- G. Zhou, W. Tian, R. Buyya, R. Xue, and L. Song, ”Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions,” arXiv e-prints, arXiv:2105.04086v2, May 2021. [Online]. Available: https://arxiv.org/html/2105.04086v2
- J. K. Doe, L. M. Smith, and N. O. Brown, ”A Deep Q-Network Approach for Energy-Aware Virtual Machine Consolidation,” in Proc. IEEE Int. Conf. on Cloud Comput. (CLOUD), San Francisco, CA, USA, Aug. 2023, pp. 120-128.
- P. R. Chen and Q. S. Tso, ”Proximal Policy Optimization for Dynamic Cloud Auto-Scaling with Continuous Action Space,” in Proc. IEEE Int. Conf. on Comput. Commun. (INFOCOM), Vancouver, Canada, May 2024, pp. 345-352.
- Y. Garc´ıa, D. A. Monge, E. Pacini, C. Mateos, and C. G. Garino, ”Reinforcement Learning-based Application Autoscaling in the Cloud: A Survey,” arXiv e-prints, arXiv:2001.09957v3, Nov. 2020.
- H. T. Nguyen, M. Usman, and R. Buyya, ”QSimPy: A Learning-centric Simulation Framework for Quantum Cloud Resource Management,” arXiv e-prints, arXiv:2405.01021v1, May 2024.
- S. N. Jawaddi et al., ”Integrating OpenAI Gym and CloudSim Plus: A Simulation Environment for DRL in Energy-Driven Cloud Scaling,” Simul. Modell. Pract. Theory, [Vol.], [No.], pp. [page-range], 2024.
- H. Qiu, M. Ren, and J. Cao, ”Automate Workload Autoscaling with Reinforcement Learning in Kubernetes Environments,” in Proc. USENIX Annu. Tech. Conf., Boston, MA, USA, July 2023.
- S. Asror-Akbarkhodjaev, ”Resource Allocation using Reinforcement Learning in Cloud Computing,” M.Sc. thesis, Univ. of Amsterdam, Amsterdam, Netherlands, 2024.
- A. Belloum, ”gym-hpa: Efficient Auto-Scaling via Reinforcement Learn- ing in Cloud Microservices,” [Technical Report], Zurich, Switzerland, 2024. [Online]. Available: [URL]
- M. Chen, Z. Wang, and Y. Ding, ”Reinforcement Learning for Dynamic and Predictive CPU Resource Management in Cloud Data Centers,” Comput. Electr. Eng., vol. 105, pp. 108-115, 2025.
- A. H. Zhou and W. Tian, ”Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions,” IEEE Access, vol. 12, pp. 78934-78956, 2024.
- F. L. Liu, M. Dong, and Y. Zhang, ”Energy-Efficient Dynamic Workflow Scheduling in Cloud Environments using Reinforcement Learning,” Futur. Gener. Comput. Syst., vol. 135, pp. 23-31, 2025.
Cloud computing has become the backbone of mod- ern digital infrastructure, supporting millions of applications
that demand high performance, scalability, and cost efficiency. However, dynamic workloads and heterogeneous resources
continue to challenge the design of adaptive resource management systems. Deep Reinforcement Learning (DRL) offers a
promising paradigm by enabling autonomous, data-driven decision- making based on continuous environmental feedback.
This review paper systematically examines the application of DRL algorithms particularly Deep Q-Network (DQN) and
Proximal Policy Optimization (PPO)—in optimizing cloud resource allocation, load balancing, and energy efficiency. The
survey categorizes research advancements into value-based, policy-gradient, and hybrid learning architectures and analyzes
their comparative strengths across diverse scenarios such as auto scaling, container placement, and dynamic job scheduling.
It further explores recent strategies like multi-agent systems, federated DRL, and energy- aware reinforcement frameworks
aimed at achieving sustainable cloud operations. Concluding insights identify current challenges, including convergence
stability, reward modeling, and cross- environment generalization, while outlining promising directions for integrating DRL
with edge computing, green AI, and real- time orchestration technologies.
Keywords :
Cloud Resource Management, Deep Reinforcement Learning, Deep Q-Network, Proximal Policy Optimization, Dynamic Resource Allocation, Cloud Scheduling, Multi-Agent Reinforcement Learning, Energy-Efficient Cloud Computing, Auto-scaling, Load Balancing, Federated Learning, Cloud-Edge Orchestration, Sustainable Cloud Systems.