Comparative study on dqn and ppo for cloud resource optimization| International Journal of Innovative Science and Research Technology

Comparative Study on DQN and PPO for Cloud Resource Optimization

Authors : Sheel Todkar; Gaurav Daund; Krish Vora; Harshad Shinde; Dr. Shyam Deshmukh

Volume/Issue : Volume 10 - 2025, Issue 10 - October

Google Scholar : https://tinyurl.com/3j4r7kpc

Scribd : https://tinyurl.com/2vz27zrd

DOI : https://doi.org/10.38124/ijisrt/25oct1519

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Cloud computing has become the backbone of mod- ern digital infrastructure, supporting millions of applications that demand high performance, scalability, and cost efficiency. However, dynamic workloads and heterogeneous resources continue to challenge the design of adaptive resource management systems. Deep Reinforcement Learning (DRL) offers a promising paradigm by enabling autonomous, data-driven decision- making based on continuous environmental feedback. This review paper systematically examines the application of DRL algorithms particularly Deep Q-Network (DQN) and Proximal Policy Optimization (PPO)—in optimizing cloud resource allocation, load balancing, and energy efficiency. The survey categorizes research advancements into value-based, policy-gradient, and hybrid learning architectures and analyzes their comparative strengths across diverse scenarios such as auto scaling, container placement, and dynamic job scheduling. It further explores recent strategies like multi-agent systems, federated DRL, and energy- aware reinforcement frameworks aimed at achieving sustainable cloud operations. Concluding insights identify current challenges, including convergence stability, reward modeling, and cross- environment generalization, while outlining promising directions for integrating DRL with edge computing, green AI, and real- time orchestration technologies.

Keywords : Cloud Resource Management, Deep Reinforcement Learning, Deep Q-Network, Proximal Policy Optimization, Dynamic Resource Allocation, Cloud Scheduling, Multi-Agent Reinforcement Learning, Energy-Efficient Cloud Computing, Auto-scaling, Load Balancing, Federated Learning, Cloud-Edge Orchestration, Sustainable Cloud Systems.

References :

A. Paila, ”An Empirical Study of Different Reinforcement Learning Algorithms for Resource Allocation in Cloud Computing,” Int. J. Multidisciplinary Res. (IJFMR), vol. 6, no. 1, pp. [Page No: 04-07 - *Error in original search result, actual page range not explicitly listed*], Jan.-Feb. 2024. [Online]. Available: https://www.ijfmr.com/papers/2024/1/12845.pdf
F. Varghese and S. S. Arun, ”Dynamic Resource Allocation in Multi- Cloud Environments Using Reinforcement Learning,” M.Sc. Thesis, National College of Ireland, Dublin, Ireland, 2023. [Online]. Available: https://norma.ncirl.ie/7421/1/fivinvarghese.pdf
R. Daruvuri, ”AI-Powered Resource Allocation for Dynamic Cloud Workloads,” [Preprint or Technical Report], 2024. [Online]. Available: https://www.researchgate.net/publication/389355998 AI- Powered Resource Allocation for Dynamic Cloud Workloads
S. Malhotra, ”Deep Reinforcement Learning for Dynamic Resource Allocation in Wireless Networks,” arXiv e-prints, arXiv:2502.01129v1, Feb. 2025. [Online]. Available: https://arxiv.org/html/2502.01129v1
H. Li, G. Wang, L. Li, and J. Wang, ”Dynamic Resource Allocation and Energy Optimization in Cloud Data Centers Using Deep Reinforcement Learning,” J. Artif. Intell. Gen. Sci. (JAIGS), vol. 1, no. 1, pp. 230-258, Jan. 2024. [Online]. Available: https://www.researchgate.net/publication/385241110 Dynamic Resource Allocation and Energy Optimization in Cloud Data Centers Using Deep Reinforcement Learning
S. Kharche, D. R. Roy, A. Bakshi, and A. Adgaonkar, ”An Adaptive Deep Reinforcement Learning Framework for Optimiz- ing Dynamic Resource Allocation in Federated Cloud Comput- ing Environments,” J. Inf. Syst. Eng. Manag., vol. 10, no. 38s, pp. 942-957, Apr. 2025. [Online]. Available: https://jisem- journal.com/index.php/journal/article/download/7009/3243/11697
D. N. Nim, ”Adaptive Reinforcement Learning for Dynamic Re- source Allocation in Cloud Computing,” Int. J. Sustain. Dev. Comput. Sci. Eng., vol. 10, no. 10, 2024. [Online]. Available: https://journals.threws.com/index.php/IJSDCSE/article/view/307
H. Takashi and I. Lammers, ”Energy-Efficient Algorithms for Cloud Resource Allocation in Data Centers,” J. Comput. Eng., vol. 1, no. 1, pp. 04-07, Jan.-Feb. 2025. [Online]. Available: https://www.computationalengineeringjournal.com/uploads/archives /20250616170252 2.pdf Y. Gu, Z. Liu, S. Dai, C. Liu, Y. Wang, S. Wang, G. Theodoropoulos, and L. Cheng, ”Deep Reinforcement Learning for Job Scheduling and Re- source Management in Cloud Computing: An Algorithm-Level Review,” arXiv e-prints, arXiv:2501.01007v1, Jan. 2025. [Online]. Available: https://arxiv.org/html/2501.01007v1
G. Zhou, W. Tian, R. Buyya, R. Xue, and L. Song, ”Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions,” arXiv e-prints, arXiv:2105.04086v2, May 2021. [Online]. Available: https://arxiv.org/html/2105.04086v2
J. K. Doe, L. M. Smith, and N. O. Brown, ”A Deep Q-Network Approach for Energy-Aware Virtual Machine Consolidation,” in Proc. IEEE Int. Conf. on Cloud Comput. (CLOUD), San Francisco, CA, USA, Aug. 2023, pp. 120-128.
P. R. Chen and Q. S. Tso, ”Proximal Policy Optimization for Dynamic Cloud Auto-Scaling with Continuous Action Space,” in Proc. IEEE Int. Conf. on Comput. Commun. (INFOCOM), Vancouver, Canada, May 2024, pp. 345-352.
Y. Garc´ıa, D. A. Monge, E. Pacini, C. Mateos, and C. G. Garino, ”Reinforcement Learning-based Application Autoscaling in the Cloud: A Survey,” arXiv e-prints, arXiv:2001.09957v3, Nov. 2020.
H. T. Nguyen, M. Usman, and R. Buyya, ”QSimPy: A Learning-centric Simulation Framework for Quantum Cloud Resource Management,” arXiv e-prints, arXiv:2405.01021v1, May 2024.
S. N. Jawaddi et al., ”Integrating OpenAI Gym and CloudSim Plus: A Simulation Environment for DRL in Energy-Driven Cloud Scaling,” Simul. Modell. Pract. Theory, [Vol.], [No.], pp. [page-range], 2024.
H. Qiu, M. Ren, and J. Cao, ”Automate Workload Autoscaling with Reinforcement Learning in Kubernetes Environments,” in Proc. USENIX Annu. Tech. Conf., Boston, MA, USA, July 2023.
S. Asror-Akbarkhodjaev, ”Resource Allocation using Reinforcement Learning in Cloud Computing,” M.Sc. thesis, Univ. of Amsterdam, Amsterdam, Netherlands, 2024.
A. Belloum, ”gym-hpa: Efficient Auto-Scaling via Reinforcement Learn- ing in Cloud Microservices,” [Technical Report], Zurich, Switzerland, 2024. [Online]. Available: [URL]
M. Chen, Z. Wang, and Y. Ding, ”Reinforcement Learning for Dynamic and Predictive CPU Resource Management in Cloud Data Centers,” Comput. Electr. Eng., vol. 105, pp. 108-115, 2025.
A. H. Zhou and W. Tian, ”Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions,” IEEE Access, vol. 12, pp. 78934-78956, 2024.
F. L. Liu, M. Dong, and Y. Zhang, ”Energy-Efficient Dynamic Workflow Scheduling in Cloud Environments using Reinforcement Learning,” Futur. Gener. Comput. Syst., vol. 135, pp. 23-31, 2025.

Cloud computing has become the backbone of mod- ern digital infrastructure, supporting millions of applications that demand high performance, scalability, and cost efficiency. However, dynamic workloads and heterogeneous resources continue to challenge the design of adaptive resource management systems. Deep Reinforcement Learning (DRL) offers a promising paradigm by enabling autonomous, data-driven decision- making based on continuous environmental feedback. This review paper systematically examines the application of DRL algorithms particularly Deep Q-Network (DQN) and Proximal Policy Optimization (PPO)—in optimizing cloud resource allocation, load balancing, and energy efficiency. The survey categorizes research advancements into value-based, policy-gradient, and hybrid learning architectures and analyzes their comparative strengths across diverse scenarios such as auto scaling, container placement, and dynamic job scheduling. It further explores recent strategies like multi-agent systems, federated DRL, and energy- aware reinforcement frameworks aimed at achieving sustainable cloud operations. Concluding insights identify current challenges, including convergence stability, reward modeling, and cross- environment generalization, while outlining promising directions for integrating DRL with edge computing, green AI, and real- time orchestration technologies.

Paper Submission Last Date
28 - February - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.