Authors :
Jiaqi Wang
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/mt7734yv
Scribd :
https://tinyurl.com/axfery29
DOI :
https://doi.org/10.38124/ijisrt/25oct1329
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Dynamic portfolio optimization remains one of the most challenging problems in quantitative finance due to the
non-stationary nature of financial markets, complex asset correlations, and the presence of transaction costs. Traditional
portfolio management approaches, including Modern Portfolio Theory and mean-variance optimization, often rely on
restrictive assumptions that fail to capture market dynamics effectively. This paper investigates the application of Deep
Reinforcement Learning techniques to dynamic portfolio optimization, exploring how intelligent agents can learn optimal
allocation strategies through continuous interaction with financial environments. We systematically review recent advances
in DRL-based portfolio management, examining various algorithmic frameworks including convolutional neural network
architectures and actor-critic methods. Our methodology section presents a comprehensive DRL framework employing the
Ensemble of Identical Independent Evaluators topology with convolutional layers for feature extraction from historical price
data. Through simulated trading experiments, we demonstrate that DRL-based approaches can adapt to changing market
conditions while maintaining reasonable trading frequencies that minimize transaction costs. The results indicate that DRL
agents achieve superior risk-adjusted returns compared to traditional benchmarks while exhibiting disciplined trading
behavior with manageable transaction volumes. This research contributes to the growing body of literature on artificial
intelligence applications in finance and provides practical insights for developing adaptive portfolio management systems.
Keywords :
Deep Reinforcement Learning, Portfolio Optimization, Convolutional Neural Networks, Transaction Costs, Algorithmic Trading, Financial Machine Learning.
References :
- Zhang, Z., Zohren, S., & Roberts, S. (2019). Deep reinforcement learning for trading. arXiv preprint arXiv:1911.10107.
- Hu, X., Zhao, X., Wang, J., & Yang, Y. (2025). Information-theoretic multi-scale geometric pre-training for enhanced molecular property prediction. PLoS One, 20(10), e0332640.
- Zhang, H., Ge, Y., Zhao, X., & Wang, J. (2025). Hierarchical deep reinforcement learning for multi-objective integrated circuit physical layout optimization with congestion-aware reward shaping. IEEE Access.
- Huang, G., Zhou, X., & Song, Q. (2025). Deep Reinforcement Learning for Long-Short Portfolio Optimization. Computational Economics, 1-37.
- Choudhary H, Orra A, Sahoo K, et al. Risk-adjusted deep reinforcement learning for portfolio optimization: A multi-reward approach. International Journal of Computational Intelligence Systems. 2025;18:126.
- Ndikum, P., & Ndikum, S. (2024). Advancing investment frontiers: Industry-grade deep reinforcement learning for portfolio optimization. arXiv preprint arXiv:2403.07916.
- Foo M, Lesmana N, Pun C. DRL trading with CPT actor and truncated quantile critics. In: Proceedings of the Fourth ACM International Conference on AI in Finance. 2023. p. 574-582.
- du Jardin, P. (2023). Designing topological data to forecast bankruptcy using convolutional neural networks. Annals of Operations Research, 325(2), 1291-1332.
- Kim H, Kim HY. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications. 2023;218:119556.
- Zheng, W., & Liu, W. (2025). Symmetry-Aware Transformers for Asymmetric Causal Discovery in Financial Time Series. Symmetry, 17(10), 1591.
- Wang X, Liu L. Risk-sensitive deep reinforcement learning for portfolio optimization. Journal of Risk and Financial Management. 2025;18(7):347.
- de-la-Rica-Escudero A, Garrido-Merchán EC, Coronado-Vaca M. Explainable post hoc portfolio management financial policy of a deep reinforcement learning agent. PLOS ONE. 2025;20(1):e0315528.
- Liu XY, Yang H, Gao J, Wang CD. FinRL: Deep reinforcement learning framework to automate trading. In: Proceedings of the Second ACM International Conference on AI in Finance. 2021. p. 1-9.
- Sato, Y. (2019). Model-free reinforcement learning for financial portfolios: a brief survey. arXiv preprint arXiv:1904.04973.
- Yu P, Lee JS, Kulyatin I, Shi Z, Dasgupta S. Model-based deep reinforcement learning for dynamic portfolio optimization. arXiv preprint arXiv:1901.08740. 2019.
- Sun R, Stefanidis A, Jiang Z, Su J. Combining transformer based deep reinforcement learning with Black-Litterman model for portfolio optimization. Neural Computing and Applications. 2024;36:8181-8197.
- Wang, J., Zhang, H., Wu, B., & Liu, W. (2025). Symmetry-Guided Electric Vehicles Energy Consumption Optimization Based on Driver Behavior and Environmental Factors: A Reinforcement Learning Approach. Symmetry, 17(6), 930.
- Ye Y, Pei H, Wang B, et al. Reinforcement-learning based portfolio management with augmented asset movement prediction states. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2020;34:1112-1119.
- Hu, X., Zhao, X., & Liu, W. (2025). Hierarchical Sensing Framework for Polymer Degradation Monitoring: A Physics-Constrained Reinforcement Learning Framework for Programmable Material Discovery. Sensors, 25(14), 4479.
- Han, X., Yang, Y., Chen, J., Wang, M., & Zhou, M. (2025). Symmetry-Aware Credit Risk Modeling: A Deep Learning Framework Exploiting Financial Data Balance and Invariance. Symmetry (20738994), 17(3).
- Wang, Y., Ding, G., Zeng, Z., & Yang, S. (2025). Causal-Aware Multimodal Transformer for Supply Chain Demand Forecasting: Integrating Text, Time Series, and Satellite Imagery. IEEE Access.
- Ma, Z., Chen, X., Sun, T., Wang, X., Wu, Y. C., & Zhou, M. (2024). Blockchain-based zero-trust supply chain security integrated with deep reinforcement learning for inventory optimization. Future Internet, 16(5), 163.
- Sun, T., Yang, J., Li, J., Chen, J., Liu, M., Fan, L., & Wang, X. (2024). Enhancing auto insurance risk evaluation with transformer and SHAP. IEEE Access.
- Cao, W., Mai, N. T., & Liu, W. (2025). Adaptive knowledge assessment via symmetric hierarchical Bayesian neural networks with graph symmetry-aware concept dependencies. Symmetry, 17(8), 1332.
- Mai, N. T., Cao, W., & Liu, W. (2025). Interpretable knowledge tracing via transformer-Bayesian hybrid networks: Learning temporal dependencies and causal structures in educational data. Applied Sciences, 15(17), 9605.
- Chen, S., Liu, Y., Zhang, Q., Shao, Z., & Wang, Z. (2025). Multi‐Distance Spatial‐Temporal Graph Neural Network for Anomaly Detection in Blockchain Transactions. Advanced Intelligent Systems, 2400898.
- Mai, N. T., Cao, W., & Wang, Y. (2025). The global belonging support framework: Enhancing equity and access for international graduate students. Journal of International Students, 15(9), 141-160.
- Zhang, Q., Chen, S., & Liu, W. (2025). Balanced Knowledge Transfer in MTTL-ClinicalBERT: A Symmetrical Multi-Task Learning Framework for Clinical Text Classification. Symmetry, 17(6), 823.
- Ren, S., Jin, J., Niu, G., & Liu, Y. (2025). ARCS: Adaptive Reinforcement Learning Framework for Automated Cybersecurity Incident Response Strategy Optimization. Applied Sciences, 15(2), 951.
- Liu, Y., Ren, S., Wang, X., & Zhou, M. (2024). Temporal logical attention network for log-based anomaly detection in distributed systems. Sensors, 24(24), 7949.
- Tan, Y., Wu, B., Cao, J., & Jiang, B. (2025). LLaMA-UTP: Knowledge-Guided Expert Mixture for Analyzing Uncertain Tax Positions. IEEE Access.
- Ge, Y., Wang, Y., Liu, J., & Wang, J. (2025). GAN-Enhanced Implied Volatility Surface Reconstruction for Option Pricing Error Mitigation. IEEE Access.
Dynamic portfolio optimization remains one of the most challenging problems in quantitative finance due to the
non-stationary nature of financial markets, complex asset correlations, and the presence of transaction costs. Traditional
portfolio management approaches, including Modern Portfolio Theory and mean-variance optimization, often rely on
restrictive assumptions that fail to capture market dynamics effectively. This paper investigates the application of Deep
Reinforcement Learning techniques to dynamic portfolio optimization, exploring how intelligent agents can learn optimal
allocation strategies through continuous interaction with financial environments. We systematically review recent advances
in DRL-based portfolio management, examining various algorithmic frameworks including convolutional neural network
architectures and actor-critic methods. Our methodology section presents a comprehensive DRL framework employing the
Ensemble of Identical Independent Evaluators topology with convolutional layers for feature extraction from historical price
data. Through simulated trading experiments, we demonstrate that DRL-based approaches can adapt to changing market
conditions while maintaining reasonable trading frequencies that minimize transaction costs. The results indicate that DRL
agents achieve superior risk-adjusted returns compared to traditional benchmarks while exhibiting disciplined trading
behavior with manageable transaction volumes. This research contributes to the growing body of literature on artificial
intelligence applications in finance and provides practical insights for developing adaptive portfolio management systems.
Keywords :
Deep Reinforcement Learning, Portfolio Optimization, Convolutional Neural Networks, Transaction Costs, Algorithmic Trading, Financial Machine Learning.