Authors :
Nashak Danaan; Abraham Deme; Gideon Bibu; Mustapha Abdulrahman Lawal; Ismail Zahraddeen Yakubu
Volume/Issue :
Volume 10 - 2025, Issue 6 - June
Google Scholar :
https://tinyurl.com/s6b6bz96
DOI :
https://doi.org/10.38124/ijisrt/25jun1816
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Customer churn is a major challenge in the e-commerce industry, where customers end their relationship with an
online business due to reasons like dissatisfaction with product quality, poor customer service, pricing concerns, fierce
competition, or changing preferences. This study introduces an integrated machine learning approach to predict customer
churn in e-commerce, combining k-means clustering for customer segmentation and XGBoost for classification within the
Cross-Industry Standard Process for Data Mining (CRISP-DM) framework. The model aims to deliver a comprehensive,
stable, and reliable churn-prediction solution by analyzing customer data such as purchase history and demographics. The
methodology ensures a thorough and insightful analysis of customer data to improve prediction accuracy. The model
achieved an accuracy of 98.68%, precision of 96.19%, recall of 94.39%, and F1 score of 95%, outperforming individual
algorithms used in earlier or similar studies. These results demonstrate the effectiveness of the integrated approach in
predicting customer churn and offer valuable insights for e-commerce businesses, highlighting the importance of using
advanced machine-learning techniques to boost customer retention and profitability. The study adds to the less-explored
area of churn prediction in e-commerce and shows the potential of combined machine learning approaches to solve this
critical issue.
Keywords :
Customer Churn, E-Commerce, Machine Learning.
References :
- Ahmad, A. K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom using machine learning in the big data platform. Journal of Big Data, 6(1), 1-24.
- Alshamsi, S. A. (2022). Customer churn prediction in the e-commerce sector Master's thesis, Rochester Institute of Technology, Dubai. https://core.ac.uk/download/534376391.pdf
- Caigny, D., Coussement, K., & De Bock, K. W. (2018). A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2), 760-772.
- Chinnu, P. J., & Paul, P. M. (2017). Customer churn prediction: A survey. International Journal of Advanced Research in Computer Science, 8(5), 2178-2181.
- Durkaya, K. B., & Ozcan, T. (2023). Predicting customer churn using grey wolf optimization‐based support vector machine with principal component analysis. Journal of Forecasting.
- Gonzalez-Rodriguez, N., Osaba, E., Camacho, D., & Yang, X. S. (2019). A hybrid customer churn prediction model uses deep learning and gradient boosting. Expert Systems with Applications, 137, 236-247.
- Hu, J., Zhuang, Y., Yang, J., Lei, L., Huang, M., Zhu, R., & Dong, S. (2018). RNN: A recurrent neural network-based approach for customer churn prediction in the telecommunication sector. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 4081-4085). IEEE.
- Jain, H., Khunteta, A., & Srivastava, S. (2020). Churn prediction in telecommunication using logistic regression and logit boost. Procedia Computer Science, 167, 101-112.
- Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets, 31, 685-695. https://doi.org/10.1007/s12525-021-00475-2
- Khanna, R., & Awad, M. (2015). Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Springer. https://doi.org/10.1007/978-1-4302-5990-9
- Lalwani, P., Mishra, M. K., Chadha, J. S., & Sethi, P. (2021). Customer churn prediction system: A machine learning approach. VIT Bhopal University, Bhopal, India. https://doi.org/10.1007/s00607-021-00908-y
- Long, J., Chen, Z., He, W., Wu, T., & Ren, J. (2020). An integrated framework of deep learning and knowledge graph for predicting stock price trend: An application in the Chinese stock exchange market. Applied Soft Computing, 91, 106205.
- Mittal, M. K. (2022). Customer churn analysis in telecom using machine learning techniques (Doctoral dissertation, Dublin, National College of Ireland).
- Provost, F., & Fawcett, T. (2013). Data science for business: What you need to know about data mining and data-analytic thinking. O'Reilly Media, Inc.
- Rani, B., & Kant, S. (2020). Semi-supervised learning approach to improve machine learning algorithms for churn analysis in telecommunication. International Journal of Computer Information Systems and Industrial Management Applications, 12, 265-275. https://www.mirlabs.net/ijcisim/index.html
- Shirazi, F., & Mohammadi, M. (2019). A big data analytics model for customer churn prediction in the retiree segment. An International Journal of Information Management, 48, 238-253.
- Smith, A. N., Fischer, E., & Yongjian, C. (2012). How does brand-related user-generated content differ across YouTube, Facebook, and Twitter? Journal of Interactive Marketing, 32, 26-36.
- Singh, Y., Pandit, Y., Joshi, N., & Avhad, V. (2022). Prediction of customer churn using machine learning. International Research Journal of Modernization in Engineering Technology and Science, 4(4), 2582-5208. www.irjmets.com
- Tran, H., Le, N., & Nguyen, V.-H. (2023). Customer churn prediction in the banking sector using machine learning-based classification models. Interdisciplinary Journal of Information, Knowledge, and Management, 18, 87-105. https://doi.org/10.28945/5086
- Umair, S. (2014). A comparative study of data mining process models. International Journal of Innovation and Scientific Research, 12(1), 217-222. http://www.ijisr.issr-journals.org/
- Wu, X., & Meng, S. (2016). E-commerce customer churn prediction based on improved SMOTE and AdaBoost. In 2016 13th International Conference on Service Systems and Service Management (ICSSSM) (pp. 1-5). IEEE.
- Xiahou, X., & Harada, Y. (2022). B2C e-commerce customer churn prediction based on K-means and SVM. Journal of Theoretical and Applied Electronic Commerce Research, 17(2), 458-475.
- Yahaya, R., Abisoye, O. A., & Bashir, S. A. (2021). An enhanced bank customers churn prediction model using a hybrid genetic algorithm, k-means filter, and artificial neural network. In 2020 IEEE 2nd International Conference on Cyberspace (CYBER NIGERIA) (pp. 52-58). IEEE.
- Zhang, G., Zeng, J., Zhao, Z., Jin, D., & Li, Y. (2022). A counterfactual modeling framework for churn prediction. Beijing National Research Center for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing, China. https://doi.org/10.1145/3488560.3498468
Customer churn is a major challenge in the e-commerce industry, where customers end their relationship with an
online business due to reasons like dissatisfaction with product quality, poor customer service, pricing concerns, fierce
competition, or changing preferences. This study introduces an integrated machine learning approach to predict customer
churn in e-commerce, combining k-means clustering for customer segmentation and XGBoost for classification within the
Cross-Industry Standard Process for Data Mining (CRISP-DM) framework. The model aims to deliver a comprehensive,
stable, and reliable churn-prediction solution by analyzing customer data such as purchase history and demographics. The
methodology ensures a thorough and insightful analysis of customer data to improve prediction accuracy. The model
achieved an accuracy of 98.68%, precision of 96.19%, recall of 94.39%, and F1 score of 95%, outperforming individual
algorithms used in earlier or similar studies. These results demonstrate the effectiveness of the integrated approach in
predicting customer churn and offer valuable insights for e-commerce businesses, highlighting the importance of using
advanced machine-learning techniques to boost customer retention and profitability. The study adds to the less-explored
area of churn prediction in e-commerce and shows the potential of combined machine learning approaches to solve this
critical issue.
Keywords :
Customer Churn, E-Commerce, Machine Learning.