Authors :
Nikhil Sharma; Gaurang Shirodkar; Nikhil Singh; Rohan A Mathews; Ragavan R; Shobha T
Volume/Issue :
Volume 10 - 2025, Issue 1 - January
Google Scholar :
https://tinyurl.com/35aptn2x
Scribd :
https://tinyurl.com/hzbkpsm6
DOI :
https://doi.org/10.5281/zenodo.14724989
Abstract :
Since the advent of the automobile by Karl Benz, there has been an exponential increase in the number of automobiles worldwide and
the growth of the automobile industry. In this situation, it becomes increasingly significant to have a precise pricing model, which is
crucial for both buyers and sellers. Such models provide reliable pricing information, enabling better-informed decisions. This project
aims to design a prediction system using machine learning capable of predicting prices for used cars based on the most relevant factors
such as make, model, mileage, year, fuel type, and transmission type. The ML models trained and tested to form the system include
Random Forest, Gradient Boosting, and Stacking Regressor. Model stacking and ensemble voting were con- ducted to increase
prediction accuracy. Experimental results demonstrate that an ensemble model possesses far higher predictive power, accuracy, and
robustness compared to single models. The final ensemble model produced an X RMSE and Y R2
score, proving that vehicle prices
can be predicted fairly well using this approach. Thus, the system becomes more useful for consumer and dealer applications.
Keywords :
Precise Pricing Model, Informed Decisions, Price Prediction, Random Forest, Gradient Boosting, Ensemble voting, Accuracy, Robustness.
References :
- Vehicle Price Prediction by Aggregating Decision Tree Model with Boosting Model, Auwal Tijjani Amshi, 2023.
- Error Reduction from Stacked Regressions, Xin Chen et al., 2024.
- RELF: Robust Regression Extended with Ensemble Loss Function, Hamideh Hajiabadi et al., 2018.
- Stacked Ensemble Machine Learning for Porosity and Absolute Permeability Prediction, Ramanzani Kalule et al., 2023.
- Friedman, J. H. (2001). Greedy function ap- proximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232.
- Wolpert, D. H. (1992). Stacked generaliza- tion. Neural Networks, 5(2), 241-259
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
- Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., & Ma, W. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
- Scikit-learn developers. (2020). Scikit- learn: Machine learning in Python. Retrieved from https://scikit-learn.org/stable/
- Dietterich, T. G. (2000). Ensemble methods in machine learning. International Workshop on Multiple Classifier Systems.
- Molnar, C. (2019). Interpretable Machine Learning.
- Zhao, L., et al. (2021). Comparative analysis of gradient boosting methods in predictive modeling. Journal of Data Science, 19(4), 345-361.
- Singh, R., et al. (2022). Enhancing vehicle price prediction through modelstacking. Applied Artificial Intelligence, 36(5), 425-440.
- Smith & Johnson (2021). Highlights of categorical variables such as fuel type and transmission in improving model performance.
- Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. [16]Bishop,C.M.(2006).Pattern recognition and Machine Learning
Since the advent of the automobile by Karl Benz, there has been an exponential increase in the number of automobiles worldwide and
the growth of the automobile industry. In this situation, it becomes increasingly significant to have a precise pricing model, which is
crucial for both buyers and sellers. Such models provide reliable pricing information, enabling better-informed decisions. This project
aims to design a prediction system using machine learning capable of predicting prices for used cars based on the most relevant factors
such as make, model, mileage, year, fuel type, and transmission type. The ML models trained and tested to form the system include
Random Forest, Gradient Boosting, and Stacking Regressor. Model stacking and ensemble voting were con- ducted to increase
prediction accuracy. Experimental results demonstrate that an ensemble model possesses far higher predictive power, accuracy, and
robustness compared to single models. The final ensemble model produced an X RMSE and Y R2
score, proving that vehicle prices
can be predicted fairly well using this approach. Thus, the system becomes more useful for consumer and dealer applications.
Keywords :
Precise Pricing Model, Informed Decisions, Price Prediction, Random Forest, Gradient Boosting, Ensemble voting, Accuracy, Robustness.