Authors :
N. Bhavana; Avulapalli Sowmya
Volume/Issue :
Volume 10 - 2025, Issue 5 - May
Google Scholar :
https://tinyurl.com/mr7kcknb
DOI :
https://doi.org/10.38124/ijisrt/25may1564
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Email has emerged as the main channel for both personal and professional interaction due to the quick
expansion of digital communication. However, because email is so widely used, there is an increase in spam
communications, which can be anything from malicious phishing attempts to innocuous adverts. In addition to filling
consumers' inboxes, these spam emails present serious security risks. Conventional rule-based spam filters have not been
able to keep up with the changing nature of spam. Because machine learning (ML) algorithms can learn from data and get
better over time, they are increasingly being incorporated into spam detection systems. In this paper, a variety of machine
learning algorithms, such as Random Forest, Support Vector Machine (SVM), and Naive Bayes, are used to detect email
spam. Using a publically accessible dataset, we assess these models on the basis of accuracy, precision, recall, and F1-score.
According to our findings, machine learning models—in particular, ensemble approaches—provide reliable and
expandable spam detection systems.
Keywords :
Machine Learning, Spam, Emails.
References :
- Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., & Spyropoulos, C. D. (2000). An experimental comparison of naive Bayesian and keyword-based anti-spam filtering. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval.
- Sahami, M., Dumais, S., Heckerman, D., & Horvitz, E. (1998). A Bayesian approach to filtering junk e-mail. Learning for Text Categorization: Papers from the 1998 Workshop.
- Drucker, H., Wu, D., &Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 10(5), 1048–1054.
- Carreras, X., & Marquez, L. (2001). Boosting trees for anti-spam email filtering. In Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing (RANLP 2001).
- Metsis, V., Androutsopoulos, I., &Paliouras, G. (2006). Spam filteringwith naive Bayes – Which naive Bayes? In CEAS.
- Goodman, J., Cormack, G. V., & Heckerman, D. (2007). Spam and the ongoing battle for the inbox. Communications of the ACM, 50(2), 24-33.
- Blanzieri, E., & Bryl, A. (2008). A survey of learning-based techniques of email spam filtering. Artificial Intelligence Review, 29(1), 63–92.
- Zhang, L., Zhu, J., & Yao, T. (2004). An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing (TALIP), 3(4), 243–269.
- Cormack, G. V. (2007). Email spam filtering: A systematic review. Foundations and Trends in Information Retrieval, 1(4), 335–455.
- Hidalgo, J. M. G., Bringas, G. C., Sánz, E. P., & García, F. C. (2006). Content based SMS spam filtering. In Proceedings of the 2006 ACM symposium on Document engineering, 107–114.
Email has emerged as the main channel for both personal and professional interaction due to the quick
expansion of digital communication. However, because email is so widely used, there is an increase in spam
communications, which can be anything from malicious phishing attempts to innocuous adverts. In addition to filling
consumers' inboxes, these spam emails present serious security risks. Conventional rule-based spam filters have not been
able to keep up with the changing nature of spam. Because machine learning (ML) algorithms can learn from data and get
better over time, they are increasingly being incorporated into spam detection systems. In this paper, a variety of machine
learning algorithms, such as Random Forest, Support Vector Machine (SVM), and Naive Bayes, are used to detect email
spam. Using a publically accessible dataset, we assess these models on the basis of accuracy, precision, recall, and F1-score.
According to our findings, machine learning models—in particular, ensemble approaches—provide reliable and
expandable spam detection systems.
Keywords :
Machine Learning, Spam, Emails.