Fraudulent Text Detection System Using Hybrid Machine Learning and Natural Language Processing Approaches


Authors : Muhammad Dawaki; Ahmed Mohammed; Dr. Mustapha Ismail

Volume/Issue : Volume 7 - 2022, Issue 10 - October

Google Scholar : https://bit.ly/3IIfn9N

Scribd : https://bit.ly/3NXnTWt

DOI : https://doi.org/10.5281/zenodo.7313563

Abstract : In today's digital age, fraudulent behaviour is becoming increasingly common. Many of the fraudulent actions have been carried out by sending text messages with malicious links attached, which can disrupt a system and potentially steal confidential personal information from a user. A system capable of identifying and classifying fraudulent content within a text string was developed in this project using machine learning algorithms and natural language processing libraries. Due to the ever-changing and sophisticated nature of fraudulent activity, detecting fraud is a difficult task that necessitates the use of cutting-edge technology to combat fraud. However, this research looked at the potential of developing a cutting-edge machine learning model. The fraudulent detection model was trained and tested using many machine learning algorithms utilizing an SMS spam dataset in this study. Three of the eleven algorithms used, K-Nearest Neighbor, Naive Bayesian Classifier, and Random Forest Classifier, outperformed the others, with performance accuracy and precision of 90% and 100% for K-Nearest Neighbor, 96% and 100% for Nave Bayesian Classifier, and 97% and 100% for Random Forest Classifier, respectively. The count vectorizer technique was used to select and extract the best features. The final optimal model performance obtained was 97% accuracy and 100% precision using accuracy, precision, recall, and f1-measure as metrics. The results obtained are promising, and the model was deployed using the streamlit framework.

Keywords : Fraudulent, Machine Learning, Dataset, Algorithm, Natural Language Processing

In today's digital age, fraudulent behaviour is becoming increasingly common. Many of the fraudulent actions have been carried out by sending text messages with malicious links attached, which can disrupt a system and potentially steal confidential personal information from a user. A system capable of identifying and classifying fraudulent content within a text string was developed in this project using machine learning algorithms and natural language processing libraries. Due to the ever-changing and sophisticated nature of fraudulent activity, detecting fraud is a difficult task that necessitates the use of cutting-edge technology to combat fraud. However, this research looked at the potential of developing a cutting-edge machine learning model. The fraudulent detection model was trained and tested using many machine learning algorithms utilizing an SMS spam dataset in this study. Three of the eleven algorithms used, K-Nearest Neighbor, Naive Bayesian Classifier, and Random Forest Classifier, outperformed the others, with performance accuracy and precision of 90% and 100% for K-Nearest Neighbor, 96% and 100% for Nave Bayesian Classifier, and 97% and 100% for Random Forest Classifier, respectively. The count vectorizer technique was used to select and extract the best features. The final optimal model performance obtained was 97% accuracy and 100% precision using accuracy, precision, recall, and f1-measure as metrics. The results obtained are promising, and the model was deployed using the streamlit framework.

Keywords : Fraudulent, Machine Learning, Dataset, Algorithm, Natural Language Processing

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe