Authors :
Salisu Suleiman; Prashansa Taneja; Ayushi Nainwal
Volume/Issue :
Volume 7 - 2022, Issue 6 - June
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3HWydLj
DOI :
https://doi.org/10.5281/zenodo.6757912
Abstract :
The internet has pervaded every part of
human life making it easier to link individuals across the
globe and disseminate information to a large group of
people. Despite its importance, the cyberworld has a
number of negative effects on people today. One of the
most dangerous threats in the cyberworld is
cyberbullying as it destroys individuals' reputation or
privacy, threatens or harasses them, and has a long-term
impact on the victim. Despite the issue has been in
existence for many years, the impact on young people has
just recently become more widely recognized. Using
machine learning and natural language processing, the
bullies' harassing tweets or offensive comments may be
automatically identified and detected. This paper
reviewed the previous research in cyberbullying detection
domain and more importantly, proposed a novel
cyberbullying detection model to close the gap that was
discovered during the review of the related literature. In
this study, we employed standard supervised learning
method and ensemble supervised learning method. The
traditional methods used three ML classifiers: Gaussian
Naïve Bayes (GNV), Logistic Regression (LR), and
Decision Tree (DT) classifiers, While Adaboost and
Random Forest (RF) classifiers were used as ensemble
technique. We trained and tested our model to detect and
classify bullying content as either bullying or nonbullying (binary classification model) using our dataset,
and Termed Frequency Inverse Document Frequency
(Tf-idf) was used to extract features from a twitter dataset
downloaded from kaagle.
Keywords :
Cyberworld, Social Media, Machine Learning, Cyber Bullying.
The internet has pervaded every part of
human life making it easier to link individuals across the
globe and disseminate information to a large group of
people. Despite its importance, the cyberworld has a
number of negative effects on people today. One of the
most dangerous threats in the cyberworld is
cyberbullying as it destroys individuals' reputation or
privacy, threatens or harasses them, and has a long-term
impact on the victim. Despite the issue has been in
existence for many years, the impact on young people has
just recently become more widely recognized. Using
machine learning and natural language processing, the
bullies' harassing tweets or offensive comments may be
automatically identified and detected. This paper
reviewed the previous research in cyberbullying detection
domain and more importantly, proposed a novel
cyberbullying detection model to close the gap that was
discovered during the review of the related literature. In
this study, we employed standard supervised learning
method and ensemble supervised learning method. The
traditional methods used three ML classifiers: Gaussian
Naïve Bayes (GNV), Logistic Regression (LR), and
Decision Tree (DT) classifiers, While Adaboost and
Random Forest (RF) classifiers were used as ensemble
technique. We trained and tested our model to detect and
classify bullying content as either bullying or nonbullying (binary classification model) using our dataset,
and Termed Frequency Inverse Document Frequency
(Tf-idf) was used to extract features from a twitter dataset
downloaded from kaagle.
Keywords :
Cyberworld, Social Media, Machine Learning, Cyber Bullying.