Authors :
Mohammed Ali Kawo; Dr. Garba Muhammad; Dr.Danlami Gabi; Dr. Musa Sule Argungu
Volume/Issue :
Volume 9 - 2024, Issue 5 - May
Google Scholar :
https://tinyurl.com/yhzh9ma4
Scribd :
https://tinyurl.com/2yumt3by
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24MAY1751
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Extracting subjective data from online user
generated text documents is made quite easy with the use
of sentiment analysis. For a classification task different
individual algorithms are applied to a review dataset in
which most classifiers produce accurate results while
others produce limited and inaccurate predictions. This
research is to evaluate various machine learning
algorithms for online dataset classification, where same
set of data will be used to test four different machine
learning algorithms: Naive Bayes, Support Vector
machine, K-nearest neighbor and Decision tree. In order
to determine which machine learning model will perform
best in sentiment analysis as a constant issue. In this
research, our primary goal is to identify the most
effective machine learning model for sentiment analysis
of English texts among the aforementioned classifiers.
Their robustness will be tested and classified with an
imbalanced dataset Kaggle.com a Machine learning
repository. The dataset will first undergo data
preprocessing in order to enable analysis, and then
feature extraction for the base classifiers performance
and accuracy which will be carried out in Jupyter
notebook from Anaconda. Each machine learning
algorithm performance scores will be calculated for
higher accuracy using confusion matrix, F1-score,
precision and recall respectively.
Keywords :
Machine Learning Algorithms, Sentiment Analysis, Imbalanced, Confusion Matrix.
References :
- Agustini, T. (2021). Sentiment Analysis on Social Media using Machine Learning-Based Approach. June, 544437.
- Arya, P., Bhagat, A., & Nair, R. (2019). Improved Performance of Machine Learning Algorithms via Ensemble Learning Methods of Sentiment Analysis. 10(2), 110–116.
- Bahwari. (2019). Sentiment Analysis Using Random
Forest Algorithm - Online Social Media Based.
Journal Of Information Technology AND ITS UTILIZATION,
2(2), 29–33. https://www.researchgate.
net/publication/338548518_SENTIMENT_ANALYSIS_
USING_RANDOM_FOREST_ALGORITHM_ONLINE_
SOCIAL_MEDIA_BASED
- Feng, W., Gou, J., Fan, Z., & Chen, X. (2023). An ensemble machine learning approach for classification tasks using feature generation. Connection Science, 35(1). https://doi.org/10.1080/ 09540091.2023.2231168
- George, S., & Srividhya, V. (2022). Performance Evaluation of Sentiment Analysis on Balanced and Imbalanced Dataset Using Ensemble Approach. Indian Journal of Science and Technology, 15(17), 790–797. https://doi.org/10.17485/ijst/v15i17.2339
- Ghosh, S., Hazra, A., & Raj, A. (2020). A Comparative Study of Different Classification Techniques for Sentiment Analysis. International Journal of Synthetic Emotions, 11(1), 49–57. https://doi.org/10.4018/ijse.20200101.oa
- Jawale, S. (2019). Sentiment Analysis using Ensemble Learning. May.
- Jordan, M. I., & Mitchell, T. M. (2020). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. https://doi.org/ 10.1126/science.aaa8415
- Kawade, D. R., & Oza, D. K. S. (2017). Sentiment Analysis: Machine Learning Approach. International Journal of Engineering and Technology, 9(3), 2183–2186. https://doi.org/10.21817/ijet/2017/v9i3/ 1709030151
- Kumar, S., Kaur, N., Kavita, & Joshi, A. (2023). Tweet sentiment analysis using logistic regression. July, 332–336. https://doi.org/10.1049/icp.2023.1801
- Lazrig, I., & Humpherys, S. L. (2022). Using Machine Learning Sentiment Analysis to Evaluate Learning Impact. Information Systems Education Journal (ISEDJ), 20(1), 20. https://isedj.org/; https://iscap.info
- Liakos, K. G., Busato, P., Moshou, D., Pearson, S., & Bochtis, D. (2018). Machine learning in agriculture: A review. Sensors (Switzerland), 18(8), 1–29. https://doi.org/10.3390/s18082674
- Meenu, S. G. (2019). 154. Sunila. International Journal of Electronics Engineering (ISSN: 0973-7383, Volumne 11(• Issue 1), 965–970.
- Mostafa, G., Ahmed, I., & Junayed, M. S. (2021). Investigation of Different Machine Learning Algorithms to Determine Human Sentiment Using Twitter Data. International Journal of Information Technology and Computer Science, 13(2), 38–48. https://doi.org/10.5815/ijitcs.2021.02.04
- Patel, R. (2017). Sentiment Analysis on Twitter Data Using Machine Learning by Ravikumar Patel A thesis submitted in partial fulfillment of the requirements for the degree of MSc Computational Sciences The Faculty of Graduate Studies.
- Tan, K. L., Lee, C. P., & Lim, K. M. (2023). A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research. Applied Sciences (Switzerland), 13(7). https://doi.org/10.3390/app 13074550
- Theobald, O. (2017). Machine Learning For Absolute Beginners.
- Zishumba, K. (2019). Sentiment Analysis Based on Social Media Data. Journal of Information and Telecommunication, 1–48. http://repository.aust.edu. ng/xmlui/bitstream/handle/123456789/4901/Kudzai Zishumba.pdf?sequence=1&isAllowed=y
Extracting subjective data from online user
generated text documents is made quite easy with the use
of sentiment analysis. For a classification task different
individual algorithms are applied to a review dataset in
which most classifiers produce accurate results while
others produce limited and inaccurate predictions. This
research is to evaluate various machine learning
algorithms for online dataset classification, where same
set of data will be used to test four different machine
learning algorithms: Naive Bayes, Support Vector
machine, K-nearest neighbor and Decision tree. In order
to determine which machine learning model will perform
best in sentiment analysis as a constant issue. In this
research, our primary goal is to identify the most
effective machine learning model for sentiment analysis
of English texts among the aforementioned classifiers.
Their robustness will be tested and classified with an
imbalanced dataset Kaggle.com a Machine learning
repository. The dataset will first undergo data
preprocessing in order to enable analysis, and then
feature extraction for the base classifiers performance
and accuracy which will be carried out in Jupyter
notebook from Anaconda. Each machine learning
algorithm performance scores will be calculated for
higher accuracy using confusion matrix, F1-score,
precision and recall respectively.
Keywords :
Machine Learning Algorithms, Sentiment Analysis, Imbalanced, Confusion Matrix.