⚠ Official Notice: www.ijisrt.com is the official website of the International Journal of Innovative Science and Research Technology (IJISRT) Journal for research paper submission and publication. Please beware of fake or duplicate websites using the IJISRT name.



Twitter Sentiment Analysis


Authors : Himani H. Patel; Om Mahalle; Anish Shetty; Sachin Hugar

Volume/Issue : Volume 11 - 2026, Issue 6 - June


Google Scholar : https://tinyurl.com/3yewdpjv

Scribd : https://tinyurl.com/mscnwy94

DOI : https://doi.org/10.38124/ijisrt/26jun364

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : The concept of sentiment analysis of social media texts poses a vital role in better understanding public opinions, behavior patterns of consumers, and societal trends. Twitter, being a microblogging website, poses a tremendous challenge to the social media world due to its highly noisy and informal nature of tweets. This paper emphasizes a highly efficient and scalable sentiment analysis system using the Sentiment140 dataset. The dataset comprises 1.6 million tweets that are automatically labeled using emoticons. The system uses lightweight text processing steps followed by converting tweets into a numerical representation by means of Term Frequency-Inverse Document Frequency (TF-IDF) with unigrams, bigrams, and sublinear scaling. Three powerful yet classic machine learning classifiers—Multinomial Naive Bayes and Logistic Regression (tuned using GridSearchCV) and Linear SVM—are combined using a hard voting classifier. This paper proves that the combination of classifiers yields better accuracy and performance. An experimental study using a train-test splitting ratio of 75:25 demonstrates that the combination classifier exhibits higher accuracy, precision, recall, and F1-measure. The system has been found computationally efficient. Error analysis indicates that slang usage, sarcasm, and the use of emojis constitute major challenges. The results confirm that classical linear models, when trained on large-scale data and combined effectively, provide a strong, scalable baseline for Twitter sentiment analysis suitable for real-time deployment. Future work includes incorporating emoji-aware features and contextual embeddings to handle linguistic nuance.

Keywords : Component, Formatting, Style, Styling, Insert.

References :

  1. A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” CS224N Project Report, Stanford University, 2009.
  2. G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing & Management, vol. 24, no. 5, pp. 513–523, 1988.
  3. J. Ramos, “Using TF-IDF to determine word relevance in document queries,” Proc. First Instructional Conf. Machine Learning, 2003.
  4. T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” Proc. European Conf. Machine Learning (ECML), pp. 137–142, 1998.
  5. A. McCallum and K. Nigam, “A comparison of event models for Naive Bayes text classification,” AAAI Workshop on Learning for Text Categorization, 1998.
  6. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
  7. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.
  8. B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment classification using machine learning techniques,” Proc. ACL, pp. 79–86, 2002.
  9. B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends in Information Retrieval, vol. 2, no. 1–2, pp. 1–135, 2008.
  10. F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
  11. O. Sagi and L. Rokach, “Ensemble learning: A survey,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 8, no. 4, p. e1249, 2018.
  12. Y. Kim, “Convolutional neural networks for sentence classification,” Proc. EMNLP, pp. 1746–1751, 2014.
  13. A. Severyn and A. Moschitti, “Twitter sentiment analysis with deep convolutional neural networks,” Proc. SIGIR, pp. 959–962, 2015.
  14. A. Vaswani et al., “Attention is all you need,” Advances in Neural Information Processing Systems (NeurIPS), pp. 5998–6008, 2017.
  15. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” Proc. NAACL-HLT, pp. 4171–4186, 2019.

The concept of sentiment analysis of social media texts poses a vital role in better understanding public opinions, behavior patterns of consumers, and societal trends. Twitter, being a microblogging website, poses a tremendous challenge to the social media world due to its highly noisy and informal nature of tweets. This paper emphasizes a highly efficient and scalable sentiment analysis system using the Sentiment140 dataset. The dataset comprises 1.6 million tweets that are automatically labeled using emoticons. The system uses lightweight text processing steps followed by converting tweets into a numerical representation by means of Term Frequency-Inverse Document Frequency (TF-IDF) with unigrams, bigrams, and sublinear scaling. Three powerful yet classic machine learning classifiers—Multinomial Naive Bayes and Logistic Regression (tuned using GridSearchCV) and Linear SVM—are combined using a hard voting classifier. This paper proves that the combination of classifiers yields better accuracy and performance. An experimental study using a train-test splitting ratio of 75:25 demonstrates that the combination classifier exhibits higher accuracy, precision, recall, and F1-measure. The system has been found computationally efficient. Error analysis indicates that slang usage, sarcasm, and the use of emojis constitute major challenges. The results confirm that classical linear models, when trained on large-scale data and combined effectively, provide a strong, scalable baseline for Twitter sentiment analysis suitable for real-time deployment. Future work includes incorporating emoji-aware features and contextual embeddings to handle linguistic nuance.

Keywords : Component, Formatting, Style, Styling, Insert.

Paper Submission Last Date
30 - June - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS
Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe