Deep learning architectures for text classification| International Journal of Innovative Science and Research Technology

Deep Learning Architectures for Text Classification

Authors : Chitra Desai

Volume/Issue : Volume 10 - 2025, Issue 5 - May

Google Scholar : https://tinyurl.com/3duu49yc

DOI : https://doi.org/10.38124/ijisrt/25may1682

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Text classification is crucial in natural language processing applications such as sentiment analysis, topic tagging, and news categorization. This paper presents a comparative analysis of three deep learning architectures—LSTM, Bidirectional LSTM, and Character-level Convolutional Neural Networks (Char-CNN), for the task of news categorization using the AG News dataset. The models were trained using a unified preprocessing pipeline, including tokenization, padding, and label encoding. Performance was evaluated based on classification accuracy, training time, and learning stability across epochs. The results show that Bidirectional LSTM outperforms the standard LSTM in capturing long-range dependencies by leveraging both past and future context. The Character-level CNN demonstrates robust performance by learning morphological patterns directly from raw text, making it resilient to misspellings and out-of-vocabulary words. The trade- offs between model complexity, training time, and interpretability has also been analyzed. This study offers practical insights into model selection for real-world NLP applications and highlights the importance of architectural choices in deep learning-based text classification.

Keywords : Deep Learning for NLP; Text Classification Models; Bidirectional LSTM Performance; Character-level CNN; AG News Dataset.

References :

Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2021). A Survey on Text Classification: From Traditional to Deep Learning. ACM Computing Surveys, 54(3), 1–35.
Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 13(3), 55–75.
T. Mikolov, M. Karafiat, L. Burget, J. Cernock ´ y, and S. Khudanpur, “Recurrent neural network based language model.” ` in Interspeech, vol. 2, 2010, p. 3.
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, 2013, pp. 3111–3119.
R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, C. Potts et al., “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol. 1631, 2013, p. 1642.
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1480–1489). Association for Computational Linguistics.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), 30.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pp. 4171–4186.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020). ALBERT: A lite BERT for self-supervised learning of language representations. In International Conference on Learning Representations (ICLR).
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., & Liu, Q. (2020). TinyBERT: Distilling BERT for natural language understanding. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 4163–4174.
Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., & Zhou, D. (2020). MobileBERT: a compact task-agnostic BERT for resource-limited devices. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2158–2170.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems (NeurIPS), 28.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.

Text classification is crucial in natural language processing applications such as sentiment analysis, topic tagging, and news categorization. This paper presents a comparative analysis of three deep learning architectures—LSTM, Bidirectional LSTM, and Character-level Convolutional Neural Networks (Char-CNN), for the task of news categorization using the AG News dataset. The models were trained using a unified preprocessing pipeline, including tokenization, padding, and label encoding. Performance was evaluated based on classification accuracy, training time, and learning stability across epochs. The results show that Bidirectional LSTM outperforms the standard LSTM in capturing long-range dependencies by leveraging both past and future context. The Character-level CNN demonstrates robust performance by learning morphological patterns directly from raw text, making it resilient to misspellings and out-of-vocabulary words. The trade- offs between model complexity, training time, and interpretability has also been analyzed. This study offers practical insights into model selection for real-world NLP applications and highlights the importance of architectural choices in deep learning-based text classification.

Keywords : Deep Learning for NLP; Text Classification Models; Bidirectional LSTM Performance; Character-level CNN; AG News Dataset.