Decoding Complexity and Emotion: A Computational Linguistic Approach to Sentence Comprehensibility and Writer Affect


Authors : Poushali Das; Avishake Kar; Ipsita Pathak; Shibani Mukherjee; Siddhartha Chatterjee

Volume/Issue : Volume 10 - 2025, Issue 7 - July


Google Scholar : https://tinyurl.com/3wwfam8w

DOI : https://doi.org/10.38124/ijisrt/25jul551

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : Grasping the cognitive and emotional foundations of written language is vital for developing AI systems that better align with human needs and for progressing language understanding technologies. This study examines how both the complexity of sentences and the emotional state of the writer can be modeled computationally using a comprehensive, multimodal strategy. In particular, we investigate the comprehensibility of sentences through Cognitive Load Prediction Models that are trained on eye-tracking and EEG-annotated datasets, utilizing the Zurich Cognitive Language Processing Corpus (ZuCo) to anchor complexity analysis in actual human cognitive signals. At the same time, to assess the writer’s emotional state, we present a neuro-symbolic NLP framework that merges rule-based sentiment techniques (such as negation patterns and emotion lexicons) with deep neural representations to enhance emotional detail and interoperability. Additionally, we include multimodal behavioral indicators like typing speed, keystroke dynamics, and real-time writing hesitations to map cognitive and emotional trends during the writing process. Our proposed architecture integrates these modalities—textual, physiological, and behavioral—to develop robust models that can predict the processing difficulty of sentences and the underlying emotions of the writer. Experimental findingsindicate that combining symbolic reasoning with contextual embeddings, supplemented by physiological and behavioral information, significantly boosts prediction accuracy compared to models relying solely on text. This research establishes a groundwork for sophisticated linguistic intelligence systems that can interpret not just the content of writing, but also the motivations behind it.

Keywords : Sentence Complexity, Memory Load Prediction, Writer Emotion Recognition, Neuro-Symbolic NLP, Multimodal Fusion, Eye-Tracking.

References :

[1].   Flesch, R. (1948). “A new readability yardstick,” Journal of Applied Psychology, vol. 32, no. 3, pp. 221–233.

[2].  Gunning, R. (1952). The Technique of Clear Writing. New York: McGraw-Hill.

[3]. McLaughlin, G. H. (1969). “SMOG grading—a new readability formula,” Journal of Reading, vol. 12, no. 8, pp. 639–646.

[4].  Lu, X. (2010). “Automatic analysis of syntactic complexity in second language writing,” International Journal of Corpus Linguistics, vol. 15, no. 4, pp. 474–496.

[5]. Schwarm, S. E., and Ostendorf, M. (2005). “Reading level assessment using support vector machines and statistical language models,” in Proc. ACL, pp. 523–530.

[6]. Pitler, E., and Nenkova, A. (2008). “Revisiting readability: A unified framework for predicting text quality,” in Proc. EMNLP, pp. 186–195.

[7].  Hollenstein, N., et al. (2019). “ZuCo 2.0: A dataset of physiological recordings during natural reading and annotation,” in Proc. LREC, pp. 138–147.

[8]. Barrett, S. E., Hollenstein, N., and Zhang, C. (2016). “Reading patterns and neural activity for language processing,” Journal of Cognitive Neuroscience.

[9].  Mishra, A., et al. (2020). “Predicting sentence difficulty using eye-tracking data,” IEEE Transactions on Affective Computing.

[10]. Brouwer, H., Fitz, H., and Hoeks, J. (2012). “Event-related potentials reflect semantic processing,” Neuropsychologia, vol. 50, no. 4, pp. 921–933.

[11]. Mohammad,S.M.,and Turney, P. D. (2013). “Crowdsourcing a word–emotion association lexicon,” Computational Intelligence, vol. 29, no. 3, pp. 436–465.

[12].  Devlin, J., et al. (2019). “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, pp. 4171–4186.

[13]. Liu, Y., et al. (2019). “RoBERTa: A robustly optimized BERT pretraining approach,” arXiv preprint, arXiv:1907.11692.

[14]. Kusner, M. J., Sun, Y., Kolkin, N. I., and Weinberger, K. Q. (2015). “From word embeddings to document distances,” in Proc. ICML, pp. 957–966.

[15]. Zimmermann, P., et al. (2003). “Affective computing—A rationale for measuring mood with mouse and keyboard,” International Journal of Occupational Safety and Ergonomics, vol. 9, no. 4, pp. 539–551.

[16]. Khanna, P., and Sasikumar, M. (2010). “Recognising emotion from keyboard strokes using neural networks,” International Journal of Computer Applications, vol. 11, no. 9, pp. 1–5.

[17]. Epp, C., Lippold,M., and Mandryk, R. L. (2011). “Identifying emotional states using keystroke dynamics,” in Proc. CHI, pp. 715–724.

[18]. Garcez,A.d.,Lamb, L. C., and Gabbay, D. M. (2019). Neural-Symbolic Cognitive Reasoning. Springer.

[19]. Delaney, D., Green, T., and Mahajan, M. (2020). “Multimodal emotion recognition using text, audio, and EEG,” in Proc. ICML.

[20]. Katsis, C. D., et al. (2008). “Toward emotion recognition in car-racing drivers,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 38, no. 3, pp. 502–512.

[21]. Lundberg, S. M., and Lee, S.-I. (2017). “A unified approach to interpreting model predictions,” in Proc. NeurIPS, pp. 4765–4774.

[22]. van der Maaten, L., and Hinton, G. (2008). “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, pp. 2579–2605.

[23]. Ghosh, P., Hazra, S., & Chatterjee, S. Future Prospects Analysis in Healthcare Management Using Machine Learning Algorithms. the International Journal of Engineering and Science Invention (IJESI), ISSN (online), 2319-6734.

[24]. Hazra, S., Mahapatra, S., Chatterjee, S., & Pal, D. (2023). Automated Risk Prediction of Liver Disorders Using Machine Learning. In the proceedings of 1st International conference on Latest Trends on Applied Science, Management, Humanities and Information Technology (SAICON-IC-LTASMHIT-2023) on 19th June (pp. 301-306).

[25]. Gon, A., Hazra, S., Chatterjee, S., & Ghosh, A. K. (2023). Application of machine learning algorithms for automatic detection of risk in heart disease. In Cognitive cardiac rehabilitation using IoT and AI tools (pp. 166-188). IGI Global.

[26]. Das, S., Chatterjee, S., Sarkar, D., & Dutta, S. (2022). Comparison Based Analysis and Prediction for Earlier Detection of Breast Cancer Using Different Supervised ML Approach. In Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 3 (pp. 255-267). Singapore: Springer Nature Singapore.

[27]. Das, S., Chatterjee, S., Karani, A. I., & Ghosh, A. K. (2023, November). Stress Detection While Doing Exam Using EEG with Machine Learning Techniques. In International Conference on Innovations in Data Analytics (pp. 177-187). Singapore: Springer Nature Singapore.

[28]. Hazra, S. (2024). Pervasive nature of AI in the health care industry: high-performance medicine.

[29]. Sima Das, Siddhartha Chatterjee, Sutapa Bhattacharya, Solanki Mitra, Arpan Adhikary and Nimai Chandra Giri “Movie’s-Emotracker: Movie Induced Emotion Detection by using EEG and AI Tools”, In the proceedings of the 4th International conference on Communication, Devices and Computing (ICCDC 2023), Springer-LNEE SCOPUS Indexed, DOI: 10.1007/978-981-99-2710-4_46, pp.583-595, vol. 1046 on 28th July, 2023.

[30]. Chatterjee, R., Chatterjee, S., Samanta, S., & Biswas, S. (2024, December). AI Approaches to Investigate EEG Signal Classification for Cognitive Performance Assessment. In 2024 6th International Conference on Computational Intelligence and Networks (CINE) (pp. 1-7). IEEE.

[31]. Adhikary, A., Das, S., Mondal, R., & Chatterjee, S. (2024, February). Identification of Parkinson’s Disease Based on Machine Learning Classifiers. In International Conference on Emerging Trends in Mathematical Sciences & Computing (pp. 490-503). Cham: Springer Nature Switzerland.

[32]. Ghosh, P., Dutta, R., Agarwal, N., Chatterjee, S., & Mitra, S. (2023). Social media sentiment analysis on third booster dosage for COVID-19 vaccination: a holistic machine learning approach. Intelligent Systems and Human Machine Collaboration: Select Proceedings of ICISHMC 2022, 179-190.

[33]. Rupa Debnath; Rituparna Mondal; Arpita Chakraborty; Siddhartha Chatterjee (2025) Advances in Artificial Intelligence for Lung Cancer Detection and Diagnostic Accuracy: A Comprehensive Review. International Journal of Innovative Science and Research Technology, 10(5), 1579-1586. https://doi.org/10.38124/IJISRT/25may1339

[34]. Nitu Saha; Rituparna Mondal; Arunima Banerjee; Rupa Debnath; Siddhartha Chatterjee; (2025) Advanced Deep Lung Care Net: A Next Generation Framework for Lung Cancer Prediction. International Journal of Innovative Science and Research Technology, 10(6), 2312-2320. https://doi.org/10.38124/ijisrt/25jun1801

[35]. Poushali Das; Washim Akram; Arijita Ghosh; Suman Biswas; Siddhartha Chatterjee (2025) Enhancing Diagnostic Accuracy: Leveraging Continuous pH Surveillance for Immediate Health Evaluation. International Journal of Innovative Science and Research Technology,10(7),7-12. https://doi.org/10.38124/ijisrt/25jul123

[36]. Manali Sarkar; Aparajita Das; Sraddha Roy Choudhury; Siddhartha Chatterjee (2025). A* Based Optimized Travel Recommendation System for Smart Mobility. International Journal of Innovative Science and Research Technology, 10(5), 3185-3193. https://doi.org/10.38124/ijisrt/25may2352

[37]. Hazra, S., Chatterjee, S., Mandal, A., Sarkar, M., Mandal, B.K. (2023). An Analysis of Duckworth-Lewis-Stern Method in the Context of Interrupted Limited over Cricket Matches. In: Chaki, N., Roy, N.D., Debnath, P., Saeed, K. (eds) Proceedings of International Conference on Data Analytics and Insights, ICDAI 2023. ICDAI 2023. Lecture Notes in Networks and Systems, vol.727.Springer, Singapore. https://doi.org/10.1007/978-981-99-3878-0_46

[38]. Babli Kumari, Renu Dhir, Siddhartha Chatterjee, and Suchi Jain. 2025. Automated Identification of Traffic Accidents in Images and Videos Employing Advanced Deep Learning Methods. In 26th International Conference on Distributed Computing and Networking (ICDCN 2025), Janu ary 04–07, 2025, Hyderabad, India. ACM, New York, NY, USA,6pages. https://doi.org/10.1145/3700838.370368

[39]. Madhuparna Das Hait; Priya Das; Washim Akram; Siddhartha Chatterjee (2025): A Comparative Analysis of Linear Regression Techniques: Evaluating Predictive Accuracy and Model Effectiveness. International Journal of Innovative Science and Research Technology, 10(7), 127-139. https://doi.org/10.38124/ijisrt/25jul34.

Grasping the cognitive and emotional foundations of written language is vital for developing AI systems that better align with human needs and for progressing language understanding technologies. This study examines how both the complexity of sentences and the emotional state of the writer can be modeled computationally using a comprehensive, multimodal strategy. In particular, we investigate the comprehensibility of sentences through Cognitive Load Prediction Models that are trained on eye-tracking and EEG-annotated datasets, utilizing the Zurich Cognitive Language Processing Corpus (ZuCo) to anchor complexity analysis in actual human cognitive signals. At the same time, to assess the writer’s emotional state, we present a neuro-symbolic NLP framework that merges rule-based sentiment techniques (such as negation patterns and emotion lexicons) with deep neural representations to enhance emotional detail and interoperability. Additionally, we include multimodal behavioral indicators like typing speed, keystroke dynamics, and real-time writing hesitations to map cognitive and emotional trends during the writing process. Our proposed architecture integrates these modalities—textual, physiological, and behavioral—to develop robust models that can predict the processing difficulty of sentences and the underlying emotions of the writer. Experimental findingsindicate that combining symbolic reasoning with contextual embeddings, supplemented by physiological and behavioral information, significantly boosts prediction accuracy compared to models relying solely on text. This research establishes a groundwork for sophisticated linguistic intelligence systems that can interpret not just the content of writing, but also the motivations behind it.

Keywords : Sentence Complexity, Memory Load Prediction, Writer Emotion Recognition, Neuro-Symbolic NLP, Multimodal Fusion, Eye-Tracking.

CALL FOR PAPERS


Paper Submission Last Date
31 - December - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe