Personalized Emotion Detection Adapting Models to Individual Emotional Expressions


Authors : Diwakar Mainali; Saraswoti Shrestha; Umesh Thapa; Sanjib Nepali

Volume/Issue : Volume 9 - 2024, Issue 10 - October

Google Scholar : https://tinyurl.com/mrxsexjv

Scribd : https://tinyurl.com/586c35ex

DOI : https://doi.org/10.38124/ijisrt/IJISRT24OCT1478

Abstract : Emotion recognition from text and speech has become a critical area of research in artificial intelligence (AI), enhancing human-computer interaction across various sectors. This paper explores the methodologies used in emotion recognition, focusing on Natural Language Processing (NLP) for text and acoustic analysis for speech. It reviews key machine learning and deep learning models, including Support Vector Machines (SVM), neural networks, and transformers, and highlights the datasets commonly used in emotion detection studies. The paper also addresses challenges such as multimodal integration, data ambiguity, and ethical considerations like privacy concerns and bias in models. Applications in customer service, healthcare, education, and entertainment are discussed, showcasing the growing importance of emotion recognition in AI- driven systems. Future research directions, including advancements in deep learning, multimodal systems, and real-time processing, are also explored to address existing limitations.

Keywords : Emotion Recognition, Natural Language Processing (NLP), Acoustic Analysis, Machine Learning, Deep Learning, Speech Emotion Recognition, Text Emotion

References :

  1. N. Alswaidan and M. E. B. Menai, "A survey of state-of-the-art approaches for emotion recognition in text," Knowledge and Information Systems, vol. 62, no. 8, pp. 2937-2987, 2020.
  2. N. Braunschweiler, R. Doddipatla, S. Keizer, and S. Stoyanchev, "Factors in emotion recognition with deep learning models using speech and text on multiple corpora," IEEE Signal Processing Letters, vol. 29, pp. 722-726, 2022.
  3. S. W. Byun, J. H. Kim, and S. P. Lee, "Multi-modal emotion recognition using speech features and text-embedding," Applied Sciences, vol. 11, no. 17, p. 7967, 2021.
  4. M. R. Makiuchi, K. Uto, and K. Shinoda, "Multimodal emotion recognition with high-level speech and text features," in 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021, pp. 350-357.
  5. K. Sailunaz, M. Dhaliwal, J. Rokne, and R. Alhajj, "Emotion detection from text and speech: A survey," Social Network Analysis and Mining, vol. 8, no. 1, p. 28, 2018.
  6. P. Singh, R. Srivastava, K. P. S. Rana, and V. Kumar, "A multimodal hierarchical approach to speech emotion recognition from audio and text," Knowledge-Based Systems, vol. 229, p. 107316, 2021.
  7. O. Verkholyak, A. Dvoynikova, and A. Karpov, "A bimodal approach for speech emotion recognition using audio and text," J. Internet Serv. Inf. Secur., vol. 11, no. 1, pp. 80-96, 2021.
  8. C. Wu, C. Huang, and H. Chen, "Text-independent speech emotion recognition using frequency adaptive features," Multimedia Tools and Applications, vol. 77, pp. 24353-24363, 2018.
  9. H. Xu, H. Zhang, K. Han, Y. Wang, Y. Peng, and X. Li, "Learning alignment for multimodal emotion recognition from speech," arXiv preprint arXiv:1909.05645, 2019.
  10. S. Yoon, S. Byun, and K. Jung, "Multimodal speech emotion recognition using audio and text," in 2018 IEEE Spoken Language Technology Workshop (SLT), 2018, pp. 112-118.
  11. S. K. Bharti, S. Varadhaganapathy, R. K. Gupta, P. K. Shukla, M. Bouye, S. K. Hingaa, and A. Mahmoud, "Text-based emotion recognition using deep learning approach," Computational Intelligence and Neuroscience, vol. 2022, p. 2645381, 2022.
  12. H. Lian, C. Lu, S. Li, Y. Zhao, C. Tang, and Y. Zong, "A survey of deep learning-based multimodal emotion recognition: Speech, text, and face," Entropy, vol. 25, no. 10, p. 1440, 2023.
  13. X. Cai, J. Yuan, R. Zheng, L. Huang, and K. Church, "Speech emotion recognition with multi-task learning," in Interspeech, 2021, pp. 4508-4512.
  14. B. T. Atmaja, K. Shirai, and M. Akagi, "Speech emotion recognition using speech feature and word embedding," in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019, pp. 519-523.
  15. L. Pepino, P. Riera, L. Ferrer, and A. Gravano, "Fusion approaches for emotion recognition from speech using acoustic and text-based features," in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 6484-6488.
  16. M. S. Fahad, A. Ranjan, J. Yadav, and A. Deepak, "A survey of speech emotion recognition in natural environment," Digital Signal Processing, vol. 110, p. 102951, 2021.
  17. P. Hajek and M. Munk, "Speech emotion recognition and text sentiment analysis for financial distress prediction," Neural Computing and Applications, vol. 35, no. 29, pp. 21463-21477, 2023.
  18. T. Mittal, U. Bhattacharya, R. Chandra, A. Bera, and D. Manocha, "M3er: Multiplicative multimodal emotion recognition using facial, textual, and speech cues," in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 2, pp. 1359-1367.
  19. P. Kumar, V. Kaushik, and B. Raman, "Towards the explainability of multimodal speech emotion recognition," in Interspeech, 2021, pp. 1748-1752.
  20. X. Zhang, M. J. Wang, and X. D. Guo, "Multi-modal emotion recognition based on deep learning in speech, video and text," in 2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP), 2020, pp. 328-333.

Emotion recognition from text and speech has become a critical area of research in artificial intelligence (AI), enhancing human-computer interaction across various sectors. This paper explores the methodologies used in emotion recognition, focusing on Natural Language Processing (NLP) for text and acoustic analysis for speech. It reviews key machine learning and deep learning models, including Support Vector Machines (SVM), neural networks, and transformers, and highlights the datasets commonly used in emotion detection studies. The paper also addresses challenges such as multimodal integration, data ambiguity, and ethical considerations like privacy concerns and bias in models. Applications in customer service, healthcare, education, and entertainment are discussed, showcasing the growing importance of emotion recognition in AI- driven systems. Future research directions, including advancements in deep learning, multimodal systems, and real-time processing, are also explored to address existing limitations.

Keywords : Emotion Recognition, Natural Language Processing (NLP), Acoustic Analysis, Machine Learning, Deep Learning, Speech Emotion Recognition, Text Emotion

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe