Multimodal Emotional Analysis using XAI for Psychotherapy


Authors : N. Sripriya; Swetha Subramanian; Sriganesh Jagathisan

Volume/Issue : Volume 10 - 2025, Issue 5 - May


Google Scholar : https://tinyurl.com/uw2fyz85

DOI : https://doi.org/10.38124/ijisrt/25may474

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : A person’s mental well-being can be perceived by the emotions that they express. What a person feels can be observed by various physical and physiological cues. But people aren’t all the same, some are capable of expressing what they truly feel, others might not and there are certain scenarios where the person who is expressing those emotions isn’t completely aware of the emotional state they are in. Such scenarios are where even a trained professional isn't always one- hundred percent right. This raises the need for a solution that can observe a person’s behavioral traits and guess their emotional state. Currently we have various deep learning approaches that can tackle the problem in hand. One of the widely used approaches is making use of a Unimodal system that predicts a person’s emotional state by processing information that is collected in the form of a single modality. But using a single channel to perform such a complex classification task is often inefficient. To make more appropriate classifications, this study proposes a multimodal approach that incorporates eXplainable Artificial Intelligence (XAI) methodologies, and hence improving psychotherapeutic outcomes. The multimodal emotion recognition approach integrates multiple information channels of physical cues of a person, like speech and facial expressions. A more accurate prediction can be arrived at with various complementary channels backing it up. The addition of XAI algorithms make it clearer as to how the model arrived at its conclusion. Overall, this system provides a solution that can be personalized for each client and allows us to have a proper data-driven tool for emotional analysis, which can help the practitioners to design appropriate treatment plans for their client. By adding this state-of-the-art technology as a supplement to conventional psychotherapy techniques, we can yield more successful treatments.

Keywords : Multimodal Emotion Recognition; Explainable Artificial Intelligence(XAI); Psychotherapy; Gradcam; LIME; Therapy Results.

References :

  1. Khalane, A., Makwana, R., Shaikh, T., & Ullah, A. (2023). Evaluating significant features in context‐aware multimodal emotion recognition with XAI methods. Expert Systems, e13403.
  2. Rahman, M. A., Brown, D. J., Shopland, N., Burton, A., & Mahmud, M. (2022, June). Explainable multimodal machine learning for engagement analysis by continuous performance test. In International Conference on Human Computer Interaction (pp. 386-399). Cham: Springer International Publishing.
  3. Guerdan, L., Raymond, A., & Gunes, H. (2021). Toward affective XAI: facial affect analysis for understanding explainable human-ai interactions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3796-3805).
  4. Mylona, A., Avdi, E., & Paraskevopoulos, E. (2022). Alliance rupture and repair processes in psychoanalytic psychotherapy: multimodal in-session shifts from momentary failure to repair. Counselling Psychology Quarterly, 35(4), 814-841.
  5. Terhürne, P., Schwartz, B., Baur, T., Schiller, D., & André, E. (2022). Validation and application of the Non Verbal Behavior Analyzer: An automated tool to assess non verbal emotional expressions in psychotherapy. Frontiers in Psychiatry, 13, 1026015.
  6. Döllinger, L., Högman, L. B., Laukka, P., Bänziger, T., Makower, I., Fischer, H., & Hau, S. (2023). Trainee psychotherapists’ emotion recognition accuracy improves after training: emotion recognition training as a tool for psychotherapy education. Frontiers in Psychology, 14, 1188634.
  7. Tran, T., Yin, Y., Tavabi, L., Delacruz, J., Borsari, B., Woolley, J. D., ... & Soleymani, M. (2023, October). Multimodal Analysis and Assessment of Therapist Empathy in Motivational Interviews. In Proceedings of the 25th International Conference on Multimodal Interaction (pp. 406-415).
  8. Döllinger, L., Letellier, I., Högman, L., Laukka, P., Fischer, H., & Hau, S. (2023). Trainee psychotherapists’ emotion recognition accuracy during 1.5 years of psychotherapy education compared to a control group: no improvement after psychotherapy training. PeerJ, 11, e16235.
  9. Christ, L., Amiriparian, S., Baird, A., Tzirakis, P., Kathan, A., Müller, N., ... & Schuller, B. W. (2022, October). The muse 2022 multimodal sentiment analysis challenge: humor, emotional reactions, and stress. In Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge (pp. 5-14).Terhürne, P., Schwartz, B., Baur, T., Schiller, D., & André, E. (2022). Validation and application of the Non Verbal Behavior Analyzer: An automated tool to assess non verbal emotional expressions in psychotherapy. Frontiers in Psychiatry, 13, 1026015.
  10. Cai, C., He, Y., Sun, L., Lian, Z., Liu, B., Tao, J., ... & Wang, K. (2021). Multimodal sentiment analysis based on recurrent neural network and multimodal attention. In Proceedings of the 2nd on multimodal sentiment analysis challenge (pp. 61-67).

A person’s mental well-being can be perceived by the emotions that they express. What a person feels can be observed by various physical and physiological cues. But people aren’t all the same, some are capable of expressing what they truly feel, others might not and there are certain scenarios where the person who is expressing those emotions isn’t completely aware of the emotional state they are in. Such scenarios are where even a trained professional isn't always one- hundred percent right. This raises the need for a solution that can observe a person’s behavioral traits and guess their emotional state. Currently we have various deep learning approaches that can tackle the problem in hand. One of the widely used approaches is making use of a Unimodal system that predicts a person’s emotional state by processing information that is collected in the form of a single modality. But using a single channel to perform such a complex classification task is often inefficient. To make more appropriate classifications, this study proposes a multimodal approach that incorporates eXplainable Artificial Intelligence (XAI) methodologies, and hence improving psychotherapeutic outcomes. The multimodal emotion recognition approach integrates multiple information channels of physical cues of a person, like speech and facial expressions. A more accurate prediction can be arrived at with various complementary channels backing it up. The addition of XAI algorithms make it clearer as to how the model arrived at its conclusion. Overall, this system provides a solution that can be personalized for each client and allows us to have a proper data-driven tool for emotional analysis, which can help the practitioners to design appropriate treatment plans for their client. By adding this state-of-the-art technology as a supplement to conventional psychotherapy techniques, we can yield more successful treatments.

Keywords : Multimodal Emotion Recognition; Explainable Artificial Intelligence(XAI); Psychotherapy; Gradcam; LIME; Therapy Results.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe