Authors :
N. Sripriya; Swetha Subramanian; Sriganesh Jagathisan
Volume/Issue :
Volume 10 - 2025, Issue 5 - May
Google Scholar :
https://tinyurl.com/uw2fyz85
DOI :
https://doi.org/10.38124/ijisrt/25may474
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
A person’s mental well-being can be perceived by the emotions that they express. What a person feels can be
observed by various physical and physiological cues. But people aren’t all the same, some are capable of expressing what
they truly feel, others might not and there are certain scenarios where the person who is expressing those emotions isn’t
completely aware of the emotional state they are in. Such scenarios are where even a trained professional isn't always one-
hundred percent right. This raises the need for a solution that can observe a person’s behavioral traits and guess their
emotional state. Currently we have various deep learning approaches that can tackle the problem in hand. One of the widely
used approaches is making use of a Unimodal system that predicts a person’s emotional state by processing information that
is collected in the form of a single modality. But using a single channel to perform such a complex classification task is often
inefficient. To make more appropriate classifications, this study proposes a multimodal approach that incorporates
eXplainable Artificial Intelligence (XAI) methodologies, and hence improving psychotherapeutic outcomes. The multimodal
emotion recognition approach integrates multiple information channels of physical cues of a person, like speech and facial
expressions. A more accurate prediction can be arrived at with various complementary channels backing it up. The addition
of XAI algorithms make it clearer as to how the model arrived at its conclusion. Overall, this system provides a solution that
can be personalized for each client and allows us to have a proper data-driven tool for emotional analysis, which can help
the practitioners to design appropriate treatment plans for their client. By adding this state-of-the-art technology as a
supplement to conventional psychotherapy techniques, we can yield more successful treatments.
Keywords :
Multimodal Emotion Recognition; Explainable Artificial Intelligence(XAI); Psychotherapy; Gradcam; LIME; Therapy Results.
References :
- Khalane, A., Makwana, R., Shaikh, T., & Ullah, A. (2023). Evaluating significant features in context‐aware multimodal emotion recognition with XAI methods. Expert Systems, e13403.
- Rahman, M. A., Brown, D. J., Shopland, N., Burton, A., & Mahmud, M. (2022, June). Explainable multimodal machine learning for engagement analysis by continuous performance test. In International Conference on Human Computer Interaction (pp. 386-399). Cham: Springer International Publishing.
- Guerdan, L., Raymond, A., & Gunes, H. (2021). Toward affective XAI: facial affect analysis for understanding explainable human-ai interactions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3796-3805).
- Mylona, A., Avdi, E., & Paraskevopoulos, E. (2022). Alliance rupture and repair processes in psychoanalytic psychotherapy: multimodal in-session shifts from momentary failure to repair. Counselling Psychology Quarterly, 35(4), 814-841.
- Terhürne, P., Schwartz, B., Baur, T., Schiller, D., & André, E. (2022). Validation and application of the Non Verbal Behavior Analyzer: An automated tool to assess non verbal emotional expressions in psychotherapy. Frontiers in Psychiatry, 13, 1026015.
- Döllinger, L., Högman, L. B., Laukka, P., Bänziger, T., Makower, I., Fischer, H., & Hau, S. (2023). Trainee psychotherapists’ emotion recognition accuracy improves after training: emotion recognition training as a tool for psychotherapy education. Frontiers in Psychology, 14, 1188634.
- Tran, T., Yin, Y., Tavabi, L., Delacruz, J., Borsari, B., Woolley, J. D., ... & Soleymani, M. (2023, October). Multimodal Analysis and Assessment of Therapist Empathy in Motivational Interviews. In Proceedings of the 25th International Conference on Multimodal Interaction (pp. 406-415).
- Döllinger, L., Letellier, I., Högman, L., Laukka, P., Fischer, H., & Hau, S. (2023). Trainee psychotherapists’ emotion recognition accuracy during 1.5 years of psychotherapy education compared to a control group: no improvement after psychotherapy training. PeerJ, 11, e16235.
- Christ, L., Amiriparian, S., Baird, A., Tzirakis, P., Kathan, A., Müller, N., ... & Schuller, B. W. (2022, October). The muse 2022 multimodal sentiment analysis challenge: humor, emotional reactions, and stress. In Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge (pp. 5-14).Terhürne, P., Schwartz, B., Baur, T., Schiller, D., & André, E. (2022). Validation and application of the Non Verbal Behavior Analyzer: An automated tool to assess non verbal emotional expressions in psychotherapy. Frontiers in Psychiatry, 13, 1026015.
- Cai, C., He, Y., Sun, L., Lian, Z., Liu, B., Tao, J., ... & Wang, K. (2021). Multimodal sentiment analysis based on recurrent neural network and multimodal attention. In Proceedings of the 2nd on multimodal sentiment analysis challenge (pp. 61-67).
A person’s mental well-being can be perceived by the emotions that they express. What a person feels can be
observed by various physical and physiological cues. But people aren’t all the same, some are capable of expressing what
they truly feel, others might not and there are certain scenarios where the person who is expressing those emotions isn’t
completely aware of the emotional state they are in. Such scenarios are where even a trained professional isn't always one-
hundred percent right. This raises the need for a solution that can observe a person’s behavioral traits and guess their
emotional state. Currently we have various deep learning approaches that can tackle the problem in hand. One of the widely
used approaches is making use of a Unimodal system that predicts a person’s emotional state by processing information that
is collected in the form of a single modality. But using a single channel to perform such a complex classification task is often
inefficient. To make more appropriate classifications, this study proposes a multimodal approach that incorporates
eXplainable Artificial Intelligence (XAI) methodologies, and hence improving psychotherapeutic outcomes. The multimodal
emotion recognition approach integrates multiple information channels of physical cues of a person, like speech and facial
expressions. A more accurate prediction can be arrived at with various complementary channels backing it up. The addition
of XAI algorithms make it clearer as to how the model arrived at its conclusion. Overall, this system provides a solution that
can be personalized for each client and allows us to have a proper data-driven tool for emotional analysis, which can help
the practitioners to design appropriate treatment plans for their client. By adding this state-of-the-art technology as a
supplement to conventional psychotherapy techniques, we can yield more successful treatments.
Keywords :
Multimodal Emotion Recognition; Explainable Artificial Intelligence(XAI); Psychotherapy; Gradcam; LIME; Therapy Results.