Advanced Computational Intelligence: An International Journal, Vol.12, No.2/3/4, October 2025
11
[5] Liu, W., Qiu, JL, Zheng, WL, & Lu, BL (2021). Comparing recognition performance and
robustness of multimodal deep learning models for multimodal emotion recognition. IEEE
Transactions on Cognitive and Developmental Systems, 14 (2), 715–729.
https://doi.org/10.1109/TCDS.2021.3071170
[6] Maithri, M., Raghavendra, U., Gudigar, A., Samanth, J., Barua, PD, Murugappan, M., & Acharya,
UR (2022). Automated emotion recognition: Current trends and future perspectives. Computer
Methods and Programs in Biomedicine, 215, 106646. https://doi.org/10.1016/j.cmpb.2022.106646
[7] Pan, B., Hirota, K., Jia, Z., & Dai, Y. (2023). A review of multimodal emotion recognition from
datasets, preprocessing, features, and fusion methods. Neurocomputing, 561, 126866.
https://doi.org/10.1016/j.neucom.2023.126866
[8] Zhang, S., Yang, Y., Chen, C., Zhang, X., Leng, Q., & Zhao, X. (2024). Deep learning-based
multimodal emotion recognition from audio, visual, and text modalities: A systematic review of
recent advances and future prospects. Expert Systems with Applications, 237, 121692.
https://doi.org/10.1016/j.eswa.2023.121692
[9] Lian, H., Lu, C., Li, S., Zhao, Y., Tang, C., & Zong, Y. (2023). A survey of deep learning-based
multimodal emotion recognition: Speech, text, and face. Entropy, 25 (10), 1440.
https://doi.org/10.3390/e25101440
[10] Zhao, Z., Wang, Y., Shen, G., Hu, Y., & Zhang, J. (2023). TDFNet: Transformer-based deep-scale
fusion network for multimodal emotion recognition. IEEE/ACM Transactions on Audio, Speech,
and Language Processing, 31, 3771–3782. https://doi.org/10.1109/TASLP.2023.3314318
[11] Alam, MM, Dini, MA, Kim, D.-S., & Jun, T. (2025). TMNet: Transformer-fused multimodal
framework for emotion recognition via EEG and speech. ICT Express, in press.
https://doi.org/10.1016/j.icte.2024.09.003
[12] Chen, L., Wang, K., Li, M., Wu, M., Pedrycz, W., & Hirota, K. (2022). K-means clustering-based
kernel canonical correlation analysis for multimodal emotion recognition in human-robot
interaction. IEEE Transactions on Industrial Electronics, 70 (1), 1016–1024.
https://doi.org/10.1109/TIE.2022.3156150
[13] Tzirakis, P., Trigeorgis, G., Nicolaou, MA, Schuller, BW, & Zafeiriou, S. (2017). End-to-end
multimodal emotion recognition using deep neural networks. IEEE Journal of Selected Topics in
Signal Processing, 11 (8), 1301–1309. https://doi.org/10.1109/JSTSP.2017.2764438
AUTHOR
Kurbanov Abdurahmon received his bachelor's degree in Applied Mathematics
and Informatics from Gulistan State University in 2007. In 2009, he received his
master's degree in Applied Mathematics and Information Technologies from the same
university. In 2009-2012, he worked as a senior lecturer at Gulistan College of
Computer Science, and from 2012-2022, he worked as a senior lecturer at the Syrdarya
Regional Institute of Teacher Training. From 2023 to the present, he is a doctoral
student at the Jizzakh branch of the National University of Uzbekistan named after
Mirzu Ulugbek. His interests include software engineering, artificial intelligence,
deep learning, web software, programming languages. He can be contacted at
[email protected].