A review of multimodal aspect-based sentiment analysis

Research Article
Open access

A review of multimodal aspect-based sentiment analysis

Tian’ang Chen 1*
  • 1 University of Electronic Science and Technology of China    
  • *corresponding author chentianang@foxmail.com
Published on 10 June 2025 | https://doi.org/10.54254/2977-3903/2025.23984
AEI Vol.16 Issue 6
ISSN (Print): 2977-3911
ISSN (Online): 2977-3903

Abstract

In the era of digital communication, the exponential growth of user-generated content across social media and online platforms has intensified the demand for effective emotion analysis tools. Traditional text-based sentiment analysis methods, however, often fall short in accurately capturing the nuances of human emotions due to their reliance on a single modality. Motivated by the need for more comprehensive and context-aware emotion recognition, this study systematically reviews the literature on both unimodal and multimodal aspect-level sentiment analysis. By comparing different approaches within the multimodal domain, we identify existing challenges and emerging trends in this research area. Our findings highlight the potential of integrating multiple modalities—such as text, images, and audio—to enhance the precision of sentiment detection and suggest future directions for advancing multimodal sentiment analysis.

Keywords:

Aspect-Based Sentiment Analysis, Multimodal Aspect-Based Sentiment Analysis, Large Language Models (LLMs)

Chen,T. (2025). A review of multimodal aspect-based sentiment analysis. Advances in Engineering Innovation,16(6),43-51.
Export citation

References

[1]. Bhowmik, A., Nur, N. M., Miah, M. S. U., & Karmekar, D. (2023). Aspect-based Sentiment Analysis Model for Evaluating Teachers’ Performance from Students’ Feedback. AIUB Journal of Science and Engineering, 22(3), 287–294. https://doi.org/10.53799/AJSE.V22I3.921

[2]. Kontonatsios, G., Clive, J., Harrison, G., Metcalfe, T., Sliwiak, P., Tahir, H., &Ghose, A. (2023). FABSA: An aspect-based sentiment analysis dataset of user reviews. Neurocomputing, 562, 0–9. https://doi.org/10.1016/j.neucom.2023.126867

[3]. Meng, L., Zhao, T., & Song, D. (2024). DS-Group at SIGHAN-2024 dimABSA Task: Constructing In-context Learning Structure for Dimensional Aspect-Based Sentiment Analysis. Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing, 127–132. https://aclanthology.org/2024.sighan-1.15

[4]. Aggrawal, A., & Varshney, D. (2024). Multimodal Sentiment Analysis: Perceived vs Induced Sentiments. 2024 Silicon Valley Cybersecurity Conference (SVCC). https://doi.org/10.1109/SVCC61185.2024.10637377

[5]. Feng, J., Lin, M., Shang, L., & Gao, X. (2024). Autonomous Aspect-Image Instruction A2II: Q-Former Guided Multimodal Sentiment Classification. 2024 Joint International Conference on Computational Linguistics and Language Resources Evaluation (LREC) - Main Conference Proceedings, 1996–2005.

[6]. Bianbian, J., Rajamanickam, L., Lohgheswary, N., & Nopiah, Z. M. (2023). Multimodal Sentimental Analysis Based on Deep Learning. Section A-Research Paper Eur. (12), (5), 3567–3573. 10.48047/ecb/2023.12.si5a.0249.

[7]. Zhou, R., Guo, W., Liu, X., Yu, S., Zhang, Y., & Yuan, X. (2023). AoM: Detecting Aspect-oriented Information for Multimodal Aspect-Based Sentiment Analysis. Proceedings of the Annual Meeting of the Association for Computational Linguistics, (1), 8184–8196. https://doi.org/10.18653/v1/2023.findings-acl.519

[8]. Zhou, Z., Feng, H., Qiao, B., Wu, G., & Han, D. (2023). Syntax-aware Hybrid prompt model for Few-shot multi-modal sentiment analysis. ArXiv, abs/2306.01312.

[9]. Nguyen, C. D., Nguyen, T., Vu, D. A., & Tuan, L. A. (2023). Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation. Findings of the Association for Computational Linguistics: EMNLP 2023, 14714–14724. https://doi.org/10.18653/v1/2023.findings-emnlp.980

[10]. Xiang, Y., Cai, Y., & Guo, J. (2023). MSFNet: modality smoothing fusion network for multimodal aspect-based sentiment analysis. Frontiers in Physics, 11(5), 1–10. https://doi.org/10.3389/fphy.2023.1187503

[11]. Zhu, L., Sun, H., Gao, Q., Yi, T., & He, L. (2024). Joint Multimodal Aspect Sentiment Analysis with Aspect Enhancement and Syntactic Adaptive Learning. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (pp. 6678–6686). https://doi.org/10.24963/ijcai.2024/738

[12]. Hassan, A., & Mahmood, A. (2017). Deep Learning approach for sentiment analysis of short texts. 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), 705–710. https://doi.org/10.1109/ICCAR.2017.7942788

[13]. Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. (2024). Research on the application of deep learning-based BERT model in sentiment analysis. Applied Computing Engineering, 71(1), 14–20. https://doi.org/10.54254/2755-2721/71/2024ma

[14]. Wang, D., He, Y., Liang, X., Tian, Y., Li, S., & Zhao, L. (2024). TMFN: A Target-oriented Multi-grained Fusion Network for End-to-end Aspect-based Multimodal Sentiment Analysis. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 16187–16197, Torino, Italia. ELRA and ICCL.

[15]. Zhang, J., Wu, X., & Huang, C. (2023). AdaMoW: Multimodal Sentiment Analysis Based on Adaptive Modality-Specific Weight Fusion Network. IEEE Access, 11(April), 48410–48420. https://doi.org/10.1109/ACCESS.2023.3276932

[16]. Liu, Y., Zhou, Y., Li, Z., Zhang, J., Shang, Y., &Zhang, C. (2024). RNG: Reducing Multi-level Noise and Multi-grained Semantic Gap for Joint Multimodal Aspect-Sentiment Analysis. Proceedings - IEEE International Conference on Multimedia and Expo. https://doi.org/10.1109/ICME57554.2024.10687372

[17]. Luo, M., Fei, H., Li, B., Wu, S., Liu, Q., Poria, S., Cambria, E., Lee, M., & Hsu, Y. (2024). PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis. In Proceedings of the 32nd ACM International Conference on Multimedia (MM '24). Association for Computing Machinery, New York, NY, USA, 7667–7676. https://doi.org/10.1145/3664647.3680705

[18]. Ye, J., Zhou, J., Tian, J., Wang, R., Zhang, Q., Gui, T., & Huang, X. (2023). RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal Sentiment Classification. Findings of the Association for Computational Linguistics: EMNLP 2023, 270–277. https://doi.org/10.18653/v1/2023.findings-emnlp.21


Cite this article

Chen,T. (2025). A review of multimodal aspect-based sentiment analysis. Advances in Engineering Innovation,16(6),43-51.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Journal:Advances in Engineering Innovation

Volume number: Vol.16
Issue number: Issue 6
ISSN:2977-3903(Print) / 2977-3911(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Bhowmik, A., Nur, N. M., Miah, M. S. U., & Karmekar, D. (2023). Aspect-based Sentiment Analysis Model for Evaluating Teachers’ Performance from Students’ Feedback. AIUB Journal of Science and Engineering, 22(3), 287–294. https://doi.org/10.53799/AJSE.V22I3.921

[2]. Kontonatsios, G., Clive, J., Harrison, G., Metcalfe, T., Sliwiak, P., Tahir, H., &Ghose, A. (2023). FABSA: An aspect-based sentiment analysis dataset of user reviews. Neurocomputing, 562, 0–9. https://doi.org/10.1016/j.neucom.2023.126867

[3]. Meng, L., Zhao, T., & Song, D. (2024). DS-Group at SIGHAN-2024 dimABSA Task: Constructing In-context Learning Structure for Dimensional Aspect-Based Sentiment Analysis. Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing, 127–132. https://aclanthology.org/2024.sighan-1.15

[4]. Aggrawal, A., & Varshney, D. (2024). Multimodal Sentiment Analysis: Perceived vs Induced Sentiments. 2024 Silicon Valley Cybersecurity Conference (SVCC). https://doi.org/10.1109/SVCC61185.2024.10637377

[5]. Feng, J., Lin, M., Shang, L., & Gao, X. (2024). Autonomous Aspect-Image Instruction A2II: Q-Former Guided Multimodal Sentiment Classification. 2024 Joint International Conference on Computational Linguistics and Language Resources Evaluation (LREC) - Main Conference Proceedings, 1996–2005.

[6]. Bianbian, J., Rajamanickam, L., Lohgheswary, N., & Nopiah, Z. M. (2023). Multimodal Sentimental Analysis Based on Deep Learning. Section A-Research Paper Eur. (12), (5), 3567–3573. 10.48047/ecb/2023.12.si5a.0249.

[7]. Zhou, R., Guo, W., Liu, X., Yu, S., Zhang, Y., & Yuan, X. (2023). AoM: Detecting Aspect-oriented Information for Multimodal Aspect-Based Sentiment Analysis. Proceedings of the Annual Meeting of the Association for Computational Linguistics, (1), 8184–8196. https://doi.org/10.18653/v1/2023.findings-acl.519

[8]. Zhou, Z., Feng, H., Qiao, B., Wu, G., & Han, D. (2023). Syntax-aware Hybrid prompt model for Few-shot multi-modal sentiment analysis. ArXiv, abs/2306.01312.

[9]. Nguyen, C. D., Nguyen, T., Vu, D. A., & Tuan, L. A. (2023). Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation. Findings of the Association for Computational Linguistics: EMNLP 2023, 14714–14724. https://doi.org/10.18653/v1/2023.findings-emnlp.980

[10]. Xiang, Y., Cai, Y., & Guo, J. (2023). MSFNet: modality smoothing fusion network for multimodal aspect-based sentiment analysis. Frontiers in Physics, 11(5), 1–10. https://doi.org/10.3389/fphy.2023.1187503

[11]. Zhu, L., Sun, H., Gao, Q., Yi, T., & He, L. (2024). Joint Multimodal Aspect Sentiment Analysis with Aspect Enhancement and Syntactic Adaptive Learning. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (pp. 6678–6686). https://doi.org/10.24963/ijcai.2024/738

[12]. Hassan, A., & Mahmood, A. (2017). Deep Learning approach for sentiment analysis of short texts. 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), 705–710. https://doi.org/10.1109/ICCAR.2017.7942788

[13]. Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. (2024). Research on the application of deep learning-based BERT model in sentiment analysis. Applied Computing Engineering, 71(1), 14–20. https://doi.org/10.54254/2755-2721/71/2024ma

[14]. Wang, D., He, Y., Liang, X., Tian, Y., Li, S., & Zhao, L. (2024). TMFN: A Target-oriented Multi-grained Fusion Network for End-to-end Aspect-based Multimodal Sentiment Analysis. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 16187–16197, Torino, Italia. ELRA and ICCL.

[15]. Zhang, J., Wu, X., & Huang, C. (2023). AdaMoW: Multimodal Sentiment Analysis Based on Adaptive Modality-Specific Weight Fusion Network. IEEE Access, 11(April), 48410–48420. https://doi.org/10.1109/ACCESS.2023.3276932

[16]. Liu, Y., Zhou, Y., Li, Z., Zhang, J., Shang, Y., &Zhang, C. (2024). RNG: Reducing Multi-level Noise and Multi-grained Semantic Gap for Joint Multimodal Aspect-Sentiment Analysis. Proceedings - IEEE International Conference on Multimedia and Expo. https://doi.org/10.1109/ICME57554.2024.10687372

[17]. Luo, M., Fei, H., Li, B., Wu, S., Liu, Q., Poria, S., Cambria, E., Lee, M., & Hsu, Y. (2024). PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis. In Proceedings of the 32nd ACM International Conference on Multimedia (MM '24). Association for Computing Machinery, New York, NY, USA, 7667–7676. https://doi.org/10.1145/3664647.3680705

[18]. Ye, J., Zhou, J., Tian, J., Wang, R., Zhang, Q., Gui, T., & Huang, X. (2023). RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal Sentiment Classification. Findings of the Association for Computational Linguistics: EMNLP 2023, 270–277. https://doi.org/10.18653/v1/2023.findings-emnlp.21