Research Article
Open access
Published on 4 February 2024
Download pdf
Ouyang,S. (2024). Deep learning for sentiment analysis on IMDB movie reviews using N-gram features. Applied and Computational Engineering,35,56-63.
Export citation

Deep learning for sentiment analysis on IMDB movie reviews using N-gram features

Sicheng Ouyang *,1,
  • 1 Wuhan Britain China School

* Author to whom correspondence should be addressed.

https://doi.org/10.54254/2755-2721/35/20230361

Abstract

In the rapidly evolving digital landscape, the synergy of deep learning techniques and abundant datasets has opened new frontiers in various domains. This research delves into the film industry, specifically harnessing the potential of the International Movie DataBase (IMDB) dataset for sentiment analysis. Through a deep learning paradigm, we embark on sentiment classification of movie reviews, discerning between positive and negative sentiments. By navigating data preprocessing and N-gram feature extraction, we engineer a deep learning model comprising embedding, global average pooling, and multi-layer dense architectures. The experimental results underscore the model's prowess in sentiment analysis, emphasizing its capacity to empower informed decision-making within the film industry.

Keywords

IMDB movie dataset, deep learning, sentiment analysis, N-gram feature extraction, model architecture

[1]. Shams EA, Rizaner A, Ulusoy AH. A novel context-aware feature extraction method for convolutional neural network-based intrusion detection systems. Neural Computing and Applications. 2021 Oct;33(20):13647-65.

[2]. Jose A, Harikumar S. Predicting IMDB Movie Ratings Using RoBERTa Embeddings and Neural Networks. InResponsible Data Science: Select Proceedings of ICDSE 2021 2022 Nov 15 (pp. 181-189). Singapore: Springer Nature Singapore.

[3]. Parvin H, Minaei B, Karshenas H, Beigi A. A new N-gram feature extraction-selection method for malicious code. InAdaptive and Natural Computing Algorithms: 10th International Conference, ICANNGA 2011, Ljubljana, Slovenia, April 14-16, 2011, Proceedings, Part II 10 2011 (pp. 98-107). Springer Berlin Heidelberg.

[4]. de Godoi Brandão J, Calixto WP. N-Gram and TF-IDF for Feature Extraction on Opinion Mining of Tweets with SVM Classifier. In2019 International Artificial Intelligence and Data Processing Symposium (IDAP) 2019 Sep 21 (pp. 1-5). IEEE.

[5]. Poornima S, Subramanian T. Effective Feature Extraction via N-Skip Gram Instruction Embedding Model using Deep Neural Network for designing Anti-Malware Application. In2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS) 2023 Mar 17 (Vol. 1, pp. 2118-2123). IEEE.

[6]. Wang F, Quach TT, Wheeler J, Aimone JB, James CD. Sparse coding for n-gram feature extraction and training for file fragment classification. IEEE Transactions on Information Forensics and Security. 2018 Apr 5;13(10):2553-62.

[7]. Zhu L, Wang W, Huang M, Chen M, Wang Y, Cai Z. A N-gram based approach to auto-extracting topics from research articles1. Journal of Intelligent & Fuzzy Systems. 2022 Jan 1;43(5):6137-46.

[8]. Kim K, Gopi S, Kulkarni J, Yekhanin S. Differentially private n-gram extraction. Advances in Neural Information Processing Systems. 2021 Dec 6;34:5102-11.

[9]. Zhou J, Liu L, Wei W, Fan J. Network representation learning: from preprocessing, feature extraction to node embedding. ACM Computing Surveys (CSUR). 2022 Jan 18;55(2):1-35.

[10]. Damashek M. Gauging similarity with n-grams: Language-independent categorization of text. Science. 1995 Feb 10;267(5199):843-8.

Cite this article

Ouyang,S. (2024). Deep learning for sentiment analysis on IMDB movie reviews using N-gram features. Applied and Computational Engineering,35,56-63.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 2023 International Conference on Machine Learning and Automation

Conference website: https://2023.confmla.org/
ISBN:978-1-83558-295-4(Print) / 978-1-83558-296-1(Online)
Conference date: 18 October 2023
Editor:Mustafa İSTANBULLU
Series: Applied and Computational Engineering
Volume number: Vol.35
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).