Research on web text sentiment analysis and application based on Deep Learning

Ziming Tang

doi:10.54254/2977-3903/2025.23334

1. Introduction

The continuous progress of Internet technology has led to the rapid development of the media. Media represented by Weibo, Facebook, Twitter, etc., provide users with a platform to express their views and emotions. Internet accumulated the massive contains personal point of view discusses the text information and emotion, feelings and views of these text contains were analyzed, and the crowd for a particular event can be drawn from the emotional changes associated with emotional and relationship, to get public opinion direction, the product of social evaluation play an important role.

Text Sentiment Analysis (SA), also known as text opinion mining, refers to the extraction and classification of the potential emotions, attitudes, opinions and other similar information in the text, which is a kind of Natural Language Processing. It is a fundamental and important task in the field of NLP. This study aims to explore Deep Learning (DL) -based web text sentiment analysis techniques, compare the performance of different DL models, and explore their best practices in practical applications. As the receivers and senders of network information, Internet users generate a large amount of text data with subjective emotions on the network platform by publishing online reviews. These texts play an important role for individuals, businesses and government agencies. Through the analysis of these texts, which not only can enrich theoretical research in the field of natural language processing (NLP), can also make corresponding purchase decisions, operational strategy and response measures, public opinion has important theoretical and practical significance [1].

2. Related technologies

2.1. Overview of Deep Learning

Deep Learning (DL) is a Machine Learning (ML) in the field of category based on Artificial Neural Network (ANN) is a powerful technology, its core is that by building a Neural Network model has multiple layers, it allows the computer automatically learn from large-scale data patterns and characteristics of complex said. It imitates the connection between neurons in the human brain and adjusts the connection weights between neurons through a large amount of data training, so as to realize accurate understanding and processing of data.

The development of deep learning (DL) can be traced back to the middle of the last century. The early research on Neural Networks (NN) developed slowly due to the limitations of computing power and the amount of data. With the rapid promotion of computer hardware performance, the advent of the era of big data and the algorithm of continuous innovation, deep learning in recent years has made breakthrough progress. From the initial simple multi-layer Perceptron (Multilayer Perceptron, MLP), then the Image Recognition, Image Recognition) to shine in the field of Convolutional Neural Network (CNN), and Recurrent Neural Networks, which are widely used in natural language processing (NLP) and Time Series Analysis. RNN and its variants (such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU)), To advance training Language model based on the Transformer architecture (Pre-steeped Language Models, PLM), Such as Bidirectional Encoder Representations from Transformers (BERT) and generate the preliminary training converter (Generative Pre-steeped Transformer, GPT), etc., deep learning push the Artificial Intelligence (AI) the development of technology, It has shown strong application potential in various fields [2].

2.2. Neural network architecture

2.2.1. Convolutional Neural Network (CNN)

Originally mainly used in the field of image recognition, CNNS have also been widely used in text sentiment analysis in recent years. CNN is mainly used for short text sentiment analysis, which can extract semantic features through local receptive fields. In text processing, the CNN by sliding, convolution kernel in the text sequence to extract local features. Compared with RNN, CNN has higher computational efficiency, can be calculated in parallel, and performs well in processing short texts. Typical CNN sentiment analysis models include:

Word embedded layer (Word Embedding), such as Word2Vec, GloVe, converts text to a low dimensional vector [3].

Convolutional Layer, which is used to extract local features.

Pooling Layer for dimensionality reduction and computational efficiency.

Fully Connected Layer, which is used to classify text sentiment.

2.2.2. Recurrent Neural Network (RNN)

A recurrent neural network (RNN) is a type of neural network that can process sequential data. It can store the information of the previous time through the memory unit, so that it can process the current time data. The advantage of RNN is that it can effectively capture the time dependence in the sequence, so it has a significant application performance in long text sentiment analysis. However, RNN suffers from gradient disappearance and gradient explosion, which makes it perform poorly when dealing with long sequence data. In order to solve these problems, long short-term memory network (LSTM) and Gated recurrent Unit (GRU) are usually used as alternatives.

LSTM can effectively solve the problem of long-term dependence by introducing a gating mechanism. The LSTM network controls the flow of information through three gates (input gate, forget gate and output gate), so as to realize the selective memory and forgetting of information. This design enables LSTM to retain important context information better than traditional RNN when dealing with long sequence data, and avoids the problem of gradient disappearance.

GRU helped is a variant of LSTM, it simplifies the LSTM on structure, mainly through updating and resetting the door to control the flow of information. Compared to LSTM, GRU has fewer parameters and thus is more computationally efficient. Although GRU helped on certain tasks may have underperformed the LSTM, but it still can provide similar performance in many applications, and is more suitable for need quick training scenario [4].

2.3. Attention mechanism

Attention mechanism can enhance the model's attention to key sentiment words and improve the accuracy of sentiment analysis. In particular, the introduction of Attention in models such as Bi-LSTM can effectively capture context information and improve classification performance. Attention mechanism to model in the treatment of the text, the important part of automatic focus is to ignore irrelevant information. In text sentiment analysis, an attention mechanism model can help to better capture keywords related to emotions and statements, and improve the accuracy of the analysis. For example, in the sentiment analysis model based on the LSTM joint attention mechanism, it can make the model more focused on the strong emotional expression part [5].

2.4. Pre-training models

In recent years, pre-trained models such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer) have been increasingly used in the field of sentiment analysis. Pre-trained language models have been a major breakthrough in the field of natural language processing in recent years. BERT is based on the Transformer architecture and has learned a wealth of language knowledge and semantic representations through large-scale unsupervised pre-training. In text sentiment analysis, good performance can be achieved by simply fine-tuning the BERT model. GPT is a generative pre-trained model that can be used not only for sentiment analysis, but also for tasks such as text generation and question-answering systems [6].

3. Research on the application of web text sentiment analysis

3.1. Social media sentiment analysis

Social media platforms (such as Weibo, Twitter, Facebook) as the important platforms for people to express their views, share feelings, and contain massive text data. Social media sentiment analysis based on deep learning aims to mine users' emotional tendencies from the text data, which provides strong support for public opinion monitoring, brand management, and market research.

In the data collection stage, it is necessary to collect text data related to the research topic from major social media platforms. Due to the characteristics of social media data, such as a large amount of data, various formats, and lots of noise, strict data preprocessing work is required. First, the text is cleaned to remove irrelevant information such as HTML tags, special characters, links, etc.; then, the text is segmented into words or phrases through word segmentation. Then, the text was segmented into words or phrases by word segmentation. Then, stemming or lemmatization is performed to convert words into their basic forms to reduce the diversity of vocabulary. Finally, stop word filtering is performed to remove those common words that have no substantive meaning for sentiment analysis (e.g., "of", "is", "in", etc.).

However, social media sentiment analysis also faces some challenges. Firstly, there are a large number of colloquial expressions, abbreviations, Internet expressions and emotifications in social media texts. These special linguistic phenomena increase the difficulty of sentiment analysis. Secondly, the real-time nature of social media data is constantly updated, which requires the model to be able to quickly adapt to new data and a changing language environment. In addition, the differences in emotional expression under different cultural backgrounds, how to make the model in the analysis of cross-cultural social media emotion have good performance, is also a need to solve the problem. In the future, with the continuous development and innovation of deep learning technology, it is believed that these challenges will be gradually solved, and social media sentiment analysis will play a greater role in more fields.

3.2. Product review analysis

User reviews on e-commerce platforms such as JD.com, Taobao, and Amazon are crucial for consumer decision-making. In e-commerce platforms, through sentiment analysis of user reviews, enterprises can understand consumers' satisfaction and demand for products and improve products and services in a timely manner. For example, an e-commerce company used deep learning model to analyze user reviews and found that negative reviews of a certain product were mainly focused on product quality and after-sales service. Therefore, the company strengthened quality control and after-sales service training to improve customer satisfaction and sales.

3.3. Public opinion monitoring and government decision support

The government and relevant agencies can use text sentiment analysis technology to monitor social media and news reports in real time to understand public attitudes and emotions toward policies and events. Through sentiment analysis on the discussion of hot events, policies and regulations on social media, the government can understand the public's attitude and emotion in time, take corresponding measures to guide and respond, and maintain social harmony and stability.

3.4. Intelligent customer service and chatbots

In intelligent customer service and chatbots, text sentiment analysis can help bots understand users' emotions and provide more personalized services. For example, when a user expresses dissatisfaction, the robot can recognize it in time and take calming measures to improve the user experience. In addition, sentiment analysis technology can also help robots identify users' needs and preferences, so as to provide more personalized recommendations and solutions. With the continuous progress of sentiment analysis technology, future chatbots will be able to more accurately capture user emotional changes, further improving service quality and user satisfaction.

4. The challenges and future development trends of sentiment analysis of web text based on deep learning

4.1. Challenges of research

Data quality is a challenge that cannot be ignored when conducting sentiment analysis of Web text. Web text data often contains noise, typos and colloquial expressions. These problems will affect the training effect of the model, and then lead to a decline in the accuracy of sentiment analysis. Therefore, how to effectively clean and process this data to improve its quality is the key to improving the performance of sentiment analysis models.

Another important issue is the interpretability of deep learning models. Because deep learning models are usually considered as "black boxes" with a lack of transparency in their decision-making process, it is difficult for them to be widely used in some fields with high requirements for interpretability, such as healthcare and finance. Therefore, how to improve the interpretability of deep learning models so that they can be fully applied in these fields is an important direction for future research.

In addition, cross-language and cross-domain adaptability is also one of the challenges faced by current sentiment analysis technology. Text in different languages and domains has significant differences in language characteristics and semantic expression, which makes it an urgent problem to ensure that the model can maintain good performance in multi-language and cross-domain scenarios. With the further development of globalization, sentiment analysis models that can effectively deal with different languages and domains will have a wider application prospect.

In the future, with the continuous development and innovation of deep learning technology, it is believed that these challenges will be gradually solved, and network text sentiment analysis will play a greater role in more fields.

4.2. Future development trends

In the future, the field of sentiment analysis will develop in the direction of multi-modal, which can significantly improve the accuracy of sentiment analysis by combining multiple data types such as text, speech, and image. By fusing multi-modal information, the model can understand the user's emotion more comprehensively, so as to achieve more accurate sentiment classification and analysis.

In terms of improving the performance of the model, the integration of different deep learning models has become an important research direction. By combining the advantages of multiple models, the accuracy and efficiency of the analysis can be further enhanced. At the same time, the continuous optimization of the model structure and algorithm will make the sentiment analysis model more efficient and have better interpretability, which will broaden its application range in various fields.

The combination of knowledge graph and deep learning can enhance the model's ability to understand text. By using the semantic knowledge and entity relationships in the knowledge graph, the model can process complex texts more accurately, especially when it involves professional terms or texts rich in context information, which can effectively improve the analysis results.

With the rising cost of data annotation, it is becoming more and more important to study how to perform sentiment analysis with a small amount of labeled data or no labeled data. The application of few-shot learning and unsupervised learning methods will reduce the dependence on large-scale labeled data, and promote the wide application of sentiment analysis technology in data-scarce scenarios.

5. Conclusion

This paper comprehensively studies web text sentiment analysis based on deep learning, deeply explores the relevant theoretical basis and model architecture, and demonstrates the superiority of deep learning models in text sentiment analysis. Through the application analysis of different deep learning models, such as convolutional Neural Network (CNN), recurrent neural network (RNN) and pre-trained models, this paper shows that deep learning technology can significantly improve the accuracy and efficiency of sentiment analysis, especially when dealing with long texts and complex sentiment expressions. With the progress of these technologies, sentiment analysis can be better applied to many fields, such as social media analysis, product evaluation, public opinion monitoring, and intelligent customer service.

Although deep learning methods have shown great potential in sentiment analysis, there are still some urgent challenges to be solved. First of all, the interpretability of the model is still an important issue, especially in fields with high interpretability requirements (such as healthcare and finance), and the "black box" nature of the model limits its wide application. In addition, the adaptability of cross-lingual sentiment analysis is also one of the bottlenecks in the current development of technology. Future research can focus on these challenges and further optimize the model structure to achieve more efficient and interpretable sentiment analysis.

In this paper, although the application of deep learning in sentiment analysis is explored in depth, there is still a certain research space in the introduction of advanced technologies such as multi-modal sentiment analysis, sentiment analysis combined with a knowledge graph, and few-shot learning. In addition, this paper does not cover the deployment and real-time performance of sentiment analysis algorithms in the actual production environment, which is also a direction worthy of attention in future research.

With the continuous development of deep learning technology, text sentiment analysis has a broad application prospect in more practical fields. It is expected that in the future, sentiment analysis will provide more accurate data support for all walks of life, help people make more scientific decisions, and promote the development and progress of society.

References

[1]. Zeng, Y. (2022). Research on Chinese Text Sentiment Analysis Based on Deep Learning. (Doctoral dissertation, Sichuan Agricultural University).

[2]. Yu, K., Jia, L., Chen, Y., & Xu, W. (2013). The Past, Present, and Future of Deep Learning. Computer Research and Development, 50(9), 6.

[3]. Zhou, F., Jin, L., & Dong, J. (2017). A Survey of Convolutional Neural Networks. Journal of Computer Science, 40(6), 23.

[4]. Li, H., Qu, D., Zhang, W., Wang, B., & Liang, Y. (2016). A Recurrent Neural Network Language Model with Global Word Vector Features. Signal Processing, 32(6), 9.[7]

[5]. Rodriguez, Pau, Gonfaus, J. M., Cucurull, G., Roca, F. X., & Gonzalez, Jordi. (2018). Attend and rectify: a gated attention mechanism for fine-grained recovery. vol 11212. Springer, Cham. https://doi.org/10.1007/978-3-030-01237-3_22

[6]. Zhang, H., Lu, S., Li, Z., Jin, Z., Ma, L., & Liu, Y., et al. (2024). Codebert‐attack: adversarial attack against source code deep learning models via pre‐trained model. Journal of Software: Evolution & Process, 36(3).

Cite this article

Tang,Z. (2025). Research on web text sentiment analysis and application based on Deep Learning. Advances in Engineering Innovation,16(5),85-89.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Journal：Advances in Engineering Innovation

Volume number: Vol.16

Issue number: Issue 5

ISSN：2977-3903(Print) / 2977-3911(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Zeng, Y. (2022). Research on Chinese Text Sentiment Analysis Based on Deep Learning. (Doctoral dissertation, Sichuan Agricultural University).

[2]. Yu, K., Jia, L., Chen, Y., & Xu, W. (2013). The Past, Present, and Future of Deep Learning. Computer Research and Development, 50(9), 6.

[3]. Zhou, F., Jin, L., & Dong, J. (2017). A Survey of Convolutional Neural Networks. Journal of Computer Science, 40(6), 23.

[4]. Li, H., Qu, D., Zhang, W., Wang, B., & Liang, Y. (2016). A Recurrent Neural Network Language Model with Global Word Vector Features. Signal Processing, 32(6), 9.[7]