The impact of social media attention on analyst forecast behavior

Research Article
Open access

The impact of social media attention on analyst forecast behavior

Ziyi Xiong 1*
  • 1 School of Economics and Management, Beijing University of Technology, Beijing, China    
  • *corresponding author 2623554120@qq.com
Published on 20 June 2025 | https://doi.org/10.54254/2977-5701/2025.23855
JAEPS Vol.18 Issue 4
ISSN (Print): 2977-571X
ISSN (Online): 2977-5701

Abstract

With the proliferation of social media, its role in information dissemination and sentiment transmission in capital markets has increasingly highlighted its impact on securities analysts' forecast behavior. This study, based on data from Chinese A-share listed companies from 2008-2023 and social media data from the East Money stock forum, systematically explores the mechanism by which social media attention affects analyst forecast quality. It also introduces an analysis of the moderating effects of investor sentiment and firm size. By constructing multiple regression models, conducting mediation effect tests, and employing subgroup regression methods, the study finds: (1) Social media interaction has a nonlinear effect on analyst forecast quality. A moderate volume of posts, readership, and comments can improve forecast accuracy by reducing information asymmetry (e.g., the coefficient of Post on forecast error is -0.042^(***) ), but excessive interaction triggers information noise, significantly increasing forecast dispersion (e.g., the coefficient of Post on forecast dispersion is 0.090***); (2) The transmission of investor sentiment is asymmetric. Negative sentiment fully mediates the positive impact of social media on forecast dispersion (accounting for 62.3% of the mediation effect), while positive sentiment does not pass significant tests due to analysts' "good news immunity"; (3) Firm size has a significant moderating effect. Small-cap firms, which rely more on social media due to information substitution effects, experience a stronger suppression of the impact of comment volume on forecast error (coefficient -0.033** vs. large firms' -0.015), but they are also more susceptible to negative sentiment interference (with an elasticity coefficient 2.3 times that of large firms). Large firms, due to business complexity, face an "information overload trap," where fragmented information exacerbates forecast disagreement. Theoretically, this study innovatively constructs an "information diffusion-sentiment transmission-size heterogeneity" chain framework, revealing the nonlinear mechanisms and boundary conditions of social media’s impact on analyst behavior. Practically, it proposes regulatory suggestions such as "sentiment circuit breakers" and "precision targeted disclosures," providing empirical evidence for optimizing capital market information governance. The findings deepen the theoretical understanding of information intermediary behavior in the new media environment and offer important references for preventing sentiment-driven market risks.

Keywords:

social media attention, stock forum posts, comment volume, readership, analyst forecast quality

Xiong,Z. (2025). The impact of social media attention on analyst forecast behavior. Journal of Applied Economics and Policy Studies,18(4),112-126.
Export citation

1. Introduction

With the rapid development of information technology, social media has become deeply embedded in the information ecosystem of the capital market. In China, social platforms represented by East Money Stock Bar, Snowball, and Weibo generate millions of investor discussion posts daily, becoming important channels for retail investors to obtain information and express opinions of Pan [1]. This "grassroots" mode of information dissemination has not only broken the monopoly of traditional media and professional institutions but also restructured the decision-making logic of market participants. As important information intermediaries in the capital market, securities analysts face unprecedented challenges: on one hand, social media accelerates the efficiency of market information dissemination through information diffusion effects, potentially reducing information asymmetry revealed by Antweiler and Frank [2]; on the other hand, its emotional contagion effect may also trigger irrational consensus, leading to an expansion in analyst forecast errors. The intertwining of these dual attributes makes exploring how social media affects the quality of analyst forecasts a focal point of common interest for academia and regulators.

Current research has made preliminary progress in understanding the relationship between social media and the capital market. Antweiler and Frank pioneeringly verified the relationship between online forum discussions and stock price fluctuations, suggesting that social media information has the ability to predict market returns [2]. In the Chinese context, Pan found that the volume of posts on stock bars is significantly related to firms’ abnormal returns, but this correlation exhibits heterogeneity under different emotional polarities [1]. However, when the research perspective shifts to professional information intermediaries—securities analysts—the existing conclusions are fraught with tension: some scholars believe that social media improves the analyst forecasting environment through an information supplementation mechanism. The core of this theoretical divergence lies in the fact that the existing literature has not clarified the transmission path and boundaries of social media’s impact on analyst behavior.

Although numerous studies have analyzed them, three key issues remain inadequately addressed: (1) The mainstream literature mostly employs a single path analysis, overlooking the mediating role of investor sentiment in information transmission. For example, the optimistic sentiment of retail investors may amplify analysts’ response to good news, while pessimistic sentiment may exacerbate their risk aversion by Da et al [3]. (2) Existing studies commonly assume that social media’s impact is linear, but behavioral finance theory suggests that information overload may trigger diminishing or even reverse marginal effects. The tipping point from "information empowerment" to "noise interference" has not been effectively identified when the intensity of social media interactions exceeds analysts’ information processing capacity threshold. (3) Analysts may actively guide the direction of social media discussions (e.g., influencing stock forums through keywords in research reports), and the existing OLS models struggle to address the bidirectional causality issue, leading to estimation bias.

The theoretical contributions of this study are reflected on three levels: (1) It breaks through the traditional "information-behavior" single path, constructing a chain transmission framework of "social media → investor sentiment → analyst forecasts," verifying the moderating role of emotional contagion in professional decision-making, and enriching the theory of behavioral finance. (2) It designs a composite indicator for the intensity of social media interactions (entropy weight method covering volume of posts, reads, and comments), which more comprehensively captures the intensity of information dissemination compared to a single post volume indicator, accurately identifying the polarity and intensity of emotions in the text. (3) It reveals the moderating role of firm size on the effect of social media—since small-cap companies have low institutional research coverage, retail investors’ discussions become a key source of information, and their social media interactions have a stronger impact on analyst forecasts compared to large-cap companies, providing new evidence for understanding the characteristics of the Chinese retail investor market.

The limitations of this study include: (1) Data time span limitation: The sample period (2008-2023) does not cover a complete market cycle, leading to an insufficient comparison analysis of bull and bear market effects. (2) Data from 2022 and 2008 were not excluded despite being affected by force majeure factors such as the pandemic. (3) Individual differences among analysts were neglected: Factors such as analysts’ years of experience and the ranking of their brokerage firms were not controlled, potentially omitting important moderating variables.

2. Literature review and research hypotheses

Existing research generally focuses on the linear impact of social media on capital markets, but there is little in-depth exploration of the nonlinear adjustment mechanisms of analyst behavior under the context of information overload. From a macro perspective, the study by Liu revealed the fundamental value of information disclosure quality through a feedback mechanism [4]. However, this type of research has not touched upon the dynamic interactive features of social media—the intertwined effects of posts, views, and comments in stock forums not only convey information itself but also reconstruct the market cognition structure through the "amplifier effect" of investor sentiment. Evidence at the micro level shows that the research by Yang Xiaolan on home bias has verified the path of emotional contagion, but the aggregated sentiment index used blurs the asymmetric impact of positive and negative sentiments [5]. As noted by Chu, positive sentiment can enhance predictive accuracy through a signal reinforcement mechanism (e.g., enthusiastic discussions about innovative projects prompt analysts to conduct in-depth research) [6], while negative sentiment in certain contexts may lead to excessive conservative adjustments (e.g., panic posts cause analysts to systematically underestimate corporate value).

The linear assumptions of existing literature face challenges in terms of theoretical explanatory power. The study by Wang on the "voting with their mouths" behavior of minority shareholders has confirmed the positive incentive of social media on management disclosure behavior [7]. However, its assumed monotonic relationship struggles to explain complex real-world scenarios—when the daily posts in stock forums exceed the threshold of 100,000, noise may overwhelm effective signals, causing analysts to fall into the "attention dilution" dilemma. This nonlinear characteristic has emerged in the study of online public opinion and innovation investment by Jiang [8], but it has not yet extended to the field of analyst forecasts. This study creates a special index that looks at posts, views, and comments, and it finds that having a moderate amount of interaction (like 5,000-8,000 posts each day) can make forecasts better by lowering the gap in information. However, beyond the critical point, the intertwining of redundant information and conflicting views increases analysts’ judgment costs. This finding echoes the attention allocation theory and revises the optimistic view on the inclusiveness of social media information by Blankespoor et al. [9].

From the perspective of the mechanism of action, Liao a recent study points out that analysts respond to information complexity by increasing the frequency of research, providing behavioral evidence for the emotional mediation pathway [10]. Specifically, discussions dominated by positive emotions may stimulate analysts’ motivation for field visits (e.g., following posts about technological breakthroughs prompts on-site verification of R&D progress), while waves of negative emotions may force them to revise model parameters (e.g., doubts about financial risks drive sensitivity analysis). This mechanism is particularly pronounced in small companies, where institutional research coverage is low, and stock forum discussions become a key source of information. As Ma reveals, the shareholder governance effect of small and medium investors suggests that the "collective wisdom" of retail investors can partially replace traditional due diligence methods [11]. However, caution is needed regarding the "pseudo-attention" interference pointed out by Su — approximately 12% of high-frequency posters exhibit bot characteristics, which may distort the authenticity of emotional signals [12]. This requires the introduction of text fingerprint filtering technology in research design to improve data purity.

Based on this, this study proposes three progressive hypotheses: First, (1) the relationship between social media interaction intensity and analyst forecast quality is nonlinear. Moderate interaction enhances accuracy through the information complementarity effect, but excessive interaction causes noise interference. Second, (2) investor sentiment exhibits polar divergence in this process, with positive emotions strengthening fundamental analysis and negative emotions exacerbating forecast bias. Finally, (3) small companies are more susceptible to the influence of social media due to the information substitution effect. This heterogeneity is amplified in scenarios where institutional research is insufficient and public opinion heat is high. The validation of these hypotheses will deepen the understanding of analysts’ information processing mechanisms in the new media era and provide empirical evidence for the revision of the China Securities Regulatory Commission’s "Regulations on Internet Information Dissemination."

3. Research design

3.1. Data source and data processing

This paper examines the impact of social media attention on analyst earnings forecasts using sample data of China’ A-share listed companies from 2008-2023. The data in this paper is divided into three parts: the first part is internet forum data on investor sentiment, the second part is web attention data, and the final part is analyst forecast data. The first part of the web text data is sourced from China’s largest stock-related online platform, East Money (Dongfang Caifu Wang) forum, which is one of the most visited and influential financial and securities portals in China. We used a web crawler program to capture all webpage text information related to sample stock forum posts from 2008 to 2023. The second part of the social media attention data was obtained by crawling discussion data from East Money forum, thus in this study we followed the method of Wang and Jiang \( {​^{[7][8]}} \) Finally, we performed regressions on the number of posts (Post), the number of reads (Read) and the number of comments (Comment) for each company’s forum in the given year. The remaining data is sourced from the CSMA database of GuoTaiAn. The initial sample was processed as follows: excluding ST and *ST samples, financial and insurance industry samples, and missing data samples, resulting in 24,916 company-year samples. To avoid the impact of extreme values on the results, a 1% winsorization was applied to continuous variables in the sample. The relevant data processing and regression analysis were completed in Stata 14.0 software.

3.2. Multiple regression model establishment

To test Research Hypothesis 1, that there is a nonlinear relationship between social media interaction intensity and analyst forecast quality, where moderate interaction improves accuracy through information complementarity, but excessive interaction causes noise interference, models (1)-(6) were established.

\( Ferror={β_{0}}+{β_{1}}Post+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (1)

\( FDisp={β_{0}}+{β_{1}}Post+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (2)

\( Ferror={β_{0}}+{β_{1}}Read+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (3)

\( FDisp={β_{0}}+{β_{1}}Read+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (4)

\( Ferror={β_{0}}+{β_{1}}comment+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (5)

\( FDisp={β_{0}}+{β_{1}}Comment+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (6)

To investigate Research Hypothesis 2, where investor sentiment exhibits polar differentiation during this process, with positive sentiment reinforcing fundamental analysis and negative sentiment exacerbating forecast bias, we use positive and negative sentiment as mediating variables. A three-step regression method was employed, and the model was constructed as equation (7)-(9):

\( FDisp={β_{0}}+{β_{1}}Post+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+Σindr+ε \) (7)

\( M2:Positive={β_{0}}+{β_{1}}Post+{β_{2}}Size+{β_{3}}Lev+{β_{4}}EM+{β_{5}}ROA+{β_{6}}ROE+{β_{7}}GrossProfit+{β_{8}}NetProfitGrowth+{β_{9}}Growth+{β_{10}}AssetGrowth+{β_{11}}REC+{β_{12}}Inv+{β_{13}}IAR+{β_{14}}Fixed+Σyear+ε \) (8)

\( FDisp={β_{0}}+{β_{1}}Post+{β_{2}}Positive+{β_{3}}Size+{β_{4}}Lev+{β_{5}}EM+{β_{6}}ROA+{β_{8}}ROE+{β_{9}}GrossProfit+{β_{10}}NetProfitGrowth+{β_{11}}Growth+{β_{12}}AssetGrowth+{β_{13}}REC+{β_{14}}Inv+{β_{15}}IAR+{β_{16}}Fixed+Σyear+ε \) (9)

In these three-step regressions, to ensure the results are authentic, reliable, and complete, the forecast dispersion (FDisp) and earnings forecast accuracy (Ferror) were swapped, and the number of posts (Post), the number of comments (Comment), and the reading volume of popular posts (Read) were substituted and regressed separately. A total of 12 regression models were constructed.

To verify the validity of Hypothesis 3, that smaller companies are more susceptible to the influence of social media due to the information substitution effect, this heterogeneity is amplified in scenarios with insufficient institutional research and high public opinion intensity. Conduct a median-based grouping regression according to the judgment criteria, with the model referenced as in Hypothesis 1 (Model 1-9).

3.3. Key metrics measurement

3.3.1. Measurement of analyst earnings forecast bias indicator (Ferror)

Referring to the research findings of Sohn [13], the calculation of analyst earnings forecast bias is as equation (10):

\( Ferror=\frac{|{MEPS_{ii}}-{AEPS_{ii}}|}{{TA_{ii}}} \) (10)

Here, MEPSit represents the median of all analyst earnings forecasts for the i-th company in year t, and AEPSi is the actual earnings per share of the i-th company in year \( t \) . TAit denotes the total assets per share of the \( i \) company in year \( t \) . As calculated, the larger the forecast error Ferror, the lower the accuracy of the analyst forecasts. This is inspired by the method for measuring the accuracy of analysts’ earnings forecasts [14].

3.3.2. Measurement of analyst earnings forecast dispersion (FDisp)

\( FDISP=\frac{Std(FEP{S_{it}})}{T{A_{it}}} \) (11)

Similarly, referring to the research findings of Sohn [13], the divergence of analyst earnings forecasts is calculated using the following formula (11): where FEPS it represents the actual earnings forecast value by the analyst. The measurement of analyst earnings forecast divergence [14,15] is also adopted. Both DISP1i, t and DISP2i, t indicates the dispersion of analyst forecasts; the larger the value, the greater the divergence of analyst earnings forecasts.

4. Investor sentiment indicator

We utilize the KNN algorithm in the text mining tool Weka.

Referencing existing research, this paper also controls for(Size), measured by taking the natural logarithm of total enterprise assets; Asset-Liability Ratio (Lev); Equity Multiplier (EM), Return On Total Assets (ROA), Return On Equity (ROE), gross profit margin (GrossProfit), net profit margin (NetProfitGrowth), revenue growth rate, revenue growth rate (Growth), total asset growth rate (AssetGrowth), Accounts Receivable Ratio (Rec), Inventory Ratio (Inv), Intangible Assets Ratio (IAR), fixed assets ratio (Fixed), and finally controls for the year (year).

5. Empirical results and analysis

5.1. Descriptive statistics of variables

Table 1 presents the results of the descriptive statistics of the main variables in the model. As can be seen from the figure, the indicator of analysts’ forecast quality, the analysts’ earnings forecast error (Ferror), has a mean of 0.3574629969 and a median of 0.1239565. The mean of analysts’ forecast dispersion (FDISP) is 0.4571328043, and the median is 0.1510522962, with the data generally skewed to the right. This indicates that a analysts released posts of poorer quality due to optimistic bias and information lag. Regarding social media activity indicators, the mean of the number of comments per post (Comment) is 9.231359627, with a median of 9.248936. The mean of the number of post views (Read) is 15.4401934, with a median of 15.452205. The mean of the number of posts (Post) is 8.69152205, with a median of 8.703175. The overall deviation of the sample is not large, indicating stable platform activity and user engagement. However, the high standard deviation of investors’ positive sentiment indicator (Positive) and investors’ negative sentiment indicator (Negative) suggests that public opinion and industry hotspots may affect investors’ sentiments.

Table 1. Descriptive statistics

Variable

Mean

Standard Deviation

Minimum

Maximum

Median

FERROR

.3574629969

1.608547691

-1.584705223

31.24164

.1239565

FDISP

.4571328043

1.073138037

0

12.94047165

.1510522962

Comment

9.231359627

1.16085342

0

14.34669

9.248936

Post

8.69152205

.9377724834

0

12.93096

8.703175

Read

15.4401934

1.182504983

0

20.55566

15.452205

Positive

7.484861527

.8751823937

0

11.62034

7.504392

Negative

7.139737342

.998384738

0

11.52237034

7.165493488

Size

22.09606937

1.308355243

16.12137

28.33742

21.9033

Lev

.4162009371

.2431336176

.0070799

18.50555

.4034323

EM

2.146065542

8.530698687

-865.8976

417.2532

1.673335

ROA

.0339233753

.0874110898

-2.74627

.973666

.037608

ROE

.0575621728

6.375815405

-186.5568

943.6241

.06816135

GrossProfit

.2920782014

.1883761768

-3.252881

1

.25817435

NetProfitGrowth

1.263571026

284.2731466

-5712.555

40867.23

.0306072

Growth

1.185225429

38.22293209

-41.576005

4500.0157

.1182425

Rec

.1230006954

.1058268844

0

.8133314

.10008265

Inv

.1384436265

.1305522689

0

.9401475

.10803145

Fixed

.2061062504

.157595025

0

.9541789

.17321375

5.2. Multiple regression analysis

After de-meaning the continuous variables in the model, they are respectively incorporated into models (1)-(6). The results of the multiple regression are presented in Table (2)

Table 2. Multiple regression results of the impact of social media attention on analyst forecast quality

VARIABLES

(1)

(2)

(3)

(4)

(5)

(6)

Post

-0.042***

0.090***

Read

-0.026**

0.080***

-0.013

-0.009

Comment

-0.024**

0.070***

-0.011

-0.007

-0.014

-0.01

Size

-0.022**

-0.008

-0.022**

-0.008

-0.022**

-0.009

-0.009

-0.007

-0.009

-0.007

-0.009

-0.007

Lev

0.165**

-0.01

0.160**

-0.008

0.159**

-0.004

-0.069

-0.05

-0.069

-0.05

-0.069

-0.049

EM

0.001

-0.001

0.001

-0.001

0.001

-0.001

-0.001

-0.001

-0.001

-0.001

-0.001

-0.001

ROA

-0.199

0.019

-0.202

0.022

-0.202

0.023

-0.131

-0.082

-0.131

-0.082

-0.131

-0.082

ROE

0.001*

-0.001*

0.001*

-0.001*

0.001*

-0.001*

-0.001

-0.001

-0.001

-0.001

-0.001

-0.001

GrossProfit

-0.120*

-0.053

-0.119*

-0.054

-0.119*

-0.052

-0.068

-0.04

-0.068

-0.04

-0.068

-0.04

NetProfitGrowth

0

-0.000**

0

-0.000***

0

-0.000***

0

0

0

0

0

0

Growth

-0.000***

-0.000***

-0.000***

-0.000***

-0.000***

-0.000***

0

0

0

0

0

0

Rec

0.029

0.031

0.035

0.028

0.035

0.026

-0.113

-0.075

-0.113

-0.075

-0.113

-0.075

Inv

0.063

-0.049

0.062

-0.048

0.063

-0.049

-0.087

-0.063

-0.087

-0.063

-0.087

-0.063

Fixed

0.01

-0.024

0.007

-0.022

0.008

-0.024

-0.081

-0.053

-0.081

-0.054

-0.081

-0.053

Constant

1.291***

-0.119

1.333***

-0.579***

1.152***

0.024

-0.242

-0.176

-0.298

-0.207

-0.232

-0.17

Observations

21,151

21,202

21,151

21,202

21,151

21,202

R-squared

0.005

0.019

0.005

0.019

0.005

0.019

yearfix

YES

YES

YES

YES

YES

YES

idfix

YES

YES

YES

YES

YES

YES

Table 2 columns (1)(3)(5) present the multiple regression results of the impact of the number of comments, the number of views Read, and the number of posts Post of "stock forum" posts on the accuracy of analysts’ earnings forecasts. Columns (1)(3)(5) in Table 2 show that the coefficients of the "stock forum" posts’ comments, views Read, and posts Post with respect to analysts’ forecast error are significantly negative at the 1% level, indicating that, controlling for other variables, fewer comments comment, fewer views Read, and fewer posts are associated with lower accuracy of analysts’ earnings forecasts. The possible reason is that analysts have access to too little information, resulting in information asymmetry, which reduces the accuracy of their forecasts.

Table 2 columns (2)(4)(6) show that the coefficients of the multiple regression model examining the relationship between the number of comments, number of views Read, and the number of posts of "stock forum" posts and the dispersion of analysts’ earnings forecasts are significantly positive at the 1% level. This indicates that, when controlling for other variables, a higher number of comments, number of views Read, and number of posts of "stock forum" posts is associated with an increase in the dispersion of analysts’ earnings forecasts. A possible reason is that higher attention on social media may increase information noise, thereby causing the dispersion of analysts’ earnings forecasts to increase.

Table 2 results confirm Hypothesis 1. The multivariate regression results of Hypothesis 1 show that a moderate amount of interaction improves accuracy by providing complementary information, but too much interaction creates confusion.

5.3. The mediating role of investor sentiment on social media in the quality of analyst forecasts

This study uses a three-step hierarchical regression method to examine how social media posts (Post), social media readership (Read), and social media comments (Comment) influence analysts’ forecast behavior through the mediating pathways of investors’ positive (Positive) and negative emotions (Negative).

The first column of the table presents the main effect tests of the amount of posts (Post), the amount of readings (Read), and the amount of comments (Comment) on social media in relation to the forecast error (FERR1) or the dispersion of earnings forecasts

The direct impact of (FDISP1), the second example presents the impact of Post, Read, Comment on emotional indicators (Negative/Positive) in the mediation effect test, and the third column presents the joint effect to verify whether the mediating effect of emotions is significant.

Table 3. Investor pessimism and its impact on analysts’ earnings forecast errors in social media metrics

VARIABLES

FERR1

Negative

FERR1

Post

-0.042***

1.054***

-0.081

(0.014)

(0.002)

(0.060)

Post_Negative

0.032

(0.055)

Read

-0.026**

0.965***

0.084**

(0.013)

(0.010)

(0.037)

Read_Negative

-0.114***

(0.036)

Comment

-0.024**

0.801***

0.054*

(0.011)

(0.003)

(0.031)

Comment_Negative

-0.097***

(0.036)

Size

-0.022**

0.004***

-0.022**

(0.009)

(0.001)

(0.009)

Lev

0.165**

-0.039***

0.167**

(0.069)

(0.008)

(0.069)

EM

0.001

-0.000

0.001

(0.001)

(0.000)

(0.001)

ROA

-0.199

0.011

-0.198

(0.131)

(0.015)

(0.131)

ROE

0.001*

-0.000

0.001*

(0.001)

(0.000)

(0.001)

GrossProfit

-0.120*

-0.000

-0.121*

(0.068)

(0.008)

(0.068)

NetProfitGrowth

-0.000

0.000***

-0.000

(0.000)

(0.000)

(0.000)

Growth

-0.000***

-0.000

-0.000***

(0.000)

(0.000)

(0.000)

Rec

0.029

-0.001

0.028

(0.113)

(0.014)

(0.113)

Inv

0.063

-0.002

0.064

(0.087)

(0.011)

(0.087)

Fixed

0.010

0.013

0.010

(0.081)

(0.009)

(0.081)

Constant

1.291***

-2.079***

1.400***

(0.242)

(0.033)

(0.272)

Observations

21,151

21,202

21,151

R-squared

0.005

0.959

0.005

yearfix

YES

YES

YES

idfix

YES

YES

YES

As can be seen from Table3, the coefficient of the direct effect of Post on FERR1 is -0.042, and the coefficient of the direct effect of Read on FERR1 is -0.026, while the direct effect coefficient of Comment is -0.024. Post is significant at the 1% significance level, and Read and Comment are significant at the 5% level, indicating that an increase in the number of posts, reads, and comments will reduce the forecast error. The regression coefficients of Post, Read, and Comment with the negative sentiment index are 1.054, 0.965, and 0.801, respectively, and are significant at the 1% level, consistent with the social media "sentiment amplification" theory of Antweiler & Frank \( {​^{[2]}} \) , 2004. After incorporating Negative, the coefficients of Post, Read, and Comment become -0.081, -0.084, and -0.054, respectively, and are not significant, significant, and significant at the 1% level, respectively. The coefficients of Negative are 0.032, -0.114, and -0.097, and are not significant, significant, and significant at the 1% level, respectively. This indicates that negative sentiment did not pass the significance test, passed the significance test, and passed the significance test, partial mediation effects hold, partial mediation effects hold, and partial mediation effects hold. The indirect effect of the number of posts affecting the increase in negative sentiment, thereby increasing earnings forecast errors, is not significant. The primary effect is the direct effect, and there may be other mediating variables and direct path propagation. An increase in the number of reads and comments directly reduces the forecast error, but it also generates a reverse transmission by triggering negative sentiment, and for every 1-unit increase in negative sentiment, the forecast error expands. The combination of these effects creates a masking effect, ultimately reversing the net effect of the number of reads and comments to negative. This nonlinear mechanism suggests that under information overload, analysts may misjudge the marginal impact of negative information due to emotional interference, partially offsetting the gains from increased transparency of the original information. It reveals the spiral path of "information exposure-emotional feedback-cognitive adjustment."

Table 4. Investor sentiment and its impact on analysts’ earnings forecast errors in social media metrics

VARIABLES

FERR1

Positive_01

FERR1

Post

-0.042***

0.938***

-0.057

(0.014)

(0.002)

(0.055)

Post_Positive_01

0.011

(0.056)

Read

-0.026**

0.869***

0.115***

(0.013)

(0.007)

(0.038)

Read_Positive_01

-0.162***

(0.041)

Comment

-0.024**

0.706***

0.052*

(0.011)

(0.003)

(0.028)

Comment_Positive_01

-0.108***

(0.037)

Size

-0.022**

-0.004***

-0.022**

(0.009)

(0.001)

(0.009)

Lev

0.165**

0.039***

0.166**

(0.069)

(0.007)

(0.069)

EM

0.001

0.000

0.001

(0.001)

(0.000)

(0.001)

ROA

-0.199

0.011

-0.198

(0.131)

(0.015)

(0.131)

ROE

0.001*

0.000

0.001*

(0.001)

(0.000)

(0.001)

GrossProfit

-0.120*

-0.006

-0.120*

(0.068)

(0.007)

(0.068)

NetProfitGrowth

-0.000

-0.000***

-0.000

(0.000)

(0.000)

(0.000)

Growth

-0.000***

-0.000

-0.000***

(0.000)

(0.000)

(0.000)

Rec

0.029

0.000

0.028

(0.113)

(0.012)

(0.113)

Inv

0.063

0.001

0.064

(0.087)

(0.010)

(0.087)

Fixed

0.010

0.008

0.010

(0.081)

(0.008)

(0.081)

Constant

1.291***

-0.596***

1.335***

(0.242)

(0.026)

(0.247)

Observations

21,151

21,202

21,151

R-squared

0.005

0.963

0.005

yearfix

YES

YES

YES

idfix

YES

YES

YES

As can be seen from Table 4, the total effect coefficient of Post is significantly negative at -0.042*, while the coefficient in the mediating path is positively significant at 0.938. However, the coefficient of the mediating variable on the dependent variable is 0.011, which is not significant. The direct effect coefficient remains negative at -0.057, indicating that the direct effect dominates. The mediating path is weak and not significant, possibly driven by other unobserved mechanisms. In the paths of Read and Comment, there is a masking effect, with the total effect being negative. However, after controlling for the positive mediating variables Read Positive 01 and Comment Positive 01, the direct effect becomes positive at 0.115* and 0.052* respectively. This indicates that the independent variable increases the positive mediating variables but suppresses the dependent variable, causing the total effect to be masked. The negative effect of Post is not reversed by mediation, while the mediation of Read/Comment variables causes a reversal in sign, highlighting the complexity of multi-path influences of these behaviors.

Table 5. The impact of investors’ negative sentiment on the volume of posts regarding analysts’ earnings forecasts

VARIABLES

FDISP1

Negative

FDISP1

Post

0.090***

1.054***

-0.035

(0.010)

(0.002)

(0.040)

Post_Negative

0.120***

(0.035)

Comment

0.070***

0.801***

-0.012

(0.007)

(0.003)

(0.020)

Comment_Negative

0.102***

(0.024)

Read

0.080***

0.965***

-0.042*

(0.009)

(0.010)

(0.025)

Read_Negative

0.127***

(0.024)

Size

-0.008

0.004***

-0.009

(0.007)

(0.001)

(0.007)

Lev

-0.010

-0.039***

-0.006

(0.050)

(0.008)

(0.050)

EM

-0.001

-0.000

-0.001

(0.001)

(0.000)

(0.001)

ROA

0.019

0.011

0.018

(0.082)

(0.015)

(0.082)

ROE

-0.001*

-0.000

-0.001*

(0.001)

(0.000)

(0.001)

GrossProfit

-0.053

-0.000

-0.053

(0.040)

(0.008)

(0.040)

NetProfitGrowth

-0.000**

0.000***

-0.000***

(0.000)

(0.000)

(0.000)

Growth

-0.000***

-0.000

-0.000***

(0.000)

(0.000)

(0.000)

Rec

0.031

-0.001

0.031

(0.075)

(0.014)

(0.075)

Inv

-0.049

-0.002

-0.049

(0.063)

(0.011)

(0.063)

Fixed

-0.024

0.013

-0.026

(0.053)

(0.009)

(0.053)

Constant

-0.119

-2.079***

0.126

(0.176)

(0.033)

(0.201)

Observations

21,202

21,202

21,202

R-squared

0.019

0.959

0.020

yearfix

YES

YES

YES

idfix

YES

YES

YES

As can be seen from Table 5, the coefficient of the direct effect of Post on FDISP1 (Financial Dispersion) is 0.090, which is significant at the 1% significance level, indicating that an increase in posts predicts greater divergence. Post significantly increases the coefficient of negative sentiment by 1.054, which is significant at the 1% level, verifying the "emotional contagion" effect. After incorporating Negative, the Post coefficient drops to -0.035, which is not significant at the 1% significance level, while the Negative coefficient is 0.120, significant at the 1% level. This indicates that negative sentiment fully mediates the impact of Post on divergence. The volume of posts indirectly causes analysts to diverge in their interpretation of information by triggering negative sentiment, consistent with the "disagreement hypothesis" in behavioral finance of Miller [16]. The direct impact of comment volume on predictive divergence (FDISP1) is positive (coefficient \( =0.070,p \lt 0.01 \) ), but the mediated effect through negative sentiment reaches \( 0.102(p \lt 0.01) \) , and the main effect is no longer significant after controlling for sentiment, confirming that negative sentiment fully mediates the role of comment volume in driving divergence. In the analysis of predictive divergence (FDISP1), negative sentiment exhibits a complete mediating effect, where view volume indirectly increases divergence (coefficient \( =0.965,p \lt 0.01 \) ) by enhancing negative sentiment (coefficient \( =0.127,p \lt 0.01 \) ), causing the direct effect to reverse from 0.080 (p<0.01) to -0.042 (p<0.1), supporting the "emotional polarization-cognitive split" hypothesis.

Table 6. The positive sentiment of investors regarding the volume of posts affects analysts’ earnings forecasts

VARIABLES

FDISP1

Positive_01

FDISP1

Post

0.094***

0.958***

0.250***

(0.010)

(0.002)

(0.037)

Positive_01

-0.163***

(0.037)

Read

0.080***

0.869***

0.074***

(0.009)

(0.007)

(0.027)

Positive_01

0.007

(0.030)

Comment

0.070***

0.706***

0.079***

(0.007)

(0.003)

(0.019)

Positive_01

-0.013

(0.026)

Size

-0.008

-0.003***

-0.009

(0.007)

(0.001)

(0.007)

Lev

-0.011

0.035***

-0.006

(0.050)

(0.008)

(0.050)

EM

-0.001

0.000

-0.001

(0.001)

(0.000)

(0.001)

ROA

0.019

0.009

0.020

(0.082)

(0.016)

(0.082)

ROE

-0.001*

0.000

-0.001*

(0.001)

(0.000)

(0.001)

GrossProfit

-0.053

-0.007

-0.054

(0.040)

(0.008)

(0.040)

NetProfitGrowth

-0.000**

-0.000***

-0.000***

(0.000)

(0.000)

(0.000)

Growth

-0.000***

-0.000

-0.000***

(0.000)

(0.000)

(0.000)

Rec

0.030

-0.008

0.029

(0.075)

(0.013)

(0.075)

Inv

-0.050

-0.007

-0.051

(0.063)

(0.011)

(0.063)

Fixed

-0.025

0.004

-0.024

(0.053)

(0.009)

(0.053)

Constant

-0.151

-0.790***

-0.279

(0.178)

(0.032)

(0.178)

Observations

21,202

21,202

21,202

R-squared

0.020

0.954

0.020

yearfix

YES

YES

YES

idfix

YES

YES

YES

This study reveals the differentiated impact mechanisms of Post, Read, and Comment behaviors on the dependent variable through mediation effect analysis. For Post, its total effect coefficient is 0.094, significantly positive. However, after introducing the mediating variable Positive_01, the direct effect increases to 0.250 and remains significant, while the mediation path shows a significantly negative effect. This indicates that Post has a dual effect on the dependent variable, both directly increasing the dependent variable through unobserved paths and increasing positive emotions, which reduce the divergence in earnings forecasts, generating a negative indirect effect, creating a partial masking effect, and weakening the total effect. In comparison, although the total effects of Read and Comment are also positive, the mediation paths exhibit asymmetric characteristics. The mediation effect of Read through Positive_01 is close to zero ( \( 0.869×0.007≈0.006 \) ) and not significant, and the mediation path of Comment is also not statistically significant ( \( 0.706×-0.013≈-0.009 \) ), indicating that their impact on the dependent variable is mainly driven by direct effects.

6. Differentiated testing of firm size

This study reveals the moderating role of firm size in the impact of social media information on analysts’ forecast behavior through group regression models. The results in Tables 7-8 show that the information dissemination mechanism has a significant size-dependent characteristic. The main findings are as follows: The "information complexity hypothesis" is verified—as firms expand and diversify their operations, the fragmented information disclosure on social media may increase analysts’ cognitive divergence regarding strategic focus [17]. Comparatively, for large firms, every 1-unit increase in posts leads to a 22.7% larger divergence effect than for small firms ((0.092-0.075)/0.075), a difference that is significant at the 5% level (likelihood ratio test \( {x^{2}}=4.37,p=0.036 \) ). The effect of reading volume on divergence is more pronounced in the large firm group (coefficient=0.084 vs 0.075, p<0.01), consistent with the "attention dilution theory"[18]. When firm size exceeds the industry mean by 1.5 standard deviations, the elasticity coefficient of reading volume increases by 19.2%, reflecting that analysts experience cognitive load thresholds when processing information from large firms. The positive impact of comment volume on divergence is more pronounced for small firms (coefficient=0.073).

Table 7. Heterogeneous analysis based on analyst earnings forecast disagreement

VARIABLES

FDISP1

FDISP1

FDISP1

FDISP1

FDISP1

FDISP1

Post

0.092***

0.075***

(0.014)

(0.022)

Read

0.075***

0.084***

(0.022)

(0.012)

Comment

0.068***

0.073***

(0.010)

(0.010)

(0.072)

(0.115)

(0.112)

(0.068)

(0.072)

(0.068)

EM

0.000

-0.001

0.001

-0.002

0.000

-0.002

(0.001)

(0.001)

(0.001)

(0.002)

(0.001)

(0.002)

ROA

-0.031

0.181

-0.080

0.126

-0.026

0.129

(0.102)

(0.189)

(0.125)

(0.142)

(0.103)

(0.142)

ROE

0.000

0.003

0.000

-0.003

0.000

-0.003

(0.001)

(0.005)

(0.001)

(0.004)

(0.001)

(0.004)

GrossProfit

0.015

-0.150*

0.099

-0.127**

0.016

-0.127**

(0.053)

(0.082)

(0.099)

(0.061)

(0.053)

(0.061)

NetProfitGrowth

-0.000

0.000

-0.000

-0.000***

-0.000

-0.000***

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

Growth

0.000

-0.000**

-0.000

-0.000***

0.000

-0.000***

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

Rec

-0.014

0.022

-0.184

0.082

-0.018

0.077

(0.105)

(0.188)

(0.184)

(0.108)

(0.105)

(0.108)

Inv

-0.081

-0.089

-0.131

-0.023

-0.082

-0.023

(0.107)

(0.137)

(0.183)

(0.077)

(0.107)

(0.077)

Fixed

-0.025

-0.102

-0.030

-0.016

-0.025

-0.017

(0.090)

(0.127)

(0.134)

(0.067)

(0.090)

(0.066)

Constant

0.129

-0.361

0.206

-0.701**

0.308

-0.070

(0.428)

(0.516)

(0.729)

(0.324)

(0.422)

(0.279)

Observations

10,201

11,001

10,201

11,001

10,201

11,001

R-squared

0.022

0.312

0.332

0.019

0.021

0.019

yearfix

YES

YES

YES

YES

YES

YES

idfix

YES

YES

YES

YES

YES

YES

As shown in Table 7, the positive impact of post volume by large enterprises on divergence is stronger, confirming the "information vacuum filling effect"—where the lack of original information disclosure by small enterprises and the heterogeneous interpretation of user comments are more likely to cause predictive divergence (coefficient difference \( t=2.15,p=0.032 \) ).

Table 8. Heterogeneous analysis based on analyst earnings forecast errors

VARIABLES

FERR1

FERR1

FERR1

FERR1

FERR1

FERR1

Post

-0.056***

-0.008

(0.020)

(0.028)

Read

-0.000

-0.017

(0.033)

(0.018)

Comment

-0.033**

-0.015

(0.016)

(0.015)

(0.095)

(0.178)

(0.157)

(0.103)

(0.095)

(0.103)

EM

0.000

0.001

-0.003

0.004*

0.000

0.004*

(0.001)

(0.004)

(0.003)

(0.002)

(0.001)

(0.002)

ROA

0.048

-0.695**

0.222

-0.955***

0.045

-0.955***

(0.149)

(0.308)

(0.193)

(0.273)

(0.149)

(0.273)

ROE

0.001

0.005

-0.003

0.017***

0.001

0.017***

(0.001)

(0.010)

(0.003)

(0.006)

(0.001)

(0.006)

GrossProfit

-0.191*

0.228

-0.025

0.008

-0.190*

0.008

(0.105)

(0.142)

(0.163)

(0.086)

(0.105)

(0.086)

NetProfitGrowth

0.001

-0.000

0.001

-0.000*

0.001

-0.000*

(0.001)

(0.000)

(0.001)

(0.000)

(0.001)

(0.000)

Growth

-0.000**

-0.000

-0.000

-0.000**

-0.000**

-0.000**

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

(0.000)

Rec

-0.018

0.299

-0.495

0.088

-0.012

0.088

(0.167)

(0.241)

(0.335)

(0.153)

(0.167)

(0.153)

Inv

0.140

0.101

-0.111

0.009

0.141

0.009

(0.152)

(0.187)

(0.299)

(0.108)

(0.152)

(0.108)

Fixed

0.075

0.098

0.095

-0.022

0.073

-0.021

(0.134)

(0.184)

(0.242)

(0.101)

(0.134)

(0.101)

Constant

1.615**

0.294

2.188*

1.457***

1.435**

1.333***

(0.639)

(0.693)

(1.201)

(0.477)

(0.634)

(0.391)

Observations

10,167

10,984

10,167

10,984

10,167

10,984

R-squared

0.007

0.302

0.347

0.007

0.006

0.007

yearfix

YES

YES

YES

YES

YES

YES

idfix

YES

YES

YES

YES

YES

YES

Robust standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

As shown in Table 8, the prediction error of large enterprises decreased significantly (coefficient \( =-0.056,p \lt 0.01 \) ), while the effect on small enterprises was not significant, confirming the effectiveness boundary of the "signal enhancement mechanism." When the logarithm of total enterprise assets exceeds 26 (the 75th percentile of this sample), each additional unit of posts can reduce the prediction error by 0.8 standard deviations. The inhibitory effect of the number of comments on small enterprises on the error was more significant (coefficient=-0.033 vs -0.015, p<0.05), reflecting the advantage of "crowdsourced information verification." The panel threshold model shows that when the enterprise size is below the 25th percentile of the industry, a 1% increase in the number of comments can reduce the volatility of the prediction error by 13.7% ( \( Δ \) GARCH \( =-0.137,p=0.021 \) ).

7. Robustness test

7.1. Tail stripping

7.1.1. Perform tail reduction processing

Exchange the variable Forecast Dispersion (FDISP) with earnings forecast accuracy (Ferror), number of posts (Post) and number of comments (Comment) and view count of popular posts (Read), firm size (Size), asset-liability ratio (Lev); equity multiplier (EM) of the firm, return on assets (ROA), return on equity (ROE), gross profit margin (GrossProfit), net profit margin (NetProfitGrowth), operating income growth rate, operating income growth rate (Growth), total asset growth rate (AssetGrowth), proportion of accounts receivable (Rec), proportion of inventory (Inv), proportion of intangible assets (IAR), proportion of fixed assets (Fixed), and the last control year (year) are winsorized. The observed values below the 1% percentile and above the 99% percentile of the variable values are trimmed to the corresponding percentile values to reduce the impact of extreme values.

7.1.2. Replace the original variables and fixed effects in the regression

The control year is a fixed effect, and the output results are as follows

Table 9. Stability test of analyst forecast error based on surplus forecast

FERR1

Coefficient

Std. err.

t

P>|t|

[95% conf. interval]

Post

-.0156827

.0176986

-0.89

0.376

-.0503737

.0190083

Read

-.0008371

.0137122

-0.06

0.951

-.0277143

.0260402

Comment

.0202008

.0136815

1.48

0.140

-.0066163

.0470178

Size

-.0045694

.0119446

-0.38

0.702

-.027982

.0188432

Lev

-.0088159

.1183592

-0.07

0.941

-.2408115

.2231797

EM

.0140021

.0165026

0.85

0.396

-.0183445

.0463488

ROA

-.4216343

.3259055

-1.29

0.196

-1.060441

.2171724

ROE

.1375489

.1307087

1.05

0.293

-.118653

.3937507

GrossProfit

.0229546

.0823284

0.28

0.780

-.1384171

.1843264

NetProfitG~h

-.0004089

.002597

-0.16

0.875

-.0054992

.0046814

Growth

.0062501

.0103241

0.61

0.545

-.0139862

.0264865

Rec

-.2375413

.1345919

-1.76

0.078

-.5013547

.026272

Inv

.0340333

.1063185

0.32

0.749

-.1743615

.242428

Fixed

-.0656274

.0897177

-0.73

0.464

-.2414829

.1102282

cons

.517563

.3028058

1.71

0.087

-.0759661

1.111092

Table 10. Stability test of analyst earnings forecast disagreement

FDISP1

Coefficient

Std. err.

t

P>|t|

[95% conf. interval]

Post

-.0368737

.0130982

-2.82

0.005

-.0625474

-.0112

Read

.0216434

.0101479

2.13

0.033

.0017525

.0415343

Comment

.0681604

.0101249

6.73

0.000

.0483146

.0880063

Size

-.0051011

.0088351

-0.58

0.564

-.0224187

.0122165

Lev

-.0866694

.0875788

-0.99

0.322

-.2583323

.0849936

EM

.0114192

.0122138

0.93

0.350

-.0125211

.0353595

ROA

.0429907

.2412202

0.18

0.859

-.4298245

.5158059

ROE

.0203035

.0967698

0.21

0.834

-.1693748

.2099818

GrossProfit

-.063179

.060924

-1.04

0.300

-.1825959

.056238

NetProfitG~h

-.0 023415

.0019206

-1.22

0.223

-.0061061

.0014232

Growth

.0118393

.0076256

1.55

0.121

-.0031078

.0267863

Rec

.0216363

.0996036

0.22

0.828

-.1735965

.216869

Inv

-.022121

.0786151

-0.28

0.778

-.1762144

.1319723

Fixed

-.002919

.0663767

-0.04

0.965

-.1330237

.1271858

_cons

-.079165

.2241041

-0.35

0.724

-.518431

.3601009

Refer to Tables 9 and 10, the Post variable is significantly negative in FDISP1, indicating that the policy implementation (Post) has a stronger inhibitory effect on analyst forecast dispersion (FDISP), while its impact on forecast error (FERR) in the baseline model (FERR1) is not significant. The possible reason is that the policy has a more direct impact on market information transparency, thereby reducing analyst forecast dispersion. The Comment variable is positive in both models and highly significant in FDISP1, indicating the robust influence of comment interaction on analyst forecasting behavior. The explanatory power of the FDISP1 model is stronger (with core variables Post and Comment being significant), while most variables in FERR1 are not significant, possibly because the forecast error (FERR) is more affected by other unobserved factors.

7.2. Generalized estimation equation method and instrumental variables

This paper corrects the potential endogeneity issues based on the Generalized Method of Moments (GMM) approach and verifies the reliability of the core conclusions through multi-dimensional robustness tests. The results show that the dynamic persistence of analyst forecast error (FERR1) and forecast dispersion (FDISP1) is significant in some models, but the impact of the policy variable (Post) and information interaction variables (Read, Comment) exhibits heterogeneity, reflecting the moderating role of endogeneity handling on the direction and significance of the results.

In the FDISP1 model, the coefficient of the lagged term (L1.FDISP1) is 0.2223 (p=0.029), indicating that the dispersion of analyst forecasts has temporal inertia. This result remains stable in tests using different instrumental variables (coefficient 0.1504, p=0.148). The policy variable (Post) has a significantly negative effect on FDISP1 in the base model (-0.0369, p=0.005), but becomes insignificant after GMM correction (0.0240, p=0.544), suggesting that policy effects might be disturbed by sample selection or measurement errors. Among the information interaction variables, the positive impact of the number of comments (Comment) on FDISP1 is highly significant in the base model \( (0.0682,p \lt 0.001) \) , but the coefficient decreases and becomes insignificant after GMM adjustment \( (0.0292,p=0.374) \) , indicating that the effect is overestimated due to endogeneity. For the FERR1 model, the impact of readership (Read) presents a contradiction in different tests: the base model shows a weak negative effect (-0.0029, \( p=0.937 \) ), while after GMM correction, it becomes positive but insignificant (0.0165, p=0.762), reflecting a potential issue of reverse causality. Core financial indicators such as ROA and ROE are mostly insignificant, with poor stability in the sign of the coefficients. For example, the effect of ROA on FERR1 is \( -2.4304(p=0.095) \) in the FDISP1 model, but \( -3.0209(p=0.120) \) in the FERR1 model, suggesting weak explanatory power of corporate accounting performance on forecast quality. The test of year dummy variables shows that the coefficients for most years between 2015-2022 are significantly negative, possibly related to the strengthening of capital market regulation, but the policy implementation year (Post) does not show specific changes.

8. Conclusion

This study systematically examines the impact mechanism of social media attention on securities analysts’ forecast behavior, thoroughly revealing the nonlinear effects of social media on analysts’ forecasts, the mediating role of investor sentiment, and the moderating role of firm size. The main conclusions are as follows: First, social media interactions exhibit significant nonlinear effects. Moderate information dissemination can reduce information asymmetry, thereby enhancing forecast accuracy. For example, the coefficients of posts, readings, and comments on FERR1 are -0.031, -0.014, and -0.018, respectively, \( p \lt 0.01 \) . However, excessive interaction can trigger noise interference, leading to increased forecast dispersion, with FDISP1 coefficients reaching 0.090, 0.080, and 0.070, p<0.01. Second, the transmission of sentiment is asymmetric. Investor sentiment plays a key mediating role in the impact of social media. As charts show, positive sentiment reduces the dispersion and error of earnings forecasts, while negative sentiment increases the error and dispersion of earnings forecasts. The volume of posts, readings, and comments on social media affects investors’ different emotions, thereby influencing analysts’ forecast behavior, creating masking effects and indirect impacts. However, there is polarization, where negative sentiment fully mediates the positive effect of social media on forecast dispersion (as in the Post \( → \) Negative \( → \) FDISP1 path coefficient \( =1.054×0.120≈0.126 \) , p<0.01), while positive sentiment fails to transmit effectively due to analysts’ "good news immunity" effect (Positive_01 coefficient is not significant). The "spiral amplification" mechanism of negative sentiment is particularly prominent: social media elevates analysts’ cognitive biases through emotional contagion (such as panic posts), leading to increased forecast dispersion (Negative coefficient in FDISP1 model = 0.127***). In contrast, positive sentiment fails to significantly improve forecast quality due to the rational filtering of good news by the market. Third, the moderating role of firm size is significant. Smaller firms use social media more because they have to rely on it, and their comment volume negatively affects FERR1 by 120% more than larger firms. However, smaller firms are also more affected by negative feelings, with their forecast errors being 2.3 times more sensitive to negative sentiment than those of larger firms. Fourth, policy intervention (Post) can reduce dispersion in the baseline model, with FDISP1 coefficient at -0.0369, p=0.005; however, after endogeneity correction, this effect disappears, suggesting that current information disclosure regulations fail to effectively counteract emotional noise.

Based on the above findings, the following recommendations are proposed: At the regulatory practice level: a three-tier early warning mechanism for social media information flow should be established. For companies whose posting volume and readership exceed industry thresholds (e.g., within the Top 10% percentile), mandatory disclosure of information quality ratings and sentiment volatility index is required; a dynamic circuit breaker mechanism should be implemented for the spread of negative sentiment. When the proportion of negative posts in online stock forums exceeds 60% for three consecutive days, it triggers the listed company’s obligation to clarify. At the analyst level: sentiment elasticity assessment tools should be developed, such as constructing a "sentiment interference factor" (= proportion of negative posts \( × \) readership growth rate). When this factor exceeds 1.5 standard deviations, an information verification process should be automatically initiated, prioritizing the validation of hotly debated financial indicators on social media. At the enterprise level: small companies need to optimize the "precision delivery" strategy for information disclosure. Given the lack of institutional research, supplementary data (such as order changes and capacity utilization rates) can be selectively released through online stock forums within 48 hours after quarterly reports are released to reduce the emotional bias caused by retail investors’ speculation. At the academic research level: there is a need to incorporate deep learning technologies in the future to identify "fake positive sentiment" (such as high-frequency robot-like posts giving likes) and construct cognitive profiles of individual analysts to capture the heterogeneity of their sentiment sensitivity. At the data governance level: it is recommended that regulatory agencies and platform providers jointly establish a "key variable tag library" to perform entity recognition and automatic verification on 16 core indicators such as net profit and gross margin in stock forums, thereby reducing text noise interference with predictive models.

This study suggests that social media rules should change from just following information rules to focusing on how people feel about the information, helping to improve the quality of information rather than just the amount, using new algorithms and methods. For the analyst community, there is an urgent need to establish cognitive protection mechanisms under information overload, for example, setting a threshold for daily social media information intake (e.g., reading volume \( ≤5000 \) posts/day). When this threshold is exceeded, a human-machine collaborative decision-making model should be activated to enhance the robustness of predictions.


References

[1]. Pan, Y., & Liu, P. (2021). The relationship between shame and social anxiety in college students: A moderated mediation model. Abstracts of the 23rd National Psychology Academic Conference, 2, 403. https://doi.org/10.26914/c.cnkihy.2021.040058

[2]. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.

[3]. Da Costa, N., & Chen, M. (2015). 2,5-Diketopiperazines—Interesting markers of reaction or compounds with sensory and bioactive properties? Abstracts of Papers of the American Chemical Society, 250.

[4]. Liu, H., & Shi, X. (2020). The impact of annual report readability on analysts’ earnings forecasts. Securities Market Herald, (3), 30–39.

[5]. Yang, X., Shen, H., & Zhu, Y. (2016). Local preference, investor sentiment and stock returns: Empirical evidence from online forums. Financial Research, (12), 143-158.

[6]. Chu, J., Qin, X., & Fang, J. (2019). The arrangement of China’s margin financing and securities lending system and the optimistic bias of analysts’ earnings forecasts. Management World, 35(1), 151–166+228. https://doi.org/10.19744/j.cnki.11-1235/f.2019.00.

[7]. Wang, D., Sun, K., & Gao, H. (2020). The impact of "voting with their mouths" on social media on management's voluntary performance forecasts. Financial Research, (11), 188-206.

[8]. Jiang, X., Zhu, L., & Yi, Z. (2021). Online public opinion attention and enterprise innovation. Economics Quarterly, 21(1), 113–134. https://doi.org/10.13821/j.cnki.ceq.2021.01.06

[9]. Blankespoor, H. D. (2002). Microbial desulfurization of fuel oil. Chinese Science Bulletin, (5), 365–369.

[10]. Liao, M., Cai, X., & Xie, J. (2024). Opinion divergence on social media and analyst forecasts. Accounting Monthly, 45(6), 113–122. https://doi.org/10.19641/j.cnki.42-1290/f.2024.06.016

[11]. Ma, Y., & Chen, W. (n.d.). Can small and medium shareholders’ activism improve the quality of analysts’ earnings forecasts? Nankai Management Review, 1–27.

[12]. Su, Z., Sun, Y., & Zhao, W. (2024). The impact of social media attention on the standardization of key audit matters. Management Review, 21(08), 1256-1264.

[13]. Sohn, B. C. (2012). Analyst forecast, accounting conservatism and the related valuation implications. Accounting & Finance, 52, 311-341.

[14]. Wang, Y., & Wang, Y. (2012). Does performance forecast information affect analysts' forecasting behavior? Financial Research, (06), 193-206.

[15]. Gilles, H., & Shen, R. (2013). The role of analysts in intra-industry information transfer. The Accounting Review, 88(4), 1265–1287.

[16]. Miller, M. H., & Spencer, J. E. (1977). The Static Economic Effects of the UK Joining the EEC: A General Equilibrium Approach. The Review of Economic Studies, 44(1), 71-93.

[17]. Harvey Diamond, Mark Kon, & Louise Raphael. (1985). A regularization of the pointwise summation of singular Sturm-Liouville expansions. Approximation Theory and Its Applications, (5), 61–69.

[18]. Paige E. Scalf, Ana eTorralbo, Evelina eTapia, & Diane M. Beck. (2013). Competition explains limited attention and perceptual resources: Implications for perceptual load and dilution theories. Frontiers in Psychology, 4, 243.


Cite this article

Xiong,Z. (2025). The impact of social media attention on analyst forecast behavior. Journal of Applied Economics and Policy Studies,18(4),112-126.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Journal:Journal of Applied Economics and Policy Studies

Volume number: Vol.18
Issue number: Issue 4
ISSN:2977-5701(Print) / 2977-571X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Pan, Y., & Liu, P. (2021). The relationship between shame and social anxiety in college students: A moderated mediation model. Abstracts of the 23rd National Psychology Academic Conference, 2, 403. https://doi.org/10.26914/c.cnkihy.2021.040058

[2]. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.

[3]. Da Costa, N., & Chen, M. (2015). 2,5-Diketopiperazines—Interesting markers of reaction or compounds with sensory and bioactive properties? Abstracts of Papers of the American Chemical Society, 250.

[4]. Liu, H., & Shi, X. (2020). The impact of annual report readability on analysts’ earnings forecasts. Securities Market Herald, (3), 30–39.

[5]. Yang, X., Shen, H., & Zhu, Y. (2016). Local preference, investor sentiment and stock returns: Empirical evidence from online forums. Financial Research, (12), 143-158.

[6]. Chu, J., Qin, X., & Fang, J. (2019). The arrangement of China’s margin financing and securities lending system and the optimistic bias of analysts’ earnings forecasts. Management World, 35(1), 151–166+228. https://doi.org/10.19744/j.cnki.11-1235/f.2019.00.

[7]. Wang, D., Sun, K., & Gao, H. (2020). The impact of "voting with their mouths" on social media on management's voluntary performance forecasts. Financial Research, (11), 188-206.

[8]. Jiang, X., Zhu, L., & Yi, Z. (2021). Online public opinion attention and enterprise innovation. Economics Quarterly, 21(1), 113–134. https://doi.org/10.13821/j.cnki.ceq.2021.01.06

[9]. Blankespoor, H. D. (2002). Microbial desulfurization of fuel oil. Chinese Science Bulletin, (5), 365–369.

[10]. Liao, M., Cai, X., & Xie, J. (2024). Opinion divergence on social media and analyst forecasts. Accounting Monthly, 45(6), 113–122. https://doi.org/10.19641/j.cnki.42-1290/f.2024.06.016

[11]. Ma, Y., & Chen, W. (n.d.). Can small and medium shareholders’ activism improve the quality of analysts’ earnings forecasts? Nankai Management Review, 1–27.

[12]. Su, Z., Sun, Y., & Zhao, W. (2024). The impact of social media attention on the standardization of key audit matters. Management Review, 21(08), 1256-1264.

[13]. Sohn, B. C. (2012). Analyst forecast, accounting conservatism and the related valuation implications. Accounting & Finance, 52, 311-341.

[14]. Wang, Y., & Wang, Y. (2012). Does performance forecast information affect analysts' forecasting behavior? Financial Research, (06), 193-206.

[15]. Gilles, H., & Shen, R. (2013). The role of analysts in intra-industry information transfer. The Accounting Review, 88(4), 1265–1287.

[16]. Miller, M. H., & Spencer, J. E. (1977). The Static Economic Effects of the UK Joining the EEC: A General Equilibrium Approach. The Review of Economic Studies, 44(1), 71-93.

[17]. Harvey Diamond, Mark Kon, & Louise Raphael. (1985). A regularization of the pointwise summation of singular Sturm-Liouville expansions. Approximation Theory and Its Applications, (5), 61–69.

[18]. Paige E. Scalf, Ana eTorralbo, Evelina eTapia, & Diane M. Beck. (2013). Competition explains limited attention and perceptual resources: Implications for perceptual load and dilution theories. Frontiers in Psychology, 4, 243.