Literature Review on Customer Churn Prediction in Telecom Industry

1. Introduction

Customer churn is a major challenge facing the telecommunications industry. Customer churn affects revenue and increases the cost of acquiring new customers. In today's competitive market, an increase in customer churn can lead to significant financial losses [1, 2]. Customer relationship management (CRM) helps maintain customer satisfaction and loyalty [1].

With the digitalization of telecommunications services and the explosive growth of user data in recent years, customer churn prediction models that can use historical customer data to predict potential churn customers are increasingly demanded by telecommunications companies [3, 4]. Through churn prediction, telecommunications companies can use predictive analytics to help them conduct customer retention activities effectively in the following way: reduce marketing costs and increase profitability [5, 6].

This review intends to summarize the current state of research on customer churn prediction in the telecommunications industry. It compares the datasets, methods and evaluation metrics used in previous studies on customer churn prediction. It discusses challenges and research gaps as well. By summarizing the recent developments, this review also provides guidance for industry practitioners in customer churn prediction systems.

2. Data sources and characteristics

Customer churn prediction research in telecommunications industry uses public benchmark datasets or proprietary datasets. Public datasets are more accessible and are commonly used for model benchmarking. IBM Telco Customer Churn dataset has 7043 records with 21 features. It includes customers' demographic information, service usage and contract information [2, 7]. In addition, Cell2Cell dataset has around 50000 customer records of a US telecom operator [2].

Proprietary datasets vary greatly in size and complexity. Some research analyzes more than one million customer records with more than 400 features [8]. Some studies focus on data from specific countries/regions, e.g., China Mobile [5], Danish telecom customers [9, 10]. These datasets have billing history, service usage and data as well as detailed call logs. These datasets are richer in information but also raise privacy issues [1].

Typical feature categories are demographic information, service usage patterns and contract information [3, 6]. Class imbalance usually exists in datasets, i.e., number of churned users is much smaller than the number of non-churned users. Imbalance will lead to bias during model training and reduce prediction accuracy. Therefore, resampling methods such as SMOTE [11], cost sensitive learning should be used [4, 12, 13].

3. Methodological approaches

Telecom customer churn prediction research has used a wide variety of modeling techniques from traditional machine learning to deep learning and explainable artificial intelligence.

Classical models are still widely used due to interpretability and low computational cost. Logistic regression (LR) is often used as a benchmark model for customer churn prediction, and it has been validated on structured telecom data [5, 9, 13]. Compared with other models, decision trees (DT) and random forests (RF) are better at handling nonlinear relationships and feature interactions [2, 7, 14]. Support vector machines (SVM) are recommended when the data is highly dimensional, but they are computationally expensive [5]. K-nearest neighbor (KNN) gives an intuitive distance-based classification method, but it is sensitive to irrelevant features and scaling [2].

3.1. Traditional machine learning models

Classical models are still widely used due to interpretability and low computational cost. Logistic regression (LR) is still often used as a benchmark model for customer churn prediction, and it has been validated on structured telecom data [5, 9, 13]. Compared with other models, decision trees (DT) and random forests (RF) are better at handling nonlinear relationships and feature interactions [2, 7, 14]. Support vector machines (SVMs) are still recommended when the data is highly dimensional, but they are computationally expensive [5]. K-nearest neighbors (KNN) give an intuitive distance-based classification method, but it is sensitive to irrelevant features and scaling [2].

3.2. Ensemble and hybrid methods

Ensemble models combine multiple base models to improve prediction accuracy. Bagging methods such as RF can reduce variance, while bagging methods such as XGBoost, AdaBoost, and Gradient Boosting Machine (GBM) improve prediction ability by focusing on misclassified samples [3, 6, 15].

Hybrid methods integrate multiple algorithms and take advantage of their complementary strengths. AbdelAziz et al. proposed an ensemble method of deep feature extraction CNN and lightweight feature extraction CNN, achieving an accuracy of 95.96% for a single dataset [3]. Ouf et al. proposed a hybrid framework using XGBoost and SMOTE-ENN resampling, which outperformed a single model [4]. Usman-Hamza et al. developed a heterogeneous multi-layer ensemble based on SMOTE to improve performance on imbalanced data [12]. Cost-sensitive ensemble methods can also solve the class imbalance problem and improve the recall rate of minority class loss [11].

3.3. Deep learning models

Deep learning has been attracted much attention due to its ability to model complex information from large data sets. Artificial neural networks (ANNs) are applied to do nonlinear feature learning [2, 16]. Convolutional neural networks (CNNs) are good at modeling spatial information in structured customer data [3], while recurrent neural networks (RNNs) and BiLSTM-CNN hybrid models are good at modeling sequential and temporal churn behaviors [1, 6, 17].

These methods commonly include regularization, dropout, and sophisticated feature selection to prevent overfitting [1]. These methods typically also require larger data sets and greater computing power.

3.4. Explainable Artificial Intelligence (XAI)

Recently, the models that achieve the best performance are also very complicated. Making these models interpretable has become a research focus in recent years. Explanable AI tools like SHAP or LIME visualize features importance and allow for generating local explanations for single predictions [1, 8, 15, 18]. This level of transparency is valuable for making business decisions and ensuring regulatory compliance when applying models on a customer retention use case. We have empirically demonstrated that an interpretable combination of XAI and a potent predictive model can be more widely trusted and adopted in an industry setting [1, 9].

4. Evaluation and findings

4.1. Evaluation metrics

Telecom customer churn prediction studies typically use classification metrics such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve (ROC-AUC) [1, 2, 6]. Accuracy measures the overall correctness of the prediction. Precision and recall focus on positive (churn) predictions. The F1 score balances precision and recall, increasing its reference value in imbalanced datasets [4]. ROC-AUC provides a threshold-independent measure of model discrimination ability.

4.2. Performance trends

Across the literature, ensemble and hybrid methods often achieve relatively high accuracy. AbdelAziz et al. reported that an ensemble CNN achieved 95.96% accuracy on an insurance customer churn dataset, outperforming a single model [3]. Similarly, Ouf et al. showed that their XGBoost + SMOTE-ENN hybrid model outperformed individual classifiers on three telecommunications datasets [4].

Usman-Hamza et al. showed that multi-layer ensembles can further improve performance [12]. Deep learning hybrid models such as BiLSTM-CNN also perform well, especially for datasets with temporal or sequential features [1, 6, 17]. Meanwhile, traditional models such as LR and RF remain competitive on datasets with lower complexity or smaller size, with faster training speeds and more interpretable models [5, 2, 13].

4.3. Common challenges

Class imbalance remains the most frequently mentioned problem [4, 11, 12]. When churned users make up a small portion of the dataset, the model tends to predict the majority class, resulting in low recall for churn prediction. To address this issue, techniques such as oversampling (SMOTE), undersampling, and cost-sensitive learning are often adopted [4, 9].

Another challenge is overfitting, especially for deep learning models trained on small or imbalanced datasets [1, 2, 16]. Common countermeasures include regularization, dropout, and careful hyperparameter tuning.

At the same time, interpretability is also an important issue. Even if high-performance models such as deep neural networks and gradient boosting can provide high accuracy, their decision-making process can be more difficult to explain to business stakeholders [1, 8, 18].

5. Research gaps and future directions

Telecom customer churn has made some progress, but some research gaps remain.

Multimodal data integration still has limitations. Most studies rely on structured telecom records, but the accuracy of predictions can be improved by integrating external sources such as social media interactions, geolocation data, and customer service records [1, 8]. Therefore, new methods that can handle heterogeneous data types and ensure user privacy are needed.

Developing real-time customer churn prediction systems is an emerging need. For many current models, they are trained offline and updated infrequently, which limits their ability to detect customer churn risk as customer behavior changes [2, 6]. Therefore, future work needs to focus on online learning and stream data processing.

Model interpretability still needs to be given priority. Although Explainable Artificial Intelligence (XAI) methods such as SHAP and LIME are gaining widespread attention, their integration with deep learning models is still in the early stages of development [1, 9, 15]. Saarela and Podgorelec emphasized that Explainable Artificial Intelligence (XAI) is rapidly developing in healthcare and other fields, but its application in the telecommunications field is still slow [18].

Privacy and ethical issues need to be addressed. Large-scale telecom datasets often involve sensitive personal information, raising privacy and ethical concerns. Therefore, secure data processing, anonymization, and compliance with data protection regulations are crucial for responsible model deployment [1, 4].

Cross-domain transfer learning can help improve predictive capabilities in markets with limited historical data. This allows models trained in one region or company to be adapted to another, but only if differences in customer behavior and service offerings can be accounted for [1, 8, 10].

6. Conclusion

Telecom customer churn prediction has evolved from simple statistical models to advanced deep learning and hybrid methods. Public datasets such as IBM Telco and Cell2Cell have enabled benchmarking, while proprietary datasets support large-scale domain-specific research.

Ensemble methods and deep learning hybrid methods can achieve the best performance, especially when combined with effective data balancing and feature engineering. However, practical deployments still face challenges such as class balance, overfitting, and limited interpretability.

Future research needs to focus on integrating diverse data sources, building real-time prediction systems, and improving transparency while maintaining accuracy. Integrating technological advances with business goals ensures the accuracy of customer churn prediction models and considers their practicality for telecom operators.

References

[1]. Shahabikargar et al., "A Comprehensive Survey on Customer Churn Analysis Studies, " Journal of Information and Telecommunication, 2025.

[2]. Barsotti, A., Gianini, G., Mio, C., Lin, J., Babbar, H., Singh, A., Taher, F., Damiani, E., "A Decade of Churn Prediction Techniques in the TelCo Domain: A Survey, " SN Computer Science 2024, 5: 404, 2024.

[3]. AbdelAziz et al., "A Comprehensive Evaluation of Machine Learning and Deep Learning Models for Churn Prediction, " Information 2025, 16, 537, 2025.

[4]. Ouf, S., Mahmoud, K.T., Abdel-Fattah, M.A., "A Proposed Hybrid Framework to Improve the Accuracy of Customer Churn Prediction in Telecom Industry, " Journal of Big Data (2024) 11: 70, 2024.

[5]. Zhang, T., Moro, S., Ramos, R.F., "A Data-Driven Approach to Improve Customer Churn Prediction Based on Telecom Customer Segmentation, " Future Internet 2022, 14(3), 94, 2022.

[6]. Sudharsan, R., Ganesh, E.N., "A Swish RNN Based Customer Churn Prediction for the Telecom Industry with a Novel Feature Selection Strategy, " Connection Science, 34(1), 1855–1876, 2022.

[7]. Lu, N., Lin, H., Lu, J., Zhang, G., "A Customer Churn Prediction Model in Telecom Industry Using Boosting, " IEEE Transactions on Industrial Informatics, 10(2), 1659–1666, 2014.

[8]. Li, W., Zhou, C., "Customer Churn Prediction in Telecom Using Big Data Analytics, " IOP Conf. Series: Materials Science and Engineering, 768, 052070, 2020.

[9]. Wagh, S.K., Andhale, A., Wagh, K.S., Pansare, J.R., Ambadekar, S.P., Gawande, S.H., "Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques, " Results in Control and Optimization, 14, 100342, 2024.

[10]. Saleh, S., Saha, S., "Customer Retention and Churn Prediction in the Telecommunication Industry: A Case Study on a Danish University, " SN Applied Sciences, 5, 173, 2023.

[11]. Ahmad, A.K., Jafar, A., Aljoumaa, K., "Customer Churn Prediction in Telecom Using Machine Learning in Big Data Platform, " Journal of Big Data, 6(1), 1–26, 2019.

[12]. Usman-Hamza, F. E., Balogun, A. O., Amosa, R. T., Capretz, L. F., Mojeed, H. A., Salihu, S. A., Akintola, A. G., & Mabayoje, M. A., "Sampling-based novel heterogeneous multi-layer stacking ensemble method for telecom customer churn prediction, " Scientific African, 24, e02223, 2024.

[13]. Yuan, X., "Telecom Customer Churn Prediction in Context of Composite Model, " Second International Conference on Statistics, Applied Mathematics and Computing Science (CSAMCS 2022), SPIE, 2023.

[14]. Chang, V., Hall, K., Xu, Q.A., Amao, F.O., Ganatra, M.A., Benson, V., "Prediction of Customer Churn Behavior in the Telecommunication Industry Using Machine Learning Models, " Algorithms, 17(7), 231, 2024.

[15]. Poudel, S.S., Pokharel, S., Timilsina, M., "Explaining Customer Churn Prediction in Telecom Industry Using Tabular Machine Learning Models, " Machine Learning with Applications, 17, 100567, 2024.

[16]. Fujo, S.W., Subramanian, S., Khder, M.A., "Customer Churn Prediction in Telecommunication Industry Using Deep Learning, " Information Sciences Letters, 11(1), 185–198, 2022.

[17]. Khattak, A., Mehak, Z., Ahmad, H., Asghar, M.U., Asghar, M.Z., Khan, A., "Customer Churn Prediction Using Composite Deep Learning Technique, " Scientific Reports, 13, 12345, 2023.

[18]. Saarela, M., Podgorelec, V., "Recent Applications of Explainable AI (XAI): A Systematic Literature Review, " Applied Sciences, 14, 8884, 2024.

Cite this article

Liu,S. (2025). Literature Review on Customer Churn Prediction in Telecom Industry. Theoretical and Natural Science,132,27-32.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of CONF-APMM 2025 Symposium: Simulation and Theory of Differential-Integral Equation in Applied Physics

ISBN：978-1-80590-305-5(Print) / 978-1-80590-306-2(Online)

Editor：Marwan Omar, Shuxia Zhao

Conference website: https://www.confapmm.org/dalian.html

Conference date: 27 September 2025

Series: Theoretical and Natural Science

Volume number: Vol.132

ISSN：2753-8818(Print) / 2753-8826(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).