
Research on prediction of e-commerce repurchase behavior based on multiple fusion models
- 1 Fuzhou University, Fuzhou Fujian, China
* Author to whom correspondence should be addressed.
Abstract
With the advent of the Internet era, online shopping has become an integral part of people’s life. In order to perform precision marketing, more and more e-commerce platforms are trying to predict users’ repurchase behaviors by collecting massive user behavior data. Although the traditional single-model prediction method is mature, it is still difficult to improve the accuracy of prediction. Based on the real user behavior data of Tmall, this paper focuses on comparing and exploring the help of different algorithm fusion methods to improve the model prediction effect. The under-sampling method is introduced for sample equalization processing. User behavior features are constructed from three aspects which are user, merchant and user-merchant interaction. Taking AUC value as evaluation method, Soft-Voting and Stacking model fusion methods are used to integrate logistics regression, KNN, XGBoost and RandomForest. And the prediction results is produced based on stratified 5-fold cross-validation. The experimental results show that the fusion model can effectively improve the prediction effect, and the AUC value is raised by 0.2%~4% compared with the single model. The AUC value of Soft-Voting increases by approximately 0.4% after it is weighted.
Keywords
model fusion, repurchase prediction, Stacking integration model, Soft-Voting, e-commerce.
[1]. Wang B Y. The Study on the Industrial Evolution,Competitive Situation and Development Trend of Chinese Online Retail Industry[J]. China Business and Market, 31(4):25-34 (2017).
[2]. Wang Y, Ruan M L. Risk Estimation Simulation of Excess Trading Transactions Based on Big Data[J]. Computer Simulation, 35(3):369-372 (2018).
[3]. Yang G S, Guo B B. User Behavior Prediction for E-commerce Platforms Enhanced by Machine Learning[J]. Science and Technology & Innovation, 2019
[4]. Schmittlein D C, Morrison D G, Colombo R. Counting Your Customers: Who are They and What will They Do Next? [J]. Management Science, 33(1): 1-24 (1987).
[5]. Zhang C L. Research on BG/NBD Prediction Model and Its Application of the Customer Purchase Behavior[D]. Harbin Institute of Technology, 2006.
[6]. Shu F, Ma S H. A Composition Forecasting Approach of Customer Repeat Purchasing[J]. Computer and Modernization,2015(5):67-70 (2015).
[7]. Hughes A M. Boosting Response with RFM[J]. Marketing Tools, 1996, 3(3): 4.
[8]. Xu X B, Wang J Q, TU Huan, et al. Customer classification of E-commerce based on improved RFM model[J]. Journal of Computer Application, 2012, 32(5):1439-1442.
[9]. Zhang N, Fan C R, Zhang Y. A Novel Personalized Recommendation Algorithm of Collaborative Filtering Based on RFM Model[J]. Telecommunications Science, 31(9):110-118 (2015).
[10]. Wang F, Shen G C. Machine Learning Algorithms in the Application of user Behavior[J]. Computer Knowledge and Technology, 2017(26): 180 -182.
[11]. Zhu X, Liu X M, Chen S G, et al. Research on Network Purchase Behavior Prediction Based on Machine Learning[J]. Statistics & Information Forum, 2017, 32(12): 94-100.
[12]. Yang L H, Bai Z Q. User behavior prediction based on feature engineering of quadratic combination and XGBoost model[J]. Science Technology and Engineering, 18(14):186—189 (2018)
[13]. Zhang B, Fu Y, Zhou J, et al. User behavior prediction method of E-commerce platform based on deep forest[J]. Information Technology, 2021(6): 96-101.
[14]. Zhang L Y, Li Y R, Wen X. Predicting Repeat Purchase Intention of New Consumers[J]. Data Analysis and Knowledge Discovery, 2018, 2(11): 10-18.
[15]. Hu X L, Zhang H B, Dong Junchao, et al. Prediction of ensemble learning⁃based new users’ repurchase behavior on e⁃commerce platform[J]. Modern Electronics Technique, 43(11):115-119 (2020).
[16]. ZhouZH. Ensemble Learning[J]. Encyclopedia of Biometrics, 2009.
[17]. Wolpert D H. Stacked generalization[J]. Neural networks, 1992, 5(2): 241-2 59.
[18]. Zhou Z H. Machine Learning (Version I) [M]. Beijing: Tsinghua University publishing house co., ltd, 2016.
[19]. Andreas C. Müller, Sarah Guido. Introduction to Machine Learning with Python[M]. The People’s Posts and Telecommunications Press, 2018.
[20]. Breiman L, Breiman L, Cutler R A.Random Forests Machine Learning[J]. Journal of Clinical Microbiology, 2001, 2: 199-228.
[21]. Chen T, Guestrin C.Xgboost: A scalable tree boosting system[C]. San Francisco: the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016: 785-794.
[22]. Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2022, 16:321 - 357.
[23]. Fawcett T.ROC Graphs: Notes and practical considerations for data mining researchers[J]. Pattern Recognition Letters, 31(8): 1-38 (2003).
Cite this article
Yuwei,J. (2023). Research on prediction of e-commerce repurchase behavior based on multiple fusion models. Applied and Computational Engineering,2,90-104.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 4th International Conference on Computing and Data Science (CONF-CDS 2022)
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).