Research on Ride-hailing User Travel Preference Identification and Personalized Recommendation Strategies Based on Machine Learning

Research Article
Open access

Research on Ride-hailing User Travel Preference Identification and Personalized Recommendation Strategies Based on Machine Learning

Xiaotong Shi 1*
  • 1 Columbia University    
  • *corresponding author john33361@gmail.com
Published on 24 September 2025 | https://doi.org/10.54254/2755-2721/2025.GL27135
ACE Vol.186
ISSN (Print): 2755-2721
ISSN (Online): 2755-273X
ISBN (Print): 978-1-80590-383-3
ISBN (Online): 978-1-80590-384-0

Abstract

The rapid expansion of ride-hailing services has generated massive amounts of user travel data, presenting both opportunities and challenges for service optimization. This research proposes a comprehensive framework for identifying user travel preferences and developing personalized recommendation strategies using advanced machine learning techniques. Our methodology integrates feature extraction algorithms, pattern recognition models, and recommendation systems to enhance user experience and operational efficiency. Through extensive experiments on real-world datasets, we demonstrate that our approach achieves 87.3% accuracy in preference identification and improves user satisfaction by 23.7% compared to conventional methods. The proposed framework effectively addresses the heterogeneity of user behaviors while maintaining computational efficiency, providing practical solutions for ride-hailing platforms to deliver customized services and optimize resource allocation.

Keywords:

travel preference identification, personalized recommendation, machine learning, ride-hailing services

Shi,X. (2025). Research on Ride-hailing User Travel Preference Identification and Personalized Recommendation Strategies Based on Machine Learning. Applied and Computational Engineering,186,39-51.
Export citation

1. Introduction

1.1. Background and motivation of ride-hailing industry development

The ride-hailing industry has experienced unprecedented growth over the past decade, fundamentally transforming urban mobility patterns and user expectations. Modern transportation networks generate terabytes of data daily, encompassing user requests, trip trajectories, driver behaviors, and service feedback. This data explosion presents unique opportunities for understanding complex travel patterns and delivering personalized services that align with individual user preferences.

Traditional transportation systems operated on static models that failed to capture the dynamic nature of user demands and preferences. Contemporary ride-hailing platforms require sophisticated analytical frameworks to process heterogeneous data sources and extract meaningful insights about user behavior. The integration of machine learning technologies has become essential for platforms seeking to maintain competitive advantages while optimizing operational efficiency [1].

The emergence of smart transportation ecosystems demands innovative approaches to user preference modeling and service personalization. Machine learning algorithms offer powerful tools for analyzing large-scale mobility data, identifying latent patterns, and predicting future travel behaviors. These capabilities enable platforms to transition from reactive service models to proactive recommendation systems that anticipate user needs and preferences.

User preference identification represents a critical component in the evolution of intelligent transportation systems. The ability to accurately model individual travel patterns, preferred routes, timing preferences, and service requirements directly impacts user satisfaction and platform profitability. Advanced analytics frameworks must address the complexity of human mobility while maintaining scalability and real-time processing capabilities.

1.2. Challenges in current user travel preference analysis

Current approaches to user travel preference analysis face several significant challenges that limit their effectiveness in real-world applications. The heterogeneity of user behaviors presents complex modeling challenges, as individual preferences vary substantially across different user segments, temporal contexts, and geographical regions. Traditional clustering and classification methods often fail to capture the nuanced patterns that characterize individual travel preferences [2].

Data sparsity and quality issues represent substantial obstacles in preference identification systems. Many users exhibit irregular travel patterns or limited interaction histories, making it difficult to establish reliable preference profiles. Missing data, inconsistent labeling, and noise in GPS trajectories further complicate the analysis process. Existing methods struggle to balance model complexity with interpretability, often producing black-box solutions that lack transparency in decision-making processes.

Privacy concerns and regulatory requirements impose additional constraints on preference analysis systems. User data protection regulations limit the types of information that can be collected and processed, while users increasingly demand transparency about how their data is utilized. Balancing personalization benefits with privacy protection requires sophisticated approaches that can extract useful insights while preserving user anonymity [3].

Scalability represents another critical challenge as ride-hailing platforms serve millions of users across diverse geographical markets. Processing real-time data streams while maintaining low-latency response times requires efficient algorithms and distributed computing architectures. Current systems often struggle to maintain performance as user bases expand and data volumes increase exponentially.

1.3. Research objectives and main contributions

This research aims to develop a comprehensive framework for ride-hailing user travel preference identification and personalized recommendation strategies that addresses the limitations of existing approaches. Our primary objective is to design machine learning models that can accurately capture individual travel preferences while maintaining computational efficiency and scalability for large-scale deployments.

The research contributes several novel components to the field of intelligent transportation systems. We propose an enhanced feature extraction methodology that combines spatiotemporal patterns, user demographics, and contextual information to create comprehensive preference profiles. Our approach integrates multiple data sources and applies advanced preprocessing techniques to address data quality issues commonly encountered in real-world scenarios [4].

We introduce innovative machine learning models specifically designed for travel preference identification that leverage ensemble methods and deep learning architectures. These models demonstrate superior performance in handling sparse data and capturing complex non-linear relationships between user characteristics and travel behaviors. Our recommendation framework incorporates real-time context awareness and dynamic preference updating to adapt to evolving user needs.

The practical contributions of this research extend to the development of evaluation metrics and experimental protocols that enable comprehensive assessment of preference identification systems [4]. We provide empirical validation using large-scale datasets and demonstrate significant improvements in recommendation accuracy, user satisfaction, and system efficiency compared to baseline approaches.

2. Related work and literature review

2.1. Machine learning applications in transportation data analysis

Machine learning applications in transportation data analysis have evolved significantly, encompassing diverse methodologies for processing complex mobility datasets. Recent advances in deep learning and neural network architectures have enabled more sophisticated analysis of spatiotemporal patterns in transportation systems. Convolutional neural networks have proven particularly effective for analyzing GPS trajectory data and identifying spatial patterns in user movements [5].

Ensemble learning methods have gained prominence in transportation analytics due to their ability to combine multiple weak learners and improve prediction accuracy. Random forests, gradient boosting, and adaptive boosting algorithms have been successfully applied to various transportation prediction tasks, including demand forecasting, route optimization, and service quality assessment. These methods demonstrate robust performance across different data distributions and can handle missing values effectively [6].

Reinforcement learning has emerged as a powerful paradigm for optimizing transportation systems through interaction with dynamic environments. Q-learning and policy gradient methods have been applied to vehicle routing problems, traffic signal control, and dynamic pricing strategies. The ability to learn optimal policies through trial and error makes reinforcement learning particularly suitable for complex optimization problems in transportation networks [7].

Unsupervised learning techniques play crucial roles in discovering hidden patterns and structures in transportation data. Clustering algorithms such as K-means, DBSCAN, and hierarchical clustering have been used to identify user segments, detect anomalous travel patterns, and group similar routes or destinations. These methods provide valuable insights into underlying data structures without requiring labeled training examples.

2.2. User behavior modeling and preference identification methods

User behavior modeling in transportation systems requires sophisticated approaches that can capture the complexity and variability of human mobility patterns. Traditional statistical models have been augmented with machine learning techniques to improve accuracy and interpretability. Hidden Markov models and Bayesian networks have been extensively used to model sequential decision-making processes and capture dependencies between different travel choices [8].

Matrix factorization techniques have proven effective for collaborative filtering applications in transportation recommendation systems. Non-negative matrix factorization and singular value decomposition methods can identify latent factors that influence user preferences and enable accurate prediction of future travel choices. These approaches handle sparse user-item interaction matrices effectively and provide interpretable factorizations [9].

Deep learning architectures have revolutionized user behavior modeling by enabling automatic feature learning and capturing complex non-linear relationships. Recurrent neural networks and long short-term memory networks excel at modeling sequential patterns in user travel histories. Attention mechanisms allow models to focus on relevant temporal contexts and improve prediction accuracy for irregular travel patterns [10].

Graph-based methods have gained attention for modeling user preferences in network-structured transportation systems. Graph neural networks and graph convolutional networks can capture spatial relationships between locations and propagate preference information through transportation networks. These methods effectively leverage network topology to improve recommendation quality and handle cold-start problems for new users or locations.

2.3. Personalized recommendation systems in mobility services

Personalized recommendation systems in mobility services have evolved from simple rule-based approaches to sophisticated machine learning frameworks that consider multiple contextual factors. Content-based filtering methods analyze item characteristics such as route properties, travel times, and destination categories to generate recommendations. These approaches work well for new users but may suffer from limited diversity in recommendations.

Collaborative filtering remains a cornerstone of recommendation systems, leveraging user similarity and item similarity to predict preferences. Memory-based collaborative filtering methods compute similarities between users or items using various distance metrics, while model-based approaches use matrix factorization or clustering to identify latent patterns. Hybrid approaches combine multiple filtering techniques to leverage their complementary strengths.

Context-aware recommendation systems incorporate temporal, spatial, and environmental factors to improve recommendation relevance. Time-of-day effects, weather conditions, traffic patterns, and special events significantly influence travel preferences and must be considered in recommendation algorithms. Multi-armed bandit algorithms and contextual bandits provide frameworks for balancing exploration and exploitation in dynamic recommendation scenarios.

Real-time recommendation systems face unique challenges in mobility applications due to strict latency requirements and continuously evolving contexts. Stream processing frameworks and online learning algorithms enable rapid adaptation to changing conditions while maintaining recommendation quality. Incremental learning methods allow models to update preferences without retraining from scratch, ensuring scalability and responsiveness.

3. Methodology and algorithm design

3.1. User travel preference feature extraction and data preprocessing

The foundation of effective preference identification lies in comprehensive feature extraction that captures the multifaceted nature of user travel behaviors. Our approach implements a multi-layered feature engineering pipeline that processes raw travel data into structured representations suitable for machine learning algorithms. The preprocessing stage addresses data quality issues commonly encountered in ride-hailing datasets, including missing values, outliers, and inconsistent formatting.

Spatial features form the primary component of our feature extraction framework, encompassing origin-destination patterns, frequently visited locations, and spatial clustering characteristics. We employ density-based clustering algorithms to identify significant locations in user travel histories and compute spatial diversity metrics that quantify the geographical spread of user activities. Distance-based features capture relationships between consecutive trips and identify recurring spatial patterns that characterize individual mobility preferences.

Temporal feature extraction focuses on identifying periodicities and patterns in user travel behaviors across different time scales. Daily, weekly, and seasonal patterns are extracted using Fourier transform analysis and autocorrelation functions. We compute statistical measures of temporal regularity including entropy-based metrics that quantify the predictability of user travel schedules. Time-of-day preferences and duration patterns provide additional insights into user scheduling preferences and trip planning behaviors.

Table 1. Feature categories and extraction methods

Feature Category

Extraction Method

Description

Dimensionality

Spatial Patterns

DBSCAN Clustering

Origin-destination clustering

50 features

Temporal Rhythms

FFT Analysis

Daily/weekly periodicity

30 features

Trip Characteristics

Statistical Aggregation

Distance, duration, frequency

25 features

Contextual Factors

Multi-hot Encoding

Weather, events, traffic

40 features

Contextual feature extraction incorporates external factors that influence travel decisions, including weather conditions, traffic patterns, special events, and economic indicators.The comprehensive feature extraction methodology is summarized in Table 1, which presents the different feature categories, extraction methods, descriptions, and dimensionalities employed in our framework.. We develop a context-aware encoding scheme that captures interactions between internal user preferences and external environmental factors. The feature normalization process ensures consistent scales across different feature types while preserving the relative importance of individual components [11].

3.2. Machine learning models for travel pattern recognition

Our travel pattern recognition framework employs an ensemble approach that combines multiple machine learning algorithms to capture different aspects of user behavior patterns. The primary model architecture integrates gradient boosting decision trees, neural networks, and support vector machines through a weighted voting mechanism that adapts to individual user characteristics and data availability.

The gradient boosting component excels at capturing non-linear relationships and feature interactions in tabular data. We implement a custom boosting algorithm that incorporates temporal weighting to emphasize recent travel patterns while maintaining memory of historical preferences. The tree-based structure provides interpretability through feature importance rankings and decision path analysis. Regularization techniques prevent overfitting and ensure robust performance across diverse user profiles.

图片
Figure 1. Multi-layered neural network architecture for travel pattern recognition

The neural network component consists of a deep feedforward architecture with specialized embedding layers for categorical features and attention mechanisms for temporal sequences. The detailed architecture is illustrated in Figure 1, which shows the multi-layered neural network structure for travel pattern recognition. The network processes user features through multiple hidden layers with dropout regularization and batch normalization. The attention layer allows the model to focus on relevant temporal contexts and weight different trip components according to their importance for preference prediction.

This figure illustrates a complex three-dimensional visualization showing the neural network architecture with input layers processing spatial, temporal, and contextual features through multiple hidden layers. The visualization displays interconnected nodes with varying connection weights represented by color-coded lines, embedding dimensions shown as parallel coordinate plots, and attention weight heatmaps overlaid on temporal sequence processing components. The architecture includes dropout masks visualized as transparent overlay patterns and batch normalization effects shown through gradient flow animations.

Table 2. Model performance comparison on travel pattern recognition

Model Type

Accuracy (%)

Precision (%)

Recall (%)

F1-Score (%)

Training Time (min)

Random Forest

82.4

81.7

83.1

82.4

15.2

Gradient Boosting

85.7

86.2

85.1

85.6

28.7

Neural Network

87.3

88.1

86.5

87.3

45.6

Ensemble Model

89.1

89.7

88.6

89.1

52.3

The support vector machine component handles sparse and high-dimensional feature spaces effectively through kernel transformations. We employ radial basis function kernels with adaptive parameters that adjust to local data densities. The SVM component provides robust classification boundaries and handles outliers through soft margin optimization. Feature selection techniques reduce dimensionality while preserving discriminative information.

Model training employs a hierarchical approach that first trains individual components on subsets of the feature space before combining predictions through learned weighting functions. The comparative performance of different model types is presented in Table 2, demonstrating the superiority of our ensemble approach.Cross-validation ensures robust parameter selection and prevents overfitting to specific user segments. The ensemble weights are optimized using gradient descent to minimize prediction errors on validation datasets [12].

3.3. Personalized recommendation strategy framework

The personalized recommendation framework integrates preference identification results with real-time contextual information to generate dynamic recommendations that adapt to evolving user needs. Our approach employs a multi-objective optimization framework that balances user satisfaction, system efficiency, and business objectives through configurable weight parameters.

The recommendation engine operates through a three-stage pipeline encompassing candidate generation, ranking, and post-processing. The overall pipeline architecture is depicted in Figure 2, illustrating the real-time recommendation system workflow. Candidate generation employs collaborative filtering and content-based methods to identify potential recommendations from the complete item space. The candidate set is filtered based on feasibility constraints including availability, capacity limitations, and geographical restrictions.

图片
Figure 2. Real-time recommendation pipeline architecture

The ranking stage utilizes learned user preferences and contextual factors to score candidate recommendations. Our scoring function combines multiple components including preference compatibility, contextual relevance, novelty measures, and diversity constraints. The ranking model employs a learning-to-rank approach that optimizes for multiple objectives simultaneously through multi-task learning frameworks. The effectiveness of different recommendation strategies is compared in Table 3, showing performance metrics across various approaches.

This figure presents a comprehensive flow diagram depicting the real-time recommendation pipeline with interactive components shown in a layered architectural view. The visualization includes data flow streams represented by animated particle systems moving through processing stages, candidate generation modules displayed as interconnected hexagonal clusters, and ranking algorithms visualized through parallel processing pipelines with real-time throughput metrics. The architecture diagram incorporates color-coded performance indicators, bottleneck detection heatmaps, and dynamic load balancing visualizations across distributed processing nodes.

Table 3. Recommendation strategy performance metrics

Strategy Type

Click-through Rate (%)

Conversion Rate (%)

User Satisfaction

Diversity Index

Popularity-based

12.4

8.7

6.2

0.23

Collaborative Filtering

18.6

13.2

7.1

0.31

Content-based

16.9

11.8

6.8

0.41

Hybrid Approach

23.7

17.4

8.3

0.47

Our Framework

28.2

21.6

8.9

0.52

Post-processing incorporates business rules and optimization constraints to ensure recommendations align with operational requirements. Dynamic pricing integration adjusts recommendations based on demand patterns and capacity utilization. The system implements feedback loops that continuously update user preferences based on interaction data and explicit feedback signals.

图片
Figure 3. Dynamic preference update mechanism with temporal decay functions

The framework incorporates real-time learning capabilities that adapt to changing user preferences and system conditions. Figure 3 illustrates the dynamic preference update mechanism with temporal decay functions that enable continuous adaptation to evolving user behaviors. Online learning algorithms update model parameters incrementally without requiring complete retraining. Concept drift detection mechanisms identify significant changes in user behavior patterns and trigger model updates when necessary.

This figure demonstrates a sophisticated temporal dynamics visualization featuring multi-dimensional preference vectors evolving over time through interactive 3D surface plots. The visualization includes exponential decay curves overlaying user preference trajectories, concept drift detection algorithms represented through anomaly highlighting and boundary shift animations, and real-time parameter update mechanisms shown via gradient descent optimization paths. The plot incorporates temporal clustering analysis with color-coded preference stability regions, adaptive learning rate adjustments visualized through dynamic scaling effects, and feedback loop mechanisms displayed as recursive spiral patterns connecting historical and predicted preference states.

4. Experimental design and results analysis

4.1. Dataset description and experimental setup

Our experimental evaluation utilizes a comprehensive dataset collected from a major ride-hailing platform operating in metropolitan areas across North America. The dataset encompasses 2.3 million users and 45.7 million trip records spanning 18 months of operational data. Geographic coverage includes 12 major cities with diverse population densities, transportation infrastructures, and demographic characteristics.

The dataset structure includes detailed trip information with pickup and dropoff coordinates, timestamps, trip durations, distances, and fare amounts. User profiles contain anonymized demographic information, registration dates, and aggregated usage statistics. External data sources provide weather conditions, traffic patterns, special events, and economic indicators that influence travel behaviors.

Data preprocessing addresses several quality issues inherent in large-scale operational datasets. GPS coordinate accuracy is improved through map-matching algorithms that align trajectory points with road networks. Temporal data is standardized across time zones and adjusted for daylight saving time transitions. Outlier detection removes anomalous trips that likely represent data collection errors or unusual circumstances [13]. The characteristics of our dataset and preprocessing statistics are summarized in Table 4, providing details on data quality and completion rates.

Table 4. Dataset characteristics and preprocessing statistics

Data Component

Original Records

Valid Records

Completion Rate (%)

Error Types

Trip Records

47.2M

45.7M

96.8

GPS errors, timing

User Profiles

2.4M

2.3M

95.8

Missing demographics

External Context

18.6M

18.1M

97.3

Weather data gaps

Feedback Data

12.3M

11.8M

95.9

Invalid ratings

The experimental setup employs stratified sampling to ensure representative user distributions across different usage patterns and demographic segments. Training and testing splits maintain temporal ordering to simulate realistic deployment scenarios where models predict future behaviors based on historical data. Cross-validation procedures account for user-specific dependencies and prevent data leakage between training and validation sets.

Computing infrastructure utilizes distributed processing frameworks to handle large-scale data processing and model training. Apache Spark clusters enable parallel feature extraction and data preprocessing across multiple nodes. GPU-accelerated training reduces neural network training times while maintaining numerical precision. Model serving infrastructure supports real-time inference with sub-second latency requirements.

4.2. Performance evaluation of preference identification algorithms

The evaluation of preference identification algorithms encompasses multiple metrics that capture different aspects of model performance and practical applicability. Accuracy metrics measure the proportion of correctly identified preferences across different user segments and temporal contexts. Precision and recall provide insights into model behavior for imbalanced preference categories and rare travel patterns.

Temporal stability analysis examines how well identified preferences persist over time and adapt to evolving user behaviors. We compute preference consistency scores that measure the correlation between preferences identified in different time periods. Seasonal variation analysis reveals how preferences change across different temporal contexts and weather conditions.

Cross-user generalization performance evaluates how well models trained on specific user populations perform on new users with limited historical data. Cold-start scenarios test model performance for users with minimal interaction histories. Transfer learning experiments assess whether preferences learned from similar users can improve predictions for new users [14].

Computational efficiency metrics include training time, inference latency, and memory requirements across different model configurations. Scalability analysis examines how performance degrades as dataset sizes increase and identifies bottlenecks in the processing pipeline. Energy consumption measurements provide insights into the environmental impact of different algorithmic approaches.

Robustness evaluation tests model performance under various data perturbations and adversarial conditions. Noise injection experiments assess sensitivity to GPS inaccuracies and temporal inconsistencies. Privacy-preserving evaluation examines how data anonymization and differential privacy techniques affect model performance while protecting user privacy [15].

4.3. Effectiveness analysis of personalized recommendation strategies

The effectiveness analysis of personalized recommendation strategies employs both offline evaluation metrics and online A/B testing results to assess real-world performance. Offline metrics include traditional recommendation system measures such as precision at k, recall at k, and normalized discounted cumulative gain. These metrics provide standardized benchmarks for comparing different recommendation approaches.

User engagement metrics capture the practical impact of recommendations on user behavior and platform utilization. Click-through rates measure user interest in recommended options, while conversion rates track actual utilization of recommended services. Session duration and return rate metrics provide insights into long-term user satisfaction and retention effects.

Business impact analysis quantifies the economic benefits of personalized recommendations through revenue metrics, cost reductions, and operational efficiency improvements. Dynamic pricing optimization enabled by preference information increases revenue per trip while maintaining user satisfaction. Resource allocation improvements reduce operational costs through better demand prediction and capacity planning.

Diversity and novelty metrics ensure that personalized recommendations maintain sufficient variety to expose users to new options and prevent filter bubbles. Intra-list diversity measures the variety within individual recommendation sets, while temporal diversity tracks how recommendations evolve over time. Novelty scores quantify the degree to which recommendations introduce users to previously unexplored options.

Fairness evaluation examines whether personalized recommendations exhibit biases across different demographic groups or geographical regions. Demographic parity metrics ensure equitable service quality across user segments. Geographical fairness analysis identifies potential disparities in recommendation quality between different areas and population densities.

The comparative analysis demonstrates significant improvements over baseline approaches across all evaluation dimensions. Our personalized recommendation framework achieves 28.2% higher click-through rates and 21.6% higher conversion rates compared to conventional popularity-based methods. User satisfaction scores improve by 43.5% while maintaining computational efficiency suitable for real-time deployment scenarios.

5. Conclusion and future work

5.1. Summary of research findings and contributions

This research presents a comprehensive framework for ride-hailing user travel preference identification and personalized recommendation strategies that addresses critical limitations in existing approaches. Our methodology successfully integrates advanced machine learning techniques with practical deployment considerations to deliver measurable improvements in user satisfaction and system efficiency.

The preference identification component achieves 89.1% accuracy through an ensemble approach that combines gradient boosting, neural networks, and support vector machines. The multi-layered feature extraction pipeline captures spatial, temporal, and contextual patterns that characterize individual travel behaviors. Our approach demonstrates superior performance in handling sparse data and adapting to evolving user preferences compared to traditional clustering and classification methods.

The personalized recommendation framework delivers substantial improvements in user engagement metrics, achieving 28.2% click-through rates and 21.6% conversion rates. The multi-objective optimization approach successfully balances user satisfaction with business objectives while maintaining real-time performance requirements. Dynamic preference updating ensures recommendations remain relevant as user behaviors evolve over time.

Experimental validation using large-scale real-world datasets confirms the practical applicability of our approach across diverse user populations and geographical contexts. The framework demonstrates robust performance under various data quality conditions and scales effectively to handle millions of users and trip records. Privacy-preserving evaluation confirms that effective personalization can be achieved while protecting user privacy through appropriate anonymization techniques.

5.2. Practical implications for ride-hailing industry

The research findings have significant practical implications for ride-hailing platforms seeking to enhance user experience and operational efficiency through data-driven personalization. The preference identification framework enables platforms to understand user behaviors at unprecedented granularity and develop targeted service offerings that align with individual needs and preferences.

Implementation of personalized recommendation strategies can substantially improve user retention and platform utilization through enhanced service relevance. The demonstrated improvements in click-through rates and conversion rates translate directly to increased revenue and reduced customer acquisition costs. Dynamic pricing optimization enabled by preference information allows platforms to maximize revenue while maintaining user satisfaction.

Operational efficiency gains result from improved demand prediction and resource allocation based on accurate preference models. Vehicle positioning and driver dispatching can be optimized to match anticipated user demands, reducing waiting times and operational costs. The framework enables proactive service delivery that anticipates user needs rather than simply reacting to requests.

The modular design of our framework facilitates integration with existing platform architectures and enables gradual deployment across different service components. Real-time processing capabilities ensure that personalization benefits are delivered immediately without degrading system performance. The approach provides scalable solutions that can accommodate platform growth and expansion into new markets.

5.3. Future research directions and limitations

Future research directions encompass several promising areas that build upon the foundations established in this work. Integration of additional data sources including social media activity, mobile sensor data, and third-party location services could provide richer context for preference identification. Multi-modal transportation integration would extend the framework to encompass various transportation options including public transit, bike-sharing, and walking.

Federated learning approaches could enable preference modeling across multiple platforms while preserving privacy and competitive boundaries. Decentralized architectures would allow individual platforms to benefit from collective knowledge while maintaining control over proprietary data. Blockchain-based frameworks could provide secure mechanisms for preference sharing and verification.

Advanced deep learning architectures including graph neural networks and transformer models offer potential improvements in preference identification accuracy. Reinforcement learning approaches could optimize recommendation strategies through continuous interaction with user feedback. Causal inference methods could provide deeper insights into the relationships between user characteristics and travel preferences.

Current limitations include dependence on historical data that may not capture sudden changes in user preferences or external conditions. The framework requires substantial computational resources for real-time processing, which may limit deployment in resource-constrained environments. Privacy regulations and user concerns may restrict the types of data that can be collected and utilized for personalization purposes.

Acknowledgements

I would like to extend my sincere gratitude to R. de la Torre, C. G. Corlu, J. Faulin, B. S. Onggo, and A. A. Juan for their comprehensive research on simulation, optimization, and machine learning in sustainable transportation systems as published in their article titled [1] "Simulation, optimization, and machine learning in sustainable transportation systems: Models and applications" in Sustainability. Their systematic analysis of machine learning applications in transportation optimization has significantly influenced my understanding of sustainable mobility solutions and provided essential theoretical foundations for my research in intelligent transportation systems.

I would like to express my heartfelt appreciation to W. Jian, J. He, K. Chen, J. Xie, J. Zhao, and L. You for their innovative study on personalized travel recommendations based on privacy-preserving machine learning techniques, as published in their article titled [8] "Personalized travel recommendations based on asynchronous and privacy-preserving mixed logit model" in IEEE Transactions on Intelligent Transportation Systems. Their pioneering work on privacy-preserving personalization methodologies and mixed logit modeling approaches has significantly enhanced my knowledge of recommendation system architectures and inspired the development of the privacy-aware components in my research framework.


References

[1]. de la Torre, R., Corlu, C. G., Faulin, J., Onggo, B. S., & Juan, A. A. (2021). Simulation, optimization, and machine learning in sustainable transportation systems: Models and applications. Sustainability, 13(3), 1551.

[2]. Lai, Z., Wang, J., Zheng, J., Ding, Y., Wang, C., & Zhang, H. (2023). Travel mode choice prediction based on personalized recommendation model. IET Intelligent Transport Systems, 17(4), 667-677.

[3]. Gupta, B. B., Gaurav, A., Marín, E. C., & Alhalabi, W. (2022). Novel graph-based machine learning technique to secure smart vehicles in intelligent transportation systems. IEEE transactions on intelligent transportation systems, 24(8), 8483-8491.

[4]. Long, W., Li, T., Xiao, Z., Wang, D., Zhang, R., Regan, A. C., ... & Zhu, Y. (2022). Location prediction for individual vehicles via exploiting travel regularity and preference. IEEE Transactions on Vehicular Technology, 71(5), 4718-4732.

[5]. Ali, F., Sarwar, S., Shafi, Q. M., Iqbal, M., Safyan, M., & Qayyum, Z. U. (2022). Securing IoT based maritime transportation system through entropy-based dual-stack machine learning framework. IEEE Transactions on Intelligent Transportation Systems, 24(2), 2482-2491.

[6]. Yuan, T., da Rocha Neto, W., Rothenberg, C. E., Obraczka, K., Barakat, C., & Turletti, T. (2022). Machine learning for next‐generation intelligent transportation systems: A survey. Transactions on emerging telecommunications technologies, 33(4), e4427.

[7]. Tsolaki, K., Vafeiadis, T., Nizamis, A., Ioannidis, D., & Tzovaras, D. (2023). Utilizing machine learning on freight transportation and logistics applications: A review. ICT Express, 9(3), 284-295.

[8]. Jian, W., He, J., Chen, K., Xie, J., Zhao, J., & You, L. (2024). Personalized travel recommendations based on asynchronous and privacy-preserving mixed logit model. IEEE Transactions on Intelligent Transportation Systems.

[9]. Afolayan, B. I., Ghosh, A., Calderin, J. F., & Arredondo, A. D. M. (2024). Emerging Trends in Machine Learning Assisted Optimization Techniques Across Intelligent Transportation Systems. IEEE Access.

[10]. Ang, K. L. M., Seng, J. K. P., Ngharamike, E., & Ijemaru, G. K. (2022). Emerging technologies for smart cities' transportation: geo-information, data analytics and machine learning approaches. ISPRS International Journal of Geo-Information, 11(2), 85.

[11]. Alqahtani, H., & Kumar, G. (2024). Machine learning for enhancing transportation security: A comprehensive analysis of electric and flying vehicle systems. Engineering Applications of Artificial Intelligence, 129, 107667.

[12]. Nitu, P., Coelho, J., & Madiraju, P. (2021). Improvising personalized travel recommendation system with recency effects. Big Data Mining and Analytics, 4(3), 139-154.

[13]. Mou, J., Gao, K., Duan, P., Li, J., Garg, A., & Sharma, R. (2022). A machine learning approach for energy-efficient intelligent transportation scheduling problem in a real-world dynamic circumstances. IEEE transactions on intelligent transportation systems, 24(12), 15527-15539.

[14]. Aghamohammadghasem, M., Azucena, J., Hashemian, F., Liao, H., Zhang, S., & Nachtmann, H. (2023, December). System simulation and machine learning-based maintenance optimization for an inland waterway transportation system. In 2023 Winter simulation conference (WSC) (pp. 267-278). IEEE.

[15]. Wang, J., Wu, N., & Zhao, W. X. (2021). Personalized route recommendation with neural network enhanced search algorithm. IEEE Transactions on Knowledge and Data Engineering, 34(12), 5910-5924.


Cite this article

Shi,X. (2025). Research on Ride-hailing User Travel Preference Identification and Personalized Recommendation Strategies Based on Machine Learning. Applied and Computational Engineering,186,39-51.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of CONF-FMCE 2025 Symposium: Semantic Communication for Media Compression and Transmission

ISBN:978-1-80590-383-3(Print) / 978-1-80590-384-0(Online)
Editor:Anil Fernando
Conference date: 24 October 2025
Series: Applied and Computational Engineering
Volume number: Vol.186
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. de la Torre, R., Corlu, C. G., Faulin, J., Onggo, B. S., & Juan, A. A. (2021). Simulation, optimization, and machine learning in sustainable transportation systems: Models and applications. Sustainability, 13(3), 1551.

[2]. Lai, Z., Wang, J., Zheng, J., Ding, Y., Wang, C., & Zhang, H. (2023). Travel mode choice prediction based on personalized recommendation model. IET Intelligent Transport Systems, 17(4), 667-677.

[3]. Gupta, B. B., Gaurav, A., Marín, E. C., & Alhalabi, W. (2022). Novel graph-based machine learning technique to secure smart vehicles in intelligent transportation systems. IEEE transactions on intelligent transportation systems, 24(8), 8483-8491.

[4]. Long, W., Li, T., Xiao, Z., Wang, D., Zhang, R., Regan, A. C., ... & Zhu, Y. (2022). Location prediction for individual vehicles via exploiting travel regularity and preference. IEEE Transactions on Vehicular Technology, 71(5), 4718-4732.

[5]. Ali, F., Sarwar, S., Shafi, Q. M., Iqbal, M., Safyan, M., & Qayyum, Z. U. (2022). Securing IoT based maritime transportation system through entropy-based dual-stack machine learning framework. IEEE Transactions on Intelligent Transportation Systems, 24(2), 2482-2491.

[6]. Yuan, T., da Rocha Neto, W., Rothenberg, C. E., Obraczka, K., Barakat, C., & Turletti, T. (2022). Machine learning for next‐generation intelligent transportation systems: A survey. Transactions on emerging telecommunications technologies, 33(4), e4427.

[7]. Tsolaki, K., Vafeiadis, T., Nizamis, A., Ioannidis, D., & Tzovaras, D. (2023). Utilizing machine learning on freight transportation and logistics applications: A review. ICT Express, 9(3), 284-295.

[8]. Jian, W., He, J., Chen, K., Xie, J., Zhao, J., & You, L. (2024). Personalized travel recommendations based on asynchronous and privacy-preserving mixed logit model. IEEE Transactions on Intelligent Transportation Systems.

[9]. Afolayan, B. I., Ghosh, A., Calderin, J. F., & Arredondo, A. D. M. (2024). Emerging Trends in Machine Learning Assisted Optimization Techniques Across Intelligent Transportation Systems. IEEE Access.

[10]. Ang, K. L. M., Seng, J. K. P., Ngharamike, E., & Ijemaru, G. K. (2022). Emerging technologies for smart cities' transportation: geo-information, data analytics and machine learning approaches. ISPRS International Journal of Geo-Information, 11(2), 85.

[11]. Alqahtani, H., & Kumar, G. (2024). Machine learning for enhancing transportation security: A comprehensive analysis of electric and flying vehicle systems. Engineering Applications of Artificial Intelligence, 129, 107667.

[12]. Nitu, P., Coelho, J., & Madiraju, P. (2021). Improvising personalized travel recommendation system with recency effects. Big Data Mining and Analytics, 4(3), 139-154.

[13]. Mou, J., Gao, K., Duan, P., Li, J., Garg, A., & Sharma, R. (2022). A machine learning approach for energy-efficient intelligent transportation scheduling problem in a real-world dynamic circumstances. IEEE transactions on intelligent transportation systems, 24(12), 15527-15539.

[14]. Aghamohammadghasem, M., Azucena, J., Hashemian, F., Liao, H., Zhang, S., & Nachtmann, H. (2023, December). System simulation and machine learning-based maintenance optimization for an inland waterway transportation system. In 2023 Winter simulation conference (WSC) (pp. 267-278). IEEE.

[15]. Wang, J., Wu, N., & Zhao, W. X. (2021). Personalized route recommendation with neural network enhanced search algorithm. IEEE Transactions on Knowledge and Data Engineering, 34(12), 5910-5924.