Volume 15
Published on February 2025The application of fusion technology is of considerable importance in the field of multi-modal viewport prediction. The latest attention-based fusion methods have been shown to perform well in prediction accuracy. However, these methods fail to account for the differential density of information among the three modalities involved in viewport prediction - trajectory, visual, and audio. Visual and audio modalities present primitive signal information, while trajectory modality shows advanced time-series information. In this paper, a viewport prediction framework based on a Modality Diversity-Aware (MDA) fusion network is proposed to achieve multi-modal feature interaction. Firstly, we designed a fusion module to promote the combination of visual and auditory modalities, augmenting their efficacy as advanced complementary features. Subsequently, we utilize cross-modal attention to enable reinforced integration of visual-audio fused information and trajectory features. Our method addresses the issue of differing information densities among the three modalities, ensuring a fair and effective interaction between them. To evaluate the efficacy of the proposed approach, we conducted experiments on a widely-used public dataset. Experiments demonstrate that our approach predicts accurate viewport areas with a significant decrease in model parameters.
Linking records is essential in data integration, healthcare analysis, fraud detection, and other applications where matching across datasets is needed. But actual data is usually noisy (lost values, typos, inconsistent formatting), and these factors greatly sour the performance of deterministic and probabilistic approaches. In this paper, we introduce a deep learning model and high-level regularizations (dropout, weight decay, early stopping) to enhance robustness for noisy record linkage. We test the approaches by using open data, that are simulated scenarios of real world with different levels of noise. Data augmentation generates fake noise (realistic input errors). Results reveal that regularization techniques improve the model’s performance under noisy environments with up to 20% better accuracy and recall than unregularized models. Dropout specifically tended to generalise better by limiting overfitting to noise. These results reveal the potential of deep learning and regularization to address record linkage problems in noisy environments, and suggest future work on additional techniques including adversarial training and batch normalization.
In intelligent scenarios, large language models (LLMs) are used to create characters that interact with users, providing guidance and relevant information. The higher the degree of anthropomorphism of these roles, the better the emotional experience they provide, which is beneficial for user interaction and enhances user experience. Therefore, evaluating the character-creation capabilities of LLMs is essential. This study used a questionnaire and used another LLM (ChatGPT-4o) to assess the impact of emoji usage and language style on the anthropomorphism and emotional expression of content generated by LLMs. The results indicate that when using emojis, the characters exhibit higher levels of anthropomorphism and emotional expression. Additionally, informal language styles contribute to enhancing both anthropomorphism and emotional expression.
Urbanization is a significant driver of land use change, particularly in rapidly growing metropolitan areas. This research investigates Greenfield City land use change in the 20-year period (2000-2020) using GIS and satellite data. The mapping shows where the greatest land-use changes occurred, ranging from increased residential and commercial developments to the loss of agricultural fields and the omission of green space. The work applies multi-temporal analyses of Landsat satellite images taken in 2000, 2010 and 2020 to estimate land cover change and its effects on urban planning and sustainability. They indicate that there’s a clear rise in housing and business developments, but also a steep decline in farming and greenspace. These transformations affect the environment, with habitat loss, biodiversity destruction and encroachment on natural resources. The paper wraps up by focusing on the issues of sustainability in urban planning and how better land use planning is required to reduce the negative environmental effects of urban sprawl.
In intelligent warehousing and transportation processes, the centralization of material units significantly enhances storage and handling efficiency. Among these, the centralized unitization of material pallets is in high demand and widely applied in practical operations. In multi-SKU scenarios, achieving efficient palletizing—particularly online mixed palletizing—poses a major challenge in logistics operations. This process aims to save manpower while ensuring operational efficiency. To address this issue, this paper presents a combined heuristic algorithm that integrates an anthropomorphic heuristic algorithm with a greedy algorithm incorporating local perturbations. The proposed approach accounts for constraints such as mass, volume, center of gravity, non-overlapping placement, and stability. Experimental results demonstrate that this algorithm effectively resolves the palletizing challenges for multi-SKU goods, significantly reducing space waste.
Increasing amounts of financial data demand sophisticated analytics to develop sound recommendation models. This article discusses combining time series analysis and association rule mining for big data in Hadoop and Spark to enrich financial product recommendation engines. The paper is an integrated analysis of two types of prediction algorithms: AutoRegressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks to forecast user behavior and demand for financial services in the future from transactional history. The ARIMA model is used as the default while the LSTM model is used to represent non-linear dependencies and give a more dynamic forecast. association rule mining – in particular the Apriori algorithm – is used to find latent patterns and relationships between user transactions and financial products. This article illustrates how time series forecasting and association rule mining can be merged to bring a more useful financial recommendation. The hybrid approach, which combines both approaches, proves to increase user interaction and recommendation accuracy by 20% compared to the previous systems, according to experiments. The paper emphasises the possibilities of using big data in the construction of scalable, individualized financial recommendation systems.
This paper delves into the parameter tuning of fractional-order PID (FOPID) controllers. FOPID controllers, with additional integral and derivative orders compared to traditional PID controllers, possess enhanced capabilities in handling complex systems. However, effective tuning of its five parameters is challenging. To address this, multiple intelligent algorithms are investigated. The improved sparrow search algorithm (ISSA) utilizes Chebyshev chaotic mapping initialization, adaptive t-distribution, and the firefly algorithm to overcome the limitations of the basic algorithm, showing high accuracy, speed, and robustness in multi-modal problems. The grey wolf optimizer (GWO), inspired by the hunting behavior of grey wolves, has procedures for encircling, hunting, and attacking but may encounter local optima, and several improvement methods have been proposed. The genetic algorithm, based on the survival of the fittest principle, involves encoding, decoding, and other operations. Taking vehicle ABS control as an example, the genetic algorithm-based FOPID controller outperforms the traditional PID controller. In conclusion, different algorithms have their own advantages in FOPID parameter tuning, and the selection depends on system characteristics and control requirements. Future research can focus on further algorithm improvement and hybrid methods to achieve better control performance, providing a valuable reference for FOPID applications in industry.
In recent years, machine learning has emerged as a powerful tool with widespread applications across various domains due to its ability to process and analyze vast amounts of data. This study explores the application of machine learning techniques in predicting hotel booking cancellations using Property Management System (PMS) data. The research involves a comprehensive process, including data cleaning, feature engineering, feature selection, and model development. Feature selection and dimensionality reduction using Principal Component Analysis (PCA) and Lasso regression identified key predictive features, facilitating the rapid creation of neural network models. A diverse set of machine learning and deep learning models, such as Logistic Regression, Decision Tree, Random Forest, XGBoost, Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Deep Neural Network (DNN), and Long Short-Term Memory (LSTM), were employed. All models achieved accuracies exceeding 80%, with neural networks nearing 100%. These results highlight the efficacy of these models in predicting cancellations across different hotels, revealing consistent cancellation patterns. The study demonstrates the potential of machine learning to optimize hotel management by accurately forecasting booking cancellations, thereby reducing uncertainty and increasing revenue. Future work may focus on exploring more advanced feature engineering techniques and models to further enhance prediction accuracy and generalizability.
The rapid spread of fake news across digital platforms poses a significant challenge to societies, leading to a growing demand for robust detection mechanisms. Traditional fake news detection methods often rely on unimodal data, such as textual content, limiting their effectiveness in addressing the complex, and multimodal nature of fake news. This paper introduces a Multimodal Fake News Detector (MFND) that integrates textual, visual, and social context features to enhance detection accuracy. This makes classification tasks more accurate and reliable. The MFND was evaluated using the FakeNewsNet and Sina Weibo datasets, achieving high accuracy and outperforming existing models. The experimental results highlight the importance of multimodal fusion and attention-based weighting mechanisms in improving detection performance, particularly in complex social media environments. This research demonstrates the potential of multimodal approaches for more accurate and reliable fake news detection.
In the development of side-impact safety performance in automobiles, the injury conditions of far-side occupants have gradually been incorporated into the evaluation system of automotive safety in China. This study, based on three side-impact conditions in the Chinese automotive safety assessment system, employs the finite element method to analyze the injury and motion characteristics of far-side occupants under various impact scenarios. The results indicate that in the C-NCAP 75° POLE condition, certain impact areas of the far-side dummy sustain more severe injuries. These findings provide data references for the development of safety performance measures aimed at protecting far-side occupants.