Volume 109
Published on November 2024Volume title: Proceedings of the 2nd International Conference on Machine Learning and Automation
Abstract. With the emergence and development of meta-learning algorithms, their remarkable advantages in few-shot learning (FSL) scenarios have led to the increasing application of meta-learning in natural language processing (NLP). However, meta-learning is not without limitations, and various challenges may arise during its practical application. This paper begins by providing a brief introduction to the concept of meta-learning and delves into its complex applications in NLP, such as federated learning (FL), cross-task generalization, and few-shot text classification. The research demonstrates that the advantages of meta-learning in few-shot learning can significantly aid these applications, particularly in scenarios where data scarcity necessitates rapid adaptation. Nevertheless, throughout these studies, meta-learning also exhibits limitations in various contexts. Finally, this paper summarizes the content, analyzes the interrelationships and potential connections between the topics, and highlights current challenges in the field as well as possible future directions for development.
Abstract. The application of large language models (LLMs) in single-agent systems within complex environments has proven successful, prompting a growing interest in their use within multi-agent systems (MAS). Despite the impressive capabilities of LLMs, it remains unclear how they can be optimally integrated and utilized to empower agents in MAS. Understanding how to effectively leverage the advantages of LLMs to enhance agent performance is crucial. This survey provides a comprehensive overview of the application of LLMs in MAS, focusing on their impact on agent cooperation, reasoning, and adaptive abilities. Finally, we discuss future directions and open questions in this evolving field.
Abstract. This paper provides a thorough examination of the utilisation of deep learning in music recommendation systems, which have transformed consumer discovery and engagement with music on streaming platforms. Scalability challenges and the cold-start problem are among the constraints that conventional recommendation methods, such as content-based filtering and collaborative filtering, encounter, which hinder their ability to deliver personalised recommendations that are effective. The processing of multi-modal and sequential data is significantly improved by deep learning methodologies, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Autoencoders. This results in more precise and contextually pertinent music recommendations. Moreover, hybrid models that amalgamate deep learning with conventional techniques augment recommendation precision by synthesising user interaction data with acoustic characteristics. This paper examines essential performance metrics employed in the assessment of music recommendation systems, including precision, recall, F1-score, and Mean Reciprocal Rank (MRR). It also tackles issues including computational complexity, bias, and ethical dilemmas pertaining to data privacy.
Abstract. Artificial Intelligence (AI) is revolutionizing the sectors of medicine and media by improving productivity, precision, and customization. AI is revolutionizing the development, distribution, and consumption of material in the media while also being essential in information verification and moderation. AI in medicine enhances clinical processes, personalizes patient care, and increases diagnostic accuracy. But these developments also bring with them new difficulties, including preserving media originality and guaranteeing the accuracy and openness of AI-generated insights in the medical field. In order to comprehend their impact on content creation and information distribution, this article investigates the approaches and uses of AI, such as Generative Adversarial Networks (GANs) and the Information Adoption Model (IAM). In media, AI automates video editing, content moderation, and tailored recommendations, while in healthcare, it enhances diagnostic accuracy, personalizes care, and simplifies clinical operations. The article also discusses issues like biases, the necessity of human oversight, and data privacy, highlighting how crucial it is to create AI systems that are both morally and practically sound.
Abstract. Artificial intelligence (AI) technology has rapidly developed in recent years and has gradually permeated various aspects of everyday life. The integration of AI with driving technology has given rise to autonomous driving, a technology that is expected to profoundly impact human transportation, efficiency, and quality of life. This paper provides a detailed exploration of the specific applications of AI in the field of autonomous driving and its future development prospects, analyzing the advantages and challenges of these technologies. It also briefly introduces practical application scenarios such as autonomous taxis, intelligent traffic management systems, and long-distance freight transportation, discussing the potential impacts of these technologies on society and the environment. Finally, the paper looks ahead to the profound changes that may result from the combination of autonomous driving technology with other cutting-edge technologies, such as 5G and the Internet of Things (IoT), in shaping future transportation systems.
Abstract. Chinese word tokenization is an important task in natural language processing and has undergone significant evolution with artificial intelligence. This paper reviews the historical progression and contemporary methodologies for Chinese word segmentation. The paper examines the traditional character-based approaches, which rely on dictionaries and pattern matching, and transition into machine learning-based techniques that utilize statistical models and neural networks. A particular focus is given to the recent developments in deep learning, including the application of recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and transformer models like BERT. The review also highlights innovative approaches such as memory networks and sub-character tokenization, which have shown promising results in improving segmentation accuracy and computational efficiency. Furthermore, the paper discusses the challenges faced in tokenization, such as handling out-of-vocabulary words and the integration of syntactic and semantic information. The paper concludes with insights on the future directions of Chinese word tokenization, emphasizing the potential of unsupervised learning and the need for more robust evaluation frameworks.
Abstract. Visual perception and human body recognition are fundamental capabilities required for effective and safe interactions between artificial intelligence (AI), computer vision, and humans in real-world scenarios. Recent groundbreaking developments in AI and computer vision have resulted in major advancements in human body recognition technology. However, research in human body recognition is still in the early stages of the product lifecycle. Identifying the three-dimensional locations of the joints in the human body from pictures or videos is known as 3D posture estimation. Although it is widely used in areas like human motion analysis and robotics, it continues to be a difficult task due to challenges such as depth ambiguity and the scarcity of robust datasets. Over the past decade, numerous methods have been developed, many of which are based on deep learning, significantly improving the performance of existing benchmarks. A comprehensive literature review of this field is crucial for future development. However, in nowadays,more and more such research has mainly concentrated on traditional techniques, requirement for a comprehensive examination of tools based on deep learning. This paper delivers a thorough overview of current deep learning-based 3D pose estimation algorithms, outlining their advantages and limitations while providing a detailed understanding of the field. It also explores commonly used benchmark datasets and methods for analyzing human poses in unlabeled field images, providing a thorough comparative analysis. Finally, insights are provided to aid in the design of future models and algorithms.
Abstract. Humor recognition is a popular research area in Natural Language Processing (NLP), with the fundamental goal of detecting whether humor is present. With the rapid development of artificial intelligence and the growing demand for human-computer interaction, humor recognition has broad applications in areas such as social media analysis, human-computer interaction systems, and intelligent question-answering assistants. However, due to the highly subjective and complex nature of humor, teaching computers to understand and recognize humor in a manner similar to humans is an exceptionally challenging task. This paper reviews the major research progress in the field of humor recognition and the construction of humor corpora. It discusses humor recognition methods based on traditional machine learning, deep learning, and multimodal approaches, and then highlights the advantages and disadvantages of each. Based on these analyses, the paper identifies the main challenges and difficulties in current research and provides suggestions and prospects for the future development of humor recognition systems.
Abstract. With the widespread use of deep learning models in various applications. People are gradually realizing the vulnerability of these models to adversarial attacks. Adversarial training is an effective strategy to defend against adversarial attacks. Based on the advantages and disadvantages of the current mainstream Fast Gradient Sign Method (FGSM) adversarial training and Projected Gradient Descent (PGD) adversarial training, this paper proposes a hybrid adversarial training that integrates FGSM and PGD methods and uses the ResNet-18 model and SVHN dataset for testing. Experimental results show that hybrid adversarial training can effectively reduce training time. Its accuracy on the original data set is higher than that of PGD adversarial training, which is improved by about 2%. The performance when facing FGSM attacks is almost the same as that of single FGSM adversarial training. The performance when facing PGD attacks decreases more significantly, which is about 2% to 3% lower than that of PGD adversarial training. This study not only helps to understand the robustness of hybrid adversarial training to models facing adversarial attacks but also helps in studying new adversarial training strategies.
Abstract. In recent years, corridor lighting has been widely used all over the world, as a simple lighting system, which provides great convenience for people to travel at night. However, with the increase in population and buildings, the number of lighting systems is increasing, and the disadvantage of low luminous efficiency of incandescent lamps is becoming more and more obvious, and the increase in electricity consumption has become a huge burden. How to reduce the electricity consumption of induction lamps in residential areas and corridors while ensuring the convenience of people's lives has become a prominent problem. This paper consulted the relevant information, chose sound and light control together, and set up a programmable delay LED lighting system. On the basis of basic lighting functions, people come to light at night, and people leave the light. By equipping the light-emitting element with higher luminous efficiency, the energy consumption is greatly reduced, and it has certain practical value. Through experiments in different scenarios, the above results are verified.