1. Introduction
With rapid advancements in artificial intelligence (AI), the field of Human-Computer Interaction (HCI) has transitioned from traditional Graphical User Interfaces (GUI) to more advanced Natural User Interfaces (NUI). AI technologies, including speech recognition, computer vision, and natural language processing, are reshaping the way users interact with systems, making these interactions more intuitive, adaptive, and context-aware. Nonetheless, significant research gaps persist in areas such as multimodal interaction, personalized user experiences, and context-aware interfaces, particularly in complex scenarios and about privacy and ethical considerations.
This study aims to explore the integration of AI into HCI, focusing on analyzing technological advancements, identifying current challenges, and forecasting future trends. Specifically, the research investigates how AI can enhance the intelligence of HCI systems, how to facilitate natural and personalized user experiences in multimodal contexts, and how to effectively address privacy and ethical issues.
Methodologically, the study employs a comprehensive literature review combined with case analysis to examine the application of these technologies and to evaluate the associated challenges. The findings of this research provide a theoretical foundation for the deep integration of AI and HCI, contributing to the innovative design of intelligent interaction systems and offering valuable insights for future development and research trajectories.
2. Development of Human-Computer Interaction
2.1. Early Human-Computer Interaction
The command line interface is the main interaction method of early computer systems. Users interact with computers by entering text commands on the keyboard. For example, Unix systems and MS-DOS are typical command line operating systems. Here mainly introduce the Unix operating system: UNIX V1 (1971) introduced basic multitasking and file system capabilities. UNIX V7 (1979) added the Bourne shell and C language compiler, becoming a widely adopted standard version. UNIX System V (1983) introduced more robust process control, file system improvements, and standardized inter-process communication. For most users, the usual human-computer communication is through a program called Shell. “The Shell is a command line interpreter: it reads lines typed by the user and interprets them as requests to execute other programs.”[1] The emergence of command-line interfaces (CLI) in early human-computer interaction was a significant milestone in making computers more accessible. However, despite its utility, the CLI had several notable drawbacks, especially when viewed through the lens of modern AI-driven interactions: steep learning curve, lack of adaptability, no error tolerance, and so on.
2.2. The emergence and development of graphical user interface (GUI)
The emergence and development of the graphical user interface (GUI) is an important turning point in the history of human-computer interaction. In the early 1970s, the Palo Alto Research Center (PARC) of Xerox Corporation developed the GUI, which was first applied to the Xerox Alto computer. This interface replaced the traditional command line input with graphical elements such as windows, icons, and buttons, enabling users to control the computer by directly operating these graphical elements, thus achieving more intuitive interaction. The commercialization of GUI began in 1984 when Apple released the GUI-based Macintosh computer, and then Microsoft launched the GUI-based Windows operating system, which gradually made GUI a standard configuration for personal computers. With the advancement of computing power and display technology, modern GUIs have become more complex and sophisticated, integrating animation effects, multi-touch, drag-and-drop operations, and complex visual designs, greatly enhancing the visual appeal and ease of operation of the user experience[2]. The benefits of GUI include intuitive operation through visual icons and controls, support for multitasking, and instant feedback through graphics and animation [3]. With the advancement of AI innovation, modern-day GUI systems are significantly incorporating intelligent functions, such as automated tasks, voice control, and intelligent search, making the user interface more personalized and intelligent.
2.3. The rise of natural user interfaces (NUI)
The rise of Natural User Interfaces (NUI) has transformed the way humans engage with innovation, moving away from conventional graphical user interfaces (GUI) to more instinctive and smoother kinds of interaction. NUI leverages touch, gesture, and voice as primary input approaches, aiming to develop an interaction experience that feels more natural and associated with human habits. As Norman[4] explains, despite its name, NUI is not naturally natural however relies greatly on design to minimize the cognitive load of connecting with technology. The integration of Artificial Intelligence (AI) has played an essential role in enhancing NUI's effectiveness. AI developments such as deep understanding have allowed systems to better understand and react to detailed user inputs, consisting of speech and gestures[5]. Moreover, AI-driven developments in computer vision and natural language processing license systems to equate user behavior more accurately and individualize the interaction experience[6] As Wigdor and Wixon[7] stress, the future of NUI depends on its ability to establish genuinely adaptive and smart systems, driven by AI, that respond to users in such a way that feels simple and easy and instinctive.
3. Application of AI Technology in Human-Computer Interaction
3.1. Intelligent Voice Interaction
In layperson's terms, intelligent voice interaction (IVI) refers to the method users communicate with gadgets through natural language. This technology of speech recognition and synthesis makes communication between humans and machines more natural and efficient. With the innovation of AI technology and the optimization of algorithms, intelligent voice interaction is becoming more intelligent, aiming to free hands, improve productivity, and be more personalized. The core technologies of intelligent voice interaction include: speech recognition, natural language processing (NLP), As well as speech synthesis, etc., it is also applicable in many occasions, such as smart home appliances, the automotive field, and some enterprise services. At the same time, the emergence of intelligent voice assistants has made life more intelligent to a greater extent and built a bridge for natural communication between people and devices.
3.1.1. Advances in Speech Recognition Technology
As an important part of intelligent voice interaction, the progress of speech recognition technology is inseparable from deep learning and neural networks. Hinton et al.[8] discussed the application of deep neural networks (DNN) in speech recognition acoustic models in their paper, especially how to improve the feature extraction and pattern recognition capabilities of speech signals through multi-layer neural networks. This research created the basis for current speech recognition and greatly promoted the development of speech recognition technology. By introducing deep learning technology, the accuracy of speech recognition has been significantly improved, especially in complex contexts and noisy environments.
Subsequently, Xiong et al.[9] detailed Microsoft’s progress in speech recognition in 2017 in their detailed study. The system combines deep neural networks with more advanced language models to achieve new levels of accuracy in speech recognition in conversational environments. In particular, by optimizing the end-to-end deep learning model, Microsoft’s system is better able to handle the complexity of natural language conversations, making intelligent assistants (such as Cortana) more intelligent and reliable in practical applications.
These two studies have demonstrated the progress of speech recognition technology, especially in the core algorithm of speech recognition. The application of deep learning has brought revolutionary changes to the processing of complex speech signals. This has also laid a technical foundation for the development of intelligent voice assistants, making them more excellent in understanding user intentions, natural speech processing, and multi-task responses.
3.1.2. Development of intelligent voice assistants
In the process of intelligent voice development, the emergence of products such as Siri and Alexa means a major change in the human-computer interaction mode. Siri was launched by Apple in 2011 and became the first mainstream intelligent voice assistant. Users can use voice to control their mobile phones to complete various tasks, such as making calls, sending messages and querying information. Today, Siri in the IOS18 system has become more powerful, equivalent to an artificial intelligence existing in the mobile phone. The success of Siri has also spawned more voice assistants, among which Amazon's Alexa was released in 2014 and quickly became the core technology in the field of smart home appliances. Through the integration of the Internet of Things, Alexa can remotely control home devices such as lights, air conditioners, humidifiers, etc., to achieve voice-controlled smart home appliance management.
In recent years, research has shown that the core technological advances of intelligent voice assistants are mainly concentrated on natural language processing (NLP) and the application of deep models. For example, Hoy[10] discussed the user experience and privacy issues of intelligent voice assistants (such as Alexa and Google Assistant), and believed that while intelligent voice assistants improve user experience, they also raise concerns about data privacy. In addition, Kepuska and Bohouta[11] pointed out in their research that modern voice assistants achieve more natural and smoother user interactions by combining natural language understanding and speech recognition technology. Due to the continuous advancement of AI technology, intelligent voice assistants not only perform well in daily tasks, but are also gradually used in fields such as medical care, education, and finance, showing a wide range of application potential.
3.2. Computer Vision and Human-Computer Interaction
With the rapid development of artificial intelligence and computer vision technology, gesture recognition and facial expression recognition have become an important part of the field of human-computer interaction. Through visual understanding and analysis, the system can identify the user's gestures, facial expressions and emotional states, consequently accomplishing a more natural and user-friendly method of interaction.
3.2.1. Gesture recognition and interaction
Gesture recommendation is a vital research study instruction in the field of computer vision, which enables users to engage with computer systems through gestures. With the advancement of a deep understanding of innovation, the accuracy and toughness of gesture recommendations have been substantially enhanced. Oberweger et al.[12] proposed a hand position evaluation method based upon deep understanding, which utilizes convolutional neural networks (CNN) to acknowledge gestures in real time and reveals superior performance in complex environments. This innovation is not only suitable for gaming and virtual reality (VR) applications, however can similarly be thoroughly utilized in medical, clever home and other fields.
In addition, Molchanov et al.[13] studied a dynamic gesture recognition system based upon 3D convolutional neural networks and returning neural networks. The system has the capability to find and categorize vibrant gestures online and reveals high responsiveness in interactive simulations and other time-sensitive applications. This real-time gesture recommendation ability supplies a more natural user experience for utilizing interactive devices, significantly promoting the popularization of creative gadgets.
3.2.2. Facial Expression Recognition and Emotion Analysis
Facial expression recommendation innovation allows computer systems to understand human feelings, for that reason improving the intelligence level of human-computer interaction. Li and Deng[14] examined facial expression acknowledgment and stressed the worth of deep understanding in emotion acknowledgment. The research study exposed that combining convolutional neural networks (CNNs) and frequent neural networks (RNNs) can improve the detection accuracy of micro-expressions, which is important for real-time human-computer interaction.
In addition, Zeng et al.[15] proposed a multi-modal sensation recommendation system that integrates facial expressions, speech and physiological signals, substantially enhancing the precision of feeling acknowledgment. Research study reveals that this multi-modal approach carries out well in loud and complex environments and can effectively capture users' psychological modifications.
These research study studies reveal that through advanced computer vision innovation, we can achieve more accurate and delicate human-computer interaction and improve user experience. Gesture acknowledgment and facial expression analysis are not simply possible in industrial applications, however also offer the basis for future intelligent interaction systems.
3.3. Personalization and context awareness
With the development of artificial intelligence and huge data innovation, tailored and context-aware interaction techniques are ending up being considerably important in modern-day human-computer interaction. These innovations can improve the user experience by dynamically changing interactive products based on the user's requirements and environment.
3.3.1. User portrait and personalized interaction
User image is an in-depth description of the user established by the event and examines the user's practices, choices, and background details. This kind of picture can not simply assist the system in comprehending the user's interests and requirements, nevertheless likewise supplies tailored material and services throughout interactions. Ricci and Rokach[16] provided an introduction of the standard principles and developments of idea systems and went over how to produce user portraits through user behavior analysis to improve the accuracy of personalized recommendations. This strategy significantly boosts the user experience on e-commerce, social networks and other platforms, permitting users to get recommendations and services that are more in line with their interests.
In addition, Zhang and Chen[17] studied collective filtering recommendation systems and took a look at the application of various methods in user profile generation and tailored suggestions. This research study exposed that the mix of deep learning and basic artificial intelligence can effectively improve the efficiency of suggestion systems.
3.3.2. Context - based intelligent interaction
Context-aware intelligent interaction systems are able to dynamically adjust their behavior and responses based upon the user's environment and present state. The research study by Dey and Abowd[18] checked out the standard theory of context awareness and provided a crucial framework for understanding how to build smart interactive systems. They keep in mind that taking contextual aspects into account can significantly improve the quality of interactions between users and systems. At the same time, Bardram[19] has gone over the importance of context in designing context-aware systems and proposed that the user's environment and behavior must be considered in the style to enhance the intelligence level of the system. This context-based interaction not only boosts the user experience but also increases the efficiency of the system in useful applications.
4. Future trends and application prospects of human-computer interaction
4.1. Development of multimodal interaction
Multimodal interaction is one of the crucial research instructions in the current human-computer interaction field. It enhances the system's ability to comprehend and react to user objectives by incorporating different input modes (such as voice, gesture, touch, vision, and so on). In recent years, with the advancement of innovations such as deep learning and neural networks, multimodal interaction systems have actually been commonly used in scenarios such as intelligent assistants, virtual reality systems, and autonomous driving.
Baltrusaitis, Ahuja, and Morency[20] carried out a comprehensive evaluation of the present status of multimodal artificial intelligence in their research study, mentioning that multimodal information processing deals with issues such as how to successfully integrate data from different techniques and handle information redundancy and missingness. The research study also talked about the application capacity of multimodal interaction systems in sensation acknowledgment, visual understanding, and speech generation. In addition, Atzori et al.[21] explored how multimodal interaction can enhance the intelligence level of human-computer interaction systems by integrating understanding and execution data in the context of the Internet of Things.
These research study studies reveal the advancement potential of multimodal interaction in several fields and suggest that the future development of this technology will put higher emphasis on the accuracy and real-time nature of multimodal input.
4.2. Human-machine collaboration and symbiosis
With the maturity of automation technology and the popularization of robotic systems, human-machine collaboration has become an essential research study field. In human-machine collective systems, robotics are no longer easy execution tools, however collaborative partners that work with human beings to complete complex tasks. To achieve true human-machine symbiosis, researchers not only focus on the efficiency of the technical level but also place special emphasis on the trust, and emotional and cognitive interaction between humans and machines.
Sheridan [22] pointed out that human-machine collaboration has shown great potential in many fields, especially in environments such as manufacturing and medical care where humans and robots need to work together. How to make machines better understand human intentions has become the core of research. In addition, Bauer, Wollherr, and Buss[23] classified existing methods of human-machine collaboration and explored methods to achieve flexibility and safety in human-machine collaboration, especially in dynamic and uncertain environments.
These studies provide a reference for the design of future human-machine collaborative systems, indicating that as technology continues to advance, the relationship between humans and machines will become more complex and close.
4.3. Human-computer interaction in virtual reality and augmented reality
Virtual reality (VR) and augmented reality (AR) technologies provide a new dimension for human-computer interaction, enabling users to experience immersion through virtual or augmented environments. In this context, the challenge of human-computer interaction lies not only in the design of hardware devices, but also in how to interact in a natural and intuitive way, such as through gestures, eye tracking or voice commands.
Azuma [24] laid the foundation for augmented reality technology and proposed a basic framework for the seamless integration of virtual objects and the real world. Azuma emphasized that the core challenge of AR is how to accurately locate the matching relationship between virtual objects and real scenes. Jerald [25] explored the human-computer interaction design of VR systems, especially emphasizing the importance of user experience when achieving efficient interaction in a virtual environment, and pointed out that immersion and interactivity are key factors for the success of VR systems.
4.4. Ethical and Privacy Issues in Human-Computer Interaction
With the prevalent application of human-computer interaction innovation, privacy and ethical issues have actually ended up being difficulties that can not be overlooked. The collection, storage and use of user data involve individual privacy defense, especially in innovations such as voice assistants, virtual reality, and augmented truth. Zhou and Li[26] explained that as the frequency of interactions increases, the risk of privacy leak increases, particularly in the context of cloud data storage and processing.
Ethical issues focus on decision-making openness and algorithmic predisposition. Shin[27] emphasized that artificial intelligence systems often do not have openness, making it difficult for users to understand their decision-making procedures, which might result in unfairness or predisposition. This type of problem is especially considerable in the fields of autonomous driving and medical care. Additionally, Floridi and Taddeo[28] talked about the issue of attribution of obligation when devices make decisions autonomously.
To address these obstacles, scientists have proposed data minimization and privacy defense techniques, such as differential privacy, to lower data collection and safeguard user privacy[29] In regards to principles, value-sensitive design (VSD) supplies a method to incorporate ethics into innovation style.
5. Future Outlook: Opportunities and Challenges of AI and HCI
The continued development of AI and human-computer interaction (HCI) is expected to drive future interactive systems towards a more natural, smooth and personalized experience. Multimodal interaction will play a central role in this process, combining advanced AI technologies such as speech recognition and computer vision to promote more intuitive device interaction[7]. At the same time, the demand for personalized services will grow, thanks to the improvement of user profiles and the application of AI recommendation systems [30].
The integration of AI and HCI has broad application prospects in multiple industries. In the education field, adaptive learning systems can provide personalized content[31]; in the medical field, AI-driven diagnostic tools can improve patient care[32]. Smart cities will benefit from the optimization of smart transportation and resource management systems[33], while smart home devices improve the comfort and convenience of life by analyzing user habits.
6. Conclusion
This study explored the integration of artificial intelligence into human-computer interaction, highlighting the evolution from early command-line interfaces to modern, intelligent, context-aware systems. The findings demonstrate that incorporating AI technologies, such as natural language processing, deep learning, and computer vision, has significantly enhanced the user experience, making interactions more adaptive and personalized. However, challenges remain, particularly in handling multimodal data integration, privacy concerns, and the ethical implications of AI in decision-making.
While the research addressed key questions raised in the introduction, including the effectiveness of AI in advancing HCI, there are still limitations in the current study. For instance, the analysis was largely theoretical and relied on existing literature without empirical validation through user experiments. Future research should focus on developing robust methods for evaluating user experience in real-world applications, as well as investigating strategies to mitigate privacy risks and algorithmic biases.
In conclusion, the deepening integration of AI and HCI holds great potential for creating more natural, efficient, and user-centered interaction systems. Continued research in this field will likely focus on refining multimodal interfaces, improving the accuracy of context-aware responses, and addressing ethical challenges, ultimately contributing to the next generation of intelligent interactive technologies.
References
[1]. Scrivener, S. (2003). The Xerox Alto: A Retrospective. ACM SIGCHI Bulletin, 35(4), 49-55.
[2]. Hollnagel, E. (2009). The Handbook of Human-Machine Interaction: A Human-Centered Design Approach. CRC Press.
[3]. Nielsen, J. (1993). Usability Engineering. Academic Press.
[4]. Norman, D. A. (2010). Natural user interfaces are not natural. Interactions, 17(3), 6-10.
[5]. Hinton, G., Deng, L., Yu, D., Dahl, G. E., et al. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29(6), 82-97.
[6]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[7]. Wigdor, D., & Wixon, D. (2011). Brave NUI world: Designing natural user interfaces for touch and gesture. Elsevier.
[8]. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., & Kingsbury, B. (2012). Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine, 29(6), 82–97.
[9]. Xiong, W., Wu, L., Alleva, F., Droppo, J., Huang, X., & Stolcke, A. (2018). The Microsoft 2017 Conversational Speech Recognition System. In ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5934–5938). IEEE.
[10]. Hoy, M. B. (2018). Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants. Medical Reference Services Quarterly, 37(1), 81–88.
[11]. Kepuska, V., & Bohouta, G. (2018). Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home). 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), 99-103.
[12]. Oberweger, M., Rad, M., & Lepetit, V. (2015). "Hands Deep in Deep Learning for Hand Pose Estimation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13]. Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., & Kautz, J. (2016). "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14]. Li, S., & Deng, W. (2020). "Deep Facial Expression Recognition: A Survey." IEEE Transactions on Affective Computing.
[15]. Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). "A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions." IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16]. Ricci, F., & Rokach, L. (2011). Recommender Systems Handbook. Springer.
[17]. Zhang T., Chen J. (2019). Location-Based Context-Aware Mobile Assistant. Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 45-55.
[18]. Dey, A. K., & Abowd, G. D. (2000). Towards a better understanding of context and context-awareness. Proceedings of the 2000 ACM Conference on Human Factors in Computing Systems (CHI), 304-307.
[19]. Bardram, J.E. (2005). The Java Context Awareness Framework (JCAF)—A Service Infrastructure and Programming Framework for Context-Aware Applications. Proceedings of the Third International Conference on Pervasive Computing, Munich, May 2005, pp. 98-115.
[20]. Baltrusaitis, T., Ahuja, C., & Morency, L.-P. (2019). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423-443.
[21]. Atzori, L., Iera, A., & Morabito, G. (2010). The internet of things: A survey. Computer Networks, 54(15), 2787-2805.
[22]. Sheridan, T. B. (2016). Human–robot interaction: Status and challenges. Human Factors, 58(4), 525-532.
[23]. Bauer, A., Wollherr, D., & Buss, M. (2008). Human–robot collaboration: A survey. International Journal of Humanoid Robotics, 5(01), 47-66.
[24]. Azuma, R. T. (1997). A survey of augmented reality. Presence: Teleoperators and Virtual Environments, 6(4), 355-385.
[25]. Jerald, J. (2015). The VR book: Human-centered design for virtual reality. Morgan & Claypool Publishers.
[26]. Zhou, J., & Li, B. (2017). The rise of privacy concerns in human-computer interaction: Challenges and future directions. Computers in Human Behavior, 76, 511-519.
[27]. Shin, D. (2020). The effects of explainability and causability on algorithmic transparency, trust, and acceptance. Journal of Behavioral and Experimental Economics, 87, 101571.
[28]. Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2083), 20160112.
[29]. Gürses, S., & Berendt, B. (2010). The value of privacy in a digital world. IEEE Security & Privacy, 8(2), 52-55.
[30]. Ricci, F., & Rokach, L. (2011). Recommender Systems Handbook. Springer.
[31]. Chen, Z., Li, M., & Liang, P. (2020). Intelligent Adaptive Learning Systems in Online Education. Journal of Educational Technology & Society, 23(2), 80-94.
[32]. Topol, E. (2019). Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. Basic Books.
[33]. Zanella, A., Bui, N., Castellani, A., Vangelista, L., & Zorzi, M. (2014). Internet of Things for Smart Cities. IEEE Internet of Things Journal, 1(1), 22-32.
Cite this article
Wang,Y. (2024). Exploring the Fusion of AI and HCI: Shaping the Future of Human-Computer Interaction. Applied and Computational Engineering,120,66-74.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 5th International Conference on Signal Processing and Machine Learning
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).
References
[1]. Scrivener, S. (2003). The Xerox Alto: A Retrospective. ACM SIGCHI Bulletin, 35(4), 49-55.
[2]. Hollnagel, E. (2009). The Handbook of Human-Machine Interaction: A Human-Centered Design Approach. CRC Press.
[3]. Nielsen, J. (1993). Usability Engineering. Academic Press.
[4]. Norman, D. A. (2010). Natural user interfaces are not natural. Interactions, 17(3), 6-10.
[5]. Hinton, G., Deng, L., Yu, D., Dahl, G. E., et al. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29(6), 82-97.
[6]. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
[7]. Wigdor, D., & Wixon, D. (2011). Brave NUI world: Designing natural user interfaces for touch and gesture. Elsevier.
[8]. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., & Kingsbury, B. (2012). Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine, 29(6), 82–97.
[9]. Xiong, W., Wu, L., Alleva, F., Droppo, J., Huang, X., & Stolcke, A. (2018). The Microsoft 2017 Conversational Speech Recognition System. In ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5934–5938). IEEE.
[10]. Hoy, M. B. (2018). Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants. Medical Reference Services Quarterly, 37(1), 81–88.
[11]. Kepuska, V., & Bohouta, G. (2018). Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home). 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), 99-103.
[12]. Oberweger, M., Rad, M., & Lepetit, V. (2015). "Hands Deep in Deep Learning for Hand Pose Estimation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13]. Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., & Kautz, J. (2016). "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14]. Li, S., & Deng, W. (2020). "Deep Facial Expression Recognition: A Survey." IEEE Transactions on Affective Computing.
[15]. Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). "A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions." IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16]. Ricci, F., & Rokach, L. (2011). Recommender Systems Handbook. Springer.
[17]. Zhang T., Chen J. (2019). Location-Based Context-Aware Mobile Assistant. Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 45-55.
[18]. Dey, A. K., & Abowd, G. D. (2000). Towards a better understanding of context and context-awareness. Proceedings of the 2000 ACM Conference on Human Factors in Computing Systems (CHI), 304-307.
[19]. Bardram, J.E. (2005). The Java Context Awareness Framework (JCAF)—A Service Infrastructure and Programming Framework for Context-Aware Applications. Proceedings of the Third International Conference on Pervasive Computing, Munich, May 2005, pp. 98-115.
[20]. Baltrusaitis, T., Ahuja, C., & Morency, L.-P. (2019). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423-443.
[21]. Atzori, L., Iera, A., & Morabito, G. (2010). The internet of things: A survey. Computer Networks, 54(15), 2787-2805.
[22]. Sheridan, T. B. (2016). Human–robot interaction: Status and challenges. Human Factors, 58(4), 525-532.
[23]. Bauer, A., Wollherr, D., & Buss, M. (2008). Human–robot collaboration: A survey. International Journal of Humanoid Robotics, 5(01), 47-66.
[24]. Azuma, R. T. (1997). A survey of augmented reality. Presence: Teleoperators and Virtual Environments, 6(4), 355-385.
[25]. Jerald, J. (2015). The VR book: Human-centered design for virtual reality. Morgan & Claypool Publishers.
[26]. Zhou, J., & Li, B. (2017). The rise of privacy concerns in human-computer interaction: Challenges and future directions. Computers in Human Behavior, 76, 511-519.
[27]. Shin, D. (2020). The effects of explainability and causability on algorithmic transparency, trust, and acceptance. Journal of Behavioral and Experimental Economics, 87, 101571.
[28]. Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2083), 20160112.
[29]. Gürses, S., & Berendt, B. (2010). The value of privacy in a digital world. IEEE Security & Privacy, 8(2), 52-55.
[30]. Ricci, F., & Rokach, L. (2011). Recommender Systems Handbook. Springer.
[31]. Chen, Z., Li, M., & Liang, P. (2020). Intelligent Adaptive Learning Systems in Online Education. Journal of Educational Technology & Society, 23(2), 80-94.
[32]. Topol, E. (2019). Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. Basic Books.
[33]. Zanella, A., Bui, N., Castellani, A., Vangelista, L., & Zorzi, M. (2014). Internet of Things for Smart Cities. IEEE Internet of Things Journal, 1(1), 22-32.