
GazeLink: A multi-language low-cost mobile eye-gesture communication system with large language models for people with amyotrophic lateral sclerosis
- 1 The Webb Schools, Claremont, CA, United States
* Author to whom correspondence should be addressed.
Abstract
Amyotrophic Lateral Sclerosis (ALS) patients who have severe motor and speech impairments mostly rely on their eyes and assistive technology to communicate. However, existing high-tech products are expensive and hard to access, while low-tech products are inefficient and restrictive. To mitigate the limitations, this research proposes GazeLink, a multi-language low-cost mobile application for ALS patients to communicate efficiently with only eye movements. First, the system recognizes user eye gestures like left or up with machine learning and a template-matching algorithm. Then, it converts the eye gestures to words through a keyboard that supports English, Spanish, and Chinese. For efficiency, the system employs Large Language Models (LLMs) to generate a suitable sentence with words typed by the user and the context. Finally, the system provides text-to-speech and social media post services for both verbal and digital eye-gesture communication. Simulations conclude that sentence generation with LLMs can reduce user keystrokes by 81% while maintaining 90% of semantic similarity. Usability studies with 30 participants show that GazeLink can recognize eye gestures with 94.1% accuracy in varying lighting. After rapidly learning the user interface in under 10 attempts, first-time participants typed sentences of various lengths with their eyes at 15.1 words per minute, which is 7.2x faster than the common low-tech solution E-Tran. Experiments demonstrate GazeLink’s efficiency, learnability, and accuracy in eye-gesture text entry. The system is extremely affordable (less than $0.1 a month), portable, and easily accessible online. It also supports different users, lighting, smartphones, and languages. Product testing with ALS patients and personalized LLM models will be the next step.
Keywords
Assistive Technology (AT), Eye Gesture, Amyotrophic Lateral Sclerosis (ALS), Human-Computer Interaction (HCI), Large Language Model (LLM), Computer Vision (CV)
[1]. Kiernan, M. C., Vucic, S., Cheah, B. C., Turner, M. R., Eisen, A., Hardiman, O., ... & Zoing, M. C. (2011). Amyotrophic lateral sclerosis. The lancet, 377(9769), 942-955.
[2]. Inform, N.(2023). Motor neurone disease(mnd). https://www.nhsinform.scotillnesses-and-conditions/brain-nerves-and-spinal-cord/motor-neurone-disease-mnd. Accessed: 2024-02-15] .
[3]. Kang, B.-H., Kim, J.-I, Lim, Y-M., and Kim, K-K.(2018). Abnormal oculomotor functions in amyotrophic lateral sclerosis.Journal of Clinical Neurology, 14(4):464-471.
[4]. Solutions, L.T.(2016). E-tran(alphabet). https://store.lowtechsolutions.org/e-tran-alphabet/. Accessed:2024-02-15] .
[5]. Dynavox, T.(2024a). Td i-series. https://us.tobiidynavox.com/pages/td-i-series.Accessed: 2024-02-15] .
[6]. Sarcar, S., Panwar, P, and Chakraborty, T.(2013). Eyek:anefficient dwell-free eye gaze-based text entry system. In Proceedings of the 1lth asia pacifc conference on computer human interaction, pages 215-220.
[7]. Fan, M., Li, Z., and Li, F.M.(2020). Eyelid gestures on mobile devices for people with motor impairments. In Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility, pages 1-8.
[8]. Cecotti, H., Meena, Y.K., and Prasad, G.(2018).A multimodal virtual keyboard using eye-tracking and hand gesture detection. In 2018 40th Annual international conference ofthe IEEE engineering in medicine and biology society (EMBC), pages 3330-3333.IEEE.
[9]. Zhang, X., Kulkarni, H., and Morris, M.R.(2017). Smartphone-based gaze gesture communication for people with motor disabilities. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pages 2878-2889.
[10]. Shen, J., Yang, B., Dudley, J.J., and Kristensson, P.0.(2022). Kwickchat: A multi-turn dialogue system for aac using context-aware sentence generation by bag-of-keywords. In 27th International Conference on Intelligent User Interfaces, pages 853-867.
[11]. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I.(2017). Attention is all you need.Advances in neural information processing systems, 30.
[12]. Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I, Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.(2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774. [Amy and pALS, 2018] Amy and pALS (2018). Communication basics for people with als (pals). https://amyandpals.com/communication-solutions-gallery/.Accessed:2024-02-15].
[13]. Bradski, G.(2000).The opencv library. Dr.Dobb'sJournal:Software Tools for the Professional Programmer, 25(11):120-123.
[14]. Li, J., Ray, S., Rajanna, V., and Hammond, T.(2021). Evaluating the performance of machine learning algorithms in gaze gesture recognition systems. IEEEaccess, 10:1020-1035.
[15]. Vertanen, K.and Kristensson, P.0.(2011). The imagination of crowds: conversational aac language modeling using crowdsourcing and large data sources.In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 700-711.
[16]. Dinan, E., Logacheva, V., Malykh, V., Miller, A., Shuster, K., Urbanek, J., Kiela, D., Szlam, A., Serban, I, Lowe, R., et al.(2020). The second conversational intelligence challenge(convai2). In The NeurIPS'18 Competition:From Machine Learning to Intelligent Conversations, pages 187-208.Springer.
[17]. Papineni, K., Roukos, S., Ward, T, and Zhu, W.-J.(2002). Bleu:a method for automatic evaluation of machine translation. In Proceedings of the 40th annualmeeting of the Association for Computational Linguistics, pages 311-318.
[18]. Reimers, N. and Gurevych, I. (2019).Sentence-bert: Sentence embeddings using siamese bert-networks. arXivpreprint arXiv:1908.10084.
[19]. HuggingFace (2024). Sentence similarity. https://huggingface.co/tasks/sentence-similarity. Accessed:2024-02-15].
[20]. Hart, S.G. and Staveland, L.E.(1988). Development of nasa-tlx(taskload index): Results of empirical and theoretical research.In Advances in psychology, volume 52, pages 139-183.Elsevier.
[21]. MacKenzie, I.S. and Soukoreff, R.W.(2003). Phrase sets for evaluating text entry techniques.In CHI03 extended abstracts on Human factors in computing systems, pages 754-755.
[22]. Dynavox, T. (2024b). Td pilot.https://us.tobiidynavox.com/products/td-pilot.Accessed: 2024-02-15].
Cite this article
Sun,X. (2024). GazeLink: A multi-language low-cost mobile eye-gesture communication system with large language models for people with amyotrophic lateral sclerosis. Applied and Computational Engineering,88,88-104.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of the 6th International Conference on Computing and Data Science
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).