
Enhanced TransFormer-based Algorithm for Key-frame Action Recognition in Basketball Shooting
- 1 Electronics and Communications Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China
- 2 Master of Science in Management and Systems, New York University, NY, USA
- 3 Business Administration, California Institute of Advanced Management, CA, USA
* Author to whom correspondence should be addressed.
Abstract
This paper proposes an enhanced TransFormer-based algorithm for key-frame action recognition in basketball shooting. The research addresses the challenges of accurate temporal localization and feature extraction in complex basketball environments through architectural innovations in deep learning models. The proposed approach integrates a multi-scale feature fusion mechanism with an improved spatio-temporal attention structure, enabling robust recognition of basketball shooting actions across varying conditions. A novel position encoding scheme is introduced to better capture temporal relationships in shooting sequences, while the enhanced attention mechanism facilitates more precise key-frame identification. Experimental evaluations on basketball shooting datasets demonstrate that the proposed model achieves 92.8% accuracy in action recognition tasks, outperforming existing approaches by 4.3% in mean average precision. The architecture maintains computational efficiency while improving recognition accuracy, processing video sequences in real-time at 30 frames per second. Ablation studies confirm the effectiveness of individual components, with the spatio-temporal attention mechanism contributing the most significant performance gains. The system demonstrates robust performance across different shooting styles and environmental conditions, making it suitable for practical applications in basketball training and analysis.
Keywords
Key-frame Action Recognition, Enhanced TransFormer Architecture, Spatio-temporal Attention Mechanism, Basketball Shooting Analysis
[1]. Lin, C. H., Tsai, M. Y., & Chou, P. Y. (2021, September). A lightweight fine-grained action recognition network for basketball foul detection. In 2021 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW) (pp. 1-2). IEEE.
[2]. Qiaomei, L., & Yi, X. (2020, September). Automatic generation method of basketball continuous pitching action based on multi-objective machine vision. In 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE) (pp. 241-245). IEEE.
[3]. Meng, H. (2023, June). Basketball Training System Based on 3D Motion Intelligent Recognition Technology. In 2023 2nd International Conference on 3D Immersion, Interaction and Multi-sensory Experiences (ICDIIME) (pp. 384-388). IEEE.
[4]. Su, Z. (2024, April). Designing a Basketball Action Recognition System Based on the Improved OpenPose Algorithm. In 2024 Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC) (pp. 25-29). IEEE.
[5]. Gu, X., Xue, X., & Wang, F. (2020, May). Fine-grained action recognition on a novel basketball dataset. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2563-2567). IEEE.
[6]. Liu, Y., Xu, Y., & Zhou, S. (2024). Enhancing User Experience through Machine Learning-Based Personalized Recommendation Systems: Behavior Data-Driven UI Design. Authorea Preprints.
[7]. Xu, Y., Liu, Y., Wu, J., & Zhan, X. (2024). Privacy by Design in Machine Learning Data Collection: An Experiment on Enhancing User Experience. Applied and Computational Engineering, 97, 64-68.
[8]. Xu, X., Xu, Z., Yu, P., & Wang, J. (2025). Enhancing User Intent for Recommendation Systems via Large Language Models. Preprints.
[9]. Bi, Shuochen, Yufan Lian, and Ziyue Wang. "Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning." arXiv preprint arXiv:2409.10331 (2024).
[10]. Yu, P., Xu, X., & Wang, J. (2024). Applications of Large Language Models in Multimodal Learning. Journal of Computer Technology and Applied Mathematics, 1(4), 108-116.
[11]. W. Xu, J. Xiao, and J. Chen, “Leveraging large language models to enhance personalized recommendations in e-commerce,” arXiv, arXiv:2410.12829, 2024.
[12]. Ma, X., & Jiang, X. (2024). Predicting Cross-border E-commerce Purchase Behavior in Organic Products: A Machine Learning Approach Integrating Cultural Dimensions and Digital Footprints. International Journal of Computer and Information System (IJCIS), 5(1), 91-102.
[13]. Xiao, Jue, Wei Xu, and Jianlong Chen. "Social media emotional state classification prediction based on Arctic Puffin Algorithm (APO) optimization of Transformer mode." Authorea Preprints (2024).
[14]. Xiong, K., Cao, G., Jin, M., & Ye, B. (2024). A Multi-modal Deep Learning Approach for Predicting Type 2 Diabetes Complications: Early Warning System Design and Implementation.
[15]. Fan, J., Trinh, T. K., & Zhang, H. (2024). Deep Learning-Based Transfer Pricing Anomaly Detection and Risk Alert System for Pharmaceutical Companies: A Data Security-Oriented Approach. Journal of Advanced Computing Systems, 4(2), 1-14.
[16]. Xi, Y., Jia, X., & Zhang, H. (2024). Real-time Multimodal Route Optimization and Anomaly Detection for Cross-border Logistics Using Deep Reinforcement Learning. International Journal of Computer and Information System (IJCIS), 5(2), 102-114.
[17]. Chen, J., & Wang, S. (2024). A Deep Reinforcement Learning Approach for Network-on-Chip Layout Verification and Route Optimization. International Journal of Computer and Information System (IJCIS), 5(1), 67-78.
[18]. Pu, Y., Chen, Y., & Fan, J. (2023). P2P Lending Default Risk Prediction Using Attention-Enhanced Graph Neural Networks. Journal of Advanced Computing Systems, 3(11), 8-20.
[19]. Jin, M., Zhang, H., & Huang, D. (2024). Deep Learning-Based Early Warning Model for Continuous Glucose Monitoring Data in Diabetes Management. Integrated Journal of Science and Technology, 1(2).
Cite this article
Yan,L.;Weng,J.;Ma,D. (2025). Enhanced TransFormer-based Algorithm for Key-frame Action Recognition in Basketball Shooting. Applied and Computational Engineering,142,1-11.
Data availability
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
Disclaimer/Publisher's Note
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
About volume
Volume title: Proceedings of MSS 2025 Symposium: Automation and Smart Technologies in Petroleum Engineering
© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and
conditions of the Creative Commons Attribution (CC BY) license. Authors who
publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons
Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this
series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published
version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial
publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and
during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See
Open access policy for details).