Volume 151

Published on November 2025

Volume title: Proceedings of CONF-CIAP 2026 Symposium: Applied Mathematics and Statistics

ISBN:978-1-80590-559-2(Print) / 978-1-80590-560-8(Online)
Conference date: 27 January 2026
Editor:Marwan Omar
Research Article
Published on 19 November 2025 DOI: 10.54254/2753-8818/2026.CH29709
Ziqiu Wang
DOI: 10.54254/2753-8818/2026.CH29709

Sprint-freestyle performance prediction and interpretation require precise and actionable models for coaches and athletes. This study presents an interpretable machine learning model applied to lap-by-lap metrics from A-final 100-yard freestyle swims (n = 67). We construct a 12-dimensional feature vector from three technical metrics (mean stroke rate, cycle count, and breakout distance) across four laps, and construct both a regression task (smooth race time prediction) and a binary classification task (fast/slow, threshold at 41.4 s). Several algorithms were explored—Linear Regression, Random Forest, k-Nearest Neighbors (kNN), and Support Vector techniques—on multiple train/test splits and based on measures of R², MAPE, accuracy, and F1 score. Where regression R² values were low (best mean R² ≈ −0.042 for Random Forest), MAPE was nonetheless small (~0.011), with modest absolute error but little explained variance. Classification fared better: kNN recorded the best mean accuracy (≈0.727) and F1 (≈0.717). Most significantly, SHAP (Shapley Additive Explanations) identified Lap2_Stroke_Rate and Lap4_Breakout_Dist as two of the top features. Feature-selection tests showed that models that are trained on higher features perform with identical MAPE with significantly fewer inputs, towards useful, interpretable, and data-efficient ways for performance monitoring and coaching decisions.

Show more
View pdf
Wang,Z. (2025). Interpretable Machine Learning for 100-Yard Freestyle Performance: SHAP-Driven Feature Selection of Lap-Level Stroke, Splits, and Breakout Metrics. Theoretical and Natural Science,151,1-11.
Export citation