Abstract
Low back pain (LBP) is common among adolescent cricketers, often due to repetitive lumbar stress. This study investigated LBP among 450 adolescent cricketers in Dhaka City, Bangladesh, considering a range of factors, including sociodemographic characteristics, game-related activities, preventive practices, and LBP-related history. Various ML algorithms applied to LBP severity classification included K-Nearest Neighbors, Random Forest, Logistic Regression, and Support Vector Machine (SVM). LBP severity was categorized into three classes as no pain, mild pain, and moderate pain because there was an insufficient amount of data for the severe pain category. The SVM using the sigmoid kernel of the models considered gave the best performance as it produced the best performance metrics of test accuracy (87.6%), precision (90%), recall (87.6%), and F1-score (87.1%). In addition, regression analysis was also applied to identify the predictors of LBP. Key correlates included female gender (adjusted odds ratio [AOR] = 2.52), higher educational attainment (e.g., undergraduate: AOR = 5.38), elevated family income (e.g., > 60,000 BDT: AOR = 4.36), longer weekly practice duration (>20 hours: higher prevalence of 81.7%), inconsistent warm-up (often/sometimes: AOR = 12.48-14.07) and cool-down practices (sometimes: AOR = 2.86), and prior LBP history (AOR = 6.92), all significantly associated with increased LBP risk (p < 0.05). The findings show the importance of early intervention and prevention protocols for minimizing LBP occurrence among junior cricket players. In short, this work demonstrates the effectiveness of ML and regression models for ascertaining sports injury patterns of risk, data-informed prevention and management protocols, and providing a foundation for future studies on this subject. Limitations include the exclusion of a severe pain category due to insufficient data, which reduces the model's capacity to triage urgent cases requiring immediate intervention.