From Prediction to Prevention: Using Text Mining and Explainable Machine Learning for Urban Bus Accident Analytics

从预测到预防:利用文本挖掘和可解释机器学习进行城市公交事故分析

阅读:2

Abstract

Urban bus accidents present major safety and operational challenges, particularly in densely populated metropolitan areas. This study develops a machine learning-based analytical framework to identify, quantify, and interpret the factors associated with severe bus accidents. The framework integrates three components: (i) a structural topic model (STM) to extract latent accident scenarios from unstructured narrative data, (ii) an extreme gradient boosting (XGBoost) classifier to predict accident severity, and (iii) SHapley Additive exPlanations (SHAP) for post hoc interpretation of model outputs at both global and local levels. Using over 15,000 bus accident records (2013-2018) from a Tier-2 city in Jiangsu Province, China, the findings show that incorporating text-derived accident patterns markedly improves both predictive accuracy and interpretability. The analysis highlights elevated risks linked to rear-end collisions involving electric scooters, sudden stops leading to passenger injuries, and left-turn maneuvers in congested areas. SHAP-based explanations yield actionable insights for drivers, transit operators, and policymakers, facilitating targeted safety interventions. Methodologically, this study advances interpretable risk modeling through the integration of structured and unstructured data, and the modular analytical framework provides a transferable foundation for applications across diverse domains of transportation and risk analysis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。