Validation of an established TW3 artificial intelligence bone age assessment system: a prospective, multicenter, confirmatory study

对已建立的TW3人工智能骨龄评估系统进行验证:一项前瞻性、多中心、确证性研究

阅读:1

Abstract

BACKGROUND: In 2020, our center established a Tanner-Whitehouse 3 (TW3) artificial intelligence (AI) system using a convolutional neural network (CNN), which was built upon 9059 radiographs. However, the system, upon which our study is based, lacked a gold standard for comparison and had not undergone thorough evaluation in different working environments. METHODS: To further verify the applicability of the AI system in clinical bone age assessment (BAA) and to enhance the accuracy and homogeneity of BAA, a prospective multi-center validation was conducted. This study utilized 744 left-hand radiographs of patients, ranging from 1 to 20 years of age, with 378 boys and 366 girls. These radiographs were obtained from nine different children's hospitals between August and December 2020. The BAAs were performed using the TW3 AI system and were also reviewed by experienced reviewers. Bone age accuracy within 1 year, root mean square error (RMSE), and mean absolute error (MAE) were statistically calculated to evaluate the accuracy. Kappa test and Bland-Altman (B-A) plot were conducted to measure the diagnostic consistency. RESULTS: The system exhibited a high level of performance, producing results that closely aligned with those of the reviewers. It achieved a RMSE of 0.52 years and an accuracy of 94.55% for the radius, ulna, and short bones series. When assessing the carpal series of bones, the system achieved a RMSE of 0.85 years and an accuracy of 80.38%. Overall, the system displayed satisfactory accuracy and RMSE, particularly in patients over 7 years old. The system excelled in evaluating the carpal bone age of patients aged 1-6. Both the Kappa test and B-A plot demonstrated substantial consistency between the system and the reviewers, although the model encountered challenges in consistently distinguishing specific bones, such as the capitate. Furthermore, the system's performance proved acceptable across different genders and age groups, as well as radiography instruments. CONCLUSIONS: In this multi-center validation, the system showcased its potential to enhance the efficiency and consistency of healthy delivery, ultimately resulting in improved patient outcomes and reduced healthcare costs.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。