Intelligent identification of osteoporosis on hip X-rays using vision transformer

利用视觉转换器智能识别髋关节X光片上的骨质疏松症

阅读:1

Abstract

OBJECTIVE: This study aimed to develop and evaluate a deep learning model based on the Vision Transformer (ViT) architecture for the automatic classification of hip X-ray images into three categories: normal bone mass, osteopenia, and osteoporosis. The goal was to explore the model's potential for early screening and auxiliary diagnosis of osteoporosis. METHODS: A total of 3016 hip anteroposterior X-ray images were retrospectively collected from Hefei Hospital Affiliated to Anhui Medical University and affiliated community clinics. After standard preprocessing and extraction of proximal femur regions of interest (ROI), the dataset was split into training and internal validation sets in an 8:2 ratio. A pretrained ViT model was fine-tuned for the three-class classification task and compared with conventional convolutional neural networks (ResNet50 and InceptionV3). Performance was assessed using accuracy, area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Additionally, the model was further validated using an external dataset to assess its generalizability. RESULTS: On the internal validation set, the ViT model achieved an overall classification accuracy of 97.0%. The AUCs for osteoporosis, osteopenia, and normal bone mass were 99.6%, 99.4%, and 99.9%, respectively. The PPV were 96.9%, 94.1% and 100%;The NPV were 97.9%, 98.5% and 99.2%. On the external validation set, the ViT model achieved an overall classification accuracy of 89.4%. The AUCs for osteoporosis, osteopenia, and normal bone mass were 96.5%, 91.6%, and 98.4%, respectively. The PPV were 83.5%, 90.2% and 91.3%;The NPV were 94.5%, 91.3% and 96.4%. The model demonstrated high sensitivity, specificity, PPV, and NPV across all classes, and outperformed both ResNet50 and InceptionV3 in overall diagnostic performance and classification stability. CONCLUSION: The ViT-based deep learning model showed excellent performance in classifying bone mineral density using hip X-rays, with high accuracy and generalizability. Relying on routine X-ray images, this method provides a cost-effective and efficient tool for osteoporosis screening, with strong potential for clinical implementation in primary care settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。