Abstract
AIMS: Computer vision, automated interpretation of radiological images using AI algorithms, has seen considerable recent interest in the domain of musculoskeletal disease. The use of routinely collected healthcare data provides a significant potential source of information but is often complex and disorganized. We therefore set out to develop an AI-driven preprocessing pipeline for radiological hip and knee images taken from a regional NHS Picture Archiving and Communication System (PACS) system. METHODS: De-identified Scottish regional imaging data was ingested and stored in a specialist platform specifically designed for safe healthcare AI development as part of the AI to revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project. The preprocessing pipeline consisted of initial identification and sorting of anteroposterior (AP) hip and knee images using a semisupervised learning approach, followed by isolation of images with, and without, orthopaedic implants present. Successful execution was assessed through analysis on designated test sets using standard performance metrics. RESULTS: A total of 27,550 radiological images were available for inclusion. This comprised 10,111 designated pelvis and 6,496 knee radiographs, from 2,571 and 1,981 patients, respectively. Testing revealed perfect model performance for the identification of AP hip and knee images using a semisupervised ResNet model with a squeeze and excitation block (100% accuracy; recall/precision/area under receiver operating characteristic curve (AUROC) and kappa all 1.00). Implant identification model performance using a Vision Transformer architecture was excellent for both the hip (accuracy 99.3%, recall 0.99, precision 0.96, AUROC 0.99, kappa 0.97, F1 score 0.97) and knee (accuracy 96.3%, recall 0.86, precision 0.97, AUROC 0.93, kappa 0.89, F1 score 0.91). CONCLUSION: We demonstrate successful development of an AI-driven preprocessing pipeline for musculoskeletal images collated from routine NHS data sources. Use of such 'real-world' data is likely key to development of clinically useful healthcare AI algorithms.