Abstract
Osteoporosis is a prevalent metabolic bone disease that frequently remains undiagnosed due to limited access to bone mineral density (BMD) tests, such as Dual-energy X-ray absorptiometry (DXA). To address this issue, recent research explores alternative indicators from peripheral skeletal sites to enable earlier and more accessible screening. In this paper, we propose a method to predict osteoporosis using hand and wrist X-ray images, which are widely available and cost-effective, though their association with DXA-based diagnoses is not yet fully established. Our approach employs an image segmentation model utilizing a mixture of probabilistic U-Net decoders, which captures predictive uncertainty when segmenting the ulna, radius, and metacarpal bones. The segmentation task is formulated as an optimal transport (OT) problem, effectively addressing the variability inherent in medical images. Additionally, we adopt a self-supervised learning (SSL) strategy that pretrains the model on augmented, unlabeled data to learn robust, invariant feature representations. These features are subsequently fine-tuned in a supervised classification task to distinguish osteoporotic from normal cases. We evaluate our method on X-rays from 192 individuals with verified DXA diagnoses. By combining uncertainty-aware segmentation and self-supervised feature learning, our framework offers a promising vision-based strategy for early osteoporosis detection using peripheral X-ray imaging.