Abstract
Background/Objectives In this study we approach the problem of medical image segmentation by introducing a new loss function envelope that is derived from the Top-k loss strategy. We exploit the fact that, for semantic segmentation, the training loss is computed at two levels, more specifically at pixel level and at image level. Quite often, the envisaged problem has particularities that include noisy annotation at pixel level and limited data, but with accurate annotations at image level. Methods To address the mentioned issues, the Top-k strategy at image level and respectively the "Bottom all but σ" strategy at pixel level are assumed. To deal with the discontinuities of the differentials faced in the automatic learning, a derivative smoothing procedure is introduced. Results The method is thoroughly and successfully tested (in conjunction with a variety of backbone models) for several medical image segmentation tasks performed onto a variety of image acquisition types and human body regions. We present the burned skin area segmentation in standard color images, the segmentation of fetal abdominal structures in ultrasound images and ventricles and myocardium segmentation in cardiac MRI images, in all cases yielding performance improvements. Conclusions The proposed novel mechanism enhances model training by selectively emphasizing certain loss values by the use of two complementary strategies. The major benefits of the approach are clear in challenging scenarios, where the segmentation problem is inherently difficult or where the quality of pixel-level annotations is degraded by noise or inconsistencies. The proposed approach performs equally well in both convolutional neural networks (CNNs) and vision transformer (ViT) architectures.